docs(openclaw): 更新对接指南并重构蓉城易购处理逻辑
- 将蓉城易购处理逻辑从启动器迁移至专用服务类,提升代码模块化 - 更新 OpenClaw 指南,详细说明新增的多种处理模式(Excel、特殊供应商、条码映射) - 统一 headless_api 入口,通过参数化支持不同处理流程
This commit is contained in:
parent
83405a9b8e
commit
3e2f46d26d
@ -1,51 +1,66 @@
|
||||
# OCR 订单处理系统 - OpenClaw 对接指南
|
||||
# OCR 订单处理系统 - OpenClaw 对接指南 (增强版)
|
||||
|
||||
本指南旨在帮助 OpenClaw(或任何自动化 AI 代理)理解如何与本项目进行对接,实现“发送图片 -> 自动处理 -> 返回 Excel”的闭环流程。
|
||||
本指南旨在帮助 OpenClaw 实现对采购单处理系统的全功能自动化控制,涵盖图片识别、Excel 清洗、特殊供应商处理及条码映射管理。
|
||||
|
||||
## 1. 项目概览
|
||||
本项目是一个自动化的采购单处理系统。它通过 OCR 技术识别图片中的表格数据,并根据预设的规则进行清洗(如拆分单位、推算规格),最后生成标准格式的 Excel 采购单。
|
||||
## 1. 核心接口说明 (headless_api.py)
|
||||
|
||||
## 2. 核心工作流
|
||||
自动化对接建议遵循以下步骤:
|
||||
`headless_api.py` 是系统的统一命令行入口。它支持多种模式,通过参数切换功能。
|
||||
|
||||
1. **图片准备**:将收到的采购单图片保存至项目的 `data/input/` 目录。
|
||||
2. **触发处理**:运行 `headless_api.py` 脚本。
|
||||
3. **捕获路径**:从脚本的标准输出(stdout)中获取生成的 Excel 绝对路径。
|
||||
4. **读取结果**:根据获取的路径读取文件并发送回给用户。
|
||||
|
||||
## 3. API 接口说明 (headless_api.py)
|
||||
|
||||
这是专门为自动化对接设计的入口。
|
||||
|
||||
### 运行方式
|
||||
### 1.1 图片 OCR 识别 (默认模式)
|
||||
当接收到用户发送的采购单图片时使用。
|
||||
```bash
|
||||
# 方式 A:自动处理 input 目录中最新的一张图片
|
||||
# 自动处理 data/input 中最新的图片
|
||||
python headless_api.py
|
||||
|
||||
# 方式 B:处理指定路径的图片
|
||||
python headless_api.py "E:/2025Code/python/orc-order-v2/data/input/invoice_001.jpg"
|
||||
# 处理指定图片
|
||||
python headless_api.py "data/input/my_invoice.jpg"
|
||||
```
|
||||
|
||||
### 输出规范
|
||||
- **标准输出 (stdout)**:仅在处理成功时打印**最终生成的 Excel 文件的绝对路径**。
|
||||
- *示例*:`E:\2025Code\python\orc-order-v2\data\result\采购单_invoice_001.xlsx`
|
||||
- **标准错误 (stderr)**:输出所有的运行日志、进度信息、警告以及错误详情。
|
||||
- *注意*:如果发生“单价校验”差异(即生成价格与商品资料价格差异 > 1 元),相关警告会打印在此流中。
|
||||
### 1.2 纯 Excel 订单处理
|
||||
当用户直接提供供应商导出的 Excel 文件时使用。
|
||||
```bash
|
||||
# 处理 data/input 中最新的 Excel
|
||||
python headless_api.py --excel
|
||||
|
||||
## 4. 关键目录结构
|
||||
- `data/input/`:存放待处理的原始图片(支持 jpg, png, bmp)。
|
||||
- `data/result/`:存放最终生成的标准 Excel 采购单。
|
||||
- `templates/商品资料.xlsx`:系统会自动根据此文件进行单价校验。
|
||||
- `config/suppliers_config.json`:存储供应商的清洗规则。
|
||||
# 处理指定 Excel
|
||||
python headless_api.py "data/input/supplier_order.xlsx" --excel
|
||||
```
|
||||
|
||||
## 5. 错误代码与异常处理
|
||||
- 如果脚本退出码为 `0`:表示处理成功,stdout 包含文件路径。
|
||||
- 如果脚本退出码为 `1`:表示处理失败,具体原因请查看 stderr。
|
||||
### 1.3 特殊供应商预处理
|
||||
针对格式极其特殊的供应商(如蓉城易购、烟草公司),系统内置了专用逻辑。
|
||||
```bash
|
||||
# 处理蓉城易购订单 (自动从 data/output 找最新订单文件)
|
||||
python headless_api.py --rongcheng
|
||||
|
||||
## 6. 给 OpenClaw 的建议策略
|
||||
1. **轮询或触发**:当检测到用户发送图片时,先清空或直接写入 `data/input/`。
|
||||
2. **路径捕获**:使用类似 `subprocess.check_output` 的方法运行脚本,捕获最后一行输出作为文件路径。
|
||||
3. **价格预警**:OpenClaw 可以扫描 stderr,如果发现包含 `WARNING: Price validation found...` 字样,可以主动提醒用户:“处理完成,但发现部分商品单价与资料库不符,请核对。”
|
||||
# 处理烟草公司订单
|
||||
python headless_api.py --tobacco
|
||||
```
|
||||
|
||||
### 1.4 条码映射管理
|
||||
当发现 OCR 识别出的条码与系统库不一致时,可通过此接口更新映射关系。
|
||||
```bash
|
||||
# 将原始条码 12345 映射为系统目标条码 67890
|
||||
python headless_api.py --update-mapping --barcode "12345" --target "67890"
|
||||
```
|
||||
|
||||
## 2. OpenClaw 集成策略
|
||||
|
||||
### 2.1 任务分发逻辑
|
||||
OpenClaw 应根据用户输入判断调用的参数:
|
||||
- **收到图片** -> 调用默认模式。
|
||||
- **收到 Excel** -> 调用 `--excel` 模式。
|
||||
- **用户提到“蓉城”或“易购”** -> 调用 `--rongcheng`。
|
||||
- **用户提到“烟草”** -> 调用 `--tobacco`。
|
||||
- **用户纠正条码错误** -> 调用 `--update-mapping`。
|
||||
|
||||
### 2.2 结果获取
|
||||
- **成功**:脚本会在 `stdout` 打印生成的 Excel 绝对路径。OpenClaw 捕获此路径并发送文件。
|
||||
- **单价预警**:脚本会在 `stderr` 输出 `WARNING: Price validation found...`。OpenClaw 应捕获并转述给用户:“文件已生成,但发现部分商品价格异常,请注意查看。”
|
||||
|
||||
## 3. 常见目录定义
|
||||
- `data/input/`:存放待处理的原始图片或 Excel。
|
||||
- `data/output/`:存放 OCR 识别后的中间文件(蓉城/烟草模式会从此目录读取)。
|
||||
- `data/result/`:最终生成的标准银豹采购单。
|
||||
|
||||
---
|
||||
*由系统自动生成于 2026-03-25*
|
||||
*版本:2.0 | 更新日期:2026-03-25*
|
||||
|
||||
144
app/services/special_suppliers_service.py
Normal file
144
app/services/special_suppliers_service.py
Normal file
@ -0,0 +1,144 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
import os
|
||||
import re
|
||||
import time
|
||||
import pandas as pd
|
||||
import logging
|
||||
from typing import Optional, Callable
|
||||
from app.services.order_service import OrderService
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
class SpecialSuppliersService:
|
||||
"""
|
||||
处理特殊供应商逻辑的服务类,如蓉城易购等
|
||||
"""
|
||||
|
||||
def __init__(self, config_manager=None):
|
||||
self.config_manager = config_manager
|
||||
self.order_service = OrderService(config_manager)
|
||||
|
||||
def process_rongcheng_yigou(self, src_path: str, progress_cb: Optional[Callable[[int, str], None]] = None) -> Optional[str]:
|
||||
"""
|
||||
处理蓉城易购订单
|
||||
"""
|
||||
try:
|
||||
if progress_cb: progress_cb(10, "正在处理蓉城易购...")
|
||||
|
||||
def _pick_col(df, exact_list=None, contains_list=None):
|
||||
cols = list(df.columns)
|
||||
if exact_list:
|
||||
for name in exact_list:
|
||||
for c in cols:
|
||||
if str(c).strip() == str(name).strip():
|
||||
return c
|
||||
if contains_list:
|
||||
for kw in contains_list:
|
||||
for c in cols:
|
||||
if kw in str(c):
|
||||
return c
|
||||
return None
|
||||
|
||||
try:
|
||||
df_raw = pd.read_excel(src_path, header=2)
|
||||
except Exception:
|
||||
df_raw = pd.read_excel(src_path)
|
||||
df_raw = df_raw.iloc[2:].reset_index(drop=True)
|
||||
|
||||
# 去除全空列与行
|
||||
df_raw = df_raw.dropna(how='all', axis=1).dropna(how='all', axis=0)
|
||||
|
||||
# 选择关键列
|
||||
col_no = _pick_col(df_raw, contains_list=['序号'])
|
||||
col_name = _pick_col(df_raw, contains_list=['商品名称','品名','名称'])
|
||||
col_bc = _pick_col(df_raw, contains_list=['商品条码','条码'])
|
||||
col_unit = _pick_col(df_raw, exact_list=['单位(订购单位)'], contains_list=['订购单位','小单位','单位'])
|
||||
col_qty = _pick_col(df_raw, contains_list=['订购数量','订货数量','数量'])
|
||||
col_price= _pick_col(df_raw, exact_list=['优惠后金额(小单位)'], contains_list=['单价','销售价','进货价','优惠后金额'])
|
||||
col_amt = _pick_col(df_raw, exact_list=['出库小计(元)'], contains_list=['金额','优惠后金额','小计','合计','出库小计'])
|
||||
|
||||
selected = [c for c in [col_no,col_name,col_bc,col_unit,col_qty,col_price,col_amt] if c]
|
||||
|
||||
if not selected or len(selected) < 4:
|
||||
df = pd.read_excel(src_path)
|
||||
df = df.iloc[2:].reset_index(drop=True)
|
||||
keep_idx = [0, 2, 3, 9, 12, 15, 17]
|
||||
keep_idx = [i for i in keep_idx if i < df.shape[1]]
|
||||
df2 = df.iloc[:, keep_idx].copy()
|
||||
target_cols = ['序号','商品名称','商品条码','单位','数量','单价','金额']
|
||||
df2.columns = target_cols[:len(df2.columns)]
|
||||
else:
|
||||
df2 = df_raw[selected].copy()
|
||||
rename_map = {}
|
||||
if col_no: rename_map[col_no] = '序号'
|
||||
if col_name: rename_map[col_name] = '商品名称'
|
||||
if col_bc: rename_map[col_bc] = '商品条码(小条码)'
|
||||
if col_unit: rename_map[col_unit] = '单位'
|
||||
if col_qty: rename_map[col_qty] = '订购数量(小单位)'
|
||||
if col_price: rename_map[col_price] = '单价(小单位)'
|
||||
if col_amt: rename_map[col_amt] = '优惠后金额(小单位)'
|
||||
df2 = df2.rename(columns=rename_map)
|
||||
|
||||
if '单位' in df2.columns:
|
||||
df2['单位'] = df2['单位'].astype(str).str.strip().replace({'件':'份'})
|
||||
|
||||
# 分裂多条码行并均分数量
|
||||
bc_col = '商品条码(小条码)' if '商品条码(小条码)' in df2.columns else ('商品条码' if '商品条码' in df2.columns else ('条码' if '条码' in df2.columns else None))
|
||||
qty_col = '订购数量(小单位)' if '订购数量(小单位)' in df2.columns else ('订购数量' if '订购数量' in df2.columns else ('数量' if '数量' in df2.columns else None))
|
||||
up_col = '单价(小单位)' if '单价(小单位)' in df2.columns else ('单价' if '单价' in df2.columns else ('销售价' if '销售价' in df2.columns else None))
|
||||
amt_col = '优惠后金额(小单位)' if '优惠后金额(小单位)' in df2.columns else ('金额' if '金额' in df2.columns else ('小计' if '小计' in df2.columns else None))
|
||||
|
||||
if bc_col and qty_col:
|
||||
rows = []
|
||||
for _, row in df2.iterrows():
|
||||
bc_val = str(row.get(bc_col, '')).strip()
|
||||
if bc_val and any(sep in bc_val for sep in [',',',','、','/',' ']):
|
||||
parts = []
|
||||
temp_bc = bc_val
|
||||
for sep in [',',',','、','/',' ']:
|
||||
temp_bc = temp_bc.replace(sep, ' ')
|
||||
for token in temp_bc.split():
|
||||
tok = ''.join([ch for ch in token if ch.isdigit()])
|
||||
if tok: parts.append(tok)
|
||||
parts = [p for p in parts if p]
|
||||
if len(parts) >= 2:
|
||||
try:
|
||||
q_total = float(row.get(qty_col, 0) or 0)
|
||||
except Exception:
|
||||
q_total = 0
|
||||
if q_total > 0:
|
||||
n = len(parts)
|
||||
base = int(q_total) // n if q_total.is_integer() else q_total / n
|
||||
remainder = int(q_total) % n if q_total.is_integer() else 0
|
||||
for i, bc in enumerate(parts):
|
||||
new_row = row.copy()
|
||||
new_row[bc_col] = bc
|
||||
q_each = base + (1 if remainder > 0 and i < remainder else 0)
|
||||
new_row[qty_col] = q_each
|
||||
if up_col and amt_col:
|
||||
try:
|
||||
upv = float(new_row.get(up_col, 0) or 0)
|
||||
new_row[amt_col] = upv * float(q_each)
|
||||
except Exception: pass
|
||||
rows.append(new_row)
|
||||
else: rows.append(row)
|
||||
else: rows.append(row)
|
||||
else: rows.append(row)
|
||||
df2 = pd.DataFrame(rows)
|
||||
|
||||
out_dir = os.path.dirname(src_path)
|
||||
base = os.path.basename(src_path)
|
||||
final_name = f"蓉城易购预处理-{base}"
|
||||
final_path = os.path.join(out_dir, final_name)
|
||||
df2.to_excel(final_path, index=False)
|
||||
|
||||
if progress_cb: progress_cb(60, "预处理完成,开始标准流程...")
|
||||
|
||||
result = self.order_service.process_excel(final_path, progress_cb=lambda p: progress_cb(60 + int(p*0.4), "Excel处理中...") if progress_cb else None)
|
||||
return result
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"处理蓉城易购订单出错: {e}")
|
||||
return None
|
||||
163
headless_api.py
163
headless_api.py
@ -12,8 +12,10 @@ import os
|
||||
import sys
|
||||
import logging
|
||||
import time
|
||||
import argparse
|
||||
import json
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
from typing import Optional, List, Dict
|
||||
|
||||
# 添加当前目录到路径
|
||||
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
|
||||
@ -21,9 +23,11 @@ sys.path.append(os.path.dirname(os.path.abspath(__file__)))
|
||||
from app.config.settings import ConfigManager
|
||||
from app.services.ocr_service import OCRService
|
||||
from app.services.order_service import OrderService
|
||||
from app.services.tobacco_service import TobaccoService
|
||||
from app.services.special_suppliers_service import SpecialSuppliersService
|
||||
from app.core.utils.log_utils import set_log_level
|
||||
|
||||
# 配置日志输出到 stderr,以免干扰 stdout 的路径输出
|
||||
# 配置日志输出到 stderr
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
|
||||
@ -31,73 +35,121 @@ logging.basicConfig(
|
||||
)
|
||||
logger = logging.getLogger("HeadlessAPI")
|
||||
|
||||
def get_latest_input_image() -> Optional[str]:
|
||||
"""获取 input 目录中最新的图片文件"""
|
||||
input_dir = Path("data/input")
|
||||
if not input_dir.exists():
|
||||
def get_latest_file(directory: str, extensions: List[str]) -> Optional[str]:
|
||||
"""获取目录中最新的指定后缀文件"""
|
||||
dir_path = Path(directory)
|
||||
if not dir_path.exists():
|
||||
return None
|
||||
|
||||
extensions = ['.jpg', '.jpeg', '.png', '.bmp']
|
||||
files = []
|
||||
for ext in extensions:
|
||||
files.extend(input_dir.glob(f"*{ext}"))
|
||||
files.extend(input_dir.glob(f"*{ext.upper()}"))
|
||||
files.extend(dir_path.glob(f"*{ext}"))
|
||||
files.extend(dir_path.glob(f"*{ext.upper()}"))
|
||||
|
||||
if not files:
|
||||
return None
|
||||
|
||||
# 按修改时间排序
|
||||
latest_file = max(files, key=lambda p: p.stat().st_mtime)
|
||||
return str(latest_file)
|
||||
|
||||
def run_pipeline(image_path: Optional[str] = None):
|
||||
def update_barcode_mapping(barcode: str, target_barcode: str):
|
||||
"""更新条码映射"""
|
||||
try:
|
||||
config_path = os.path.join("config", "barcode_mappings.json")
|
||||
mappings = {}
|
||||
if os.path.exists(config_path):
|
||||
with open(config_path, 'r', encoding='utf-8') as f:
|
||||
mappings = json.load(f)
|
||||
|
||||
mappings[barcode] = target_barcode
|
||||
|
||||
with open(config_path, 'w', encoding='utf-8') as f:
|
||||
json.dump(mappings, f, ensure_ascii=False, indent=2)
|
||||
|
||||
logger.info(f"成功更新条码映射: {barcode} -> {target_barcode}")
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"更新条码映射失败: {e}")
|
||||
return False
|
||||
|
||||
def run_pipeline(args):
|
||||
"""运行处理流水线"""
|
||||
try:
|
||||
# 1. 确定输入文件
|
||||
if not image_path:
|
||||
image_path = get_latest_input_image()
|
||||
|
||||
if not image_path:
|
||||
print("ERROR: No input image found.", file=sys.stderr)
|
||||
return None
|
||||
|
||||
logger.info(f"开始处理图片: {image_path}")
|
||||
|
||||
# 2. 初始化服务
|
||||
config_manager = ConfigManager()
|
||||
ocr_service = OCRService(config_manager)
|
||||
order_service = OrderService(config_manager)
|
||||
|
||||
# 3. OCR 识别
|
||||
start_time = time.perf_counter()
|
||||
excel_intermediate = ocr_service.process_image(image_path)
|
||||
if not excel_intermediate:
|
||||
print(f"ERROR: OCR failed for {image_path}", file=sys.stderr)
|
||||
final_excel = None
|
||||
|
||||
# 1. 处理条码映射更新
|
||||
if args.update_mapping:
|
||||
if not args.barcode or not args.target:
|
||||
print("ERROR: --barcode and --target are required for --update-mapping", file=sys.stderr)
|
||||
return None
|
||||
if update_barcode_mapping(args.barcode, args.target):
|
||||
print(f"SUCCESS: Mapping updated {args.barcode} -> {args.target}")
|
||||
return "MAPPING_UPDATED"
|
||||
return None
|
||||
|
||||
# 4. Excel 处理与清洗
|
||||
final_excel = order_service.process_excel(excel_intermediate)
|
||||
if not final_excel:
|
||||
print(f"ERROR: Excel processing failed for {excel_intermediate}", file=sys.stderr)
|
||||
return None
|
||||
# 2. 烟草公司处理
|
||||
if args.tobacco:
|
||||
input_path = args.input or get_latest_file("data/output", [".xlsx", ".xls"])
|
||||
if not input_path:
|
||||
print("ERROR: No tobacco order file found.", file=sys.stderr)
|
||||
return None
|
||||
logger.info(f"开始处理烟草订单: {input_path}")
|
||||
tobacco_service = TobaccoService(config_manager)
|
||||
final_excel = tobacco_service.process_tobacco_order(input_path)
|
||||
|
||||
# 5. 单价校验 (输出到 stderr)
|
||||
discrepancies = order_service.validate_unit_price(final_excel)
|
||||
if discrepancies:
|
||||
print(f"WARNING: Price validation found {len(discrepancies)} issues:", file=sys.stderr)
|
||||
for d in discrepancies:
|
||||
print(f" - {d}", file=sys.stderr)
|
||||
# 3. 蓉城易购处理
|
||||
elif args.rongcheng:
|
||||
input_path = args.input or get_latest_file("data/output", [".xlsx", ".xls"])
|
||||
if not input_path:
|
||||
print("ERROR: No Rongcheng Yigou order file found.", file=sys.stderr)
|
||||
return None
|
||||
logger.info(f"开始处理蓉城易购订单: {input_path}")
|
||||
special_service = SpecialSuppliersService(config_manager)
|
||||
final_excel = special_service.process_rongcheng_yigou(input_path)
|
||||
|
||||
# 4. 普通 Excel 处理
|
||||
elif args.excel:
|
||||
input_path = args.input or get_latest_file("data/input", [".xlsx", ".xls"])
|
||||
if not input_path:
|
||||
print("ERROR: No Excel file found in input.", file=sys.stderr)
|
||||
return None
|
||||
logger.info(f"开始处理 Excel: {input_path}")
|
||||
final_excel = order_service.process_excel(input_path)
|
||||
|
||||
# 5. 图片 OCR 处理 (默认)
|
||||
else:
|
||||
logger.info("单价校验通过")
|
||||
input_path = args.input or get_latest_file("data/input", [".jpg", ".jpeg", ".png", ".bmp"])
|
||||
if not input_path:
|
||||
print("ERROR: No input image found.", file=sys.stderr)
|
||||
return None
|
||||
logger.info(f"开始 OCR 处理图片: {input_path}")
|
||||
ocr_service = OCRService(config_manager)
|
||||
excel_intermediate = ocr_service.process_image(input_path)
|
||||
if excel_intermediate:
|
||||
final_excel = order_service.process_excel(excel_intermediate)
|
||||
|
||||
duration = time.perf_counter() - start_time
|
||||
logger.info(f"处理完成,耗时: {duration:.2f}s")
|
||||
# 6. 后续处理 (校验与输出)
|
||||
if final_excel:
|
||||
# 单价校验
|
||||
discrepancies = order_service.validate_unit_price(final_excel)
|
||||
if discrepancies:
|
||||
print(f"WARNING: Price validation found {len(discrepancies)} issues:", file=sys.stderr)
|
||||
for d in discrepancies:
|
||||
print(f" - {d}", file=sys.stderr)
|
||||
|
||||
# 6. 输出最终结果路径到 stdout
|
||||
# 确保是绝对路径
|
||||
abs_path = os.path.abspath(final_excel)
|
||||
print(abs_path)
|
||||
return abs_path
|
||||
duration = time.perf_counter() - start_time
|
||||
logger.info(f"处理完成,耗时: {duration:.2f}s")
|
||||
|
||||
# 输出最终路径
|
||||
abs_path = os.path.abspath(final_excel)
|
||||
print(abs_path)
|
||||
return abs_path
|
||||
else:
|
||||
print("ERROR: Processing failed.", file=sys.stderr)
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
import traceback
|
||||
@ -106,7 +158,18 @@ def run_pipeline(image_path: Optional[str] = None):
|
||||
return None
|
||||
|
||||
if __name__ == "__main__":
|
||||
# 支持命令行传入图片路径
|
||||
input_path = sys.argv[1] if len(sys.argv) > 1 else None
|
||||
result = run_pipeline(input_path)
|
||||
parser = argparse.ArgumentParser(description="OCR订单处理系统 - 无界面自动化接口")
|
||||
parser.add_argument('input', nargs='?', help='输入文件路径 (图片或Excel)')
|
||||
|
||||
group = parser.add_mutually_exclusive_group()
|
||||
group.add_argument('--excel', action='store_true', help='处理普通 Excel 文件')
|
||||
group.add_argument('--tobacco', action='store_true', help='处理烟草公司订单')
|
||||
group.add_argument('--rongcheng', action='store_true', help='处理蓉城易购订单')
|
||||
group.add_argument('--update-mapping', action='store_true', help='更新条码映射')
|
||||
|
||||
parser.add_argument('--barcode', help='待映射的原始条码 (用于 --update-mapping)')
|
||||
parser.add_argument('--target', help='目标条码 (用于 --update-mapping)')
|
||||
|
||||
args = parser.parse_args()
|
||||
result = run_pipeline(args)
|
||||
sys.exit(0 if result else 1)
|
||||
|
||||
181
启动器.py
181
启动器.py
@ -1609,182 +1609,51 @@ def process_rongcheng_yigou_with_status(log_widget, status_bar):
|
||||
try:
|
||||
reporter = ProgressReporter(status_bar)
|
||||
reporter.running()
|
||||
reporter.set("正在处理蓉城易购...", 10)
|
||||
add_to_log(log_widget, "开始处理蓉城易购\n", "info")
|
||||
|
||||
s = load_user_settings()
|
||||
out_dir = os.path.abspath(s.get('output_folder', 'data/output'))
|
||||
if not os.path.exists(out_dir):
|
||||
os.makedirs(out_dir, exist_ok=True)
|
||||
candidates = []
|
||||
for f in os.listdir(out_dir):
|
||||
fn = f.lower()
|
||||
if re.match(r'^订单\d+\.xlsx$', fn):
|
||||
p = os.path.join(out_dir, f)
|
||||
try:
|
||||
if os.path.exists(out_dir):
|
||||
for f in os.listdir(out_dir):
|
||||
if re.match(r'^订单\d+\.xlsx$', f.lower()):
|
||||
p = os.path.join(out_dir, f)
|
||||
candidates.append((p, os.path.getmtime(p)))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if not candidates:
|
||||
add_to_log(log_widget, "未在输出目录找到蓉城易购订单文件\n", "warning")
|
||||
reporter.done()
|
||||
return
|
||||
|
||||
candidates.sort(key=lambda x: x[1], reverse=True)
|
||||
src_path = candidates[0][0]
|
||||
reporter.set("读取并清洗数据...", 25)
|
||||
def _pick_col(df, exact_list=None, contains_list=None):
|
||||
cols = list(df.columns)
|
||||
if exact_list:
|
||||
for name in exact_list:
|
||||
for c in cols:
|
||||
if str(c).strip() == str(name).strip():
|
||||
return c
|
||||
if contains_list:
|
||||
for kw in contains_list:
|
||||
for c in cols:
|
||||
if kw in str(c):
|
||||
return c
|
||||
return None
|
||||
try:
|
||||
df_raw = pd.read_excel(src_path, header=2)
|
||||
except Exception:
|
||||
df_raw = pd.read_excel(src_path)
|
||||
df_raw = df_raw.iloc[2:].reset_index(drop=True)
|
||||
# 去除全空列与行
|
||||
df_raw = df_raw.dropna(how='all', axis=1).dropna(how='all', axis=0)
|
||||
# 选择关键列(包含关键词)
|
||||
col_no = _pick_col(df_raw, contains_list=['序号'])
|
||||
col_name = _pick_col(df_raw, contains_list=['商品名称','品名','名称'])
|
||||
col_bc = _pick_col(df_raw, contains_list=['商品条码','条码'])
|
||||
col_unit = _pick_col(df_raw, exact_list=['单位(订购单位)'], contains_list=['订购单位','小单位','单位'])
|
||||
col_qty = _pick_col(df_raw, contains_list=['订购数量','订货数量','数量'])
|
||||
# 新模板映射:优惠后金额(小单位)作为“单价(小单位)”,出库小计(元)作为“优惠后金额(小单位)”
|
||||
col_price= _pick_col(df_raw, exact_list=['优惠后金额(小单位)'], contains_list=['单价','销售价','进货价','优惠后金额'])
|
||||
col_amt = _pick_col(df_raw, exact_list=['出库小计(元)'], contains_list=['金额','优惠后金额','小计','合计','出库小计'])
|
||||
selected = [c for c in [col_no,col_name,col_bc,col_unit,col_qty,col_price,col_amt] if c]
|
||||
if not selected or len(selected) < 4:
|
||||
# 兜底:沿用旧逻辑(索引选列)
|
||||
df = pd.read_excel(src_path)
|
||||
df = df.iloc[2:].reset_index(drop=True)
|
||||
keep_idx = [0, 2, 3, 9, 12, 15, 17]
|
||||
keep_idx = [i for i in keep_idx if i < df.shape[1]]
|
||||
df2 = df.iloc[:, keep_idx].copy()
|
||||
target_cols = ['序号','商品名称','商品条码','单位','数量','单价','金额']
|
||||
df2.columns = target_cols[:len(df2.columns)]
|
||||
else:
|
||||
df2 = df_raw[selected].copy()
|
||||
# 统一列名到旧配置期望的列名,便于后续映射
|
||||
rename_map = {}
|
||||
if col_no: rename_map[col_no] = '序号'
|
||||
if col_name: rename_map[col_name] = '商品名称'
|
||||
if col_bc: rename_map[col_bc] = '商品条码(小条码)'
|
||||
if col_unit: rename_map[col_unit] = '单位'
|
||||
if col_qty: rename_map[col_qty] = '订购数量(小单位)'
|
||||
if col_price: rename_map[col_price] = '单价(小单位)'
|
||||
if col_amt: rename_map[col_amt] = '优惠后金额(小单位)'
|
||||
df2 = df2.rename(columns=rename_map)
|
||||
# 单位清洗(与旧版一致:将“件”改为“份”,并去除空白)
|
||||
if '单位' in df2.columns:
|
||||
try:
|
||||
df2['单位'] = df2['单位'].astype(str).str.strip().replace({'件':'份'})
|
||||
except Exception:
|
||||
pass
|
||||
# 保留原始订购单位
|
||||
# 分裂多条码行并均分数量
|
||||
try:
|
||||
bc_col = '商品条码(小条码)' if '商品条码(小条码)' in df2.columns else ('商品条码' if '商品条码' in df2.columns else ('条码' if '条码' in df2.columns else None))
|
||||
qty_col = '订购数量(小单位)' if '订购数量(小单位)' in df2.columns else ('订购数量' if '订购数量' in df2.columns else ('数量' if '数量' in df2.columns else None))
|
||||
up_col = '单价(小单位)' if '单价(小单位)' in df2.columns else ('单价' if '单价' in df2.columns else ('销售价' if '销售价' in df2.columns else None))
|
||||
amt_col = '优惠后金额(小单位)' if '优惠后金额(小单位)' in df2.columns else ('金额' if '金额' in df2.columns else ('小计' if '小计' in df2.columns else None))
|
||||
if bc_col and qty_col:
|
||||
rows = []
|
||||
for _, row in df2.iterrows():
|
||||
bc_val = str(row.get(bc_col, '')).strip()
|
||||
if bc_val and (',' in bc_val or ',' in bc_val or '、' in bc_val or ' ' in bc_val or '/' in bc_val):
|
||||
parts = []
|
||||
for sep in [',',',','、','/',' ']:
|
||||
bc_val = bc_val.replace(sep, ' ')
|
||||
for token in bc_val.split():
|
||||
tok = ''.join([ch for ch in token if ch.isdigit()])
|
||||
if tok:
|
||||
parts.append(tok)
|
||||
parts = [p for p in parts if p]
|
||||
if len(parts) >= 2:
|
||||
try:
|
||||
q_total = float(row.get(qty_col, 0) or 0)
|
||||
except Exception:
|
||||
q_total = 0
|
||||
if q_total > 0:
|
||||
n = len(parts)
|
||||
base = int(q_total) // n if q_total.is_integer() else q_total / n
|
||||
remainder = int(q_total) % n if q_total.is_integer() else 0
|
||||
for i, bc in enumerate(parts):
|
||||
new_row = row.copy()
|
||||
new_row[bc_col] = bc
|
||||
q_each = base + (1 if remainder > 0 and i < remainder else 0)
|
||||
new_row[qty_col] = q_each
|
||||
if up_col and amt_col:
|
||||
try:
|
||||
upv = float(new_row.get(up_col, 0) or 0)
|
||||
new_row[amt_col] = upv * float(q_each)
|
||||
except Exception:
|
||||
pass
|
||||
rows.append(new_row)
|
||||
else:
|
||||
# 无法分配数量,保留原行
|
||||
rows.append(row)
|
||||
else:
|
||||
rows.append(row)
|
||||
else:
|
||||
rows.append(row)
|
||||
df2 = pd.DataFrame(rows)
|
||||
except Exception:
|
||||
pass
|
||||
base = os.path.basename(src_path)
|
||||
inter_name = f"蓉城易购_处理后_{base}"
|
||||
inter_path = os.path.join(out_dir, inter_name)
|
||||
reporter.set("保存处理结果...", 45)
|
||||
df2.to_excel(inter_path, index=False)
|
||||
final_name = f"蓉城易购-{base}"
|
||||
final_path = os.path.join(out_dir, final_name)
|
||||
try:
|
||||
if os.path.exists(final_path):
|
||||
os.remove(final_path)
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
os.replace(inter_path, final_path)
|
||||
except Exception:
|
||||
final_path = inter_path
|
||||
add_to_log(log_widget, f"蓉城易购预处理完成: {final_path}\n", "success")
|
||||
reporter.set("准备进行普通Excel处理...", 60)
|
||||
add_recent_file(final_path)
|
||||
time.sleep(3)
|
||||
order_service = OrderService()
|
||||
result = order_service.process_excel(final_path, progress_cb=lambda p: reporter.set("Excel处理中...", p))
|
||||
|
||||
from app.services.special_suppliers_service import SpecialSuppliersService
|
||||
service = SpecialSuppliersService(ConfigManager())
|
||||
|
||||
result = service.process_rongcheng_yigou(
|
||||
src_path,
|
||||
progress_cb=lambda p, m: (reporter.set(m, p), add_to_log(log_widget, f"{m}\n", "info"))
|
||||
)
|
||||
|
||||
if result:
|
||||
add_to_log(log_widget, "Excel普通处理完成\n", "success")
|
||||
add_to_log(log_widget, f"处理完成: {result}\n", "success")
|
||||
add_recent_file(result)
|
||||
try:
|
||||
validate_unit_price_against_item_data(result, log_widget)
|
||||
except Exception:
|
||||
pass
|
||||
except Exception: pass
|
||||
open_result_directory_from_settings()
|
||||
reporter.set("处理完成", 100)
|
||||
else:
|
||||
add_to_log(log_widget, "Excel普通处理失败\n", "error")
|
||||
add_to_log(log_widget, "处理失败\n", "error")
|
||||
except Exception as e:
|
||||
add_to_log(log_widget, f"处理蓉城易购时出错: {str(e)}\n", "error")
|
||||
msg = str(e)
|
||||
suggestion = None
|
||||
if 'pandas' in msg:
|
||||
suggestion = "安装依赖:pip install pandas openpyxl"
|
||||
if suggestion:
|
||||
show_error_dialog("蓉城易购处理错误", msg, suggestion)
|
||||
add_to_log(log_widget, f"处理出错: {str(e)}\n", "error")
|
||||
finally:
|
||||
try:
|
||||
reporter.done()
|
||||
except Exception:
|
||||
pass
|
||||
reporter.done()
|
||||
|
||||
thread = Thread(target=run_in_thread)
|
||||
thread.daemon = True
|
||||
thread.start()
|
||||
thread = Thread(target=run_in_thread)
|
||||
thread.daemon = True
|
||||
thread.start()
|
||||
|
||||
Loading…
Reference in New Issue
Block a user