docs(openclaw): 更新对接指南并重构蓉城易购处理逻辑
- 将蓉城易购处理逻辑从启动器迁移至专用服务类,提升代码模块化 - 更新 OpenClaw 指南,详细说明新增的多种处理模式(Excel、特殊供应商、条码映射) - 统一 headless_api 入口,通过参数化支持不同处理流程
This commit is contained in:
parent
83405a9b8e
commit
3e2f46d26d
@ -1,51 +1,66 @@
|
|||||||
# OCR 订单处理系统 - OpenClaw 对接指南
|
# OCR 订单处理系统 - OpenClaw 对接指南 (增强版)
|
||||||
|
|
||||||
本指南旨在帮助 OpenClaw(或任何自动化 AI 代理)理解如何与本项目进行对接,实现“发送图片 -> 自动处理 -> 返回 Excel”的闭环流程。
|
本指南旨在帮助 OpenClaw 实现对采购单处理系统的全功能自动化控制,涵盖图片识别、Excel 清洗、特殊供应商处理及条码映射管理。
|
||||||
|
|
||||||
## 1. 项目概览
|
## 1. 核心接口说明 (headless_api.py)
|
||||||
本项目是一个自动化的采购单处理系统。它通过 OCR 技术识别图片中的表格数据,并根据预设的规则进行清洗(如拆分单位、推算规格),最后生成标准格式的 Excel 采购单。
|
|
||||||
|
|
||||||
## 2. 核心工作流
|
`headless_api.py` 是系统的统一命令行入口。它支持多种模式,通过参数切换功能。
|
||||||
自动化对接建议遵循以下步骤:
|
|
||||||
|
|
||||||
1. **图片准备**:将收到的采购单图片保存至项目的 `data/input/` 目录。
|
### 1.1 图片 OCR 识别 (默认模式)
|
||||||
2. **触发处理**:运行 `headless_api.py` 脚本。
|
当接收到用户发送的采购单图片时使用。
|
||||||
3. **捕获路径**:从脚本的标准输出(stdout)中获取生成的 Excel 绝对路径。
|
|
||||||
4. **读取结果**:根据获取的路径读取文件并发送回给用户。
|
|
||||||
|
|
||||||
## 3. API 接口说明 (headless_api.py)
|
|
||||||
|
|
||||||
这是专门为自动化对接设计的入口。
|
|
||||||
|
|
||||||
### 运行方式
|
|
||||||
```bash
|
```bash
|
||||||
# 方式 A:自动处理 input 目录中最新的一张图片
|
# 自动处理 data/input 中最新的图片
|
||||||
python headless_api.py
|
python headless_api.py
|
||||||
|
|
||||||
# 方式 B:处理指定路径的图片
|
# 处理指定图片
|
||||||
python headless_api.py "E:/2025Code/python/orc-order-v2/data/input/invoice_001.jpg"
|
python headless_api.py "data/input/my_invoice.jpg"
|
||||||
```
|
```
|
||||||
|
|
||||||
### 输出规范
|
### 1.2 纯 Excel 订单处理
|
||||||
- **标准输出 (stdout)**:仅在处理成功时打印**最终生成的 Excel 文件的绝对路径**。
|
当用户直接提供供应商导出的 Excel 文件时使用。
|
||||||
- *示例*:`E:\2025Code\python\orc-order-v2\data\result\采购单_invoice_001.xlsx`
|
```bash
|
||||||
- **标准错误 (stderr)**:输出所有的运行日志、进度信息、警告以及错误详情。
|
# 处理 data/input 中最新的 Excel
|
||||||
- *注意*:如果发生“单价校验”差异(即生成价格与商品资料价格差异 > 1 元),相关警告会打印在此流中。
|
python headless_api.py --excel
|
||||||
|
|
||||||
## 4. 关键目录结构
|
# 处理指定 Excel
|
||||||
- `data/input/`:存放待处理的原始图片(支持 jpg, png, bmp)。
|
python headless_api.py "data/input/supplier_order.xlsx" --excel
|
||||||
- `data/result/`:存放最终生成的标准 Excel 采购单。
|
```
|
||||||
- `templates/商品资料.xlsx`:系统会自动根据此文件进行单价校验。
|
|
||||||
- `config/suppliers_config.json`:存储供应商的清洗规则。
|
|
||||||
|
|
||||||
## 5. 错误代码与异常处理
|
### 1.3 特殊供应商预处理
|
||||||
- 如果脚本退出码为 `0`:表示处理成功,stdout 包含文件路径。
|
针对格式极其特殊的供应商(如蓉城易购、烟草公司),系统内置了专用逻辑。
|
||||||
- 如果脚本退出码为 `1`:表示处理失败,具体原因请查看 stderr。
|
```bash
|
||||||
|
# 处理蓉城易购订单 (自动从 data/output 找最新订单文件)
|
||||||
|
python headless_api.py --rongcheng
|
||||||
|
|
||||||
## 6. 给 OpenClaw 的建议策略
|
# 处理烟草公司订单
|
||||||
1. **轮询或触发**:当检测到用户发送图片时,先清空或直接写入 `data/input/`。
|
python headless_api.py --tobacco
|
||||||
2. **路径捕获**:使用类似 `subprocess.check_output` 的方法运行脚本,捕获最后一行输出作为文件路径。
|
```
|
||||||
3. **价格预警**:OpenClaw 可以扫描 stderr,如果发现包含 `WARNING: Price validation found...` 字样,可以主动提醒用户:“处理完成,但发现部分商品单价与资料库不符,请核对。”
|
|
||||||
|
### 1.4 条码映射管理
|
||||||
|
当发现 OCR 识别出的条码与系统库不一致时,可通过此接口更新映射关系。
|
||||||
|
```bash
|
||||||
|
# 将原始条码 12345 映射为系统目标条码 67890
|
||||||
|
python headless_api.py --update-mapping --barcode "12345" --target "67890"
|
||||||
|
```
|
||||||
|
|
||||||
|
## 2. OpenClaw 集成策略
|
||||||
|
|
||||||
|
### 2.1 任务分发逻辑
|
||||||
|
OpenClaw 应根据用户输入判断调用的参数:
|
||||||
|
- **收到图片** -> 调用默认模式。
|
||||||
|
- **收到 Excel** -> 调用 `--excel` 模式。
|
||||||
|
- **用户提到“蓉城”或“易购”** -> 调用 `--rongcheng`。
|
||||||
|
- **用户提到“烟草”** -> 调用 `--tobacco`。
|
||||||
|
- **用户纠正条码错误** -> 调用 `--update-mapping`。
|
||||||
|
|
||||||
|
### 2.2 结果获取
|
||||||
|
- **成功**:脚本会在 `stdout` 打印生成的 Excel 绝对路径。OpenClaw 捕获此路径并发送文件。
|
||||||
|
- **单价预警**:脚本会在 `stderr` 输出 `WARNING: Price validation found...`。OpenClaw 应捕获并转述给用户:“文件已生成,但发现部分商品价格异常,请注意查看。”
|
||||||
|
|
||||||
|
## 3. 常见目录定义
|
||||||
|
- `data/input/`:存放待处理的原始图片或 Excel。
|
||||||
|
- `data/output/`:存放 OCR 识别后的中间文件(蓉城/烟草模式会从此目录读取)。
|
||||||
|
- `data/result/`:最终生成的标准银豹采购单。
|
||||||
|
|
||||||
---
|
---
|
||||||
*由系统自动生成于 2026-03-25*
|
*版本:2.0 | 更新日期:2026-03-25*
|
||||||
|
|||||||
144
app/services/special_suppliers_service.py
Normal file
144
app/services/special_suppliers_service.py
Normal file
@ -0,0 +1,144 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import time
|
||||||
|
import pandas as pd
|
||||||
|
import logging
|
||||||
|
from typing import Optional, Callable
|
||||||
|
from app.services.order_service import OrderService
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
class SpecialSuppliersService:
|
||||||
|
"""
|
||||||
|
处理特殊供应商逻辑的服务类,如蓉城易购等
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, config_manager=None):
|
||||||
|
self.config_manager = config_manager
|
||||||
|
self.order_service = OrderService(config_manager)
|
||||||
|
|
||||||
|
def process_rongcheng_yigou(self, src_path: str, progress_cb: Optional[Callable[[int, str], None]] = None) -> Optional[str]:
|
||||||
|
"""
|
||||||
|
处理蓉城易购订单
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
if progress_cb: progress_cb(10, "正在处理蓉城易购...")
|
||||||
|
|
||||||
|
def _pick_col(df, exact_list=None, contains_list=None):
|
||||||
|
cols = list(df.columns)
|
||||||
|
if exact_list:
|
||||||
|
for name in exact_list:
|
||||||
|
for c in cols:
|
||||||
|
if str(c).strip() == str(name).strip():
|
||||||
|
return c
|
||||||
|
if contains_list:
|
||||||
|
for kw in contains_list:
|
||||||
|
for c in cols:
|
||||||
|
if kw in str(c):
|
||||||
|
return c
|
||||||
|
return None
|
||||||
|
|
||||||
|
try:
|
||||||
|
df_raw = pd.read_excel(src_path, header=2)
|
||||||
|
except Exception:
|
||||||
|
df_raw = pd.read_excel(src_path)
|
||||||
|
df_raw = df_raw.iloc[2:].reset_index(drop=True)
|
||||||
|
|
||||||
|
# 去除全空列与行
|
||||||
|
df_raw = df_raw.dropna(how='all', axis=1).dropna(how='all', axis=0)
|
||||||
|
|
||||||
|
# 选择关键列
|
||||||
|
col_no = _pick_col(df_raw, contains_list=['序号'])
|
||||||
|
col_name = _pick_col(df_raw, contains_list=['商品名称','品名','名称'])
|
||||||
|
col_bc = _pick_col(df_raw, contains_list=['商品条码','条码'])
|
||||||
|
col_unit = _pick_col(df_raw, exact_list=['单位(订购单位)'], contains_list=['订购单位','小单位','单位'])
|
||||||
|
col_qty = _pick_col(df_raw, contains_list=['订购数量','订货数量','数量'])
|
||||||
|
col_price= _pick_col(df_raw, exact_list=['优惠后金额(小单位)'], contains_list=['单价','销售价','进货价','优惠后金额'])
|
||||||
|
col_amt = _pick_col(df_raw, exact_list=['出库小计(元)'], contains_list=['金额','优惠后金额','小计','合计','出库小计'])
|
||||||
|
|
||||||
|
selected = [c for c in [col_no,col_name,col_bc,col_unit,col_qty,col_price,col_amt] if c]
|
||||||
|
|
||||||
|
if not selected or len(selected) < 4:
|
||||||
|
df = pd.read_excel(src_path)
|
||||||
|
df = df.iloc[2:].reset_index(drop=True)
|
||||||
|
keep_idx = [0, 2, 3, 9, 12, 15, 17]
|
||||||
|
keep_idx = [i for i in keep_idx if i < df.shape[1]]
|
||||||
|
df2 = df.iloc[:, keep_idx].copy()
|
||||||
|
target_cols = ['序号','商品名称','商品条码','单位','数量','单价','金额']
|
||||||
|
df2.columns = target_cols[:len(df2.columns)]
|
||||||
|
else:
|
||||||
|
df2 = df_raw[selected].copy()
|
||||||
|
rename_map = {}
|
||||||
|
if col_no: rename_map[col_no] = '序号'
|
||||||
|
if col_name: rename_map[col_name] = '商品名称'
|
||||||
|
if col_bc: rename_map[col_bc] = '商品条码(小条码)'
|
||||||
|
if col_unit: rename_map[col_unit] = '单位'
|
||||||
|
if col_qty: rename_map[col_qty] = '订购数量(小单位)'
|
||||||
|
if col_price: rename_map[col_price] = '单价(小单位)'
|
||||||
|
if col_amt: rename_map[col_amt] = '优惠后金额(小单位)'
|
||||||
|
df2 = df2.rename(columns=rename_map)
|
||||||
|
|
||||||
|
if '单位' in df2.columns:
|
||||||
|
df2['单位'] = df2['单位'].astype(str).str.strip().replace({'件':'份'})
|
||||||
|
|
||||||
|
# 分裂多条码行并均分数量
|
||||||
|
bc_col = '商品条码(小条码)' if '商品条码(小条码)' in df2.columns else ('商品条码' if '商品条码' in df2.columns else ('条码' if '条码' in df2.columns else None))
|
||||||
|
qty_col = '订购数量(小单位)' if '订购数量(小单位)' in df2.columns else ('订购数量' if '订购数量' in df2.columns else ('数量' if '数量' in df2.columns else None))
|
||||||
|
up_col = '单价(小单位)' if '单价(小单位)' in df2.columns else ('单价' if '单价' in df2.columns else ('销售价' if '销售价' in df2.columns else None))
|
||||||
|
amt_col = '优惠后金额(小单位)' if '优惠后金额(小单位)' in df2.columns else ('金额' if '金额' in df2.columns else ('小计' if '小计' in df2.columns else None))
|
||||||
|
|
||||||
|
if bc_col and qty_col:
|
||||||
|
rows = []
|
||||||
|
for _, row in df2.iterrows():
|
||||||
|
bc_val = str(row.get(bc_col, '')).strip()
|
||||||
|
if bc_val and any(sep in bc_val for sep in [',',',','、','/',' ']):
|
||||||
|
parts = []
|
||||||
|
temp_bc = bc_val
|
||||||
|
for sep in [',',',','、','/',' ']:
|
||||||
|
temp_bc = temp_bc.replace(sep, ' ')
|
||||||
|
for token in temp_bc.split():
|
||||||
|
tok = ''.join([ch for ch in token if ch.isdigit()])
|
||||||
|
if tok: parts.append(tok)
|
||||||
|
parts = [p for p in parts if p]
|
||||||
|
if len(parts) >= 2:
|
||||||
|
try:
|
||||||
|
q_total = float(row.get(qty_col, 0) or 0)
|
||||||
|
except Exception:
|
||||||
|
q_total = 0
|
||||||
|
if q_total > 0:
|
||||||
|
n = len(parts)
|
||||||
|
base = int(q_total) // n if q_total.is_integer() else q_total / n
|
||||||
|
remainder = int(q_total) % n if q_total.is_integer() else 0
|
||||||
|
for i, bc in enumerate(parts):
|
||||||
|
new_row = row.copy()
|
||||||
|
new_row[bc_col] = bc
|
||||||
|
q_each = base + (1 if remainder > 0 and i < remainder else 0)
|
||||||
|
new_row[qty_col] = q_each
|
||||||
|
if up_col and amt_col:
|
||||||
|
try:
|
||||||
|
upv = float(new_row.get(up_col, 0) or 0)
|
||||||
|
new_row[amt_col] = upv * float(q_each)
|
||||||
|
except Exception: pass
|
||||||
|
rows.append(new_row)
|
||||||
|
else: rows.append(row)
|
||||||
|
else: rows.append(row)
|
||||||
|
else: rows.append(row)
|
||||||
|
df2 = pd.DataFrame(rows)
|
||||||
|
|
||||||
|
out_dir = os.path.dirname(src_path)
|
||||||
|
base = os.path.basename(src_path)
|
||||||
|
final_name = f"蓉城易购预处理-{base}"
|
||||||
|
final_path = os.path.join(out_dir, final_name)
|
||||||
|
df2.to_excel(final_path, index=False)
|
||||||
|
|
||||||
|
if progress_cb: progress_cb(60, "预处理完成,开始标准流程...")
|
||||||
|
|
||||||
|
result = self.order_service.process_excel(final_path, progress_cb=lambda p: progress_cb(60 + int(p*0.4), "Excel处理中...") if progress_cb else None)
|
||||||
|
return result
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"处理蓉城易购订单出错: {e}")
|
||||||
|
return None
|
||||||
145
headless_api.py
145
headless_api.py
@ -12,8 +12,10 @@ import os
|
|||||||
import sys
|
import sys
|
||||||
import logging
|
import logging
|
||||||
import time
|
import time
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Optional
|
from typing import Optional, List, Dict
|
||||||
|
|
||||||
# 添加当前目录到路径
|
# 添加当前目录到路径
|
||||||
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
|
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
|
||||||
@ -21,9 +23,11 @@ sys.path.append(os.path.dirname(os.path.abspath(__file__)))
|
|||||||
from app.config.settings import ConfigManager
|
from app.config.settings import ConfigManager
|
||||||
from app.services.ocr_service import OCRService
|
from app.services.ocr_service import OCRService
|
||||||
from app.services.order_service import OrderService
|
from app.services.order_service import OrderService
|
||||||
|
from app.services.tobacco_service import TobaccoService
|
||||||
|
from app.services.special_suppliers_service import SpecialSuppliersService
|
||||||
from app.core.utils.log_utils import set_log_level
|
from app.core.utils.log_utils import set_log_level
|
||||||
|
|
||||||
# 配置日志输出到 stderr,以免干扰 stdout 的路径输出
|
# 配置日志输出到 stderr
|
||||||
logging.basicConfig(
|
logging.basicConfig(
|
||||||
level=logging.INFO,
|
level=logging.INFO,
|
||||||
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
|
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
|
||||||
@ -31,73 +35,121 @@ logging.basicConfig(
|
|||||||
)
|
)
|
||||||
logger = logging.getLogger("HeadlessAPI")
|
logger = logging.getLogger("HeadlessAPI")
|
||||||
|
|
||||||
def get_latest_input_image() -> Optional[str]:
|
def get_latest_file(directory: str, extensions: List[str]) -> Optional[str]:
|
||||||
"""获取 input 目录中最新的图片文件"""
|
"""获取目录中最新的指定后缀文件"""
|
||||||
input_dir = Path("data/input")
|
dir_path = Path(directory)
|
||||||
if not input_dir.exists():
|
if not dir_path.exists():
|
||||||
return None
|
return None
|
||||||
|
|
||||||
extensions = ['.jpg', '.jpeg', '.png', '.bmp']
|
|
||||||
files = []
|
files = []
|
||||||
for ext in extensions:
|
for ext in extensions:
|
||||||
files.extend(input_dir.glob(f"*{ext}"))
|
files.extend(dir_path.glob(f"*{ext}"))
|
||||||
files.extend(input_dir.glob(f"*{ext.upper()}"))
|
files.extend(dir_path.glob(f"*{ext.upper()}"))
|
||||||
|
|
||||||
if not files:
|
if not files:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# 按修改时间排序
|
|
||||||
latest_file = max(files, key=lambda p: p.stat().st_mtime)
|
latest_file = max(files, key=lambda p: p.stat().st_mtime)
|
||||||
return str(latest_file)
|
return str(latest_file)
|
||||||
|
|
||||||
def run_pipeline(image_path: Optional[str] = None):
|
def update_barcode_mapping(barcode: str, target_barcode: str):
|
||||||
|
"""更新条码映射"""
|
||||||
|
try:
|
||||||
|
config_path = os.path.join("config", "barcode_mappings.json")
|
||||||
|
mappings = {}
|
||||||
|
if os.path.exists(config_path):
|
||||||
|
with open(config_path, 'r', encoding='utf-8') as f:
|
||||||
|
mappings = json.load(f)
|
||||||
|
|
||||||
|
mappings[barcode] = target_barcode
|
||||||
|
|
||||||
|
with open(config_path, 'w', encoding='utf-8') as f:
|
||||||
|
json.dump(mappings, f, ensure_ascii=False, indent=2)
|
||||||
|
|
||||||
|
logger.info(f"成功更新条码映射: {barcode} -> {target_barcode}")
|
||||||
|
return True
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"更新条码映射失败: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def run_pipeline(args):
|
||||||
"""运行处理流水线"""
|
"""运行处理流水线"""
|
||||||
try:
|
try:
|
||||||
# 1. 确定输入文件
|
config_manager = ConfigManager()
|
||||||
if not image_path:
|
order_service = OrderService(config_manager)
|
||||||
image_path = get_latest_input_image()
|
start_time = time.perf_counter()
|
||||||
|
final_excel = None
|
||||||
|
|
||||||
if not image_path:
|
# 1. 处理条码映射更新
|
||||||
|
if args.update_mapping:
|
||||||
|
if not args.barcode or not args.target:
|
||||||
|
print("ERROR: --barcode and --target are required for --update-mapping", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
if update_barcode_mapping(args.barcode, args.target):
|
||||||
|
print(f"SUCCESS: Mapping updated {args.barcode} -> {args.target}")
|
||||||
|
return "MAPPING_UPDATED"
|
||||||
|
return None
|
||||||
|
|
||||||
|
# 2. 烟草公司处理
|
||||||
|
if args.tobacco:
|
||||||
|
input_path = args.input or get_latest_file("data/output", [".xlsx", ".xls"])
|
||||||
|
if not input_path:
|
||||||
|
print("ERROR: No tobacco order file found.", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
logger.info(f"开始处理烟草订单: {input_path}")
|
||||||
|
tobacco_service = TobaccoService(config_manager)
|
||||||
|
final_excel = tobacco_service.process_tobacco_order(input_path)
|
||||||
|
|
||||||
|
# 3. 蓉城易购处理
|
||||||
|
elif args.rongcheng:
|
||||||
|
input_path = args.input or get_latest_file("data/output", [".xlsx", ".xls"])
|
||||||
|
if not input_path:
|
||||||
|
print("ERROR: No Rongcheng Yigou order file found.", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
logger.info(f"开始处理蓉城易购订单: {input_path}")
|
||||||
|
special_service = SpecialSuppliersService(config_manager)
|
||||||
|
final_excel = special_service.process_rongcheng_yigou(input_path)
|
||||||
|
|
||||||
|
# 4. 普通 Excel 处理
|
||||||
|
elif args.excel:
|
||||||
|
input_path = args.input or get_latest_file("data/input", [".xlsx", ".xls"])
|
||||||
|
if not input_path:
|
||||||
|
print("ERROR: No Excel file found in input.", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
logger.info(f"开始处理 Excel: {input_path}")
|
||||||
|
final_excel = order_service.process_excel(input_path)
|
||||||
|
|
||||||
|
# 5. 图片 OCR 处理 (默认)
|
||||||
|
else:
|
||||||
|
input_path = args.input or get_latest_file("data/input", [".jpg", ".jpeg", ".png", ".bmp"])
|
||||||
|
if not input_path:
|
||||||
print("ERROR: No input image found.", file=sys.stderr)
|
print("ERROR: No input image found.", file=sys.stderr)
|
||||||
return None
|
return None
|
||||||
|
logger.info(f"开始 OCR 处理图片: {input_path}")
|
||||||
logger.info(f"开始处理图片: {image_path}")
|
|
||||||
|
|
||||||
# 2. 初始化服务
|
|
||||||
config_manager = ConfigManager()
|
|
||||||
ocr_service = OCRService(config_manager)
|
ocr_service = OCRService(config_manager)
|
||||||
order_service = OrderService(config_manager)
|
excel_intermediate = ocr_service.process_image(input_path)
|
||||||
|
if excel_intermediate:
|
||||||
# 3. OCR 识别
|
|
||||||
start_time = time.perf_counter()
|
|
||||||
excel_intermediate = ocr_service.process_image(image_path)
|
|
||||||
if not excel_intermediate:
|
|
||||||
print(f"ERROR: OCR failed for {image_path}", file=sys.stderr)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 4. Excel 处理与清洗
|
|
||||||
final_excel = order_service.process_excel(excel_intermediate)
|
final_excel = order_service.process_excel(excel_intermediate)
|
||||||
if not final_excel:
|
|
||||||
print(f"ERROR: Excel processing failed for {excel_intermediate}", file=sys.stderr)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 5. 单价校验 (输出到 stderr)
|
# 6. 后续处理 (校验与输出)
|
||||||
|
if final_excel:
|
||||||
|
# 单价校验
|
||||||
discrepancies = order_service.validate_unit_price(final_excel)
|
discrepancies = order_service.validate_unit_price(final_excel)
|
||||||
if discrepancies:
|
if discrepancies:
|
||||||
print(f"WARNING: Price validation found {len(discrepancies)} issues:", file=sys.stderr)
|
print(f"WARNING: Price validation found {len(discrepancies)} issues:", file=sys.stderr)
|
||||||
for d in discrepancies:
|
for d in discrepancies:
|
||||||
print(f" - {d}", file=sys.stderr)
|
print(f" - {d}", file=sys.stderr)
|
||||||
else:
|
|
||||||
logger.info("单价校验通过")
|
|
||||||
|
|
||||||
duration = time.perf_counter() - start_time
|
duration = time.perf_counter() - start_time
|
||||||
logger.info(f"处理完成,耗时: {duration:.2f}s")
|
logger.info(f"处理完成,耗时: {duration:.2f}s")
|
||||||
|
|
||||||
# 6. 输出最终结果路径到 stdout
|
# 输出最终路径
|
||||||
# 确保是绝对路径
|
|
||||||
abs_path = os.path.abspath(final_excel)
|
abs_path = os.path.abspath(final_excel)
|
||||||
print(abs_path)
|
print(abs_path)
|
||||||
return abs_path
|
return abs_path
|
||||||
|
else:
|
||||||
|
print("ERROR: Processing failed.", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
import traceback
|
import traceback
|
||||||
@ -106,7 +158,18 @@ def run_pipeline(image_path: Optional[str] = None):
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
# 支持命令行传入图片路径
|
parser = argparse.ArgumentParser(description="OCR订单处理系统 - 无界面自动化接口")
|
||||||
input_path = sys.argv[1] if len(sys.argv) > 1 else None
|
parser.add_argument('input', nargs='?', help='输入文件路径 (图片或Excel)')
|
||||||
result = run_pipeline(input_path)
|
|
||||||
|
group = parser.add_mutually_exclusive_group()
|
||||||
|
group.add_argument('--excel', action='store_true', help='处理普通 Excel 文件')
|
||||||
|
group.add_argument('--tobacco', action='store_true', help='处理烟草公司订单')
|
||||||
|
group.add_argument('--rongcheng', action='store_true', help='处理蓉城易购订单')
|
||||||
|
group.add_argument('--update-mapping', action='store_true', help='更新条码映射')
|
||||||
|
|
||||||
|
parser.add_argument('--barcode', help='待映射的原始条码 (用于 --update-mapping)')
|
||||||
|
parser.add_argument('--target', help='目标条码 (用于 --update-mapping)')
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
result = run_pipeline(args)
|
||||||
sys.exit(0 if result else 1)
|
sys.exit(0 if result else 1)
|
||||||
|
|||||||
175
启动器.py
175
启动器.py
@ -1609,182 +1609,51 @@ def process_rongcheng_yigou_with_status(log_widget, status_bar):
|
|||||||
try:
|
try:
|
||||||
reporter = ProgressReporter(status_bar)
|
reporter = ProgressReporter(status_bar)
|
||||||
reporter.running()
|
reporter.running()
|
||||||
reporter.set("正在处理蓉城易购...", 10)
|
|
||||||
add_to_log(log_widget, "开始处理蓉城易购\n", "info")
|
add_to_log(log_widget, "开始处理蓉城易购\n", "info")
|
||||||
|
|
||||||
s = load_user_settings()
|
s = load_user_settings()
|
||||||
out_dir = os.path.abspath(s.get('output_folder', 'data/output'))
|
out_dir = os.path.abspath(s.get('output_folder', 'data/output'))
|
||||||
if not os.path.exists(out_dir):
|
|
||||||
os.makedirs(out_dir, exist_ok=True)
|
|
||||||
candidates = []
|
candidates = []
|
||||||
|
if os.path.exists(out_dir):
|
||||||
for f in os.listdir(out_dir):
|
for f in os.listdir(out_dir):
|
||||||
fn = f.lower()
|
if re.match(r'^订单\d+\.xlsx$', f.lower()):
|
||||||
if re.match(r'^订单\d+\.xlsx$', fn):
|
|
||||||
p = os.path.join(out_dir, f)
|
p = os.path.join(out_dir, f)
|
||||||
try:
|
|
||||||
candidates.append((p, os.path.getmtime(p)))
|
candidates.append((p, os.path.getmtime(p)))
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
if not candidates:
|
if not candidates:
|
||||||
add_to_log(log_widget, "未在输出目录找到蓉城易购订单文件\n", "warning")
|
add_to_log(log_widget, "未在输出目录找到蓉城易购订单文件\n", "warning")
|
||||||
reporter.done()
|
reporter.done()
|
||||||
return
|
return
|
||||||
|
|
||||||
candidates.sort(key=lambda x: x[1], reverse=True)
|
candidates.sort(key=lambda x: x[1], reverse=True)
|
||||||
src_path = candidates[0][0]
|
src_path = candidates[0][0]
|
||||||
reporter.set("读取并清洗数据...", 25)
|
|
||||||
def _pick_col(df, exact_list=None, contains_list=None):
|
from app.services.special_suppliers_service import SpecialSuppliersService
|
||||||
cols = list(df.columns)
|
service = SpecialSuppliersService(ConfigManager())
|
||||||
if exact_list:
|
|
||||||
for name in exact_list:
|
result = service.process_rongcheng_yigou(
|
||||||
for c in cols:
|
src_path,
|
||||||
if str(c).strip() == str(name).strip():
|
progress_cb=lambda p, m: (reporter.set(m, p), add_to_log(log_widget, f"{m}\n", "info"))
|
||||||
return c
|
)
|
||||||
if contains_list:
|
|
||||||
for kw in contains_list:
|
|
||||||
for c in cols:
|
|
||||||
if kw in str(c):
|
|
||||||
return c
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
df_raw = pd.read_excel(src_path, header=2)
|
|
||||||
except Exception:
|
|
||||||
df_raw = pd.read_excel(src_path)
|
|
||||||
df_raw = df_raw.iloc[2:].reset_index(drop=True)
|
|
||||||
# 去除全空列与行
|
|
||||||
df_raw = df_raw.dropna(how='all', axis=1).dropna(how='all', axis=0)
|
|
||||||
# 选择关键列(包含关键词)
|
|
||||||
col_no = _pick_col(df_raw, contains_list=['序号'])
|
|
||||||
col_name = _pick_col(df_raw, contains_list=['商品名称','品名','名称'])
|
|
||||||
col_bc = _pick_col(df_raw, contains_list=['商品条码','条码'])
|
|
||||||
col_unit = _pick_col(df_raw, exact_list=['单位(订购单位)'], contains_list=['订购单位','小单位','单位'])
|
|
||||||
col_qty = _pick_col(df_raw, contains_list=['订购数量','订货数量','数量'])
|
|
||||||
# 新模板映射:优惠后金额(小单位)作为“单价(小单位)”,出库小计(元)作为“优惠后金额(小单位)”
|
|
||||||
col_price= _pick_col(df_raw, exact_list=['优惠后金额(小单位)'], contains_list=['单价','销售价','进货价','优惠后金额'])
|
|
||||||
col_amt = _pick_col(df_raw, exact_list=['出库小计(元)'], contains_list=['金额','优惠后金额','小计','合计','出库小计'])
|
|
||||||
selected = [c for c in [col_no,col_name,col_bc,col_unit,col_qty,col_price,col_amt] if c]
|
|
||||||
if not selected or len(selected) < 4:
|
|
||||||
# 兜底:沿用旧逻辑(索引选列)
|
|
||||||
df = pd.read_excel(src_path)
|
|
||||||
df = df.iloc[2:].reset_index(drop=True)
|
|
||||||
keep_idx = [0, 2, 3, 9, 12, 15, 17]
|
|
||||||
keep_idx = [i for i in keep_idx if i < df.shape[1]]
|
|
||||||
df2 = df.iloc[:, keep_idx].copy()
|
|
||||||
target_cols = ['序号','商品名称','商品条码','单位','数量','单价','金额']
|
|
||||||
df2.columns = target_cols[:len(df2.columns)]
|
|
||||||
else:
|
|
||||||
df2 = df_raw[selected].copy()
|
|
||||||
# 统一列名到旧配置期望的列名,便于后续映射
|
|
||||||
rename_map = {}
|
|
||||||
if col_no: rename_map[col_no] = '序号'
|
|
||||||
if col_name: rename_map[col_name] = '商品名称'
|
|
||||||
if col_bc: rename_map[col_bc] = '商品条码(小条码)'
|
|
||||||
if col_unit: rename_map[col_unit] = '单位'
|
|
||||||
if col_qty: rename_map[col_qty] = '订购数量(小单位)'
|
|
||||||
if col_price: rename_map[col_price] = '单价(小单位)'
|
|
||||||
if col_amt: rename_map[col_amt] = '优惠后金额(小单位)'
|
|
||||||
df2 = df2.rename(columns=rename_map)
|
|
||||||
# 单位清洗(与旧版一致:将“件”改为“份”,并去除空白)
|
|
||||||
if '单位' in df2.columns:
|
|
||||||
try:
|
|
||||||
df2['单位'] = df2['单位'].astype(str).str.strip().replace({'件':'份'})
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
# 保留原始订购单位
|
|
||||||
# 分裂多条码行并均分数量
|
|
||||||
try:
|
|
||||||
bc_col = '商品条码(小条码)' if '商品条码(小条码)' in df2.columns else ('商品条码' if '商品条码' in df2.columns else ('条码' if '条码' in df2.columns else None))
|
|
||||||
qty_col = '订购数量(小单位)' if '订购数量(小单位)' in df2.columns else ('订购数量' if '订购数量' in df2.columns else ('数量' if '数量' in df2.columns else None))
|
|
||||||
up_col = '单价(小单位)' if '单价(小单位)' in df2.columns else ('单价' if '单价' in df2.columns else ('销售价' if '销售价' in df2.columns else None))
|
|
||||||
amt_col = '优惠后金额(小单位)' if '优惠后金额(小单位)' in df2.columns else ('金额' if '金额' in df2.columns else ('小计' if '小计' in df2.columns else None))
|
|
||||||
if bc_col and qty_col:
|
|
||||||
rows = []
|
|
||||||
for _, row in df2.iterrows():
|
|
||||||
bc_val = str(row.get(bc_col, '')).strip()
|
|
||||||
if bc_val and (',' in bc_val or ',' in bc_val or '、' in bc_val or ' ' in bc_val or '/' in bc_val):
|
|
||||||
parts = []
|
|
||||||
for sep in [',',',','、','/',' ']:
|
|
||||||
bc_val = bc_val.replace(sep, ' ')
|
|
||||||
for token in bc_val.split():
|
|
||||||
tok = ''.join([ch for ch in token if ch.isdigit()])
|
|
||||||
if tok:
|
|
||||||
parts.append(tok)
|
|
||||||
parts = [p for p in parts if p]
|
|
||||||
if len(parts) >= 2:
|
|
||||||
try:
|
|
||||||
q_total = float(row.get(qty_col, 0) or 0)
|
|
||||||
except Exception:
|
|
||||||
q_total = 0
|
|
||||||
if q_total > 0:
|
|
||||||
n = len(parts)
|
|
||||||
base = int(q_total) // n if q_total.is_integer() else q_total / n
|
|
||||||
remainder = int(q_total) % n if q_total.is_integer() else 0
|
|
||||||
for i, bc in enumerate(parts):
|
|
||||||
new_row = row.copy()
|
|
||||||
new_row[bc_col] = bc
|
|
||||||
q_each = base + (1 if remainder > 0 and i < remainder else 0)
|
|
||||||
new_row[qty_col] = q_each
|
|
||||||
if up_col and amt_col:
|
|
||||||
try:
|
|
||||||
upv = float(new_row.get(up_col, 0) or 0)
|
|
||||||
new_row[amt_col] = upv * float(q_each)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
rows.append(new_row)
|
|
||||||
else:
|
|
||||||
# 无法分配数量,保留原行
|
|
||||||
rows.append(row)
|
|
||||||
else:
|
|
||||||
rows.append(row)
|
|
||||||
else:
|
|
||||||
rows.append(row)
|
|
||||||
df2 = pd.DataFrame(rows)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
base = os.path.basename(src_path)
|
|
||||||
inter_name = f"蓉城易购_处理后_{base}"
|
|
||||||
inter_path = os.path.join(out_dir, inter_name)
|
|
||||||
reporter.set("保存处理结果...", 45)
|
|
||||||
df2.to_excel(inter_path, index=False)
|
|
||||||
final_name = f"蓉城易购-{base}"
|
|
||||||
final_path = os.path.join(out_dir, final_name)
|
|
||||||
try:
|
|
||||||
if os.path.exists(final_path):
|
|
||||||
os.remove(final_path)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
try:
|
|
||||||
os.replace(inter_path, final_path)
|
|
||||||
except Exception:
|
|
||||||
final_path = inter_path
|
|
||||||
add_to_log(log_widget, f"蓉城易购预处理完成: {final_path}\n", "success")
|
|
||||||
reporter.set("准备进行普通Excel处理...", 60)
|
|
||||||
add_recent_file(final_path)
|
|
||||||
time.sleep(3)
|
|
||||||
order_service = OrderService()
|
|
||||||
result = order_service.process_excel(final_path, progress_cb=lambda p: reporter.set("Excel处理中...", p))
|
|
||||||
if result:
|
if result:
|
||||||
add_to_log(log_widget, "Excel普通处理完成\n", "success")
|
add_to_log(log_widget, f"处理完成: {result}\n", "success")
|
||||||
add_recent_file(result)
|
add_recent_file(result)
|
||||||
try:
|
try:
|
||||||
validate_unit_price_against_item_data(result, log_widget)
|
validate_unit_price_against_item_data(result, log_widget)
|
||||||
except Exception:
|
except Exception: pass
|
||||||
pass
|
|
||||||
open_result_directory_from_settings()
|
open_result_directory_from_settings()
|
||||||
reporter.set("处理完成", 100)
|
reporter.set("处理完成", 100)
|
||||||
else:
|
else:
|
||||||
add_to_log(log_widget, "Excel普通处理失败\n", "error")
|
add_to_log(log_widget, "处理失败\n", "error")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
add_to_log(log_widget, f"处理蓉城易购时出错: {str(e)}\n", "error")
|
add_to_log(log_widget, f"处理出错: {str(e)}\n", "error")
|
||||||
msg = str(e)
|
|
||||||
suggestion = None
|
|
||||||
if 'pandas' in msg:
|
|
||||||
suggestion = "安装依赖:pip install pandas openpyxl"
|
|
||||||
if suggestion:
|
|
||||||
show_error_dialog("蓉城易购处理错误", msg, suggestion)
|
|
||||||
finally:
|
finally:
|
||||||
try:
|
|
||||||
reporter.done()
|
reporter.done()
|
||||||
except Exception:
|
|
||||||
pass
|
thread = Thread(target=run_in_thread)
|
||||||
|
thread.daemon = True
|
||||||
|
thread.start()
|
||||||
thread = Thread(target=run_in_thread)
|
thread = Thread(target=run_in_thread)
|
||||||
thread.daemon = True
|
thread.daemon = True
|
||||||
thread.start()
|
thread.start()
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user