docs(openclaw): 更新对接指南并重构蓉城易购处理逻辑

- 将蓉城易购处理逻辑从启动器迁移至专用服务类,提升代码模块化
- 更新 OpenClaw 指南,详细说明新增的多种处理模式(Excel、特殊供应商、条码映射)
- 统一 headless_api 入口,通过参数化支持不同处理流程
This commit is contained in:
侯欢 2026-03-30 10:15:01 +08:00
parent 83405a9b8e
commit 3e2f46d26d
4 changed files with 337 additions and 246 deletions

View File

@ -1,51 +1,66 @@
# OCR 订单处理系统 - OpenClaw 对接指南 # OCR 订单处理系统 - OpenClaw 对接指南 (增强版)
本指南旨在帮助 OpenClaw(或任何自动化 AI 代理)理解如何与本项目进行对接,实现“发送图片 -> 自动处理 -> 返回 Excel”的闭环流程 本指南旨在帮助 OpenClaw 实现对采购单处理系统的全功能自动化控制涵盖图片识别、Excel 清洗、特殊供应商处理及条码映射管理
## 1. 项目概览 ## 1. 核心接口说明 (headless_api.py)
本项目是一个自动化的采购单处理系统。它通过 OCR 技术识别图片中的表格数据,并根据预设的规则进行清洗(如拆分单位、推算规格),最后生成标准格式的 Excel 采购单。
## 2. 核心工作流 `headless_api.py` 是系统的统一命令行入口。它支持多种模式,通过参数切换功能。
自动化对接建议遵循以下步骤:
1. **图片准备**:将收到的采购单图片保存至项目的 `data/input/` 目录。 ### 1.1 图片 OCR 识别 (默认模式)
2. **触发处理**:运行 `headless_api.py` 脚本。 当接收到用户发送的采购单图片时使用。
3. **捕获路径**从脚本的标准输出stdout中获取生成的 Excel 绝对路径。
4. **读取结果**:根据获取的路径读取文件并发送回给用户。
## 3. API 接口说明 (headless_api.py)
这是专门为自动化对接设计的入口。
### 运行方式
```bash ```bash
# 方式 A自动处理 input 目录中最新的一张图片 # 自动处理 data/input 中最新的图片
python headless_api.py python headless_api.py
# 方式 B处理指定路径的图片 # 处理指定图片
python headless_api.py "E:/2025Code/python/orc-order-v2/data/input/invoice_001.jpg" python headless_api.py "data/input/my_invoice.jpg"
``` ```
### 输出规范 ### 1.2 纯 Excel 订单处理
- **标准输出 (stdout)**:仅在处理成功时打印**最终生成的 Excel 文件的绝对路径** 当用户直接提供供应商导出的 Excel 文件时使用
- *示例*`E:\2025Code\python\orc-order-v2\data\result\采购单_invoice_001.xlsx` ```bash
- **标准错误 (stderr)**:输出所有的运行日志、进度信息、警告以及错误详情。 # 处理 data/input 中最新的 Excel
- *注意*:如果发生“单价校验”差异(即生成价格与商品资料价格差异 > 1 元),相关警告会打印在此流中。 python headless_api.py --excel
## 4. 关键目录结构 # 处理指定 Excel
- `data/input/`:存放待处理的原始图片(支持 jpg, png, bmp python headless_api.py "data/input/supplier_order.xlsx" --excel
- `data/result/`:存放最终生成的标准 Excel 采购单。 ```
- `templates/商品资料.xlsx`:系统会自动根据此文件进行单价校验。
- `config/suppliers_config.json`:存储供应商的清洗规则。
## 5. 错误代码与异常处理 ### 1.3 特殊供应商预处理
- 如果脚本退出码为 `0`表示处理成功stdout 包含文件路径。 针对格式极其特殊的供应商(如蓉城易购、烟草公司),系统内置了专用逻辑。
- 如果脚本退出码为 `1`:表示处理失败,具体原因请查看 stderr。 ```bash
# 处理蓉城易购订单 (自动从 data/output 找最新订单文件)
python headless_api.py --rongcheng
## 6. 给 OpenClaw 的建议策略 # 处理烟草公司订单
1. **轮询或触发**:当检测到用户发送图片时,先清空或直接写入 `data/input/` python headless_api.py --tobacco
2. **路径捕获**:使用类似 `subprocess.check_output` 的方法运行脚本,捕获最后一行输出作为文件路径。 ```
3. **价格预警**OpenClaw 可以扫描 stderr如果发现包含 `WARNING: Price validation found...` 字样,可以主动提醒用户:“处理完成,但发现部分商品单价与资料库不符,请核对。”
### 1.4 条码映射管理
当发现 OCR 识别出的条码与系统库不一致时,可通过此接口更新映射关系。
```bash
# 将原始条码 12345 映射为系统目标条码 67890
python headless_api.py --update-mapping --barcode "12345" --target "67890"
```
## 2. OpenClaw 集成策略
### 2.1 任务分发逻辑
OpenClaw 应根据用户输入判断调用的参数:
- **收到图片** -> 调用默认模式。
- **收到 Excel** -> 调用 `--excel` 模式。
- **用户提到“蓉城”或“易购”** -> 调用 `--rongcheng`
- **用户提到“烟草”** -> 调用 `--tobacco`
- **用户纠正条码错误** -> 调用 `--update-mapping`
### 2.2 结果获取
- **成功**:脚本会在 `stdout` 打印生成的 Excel 绝对路径。OpenClaw 捕获此路径并发送文件。
- **单价预警**:脚本会在 `stderr` 输出 `WARNING: Price validation found...`。OpenClaw 应捕获并转述给用户:“文件已生成,但发现部分商品价格异常,请注意查看。”
## 3. 常见目录定义
- `data/input/`:存放待处理的原始图片或 Excel。
- `data/output/`:存放 OCR 识别后的中间文件(蓉城/烟草模式会从此目录读取)。
- `data/result/`:最终生成的标准银豹采购单。
--- ---
*由系统自动生成于 2026-03-25* *版本2.0 | 更新日期:2026-03-25*

View File

@ -0,0 +1,144 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import re
import time
import pandas as pd
import logging
from typing import Optional, Callable
from app.services.order_service import OrderService
logger = logging.getLogger(__name__)
class SpecialSuppliersService:
"""
处理特殊供应商逻辑的服务类如蓉城易购等
"""
def __init__(self, config_manager=None):
self.config_manager = config_manager
self.order_service = OrderService(config_manager)
def process_rongcheng_yigou(self, src_path: str, progress_cb: Optional[Callable[[int, str], None]] = None) -> Optional[str]:
"""
处理蓉城易购订单
"""
try:
if progress_cb: progress_cb(10, "正在处理蓉城易购...")
def _pick_col(df, exact_list=None, contains_list=None):
cols = list(df.columns)
if exact_list:
for name in exact_list:
for c in cols:
if str(c).strip() == str(name).strip():
return c
if contains_list:
for kw in contains_list:
for c in cols:
if kw in str(c):
return c
return None
try:
df_raw = pd.read_excel(src_path, header=2)
except Exception:
df_raw = pd.read_excel(src_path)
df_raw = df_raw.iloc[2:].reset_index(drop=True)
# 去除全空列与行
df_raw = df_raw.dropna(how='all', axis=1).dropna(how='all', axis=0)
# 选择关键列
col_no = _pick_col(df_raw, contains_list=['序号'])
col_name = _pick_col(df_raw, contains_list=['商品名称','品名','名称'])
col_bc = _pick_col(df_raw, contains_list=['商品条码','条码'])
col_unit = _pick_col(df_raw, exact_list=['单位(订购单位)'], contains_list=['订购单位','小单位','单位'])
col_qty = _pick_col(df_raw, contains_list=['订购数量','订货数量','数量'])
col_price= _pick_col(df_raw, exact_list=['优惠后金额(小单位)'], contains_list=['单价','销售价','进货价','优惠后金额'])
col_amt = _pick_col(df_raw, exact_list=['出库小计(元)'], contains_list=['金额','优惠后金额','小计','合计','出库小计'])
selected = [c for c in [col_no,col_name,col_bc,col_unit,col_qty,col_price,col_amt] if c]
if not selected or len(selected) < 4:
df = pd.read_excel(src_path)
df = df.iloc[2:].reset_index(drop=True)
keep_idx = [0, 2, 3, 9, 12, 15, 17]
keep_idx = [i for i in keep_idx if i < df.shape[1]]
df2 = df.iloc[:, keep_idx].copy()
target_cols = ['序号','商品名称','商品条码','单位','数量','单价','金额']
df2.columns = target_cols[:len(df2.columns)]
else:
df2 = df_raw[selected].copy()
rename_map = {}
if col_no: rename_map[col_no] = '序号'
if col_name: rename_map[col_name] = '商品名称'
if col_bc: rename_map[col_bc] = '商品条码(小条码)'
if col_unit: rename_map[col_unit] = '单位'
if col_qty: rename_map[col_qty] = '订购数量(小单位)'
if col_price: rename_map[col_price] = '单价(小单位)'
if col_amt: rename_map[col_amt] = '优惠后金额(小单位)'
df2 = df2.rename(columns=rename_map)
if '单位' in df2.columns:
df2['单位'] = df2['单位'].astype(str).str.strip().replace({'':''})
# 分裂多条码行并均分数量
bc_col = '商品条码(小条码)' if '商品条码(小条码)' in df2.columns else ('商品条码' if '商品条码' in df2.columns else ('条码' if '条码' in df2.columns else None))
qty_col = '订购数量(小单位)' if '订购数量(小单位)' in df2.columns else ('订购数量' if '订购数量' in df2.columns else ('数量' if '数量' in df2.columns else None))
up_col = '单价(小单位)' if '单价(小单位)' in df2.columns else ('单价' if '单价' in df2.columns else ('销售价' if '销售价' in df2.columns else None))
amt_col = '优惠后金额(小单位)' if '优惠后金额(小单位)' in df2.columns else ('金额' if '金额' in df2.columns else ('小计' if '小计' in df2.columns else None))
if bc_col and qty_col:
rows = []
for _, row in df2.iterrows():
bc_val = str(row.get(bc_col, '')).strip()
if bc_val and any(sep in bc_val for sep in [',','','','/',' ']):
parts = []
temp_bc = bc_val
for sep in [',','','','/',' ']:
temp_bc = temp_bc.replace(sep, ' ')
for token in temp_bc.split():
tok = ''.join([ch for ch in token if ch.isdigit()])
if tok: parts.append(tok)
parts = [p for p in parts if p]
if len(parts) >= 2:
try:
q_total = float(row.get(qty_col, 0) or 0)
except Exception:
q_total = 0
if q_total > 0:
n = len(parts)
base = int(q_total) // n if q_total.is_integer() else q_total / n
remainder = int(q_total) % n if q_total.is_integer() else 0
for i, bc in enumerate(parts):
new_row = row.copy()
new_row[bc_col] = bc
q_each = base + (1 if remainder > 0 and i < remainder else 0)
new_row[qty_col] = q_each
if up_col and amt_col:
try:
upv = float(new_row.get(up_col, 0) or 0)
new_row[amt_col] = upv * float(q_each)
except Exception: pass
rows.append(new_row)
else: rows.append(row)
else: rows.append(row)
else: rows.append(row)
df2 = pd.DataFrame(rows)
out_dir = os.path.dirname(src_path)
base = os.path.basename(src_path)
final_name = f"蓉城易购预处理-{base}"
final_path = os.path.join(out_dir, final_name)
df2.to_excel(final_path, index=False)
if progress_cb: progress_cb(60, "预处理完成,开始标准流程...")
result = self.order_service.process_excel(final_path, progress_cb=lambda p: progress_cb(60 + int(p*0.4), "Excel处理中...") if progress_cb else None)
return result
except Exception as e:
logger.error(f"处理蓉城易购订单出错: {e}")
return None

View File

@ -12,8 +12,10 @@ import os
import sys import sys
import logging import logging
import time import time
import argparse
import json
from pathlib import Path from pathlib import Path
from typing import Optional from typing import Optional, List, Dict
# 添加当前目录到路径 # 添加当前目录到路径
sys.path.append(os.path.dirname(os.path.abspath(__file__))) sys.path.append(os.path.dirname(os.path.abspath(__file__)))
@ -21,9 +23,11 @@ sys.path.append(os.path.dirname(os.path.abspath(__file__)))
from app.config.settings import ConfigManager from app.config.settings import ConfigManager
from app.services.ocr_service import OCRService from app.services.ocr_service import OCRService
from app.services.order_service import OrderService from app.services.order_service import OrderService
from app.services.tobacco_service import TobaccoService
from app.services.special_suppliers_service import SpecialSuppliersService
from app.core.utils.log_utils import set_log_level from app.core.utils.log_utils import set_log_level
# 配置日志输出到 stderr,以免干扰 stdout 的路径输出 # 配置日志输出到 stderr
logging.basicConfig( logging.basicConfig(
level=logging.INFO, level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
@ -31,73 +35,121 @@ logging.basicConfig(
) )
logger = logging.getLogger("HeadlessAPI") logger = logging.getLogger("HeadlessAPI")
def get_latest_input_image() -> Optional[str]: def get_latest_file(directory: str, extensions: List[str]) -> Optional[str]:
"""获取 input 目录中最新的图片文件""" """获取目录中最新的指定后缀文件"""
input_dir = Path("data/input") dir_path = Path(directory)
if not input_dir.exists(): if not dir_path.exists():
return None return None
extensions = ['.jpg', '.jpeg', '.png', '.bmp']
files = [] files = []
for ext in extensions: for ext in extensions:
files.extend(input_dir.glob(f"*{ext}")) files.extend(dir_path.glob(f"*{ext}"))
files.extend(input_dir.glob(f"*{ext.upper()}")) files.extend(dir_path.glob(f"*{ext.upper()}"))
if not files: if not files:
return None return None
# 按修改时间排序
latest_file = max(files, key=lambda p: p.stat().st_mtime) latest_file = max(files, key=lambda p: p.stat().st_mtime)
return str(latest_file) return str(latest_file)
def run_pipeline(image_path: Optional[str] = None): def update_barcode_mapping(barcode: str, target_barcode: str):
"""更新条码映射"""
try:
config_path = os.path.join("config", "barcode_mappings.json")
mappings = {}
if os.path.exists(config_path):
with open(config_path, 'r', encoding='utf-8') as f:
mappings = json.load(f)
mappings[barcode] = target_barcode
with open(config_path, 'w', encoding='utf-8') as f:
json.dump(mappings, f, ensure_ascii=False, indent=2)
logger.info(f"成功更新条码映射: {barcode} -> {target_barcode}")
return True
except Exception as e:
logger.error(f"更新条码映射失败: {e}")
return False
def run_pipeline(args):
"""运行处理流水线""" """运行处理流水线"""
try: try:
# 1. 确定输入文件
if not image_path:
image_path = get_latest_input_image()
if not image_path:
print("ERROR: No input image found.", file=sys.stderr)
return None
logger.info(f"开始处理图片: {image_path}")
# 2. 初始化服务
config_manager = ConfigManager() config_manager = ConfigManager()
ocr_service = OCRService(config_manager)
order_service = OrderService(config_manager) order_service = OrderService(config_manager)
# 3. OCR 识别
start_time = time.perf_counter() start_time = time.perf_counter()
excel_intermediate = ocr_service.process_image(image_path) final_excel = None
if not excel_intermediate:
print(f"ERROR: OCR failed for {image_path}", file=sys.stderr) # 1. 处理条码映射更新
if args.update_mapping:
if not args.barcode or not args.target:
print("ERROR: --barcode and --target are required for --update-mapping", file=sys.stderr)
return None
if update_barcode_mapping(args.barcode, args.target):
print(f"SUCCESS: Mapping updated {args.barcode} -> {args.target}")
return "MAPPING_UPDATED"
return None return None
# 4. Excel 处理与清洗 # 2. 烟草公司处理
final_excel = order_service.process_excel(excel_intermediate) if args.tobacco:
if not final_excel: input_path = args.input or get_latest_file("data/output", [".xlsx", ".xls"])
print(f"ERROR: Excel processing failed for {excel_intermediate}", file=sys.stderr) if not input_path:
return None print("ERROR: No tobacco order file found.", file=sys.stderr)
return None
# 5. 单价校验 (输出到 stderr) logger.info(f"开始处理烟草订单: {input_path}")
discrepancies = order_service.validate_unit_price(final_excel) tobacco_service = TobaccoService(config_manager)
if discrepancies: final_excel = tobacco_service.process_tobacco_order(input_path)
print(f"WARNING: Price validation found {len(discrepancies)} issues:", file=sys.stderr)
for d in discrepancies: # 3. 蓉城易购处理
print(f" - {d}", file=sys.stderr) elif args.rongcheng:
input_path = args.input or get_latest_file("data/output", [".xlsx", ".xls"])
if not input_path:
print("ERROR: No Rongcheng Yigou order file found.", file=sys.stderr)
return None
logger.info(f"开始处理蓉城易购订单: {input_path}")
special_service = SpecialSuppliersService(config_manager)
final_excel = special_service.process_rongcheng_yigou(input_path)
# 4. 普通 Excel 处理
elif args.excel:
input_path = args.input or get_latest_file("data/input", [".xlsx", ".xls"])
if not input_path:
print("ERROR: No Excel file found in input.", file=sys.stderr)
return None
logger.info(f"开始处理 Excel: {input_path}")
final_excel = order_service.process_excel(input_path)
# 5. 图片 OCR 处理 (默认)
else: else:
logger.info("单价校验通过") input_path = args.input or get_latest_file("data/input", [".jpg", ".jpeg", ".png", ".bmp"])
if not input_path:
print("ERROR: No input image found.", file=sys.stderr)
return None
logger.info(f"开始 OCR 处理图片: {input_path}")
ocr_service = OCRService(config_manager)
excel_intermediate = ocr_service.process_image(input_path)
if excel_intermediate:
final_excel = order_service.process_excel(excel_intermediate)
# 6. 后续处理 (校验与输出)
if final_excel:
# 单价校验
discrepancies = order_service.validate_unit_price(final_excel)
if discrepancies:
print(f"WARNING: Price validation found {len(discrepancies)} issues:", file=sys.stderr)
for d in discrepancies:
print(f" - {d}", file=sys.stderr)
duration = time.perf_counter() - start_time duration = time.perf_counter() - start_time
logger.info(f"处理完成,耗时: {duration:.2f}s") logger.info(f"处理完成,耗时: {duration:.2f}s")
# 6. 输出最终结果路径到 stdout # 输出最终路径
# 确保是绝对路径 abs_path = os.path.abspath(final_excel)
abs_path = os.path.abspath(final_excel) print(abs_path)
print(abs_path) return abs_path
return abs_path else:
print("ERROR: Processing failed.", file=sys.stderr)
return None
except Exception as e: except Exception as e:
import traceback import traceback
@ -106,7 +158,18 @@ def run_pipeline(image_path: Optional[str] = None):
return None return None
if __name__ == "__main__": if __name__ == "__main__":
# 支持命令行传入图片路径 parser = argparse.ArgumentParser(description="OCR订单处理系统 - 无界面自动化接口")
input_path = sys.argv[1] if len(sys.argv) > 1 else None parser.add_argument('input', nargs='?', help='输入文件路径 (图片或Excel)')
result = run_pipeline(input_path)
group = parser.add_mutually_exclusive_group()
group.add_argument('--excel', action='store_true', help='处理普通 Excel 文件')
group.add_argument('--tobacco', action='store_true', help='处理烟草公司订单')
group.add_argument('--rongcheng', action='store_true', help='处理蓉城易购订单')
group.add_argument('--update-mapping', action='store_true', help='更新条码映射')
parser.add_argument('--barcode', help='待映射的原始条码 (用于 --update-mapping)')
parser.add_argument('--target', help='目标条码 (用于 --update-mapping)')
args = parser.parse_args()
result = run_pipeline(args)
sys.exit(0 if result else 1) sys.exit(0 if result else 1)

View File

@ -1609,182 +1609,51 @@ def process_rongcheng_yigou_with_status(log_widget, status_bar):
try: try:
reporter = ProgressReporter(status_bar) reporter = ProgressReporter(status_bar)
reporter.running() reporter.running()
reporter.set("正在处理蓉城易购...", 10)
add_to_log(log_widget, "开始处理蓉城易购\n", "info") add_to_log(log_widget, "开始处理蓉城易购\n", "info")
s = load_user_settings() s = load_user_settings()
out_dir = os.path.abspath(s.get('output_folder', 'data/output')) out_dir = os.path.abspath(s.get('output_folder', 'data/output'))
if not os.path.exists(out_dir):
os.makedirs(out_dir, exist_ok=True)
candidates = [] candidates = []
for f in os.listdir(out_dir): if os.path.exists(out_dir):
fn = f.lower() for f in os.listdir(out_dir):
if re.match(r'^订单\d+\.xlsx$', fn): if re.match(r'^订单\d+\.xlsx$', f.lower()):
p = os.path.join(out_dir, f) p = os.path.join(out_dir, f)
try:
candidates.append((p, os.path.getmtime(p))) candidates.append((p, os.path.getmtime(p)))
except Exception:
pass
if not candidates: if not candidates:
add_to_log(log_widget, "未在输出目录找到蓉城易购订单文件\n", "warning") add_to_log(log_widget, "未在输出目录找到蓉城易购订单文件\n", "warning")
reporter.done() reporter.done()
return return
candidates.sort(key=lambda x: x[1], reverse=True) candidates.sort(key=lambda x: x[1], reverse=True)
src_path = candidates[0][0] src_path = candidates[0][0]
reporter.set("读取并清洗数据...", 25)
def _pick_col(df, exact_list=None, contains_list=None): from app.services.special_suppliers_service import SpecialSuppliersService
cols = list(df.columns) service = SpecialSuppliersService(ConfigManager())
if exact_list:
for name in exact_list: result = service.process_rongcheng_yigou(
for c in cols: src_path,
if str(c).strip() == str(name).strip(): progress_cb=lambda p, m: (reporter.set(m, p), add_to_log(log_widget, f"{m}\n", "info"))
return c )
if contains_list:
for kw in contains_list:
for c in cols:
if kw in str(c):
return c
return None
try:
df_raw = pd.read_excel(src_path, header=2)
except Exception:
df_raw = pd.read_excel(src_path)
df_raw = df_raw.iloc[2:].reset_index(drop=True)
# 去除全空列与行
df_raw = df_raw.dropna(how='all', axis=1).dropna(how='all', axis=0)
# 选择关键列(包含关键词)
col_no = _pick_col(df_raw, contains_list=['序号'])
col_name = _pick_col(df_raw, contains_list=['商品名称','品名','名称'])
col_bc = _pick_col(df_raw, contains_list=['商品条码','条码'])
col_unit = _pick_col(df_raw, exact_list=['单位(订购单位)'], contains_list=['订购单位','小单位','单位'])
col_qty = _pick_col(df_raw, contains_list=['订购数量','订货数量','数量'])
# 新模板映射:优惠后金额(小单位)作为“单价(小单位)”,出库小计(元)作为“优惠后金额(小单位)”
col_price= _pick_col(df_raw, exact_list=['优惠后金额(小单位)'], contains_list=['单价','销售价','进货价','优惠后金额'])
col_amt = _pick_col(df_raw, exact_list=['出库小计(元)'], contains_list=['金额','优惠后金额','小计','合计','出库小计'])
selected = [c for c in [col_no,col_name,col_bc,col_unit,col_qty,col_price,col_amt] if c]
if not selected or len(selected) < 4:
# 兜底:沿用旧逻辑(索引选列)
df = pd.read_excel(src_path)
df = df.iloc[2:].reset_index(drop=True)
keep_idx = [0, 2, 3, 9, 12, 15, 17]
keep_idx = [i for i in keep_idx if i < df.shape[1]]
df2 = df.iloc[:, keep_idx].copy()
target_cols = ['序号','商品名称','商品条码','单位','数量','单价','金额']
df2.columns = target_cols[:len(df2.columns)]
else:
df2 = df_raw[selected].copy()
# 统一列名到旧配置期望的列名,便于后续映射
rename_map = {}
if col_no: rename_map[col_no] = '序号'
if col_name: rename_map[col_name] = '商品名称'
if col_bc: rename_map[col_bc] = '商品条码(小条码)'
if col_unit: rename_map[col_unit] = '单位'
if col_qty: rename_map[col_qty] = '订购数量(小单位)'
if col_price: rename_map[col_price] = '单价(小单位)'
if col_amt: rename_map[col_amt] = '优惠后金额(小单位)'
df2 = df2.rename(columns=rename_map)
# 单位清洗(与旧版一致:将“件”改为“份”,并去除空白)
if '单位' in df2.columns:
try:
df2['单位'] = df2['单位'].astype(str).str.strip().replace({'':''})
except Exception:
pass
# 保留原始订购单位
# 分裂多条码行并均分数量
try:
bc_col = '商品条码(小条码)' if '商品条码(小条码)' in df2.columns else ('商品条码' if '商品条码' in df2.columns else ('条码' if '条码' in df2.columns else None))
qty_col = '订购数量(小单位)' if '订购数量(小单位)' in df2.columns else ('订购数量' if '订购数量' in df2.columns else ('数量' if '数量' in df2.columns else None))
up_col = '单价(小单位)' if '单价(小单位)' in df2.columns else ('单价' if '单价' in df2.columns else ('销售价' if '销售价' in df2.columns else None))
amt_col = '优惠后金额(小单位)' if '优惠后金额(小单位)' in df2.columns else ('金额' if '金额' in df2.columns else ('小计' if '小计' in df2.columns else None))
if bc_col and qty_col:
rows = []
for _, row in df2.iterrows():
bc_val = str(row.get(bc_col, '')).strip()
if bc_val and (',' in bc_val or '' in bc_val or '' in bc_val or ' ' in bc_val or '/' in bc_val):
parts = []
for sep in [',','','','/',' ']:
bc_val = bc_val.replace(sep, ' ')
for token in bc_val.split():
tok = ''.join([ch for ch in token if ch.isdigit()])
if tok:
parts.append(tok)
parts = [p for p in parts if p]
if len(parts) >= 2:
try:
q_total = float(row.get(qty_col, 0) or 0)
except Exception:
q_total = 0
if q_total > 0:
n = len(parts)
base = int(q_total) // n if q_total.is_integer() else q_total / n
remainder = int(q_total) % n if q_total.is_integer() else 0
for i, bc in enumerate(parts):
new_row = row.copy()
new_row[bc_col] = bc
q_each = base + (1 if remainder > 0 and i < remainder else 0)
new_row[qty_col] = q_each
if up_col and amt_col:
try:
upv = float(new_row.get(up_col, 0) or 0)
new_row[amt_col] = upv * float(q_each)
except Exception:
pass
rows.append(new_row)
else:
# 无法分配数量,保留原行
rows.append(row)
else:
rows.append(row)
else:
rows.append(row)
df2 = pd.DataFrame(rows)
except Exception:
pass
base = os.path.basename(src_path)
inter_name = f"蓉城易购_处理后_{base}"
inter_path = os.path.join(out_dir, inter_name)
reporter.set("保存处理结果...", 45)
df2.to_excel(inter_path, index=False)
final_name = f"蓉城易购-{base}"
final_path = os.path.join(out_dir, final_name)
try:
if os.path.exists(final_path):
os.remove(final_path)
except Exception:
pass
try:
os.replace(inter_path, final_path)
except Exception:
final_path = inter_path
add_to_log(log_widget, f"蓉城易购预处理完成: {final_path}\n", "success")
reporter.set("准备进行普通Excel处理...", 60)
add_recent_file(final_path)
time.sleep(3)
order_service = OrderService()
result = order_service.process_excel(final_path, progress_cb=lambda p: reporter.set("Excel处理中...", p))
if result: if result:
add_to_log(log_widget, "Excel普通处理完成\n", "success") add_to_log(log_widget, f"处理完成: {result}\n", "success")
add_recent_file(result) add_recent_file(result)
try: try:
validate_unit_price_against_item_data(result, log_widget) validate_unit_price_against_item_data(result, log_widget)
except Exception: except Exception: pass
pass
open_result_directory_from_settings() open_result_directory_from_settings()
reporter.set("处理完成", 100) reporter.set("处理完成", 100)
else: else:
add_to_log(log_widget, "Excel普通处理失败\n", "error") add_to_log(log_widget, "处理失败\n", "error")
except Exception as e: except Exception as e:
add_to_log(log_widget, f"处理蓉城易购时出错: {str(e)}\n", "error") add_to_log(log_widget, f"处理出错: {str(e)}\n", "error")
msg = str(e)
suggestion = None
if 'pandas' in msg:
suggestion = "安装依赖pip install pandas openpyxl"
if suggestion:
show_error_dialog("蓉城易购处理错误", msg, suggestion)
finally: finally:
try: reporter.done()
reporter.done()
except Exception: thread = Thread(target=run_in_thread)
pass thread.daemon = True
thread.start()
thread = Thread(target=run_in_thread) thread = Thread(target=run_in_thread)
thread.daemon = True thread.daemon = True
thread.start() thread.start()