docs: 更新指南和API以反映智能文件识别功能

更新 OPENCLAW_GUIDE.md 文档，强调新的全自动智能模式，简化用户操作说明。同时修改 headless_api.py 的默认处理逻辑，使其能自动识别输入文件类型（图片或Excel）并路由到相应处理流程，提升用户体验。
2026-03-30 15:40:19 +08:00
parent 32d41244e5
commit c06e3e55f9
2 changed files with 28 additions and 30 deletions
@@ -4,36 +4,28 @@

 ## 1. 核心接口说明 (headless_api.py)

-`headless_api.py` 是系统的统一命令行入口。它支持多种模式，通过参数切换功能。
+`headless_api.py` 是系统的统一命令行入口。它现在支持**智能文件类型识别**，大多数情况下 OpenClaw **无需携带任何参数**。
+
+### 1.1 全自动智能模式 (推荐方式)
+无论是收到**图片**还是 **Excel**，都可以直接调用。系统会自动判断文件类型，如果是 Excel 则自动识别供应商指纹并处理；如果是图片则先 OCR 后再智能处理。

-### 1.1 智能 Excel 订单处理 (推荐)
-当用户直接提供供应商导出的 Excel 文件时使用。**系统会自动识别文件内容特征（如：蓉城易购 RCDH、烟草公司专卖证号、杨碧月经手人），并自动路由至专用预处理流程。**
 ```bash
-# 处理 data/input 中最新的 Excel
-python headless_api.py --excel
-
-# 处理指定 Excel
-python headless_api.py "data/input/supplier_order.xlsx" --excel
-```
-
-### 1.2 图片 OCR 识别
-当接收到用户发送的采购单图片时使用。OCR 识别完成后，生成的 Excel 同样会经过智能预处理。
-```bash
-# 自动处理 data/input 中最新的图片
+# 场景 1: 自动处理 data/input 中最新的文件 (图片或 Excel)
 python headless_api.py

-# 处理指定图片
-python headless_api.py "data/input/my_invoice.jpg"
+# 场景 2: 处理指定的任意文件 (图片或 Excel)
+python headless_api.py "data/input/my_file.jpg"
+python headless_api.py "data/input/my_file.xlsx"
 ```

-### 1.3 显式特殊供应商处理
-仅在智能识别失效时，用于强制指定处理模式。
+### 1.2 显式特殊指令
+仅在需要强制指定特定流程时使用。
 ```bash
-# 显式指定：蓉城易购模式
-python headless_api.py --rongcheng
+# 强制指定为 Excel 处理模式 (即便文件没有正确后缀)
+python headless_api.py --excel

-# 显式指定：烟草公司模式
-python headless_api.py --tobacco
+# 强制映射条码
+python headless_api.py --update-mapping --barcode "123" --target "456"
 ```

 ## 2. 字段与逻辑变更 (v2.1)
@@ -121,17 +121,23 @@ def run_pipeline(args):
            # OrderService.process_excel 内部会自动调用 _check_special_preprocess
            final_excel = order_service.process_excel(input_path)

-        # 5. 图片 OCR 处理 (默认)
+        # 5. 智能处理 (默认逻辑：自动判断图片还是 Excel)
        else:
-            input_path = args.input or get_latest_file("data/input", [".jpg", ".jpeg", ".png", ".bmp"])
+            input_path = args.input or get_latest_file("data/input", [".jpg", ".jpeg", ".png", ".bmp", ".xlsx", ".xls"])
            if not input_path:
-                print("ERROR: No input image found.", file=sys.stderr)
+                print("ERROR: No input file found in data/input.", file=sys.stderr)
                return None
-            logger.info(f"开始 OCR 处理图片: {input_path}")
-            ocr_service = OCRService(config_manager)
-            excel_intermediate = ocr_service.process_image(input_path)
-            if excel_intermediate:
-                final_excel = order_service.process_excel(excel_intermediate)
+            
+            ext = os.path.splitext(input_path)[1].lower()
+            if ext in [".xlsx", ".xls"]:
+                logger.info(f"智能识别为 Excel 文件，开始处理: {input_path}")
+                final_excel = order_service.process_excel(input_path)
+            else:
+                logger.info(f"智能识别为图片文件，开始 OCR 处理: {input_path}")
+                ocr_service = OCRService(config_manager)
+                excel_intermediate = ocr_service.process_image(input_path)
+                if excel_intermediate:
+                    final_excel = order_service.process_excel(excel_intermediate)

        # 6. 后续处理 (校验与输出)
        if final_excel: