新系统

在更新一版,更方便了
修复条码验证问题：在验证阶段处理过长条码，移除末尾多余的0，确保条码不会超过标准长度
2025-08-17 15:52:49 +08:00 · 2025-08-17 15:50:21 +08:00 · 2025-05-30 12:38:25 +08:00 · 2025-05-30 12:14:53 +08:00 · 2025-05-30 12:08:06 +08:00 · 2025-05-30 11:54:08 +08:00
166 changed files with 72114 additions and 22184 deletions
@@ -1,3 +0,0 @@
 # 百度 OCR API 配置
 BAIDU_API_KEY=your_api_key_here
 BAIDU_SECRET_KEY=your_secret_key_here
@@ -1,46 +1,29 @@
-# Environment
+# Python缓存文件
 .env
 # Python
 __pycache__/
-*.pyc
+*.py[cod]
-*.pyo
+*$py.class
 .pytest_cache/
 .venv/
-# Build & dist
+# 虚拟环境
-build/
+venv/
-dist/
+env/
-release/
+ENV/
 *.spec
-# Logs & temp
+# 日志文件
-logs/
+logs/*.log
 logs/*.active
 *.log.*
 # 临时文件和缓存
 data/temp/
-
+data/*.bak
-# Runtime data (all runtime outputs, caches, databases)
+*.bak
 data/
 # Claude Code / IDE
 .claude/
 .playwright-mcp/
 .trae/
 # Old project
 wework_xiaoai_bot/
 # Node.js
 node_modules/
 # Frontend build output
 web/backend/static/
 # Screenshots (from testing)
 *.png
 # OS/IDE
 .DS_Store
-Thumbs.db
+
 # 输出文件（可选是否忽略）
 # data/output/
 # IDE文件
 .idea/
 .vscode/
-
+*.swp
 *.swo 
@@ -1,39 +1,30 @@
-# Changelog
+# 更新日志
-## [v2.2.0] - 2026-03-31
+## v1.1.0 (2025-05-30)
 ### Added
 - **UI Simplification**: Removed dedicated buttons for Rongcheng and Tobacco; all Excel orders now use the intelligent auto-routing.
 - **Enhanced Yang Biyue Support**: Fixed column mapping for Yang Biyue orders, ensuring standard fields (Barcode, Quantity, Price) are correctly extracted.
 - **Headless API Auto-Detect**: `headless_api.py` now automatically distinguishes between Image (OCR) and Excel (Direct) inputs based on file extension.
-### Fixed
+### 新特性
- **Yang Biyue Preprocessing**: Resolved issue where data was empty due to incorrect column renaming.
+- 添加对特殊条码6958620703716的处理，支持同时设置规格和条码映射
- **Interference Filtering**: Added logic to exclude distractor columns like "Settlement Unit" or "Base Quantity" during preprocessing.
+- 增强不规范规格格式的解析能力（如"IL*12"、"6oo*12"等）
 - 支持带重量单位的规格解析（如"5kg*6"）
 - 添加数量为空时通过金额和单价自动计算数量的功能
-### Removed
+### 修复
- **Redundant Files**: Cleaned up `run.py`, `clean.py`, and unused CLI modules.
+- 修复条码映射功能在特殊处理后不生效的问题
- **Legacy UI Elements**: Removed tobacco-specific keyboard shortcuts and help entries.
+- 修复OrderService中缺少merge_all_purchase_orders方法导致合并采购单报错的问题
 - 修复了条码映射对话框无法同时添加特殊处理和映射的问题
-## [v2.1.0] - 2026-03-30
+### 改进
-### Added
+- 改进了BarcodeMapper类，使其支持同时进行特殊处理和条码映射
- **Intelligent Recognition**: Automated fingerprinting for Rongcheng Yigou, Tobacco, and Yang Biyue orders.
+- 改进了规格解析逻辑，增加了对各种单位和格式的支持
- **Auto-Routing**: `OrderService.process_excel` now automatically handles preprocessing without explicit flags.
+- 添加条码映射对话框中可视化标记映射关系
- **Headless API Enhancements**: `headless_api.py` updated to support the new intelligent recognition mode.
+- 更新了条码映射配置文件，增加了更多特殊条码处理
- **Comprehensive Documentation**: Added `OPENCLAW_GUIDE.md` and `FINAL_UPDATE_REPORT.md`.
+- 改进商品验证器，在数量为空但单价和金额存在时，自动计算数量
-### Fixed
+## v1.0.0 (2025-05-01)
 - **Rongcheng Yigou**: Fixed barcode splitting issue where quantities were incorrectly distributed (30 to 5).
 - **Tobacco Orders**: Corrected unit price calculation (divided by 10) and quantity calculation (multiplied by 10).
 - **Identification Failure**: Fixed issue where `header=0` caused identification keywords at the very first row to be missed.
-## [v2.0.0] - 2026-03-25
+### 初始版本
-### Added
+- 基础OCR识别功能
- **Headless API**: First release of `headless_api.py` for OpenClaw integration.
+- Excel处理功能
- **Price Validation**: Integration with PosPal item data for unit price auditing.
+- 采购单合并功能
- **Asynchronous Logging**: GUI now uses a queue for log output to prevent UI freezing.
+- 烟草订单处理功能
-
+- 图形用户界面 
 ## [v1.1.0] - 2026-03-10
 ### Added
 - **Rongcheng Yigou Support**: Initial support for Rongcheng Excel templates.
 - **Tobacco Support**: Initial support for Tobacco Excel templates.
 - **Excel Processor**: Refactored core processing logic into `ExcelProcessor`.
@@ -1,146 +0,0 @@
 # CLAUDE.md - 益选 OCR 订单处理系统
 ## 项目概述
 益选 OCR 订单处理系统 (orc-order-v2) 是一个面向零售与分销场景的采购单处理工具。
 **核心流程**: 图片 OCR → Excel 规范化 → 模板填充 → 合并导出
 **目标系统**: 银豹 (PosPal) POS 系统
 **技术栈**: Python 3.9+, Tkinter, Pandas, Baidu OCR API, xlrd/xlwt/openpyxl
 ## 项目结构
 ```
 orc-order-v2/
 ├── 启动器.py                    # 入口桩 (~13行, 仅导入 main)
 ├── headless_api.py              # CLI 自动化接口 (OpenClaw 对接)
 ├── build_exe.py                 # PyInstaller 打包脚本
 ├── config.ini                   # 全局配置 (API密钥、路径)
 ├── config/
 │   ├── config.ini               # 配置副本
 │   ├── barcode_mappings.json    # 条码映射规则
 │   └── suppliers_config.json    # 供应商配置 (列映射/清洗规则/计算规则)
 ├── app/
 │   ├── config/
 │   │   ├── settings.py          # ConfigManager 单例
 │   │   └── defaults.py          # 默认配置
 │   ├── core/
 │   │   ├── excel/
 │   │   │   ├── processor.py     # ExcelProcessor - 标准化转换核心
 │   │   │   ├── converter.py     # UnitConverter - 单位转换与规格推断
 │   │   │   ├── merger.py        # PurchaseOrderMerger - 采购单合并
 │   │   │   ├── validators.py    # ProductValidator
 │   │   │   └── handlers/        # 条码映射、单位转换处理器
 │   │   ├── handlers/
 │   │   │   ├── rule_engine.py   # 通用规则引擎 (split/extract/normalize/mark)
 │   │   │   ├── column_mapper.py # 列映射器
 │   │   │   ├── data_cleaner.py  # 数据清洗器
 │   │   │   └── calculator.py    # 计算器
 │   │   ├── ocr/
 │   │   │   ├── table_ocr.py     # OCRProcessor
 │   │   │   └── baidu_ocr.py     # BaiduOCRClient
 │   │   ├── processors/
 │   │   │   ├── base.py          # BaseProcessor 抽象基类
 │   │   │   ├── tobacco_processor.py
 │   │   │   ├── ocr_processor.py
 │   │   │   └── supplier_processors/
 │   │   │       └── generic_supplier_processor.py
 │   │   └── utils/
 │   │       ├── file_utils.py    # 文件操作工具
 │   │       ├── log_utils.py     # 日志工具
 │   │       ├── string_utils.py  # 字符串工具
 │   │       └── dialog_utils.py  # Tkinter 对话框工具
 │   ├── services/
 │   │   ├── order_service.py     # 订单服务 (智能路由分发)
 │   │   ├── ocr_service.py       # OCR 服务
 │   │   ├── processor_service.py # 处理器调度服务
 │   │   ├── tobacco_service.py   # 烟草公司专用服务
 │   │   └── special_suppliers_service.py  # 特殊供应商服务 (蓉城/杨碧月)
 │   └── ui/                      # GUI 模块 (从启动器.py拆分)
 │       ├── error_utils.py       # L0 错误对话框
 │       ├── theme.py             # L0 主题管理 (THEMES, create_modern_button)
 │       ├── logging_ui.py        # L0 日志队列与GUI日志处理器
 │       ├── ui_widgets.py        # L0 StatusBar, ProgressReporter, center_window
 │       ├── user_settings.py     # L1 用户设置与最近文件管理
 │       ├── result_previews.py   # L1 处理结果预览对话框
 │       ├── command_runner.py    # L1 命令执行器 (subprocess + 日志重定向)
 │       ├── file_operations.py   # L2 文件选择/清理/目录操作
 │       ├── action_handlers.py   # L2 业务操作 (OCR/Excel/合并/拖拽)
 │       ├── barcode_editor.py    # L2 条码映射编辑
 │       ├── config_dialog.py     # L3 系统设置对话框
 │       ├── shortcuts.py         # L3 键盘快捷键绑定
 │       └── main_window.py       # L4 main() 主窗口构建
 ├── templates/
 │   ├── 银豹-采购单模板.xls       # 输出模板
 │   └── 商品资料.xlsx             # 单价校验参考数据
 ├── data/
 │   ├── input/                   # 输入文件
 │   ├── output/                  # OCR 输出
 │   ├── result/                  # 最终采购单
 │   └── user_settings.json       # 用户设置
 └── docs/
    └── SYSTEM_ARCHITECTURE.md   # 系统架构文档
 ```
 ## 命令与运行
 ```bash
 # GUI 模式
 python 启动器.py
 # CLI 模式 (OpenClaw 对接)
 python headless_api.py [input] [--excel|--tobacco|--rongcheng] [--barcode X --target Y]
 # 打包 EXE
 python build_exe.py
 # 条码映射更新
 python headless_api.py --update-mapping --barcode 6920584471055 --target 6920584471017
 ```
 ## 供应商智能识别逻辑
 系统通过扫描 Excel 前 50 行内容特征自动路由:
 | 供应商 | 识别特征 | 预处理逻辑 |
 |--------|----------|-----------|
 | 烟草公司 | "专卖证号" 或 "510109104938" | B/E/G/H 列映射, 数量*10, 单价/10 |
 | 蓉城易购 | "RCDH" | E/N/Q/S 列映射, 多条码分裂均分数量 |
 | 杨碧月 | "经手人" + "杨碧月" | 列对齐, 单位转换 (件→瓶) |
 | 通用供应商 | suppliers_config.json 配置 | 列映射 + 规则引擎 |
 ## 配置系统
 - **ConfigManager** (`app/config/settings.py`): 单例模式, 基于 configparser 读取 `config.ini`
 - **供应商配置** (`config/suppliers_config.json`): JSON 格式, 定义列映射/清洗规则/计算规则
 - **条码映射** (`config/barcode_mappings.json`): 运行时可更新的条码转换规则
 ## 关键约定
 ### 输出格式
 - 银豹采购单模板: 4 列 — 条码(B), 采购量(C), 赠送量(D), 采购单价(E)
 - 单价保留 4 位小数, 使用 xlwt.XFStyle
 - 采购单文件名: `采购单_{原文件名}.xls`
 ### 单位转换规则
 - "件"/"箱"/"提"/"盒" → 数量*包装数量, 单价/包装数量, 单位→"瓶"
 - 赠品: 价格为 0 或金额为 0 的行标记为赠品
 - 条码映射优先于单位转换
 ### 规格推断
 - 从商品名称推断: "24入纸箱" → 1*24, "450g*15" → 1*15
 - 支持三级规格: 1*5*12
 - OCR 修正: "IL" → "1L", "6oo" → "600"
 ## 已知技术债务
 1. ~~**启动器.py 过大**~~ (已拆分为 13 个 `app/ui/` 模块, 入口桩仅 13 行)
 2. **代码重复**: 表头识别、列映射、金额解析在多处重复实现
 3. **配置不统一**: config.ini + suppliers_config.json + 硬编码路径混用
 4. **无测试**: 测试目录为空, 无自动化测试
 5. **旧格式依赖**: xlrd/xlwt 仅支持 .xls, 不支持 .xlsx 写入
 6. **API 密钥明文**: config.ini 中百度 OCR API 密钥未加密
 7. **路径硬编码**: config.ini 中 `template_folder = E:\2025Code\python\orc-order-v2\templates`
 8. **日志不统一**: 混用 `get_logger()` 和 `logging.getLogger()`
@@ -0,0 +1,82 @@
 # -*- mode: python ; coding: utf-8 -*-
 block_cipher = None
 # 需要包含的数据文件
 added_files = [
    ('config.ini', '.'),
    ('config/barcode_mappings.json', 'config/'),
    ('config/config.ini', 'config/'),
    ('templates/银豹-采购单模板.xls', 'templates/'),
    ('app', 'app'),
 ]
 # 需要隐式导入的模块
 hidden_imports = [
    'tkinter',
    'tkinter.ttk',
    'tkinter.filedialog',
    'tkinter.messagebox',
    'tkinter.scrolledtext',
    'pandas',
    'numpy',
    'openpyxl',
    'xlrd',
    'xlwt',
    'xlutils',
    'requests',
    'configparser',
    'threading',
    'datetime',
    'json',
    're',
    'subprocess',
    'shutil',
    'app.config.settings',
    'app.services.ocr_service',
    'app.services.order_service',
    'app.services.tobacco_service',
    'app.core.utils.dialog_utils',
    'app.core.excel.converter',
 ]
 a = Analysis(
    ['启动器.py'],
    pathex=[],
    binaries=[],
    datas=added_files,
    hiddenimports=hidden_imports,
    hookspath=[],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    win_no_prefer_redirects=False,
    win_private_assemblies=False,
    cipher=block_cipher,
    noarchive=False,
 )
 pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)
 exe = EXE(
    pyz,
    a.scripts,
    a.binaries,
    a.zipfiles,
    a.datas,
    [],
    name='OCR订单处理系统',
    debug=False,
    bootloader_ignore_signals=False,
    strip=False,
    upx=True,
    upx_exclude=[],
    runtime_tmpdir=None,
    console=False,
    disable_windowed_traceback=False,
    argv_emulation=False,
    target_arch=None,
    codesign_identity=None,
    entitlements_file=None,
 )
@@ -1,206 +1,43 @@
-# 益选 OCR 订单处理系统
+# 益选-OCR订单处理系统
-面向零售与分销场景的采购单处理工具，支持图片 OCR → Excel 规范化 → 模板填充 → 合并导出全流程，输出适配银豹 (PosPal) POS 系统。
+一个集OCR识别、Excel处理和订单合并功能于一体的采购单处理系统。
-## 核心功能
+## 主要功能
- **智能供应商识别**：自动扫描 Excel 前 50 行内容特征，路由到对应的预处理逻辑（蓉城易购、烟草公司、杨碧月等）
+- **OCR识别**：识别图片中的商品信息，包括条码、名称、数量、单价等
- **图片 OCR**：调用百度 OCR 表格识别 API，将采购单图片转为结构化 Excel
+- **Excel处理**：将OCR识别结果处理成规范的Excel采购单
- **规则引擎**：支持列映射、数据清洗、单位转换、规格推断、赠品标记等自动化规则
+- **采购单合并**：合并多个采购单，汇总相同商品
- **条码映射**：可配置的条码转换规则，支持运行时编辑和云端同步
+- **条码映射**：支持将特定条码映射为其他条码，适应不同系统要求
- **单价校验**：自动比对 `商品资料.xlsx`，价差超过 1.0 元触发预警
+- **规格处理**：智能解析商品规格，实现单位自动转换
- **云端同步**：通过 Gitea REST API 在多台设备间同步配置文件（条码映射、供应商配置、商品资料、采购模板）
+- **烟草订单处理**：专门处理烟草公司订单
 - **拖拽一键处理**：拖入图片或 Excel 自动走完 OCR → 规范化 → 合并全流程
 - **CLI 接口**：`headless_api.py` 支持无界面自动化调用
-## 快速开始
+## 技术特点
-### 桌面端 (GUI / CLI)
+- 基于Python开发，使用Tkinter构建图形界面
 - 采用模块化设计，易于扩展和维护
 - 自动处理各种不规范数据格式
 - 配置文件支持，可自定义各种处理参数
 - 日志记录，便于问题排查
-```bash
+## 使用方法
 # 安装依赖
 pip install -r requirements.txt
-# GUI 模式
+1. 运行`启动器.py`打开主界面
-python 启动器.py
+2. 根据需要选择相应功能按钮
 3. 按照提示操作，完成数据处理
-# CLI 模式
+## 系统要求
 python headless_api.py data/input/xxx.xlsx
 python headless_api.py data/input/xxx.jpg --barcode 6920584471055 --target 6920584471017
-# 打包 EXE
+- Python 3.8+
-python build_exe.py
+- 所需第三方库：详见`requirements.txt`
 ```
-### Web 端
+## 最近更新
-```bash
+请查看[更新日志](CHANGELOG.md)了解最新版本变更。
 # 后端依赖
 cd web/backend && pip install -r requirements.txt
-# 前端依赖
+## 贡献者
 cd web/frontend && npm install
-# 启动后端 (端口 8000)
+- 欢欢欢
 cd web && python -m uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload
-# 启动前端开发服务器 (端口 5173)
+## 版权
 cd web/frontend && npm run dev
-# 构建前端到后端静态目录
+© 2025 益选-OCR订单处理系统 
 cd web/frontend && npm run build
 # 构建后直接访问 http://localhost:8000 即可
 # 生产部署 (仅后端，前端已内嵌)
 cd web && python -m uvicorn backend.main:app --host 0.0.0.0 --port 8000
 ```
 **默认账号：** `admin` / `admin123`（首次登录后建议修改密码）
 ## 项目结构
 ```
 ├── 启动器.py                     # GUI 入口
 ├── headless_api.py               # CLI 自动化接口
 ├── config.ini                    # 全局配置（API密钥、路径、Gitea）
 ├── config/
 │   ├── config.ini                # 配置副本
 │   ├── barcode_mappings.json     # 条码映射规则
 │   └── suppliers_config.json     # 供应商配置（列映射/规则引擎）
 ├── app/
 │   ├── config/                   # 配置管理（ConfigManager 单例）
 │   ├── core/
 │   │   ├── excel/                # Excel 处理（标准化、转换、合并、校验）
 │   │   ├── handlers/             # 规则引擎、列映射、数据清洗、计算器
 │   │   ├── ocr/                  # 百度 OCR 客户端
 │   │   ├── processors/           # 处理器（通用/烟草/OCR）
 │   │   └── utils/                # 工具（日志、文件、字符串、云端同步、对话框）
 │   ├── services/                 # 业务服务（订单、OCR、处理器调度）
 │   └── ui/                       # GUI 模块（主题、日志、快捷键、主窗口）
 ├── web/                          # Web 端
 │   ├── backend/
 │   │   ├── main.py               # FastAPI 入口
 │   │   ├── auth/                 # JWT 认证（登录、Token、权限）
 │   │   ├── routers/              # API 路由（文件、处理、记忆、条码、同步、任务、日志）
 │   │   ├── services/             # 后端服务（任务管理、数据库、文件同步）
 │   │   └── middleware/           # HTTP 日志中间件
 │   └── frontend/
 │       ├── src/
 │       │   ├── views/            # 页面（Dashboard、Layout、文件管理、任务、日志等）
 │       │   ├── stores/           # Pinia 状态管理（auth、processing）
 │       │   ├── composables/      # 共享逻辑（useDebounce、useFileUtils、useFilePreview）
 │       │   ├── api.ts            # Axios 封装
 │       │   └── router/           # Vue Router 路由
 │       ├── package.json
 │       └── vite.config.ts
 ├── templates/
 │   ├── 银豹-采购单模板.xls        # 输出模板（条码/采购量/赠送量/单价）
 │   └── 商品资料.xlsx              # 单价校验参考数据
 ├── data/
 │   ├── input/                    # 输入文件
 │   ├── output/                   # OCR 输出
 │   ├── result/                   # 最终采购单
 │   └── web_data.db               # Web 端数据库（SQLite）
 └── tests/                        # 单元测试（191 个）
 ```
 ## Web 端功能
 基于 Vue 3 + Element Plus + FastAPI 的浏览器端管理界面，与桌面端共享同一个 `data/` 目录。
 ### 处理中心 (Dashboard)
 - **一键全流程**：上传图片或 Excel 后，一键完成 OCR → 标准化 → 合并全流程
 - **批量 OCR / 批量处理**：可单独执行 OCR 识别或 Excel 标准化步骤
 - **实时进度**：WebSocket 推送任务进度、日志、状态变更
 - **多任务监控**：同时查看多个运行中任务的进度和日志
 - **任务重试**：失败任务可查看错误详情并一键重试
 ### 文件管理
 - **图片处理**：管理 `data/input/` 中的图片文件，支持上传、预览、批量 OCR、批量生成采购单
 - **表格处理**：管理 `data/output/` 中的 Excel 文件，支持上传、预览、批量标准化处理
 - **采购单管理**：管理 `data/result/` 中的采购单，支持预览、下载、合并、批量删除
 - **实时同步**：页面加载时自动同步磁盘文件到数据库，新文件立即可见
 - **清除处理缓存**：删除已处理的输出文件，允许重新处理
 ### 任务与日志
 - **任务历史**：查看所有处理任务的状态、进度、日志，支持按状态和类型筛选
 - **HTTP 日志**：记录所有 API 请求，支持按方法和状态码筛选
 ### 记忆库
 - **产品记忆**：自动从 OCR 和处理结果中学习产品信息
 - **置信度系统**：根据出现次数自动评估记忆可靠度
 - **搜索与管理**：支持搜索、编辑、删除记忆条目
 ### 条码映射
 - **映射规则管理**：添加、编辑、删除条码转换规则
 - **批量操作**：支持批量导入和删除映射
 ### 云端同步
 - **Gitea 同步**：通过 Gitea REST API 在多台设备间同步配置文件
 - **一键推拉**：选择文件推送或拉取，无需 git 客户端
 ### 系统配置
 - **配置编辑**：在浏览器中编辑系统配置（API 密钥、路径、参数）
 - **修改密码**：支持修改 Web 端登录密码
 ### UI/UX
 - **响应式布局**：适配桌面和移动端，小屏幕自动切换为抽屉式导航
 - **全局错误处理**：未捕获的 Vue 错误自动显示用户提示
 - **表单验证**：修改密码等操作有完整的输入验证
 ## 供应商智能路由
 | 供应商 | 识别特征 | 处理逻辑 |
 |--------|----------|----------|
 | 烟草公司 | "专卖证号" 或 "510109104938" | B/E/G/H 列映射，数量×10，单价÷10 |
 | 蓉城易购 | "RCDH" | E/N/Q/S 列映射，多条码分裂均分数量 |
 | 杨碧月 | "经手人" + "杨碧月" | 列对齐，单位转换（件→瓶） |
 | 通用供应商 | `suppliers_config.json` 配置 | 列映射 + 规则引擎 |
 ## 云端同步
 通过 Gitea REST API 在多台设备间同步配置，无需 git 客户端。
 **支持同步的文件：**
 - 条码映射 (`barcode_mappings.json`)
 - 供应商配置 (`suppliers_config.json`)
 - 商品资料 (`templates/商品资料.xlsx`)
 - 采购单模板 (`templates/银豹-采购单模板.xls`)
 **配置方式：**
 1. 系统设置 → 填入 Gitea 地址、仓库信息、Access Token
 2. 主窗口 → "云端同步" 按钮 → 选择文件推拉
 **Gitea 仓库：** `https://gitea.94kan.cn/houhuan/yixuan-sync-data`
 ## 配置说明
 | 配置项 | 文件 | 说明 |
 |--------|------|------|
 | API 密钥 | `.env` 或 `config.ini` | 百度 OCR API，优先从环境变量读取 |
 | Gitea Token | `.env` 或 `config.ini` | 云端同步 Token，优先从环境变量读取 |
 | 供应商规则 | `config/suppliers_config.json` | 列映射、清洗规则、计算规则 |
 | 条码映射 | `config/barcode_mappings.json` | 条码转换规则，运行时可更新 |
 ## 构建打包
 ```bash
 pip install pyinstaller
 python build_exe.py
 # 输出: dist/OCR订单处理系统.exe
 # 便携包: release/OCR订单处理系统.exe（含模板和商品资料）
 ```
 ## 测试
 ```bash
 python -m pytest tests/ -v
 ```
@@ -0,0 +1,5 @@
 """
 OCR订单处理系统 - 命令行接口
 -------------------------
 提供命令行工具，便于用户使用系统功能。
 """ 
@@ -0,0 +1,138 @@
 """
 Excel处理命令行工具
 ---------------
 提供Excel处理相关的命令行接口。
 """
 import os
 import sys
 import argparse
 from typing import List, Optional
 from ..config.settings import ConfigManager
 from ..core.utils.log_utils import get_logger, close_logger
 from ..services.order_service import OrderService
 logger = get_logger(__name__)
 def create_parser() -> argparse.ArgumentParser:
    """
    创建命令行参数解析器
    Returns:
        参数解析器
    """
    parser = argparse.ArgumentParser(description='Excel处理工具')
    # 通用选项
    parser.add_argument('--config', type=str, help='配置文件路径')
    # 子命令
    subparsers = parser.add_subparsers(dest='command', help='子命令')
    # 处理Excel命令
    process_parser = subparsers.add_parser('process', help='处理Excel文件')
    process_parser.add_argument('--input', type=str, help='输入Excel文件路径，如果不指定则处理最新的文件')
    # 查看命令
    list_parser = subparsers.add_parser('list', help='获取最新的Excel文件')
    return parser
 def process_excel(order_service: OrderService, input_file: Optional[str] = None) -> bool:
    """
    处理Excel文件
    Args:
        order_service: 订单服务
        input_file: 输入文件路径，如果为None则处理最新的文件
    Returns:
        处理是否成功
    """
    if input_file:
        if not os.path.exists(input_file):
            logger.error(f"输入文件不存在: {input_file}")
            return False
        result = order_service.process_excel(input_file)
    else:
        latest_file = order_service.get_latest_excel()
        if not latest_file:
            logger.warning("未找到可处理的Excel文件")
            return False
        logger.info(f"处理最新的Excel文件: {latest_file}")
        result = order_service.process_excel(latest_file)
    if result:
        logger.info(f"处理成功，输出文件: {result}")
        return True
    else:
        logger.error("处理失败")
        return False
 def list_latest_excel(order_service: OrderService) -> bool:
    """
    获取最新的Excel文件
    Args:
        order_service: 订单服务
    Returns:
        是否找到Excel文件
    """
    latest_file = order_service.get_latest_excel()
    if latest_file:
        logger.info(f"最新的Excel文件: {latest_file}")
        return True
    else:
        logger.info("未找到Excel文件")
        return False
 def main(args: Optional[List[str]] = None) -> int:
    """
    Excel处理命令行主函数
    Args:
        args: 命令行参数，如果为None则使用sys.argv
    Returns:
        退出状态码
    """
    parser = create_parser()
    parsed_args = parser.parse_args(args)
    if parsed_args.command is None:
        parser.print_help()
        return 1
    try:
        # 创建配置管理器
        config = ConfigManager(parsed_args.config) if parsed_args.config else ConfigManager()
        # 创建订单服务
        order_service = OrderService(config)
        # 根据命令执行不同功能
        if parsed_args.command == 'process':
            success = process_excel(order_service, parsed_args.input)
        elif parsed_args.command == 'list':
            success = list_latest_excel(order_service)
        else:
            parser.print_help()
            return 1
        return 0 if success else 1
    except Exception as e:
        logger.error(f"执行过程中发生错误: {e}")
        return 1
    finally:
        # 关闭日志
        close_logger(__name__)
 if __name__ == '__main__':
    sys.exit(main()) 
@@ -0,0 +1,147 @@
 """
 订单合并命令行工具
 --------------
 提供订单合并相关的命令行接口。
 """
 import os
 import sys
 import argparse
 from typing import List, Optional
 from ..config.settings import ConfigManager
 from ..core.utils.log_utils import get_logger, close_logger
 from ..services.order_service import OrderService
 logger = get_logger(__name__)
 def create_parser() -> argparse.ArgumentParser:
    """
    创建命令行参数解析器
    Returns:
        参数解析器
    """
    parser = argparse.ArgumentParser(description='订单合并工具')
    # 通用选项
    parser.add_argument('--config', type=str, help='配置文件路径')
    # 子命令
    subparsers = parser.add_subparsers(dest='command', help='子命令')
    # 合并命令
    merge_parser = subparsers.add_parser('merge', help='合并采购单')
    merge_parser.add_argument('--input', type=str, help='输入采购单文件路径列表，以逗号分隔，如果不指定则合并所有采购单')
    # 列出采购单命令
    list_parser = subparsers.add_parser('list', help='列出采购单文件')
    return parser
 def merge_orders(order_service: OrderService, input_files: Optional[str] = None) -> bool:
    """
    合并采购单
    Args:
        order_service: 订单服务
        input_files: 输入文件路径列表，以逗号分隔，如果为None则合并所有采购单
    Returns:
        合并是否成功
    """
    if input_files:
        # 分割输入文件列表
        file_paths = [path.strip() for path in input_files.split(',')]
        # 检查文件是否存在
        for path in file_paths:
            if not os.path.exists(path):
                logger.error(f"输入文件不存在: {path}")
                return False
        result = order_service.merge_orders(file_paths)
    else:
        # 获取所有采购单文件
        file_paths = order_service.get_purchase_orders()
        if not file_paths:
            logger.warning("未找到采购单文件")
            return False
        logger.info(f"合并 {len(file_paths)} 个采购单文件")
        result = order_service.merge_orders()
    if result:
        logger.info(f"合并成功，输出文件: {result}")
        return True
    else:
        logger.error("合并失败")
        return False
 def list_purchase_orders(order_service: OrderService) -> bool:
    """
    列出采购单文件
    Args:
        order_service: 订单服务
    Returns:
        是否有采购单文件
    """
    files = order_service.get_purchase_orders()
    if not files:
        logger.info("未找到采购单文件")
        return False
    logger.info(f"采购单文件 ({len(files)}):")
    for file in files:
        logger.info(f"  {file}")
    return True
 def main(args: Optional[List[str]] = None) -> int:
    """
    订单合并命令行主函数
    Args:
        args: 命令行参数，如果为None则使用sys.argv
    Returns:
        退出状态码
    """
    parser = create_parser()
    parsed_args = parser.parse_args(args)
    if parsed_args.command is None:
        parser.print_help()
        return 1
    try:
        # 创建配置管理器
        config = ConfigManager(parsed_args.config) if parsed_args.config else ConfigManager()
        # 创建订单服务
        order_service = OrderService(config)
        # 根据命令执行不同功能
        if parsed_args.command == 'merge':
            success = merge_orders(order_service, parsed_args.input)
        elif parsed_args.command == 'list':
            success = list_purchase_orders(order_service)
        else:
            parser.print_help()
            return 1
        return 0 if success else 1
    except Exception as e:
        logger.error(f"执行过程中发生错误: {e}")
        return 1
    finally:
        # 关闭日志
        close_logger(__name__)
 if __name__ == '__main__':
    sys.exit(main()) 
@@ -0,0 +1,164 @@
 """
 OCR命令行工具
 ----------
 提供OCR识别相关的命令行接口。
 """
 import os
 import sys
 import argparse
 from typing import List, Optional
 from ..config.settings import ConfigManager
 from ..core.utils.log_utils import get_logger, close_logger
 from ..services.ocr_service import OCRService
 logger = get_logger(__name__)
 def create_parser() -> argparse.ArgumentParser:
    """
    创建命令行参数解析器
    Returns:
        参数解析器
    """
    parser = argparse.ArgumentParser(description='OCR识别工具')
    # 通用选项
    parser.add_argument('--config', type=str, help='配置文件路径')
    # 子命令
    subparsers = parser.add_subparsers(dest='command', help='子命令')
    # 单文件处理命令
    process_parser = subparsers.add_parser('process', help='处理单个文件')
    process_parser.add_argument('--input', type=str, required=True, help='输入图片文件路径')
    # 批量处理命令
    batch_parser = subparsers.add_parser('batch', help='批量处理文件')
    batch_parser.add_argument('--batch-size', type=int, help='批处理大小')
    batch_parser.add_argument('--max-workers', type=int, help='最大线程数')
    # 查看未处理文件命令
    list_parser = subparsers.add_parser('list', help='列出未处理的文件')
    return parser
 def process_file(ocr_service: OCRService, input_file: str) -> bool:
    """
    处理单个文件
    Args:
        ocr_service: OCR服务
        input_file: 输入文件路径
    Returns:
        处理是否成功
    """
    if not os.path.exists(input_file):
        logger.error(f"输入文件不存在: {input_file}")
        return False
    if not ocr_service.validate_image(input_file):
        logger.error(f"输入文件无效: {input_file}")
        return False
    result = ocr_service.process_image(input_file)
    if result:
        logger.info(f"处理成功，输出文件: {result}")
        return True
    else:
        logger.error("处理失败")
        return False
 def process_batch(ocr_service: OCRService, batch_size: Optional[int] = None, max_workers: Optional[int] = None) -> bool:
    """
    批量处理文件
    Args:
        ocr_service: OCR服务
        batch_size: 批处理大小
        max_workers: 最大线程数
    Returns:
        处理是否成功
    """
    total, success = ocr_service.process_images_batch(batch_size, max_workers)
    if total == 0:
        logger.warning("没有找到需要处理的文件")
        return False
    logger.info(f"批量处理完成，总计: {total}，成功: {success}")
    return success > 0
 def list_unprocessed(ocr_service: OCRService) -> bool:
    """
    列出未处理的文件
    Args:
        ocr_service: OCR服务
    Returns:
        是否有未处理的文件
    """
    files = ocr_service.get_unprocessed_images()
    if not files:
        logger.info("没有未处理的文件")
        return False
    logger.info(f"未处理的文件 ({len(files)}):")
    for file in files:
        logger.info(f"  {file}")
    return True
 def main(args: Optional[List[str]] = None) -> int:
    """
    OCR命令行主函数
    Args:
        args: 命令行参数，如果为None则使用sys.argv
    Returns:
        退出状态码
    """
    parser = create_parser()
    parsed_args = parser.parse_args(args)
    if parsed_args.command is None:
        parser.print_help()
        return 1
    try:
        # 创建配置管理器
        config = ConfigManager(parsed_args.config) if parsed_args.config else ConfigManager()
        # 创建OCR服务
        ocr_service = OCRService(config)
        # 根据命令执行不同功能
        if parsed_args.command == 'process':
            success = process_file(ocr_service, parsed_args.input)
        elif parsed_args.command == 'batch':
            success = process_batch(ocr_service, parsed_args.batch_size, parsed_args.max_workers)
        elif parsed_args.command == 'list':
            success = list_unprocessed(ocr_service)
        else:
            parser.print_help()
            return 1
        return 0 if success else 1
    except Exception as e:
        logger.error(f"执行过程中发生错误: {e}")
        return 1
    finally:
        # 关闭日志
        close_logger(__name__)
 if __name__ == '__main__':
    sys.exit(main()) 
@@ -12,20 +12,14 @@ DEFAULT_CONFIG = {
        'timeout': '30',
        'max_retries': '3',
        'retry_delay': '2',
-        'api_url': 'https://aip.baidubce.com/rest/2.0/ocr/v1/table',
+        'api_url': 'https://aip.baidubce.com/rest/2.0/ocr/v1/table'
        'token_url': 'https://aip.baidubce.com/oauth/2.0/token',
        'form_ocr_url': 'https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result'
    },
    'Paths': {
        'input_folder': 'data/input',
        'output_folder': 'data/output',
        'result_folder': 'data/result',
        'temp_folder': 'data/temp',
        'template_folder': 'templates',
-        'template_file': '银豹-采购单模板.xls',
+        'processed_record': 'data/processed_files.json'
        'processed_record': 'data/processed_files.json',
        'data_dir': 'data',
        'product_db': 'data/product_cache.db'
    },
    'Performance': {
        'max_workers': '4',
@@ -38,13 +32,6 @@ DEFAULT_CONFIG = {
        'max_file_size_mb': '4'
    },
    'Templates': {
-        'purchase_order': '银豹-采购单模板.xls',
+        'purchase_order': '银豹-采购单模板.xls'
        'item_data': '商品资料.xlsx'
    },
    'Gitea': {
        'base_url': 'https://gitea.94kan.cn',
        'owner': 'houhuan',
        'repo': 'yixuan-sync-data',
        'token': ''
    }
 } 
@@ -6,16 +6,12 @@
 import os
 import configparser
 import logging
 from typing import Dict, List, Optional, Any
 from dotenv import load_dotenv
 from ..core.utils.log_utils import get_logger
 from .defaults import DEFAULT_CONFIG
-# 加载 .env 文件
+logger = logging.getLogger(__name__)
 load_dotenv()
 logger = get_logger(__name__)
 class ConfigManager:
    """
@@ -33,23 +29,13 @@ class ConfigManager:
    def _init(self, config_file):
        """初始化配置管理器"""
-        # 计算应用根目录（不依赖 os.getcwd()）
+        self.config_file = config_file or 'config.ini'
        import sys
        if getattr(sys, 'frozen', False):
            # PyInstaller 打包后，根目录是 exe 所在目录
            self.app_root = os.path.dirname(sys.executable)
        else:
            # 源码运行，根目录是 app/config/ 的上两级
            self.app_root = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
        self.config_file = config_file or os.path.join(self.app_root, 'config.ini')
        self.config = configparser.ConfigParser()
        self.load_config()
    def load_config(self) -> None:
        """
        加载配置文件，如果不存在则创建默认配置
        API 密钥优先从环境变量 (.env) 读取
        """
        if not os.path.exists(self.config_file):
            self.create_default_config()
@@ -67,9 +53,6 @@ class ConfigManager:
                        if not self.config.has_option(section, option):
                            self.config.set(section, option, value)
                # API 密钥优先从环境变量读取
                self._override_from_env()
                # 保存更新后的配置
                self.save_config()
                logger.info(f"已加载并更新配置文件: {self.config_file}")
@@ -78,18 +61,6 @@ class ConfigManager:
                logger.info("使用默认配置")
                self.create_default_config(save=False)
    def _override_from_env(self) -> None:
        """从环境变量覆盖敏感配置"""
        env_mapping = {
            ('API', 'api_key'): 'BAIDU_API_KEY',
            ('API', 'secret_key'): 'BAIDU_SECRET_KEY',
            ('Gitea', 'token'): 'GITEA_TOKEN',
        }
        for (section, option), env_key in env_mapping.items():
            env_val = os.getenv(env_key, '').strip()
            if env_val:
                self.config.set(section, option, env_val)
    def create_default_config(self, save: bool = True) -> None:
        """创建默认配置"""
        for section, options in DEFAULT_CONFIG.items():
@@ -104,25 +75,13 @@ class ConfigManager:
            logger.info(f"已创建默认配置文件: {self.config_file}")
    def save_config(self) -> None:
-        """保存配置到文件（API 密钥不写入文件）"""
+        """保存配置到文件"""
        # 保存前临时清空 API 密钥，避免写入文件
        saved_keys = {}
        for option in ('api_key', 'secret_key'):
            try:
                saved_keys[option] = self.config.get('API', option, fallback='')
            except Exception:
                saved_keys[option] = ''
            self.config.set('API', option, '')
        try:
            with open(self.config_file, 'w', encoding='utf-8') as f:
                self.config.write(f)
            logger.info(f"配置已保存到: {self.config_file}")
-        finally:
+        except Exception as e:
-            # 恢复内存中的值（即使写入失败也恢复）
+            logger.error(f"保存配置文件时出错: {e}")
            for option, val in saved_keys.items():
                if val:
                    self.config.set('API', option, val)
    def get(self, section: str, option: str, fallback: Any = None) -> Any:
        """获取配置值"""
@@ -158,29 +117,25 @@ class ConfigManager:
        获取路径配置并确保它是一个有效的绝对路径
        如果create为True，则自动创建该目录
        """
-        from pathlib import Path
+        path = self.get(section, option, fallback)
        path_str = self.get(section, option, fallback)
        path = Path(path_str)
-        if not path.is_absolute():
+        if not os.path.isabs(path):
-            # 相对路径，转为绝对路径（相对于应用根目录）
+            # 相对路径，转为绝对路径
-            path = Path(self.app_root) / path
+            path = os.path.abspath(path)
-        if create:
+        if create and not os.path.exists(path):
            try:
-                # 智能判断是文件还是目录
+                # 如果是文件路径，创建其父目录
-                # 如果有后缀名则认为是文件，创建其父目录
+                if '.' in os.path.basename(path):
-                if path.suffix:
+                    directory = os.path.dirname(path)
-                    directory = path.parent
+                    if directory and not os.path.exists(directory):
-                    if not directory.exists():
+                        os.makedirs(directory, exist_ok=True)
-                        directory.mkdir(parents=True, exist_ok=True)
+                        logger.info(f"已创建目录: {directory}")
                        logger.info(f"已创建父目录: {directory}")
                else:
                    # 否则认为是目录路径
-                    if not path.exists():
+                    os.makedirs(path, exist_ok=True)
-                        path.mkdir(parents=True, exist_ok=True)
+                    logger.info(f"已创建目录: {path}")
                        logger.info(f"已创建目录: {path}")
            except Exception as e:
                logger.error(f"创建目录失败: {path}, 错误: {e}")
-        return str(path.absolute())
+        return path 
@@ -1,609 +0,0 @@
 """
 商品资料 SQLite 数据库 + 商品记忆库
 记忆库功能：
 - 处理每步后自动学习商品数据（置信度+一致性加速）
 - OCR 字段缺失时用记忆库补全 (conf > 50 直接采用)
 - 价格异常检测：偏差 > 2倍触发补全，偏差 > 50% 记录预警
 - 批量预加载 → 内存操作 → 批量写回，保障性能
 """
 import os
 import json
 import sqlite3
 from datetime import datetime
 from typing import Dict, List, Optional, Tuple, Callable
 import pandas as pd
 from ..utils.log_utils import get_logger
 from ..utils.file_utils import smart_read_excel
 from ...core.handlers.column_mapper import ColumnMapper
 logger = get_logger(__name__)
 class ProductDatabase:
    """商品资料 SQLite 数据库 + 商品记忆库"""
    SCHEMA = """
    CREATE TABLE IF NOT EXISTS products (
        barcode TEXT PRIMARY KEY,
        name TEXT DEFAULT '',
        price REAL DEFAULT 0.0,
        unit TEXT DEFAULT '',
        updated_at TEXT,
        specification TEXT DEFAULT '',
        source TEXT DEFAULT 'template',
        confidence INTEGER DEFAULT 0,
        usage_count INTEGER DEFAULT 0,
        last_seen TEXT,
        avg_price REAL DEFAULT 0.0,
        min_price REAL DEFAULT 0.0,
        max_price REAL DEFAULT 0.0,
        price_count INTEGER DEFAULT 0
    );
    """
    _NEW_COLUMNS = {
        'specification': "TEXT DEFAULT ''",
        'source': "TEXT DEFAULT 'template'",
        'confidence': 'INTEGER DEFAULT 0',
        'usage_count': 'INTEGER DEFAULT 0',
        'last_seen': 'TEXT',
        'avg_price': 'REAL DEFAULT 0.0',
        'min_price': 'REAL DEFAULT 0.0',
        'max_price': 'REAL DEFAULT 0.0',
        'price_count': 'INTEGER DEFAULT 0',
    }
    def __init__(self, db_path: str, excel_source: str):
        self.db_path = db_path
        self.excel_source = excel_source
        self._ensure_db()
    def _connect(self) -> sqlite3.Connection:
        return sqlite3.connect(self.db_path)
    def _ensure_db(self):
        if os.path.exists(self.db_path):
            self._migrate_schema()
            return
        if not os.path.exists(self.excel_source):
            logger.warning(f"商品资料 Excel 不存在: {self.excel_source}")
            self._create_empty_db()
            return
        logger.info(f"首次运行，从 Excel 导入商品资料: {self.excel_source}")
        os.makedirs(os.path.dirname(self.db_path), exist_ok=True)
        self._create_empty_db()
        count = self.import_from_excel(self.excel_source)
        logger.info(f"商品资料导入完成: {count} 条记录")
    def _create_empty_db(self):
        conn = self._connect()
        try:
            conn.executescript(self.SCHEMA)
            conn.commit()
        finally:
            conn.close()
    def _migrate_schema(self):
        conn = self._connect()
        try:
            cursor = conn.execute("PRAGMA table_info(products)")
            existing_cols = {row[1] for row in cursor.fetchall()}
            for col_name, col_type in self._NEW_COLUMNS.items():
                if col_name not in existing_cols:
                    conn.execute(f"ALTER TABLE products ADD COLUMN {col_name} {col_type}")
                    logger.info(f"数据库迁移: 添加列 {col_name}")
            conn.commit()
        finally:
            conn.close()
    # ══════════════════════════════════════════════════════════════
    # 导入
    # ══════════════════════════════════════════════════════════════
    def import_from_excel(self, excel_path: str) -> int:
        df = smart_read_excel(excel_path)
        if df is None or df.empty:
            return 0
        barcode_col = ColumnMapper.find_column(list(df.columns), 'barcode')
        if not barcode_col:
            return 0
        price_col = ColumnMapper.find_column(list(df.columns), 'unit_price')
        if not price_col:
            for col in df.columns:
                if '进货价' in str(col).strip():
                    price_col = col
                    break
        name_col = ColumnMapper.find_column(list(df.columns), 'name')
        unit_col = ColumnMapper.find_column(list(df.columns), 'unit')
        spec_col = ColumnMapper.find_column(list(df.columns), 'specification')
        now = datetime.now().isoformat()
        rows = []
        for _, row in df.iterrows():
            barcode = str(row.get(barcode_col, '')).strip()
            if not barcode or barcode == 'nan':
                continue
            price = 0.0
            if price_col:
                try:
                    p = row.get(price_col)
                    if p is not None and str(p).strip() not in ('', 'nan', 'None'):
                        price = float(p)
                except (ValueError, TypeError):
                    pass
            name = str(row.get(name_col, '')).strip() if name_col else ''
            if name == 'nan': name = ''
            unit = str(row.get(unit_col, '')).strip() if unit_col else ''
            if unit == 'nan': unit = ''
            spec = str(row.get(spec_col, '')).strip() if spec_col else ''
            if spec == 'nan': spec = ''
            # template 源置信度 50
            rows.append((barcode, name, price, unit, now, spec, 'template', 50, 0, now,
                         price, price, price, 1 if price > 0 else 0))
        if not rows:
            return 0
        conn = self._connect()
        try:
            conn.executemany(
                "INSERT OR REPLACE INTO products "
                "(barcode, name, price, unit, updated_at, specification, source, confidence, "
                "usage_count, last_seen, avg_price, min_price, max_price, price_count) "
                "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
                rows)
            conn.commit()
        finally:
            conn.close()
        return len(rows)
    def reimport(self) -> int:
        conn = self._connect()
        try:
            conn.execute("DELETE FROM products")
            conn.commit()
        finally:
            conn.close()
        return self.import_from_excel(self.excel_source)
    # ══════════════════════════════════════════════════════════════
    # 查询
    # ══════════════════════════════════════════════════════════════
    def get_price(self, barcode: str) -> Optional[float]:
        conn = self._connect()
        try:
            row = conn.execute("SELECT avg_price FROM products WHERE barcode=?",
                               (str(barcode).strip(),)).fetchone()
            return row[0] if row and row[0] else None
        finally:
            conn.close()
    def get_prices(self, barcodes: List[str]) -> Dict[str, float]:
        if not barcodes:
            return {}
        conn = self._connect()
        try:
            placeholders = ','.join('?' * len(barcodes))
            rows = conn.execute(
                f"SELECT barcode, avg_price FROM products WHERE barcode IN ({placeholders})",
                [str(b).strip() for b in barcodes]).fetchall()
            return {r[0]: r[1] for r in rows if r[1]}
        finally:
            conn.close()
    def count(self) -> int:
        conn = self._connect()
        try:
            return conn.execute("SELECT COUNT(*) FROM products").fetchone()[0]
        finally:
            conn.close()
    def get_memory(self, barcode: str) -> Optional[Dict]:
        conn = self._connect()
        conn.row_factory = sqlite3.Row
        try:
            row = conn.execute("SELECT * FROM products WHERE barcode=?",
                               (str(barcode).strip(),)).fetchone()
            return dict(row) if row else None
        finally:
            conn.close()
    def get_memories(self, barcodes: List[str]) -> Dict[str, Dict]:
        if not barcodes:
            return {}
        conn = self._connect()
        conn.row_factory = sqlite3.Row
        try:
            placeholders = ','.join('?' * len(barcodes))
            rows = conn.execute(
                f"SELECT * FROM products WHERE barcode IN ({placeholders})",
                [str(b).strip() for b in barcodes]).fetchall()
            return {r['barcode']: dict(r) for r in rows}
        finally:
            conn.close()
    def get_all_memories(self) -> List[Dict]:
        conn = self._connect()
        conn.row_factory = sqlite3.Row
        try:
            return [dict(row) for row in
                    conn.execute("SELECT * FROM products ORDER BY usage_count DESC, barcode").fetchall()]
        finally:
            conn.close()
    # ══════════════════════════════════════════════════════════════
    # 批量预加载 — 性能核心
    # ══════════════════════════════════════════════════════════════
    def load_batch(self, barcodes: List[str]) -> Dict[str, Dict]:
        """批量预加载条码记忆到 dict — 单次 SQL，后续纯内存操作"""
        if not barcodes:
            return {}
        conn = self._connect()
        conn.row_factory = sqlite3.Row
        try:
            placeholders = ','.join('?' * len(barcodes))
            rows = conn.execute(
                f"SELECT * FROM products WHERE barcode IN ({placeholders})",
                [str(b).strip() for b in barcodes]).fetchall()
            return {r['barcode']: dict(r) for r in rows}
        finally:
            conn.close()
    # ══════════════════════════════════════════════════════════════
    # 学习逻辑 — 一致性加速 + 价格区间
    # ══════════════════════════════════════════════════════════════
    def learn_from_product(self, product: Dict, source: str = 'ocr',
                           memory: Dict[str, Dict] = None,
                           add_log: Callable = None) -> Optional[str]:
        """
        从处理结果中学习，返回日志字符串。
        memory: 可选的预加载批量内存，传入则零 DB 查询。
        """
        barcode = str(product.get('barcode', '')).strip()
        if not barcode:
            return None
        name = str(product.get('name', ''))
        spec = str(product.get('specification', ''))
        unit = str(product.get('unit', ''))
        price = float(product.get('price', 0))
        now = datetime.now().isoformat()
        # 查现有记录（优先从内存查）
        if memory is not None and barcode in memory:
            row = memory[barcode]
            old_name = row.get('name', '')
            old_spec = row.get('specification', '')
            old_unit = row.get('unit', '')
            old_conf = row.get('confidence', 0)
            old_count = row.get('usage_count', 0)
            old_avg = row.get('avg_price', 0) or 0
            old_min = row.get('min_price') or price
            old_max = row.get('max_price') or price
            pc = row.get('price_count', 0) or 0
            exists = True
        else:
            conn = self._connect()
            try:
                cursor = conn.execute(
                    "SELECT name, specification, unit, confidence, usage_count, "
                    "avg_price, min_price, max_price, price_count FROM products WHERE barcode=?",
                    (barcode,)).fetchone()
            finally:
                conn.close()
            if cursor is None:
                exists = False
            else:
                old_name, old_spec, old_unit, old_conf, old_count, old_avg, old_min, old_max, pc = cursor
                old_avg = old_avg or 0
                pc = pc or 0
                old_min = old_min if old_min is not None else price
                old_max = old_max if old_max is not None else price
                exists = True
        new_count = old_count + 1 if exists else 1
        # ── 置信度 ──
        if source == 'user_confirmed':
            new_conf = 90
        elif source == 'template':
            new_conf = 50
        elif exists and old_conf < 50:
            # 一致性加速
            spec_match = bool(spec and old_spec and spec == old_spec)
            unit_match = bool(unit and old_unit and unit == old_unit)
            if spec_match and unit_match:
                boost = 10
            elif unit_match:
                boost = 5
            else:
                boost = 3
            new_conf = min(50, old_conf + boost)
        elif exists:
            new_conf = old_conf  # > 50 稳定不变
        else:
            new_conf = 10  # 新 OCR 记录
        # ── 价格区间 ──
        if price > 0:
            new_pc = (pc if exists else 0) + 1
            new_avg = ((old_avg * (new_pc - 1)) + price) / new_pc if exists else price
            new_min = min(old_min, price) if exists else price
            new_max = max(old_max, price) if exists else price
        else:
            new_avg = old_avg if exists else 0
            new_min = old_min if exists else 0
            new_max = old_max if exists else 0
            new_pc = pc if exists else 0
        # ── 写入 ──
        conn = self._connect()
        try:
            if not exists:
                conn.execute(
                    "INSERT INTO products (barcode, name, specification, unit, price, "
                    "source, confidence, usage_count, last_seen, updated_at, "
                    "avg_price, min_price, max_price, price_count) "
                    "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
                    (barcode, name, spec, unit, price, source, new_conf, 1, now, now,
                     new_avg, new_min, new_max, new_pc))
                log = f"记忆库新增: {barcode} {name} 源={source} 可信度={new_conf}"
            else:
                # 高可信度源全字段覆盖；低可信度仅填空
                if source in ('template', 'user_confirmed') or new_conf > 50:
                    conn.execute(
                        "UPDATE products SET name=?, specification=?, unit=?, price=?, "
                        "source=?, confidence=?, usage_count=?, last_seen=?, updated_at=?, "
                        "avg_price=?, min_price=?, max_price=?, price_count=? WHERE barcode=?",
                        (name or old_name, spec or old_spec, unit or old_unit, price,
                         source, new_conf, new_count, now, now,
                         new_avg, new_min, new_max, new_pc, barcode))
                else:
                    conn.execute(
                        "UPDATE products SET "
                        "name=CASE WHEN name='' THEN ? ELSE name END, "
                        "specification=CASE WHEN specification='' THEN ? ELSE specification END, "
                        "unit=CASE WHEN unit='' THEN ? ELSE unit END, "
                        "source=?, confidence=?, usage_count=?, last_seen=?, updated_at=?, "
                        "avg_price=?, min_price=?, max_price=?, price_count=? WHERE barcode=?",
                        (name, spec, unit, source, new_conf, new_count, now, now,
                         new_avg, new_min, new_max, new_pc, barcode))
                log = f"记忆库更新: {barcode} 可信度{old_conf if exists else 0}→{new_conf}"
                if price > 0:
                    log += f" 均价{new_avg:.4f}({new_pc}次)"
            conn.commit()
            # 更新内存 dict（如果传入了）
            if memory is not None and barcode in memory:
                memory[barcode].update({
                    'confidence': new_conf, 'usage_count': new_count,
                    'avg_price': new_avg, 'min_price': new_min,
                    'max_price': new_max, 'price_count': new_pc,
                    'name': name or old_name,
                    'specification': spec or old_spec,
                    'unit': unit or old_unit,
                })
            if add_log:
                add_log(log)
            return log
        finally:
            conn.close()
    def learn_from_products(self, products: List[Dict], source: str = 'ocr',
                            add_log: Callable = None) -> int:
        """批量学习 — 先批量预加载，再逐条处理，返回更新条数"""
        barcodes = [str(p.get('barcode', '')) for p in products if p.get('barcode')]
        memory = self.load_batch(barcodes)
        count = 0
        for p in products:
            try:
                result = self.learn_from_product(p, source, memory=memory, add_log=add_log)
                if result:
                    count += 1
            except Exception as e:
                logger.warning(f"学习商品记忆失败: {e}")
        return count
    # ══════════════════════════════════════════════════════════════
    # 记忆辅助 — OCR 补全
    # ══════════════════════════════════════════════════════════════
    def _price_anomaly(self, product: Dict, mem: Dict) -> bool:
        """价格异常：> 2倍偏差"""
        price = float(product.get('price', 0))
        avg = mem.get('avg_price', 0)
        if not price or not avg:
            return False
        return price > avg * 2 or price < avg * 0.5
    def fill_from_memory(self, barcode: str, ocr_result: Dict,
                         memory: Dict[str, Dict] = None) -> Tuple[Dict, str]:
        """用记忆库补全 OCR 缺失字段。返回 (补全后的dict, 日志字符串)"""
        if memory:
            mem = memory.get(barcode)
        else:
            mem = self.get_memory(barcode)
        if not mem or mem.get('confidence', 0) < 10:
            return ocr_result, ""
        logs = []
        result = dict(ocr_result)
        conf = mem.get('confidence', 0)
        has_spec = result.get('specification')
        has_unit = result.get('unit')
        price = float(result.get('price', 0))
        if conf > 50 and not has_spec and mem.get('specification'):
            result['specification'] = mem['specification']
            logs.append(f"规格补全(可信{conf}): {barcode} → {mem['specification']}")
        elif not has_spec and mem.get('specification') and self._price_anomaly(result, mem):
            result['specification'] = mem['specification']
            logs.append(f"价格异常→规格补全: {barcode} 本次{price:.2f} vs 均价{mem['avg_price']:.2f} → {mem['specification']}")
        if conf > 50 and not has_unit and mem.get('unit'):
            result['unit'] = mem['unit']
            logs.append(f"单位补全(可信{conf}): {barcode} → {mem['unit']}")
        elif not has_unit and mem.get('unit') and self._price_anomaly(result, mem):
            result['unit'] = mem['unit']
            logs.append(f"价格异常→单位补全: {barcode} → {mem['unit']}")
        return result, "; ".join(logs)
    def price_warning(self, barcode: str, price: float,
                      memory: Dict[str, Dict] = None) -> Optional[str]:
        """价格预警。> 50% 偏差告警"""
        if memory:
            mem = memory.get(barcode)
        else:
            mem = self.get_memory(barcode)
        if not mem or not mem.get('avg_price'):
            return None
        avg = mem['avg_price']
        min_p = mem.get('min_price', avg)
        max_p = mem.get('max_price', avg)
        pc = mem.get('price_count', 0)
        if price > avg * 1.5 or price < avg * 0.5:
            return (f"单价预警: {barcode} 本次{price:.4f}元 vs "
                    f"历史均价{avg:.4f} (范围{min_p:.4f}~{max_p:.4f}, {pc}次)")
        return None
    # ══════════════════════════════════════════════════════════════
    # 手动编辑
    # ══════════════════════════════════════════════════════════════
    def update_memory(self, barcode: str, fields: Dict) -> bool:
        barcode = str(barcode).strip()
        if not barcode:
            return False
        allowed = {'name', 'specification', 'unit', 'price', 'confidence'}
        updates = {k: v for k, v in fields.items() if k in allowed}
        if not updates:
            return False
        now = datetime.now().isoformat()
        set_clause = ', '.join(f"{k}=?" for k in updates)
        values = list(updates.values())
        extra_sql = ", source='user_confirmed'"
        if 'confidence' not in updates:
            extra_sql += ", confidence=90"
        conn = self._connect()
        try:
            conn.execute(
                f"UPDATE products SET {set_clause}{extra_sql}, updated_at=? WHERE barcode=?",
                values + [now, barcode])
            conn.commit()
            return conn.total_changes > 0
        finally:
            conn.close()
    def delete_memory(self, barcode: str) -> bool:
        conn = self._connect()
        try:
            conn.execute("DELETE FROM products WHERE barcode=?", (str(barcode).strip(),))
            conn.commit()
            return conn.total_changes > 0
        finally:
            conn.close()
    # ══════════════════════════════════════════════════════════════
    # 云端同步
    # ══════════════════════════════════════════════════════════════
    def export_for_sync(self) -> Dict:
        conn = self._connect()
        try:
            cursor = conn.execute(
                "SELECT barcode, name, specification, unit, price, source, "
                "confidence, usage_count, last_seen, avg_price, min_price, max_price, price_count "
                "FROM products")
            result = {}
            for row in cursor.fetchall():
                result[row[0]] = {
                    'name': row[1], 'specification': row[2], 'unit': row[3],
                    'price': row[4], 'source': row[5], 'confidence': row[6],
                    'usage_count': row[7], 'last_seen': row[8],
                    'avg_price': row[9], 'min_price': row[10],
                    'max_price': row[11], 'price_count': row[12],
                }
            return result
        finally:
            conn.close()
    def import_from_sync(self, data: Dict) -> int:
        now = datetime.now().isoformat()
        count = 0
        conn = self._connect()
        try:
            for barcode, info in data.items():
                barcode = str(barcode).strip()
                if not barcode:
                    continue
                name = str(info.get('name', ''))
                spec = str(info.get('specification', ''))
                unit = str(info.get('unit', ''))
                price = float(info.get('price', 0))
                remote_source = str(info.get('source', 'ocr'))
                remote_conf = int(info.get('confidence', 50))
                remote_count = int(info.get('usage_count', 1))
                remote_seen = str(info.get('last_seen', now))
                remote_avg = float(info.get('avg_price', price))
                remote_min = float(info.get('min_price', price))
                remote_max = float(info.get('max_price', price))
                remote_pc = int(info.get('price_count', 1))
                row = conn.execute("SELECT confidence FROM products WHERE barcode=?",
                                   (barcode,)).fetchone()
                if row is None:
                    conn.execute(
                        "INSERT INTO products (barcode, name, specification, unit, price, "
                        "source, confidence, usage_count, last_seen, updated_at, "
                        "avg_price, min_price, max_price, price_count) "
                        "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
                        (barcode, name, spec, unit, price, remote_source, remote_conf,
                         remote_count, remote_seen, now,
                         remote_avg, remote_min, remote_max, remote_pc))
                    count += 1
                else:
                    local_conf = row[0]
                    if remote_conf > local_conf:
                        conn.execute(
                            "UPDATE products SET name=?, specification=?, unit=?, price=?, "
                            "source=?, confidence=?, usage_count=?, last_seen=?, updated_at=?, "
                            "avg_price=?, min_price=?, max_price=?, price_count=? WHERE barcode=?",
                            (name, spec, unit, price, remote_source, remote_conf,
                             remote_count, remote_seen, now,
                             remote_avg, remote_min, remote_max, remote_pc, barcode))
                        count += 1
                    elif remote_conf == local_conf:
                        conn.execute(
                            "UPDATE products SET "
                            "name=CASE WHEN name='' THEN ? ELSE name END, "
                            "specification=CASE WHEN specification='' THEN ? ELSE specification END, "
                            "unit=CASE WHEN unit='' THEN ? ELSE unit END, "
                            "usage_count=MAX(usage_count, ?), updated_at=? WHERE barcode=?",
                            (name, spec, unit, remote_count, now, barcode))
                        count += 1
            conn.commit()
        finally:
            conn.close()
        return count
    def _export_memory_json(self, json_path=None):
        """导出记忆库为 JSON（兼容旧代码调用）"""
        import os as _os
        if json_path is None:
            json_path = _os.path.join(_os.path.dirname(self.db_path), 'product_memory.json')
        data = self.export_for_sync()
        _os.makedirs(_os.path.dirname(json_path), exist_ok=True)
        with open(json_path, 'w', encoding='utf-8') as f:
            json.dump(data, f, ensure_ascii=False, indent=2)
        return json_path
@@ -285,16 +285,6 @@ class UnitConverter:
            logger.debug(f"解析规格: {spec}")
            # 新增：处理“1件=12桶/袋/盒...”等等式规格，统一为1*12
            eq_match = re.match(r'(\d+(?:\.\d+)?)\s*(?:件|箱|提|盒)\s*[=＝]\s*(\d+)\s*(?:瓶|桶|盒|支|个|袋|罐|包|卷)', spec)
            if eq_match:
                try:
                    level2 = int(eq_match.group(2))
                    logger.info(f"解析等式规格: {spec} -> 1*{level2}")
                    return 1, level2, None
                except ValueError:
                    pass
            # 处理三级包装，如1*5*12
            three_level_match = re.match(r'(\d+)[*](\d+)[*](\d+)', spec)
            if three_level_match:
@@ -63,9 +63,8 @@ class JianUnitHandler(UnitHandler):
        Returns:
            是否可以处理
        """
-        unit = str(product.get('unit', '')).strip()
+        unit = product.get('unit', '')
-        # 匹配"件"、"件、"、"件装"等
+        return unit == '件'
        return unit == '件' or unit.startswith('件')
    def handle(self, product: Dict[str, Any], level1: int, level2: int, level3: Optional[int]) -> Dict[str, Any]:
        """
@@ -118,9 +117,8 @@ class BoxUnitHandler(UnitHandler):
        Returns:
            是否可以处理
        """
-        unit = str(product.get('unit', '')).strip()
+        unit = product.get('unit', '')
-        # 匹配"箱"、"箱、"、"箱装"等
+        return unit == '箱'
        return unit == '箱' or unit.startswith('箱')
    def handle(self, product: Dict[str, Any], level1: int, level2: int, level3: Optional[int]) -> Dict[str, Any]:
        """
@@ -173,8 +171,8 @@ class TiHeUnitHandler(UnitHandler):
        Returns:
            是否可以处理
        """
-        unit = str(product.get('unit', '')).strip()
+        unit = product.get('unit', '')
-        return unit in ['提', '盒'] or unit.startswith('提') or unit.startswith('盒')
+        return unit in ['提', '盒']
    def handle(self, product: Dict[str, Any], level1: int, level2: int, level3: Optional[int]) -> Dict[str, Any]:
        """
@@ -11,12 +11,11 @@ import numpy as np
 import xlrd
 import xlwt
 from xlutils.copy import copy as xlcopy
-from typing import Dict, List, Optional, Tuple, Union, Any, Callable
+from typing import Dict, List, Optional, Tuple, Union, Any
 from datetime import datetime
 from ...config.settings import ConfigManager
 from ..utils.log_utils import get_logger
 from ..handlers.column_mapper import ColumnMapper
 from ..utils.file_utils import (
    ensure_dir,
    get_file_extension,
@@ -49,7 +48,7 @@ class PurchaseOrderMerger:
        # 修复ConfigParser对象没有get_path方法的问题
        try:
            # 获取输出目录
-            self.output_dir = config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/output')
+            self.output_dir = config.get('Paths', 'output_folder', fallback='data/output')
            # 确保目录存在
            os.makedirs(self.output_dir, exist_ok=True)
@@ -96,8 +95,8 @@ class PurchaseOrderMerger:
        Returns:
            采购单文件路径列表
        """
-        # 采购单文件保存在result目录
+        # 采购单文件保存在data/result目录
-        result_dir = self.config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(self.config, 'get_path') else os.path.abspath('data/result')
+        result_dir = "data/result"
        logger.info(f"搜索目录 {result_dir} 中的采购单Excel文件")
        # 确保目录存在
@@ -141,45 +140,91 @@ class PurchaseOrderMerger:
            logger.debug(f"Excel文件的列名: {df.columns.tolist()}")
            # 处理特殊情况：检查是否需要读取指定行作为标题行
-            header_row_idx = ColumnMapper.detect_header_row(df, max_rows=5, min_matches=3)
+            for header_row_idx in range(5):  # 检查前5行
-            if header_row_idx >= 0:
+                if len(df) <= header_row_idx:
-                logger.info(f"检测到表头在第 {header_row_idx+1} 行")
+                    continue
-                # 使用此行作为列名，数据从下一行开始
+                potential_header = df.iloc[header_row_idx].astype(str)
-                header_row = df.iloc[header_row_idx].astype(str)
+                header_keywords = ['条码', '条形码', '商品条码', '商品名称', '规格', '单价', '数量', '金额', '单位', '必填']
-                data_rows = df.iloc[header_row_idx+1:].reset_index(drop=True)
+                matches = sum(1 for keyword in header_keywords if any(keyword in str(val) for val in potential_header.values))
-                # 为每一列分配名称（避免重复的列名）
+                if matches >= 3:  # 如果至少匹配3个关键词，认为是表头
-                new_columns = []
+                    logger.info(f"检测到表头在第 {header_row_idx+1} 行")
                for i, col in enumerate(header_row):
                    col_str = str(col)
                    if col_str == 'nan' or col_str == 'None' or pd.isna(col):
                        new_columns.append(f"Col_{i}")
                    else:
                        new_columns.append(col_str)
-                # 使用新列名创建新的DataFrame
+                    # 使用此行作为列名，数据从下一行开始
-                data_rows.columns = new_columns
+                    header_row = potential_header
-                df = data_rows
+                    data_rows = df.iloc[header_row_idx+1:].reset_index(drop=True)
                logger.debug(f"重新构建的数据帧列名: {df.columns.tolist()}")
-            # 使用 ColumnMapper 统一查找列名（保留中文键名以兼容下游代码）
+                    # 为每一列分配名称（避免重复的列名）
                    new_columns = []
                    for i, col in enumerate(header_row):
                        col_str = str(col)
                        if col_str == 'nan' or col_str == 'None' or pd.isna(col):
                            new_columns.append(f"Col_{i}")
                        else:
                            new_columns.append(col_str)
                    # 使用新列名创建新的DataFrame
                    data_rows.columns = new_columns
                    df = data_rows
                    logger.debug(f"重新构建的数据帧列名: {df.columns.tolist()}")
                    break
            # 定义可能的列名映射
            column_mapping = {
                '条码': ['条码', '条形码', '商品条码', 'barcode', '商品条形码', '条形码', '商品条码', '商品编码', '商品编号', '条形码', '条码（必填）'],
                '采购量': ['数量', '采购数量', '购买数量', '采购数量', '订单数量', '采购数量', '采购量（必填）', '采购量', '数量（必填）'],
                '采购单价': ['单价', '价格', '采购单价', '销售价', '采购单价（必填）', '单价（必填）', '价格（必填）'],
                '赠送量': ['赠送量', '赠品数量', '赠送数量', '赠品']
            }
            # 显示所有列名，用于调试
            all_columns = df.columns.tolist()
            logger.info(f"列名: {all_columns}")
-            standard_to_chinese = {
+            # 映射实际的列名
                'barcode': '条码',
                'quantity': '采购量',
                'unit_price': '采购单价',
                'gift_quantity': '赠送量',
            }
            mapped_columns = {}
-            for std_name, chinese_name in standard_to_chinese.items():
+            for target_col, possible_names in column_mapping.items():
-                matched = ColumnMapper.find_column(all_columns, std_name)
+                for col in all_columns:
-                if matched:
+                    # 清理列名以进行匹配
-                    mapped_columns[chinese_name] = matched
+                    col_str = str(col).strip()
-                    logger.info(f"列名映射: {matched} -> {chinese_name}")
+                    
                    # 直接匹配整个列名
                    if col_str in possible_names:
                        mapped_columns[target_col] = col
                        logger.info(f"直接匹配列名: {col_str} -> {target_col}")
                        break
                    # 移除列名中的空白字符进行比较
                    clean_col = re.sub(r'\s+', '', col_str)
                    for name in possible_names:
                        clean_name = re.sub(r'\s+', '', name)
                        # 完全匹配
                        if clean_col == clean_name:
                            mapped_columns[target_col] = col
                            logger.info(f"清理后匹配列名: {col_str} -> {target_col}")
                            break
                        # 部分匹配（列名包含关键词）
                        elif clean_name in clean_col:
                            mapped_columns[target_col] = col
                            logger.info(f"部分匹配列名: {col_str} -> {target_col}")
                            break
                    if target_col in mapped_columns:
                        break
                # 如果没有找到匹配，尝试模糊匹配
                if target_col not in mapped_columns:
                    for col in all_columns:
                        col_str = str(col).strip().lower()
                        for name in possible_names:
                            name_lower = name.lower()
                            if name_lower in col_str:
                                mapped_columns[target_col] = col
                                logger.info(f"模糊匹配列名: {col} -> {target_col}")
                                break
                        if target_col in mapped_columns:
                            break
            # 如果找到了必要的列，重命名列
            if mapped_columns:
@@ -354,9 +399,9 @@ class PurchaseOrderMerger:
                # 采购单价（必填）- E列(4)
                output_sheet.write(r, price_col, float(row['采购单价']), price_style)
-            # 生成输出文件名，保存到result目录
+            # 生成输出文件名，保存到data/result目录
            timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
-            result_dir = self.config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(self.config, 'get_path') else os.path.abspath('data/result')
+            result_dir = "data/result"
            os.makedirs(result_dir, exist_ok=True)
            output_file = os.path.join(result_dir, f"合并采购单_{timestamp}.xls")
@@ -369,7 +414,7 @@ class PurchaseOrderMerger:
            logger.error(f"创建合并采购单时出错: {e}")
            return None
-    def process(self, file_paths: Optional[List[str]] = None, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
+    def process(self, file_paths: Optional[List[str]] = None) -> Optional[str]:
        """
        处理采购单合并
@@ -382,11 +427,6 @@ class PurchaseOrderMerger:
        # 如果未指定文件路径，则获取所有采购单文件
        if file_paths is None:
            file_paths = self.get_purchase_orders()
            try:
                if progress_cb:
                    progress_cb(97)
            except Exception:
                pass
        # 检查是否有文件需要合并
        if not file_paths:
@@ -398,22 +438,12 @@ class PurchaseOrderMerger:
        if merged_df is None:
            logger.error("合并采购单失败")
            return None
        try:
            if progress_cb:
                progress_cb(98)
        except Exception:
            pass
        # 创建合并的采购单文件
        output_file = self.create_merged_purchase_order(merged_df)
        if output_file is None:
            logger.error("创建合并采购单文件失败")
            return None
        try:
            if progress_cb:
                progress_cb(100)
        except Exception:
            pass
        # 记录已合并文件
        for file_path in file_paths:
@@ -11,7 +11,7 @@ import numpy as np
 import xlrd
 import xlwt
 from xlutils.copy import copy as xlcopy
-from typing import Dict, List, Optional, Tuple, Union, Any, Callable
+from typing import Dict, List, Optional, Tuple, Union, Any
 from datetime import datetime
 from ...config.settings import ConfigManager
@@ -25,12 +25,11 @@ from ..utils.file_utils import (
 )
 from ..utils.string_utils import (
    clean_string,
    clean_barcode,
    extract_number,
-    format_barcode,
+    format_barcode
    parse_monetary_string
 )
 from .converter import UnitConverter
 from ..handlers.column_mapper import ColumnMapper
 logger = get_logger(__name__)
@@ -40,20 +39,19 @@ class ExcelProcessor:
    提取条码、单价和数量，并按照采购单模板的格式填充
    """
-    def __init__(self, config, product_db=None):
+    def __init__(self, config):
        """
        初始化Excel处理器
        Args:
            config: 配置信息
            product_db: 商品数据库实例（可选，由外部传入以共享）
        """
        self.config = config
        # 修复ConfigParser对象没有get_path方法的问题
        try:
            # 获取输入和输出目录
-            self.output_dir = config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/output')
+            self.output_dir = config.get('Paths', 'output_folder', fallback='data/output')
            self.temp_dir = config.get('Paths', 'temp_folder', fallback='data/temp')
            # 获取模板文件路径
@@ -75,18 +73,6 @@ class ExcelProcessor:
            # 加载单位转换器和配置
            self.unit_converter = UnitConverter()
            # 商品记忆库
            if product_db is not None:
                self.product_db = product_db
            else:
                from ..db.product_db import ProductDatabase
                db_path = config.get_path('Paths', 'product_db', fallback='data/product_cache.db') if hasattr(config, 'get_path') else 'data/product_cache.db'
                tpl_folder = config.get('Paths', 'template_folder', fallback='templates')
                item_data = config.get('Templates', 'item_data', fallback='商品资料.xlsx')
                tpl_path = os.path.join(tpl_folder, item_data)
                self.product_db = ProductDatabase(db_path, tpl_path)
            logger.info(f"初始化ExcelProcessor完成，模板文件: {self.template_path}")
        except Exception as e:
            logger.error(f"初始化ExcelProcessor失败: {e}")
@@ -135,6 +121,48 @@ class ExcelProcessor:
        logger.info(f"找到最新的Excel文件: {latest_file}")
        return latest_file
    def validate_barcode(self, barcode: Any) -> bool:
        """
        验证条码是否有效
        新增功能：如果条码是"仓库"，则返回False以避免误认为有效条码
        Args:
            barcode: 条码值
        Returns:
            条码是否有效
        """
        # 处理"仓库"特殊情况
        if isinstance(barcode, str) and barcode.strip() in ["仓库", "仓库全名"]:
            logger.warning(f"条码为仓库标识: {barcode}")
            return False
        # 清理条码格式
        barcode_clean = clean_barcode(barcode)
        # 对特定的错误条码进行修正（开头改6开头）
        if len(barcode_clean) > 8 and barcode_clean.startswith('5') and not barcode_clean.startswith('53'):
            barcode_clean = '6' + barcode_clean[1:]
            logger.info(f"修正条码前缀 5->6: {barcode} -> {barcode_clean}")
        # 验证条码长度
        if len(barcode_clean) < 8 or len(barcode_clean) > 13:
            logger.warning(f"条码长度异常: {barcode_clean}, 长度={len(barcode_clean)}")
            return False
        # 验证条码是否全为数字
        if not barcode_clean.isdigit():
            logger.warning(f"条码包含非数字字符: {barcode_clean}")
            return False
        # 对于序号9的特殊情况，允许其条码格式
        if barcode_clean == "5321545613":
            logger.info(f"特殊条码验证通过: {barcode_clean}")
            return True
        logger.debug(f"条码验证通过: {barcode_clean}")
        return True
    def extract_barcode(self, df: pd.DataFrame) -> List[str]:
        """
        从数据帧中提取条码列名
@@ -145,7 +173,12 @@ class ExcelProcessor:
        Returns:
            可能的条码列名列表
        """
-        possible_barcode_columns = ColumnMapper.STANDARD_COLUMNS['barcode']
+        possible_barcode_columns = [
            '条码', '条形码', '商品条码', '商品条形码', 
            '商品编码', '商品编号', '条形码', '条码（必填）', 
            'barcode', 'Barcode', '编码', '条形码', '电脑条码',
            '条码ID', '产品条码', 'BarCode'
        ]
        found_columns = []
@@ -221,20 +254,6 @@ class ExcelProcessor:
                if not product['barcode']:
                    continue
                # 检查备注列，过滤换货、退货、作废等非采购行
                skip_row = False
                for col in df.columns:
                    col_str = str(col)
                    if any(k in col_str for k in ['备注', '说明', '类型', '备注1']):
                        val = str(row[col]).strip()
                        # 过滤常见的非采购关键字
                        if any(k in val for k in ['换货', '退货', '作废', '减钱', '冲减', '赠品单', '补货']):
                            logger.info(f"过滤非采购行: {product['barcode']} - {product.get('name', '')}, 原因: {col_str}包含 '{val}'")
                            skip_row = True
                            break
                if skip_row:
                    continue
                # 提取商品名称
                if '商品名称' in df.columns and not pd.isna(row['商品名称']):
                    product['name'] = str(row['商品名称']).strip()
@@ -262,22 +281,6 @@ class ExcelProcessor:
                    product['amount'] = row['小计']
                elif column_mapping.get('amount') and not pd.isna(row[column_mapping['amount']]):
                    product['amount'] = row[column_mapping['amount']]
                # 根据金额判断赠品：金额为0、为空、或为o/O
                amt = product.get('amount', None)
                try:
                    is_amt_gift = False
                    if amt is None:
                        is_amt_gift = True
                    elif isinstance(amt, str):
                        parsed = parse_monetary_string(amt)
                        is_amt_gift = (parsed is None or parsed == 0.0)
                    else:
                        parsed = parse_monetary_string(amt)
                        is_amt_gift = (parsed is not None and parsed == 0.0)
                    if is_amt_gift:
                        product['is_gift'] = True
                except Exception:
                    pass
                # 提取数量
                if '数量' in df.columns and not pd.isna(row['数量']):
@@ -376,17 +379,27 @@ class ExcelProcessor:
                # 如果数量为0但单价和金额都存在，计算数量 = 金额/单价
                if (product['quantity'] == 0 or product['quantity'] is None) and product['price'] > 0 and product['amount']:
                    try:
-                        amount = parse_monetary_string(product['amount'])
+                        # 确保金额是数字
-                        if amount is not None and amount > 0:
+                        if isinstance(product['amount'], str):
                            # 移除货币符号和非数字字符，保留数字、小数点和逗号
                            amount_str = re.sub(r'[^\d\.,]', '', product['amount'].strip())
                            # 替换逗号为小数点（如果逗号作为小数分隔符）
                            if ',' in amount_str and '.' not in amount_str:
                                amount_str = amount_str.replace(',', '.')
                            # 处理既有逗号又有小数点的情况（通常逗号是千位分隔符）
                            elif ',' in amount_str and '.' in amount_str:
                                amount_str = amount_str.replace(',', '')
                            amount = float(amount_str)
                        else:
                            amount = float(product['amount'])
                        # 计算数量
                        if amount > 0:
                            quantity = amount / product['price']
                            logger.info(f"数量为空或为0，通过金额({amount})和单价({product['price']})计算得出数量: {quantity}")
                            product['quantity'] = quantity
                    except Exception as e:
                        logger.warning(f"通过金额和单价计算数量失败: {e}")
                # 应用记忆库补全
                product = self._apply_memory(product)
                products.append(product)
            except Exception as e:
                logger.error(f"提取第{idx+1}行商品信息时出错: {e}", exc_info=True)
@@ -395,59 +408,6 @@ class ExcelProcessor:
        logger.info(f"提取到 {len(products)} 个商品信息")
        return products
    def _apply_memory(self, product: Dict) -> Dict:
        """查记忆库，补全 OCR 缺失/错误的字段"""
        barcode = product.get('barcode', '')
        if not barcode:
            return product
        try:
            memory = self.product_db.get_memory(barcode)
        except Exception:
            return product
        if memory is None or memory.get('confidence', 0) < 80:
            return product
        # 补全规格
        ocr_spec = product.get('specification', '')
        mem_spec = memory.get('specification', '') or ''
        if mem_spec and (not ocr_spec or self._is_spec_suspicious(ocr_spec)):
            product['specification'] = mem_spec
            logger.info(f"记忆修正规格: {barcode} '{ocr_spec}' -> '{mem_spec}'")
        # 补全名称
        ocr_name = product.get('name', '')
        mem_name = memory.get('name', '') or ''
        if mem_name and not ocr_name:
            product['name'] = mem_name
            logger.info(f"记忆修正名称: {barcode} -> '{mem_name}'")
        # 补全单位
        ocr_unit = product.get('unit', '')
        mem_unit = memory.get('unit', '') or ''
        if mem_unit and not ocr_unit:
            product['unit'] = mem_unit
            logger.info(f"记忆修正单位: {barcode} -> '{mem_unit}'")
        # 不改数量和单价（每单不同）
        return product
    def _is_spec_suspicious(self, spec: str) -> bool:
        """检测规格是否像 OCR 垃圾"""
        if not spec:
            return True
        # IL*12（I 和 1 混淆）
        if re.search(r'^[Ii][Ll*]', spec):
            return True
        # 4.51*4（L 被识别为 1）
        if re.search(r'\d+\.\d+1\*\d+', spec):
            return True
        # 包含非常规字符（排除常见规格字符）
        if re.search(r'[^\d.*xX\-LlKkGgMm升毫瓶桶盒箱件提\s]', spec):
            return True
        return False
    def fill_template(self, products: List[Dict], output_file_path: str) -> bool:
        """
        填充采购单模板
@@ -490,8 +450,21 @@ class ExcelProcessor:
                # 如果数量为0但单价和金额都存在，计算数量 = 金额/单价
                if (quantity == 0 or quantity is None) and price > 0 and amount:
                    try:
-                        amount = parse_monetary_string(amount)
+                        # 确保金额是数字
-                        if amount is not None and amount > 0:
+                        if isinstance(amount, str):
                            # 移除货币符号和非数字字符，保留数字、小数点和逗号
                            amount_str = re.sub(r'[^\d\.,]', '', amount.strip())
                            # 替换逗号为小数点（如果逗号作为小数分隔符）
                            if ',' in amount_str and '.' not in amount_str:
                                amount_str = amount_str.replace(',', '.')
                            # 处理既有逗号又有小数点的情况（通常逗号是千位分隔符）
                            elif ',' in amount_str and '.' in amount_str:
                                amount_str = amount_str.replace(',', '')
                            amount = float(amount_str)
                        else:
                            amount = float(amount)
                        # 计算数量
                        if amount > 0:
                            quantity = amount / price
                            logger.info(f"数量为空或为0，通过金额({amount})和单价({price})计算得出数量: {quantity}")
                            product['quantity'] = quantity
@@ -499,7 +472,7 @@ class ExcelProcessor:
                        logger.warning(f"通过金额和单价计算数量失败: {e}")
                # 判断是否为赠品（价格为0）
-                is_gift = bool(product.get('is_gift', False)) or (price == 0)
+                is_gift = price == 0
                logger.info(f"处理商品: 条码={barcode}, 数量={quantity}, 单价={price}, 是否赠品={is_gift}")
@@ -588,20 +561,77 @@ class ExcelProcessor:
            return False
    def _find_header_row(self, df: pd.DataFrame) -> Optional[int]:
-        """自动识别表头行，委托给 ColumnMapper.detect_header_row"""
+        """
-        result = ColumnMapper.detect_header_row(df, max_rows=30)
+        自动识别表头行
-        if result >= 0:
+        
-            logger.info(f"找到表头行: 第{result+1}行")
+        通过多种规则识别表头：
-            return result
+        1. 检查行是否包含典型的表头关键词（条码、商品名称、数量等）
-        # 回退：找第一个非空行
+        2. 检查是否是第一个非空行
        3. 检查行是否有较多的字符串类型单元格（表头通常是字符串）
        Args:
            df: 数据帧
        Returns:
            表头行索引，如果未找到则返回None
        """
        # 定义可能的表头关键词
        header_keywords = [
            '条码', '条形码', '商品条码', '商品名称', '名称', '数量', '单位', '单价', 
            '规格', '商品编码', '采购数量', '采购单位', '商品', '品名'
        ]
        # 存储每行的匹配分数
        row_scores = []
        # 遍历前10行（通常表头不会太靠后）
        max_rows_to_check = min(10, len(df))
        for row in range(max_rows_to_check):
            row_data = df.iloc[row]
            score = 0
            # 检查1: 关键词匹配
            for cell in row_data:
                if isinstance(cell, str):
                    cell_clean = str(cell).strip().lower()
                    for keyword in header_keywords:
                        if keyword.lower() in cell_clean:
                            score += 5  # 每匹配一个关键词加5分
            # 检查2: 非空单元格比例
            non_empty_cells = row_data.count()
            if non_empty_cells / len(row_data) > 0.5:  # 如果超过一半的单元格有内容
                score += 2
            # 检查3: 字符串类型单元格比例
            string_cells = sum(1 for cell in row_data if isinstance(cell, str))
            if string_cells / len(row_data) > 0.5:  # 如果超过一半的单元格是字符串
                score += 3
            row_scores.append((row, score))
            # 日志记录每行的评分情况
            logger.debug(f"第{row+1}行评分: {score}，内容: {row_data.values}")
        # 按评分排序
        row_scores.sort(key=lambda x: x[1], reverse=True)
        # 如果最高分达到一定阈值，认为是表头
        if row_scores and row_scores[0][1] >= 5:
            best_row = row_scores[0][0]
            logger.info(f"找到可能的表头行: 第{best_row+1}行，评分: {row_scores[0][1]}")
            return best_row
        # 如果没有找到明确的表头，尝试找第一个非空行
        for row in range(len(df)):
-            if df.iloc[row].notna().sum() > 3:
+            if df.iloc[row].notna().sum() > 3:  # 至少有3个非空单元格
                logger.info(f"未找到明确表头，使用第一个有效行: 第{row+1}行")
                return row
        logger.warning("无法识别表头行")
        return None
-    def process_specific_file(self, file_path: str, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
+    def process_specific_file(self, file_path: str) -> Optional[str]:
        """
        处理指定的Excel文件
@@ -619,11 +649,6 @@ class ExcelProcessor:
        try:
            # 读取Excel文件时不立即指定表头
            if progress_cb:
                try:
                    progress_cb(92)
                except Exception:
                    pass
            df = pd.read_excel(file_path, header=None)
            logger.info(f"成功读取Excel文件: {file_path}, 共 {len(df)} 行")
@@ -635,58 +660,31 @@ class ExcelProcessor:
            logger.info(f"识别到表头在第 {header_row+1} 行")
-            # 重新设置表头，避免二次读取
+            # 重新读取Excel，正确指定表头行
-            if progress_cb:
+            df = pd.read_excel(file_path, header=header_row)
-                try:
+            logger.info(f"使用表头行重新读取数据，共 {len(df)} 行有效数据")
                    progress_cb(94)
                except Exception:
                    pass
            # 使用识别到的表头行设置列名，并过滤掉表头之前的行
            df.columns = df.iloc[header_row]
            df = df.iloc[header_row + 1:].reset_index(drop=True)
            logger.info(f"重新整理数据结构，共 {len(df)} 行有效数据")
            # 提取商品信息
            if progress_cb:
                try:
                    progress_cb(96)
                except Exception:
                    pass
            products = self.extract_product_info(df)
            if not products:
                logger.warning("未提取到有效商品信息")
                return None
-            # 生成输出文件名，保存到result目录
+            # 生成输出文件名，保存到data/result目录
            file_name = os.path.splitext(os.path.basename(file_path))[0]
-            result_dir = self.config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(self.config, 'get_path') else os.path.abspath('data/result')
+            result_dir = "data/result"
            os.makedirs(result_dir, exist_ok=True)
            output_file = os.path.join(result_dir, f"采购单_{file_name}.xls")
            # 填充模板并保存
            if self.fill_template(products, output_file):
                # 从处理结果中学习商品记忆
                try:
                    self.product_db.learn_from_products(products, source='ocr')
                    self.product_db._export_memory_json()
                    logger.info(f"已从处理结果学习 {len(products)} 条商品记忆")
                except Exception as e:
                    logger.warning(f"学习商品记忆失败: {e}")
                # 记录已处理文件
                self.processed_files[file_path] = output_file
                self._save_processed_files()
                # 不再自动打开输出目录
                logger.info(f"采购单已保存到: {output_file}")
                if progress_cb:
                    try:
                        progress_cb(100)
                    except Exception:
                        pass
                return output_file
@@ -696,7 +694,7 @@ class ExcelProcessor:
            logger.error(f"处理Excel文件时出错: {file_path}, 错误: {e}")
            return None
-    def process_latest_file(self, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
+    def process_latest_file(self) -> Optional[str]:
        """
        处理最新的Excel文件
@@ -710,7 +708,7 @@ class ExcelProcessor:
            return None
        # 处理文件
-        return self.process_specific_file(latest_file, progress_cb=progress_cb)
+        return self.process_specific_file(latest_file)
    def _detect_column_mapping(self, df: pd.DataFrame) -> Dict[str, str]:
        """
@@ -730,25 +728,44 @@ class ExcelProcessor:
            logger.error("未找到条码列，无法处理")
            return {}
-        # 使用 ColumnMapper 统一查找列名
+        # 定义列名映射
-        mapped_columns = {'barcode': barcode_cols[0]}
+        column_mapping = {
            'name': ['商品名称', '名称', '品名', '商品', '商品名', '商品或服务名称', '品项名', '产品名称', '品项', '名 称'],
            'specification': ['规格', '规格型号', '型号', '商品规格', '产品规格', '包装规格','规 格'],
            'quantity': ['数量', '采购数量', '购买数量', '采购数量', '订单数量', '数量（必填）', '入库数', '入库数量','数 量'],
            'unit': ['单位', '采购单位', '计量单位', '单位（必填）', '单位名称', '计价单位','单 位'],
            'price': ['单价', '价格', '采购单价', '销售价', '进货价', '单价（必填）', '采购价', '参考价', '入库单价','单 价'],
            'amount': ['金额', '小计', '总价', '合计金额', '小计金额', '金额（元）', '金额合计', '合计', '总额']
        }
        # 映射列名到标准名称
        mapped_columns = {'barcode': barcode_cols[0]}  # 使用第一个找到的条码列
        # 记录列名映射详情
        logger.info(f"使用条码列: {mapped_columns['barcode']}")
-        # 内部键名 -> 标准列名映射 (processor.py 使用 price/amount 作为内部键名)
+        for target, possible_names in column_mapping.items():
-        field_map = [
+            for col in df.columns:
-            ('name', 'name'),
+                col_str = str(col).strip()
-            ('specification', 'specification'),
+                for name in possible_names:
-            ('quantity', 'quantity'),
+                    if col_str == name:
-            ('unit', 'unit'),
+                        mapped_columns[target] = col
-            ('price', 'unit_price'),
+                        logger.info(f"找到{target}列: {col}")
-            ('amount', 'total_price'),
+                        break
-        ]
+                if target in mapped_columns:
                    break
-        for internal_key, standard_name in field_map:
+            # 如果没有找到精确匹配，尝试部分匹配
-            matched = ColumnMapper.find_column(list(df.columns), standard_name)
+            if target not in mapped_columns:
-            if matched:
+                for col in df.columns:
-                mapped_columns[internal_key] = matched
+                    col_str = str(col).strip().lower()
-                logger.info(f"找到{internal_key}列: {matched}")
+                    for name in possible_names:
                        if name.lower() in col_str:
                            mapped_columns[target] = col
                            logger.info(f"找到{target}列(部分匹配): {col}")
                            break
                    if target in mapped_columns:
                        break
        return mapped_columns
@@ -872,11 +889,6 @@ class ExcelProcessor:
            logger.debug(f"清理后的规格字符串: {spec_str}")
            # 新增：匹配“1件=12桶/袋/盒…”等等式规格，取右侧数量作为包装数量
            eq_match = re.search(r'(\d+(?:\.\d+)?)\s*(?:件|箱|提|盒)\s*[=＝]\s*(\d+)\s*(?:瓶|桶|盒|支|个|袋|罐|包|卷)', spec_str)
            if eq_match:
                return int(eq_match.group(2))
            # 匹配带单位的格式，如"5kg*6"、"450g*15"、"450ml*15"
            weight_pattern = r'(\d+(?:\.\d+)?)\s*(?:kg|KG|千克|公斤)[*×](\d+)'
            match = re.search(weight_pattern, spec_str)
@@ -0,0 +1,355 @@
 """
 单位转换器测试模块
 ---------------
 测试单位转换和条码映射逻辑
 """
 import os
 import sys
 import unittest
 from typing import Dict, Any
 # 添加项目根目录到Python路径
 sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '../../..')))
 from app.core.excel.converter import UnitConverter
 from app.core.excel.validators import ProductValidator
 class TestUnitConverter(unittest.TestCase):
    """
    测试单位转换器功能
    """
    def setUp(self):
        """
        测试前的准备工作
        """
        self.converter = UnitConverter()
    def test_jian_unit_conversion(self):
        """
        测试"件"单位的转换
        """
        # 准备测试数据
        product = {
            'barcode': '6954767400129',
            'name': '美汁源果粒橙1.8L*8瓶',
            'specification': '1.8L*8',
            'quantity': 1.0,
            'unit': '件',
            'price': 65.0
        }
        # 执行转换
        result = self.converter.process_unit_conversion(product)
        # 验证结果
        self.assertEqual(result['quantity'], 8.0)
        self.assertEqual(result['price'], 8.125)
        self.assertEqual(result['unit'], '瓶')
    def test_box_unit_conversion(self):
        """
        测试"箱"单位的转换
        """
        # 准备测试数据
        product = {
            'barcode': '6925303721244',
            'name': '统一鲜橙多2L*6瓶',
            'specification': '2L*6',
            'quantity': 1.0,
            'unit': '箱',
            'price': 43.0
        }
        # 执行转换
        result = self.converter.process_unit_conversion(product)
        # 验证结果
        self.assertEqual(result['quantity'], 6.0)
        self.assertEqual(result['price'], 7.1666666666666667)
        self.assertEqual(result['unit'], '瓶')
    def test_tihe_unit_conversion_level3(self):
        """
        测试"提"单位的转换（三级规格）
        """
        # 准备测试数据（三级规格：1*6*4，表示1排6提，每提4瓶）
        product = {
            'barcode': '6921168509347',
            'name': '农夫山泉550ml*24瓶',
            'specification': '1*6*4',
            'quantity': 2.0,
            'unit': '提',
            'price': 16.0
        }
        # 执行转换
        result = self.converter.process_unit_conversion(product)
        # 验证结果：三级规格，提单位特殊处理，数量*最后一级
        self.assertEqual(result['quantity'], 8.0)  # 2提 * 4瓶/提
        self.assertEqual(result['price'], 4.0)     # 16元/提 ÷ 4瓶/提
        self.assertEqual(result['unit'], '瓶')
    def test_tihe_unit_conversion_level2(self):
        """
        测试"提"单位的转换（二级规格）
        """
        # 准备测试数据（二级规格：1*4，表示每件4提）
        product = {
            'barcode': '6921168509347',
            'name': '农夫山泉550ml*4瓶',
            'specification': '1*4',
            'quantity': 5.0,
            'unit': '提',
            'price': 10.0
        }
        # 执行转换
        result = self.converter.process_unit_conversion(product)
        # 验证结果：二级规格，提单位保持不变
        self.assertEqual(result['quantity'], 5.0)
        self.assertEqual(result['price'], 10.0)
        self.assertEqual(result['unit'], '提')
    def test_barcode_mapping(self):
        """
        测试条码映射
        """
        # 准备测试数据（使用需要被映射的条码）
        product = {
            'barcode': '6920584471055',  # 这个条码应映射到6920584471017
            'name': '测试映射条码商品',
            'specification': '1*12',
            'quantity': 1.0,
            'unit': '件',
            'price': 60.0
        }
        # 执行转换
        result = self.converter.process_unit_conversion(product)
        # 验证结果：条码应该被映射
        self.assertEqual(result['barcode'], '6920584471017')
        self.assertEqual(result['quantity'], 12.0)  # 同时处理件单位转换
        self.assertEqual(result['price'], 5.0)      # 60元/件 ÷ 12瓶/件
        self.assertEqual(result['unit'], '瓶')
    def test_special_barcode_multiplier(self):
        """
        测试特殊条码的倍数处理
        """
        # 准备测试数据（使用特殊条码）
        product = {
            'barcode': '6925019900087',  # 特殊条码：数量*10，单位转瓶
            'name': '特殊条码商品',
            'specification': '1*10',
            'quantity': 2.0,
            'unit': '箱',
            'price': 100.0
        }
        # 执行转换
        result = self.converter.process_unit_conversion(product)
        # 验证结果：特殊条码乘数应该生效
        self.assertEqual(result['quantity'], 20.0)  # 2箱 * 10倍数
        self.assertEqual(result['price'], 5.0)      # 100元/箱 ÷ 10倍数/箱
        self.assertEqual(result['unit'], '瓶')
 class TestProductValidator(unittest.TestCase):
    """
    测试商品数据验证器功能
    """
    def setUp(self):
        """
        测试前的准备工作
        """
        self.validator = ProductValidator()
    def test_validate_barcode(self):
        """
        测试条码验证
        """
        # 测试有效条码
        is_valid, barcode, error = self.validator.validate_barcode('6925303721244')
        self.assertTrue(is_valid)
        self.assertEqual(barcode, '6925303721244')
        self.assertIsNone(error)
        # 测试包含非数字字符的条码
        is_valid, barcode, error = self.validator.validate_barcode('6925303-721244')
        self.assertTrue(is_valid)
        self.assertEqual(barcode, '6925303721244')
        self.assertIsNone(error)
        # 测试5开头的条码修正
        is_valid, barcode, error = self.validator.validate_barcode('5925303721244')
        self.assertTrue(is_valid)
        self.assertEqual(barcode, '6925303721244')
        self.assertIsNone(error)
        # 测试过短的条码
        is_valid, barcode, error = self.validator.validate_barcode('12345')
        self.assertFalse(is_valid)
        self.assertEqual(barcode, '12345')
        self.assertIn("条码长度异常", error)
        # 测试仓库标识
        is_valid, barcode, error = self.validator.validate_barcode('仓库')
        self.assertFalse(is_valid)
        self.assertEqual(barcode, '仓库')
        self.assertEqual(error, "条码为仓库标识")
        # 测试空值
        is_valid, barcode, error = self.validator.validate_barcode(None)
        self.assertFalse(is_valid)
        self.assertEqual(barcode, "")
        self.assertEqual(error, "条码为空")
    def test_validate_quantity(self):
        """
        测试数量验证
        """
        # 测试有效数量
        is_valid, quantity, error = self.validator.validate_quantity(10)
        self.assertTrue(is_valid)
        self.assertEqual(quantity, 10.0)
        self.assertIsNone(error)
        # 测试字符串数量
        is_valid, quantity, error = self.validator.validate_quantity("25.5")
        self.assertTrue(is_valid)
        self.assertEqual(quantity, 25.5)
        self.assertIsNone(error)
        # 测试带单位的数量
        is_valid, quantity, error = self.validator.validate_quantity("30瓶")
        self.assertTrue(is_valid)
        self.assertEqual(quantity, 30.0)
        self.assertIsNone(error)
        # 测试零数量
        is_valid, quantity, error = self.validator.validate_quantity(0)
        self.assertFalse(is_valid)
        self.assertEqual(quantity, 0.0)
        self.assertIn("数量必须大于0", error)
        # 测试负数量
        is_valid, quantity, error = self.validator.validate_quantity(-5)
        self.assertFalse(is_valid)
        self.assertEqual(quantity, 0.0)
        self.assertIn("数量必须大于0", error)
        # 测试非数字
        is_valid, quantity, error = self.validator.validate_quantity("abc")
        self.assertFalse(is_valid)
        self.assertEqual(quantity, 0.0)
        self.assertIn("数量不包含数字", error)
        # 测试空值
        is_valid, quantity, error = self.validator.validate_quantity(None)
        self.assertFalse(is_valid)
        self.assertEqual(quantity, 0.0)
        self.assertEqual(error, "数量为空")
    def test_validate_price(self):
        """
        测试单价验证
        """
        # 测试有效单价
        is_valid, price, is_gift, error = self.validator.validate_price(12.5)
        self.assertTrue(is_valid)
        self.assertEqual(price, 12.5)
        self.assertFalse(is_gift)
        self.assertIsNone(error)
        # 测试字符串单价
        is_valid, price, is_gift, error = self.validator.validate_price("8.0")
        self.assertTrue(is_valid)
        self.assertEqual(price, 8.0)
        self.assertFalse(is_gift)
        self.assertIsNone(error)
        # 测试零单价（赠品）
        is_valid, price, is_gift, error = self.validator.validate_price(0)
        self.assertTrue(is_valid)
        self.assertEqual(price, 0.0)
        self.assertTrue(is_gift)
        self.assertIsNone(error)
        # 测试"赠品"标记
        is_valid, price, is_gift, error = self.validator.validate_price("赠品")
        self.assertTrue(is_valid)
        self.assertEqual(price, 0.0)
        self.assertTrue(is_gift)
        self.assertIsNone(error)
        # 测试负单价
        is_valid, price, is_gift, error = self.validator.validate_price(-5)
        self.assertFalse(is_valid)
        self.assertEqual(price, 0.0)
        self.assertTrue(is_gift)
        self.assertIn("单价不能为负数", error)
        # 测试空值
        is_valid, price, is_gift, error = self.validator.validate_price(None)
        self.assertFalse(is_valid)
        self.assertEqual(price, 0.0)
        self.assertTrue(is_gift)
        self.assertEqual(error, "单价为空，视为赠品")
    def test_validate_product(self):
        """
        测试商品数据验证
        """
        # 准备测试数据（有效商品）
        product = {
            'barcode': '6954767400129',
            'name': '测试商品',
            'specification': '1*12',
            'quantity': 3.0,
            'price': 36.0,
            'unit': '件',
            'is_gift': False
        }
        # 验证有效商品
        result = self.validator.validate_product(product)
        self.assertEqual(result['barcode'], '6954767400129')
        self.assertEqual(result['quantity'], 3.0)
        self.assertEqual(result['price'], 36.0)
        self.assertFalse(result['is_gift'])
        # 验证赠品商品
        gift_product = product.copy()
        gift_product['price'] = 0
        result = self.validator.validate_product(gift_product)
        self.assertEqual(result['price'], 0.0)
        self.assertTrue(result['is_gift'])
        # 验证需要修复的商品
        invalid_product = {
            'barcode': '5954767-400129',  # 需要修复前缀和移除非数字
            'name': '测试商品',
            'specification': '1*12',
            'quantity': '2件',  # 需要提取数字
            'price': '赠品',     # 赠品标记
            'unit': '件',
            'is_gift': False
        }
        result = self.validator.validate_product(invalid_product)
        self.assertEqual(result['barcode'], '6954767400129')  # 5->6，移除 '-'
        self.assertEqual(result['quantity'], 2.0)  # 提取数字
        self.assertEqual(result['price'], 0.0)     # 赠品价格为0
        self.assertTrue(result['is_gift'])         # 标记为赠品
 if __name__ == '__main__':
    unittest.main() 
@@ -9,7 +9,6 @@ import logging
 from typing import Dict, Any, Optional, List, Tuple, Union
 from ..utils.log_utils import get_logger
 from ..utils.string_utils import parse_monetary_string
 logger = get_logger(__name__)
@@ -157,8 +156,23 @@ class ProductValidator:
            if price_str in ["赠品", "gift", "赠送", "0", ""]:
                return True, 0.0, True, None
-            price_value = parse_monetary_string(price_str)
+            # 去除空白和非数字字符（保留小数点和逗号）
-            if price_value is None:
+            price_clean = re.sub(r'[^\d\.,]', '', price_str)
            # 处理小数点和逗号
            if ',' in price_clean and '.' not in price_clean:
                # 如果只有逗号没有小数点，将逗号视为小数点
                price_clean = price_clean.replace(',', '.')
            elif ',' in price_clean and '.' in price_clean:
                # 如果既有逗号又有小数点，移除逗号（认为逗号是千位分隔符）
                price_clean = price_clean.replace(',', '')
            if not price_clean:
                return False, 0.0, True, "单价不包含数字，视为赠品"
            try:
                price_value = float(price_clean)
            except ValueError:
                return False, 0.0, True, f"无法将单价 '{price}' 转换为数字，视为赠品"
        else:
            # 尝试直接转换
@@ -212,17 +226,6 @@ class ProductValidator:
            if error_msg:
                logger.info(error_msg)
        amount = product.get('amount', None)
        try:
            is_amount_gift = False
            parsed_amount = parse_monetary_string(amount)
            if parsed_amount is None or parsed_amount == 0.0:
                is_amount_gift = True
            if is_amount_gift:
                validated_product['is_gift'] = True
        except Exception:
            pass
        # 验证数量
        quantity = product.get('quantity', None)
        is_valid, fixed_quantity, error_msg = self.validate_quantity(quantity)
@@ -236,9 +239,18 @@ class ProductValidator:
            if fixed_price > 0 and amount is not None:
                try:
                    # 确保金额是数字
-                    amount = parse_monetary_string(amount)
+                    if isinstance(amount, str):
-                    if amount is None:
+                        # 移除货币符号和非数字字符，保留数字、小数点和逗号
-                        raise ValueError("无法解析金额")
+                        amount_str = re.sub(r'[^\d\.,]', '', amount.strip())
                        # 替换逗号为小数点（如果逗号作为小数分隔符）
                        if ',' in amount_str and '.' not in amount_str:
                            amount_str = amount_str.replace(',', '.')
                        # 处理既有逗号又有小数点的情况（通常逗号是千位分隔符）
                        elif ',' in amount_str and '.' in amount_str:
                            amount_str = amount_str.replace(',', '')
                        amount = float(amount_str)
                    else:
                        amount = float(amount)
                    # 计算数量 = 金额 / 单价
                    if amount > 0:
@@ -1,9 +0,0 @@
 """
 数据处理handlers模块初始化文件
 """
 from .data_cleaner import DataCleaner
 from .column_mapper import ColumnMapper
 from .calculator import DataCalculator
 __all__ = ['DataCleaner', 'ColumnMapper', 'DataCalculator']
@@ -1,378 +0,0 @@
 """
 数据计算处理器
 提供各种数据计算功能，如数量计算、价格计算、汇总统计等
 """
 import pandas as pd
 import numpy as np
 from typing import Dict, Any, Optional, List, Union
 from ...core.utils.log_utils import get_logger
 logger = get_logger(__name__)
 class DataCalculator:
    """数据计算处理器
    提供标准化的数据计算功能，支持各种业务计算规则
    """
    def __init__(self, config: Optional[Dict[str, Any]] = None):
        """初始化数据计算器
        Args:
            config: 计算配置
        """
        self.config = config or {}
        self.calculation_rules = []
    def add_rule(self, rule_type: str, **kwargs):
        """添加计算规则
        Args:
            rule_type: 规则类型
            **kwargs: 规则参数
        """
        rule = {'type': rule_type, **kwargs}
        self.calculation_rules.append(rule)
        logger.debug(f"添加计算规则: {rule_type}")
    def calculate(self, df: pd.DataFrame) -> pd.DataFrame:
        """执行数据计算
        Args:
            df: 输入数据
        Returns:
            计算后的数据
        """
        logger.info(f"开始数据计算，原始数据形状: {df.shape}")
        result_df = df.copy()
        for i, rule in enumerate(self.calculation_rules):
            try:
                logger.debug(f"执行计算规则 {i+1}/{len(self.calculation_rules)}: {rule['type']}")
                result_df = self._apply_rule(result_df, rule)
                logger.debug(f"规则执行完成，数据形状: {result_df.shape}")
            except Exception as e:
                logger.error(f"计算规则执行失败: {rule}, 错误: {e}")
                # 继续执行下一个规则，而不是中断整个流程
                continue
        logger.info(f"数据计算完成，最终数据形状: {result_df.shape}")
        return result_df
    def _apply_rule(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """应用单个计算规则
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        rule_type = rule.get('type')
        if rule_type == 'multiply':
            return self._multiply(df, rule)
        elif rule_type == 'divide':
            return self._divide(df, rule)
        elif rule_type == 'add':
            return self._add(df, rule)
        elif rule_type == 'subtract':
            return self._subtract(df, rule)
        elif rule_type == 'formula':
            return self._formula(df, rule)
        elif rule_type == 'round':
            return self._round(df, rule)
        elif rule_type == 'sum':
            return self._sum(df, rule)
        elif rule_type == 'aggregate':
            return self._aggregate(df, rule)
        else:
            logger.warning(f"未知的计算规则类型: {rule_type}")
            return df
    def _multiply(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """乘法计算
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        source_column = rule.get('source_column')
        target_column = rule.get('target_column')
        factor = rule.get('factor', 1)
        if source_column and target_column:
            if source_column in df.columns:
                df[target_column] = df[source_column] * factor
                logger.debug(f"乘法计算: {source_column} * {factor} -> {target_column}")
            else:
                logger.warning(f"源列不存在: {source_column}")
        return df
    def _divide(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """除法计算
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        source_column = rule.get('source_column')
        target_column = rule.get('target_column')
        divisor = rule.get('divisor', 1)
        if source_column and target_column and divisor != 0:
            if source_column in df.columns:
                df[target_column] = df[source_column] / divisor
                logger.debug(f"除法计算: {source_column} / {divisor} -> {target_column}")
            else:
                logger.warning(f"源列不存在: {source_column}")
        elif divisor == 0:
            logger.error("除数不能为0")
        return df
    def _add(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """加法计算
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        columns = rule.get('columns', [])
        target_column = rule.get('target_column')
        constant = rule.get('constant', 0)
        if target_column:
            if isinstance(columns, str):
                columns = [columns]
            if columns:
                # 列相加
                valid_columns = [col for col in columns if col in df.columns]
                if valid_columns:
                    df[target_column] = df[valid_columns].sum(axis=1) + constant
                    logger.debug(f"加法计算: {valid_columns} + {constant} -> {target_column}")
                else:
                    logger.warning(f"没有有效的列用于加法计算: {columns}")
            else:
                # 只加常数
                if target_column in df.columns:
                    df[target_column] = df[target_column] + constant
                    logger.debug(f"加法计算: {target_column} + {constant}")
                else:
                    logger.warning(f"目标列不存在: {target_column}")
        return df
    def _subtract(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """减法计算
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        minuend = rule.get('minuend')  # 被减数列
        subtrahend = rule.get('subtrahend')  # 减数列
        target_column = rule.get('target_column')
        constant = rule.get('constant', 0)
        if target_column and minuend and minuend in df.columns:
            if subtrahend and subtrahend in df.columns:
                df[target_column] = df[minuend] - df[subtrahend] - constant
                logger.debug(f"减法计算: {minuend} - {subtrahend} - {constant} -> {target_column}")
            else:
                df[target_column] = df[minuend] - constant
                logger.debug(f"减法计算: {minuend} - {constant} -> {target_column}")
        else:
            logger.warning(f"减法计算参数不完整或列不存在")
        return df
    def _formula(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """公式计算
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        formula = rule.get('formula')
        target_column = rule.get('target_column')
        if formula and target_column:
            try:
                df[target_column] = df.eval(formula)
                logger.debug(f"公式计算: {formula} -> {target_column}")
            except Exception as e:
                logger.error(f"公式计算失败: {formula}, 错误: {e}")
        else:
            logger.warning("公式计算缺少公式或目标列")
        return df
    def _round(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """四舍五入
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        columns = rule.get('columns', [])
        decimals = rule.get('decimals', 0)
        if isinstance(columns, str):
            columns = [columns]
        target_columns = columns or df.select_dtypes(include=[np.number]).columns
        for col in target_columns:
            if col in df.columns and pd.api.types.is_numeric_dtype(df[col]):
                df[col] = df[col].round(decimals)
                logger.debug(f"四舍五入: {col} 保留 {decimals} 位小数")
        return df
    def _sum(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """求和计算
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        columns = rule.get('columns', [])
        target_column = rule.get('target_column')
        group_by = rule.get('group_by')
        if isinstance(columns, str):
            columns = [columns]
        if group_by and group_by in df.columns:
            # 分组求和
            if columns:
                for col in columns:
                    if col in df.columns:
                        sum_result = df.groupby(group_by)[col].sum()
                        logger.debug(f"分组求和: {col} 按 {group_by} 分组")
            else:
                # 所有数值列分组求和
                numeric_columns = df.select_dtypes(include=[np.number]).columns
                sum_result = df.groupby(group_by)[numeric_columns].sum()
                logger.debug(f"分组求和: 所有数值列 按 {group_by} 分组")
        else:
            # 总体求和
            if columns:
                valid_columns = [col for col in columns if col in df.columns]
                if valid_columns and target_column:
                    df[target_column] = df[valid_columns].sum(axis=1)
                    logger.debug(f"求和计算: {valid_columns} -> {target_column}")
            else:
                # 所有数值列求和
                numeric_columns = df.select_dtypes(include=[np.number]).columns
                if target_column and len(numeric_columns) > 0:
                    df[target_column] = df[numeric_columns].sum(axis=1)
                    logger.debug(f"求和计算: {list(numeric_columns)} -> {target_column}")
        return df
    def _aggregate(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """聚合计算
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        group_by = rule.get('group_by')
        aggregations = rule.get('aggregations', {})
        if group_by and group_by in df.columns:
            # 构建聚合函数字典
            agg_dict = {}
            for column, func in aggregations.items():
                if column in df.columns:
                    if isinstance(func, str):
                        agg_dict[column] = func
                    elif isinstance(func, list):
                        agg_dict[column] = func
            if agg_dict:
                result = df.groupby(group_by).agg(agg_dict)
                logger.debug(f"聚合计算: 按 {group_by} 分组, 聚合: {agg_dict}")
                return result.reset_index()
        return df
    # 便捷方法
    def multiply(self, source_column: str, target_column: str, factor: float):
        """乘法计算"""
        self.add_rule('multiply', source_column=source_column, 
                     target_column=target_column, factor=factor)
        return self
    def divide(self, source_column: str, target_column: str, divisor: float):
        """除法计算"""
        self.add_rule('divide', source_column=source_column, 
                     target_column=target_column, divisor=divisor)
        return self
    def add(self, columns: Union[str, List[str]], target_column: str, constant: float = 0):
        """加法计算"""
        self.add_rule('add', columns=columns, target_column=target_column, constant=constant)
        return self
    def subtract(self, minuend: str, target_column: str, 
                  subtrahend: Optional[str] = None, constant: float = 0):
        """减法计算"""
        self.add_rule('subtract', minuend=minuend, target_column=target_column,
                     subtrahend=subtrahend, constant=constant)
        return self
    def formula(self, formula: str, target_column: str):
        """公式计算"""
        self.add_rule('formula', formula=formula, target_column=target_column)
        return self
    def round_columns(self, columns: Optional[Union[str, List[str]]] = None, decimals: int = 0):
        """四舍五入"""
        self.add_rule('round', columns=columns, decimals=decimals)
        return self
    def sum_columns(self, columns: Optional[Union[str, List[str]]] = None, 
                   target_column: Optional[str] = None, group_by: Optional[str] = None):
        """求和计算"""
        self.add_rule('sum', columns=columns, target_column=target_column, group_by=group_by)
        return self
    def aggregate(self, group_by: str, aggregations: Dict[str, Union[str, List[str]]]):
        """聚合计算"""
        self.add_rule('aggregate', group_by=group_by, aggregations=aggregations)
        return self
@@ -1,382 +0,0 @@
 """
 列映射处理器
 提供列名映射和转换功能，支持不同供应商的列名标准化
 """
 import re
 import pandas as pd
 from typing import Dict, Any, Optional, List, Union
 from ...core.utils.log_utils import get_logger
 logger = get_logger(__name__)
 class ColumnMapper:
    """列映射处理器
    提供列名标准化功能，将不同供应商的列名映射到标准列名
    """
    # 标准列名定义（所有列名别名的唯一来源）
    STANDARD_COLUMNS = {
        'barcode': [
            '条码', '条形码', '商品条码', '商品条形码', '产品条码', '商品编码',
            '商品编号', '条码（必填）', '电脑条码', '条码ID',
            'barcode', 'Barcode', 'BarCode', 'code', '编码',
        ],
        'name': [
            '商品名称', '产品名称', '名称', '商品', '产品', '商品名', '品名',
            '品项名', '商品或服务名称', '品项', '名 称',
            'name', 'product_name',
        ],
        'specification': [
            '规格', '规格型号', '型号', '商品规格', '产品规格', '包装规格', '规 格',
            'specification', 'spec', 'model',
        ],
        'quantity': [
            '数量', '采购量', '订货数量', '订单量', '需求量', '采购数量', '购买数量',
            '订单数量', '数量（必填）', '采购量（必填）', '入库数', '入库数量', '数 量',
            'quantity', 'qty',
        ],
        'unit': [
            '单位', '计量单位', '采购单位', '单位（必填）', '单位名称', '计价单位', '单 位',
            'unit', 'units',
        ],
        'unit_price': [
            '单价', '价格', '采购单价', '进货价', '销售价', '采购价', '参考价',
            '入库单价', '单价（必填）', '采购单价（必填）', '价格（必填）', '单 价',
            'unit_price', 'price',
        ],
        'total_price': [
            '总价', '金额', '小计', '合计金额', '小计金额', '金额（元）',
            '金额合计', '合计', '总额',
            'total_price', 'total', 'amount',
        ],
        'gift_quantity': [
            '赠送量', '赠品数量', '赠送数量', '赠品',
        ],
        'category': ['类别', '分类', '商品类别', 'category', 'type'],
        'brand': ['品牌', '商标', 'brand'],
        'supplier': ['供应商', '供货商', 'supplier', 'vendor'],
    }
    def __init__(self, mapping_config: Optional[Dict[str, Any]] = None):
        """初始化列映射器
        Args:
            mapping_config: 映射配置
        """
        self.mapping_config = mapping_config or {}
        self.custom_mappings = {}
        self._build_reverse_mapping()
    def _build_reverse_mapping(self):
        """构建反向映射表"""
        self.reverse_mapping = {}
        # 添加标准列的反向映射
        for standard_name, variations in self.STANDARD_COLUMNS.items():
            for variation in variations:
                self.reverse_mapping[variation.lower()] = standard_name
        # 添加自定义映射
        for standard_name, custom_names in self.mapping_config.items():
            if isinstance(custom_names, str):
                custom_names = [custom_names]
            for custom_name in custom_names:
                self.reverse_mapping[custom_name.lower()] = standard_name
                self.custom_mappings[custom_name.lower()] = standard_name
    def map_columns(self, df: pd.DataFrame, target_columns: Optional[List[str]] = None) -> pd.DataFrame:
        """映射列名
        Args:
            df: 输入数据
            target_columns: 目标列名列表，如果为None则使用所有标准列
        Returns:
            列名映射后的数据
        """
        if target_columns is None:
            target_columns = list(self.STANDARD_COLUMNS.keys())
        logger.info(f"开始列名映射，目标列: {target_columns}")
        logger.info(f"原始列名: {list(df.columns)}")
        # 创建列名映射
        column_mapping = {}
        used_columns = set()
        for target_col in target_columns:
            # 查找匹配的原始列名
            matched_column = self._find_matching_column(df.columns, target_col)
            if matched_column:
                column_mapping[matched_column] = target_col
                used_columns.add(matched_column)
                logger.debug(f"列名映射: {matched_column} -> {target_col}")
        # 重命名列
        if column_mapping:
            df_mapped = df.rename(columns=column_mapping)
            # 添加缺失的目标列
            for target_col in target_columns:
                if target_col not in df_mapped.columns:
                    df_mapped[target_col] = self._get_default_value(target_col)
                    logger.debug(f"添加缺失列: {target_col}")
            # 只保留目标列
            existing_target_columns = [col for col in target_columns if col in df_mapped.columns]
            df_result = df_mapped[existing_target_columns]
            logger.info(f"列名映射完成，结果列名: {list(df_result.columns)}")
            return df_result
        else:
            logger.warning("没有找到可映射的列名")
            return df
    def _find_matching_column(self, columns: List[str], target_column: str) -> Optional[str]:
        """查找匹配的列名
        Args:
            columns: 原始列名列表
            target_column: 目标标准列名
        Returns:
            匹配的原始列名或None
        """
        # 获取目标列的所有可能变体
        possible_names = []
        # 标准列名变体
        if target_column in self.STANDARD_COLUMNS:
            possible_names.extend(self.STANDARD_COLUMNS[target_column])
        # 自定义映射
        for standard_name, custom_names in self.mapping_config.items():
            if standard_name == target_column:
                if isinstance(custom_names, str):
                    possible_names.append(custom_names)
                else:
                    possible_names.extend(custom_names)
        # 查找匹配
        for possible_name in possible_names:
            # 精确匹配（忽略大小写）
            for column in columns:
                if column.lower() == possible_name.lower():
                    return column
            # 模糊匹配
            for column in columns:
                if possible_name.lower() in column.lower() or column.lower() in possible_name.lower():
                    return column
        return None
    def _get_default_value(self, column_name: str) -> Any:
        """获取列的默认值
        Args:
            column_name: 列名
        Returns:
            默认值
        """
        # 根据列名类型返回合适的默认值
        if column_name in ['quantity', 'unit_price', 'total_price']:
            return 0
        elif column_name in ['barcode', 'name', 'specification', 'unit', 'category', 'brand', 'supplier']:
            return ''
        else:
            return None
    def add_custom_mapping(self, standard_name: str, custom_names: Union[str, List[str]]):
        """添加自定义列名映射
        Args:
            standard_name: 标准列名
            custom_names: 自定义列名或列名列表
        """
        if isinstance(custom_names, str):
            custom_names = [custom_names]
        # 更新配置
        self.mapping_config[standard_name] = custom_names
        # 更新反向映射
        for custom_name in custom_names:
            self.reverse_mapping[custom_name.lower()] = standard_name
            self.custom_mappings[custom_name.lower()] = standard_name
        logger.info(f"添加自定义映射: {standard_name} <- {custom_names}")
    def detect_column_types(self, df: pd.DataFrame) -> Dict[str, str]:
        """检测列的数据类型
        Args:
            df: 数据
        Returns:
            列类型字典
        """
        column_types = {}
        for column in df.columns:
            if pd.api.types.is_numeric_dtype(df[column]):
                column_types[column] = 'numeric'
            elif pd.api.types.is_datetime64_any_dtype(df[column]):
                column_types[column] = 'datetime'
            elif pd.api.types.is_bool_dtype(df[column]):
                column_types[column] = 'boolean'
            else:
                column_types[column] = 'text'
        return column_types
    def suggest_column_mapping(self, df: pd.DataFrame) -> Dict[str, List[str]]:
        """建议列名映射
        Args:
            df: 数据
        Returns:
            建议的映射关系
        """
        suggestions = {}
        for column in df.columns:
            column_lower = column.lower()
            suggestions[column] = []
            # 检查标准列名
            for standard_name, variations in self.STANDARD_COLUMNS.items():
                for variation in variations:
                    if column_lower in variation.lower() or variation.lower() in column_lower:
                        suggestions[column].append(standard_name)
            # 检查自定义映射
            for custom_name, standard_name in self.custom_mappings.items():
                if column_lower in custom_name or custom_name in column_lower:
                    suggestions[column].append(standard_name)
            # 去重
            suggestions[column] = list(set(suggestions[column]))
        # 只返回有建议的列
        return {k: v for k, v in suggestions.items() if v}
    def validate_mapping(self, df: pd.DataFrame, required_columns: List[str]) -> Dict[str, Any]:
        """验证列映射结果
        Args:
            df: 映射后的数据
            required_columns: 必需的列名列表
        Returns:
            验证结果
        """
        result = {
            'valid': True,
            'missing_columns': [],
            'empty_columns': [],
            'warnings': []
        }
        # 检查缺失列
        for col in required_columns:
            if col not in df.columns:
                result['missing_columns'].append(col)
                result['valid'] = False
        # 检查空列
        for col in df.columns:
            if df[col].isnull().all():
                result['empty_columns'].append(col)
                result['warnings'].append(f"列 '{col}' 全部为空值")
        # 检查数值列
        numeric_columns = ['quantity', 'unit_price', 'total_price']
        for col in numeric_columns:
            if col in df.columns and not pd.api.types.is_numeric_dtype(df[col]):
                result['warnings'].append(f"列 '{col}' 不是数值类型")
        return result
    @classmethod
    def find_column(cls, columns: List[str], standard_name: str) -> Optional[str]:
        """在列名列表中查找匹配标准列名的列
        匹配策略: 精确匹配 → 忽略空白匹配 → 子串匹配
        Args:
            columns: 实际列名列表
            standard_name: 标准列名 (STANDARD_COLUMNS 的键)
        Returns:
            匹配到的实际列名，未找到返回 None
        """
        candidates = cls.STANDARD_COLUMNS.get(standard_name, [])
        if not candidates:
            return None
        columns_str = [str(c) for c in columns]
        # 精确匹配
        for col in columns_str:
            col_clean = col.strip()
            for candidate in candidates:
                if col_clean == candidate:
                    return col
        # 忽略空白匹配
        for col in columns_str:
            col_clean = re.sub(r'\s+', '', col.strip())
            for candidate in candidates:
                if col_clean == re.sub(r'\s+', '', candidate):
                    return col
        # 子串匹配 (候选名包含在列名中)
        for col in columns_str:
            col_lower = col.strip().lower()
            for candidate in candidates:
                if candidate.lower() in col_lower:
                    return col
        return None
    @staticmethod
    def detect_header_row(df: pd.DataFrame, max_rows: int = 10, min_matches: int = 3) -> int:
        """检测表头所在行
        扫描前 max_rows 行，返回包含最多关键词匹配的行索引。
        Args:
            df: 数据框
            max_rows: 最大扫描行数
            min_matches: 最少关键词匹配数
        Returns:
            表头行索引，未找到返回 -1
        """
        header_keywords = [
            '条码', '条形码', '商品条码', '商品名称', '名称', '规格',
            '单价', '数量', '金额', '单位', '必填', '编码',
        ]
        best_row = -1
        best_matches = 0
        for row_idx in range(min(max_rows, len(df))):
            row_values = df.iloc[row_idx].astype(str)
            matches = sum(
                1 for kw in header_keywords
                if any(kw in str(val) for val in row_values.values)
            )
            if matches >= min_matches and matches > best_matches:
                best_matches = matches
                best_row = row_idx
        return best_row
@@ -1,401 +0,0 @@
 """
 数据清洗处理器
 提供各种数据清洗功能，如空值处理、重复项处理、数据类型转换等
 """
 import pandas as pd
 from typing import Dict, Any, Optional, List, Union
 from ...core.utils.log_utils import get_logger
 logger = get_logger(__name__)
 class DataCleaner:
    """数据清洗处理器
    提供标准化的数据清洗功能，支持链式调用和规则配置
    """
    def __init__(self, config: Optional[Dict[str, Any]] = None):
        """初始化数据清洗器
        Args:
            config: 清洗配置
        """
        self.config = config or {}
        self.cleaning_rules = []
    def add_rule(self, rule_type: str, **kwargs):
        """添加清洗规则
        Args:
            rule_type: 规则类型
            **kwargs: 规则参数
        """
        rule = {'type': rule_type, **kwargs}
        self.cleaning_rules.append(rule)
        logger.debug(f"添加清洗规则: {rule_type}")
    def clean(self, df: pd.DataFrame) -> pd.DataFrame:
        """执行数据清洗
        Args:
            df: 输入数据
        Returns:
            清洗后的数据
        """
        logger.info(f"开始数据清洗，原始数据形状: {df.shape}")
        result_df = df.copy()
        for i, rule in enumerate(self.cleaning_rules):
            try:
                logger.debug(f"执行清洗规则 {i+1}/{len(self.cleaning_rules)}: {rule['type']}")
                result_df = self._apply_rule(result_df, rule)
                logger.debug(f"规则执行完成，数据形状: {result_df.shape}")
            except Exception as e:
                logger.error(f"清洗规则执行失败: {rule}, 错误: {e}")
                # 继续执行下一个规则，而不是中断整个流程
                continue
        logger.info(f"数据清洗完成，最终数据形状: {result_df.shape}")
        return result_df
    def _apply_rule(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """应用单个清洗规则
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        rule_type = rule.get('type')
        if rule_type == 'remove_duplicates':
            return self._remove_duplicates(df, rule)
        elif rule_type == 'fill_na':
            return self._fill_na(df, rule)
        elif rule_type == 'remove_rows':
            return self._remove_rows(df, rule)
        elif rule_type == 'convert_type':
            return self._convert_type(df, rule)
        elif rule_type == 'strip_whitespace':
            return self._strip_whitespace(df, rule)
        elif rule_type == 'normalize_text':
            return self._normalize_text(df, rule)
        elif rule_type == 'validate_data':
            return self._validate_data(df, rule)
        else:
            logger.warning(f"未知的清洗规则类型: {rule_type}")
            return df
    def _remove_duplicates(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """移除重复项
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        subset = rule.get('subset')  # 用于判断重复的列
        keep = rule.get('keep', 'first')  # 保留哪个重复项
        before_count = len(df)
        df_cleaned = df.drop_duplicates(subset=subset, keep=keep)
        after_count = len(df_cleaned)
        logger.info(f"移除重复项: {before_count - after_count} 行被移除")
        return df_cleaned
    def _fill_na(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """填充空值
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        columns = rule.get('columns')  # 要处理的列
        value = rule.get('value', 0)  # 填充值
        method = rule.get('method')  # 填充方法（'ffill', 'bfill', 'mean', 'median'）
        if columns:
            # 处理指定列
            if isinstance(columns, str):
                columns = [columns]
            for col in columns:
                if col in df.columns:
                    if method == 'ffill':
                        df[col] = df[col].fillna(method='ffill')
                    elif method == 'bfill':
                        df[col] = df[col].fillna(method='bfill')
                    elif method == 'mean':
                        df[col] = df[col].fillna(df[col].mean())
                    elif method == 'median':
                        df[col] = df[col].fillna(df[col].median())
                    else:
                        df[col] = df[col].fillna(value)
                    logger.debug(f"填充列 {col} 的空值: {method or value}")
        else:
            # 处理所有列
            if method == 'ffill':
                df = df.fillna(method='ffill')
            elif method == 'bfill':
                df = df.fillna(method='bfill')
            else:
                df = df.fillna(value)
            logger.debug(f"填充所有列的空值: {method or value}")
        return df
    def _remove_rows(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """移除行
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        condition = rule.get('condition')  # 条件表达式
        columns = rule.get('columns')  # 要检查的列
        values = rule.get('values')  # 要移除的值
        if condition:
            # 使用条件表达式
            try:
                before_count = len(df)
                df_filtered = df.query(condition)
                after_count = len(df_filtered)
                logger.info(f"条件过滤: {condition}, 移除了 {before_count - after_count} 行")
                return df_filtered
            except Exception as e:
                logger.error(f"条件表达式执行失败: {condition}, 错误: {e}")
                return df
        if columns and values:
            # 基于列值过滤
            if isinstance(columns, str):
                columns = [columns]
            if not isinstance(values, list):
                values = [values]
            df_filtered = df.copy()
            for col in columns:
                if col in df_filtered.columns:
                    mask = ~df_filtered[col].isin(values)
                    df_filtered = df_filtered[mask]
                    logger.debug(f"列 {col} 过滤值 {values}")
            return df_filtered
        logger.warning("移除行规则缺少条件或列配置")
        return df
    def _convert_type(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """类型转换
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        columns = rule.get('columns')
        target_type = rule.get('target_type', 'float')
        errors = rule.get('errors', 'coerce')  # 错误处理方式
        if isinstance(columns, str):
            columns = [columns]
        for col in columns:
            if col in df.columns:
                try:
                    if target_type == 'int':
                        df[col] = pd.to_numeric(df[col], errors=errors).astype('Int64')
                    elif target_type == 'float':
                        df[col] = pd.to_numeric(df[col], errors=errors)
                    elif target_type == 'datetime':
                        df[col] = pd.to_datetime(df[col], errors=errors)
                    elif target_type == 'string':
                        df[col] = df[col].astype(str)
                    else:
                        df[col] = df[col].astype(target_type)
                    logger.debug(f"列 {col} 类型转换: {target_type}")
                except Exception as e:
                    logger.error(f"列 {col} 类型转换失败: {e}")
        return df
    def _strip_whitespace(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """去除空白字符
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        columns = rule.get('columns')
        if columns:
            if isinstance(columns, str):
                columns = [columns]
            for col in columns:
                if col in df.columns and df[col].dtype == 'object':
                    df[col] = df[col].str.strip()
                    logger.debug(f"列 {col} 去除空白字符")
        else:
            # 处理所有文本列
            text_columns = df.select_dtypes(include=['object']).columns
            for col in text_columns:
                df[col] = df[col].str.strip()
            logger.debug(f"所有文本列去除空白字符: {list(text_columns)}")
        return df
    def _normalize_text(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """文本标准化
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        columns = rule.get('columns')
        lowercase = rule.get('lowercase', False)
        uppercase = rule.get('uppercase', False)
        replace_map = rule.get('replace_map', {})  # 替换映射
        if isinstance(columns, str):
            columns = [columns]
        target_columns = columns or df.select_dtypes(include=['object']).columns
        for col in target_columns:
            if col in df.columns and df[col].dtype == 'object':
                if lowercase:
                    df[col] = df[col].str.lower()
                elif uppercase:
                    df[col] = df[col].str.upper()
                # 应用替换映射
                for old, new in replace_map.items():
                    df[col] = df[col].str.replace(old, new)
                logger.debug(f"列 {col} 文本标准化完成")
        return df
    def _validate_data(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
        """数据验证
        Args:
            df: 数据
            rule: 规则配置
        Returns:
            处理后的数据
        """
        columns = rule.get('columns')
        min_value = rule.get('min_value')
        max_value = rule.get('max_value')
        required = rule.get('required', False)
        if isinstance(columns, str):
            columns = [columns]
        validation_results = []
        for col in columns:
            if col in df.columns:
                # 检查必需值
                if required:
                    null_count = df[col].isnull().sum()
                    if null_count > 0:
                        validation_results.append(f"{col}: {null_count} 个空值")
                # 检查数值范围
                if min_value is not None or max_value is not None:
                    if pd.api.types.is_numeric_dtype(df[col]):
                        invalid_mask = pd.Series(False, index=df.index)
                        if min_value is not None:
                            invalid_mask |= df[col] < min_value
                        if max_value is not None:
                            invalid_mask |= df[col] > max_value
                        invalid_count = invalid_mask.sum()
                        if invalid_count > 0:
                            validation_results.append(f"{col}: {invalid_count} 个值超出范围")
        if validation_results:
            logger.warning(f"数据验证发现问题: {', '.join(validation_results)}")
        else:
            logger.debug("数据验证通过")
        return df
    # 便捷方法
    def remove_duplicates(self, subset: Optional[List[str]] = None, keep: str = 'first'):
        """移除重复项"""
        self.add_rule('remove_duplicates', subset=subset, keep=keep)
        return self
    def fill_na(self, columns: Optional[Union[str, List[str]]] = None, 
                value: Any = 0, method: Optional[str] = None):
        """填充空值"""
        self.add_rule('fill_na', columns=columns, value=value, method=method)
        return self
    def remove_rows(self, condition: Optional[str] = None,
                   columns: Optional[Union[str, List[str]]] = None,
                   values: Optional[Any] = None):
        """移除行"""
        self.add_rule('remove_rows', condition=condition, columns=columns, values=values)
        return self
    def convert_type(self, columns: Union[str, List[str]], target_type: str, errors: str = 'coerce'):
        """类型转换"""
        self.add_rule('convert_type', columns=columns, target_type=target_type, errors=errors)
        return self
    def strip_whitespace(self, columns: Optional[Union[str, List[str]]] = None):
        """去除空白字符"""
        self.add_rule('strip_whitespace', columns=columns)
        return self
    def normalize_text(self, columns: Optional[Union[str, List[str]]] = None,
                      lowercase: bool = False, uppercase: bool = False,
                      replace_map: Optional[Dict[str, str]] = None):
        """文本标准化"""
        self.add_rule('normalize_text', columns=columns, lowercase=lowercase, 
                     uppercase=uppercase, replace_map=replace_map or {})
        return self
    def validate_data(self, columns: Union[str, List[str]], 
                     min_value: Optional[float] = None,
                     max_value: Optional[float] = None,
                     required: bool = False):
        """数据验证"""
        self.add_rule('validate_data', columns=columns, min_value=min_value,
                     max_value=max_value, required=required)
        return self
@@ -1,150 +0,0 @@
 import re
 import pandas as pd
 from typing import List, Dict, Any, Optional
 def _split_quantity_unit(df: pd.DataFrame, source: str, dictionary: Optional[Dict[str, Any]] = None) -> pd.DataFrame:
    if source in df.columns:
        vals = df[source].astype(str).fillna("")
        nums = []
        units = []
        default_unit = (dictionary or {}).get("default_unit", "")
        unit_synonyms = (dictionary or {}).get("unit_synonyms", {})
        for v in vals:
            m = re.search(r"(\d+(?:\.\d+)?)(箱|件|提|盒|瓶)", v)
            if m:
                nums.append(float(m.group(1)))
                u = unit_synonyms.get(m.group(2), m.group(2))
                units.append(u)
            else:
                try:
                    nums.append(float(v))
                    units.append(unit_synonyms.get(default_unit, default_unit))
                except Exception:
                    nums.append(0.0)
                    units.append(unit_synonyms.get(default_unit, default_unit))
        df["quantity"] = nums
        df["unit"] = units
    return df
 def _extract_spec_from_name(df: pd.DataFrame, source: str, dictionary: Optional[Dict[str, Any]] = None) -> pd.DataFrame:
    if source in df.columns:
        names = df[source].astype(str).fillna("")
        specs = []
        packs = []
        ignore_words = (dictionary or {}).get("ignore_words", [])
        name_patterns = (dictionary or {}).get("name_patterns", [])
        for s in names:
            if ignore_words:
                for w in ignore_words:
                    s = s.replace(w, "")
            matched = False
            for pat in name_patterns:
                try:
                    m = re.search(pat, s)
                    if m and len(m.groups()) >= 2:
                        try:
                            qty = int(m.group(len(m.groups())))
                        except Exception:
                            qty = None
                        specs.append(s)
                        packs.append(qty)
                        matched = True
                        break
                except Exception:
                    pass
            if matched:
                continue
            m = re.search(r"(\d+(?:\.\d+)?)(ml|l|升|毫升)[*×xX](\d+)", s, re.IGNORECASE)
            if m:
                specs.append(f"{m.group(1)}{m.group(2)}*{m.group(3)}")
                packs.append(int(m.group(3)))
                continue
            m2 = re.search(r"(\d+)[*×xX](\d+)", s)
            if m2:
                specs.append(f"1*{m2.group(2)}")
                packs.append(int(m2.group(2)))
                continue
            m3 = re.search(r"(\d{2,3})\D*(\d{1,3})\D*", s)
            if m3:
                specs.append(f"1*{m3.group(2)}")
                packs.append(int(m3.group(2)))
                continue
            specs.append("")
            packs.append(None)
        df["specification"] = df.get("specification", pd.Series(specs))
        df["package_quantity"] = packs
    return df
 def _normalize_unit(df: pd.DataFrame, target: str, unit_map: Dict[str, str], dictionary: Optional[Dict[str, Any]] = None) -> pd.DataFrame:
    if target in df.columns:
        df[target] = df[target].astype(str)
        df[target] = df[target].apply(lambda u: unit_map.get(u, u))
        pack_multipliers = (dictionary or {}).get("pack_multipliers", {})
        default_pq = (dictionary or {}).get("default_package_quantity", 1)
        try:
            if "quantity" in df.columns:
                def convert_qty(row):
                    u = row.get(target)
                    q = row.get("quantity")
                    pq = row.get("package_quantity")
                    if u in ("件", "箱", "提", "盒"):
                        mult = pq or pack_multipliers.get(u, default_pq)
                        if pd.notna(q) and pd.notna(mult) and float(mult) > 0:
                            return float(q) * float(mult)
                    return q
                df["quantity"] = df.apply(convert_qty, axis=1)
                df[target] = df[target].apply(lambda u: "瓶" if u in ("件","箱","提","盒") else u)
        except Exception:
            pass
    return df
 def _compute_quantity_from_total(df: pd.DataFrame) -> pd.DataFrame:
    if "quantity" in df.columns and "unit_price" in df.columns:
        qty = df["quantity"].fillna(0)
        up = pd.to_numeric(df.get("unit_price", 0), errors="coerce").fillna(0)
        tp = pd.to_numeric(df.get("total_price", 0), errors="coerce").fillna(0)
        need = (qty <= 0) & (up > 0) & (tp > 0)
        df.loc[need, "quantity"] = (tp[need] / up[need]).round(6)
    return df
 def _fill_missing(df: pd.DataFrame, fills: Dict[str, Any]) -> pd.DataFrame:
    for k, v in fills.items():
        if k in df.columns:
            df[k] = df[k].fillna(v)
        else:
            df[k] = v
    return df
 def _mark_gift(df: pd.DataFrame) -> pd.DataFrame:
    df["is_gift"] = False
    tp = df.get("total_price")
    up = df.get("unit_price")
    flags = pd.Series([False]*len(df))
    if tp is not None:
        tpn = pd.to_numeric(tp, errors="coerce").fillna(0)
        flags = flags | (tpn == 0)
    if up is not None:
        upn = pd.to_numeric(up, errors="coerce").fillna(0)
        flags = flags | (upn == 0)
    if "name" in df.columns:
        flags = flags | df["name"].astype(str).str.contains(r"赠品|^o$|^O$", regex=True)
    df.loc[flags, "is_gift"] = True
    return df
 def apply_rules(df: pd.DataFrame, rules: List[Dict[str, Any]], dictionary: Optional[Dict[str, Any]] = None) -> pd.DataFrame:
    out = df.copy()
    for r in rules or []:
        t = r.get("type")
        if t == "split_quantity_unit":
            out = _split_quantity_unit(out, r.get("source", "quantity"), dictionary)
        elif t == "extract_spec_from_name":
            out = _extract_spec_from_name(out, r.get("source", "name"), dictionary)
        elif t == "normalize_unit":
            out = _normalize_unit(out, r.get("target", "unit"), r.get("map", {}), dictionary)
        elif t == "compute_quantity_from_total":
            out = _compute_quantity_from_total(out)
        elif t == "fill_missing":
            out = _fill_missing(out, r.get("fills", {}))
        elif t == "mark_gift":
            out = _mark_gift(out)
    return out
@@ -4,25 +4,24 @@
 提供百度OCR API的访问和调用功能。
 """
 import os
 import time
 import base64
 import requests
-from typing import Dict, Optional, Union
+import logging
 from typing import Dict, Optional, Any, Union
 from ...config.settings import ConfigManager
 from ..utils.log_utils import get_logger
 logger = get_logger(__name__)
 # Token 过期相关常量
 _DEFAULT_TOKEN_LIFETIME = 30 * 24 * 3600  # 30天（秒）
 _TOKEN_EARLY_EXPIRY = 3600                 # 提前1小时刷新（秒）
 class TokenManager:
    """
    令牌管理类，负责获取和刷新百度API访问令牌
    """
-    def __init__(self, api_key: str, secret_key: str, max_retries: int = 3, retry_delay: int = 2, token_url: str = None):
+    def __init__(self, api_key: str, secret_key: str, max_retries: int = 3, retry_delay: int = 2):
        """
        初始化令牌管理器
@@ -31,13 +30,11 @@ class TokenManager:
            secret_key: 百度Secret Key
            max_retries: 最大重试次数
            retry_delay: 重试延迟（秒）
            token_url: 令牌获取地址
        """
        self.api_key = api_key
        self.secret_key = secret_key
        self.max_retries = max_retries
        self.retry_delay = retry_delay
        self.token_url = token_url or 'https://aip.baidubce.com/oauth/2.0/token'
        self.access_token = None
        self.token_expiry = 0
@@ -72,7 +69,7 @@ class TokenManager:
        Returns:
            新的访问令牌，如果获取失败则返回None
        """
-        url = self.token_url
+        url = "https://aip.baidubce.com/oauth/2.0/token"
        params = {
            "grant_type": "client_credentials",
            "client_id": self.api_key,
@@ -87,7 +84,7 @@ class TokenManager:
                    if "access_token" in result:
                        self.access_token = result["access_token"]
                        # 设置令牌过期时间（默认30天，提前1小时过期以确保安全）
-                        self.token_expiry = time.time() + result.get("expires_in", _DEFAULT_TOKEN_LIFETIME) - _TOKEN_EARLY_EXPIRY
+                        self.token_expiry = time.time() + result.get("expires_in", 2592000) - 3600
                        logger.info("成功获取访问令牌")
                        return self.access_token
@@ -147,8 +144,7 @@ class BaiduOCRClient:
                self.api_key, 
                self.secret_key, 
                self.max_retries, 
-                self.retry_delay,
+                self.retry_delay
                token_url=config.get('API', 'token_url', fallback='https://aip.baidubce.com/oauth/2.0/token')
            )
            # 验证API配置
@@ -303,8 +299,7 @@ class BaiduOCRClient:
            logger.error(f"无法从结果中提取有效的request_id: {request_id_or_result}")
            return None
-        base_url = self.config.get('API', 'form_ocr_url', fallback='https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result')
+        url = f"https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result?access_token={access_token}"
        url = f"{base_url}?access_token={access_token}"
        payload = {
            'request_id': request_id,
@@ -5,11 +5,15 @@
 """
 import os
 import sys
 import time
 import json
 import base64
 from datetime import datetime
 from concurrent.futures import ThreadPoolExecutor
-from typing import Dict, List, Optional, Tuple, Callable
+from typing import Dict, List, Optional, Tuple, Union, Any
 from ...config.settings import ConfigManager
 from ..utils.log_utils import get_logger
 from ..utils.file_utils import (
    ensure_dir, 
@@ -114,9 +118,9 @@ class OCRProcessor:
        # 修复ConfigParser对象没有get_path方法的问题
        try:
            # 获取输入和输出目录
-            self.input_folder = config.get_path('Paths', 'input_folder', fallback='data/input', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/input')
+            self.input_folder = config.get('Paths', 'input_folder', fallback='data/input')
-            self.output_folder = config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/output')
+            self.output_folder = config.get('Paths', 'output_folder', fallback='data/output')
-            self.temp_folder = config.get_path('Paths', 'temp_folder', fallback='data/temp', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/temp')
+            self.temp_folder = config.get('Paths', 'temp_folder', fallback='data/temp')
            # 确保目录存在
            os.makedirs(self.input_folder, exist_ok=True)
@@ -173,7 +177,7 @@ class OCRProcessor:
        skip_existing = True
        try:
            skip_existing = self.config.getboolean('Performance', 'skip_existing', fallback=True)
-        except Exception:
+        except:
            pass
        if skip_existing:
@@ -210,7 +214,7 @@ class OCRProcessor:
        max_size_mb = 4.0
        try:
            max_size_mb = float(self.config.get('File', 'max_file_size_mb', fallback='4.0'))
-        except Exception:
+        except:
            pass
        if not is_file_size_valid(image_path, max_size_mb):
@@ -237,7 +241,7 @@ class OCRProcessor:
        skip_existing = True
        try:
            skip_existing = self.config.getboolean('Performance', 'skip_existing', fallback=True)
-        except Exception:
+        except:
            pass
        # 如果需要跳过已处理的文件
@@ -253,7 +257,7 @@ class OCRProcessor:
            excel_extension = '.xlsx'
            try:
                excel_extension = self.config.get('File', 'excel_extension', fallback='.xlsx')
-            except Exception:
+            except:
                pass
            # 生成输出文件路径
@@ -328,7 +332,7 @@ class OCRProcessor:
            logger.error(f"处理图片时出错: {image_path}, 错误: {e}")
            return None
-    def process_images_batch(self, batch_size: int = None, max_workers: int = None, progress_cb: Optional[Callable[[int], None]] = None) -> Tuple[int, int]:
+    def process_images_batch(self, batch_size: int = None, max_workers: int = None) -> Tuple[int, int]:
        """
        批量处理图片
@@ -343,13 +347,13 @@ class OCRProcessor:
        if batch_size is None:
            try:
                batch_size = self.config.getint('Performance', 'batch_size', fallback=5)
-            except Exception:
+            except:
                batch_size = 5
        if max_workers is None:
            try:
                max_workers = self.config.getint('Performance', 'max_workers', fallback=4)
-            except Exception:
+            except:
                max_workers = 4
        # 获取未处理的图片
@@ -365,13 +369,6 @@ class OCRProcessor:
        for i in range(0, total, batch_size):
            batch = unprocessed_images[i:i+batch_size]
            logger.info(f"处理批次 {i//batch_size+1}/{(total+batch_size-1)//batch_size}: {len(batch)} 个文件")
            try:
                if progress_cb:
                    # 以批次为单位估算进度（0-90%），保留10%给后续阶段
                    percent = int(10 + (i / max(total, 1)) * 80)
                    progress_cb(min(percent, 90))
            except Exception:
                pass
            # 使用多线程处理批次
            with ThreadPoolExecutor(max_workers=max_workers) as executor:
@@ -381,9 +378,4 @@ class OCRProcessor:
                success_count += sum(1 for result in results if result is not None)
        logger.info(f"所有图片处理完成, 总计: {total}, 成功: {success_count}")
        try:
            if progress_cb:
                progress_cb(90)
        except Exception:
            pass
        return total, success_count 
@@ -1,9 +0,0 @@
 """
 处理器模块初始化文件
 """
 from .base import BaseProcessor
 from .ocr_processor import OCRProcessor
 from .tobacco_processor import TobaccoProcessor
 __all__ = ['BaseProcessor', 'OCRProcessor', 'TobaccoProcessor']
@@ -1,167 +0,0 @@
 """
 基础处理器接口模块
 定义所有处理器的基类，提供统一的处理接口
 """
 from abc import ABC, abstractmethod
 from typing import Dict, Any, Optional, List
 from pathlib import Path
 import logging
 import pandas as pd
 from ...core.utils.log_utils import get_logger
 logger = get_logger(__name__)
 class BaseProcessor(ABC):
    """基础处理器接口 - 所有处理器的基类
    采用策略模式设计，每个处理器负责特定类型的文件处理
    """
    def __init__(self, config: Dict[str, Any]):
        """初始化处理器
        Args:
            config: 处理器配置字典
        """
        self.config = config
        self.name = self.__class__.__name__
        self.description = ""
        self._setup_logging()
    def _setup_logging(self):
        """设置处理器日志"""
        self.logger = logging.getLogger(f"{__name__}.{self.name}")
    @abstractmethod
    def can_process(self, file_path: Path) -> bool:
        """判断是否能处理该文件
        Args:
            file_path: 文件路径
        Returns:
            是否能处理该文件
        """
        pass
    @abstractmethod
    def process(self, input_file: Path, output_dir: Path) -> Optional[Path]:
        """处理文件，返回输出文件路径
        Args:
            input_file: 输入文件路径
            output_dir: 输出目录路径
        Returns:
            输出文件路径，处理失败返回None
        """
        pass
    @abstractmethod
    def get_required_columns(self) -> List[str]:
        """返回需要的列名列表
        Returns:
            列名列表
        """
        pass
    def validate_input(self, file_path: Path) -> bool:
        """验证输入文件有效性
        Args:
            file_path: 文件路径
        Returns:
            文件是否有效
        """
        try:
            if not file_path.exists():
                self.logger.warning(f"文件不存在: {file_path}")
                return False
            if not file_path.is_file():
                self.logger.warning(f"不是文件: {file_path}")
                return False
            supported_extensions = self.get_supported_extensions()
            if supported_extensions and file_path.suffix.lower() not in supported_extensions:
                self.logger.warning(f"不支持的文件类型: {file_path.suffix}, 支持的类型: {supported_extensions}")
                return False
            return True
        except Exception as e:
            self.logger.error(f"验证文件时出错: {e}")
            return False
    def get_supported_extensions(self) -> List[str]:
        """获取支持的文件扩展名
        Returns:
            支持的扩展名列表，空列表表示支持所有类型
        """
        return []
    def get_output_filename(self, input_file: Path, suffix: str = "_processed") -> str:
        """生成输出文件名
        Args:
            input_file: 输入文件路径
            suffix: 文件名后缀
        Returns:
            输出文件名
        """
        return f"{input_file.stem}{suffix}{input_file.suffix}"
    def _read_excel_safely(self, file_path: Path, **kwargs) -> pd.DataFrame:
        """根据扩展名选择合适的读取引擎
        Args:
            file_path: 文件路径
            **kwargs: 传递给 pd.read_excel 的参数
        Returns:
            DataFrame
        Raises:
            Exception: 读取失败时抛出
        """
        suffix = file_path.suffix.lower()
        if suffix == '.xlsx':
            return pd.read_excel(file_path, engine='openpyxl', **kwargs)
        elif suffix == '.xls':
            try:
                return pd.read_excel(file_path, engine='xlrd', **kwargs)
            except Exception as e:
                self.logger.warning(f"读取xls失败，可能缺少xlrd: {e}")
                raise
        else:
            return pd.read_excel(file_path, **kwargs)
    def log_processing_start(self, input_file: Path):
        """记录处理开始日志"""
        self.logger.info(f"开始处理文件: {input_file}")
        self.logger.info(f"处理器: {self.name} - {self.description}")
    def log_processing_end(self, input_file: Path, output_file: Optional[Path] = None, success: bool = True):
        """记录处理结束日志"""
        if success:
            self.logger.info(f"处理完成: {input_file}")
            if output_file:
                self.logger.info(f"输出文件: {output_file}")
        else:
            self.logger.error(f"处理失败: {input_file}")
    def __str__(self) -> str:
        """字符串表示"""
        return f"{self.name}({self.description})"
    def __repr__(self) -> str:
        """详细字符串表示"""
        return f"{self.__class__.__module__}.{self.__class__.__name__}(name='{self.name}', description='{self.description}')"
@@ -1,192 +0,0 @@
 """
 OCR处理器
 处理图片文件的OCR识别完整流程：图片识别 → Excel处理 → 标准采购单生成
 """
 import os
 from pathlib import Path
 from typing import Optional, Dict, Any, List
 from .base import BaseProcessor
 from ...services.ocr_service import OCRService
 from ...services.order_service import OrderService
 from ...core.utils.log_utils import get_logger
 logger = get_logger(__name__)
 class OCRProcessor(BaseProcessor):
    """OCR处理器
    处理图片文件的完整OCR识别流程：
    1. OCR识别图片中的表格信息
    2. 处理识别结果生成Excel文件
    3. 转换为标准采购单格式
    """
    def __init__(self, config: Dict[str, Any]):
        """初始化OCR处理器
        Args:
            config: 配置信息
        """
        super().__init__(config)
        self.description = "OCR识别完整流程（图片→识别→Excel→采购单）"
        # 初始化服务
        self.ocr_service = OCRService(config)
        self.order_service = OrderService(config)
    def can_process(self, file_path: Path) -> bool:
        """判断是否为支持的图片文件
        Args:
            file_path: 文件路径
        Returns:
            是否能处理该文件
        """
        if not self.validate_input(file_path):
            return False
        # 支持的图片格式
        supported_extensions = ['.jpg', '.jpeg', '.png', '.bmp']
        if file_path.suffix.lower() in supported_extensions:
            self.logger.info(f"识别为图片文件: {file_path.name}")
            return True
        return False
    def process(self, input_file: Path, output_dir: Path) -> Optional[Path]:
        """处理图片文件的完整OCR流程
        Args:
            input_file: 输入图片文件路径
            output_dir: 输出目录路径
        Returns:
            输出文件路径，处理失败返回None
        """
        self.log_processing_start(input_file)
        try:
            self.logger.info("开始OCR识别流程...")
            # 步骤1: OCR识别
            self.logger.info("步骤1/3: OCR识别图片...")
            ocr_result = self._perform_ocr(input_file, output_dir)
            if not ocr_result:
                self.logger.error("OCR识别失败")
                self.log_processing_end(input_file, success=False)
                return None
            # 步骤2: Excel处理
            self.logger.info("步骤2/3: 处理Excel文件...")
            excel_result = self._process_excel(ocr_result, output_dir)
            if not excel_result:
                self.logger.error("Excel处理失败")
                self.log_processing_end(input_file, success=False)
                return None
            # 步骤3: 生成标准采购单
            self.logger.info("步骤3/3: 生成标准采购单...")
            final_result = self._generate_purchase_order(excel_result, output_dir)
            if final_result:
                self.logger.info(f"OCR处理流程完成，输出文件: {final_result}")
                self.log_processing_end(input_file, final_result, success=True)
                return final_result
            else:
                self.logger.error("生成采购单失败")
                self.log_processing_end(input_file, success=False)
                return None
        except Exception as e:
            self.logger.error(f"OCR处理流程出错: {e}", exc_info=True)
            self.log_processing_end(input_file, success=False)
            return None
    def get_required_columns(self) -> List[str]:
        """返回需要的列名列表"""
        # OCR处理不直接依赖列名，由后续处理步骤决定
        return []
    def get_supported_extensions(self) -> List[str]:
        """支持的文件扩展名"""
        return ['.jpg', '.jpeg', '.png', '.bmp']
    def _perform_ocr(self, input_file: Path, output_dir: Path) -> Optional[Path]:
        """执行OCR识别
        Args:
            input_file: 输入图片文件
            output_dir: 输出目录
        Returns:
            OCR生成的Excel文件路径，失败返回None
        """
        try:
            self.logger.info(f"开始OCR识别: {input_file}")
            # 使用OCR服务处理图片
            result_path = self.ocr_service.process_image(str(input_file))
            if result_path:
                # 确保结果文件在输出目录中
                result_path = Path(result_path)
                if result_path.exists():
                    self.logger.info(f"OCR识别成功，输出文件: {result_path}")
                    return result_path
                else:
                    self.logger.error(f"OCR结果文件不存在: {result_path}")
                    return None
            else:
                self.logger.error("OCR服务返回None")
                return None
        except Exception as e:
            self.logger.error(f"OCR识别失败: {e}", exc_info=True)
            return None
    def _process_excel(self, excel_file: Path, output_dir: Path) -> Optional[Path]:
        """处理Excel文件
        Args:
            excel_file: Excel文件路径
            output_dir: 输出目录
        Returns:
            处理后的Excel文件路径，失败返回None
        """
        try:
            self.logger.info(f"开始处理Excel文件: {excel_file}")
            # 使用订单服务处理Excel文件（生成采购单）
            result_path = self.order_service.process_excel(str(excel_file))
            if result_path:
                result_path = Path(result_path)
                if result_path.exists():
                    self.logger.info(f"Excel处理成功，输出文件: {result_path}")
                    return result_path
                else:
                    self.logger.error(f"Excel处理结果文件不存在: {result_path}")
                    return None
            else:
                self.logger.error("Excel处理服务返回None")
                return None
        except Exception as e:
            self.logger.error(f"Excel处理失败: {e}", exc_info=True)
            return None
    def _generate_purchase_order(self, processed_file: Path, output_dir: Path) -> Optional[Path]:
        """采购单生成由OrderService完成，此处直接返回处理结果"""
        try:
            if processed_file and processed_file.exists():
                return processed_file
            return None
        except Exception:
            return None
@@ -1,7 +0,0 @@
 """
 供应商处理器模块初始化文件
 """
 from .generic_supplier_processor import GenericSupplierProcessor
 __all__ = ['GenericSupplierProcessor']
@@ -1,340 +0,0 @@
 """
 通用供应商处理器
 可配置化的供应商处理器，支持通过配置文件定义处理规则
 """
 import fnmatch
 import pandas as pd
 from typing import Optional, Dict, Any, List
 from pathlib import Path
 from ..base import BaseProcessor
 from ...utils.log_utils import get_logger
 from ...handlers.rule_engine import apply_rules
 from ...handlers.column_mapper import ColumnMapper
 from ...handlers.data_cleaner import DataCleaner
 from ...handlers.calculator import DataCalculator
 logger = get_logger(__name__)
 class GenericSupplierProcessor(BaseProcessor):
    """通用供应商处理器
    基于配置文件处理不同供应商的Excel文件，支持：
    - 文件名模式匹配
    - 内容特征识别
    - 列映射配置
    - 数据清洗规则
    - 计算处理规则
    """
    def __init__(self, config: Dict[str, Any], supplier_config: Dict[str, Any]):
        """初始化通用供应商处理器
        Args:
            config: 系统配置
            supplier_config: 供应商特定配置
        """
        super().__init__(config)
        self.supplier_config = supplier_config
        # 从配置中提取基本信息
        self.name = supplier_config.get('name', 'GenericSupplier')
        self.description = supplier_config.get('description', '通用供应商处理器')
        # 处理规则配置
        self.filename_patterns = supplier_config.get('filename_patterns', [])
        self.content_indicators = supplier_config.get('content_indicators', [])
        self.column_mapping = supplier_config.get('column_mapping', {})
        self.cleaning_rules = supplier_config.get('cleaning_rules', [])
        self.calculations = supplier_config.get('calculations', [])
        # 输出配置
        self.output_template = supplier_config.get('output_template', 'templates/银豹-采购单模板.xls')
        self.output_suffix = supplier_config.get('output_suffix', '_银豹采购单')
    def can_process(self, file_path: Path) -> bool:
        """判断是否能处理该文件
        Args:
            file_path: 文件路径
        Returns:
            是否能处理
        """
        if not self.validate_input(file_path):
            return False
        # 检查文件名模式
        if self.filename_patterns:
            filename_match = self._check_filename_patterns(file_path)
            if filename_match:
                return True
        # 检查文件内容特征
        if self.content_indicators:
            content_match = self._check_content_indicators(file_path)
            if content_match:
                return True
        # 如果都没有配置，则无法判断
        if not self.filename_patterns and not self.content_indicators:
            self.logger.warning(f"处理器 {self.name} 没有配置识别规则")
            return False
        return False
    def process(self, input_file: Path, output_dir: Path) -> Optional[Path]:
        """处理文件
        Args:
            input_file: 输入文件路径
            output_dir: 输出目录路径
        Returns:
            输出文件路径，处理失败返回None
        """
        self.log_processing_start(input_file)
        try:
            # 步骤1: 读取数据
            self.logger.info("步骤1/4: 读取数据...")
            df = self._read_supplier_data(input_file)
            if df is None or df.empty:
                self.logger.error("读取数据失败或数据为空")
                self.log_processing_end(input_file, success=False)
                return None
            # 步骤2: 应用列映射
            self.logger.info("步骤2/4: 应用列映射...")
            mapped_df = self._apply_column_mapping(df)
            if mapped_df is None:
                self.logger.error("列映射失败")
                self.log_processing_end(input_file, success=False)
                return None
            # 步骤3: 数据清洗
            self.logger.info("步骤3/4: 数据清洗...")
            cleaned_df = self._apply_data_cleaning(mapped_df)
            if cleaned_df is None:
                self.logger.error("数据清洗失败")
                self.log_processing_end(input_file, success=False)
                return None
            try:
                rules = self.supplier_config.get('rules', [])
                dictionary = self.supplier_config.get('dictionary')
                standardized_df = apply_rules(cleaned_df, rules, dictionary)
            except Exception as e:
                self.logger.warning(f"规则执行失败: {e}")
                standardized_df = cleaned_df
            # 步骤4: 计算处理
            self.logger.info("步骤4/4: 计算处理...")
            calculated_df = self._apply_calculations(standardized_df)
            if calculated_df is None:
                self.logger.error("计算处理失败")
                self.log_processing_end(input_file, success=False)
                return None
            # 生成输出文件
            output_file = self._generate_output(calculated_df, input_file, output_dir)
            if output_file and output_file.exists():
                self.logger.info(f"处理完成，输出文件: {output_file}")
                self.log_processing_end(input_file, output_file, success=True)
                return output_file
            else:
                self.logger.error("输出文件生成失败")
                self.log_processing_end(input_file, success=False)
                return None
        except Exception as e:
            self.logger.error(f"处理文件时出错: {e}", exc_info=True)
            self.log_processing_end(input_file, success=False)
            return None
    def get_required_columns(self) -> List[str]:
        """返回需要的列名列表"""
        # 从列映射配置中提取目标列名
        return list(self.column_mapping.values()) if self.column_mapping else []
    def _check_filename_patterns(self, file_path: Path) -> bool:
        """检查文件名模式
        Args:
            file_path: 文件路径
        Returns:
            是否匹配
        """
        try:
            filename = file_path.name
            for pattern in self.filename_patterns:
                if fnmatch.fnmatch(filename.lower(), pattern.lower()):
                    self.logger.info(f"文件名匹配成功: {filename} -> {pattern}")
                    return True
            return False
        except Exception as e:
            self.logger.error(f"检查文件名模式时出错: {e}")
            return False
    def _check_content_indicators(self, file_path: Path) -> bool:
        """检查文件内容特征
        Args:
            file_path: 文件路径
        Returns:
            是否匹配
        """
        try:
            df = self._read_excel_safely(file_path, nrows=5)
            # 检查列名中是否包含指定关键词
            columns_str = str(list(df.columns)).lower()
            for indicator in self.content_indicators:
                if indicator.lower() in columns_str:
                    self.logger.info(f"内容特征匹配成功: {indicator}")
                    return True
            return False
        except Exception as e:
            self.logger.error(f"检查内容特征时出错: {e}")
            return False
    def _read_supplier_data(self, file_path: Path) -> Optional[pd.DataFrame]:
        """读取供应商数据
        Args:
            file_path: 文件路径
        Returns:
            数据DataFrame或None
        """
        try:
            specified = self.supplier_config.get('header_row')
            if specified is not None:
                try:
                    df = self._read_excel_safely(file_path, header=int(specified))
                except Exception:
                    df = self._read_excel_safely(file_path)
            else:
                df0 = self._read_excel_safely(file_path, header=None)
                if df0 is None:
                    return None
                header_row = self._find_header_row(df0)
                if header_row is not None:
                    df = self._read_excel_safely(file_path, header=header_row)
                else:
                    df = self._read_excel_safely(file_path)
            if df is None or df.empty:
                self.logger.warning("数据文件为空")
                return None
            self.logger.info(f"成功读取数据，形状: {df.shape}")
            return df
        except Exception as e:
            self.logger.error(f"读取数据失败: {e}")
            return None
    def _find_header_row(self, df: pd.DataFrame) -> Optional[int]:
        result = ColumnMapper.detect_header_row(df, max_rows=30)
        return result if result >= 0 else None
    def _apply_column_mapping(self, df: pd.DataFrame) -> Optional[pd.DataFrame]:
        """应用列映射
        Args:
            df: 原始数据
        Returns:
            映射后的数据或None
        """
        if not self.column_mapping:
            self.logger.info("没有列映射配置")
            return df
        try:
            # 应用列重命名
            df_renamed = df.rename(columns=self.column_mapping)
            # 检查必需的列是否存在
            required_columns = self.get_required_columns()
            missing_columns = [col for col in required_columns if col not in df_renamed.columns]
            if missing_columns:
                self.logger.warning(f"缺少必需的列: {missing_columns}")
                # 创建缺失的列并填充默认值
                for col in missing_columns:
                    df_renamed[col] = 0 if '量' in col or '价' in col else ''
                    self.logger.info(f"创建缺失列: {col}，默认值: {df_renamed[col].iloc[0] if len(df_renamed) > 0 else 'N/A'}")
            self.logger.info(f"列映射完成，列名: {list(df_renamed.columns)}")
            return df_renamed
        except Exception as e:
            self.logger.error(f"列映射失败: {e}")
            return None
    def _apply_data_cleaning(self, df: pd.DataFrame) -> Optional[pd.DataFrame]:
        """应用数据清洗规则，委托给 DataCleaner"""
        if not self.cleaning_rules:
            self.logger.info("没有数据清洗规则")
            return df
        try:
            cleaner = DataCleaner()
            for rule in self.cleaning_rules:
                cleaner.add_rule(rule.get('type'), **{k: v for k, v in rule.items() if k != 'type'})
            result = cleaner.clean(df)
            self.logger.info(f"数据清洗完成，数据形状: {result.shape}")
            return result
        except Exception as e:
            self.logger.error(f"数据清洗失败: {e}")
            return None
    def _apply_calculations(self, df: pd.DataFrame) -> Optional[pd.DataFrame]:
        """应用计算处理，委托给 DataCalculator"""
        if not self.calculations:
            self.logger.info("没有计算规则")
            return df
        try:
            calculator = DataCalculator()
            for calc in self.calculations:
                calculator.add_rule(calc.get('type'), **{k: v for k, v in calc.items() if k != 'type'})
            result = calculator.calculate(df)
            self.logger.info(f"计算处理完成，数据形状: {result.shape}")
            return result
        except Exception as e:
            self.logger.error(f"计算处理失败: {e}")
            return None
    def _generate_output(self, df: pd.DataFrame, input_file: Path, output_dir: Path) -> Optional[Path]:
        """生成输出文件
        Args:
            df: 最终数据
            input_file: 输入文件路径
            output_dir: 输出目录
        Returns:
            输出文件路径或None
        """
        try:
            # 生成输出文件名
            timestamp = pd.Timestamp.now().strftime("%Y%m%d_%H%M%S")
            output_filename = f"{input_file.stem}{self.output_suffix}_{timestamp}.xls"
            output_file = output_dir / output_filename
            # 这里应该使用实际的模板生成逻辑
            # 暂时直接保存为Excel文件
            df.to_excel(output_file, index=False)
            self.logger.info(f"输出文件生成成功: {output_file}")
            return output_file
        except Exception as e:
            self.logger.error(f"生成输出文件失败: {e}")
            return None
@@ -1,347 +0,0 @@
 """
 烟草订单处理器
 处理烟草公司特定格式的订单明细文件，生成银豹采购单
 """
 import os
 import datetime
 import pandas as pd
 import xlrd
 import xlwt
 from xlutils.copy import copy
 from openpyxl import load_workbook
 from typing import Optional, Dict, Any, List, Tuple
 from pathlib import Path
 from .base import BaseProcessor
 from ...core.utils.log_utils import get_logger
 from ...core.utils.string_utils import parse_monetary_string
 from ...core.utils.dialog_utils import show_custom_dialog
 logger = get_logger(__name__)
 class TobaccoProcessor(BaseProcessor):
    """烟草订单处理器
    处理烟草公司订单明细文件，提取商品信息并生成标准银豹采购单格式
    """
    def __init__(self, config: Dict[str, Any]):
        """初始化烟草订单处理器
        Args:
            config: 配置信息
        """
        super().__init__(config)
        self.description = "处理烟草公司订单明细文件"
        self.template_file = config.get('Paths', 'template_file', fallback='templates/银豹-采购单模板.xls')
        # 输出目录配置
        self.result_dir = Path(config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/result'))
        self.result_dir.mkdir(exist_ok=True)
        # 默认输出文件名
        self.default_output_name = "银豹采购单_烟草公司.xls"
    def can_process(self, file_path: Path) -> bool:
        """判断是否为烟草订单文件
        Args:
            file_path: 文件路径
        Returns:
            是否能处理该文件
        """
        if not self.validate_input(file_path):
            return False
        # 检查文件名特征
        filename = file_path.name
        tobacco_keywords = ['烟草', '卷烟', '订单明细', 'tobacco', '烟']
        # 检查文件内容特征
        try:
            df = self._read_excel_safely(file_path, nrows=5)
            required_columns = ['商品', '盒码', '订单量']
            # 检查文件名或内容特征
            filename_match = any(keyword in filename for keyword in tobacco_keywords)
            content_match = all(col in df.columns for col in required_columns)
            if filename_match or content_match:
                self.logger.info(f"识别为烟草订单文件: {filename}")
                return True
            return False
        except Exception as e:
            self.logger.warning(f"检查文件内容时出错: {e}")
            # 如果无法读取内容，仅基于文件名判断
            return any(keyword in filename for keyword in tobacco_keywords)
    def process(self, input_file: Path, output_dir: Path) -> Optional[Path]:
        """处理烟草订单
        Args:
            input_file: 输入文件路径
            output_dir: 输出目录路径
        Returns:
            输出文件路径，处理失败返回None
        """
        self.log_processing_start(input_file)
        try:
            # 读取订单信息（时间和总金额）
            order_info = self._read_order_info(input_file)
            if not order_info:
                self.logger.error(f"读取订单信息失败: {input_file}")
                self.log_processing_end(input_file, success=False)
                return None
            order_time, total_amount = order_info
            self.logger.info(f"订单信息 - 时间: {order_time}, 总金额: {total_amount}")
            # 读取订单数据
            order_data = self._read_order_data(input_file)
            if order_data is None or order_data.empty:
                self.logger.error(f"读取订单数据失败或数据为空: {input_file}")
                self.log_processing_end(input_file, success=False)
                return None
            self.logger.info(f"成功读取订单数据，共{len(order_data)}条记录")
            # 生成输出文件路径
            timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
            output_filename = f"银豹采购单_烟草公司_{timestamp}.xls"
            output_file = output_dir / output_filename
            # 确保输出目录存在
            output_file.parent.mkdir(parents=True, exist_ok=True)
            # 生成银豹采购单
            result = self._generate_pospal_order(order_data, order_time, output_file)
            if result:
                self.logger.info(f"采购单生成成功: {output_file}")
                self.log_processing_end(input_file, output_file, success=True)
                # 显示处理结果
                self._show_processing_result(output_file, order_time, len(order_data), total_amount)
                return output_file
            else:
                self.logger.error("生成银豹采购单失败")
                self.log_processing_end(input_file, success=False)
                return None
        except Exception as e:
            self.logger.error(f"处理烟草订单时发生错误: {e}", exc_info=True)
            self.log_processing_end(input_file, success=False)
            return None
    def get_required_columns(self) -> List[str]:
        """返回需要的列名列表"""
        return ['商品', '盒码', '条码', '建议零售价', '批发价', '需求量', '订单量', '金额']
    def get_supported_extensions(self) -> List[str]:
        """支持的文件扩展名"""
        return ['.xlsx', '.xls']
    def _read_order_info(self, file_path: Path) -> Optional[Tuple[str, float]]:
        """读取订单信息（时间和总金额）
        Args:
            file_path: 文件路径
        Returns:
            包含订单时间和总金额的元组或None
        """
        try:
            wb_info = load_workbook(file_path, data_only=True)
            ws_info = wb_info.active
            # 从指定单元格读取订单信息
            order_time = ws_info["H1"].value or "（空）"
            total_amount = ws_info["H3"].value or 0.0
            self.logger.info(f"成功读取订单信息: 时间={order_time}, 总金额={total_amount}")
            return (order_time, total_amount)
        except Exception as e:
            self.logger.error(f"读取订单信息出错: {e}")
            return None
    def _read_order_data(self, file_path: Path) -> Optional[pd.DataFrame]:
        """读取订单数据
        Args:
            file_path: 文件路径
        Returns:
            订单数据DataFrame或None
        """
        columns = ['商品', '盒码', '条码', '建议零售价', '批发价', '需求量', '订单量', '金额']
        try:
            df_old = self._read_excel_safely(file_path, header=None, skiprows=3, names=columns)
            # 过滤订单量不为0的数据，并计算采购量和单价
            df_filtered = df_old[df_old['订单量'] != 0].copy()
            if df_filtered.empty:
                self.logger.warning("没有订单量不为0的记录")
                return None
            # 计算采购量和单价
            df_filtered['采购量'] = df_filtered['订单量'] * 10  # 烟草订单通常需要乘以10
            df_filtered['采购单价'] = df_filtered['金额'] / df_filtered['采购量']
            df_filtered = df_filtered.reset_index(drop=True)
            self.logger.info(f"成功处理订单数据，有效记录数: {len(df_filtered)}")
            return df_filtered
        except Exception as e:
            self.logger.error(f"读取订单数据失败: {e}")
            return None
    def _generate_pospal_order(self, order_data: pd.DataFrame, order_time: str, output_file: Path) -> bool:
        """生成银豹采购单
        Args:
            order_data: 订单数据
            order_time: 订单时间
            output_file: 输出文件路径
        Returns:
            是否生成成功
        """
        try:
            # 检查模板文件是否存在
            template_path = Path(self.template_file)
            if not template_path.exists():
                self.logger.error(f"采购单模板文件不存在: {template_path}")
                return False
            self.logger.info(f"使用模板文件: {template_path}")
            # 打开模板，准备写入
            template_rd = xlrd.open_workbook(str(template_path), formatting_info=True)
            template_wb = copy(template_rd)
            template_ws = template_wb.get_sheet(0)
            # 获取模板中的表头列索引
            header_row = template_rd.sheet_by_index(0).row_values(0)
            # 查找需要的列索引
            try:
                barcode_col = header_row.index("条码（必填）")
                amount_col = header_row.index("采购量（必填）")
                gift_col = header_row.index("赠送量")
                price_col = header_row.index("采购单价（必填）")
            except ValueError as e:
                self.logger.error(f"模板列查找失败: {e}")
                return False
            self.logger.info(f"模板列索引 - 条码:{barcode_col}, 采购量:{amount_col}, 赠送量:{gift_col}, 单价:{price_col}")
            # 写入数据到模板
            for i, row in order_data.iterrows():
                template_ws.write(i + 1, barcode_col, row['盒码'])  # 商品条码
                template_ws.write(i + 1, amount_col, int(row['采购量']))  # 采购量
                template_ws.write(i + 1, gift_col, "")  # 赠送量为空
                template_ws.write(i + 1, price_col, round(row['采购单价'], 2))  # 采购单价保留两位小数
            # 确保输出目录存在
            output_file.parent.mkdir(parents=True, exist_ok=True)
            # 保存输出文件
            template_wb.save(str(output_file))
            self.logger.info(f"采购单生成成功: {output_file}")
            return True
        except Exception as e:
            self.logger.error(f"生成银豹采购单失败: {e}", exc_info=True)
            return False
    def _show_processing_result(self, output_file: Path, order_time: str, total_count: int, total_amount: float):
        """显示处理结果
        Args:
            output_file: 输出文件路径
            order_time: 订单时间
            total_count: 处理条目数
            total_amount: 总金额
        """
        try:
            # 创建附加信息
            additional_info = {
                "订单来源": "烟草公司",
                "处理时间": datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
            }
            # 格式化金额显示
            parsed = parse_monetary_string(total_amount)
            total_amount = parsed if parsed is not None else 0.0
            amount_display = f"¥{total_amount:.2f}"
            # 显示自定义对话框
            show_custom_dialog(
                title="烟草订单处理结果",
                message="烟草订单处理完成",
                result_file=str(output_file),
                time_info=order_time,
                count_info=f"{total_count}个商品",
                amount_info=amount_display,
                additional_info=additional_info
            )
            self.logger.info(f"显示处理结果 - 文件:{output_file}, 时间:{order_time}, 数量:{total_count}, 金额:{total_amount}")
        except Exception as e:
            self.logger.error(f"显示处理结果时出错: {e}")
    def get_latest_tobacco_order(self) -> Optional[Path]:
        """获取最新的烟草订单明细文件（兼容旧接口）
        Returns:
            文件路径或None
        """
        try:
            # 获取今日开始时间戳
            today = datetime.date.today()
            today_start = datetime.datetime.combine(today, datetime.time.min).timestamp()
            # 查找订单明细文件
            result_dir = Path(self.config.get_path('Paths', 'output_folder', fallback='data/output') if hasattr(self.config, 'get_path') else os.path.abspath('data/output'))
            if not result_dir.exists():
                return None
            # 查找符合条件的文件
            candidates = []
            for file_path in result_dir.glob("订单明细*.xlsx"):
                if file_path.stat().st_ctime >= today_start:
                    candidates.append(file_path)
            if not candidates:
                self.logger.warning("未找到今天创建的烟草订单明细文件")
                # 返回最新的文件
                all_files = list(result_dir.glob("订单明细*.xlsx"))
                if all_files:
                    all_files.sort(key=lambda x: x.stat().st_ctime, reverse=True)
                    return all_files[0]
                return None
            # 返回最新的文件
            candidates.sort(key=lambda x: x.stat().st_ctime, reverse=True)
            latest_file = candidates[0]
            self.logger.info(f"找到最新烟草订单明细文件: {latest_file}")
            return latest_file
        except Exception as e:
            self.logger.error(f"获取最新烟草订单文件时出错: {e}")
            return None
@@ -1,247 +0,0 @@
 """云端同步模块 — 基于 Gitea REST API 的文件同步"""
 import base64
 import json
 from typing import Optional, Tuple
 import requests
 from .log_utils import get_logger
 logger = get_logger(__name__)
 class GiteaSync:
    """通过 Gitea REST API 读写仓库文件"""
    def __init__(self, base_url: str, owner: str, repo: str, token: str, timeout: int = 15):
        self.base_url = base_url.rstrip("/")
        self.owner = owner
        self.repo = repo
        self.token = token
        self.timeout = timeout
    @property
    def _headers(self) -> dict:
        return {"Authorization": f"token {self.token}"}
    def _api_url(self, path: str) -> str:
        return f"{self.base_url}/api/v1/repos/{self.owner}/{self.repo}/contents/{path}"
    def pull_file(self, remote_path: str) -> Optional[Tuple[bytes, str]]:
        """从仓库下载文件
        Returns:
            (content_bytes, sha) 或 None（文件不存在或失败）
        """
        try:
            resp = requests.get(
                self._api_url(remote_path),
                headers=self._headers,
                timeout=self.timeout,
            )
            if resp.status_code == 404:
                logger.info(f"云端文件不存在: {remote_path}")
                return None
            if resp.status_code != 200:
                logger.warning(f"拉取文件失败: {resp.status_code} {resp.text[:200]}")
                return None
            data = resp.json()
            sha = data.get("sha", "")
            content_b64 = data.get("content", "")
            # Gitea 返回的 base64 可能含换行
            content_bytes = base64.b64decode(content_b64.replace("\n", ""))
            logger.info(f"拉取文件成功: {remote_path} ({len(content_bytes)} bytes)")
            return content_bytes, sha
        except requests.RequestException as e:
            logger.error(f"拉取文件网络错误: {e}")
            return None
    def push_file(
        self,
        remote_path: str,
        content: bytes,
        message: str,
        sha: Optional[str] = None,
    ) -> Optional[str]:
        """上传或更新文件到仓库
        Args:
            remote_path: 仓库中的文件路径
            content: 文件内容（bytes）
            message: commit message
            sha: 文件当前 sha（更新时必传，新建时省略）
        Returns:
            新的 sha，失败返回 None
        """
        payload = {
            "message": message,
            "content": base64.b64encode(content).decode("ascii"),
        }
        if sha:
            payload["sha"] = sha
        try:
            resp = requests.put(
                self._api_url(remote_path),
                headers={**self._headers, "Content-Type": "application/json"},
                json=payload,
                timeout=self.timeout,
            )
            if resp.status_code not in (200, 201):
                logger.warning(f"推送文件失败: {resp.status_code} {resp.text[:200]}")
                return None
            new_sha = resp.json().get("content", {}).get("sha", "")
            logger.info(f"推送文件成功: {remote_path} (sha={new_sha[:12]})")
            return new_sha
        except requests.RequestException as e:
            logger.error(f"推送文件网络错误: {e}")
            return None
    def file_exists(self, remote_path: str) -> Optional[str]:
        """检查文件是否存在
        Returns:
            文件 sha（存在）或 None（不存在）
        """
        try:
            resp = requests.head(
                self._api_url(remote_path),
                headers=self._headers,
                timeout=self.timeout,
            )
            if resp.status_code == 200:
                # HEAD 不返回 body，需要 GET 获取 sha
                result = self.pull_file(remote_path)
                return result[1] if result else None
            return None
        except requests.RequestException:
            return None
    def pull_json(self, remote_path: str) -> Optional[Tuple[dict, str]]:
        """拉取并解析 JSON 文件
        Returns:
            (parsed_dict, sha) 或 None
        """
        result = self.pull_file(remote_path)
        if result is None:
            return None
        content_bytes, sha = result
        try:
            data = json.loads(content_bytes)
            return data, sha
        except json.JSONDecodeError as e:
            logger.error(f"解析 JSON 失败: {e}")
            return None
    def push_json(self, remote_path: str, data: dict, message: str, sha: Optional[str] = None) -> Optional[str]:
        """将 dict 序列化为 JSON 并推送
        Returns:
            新的 sha，失败返回 None
        """
        content = json.dumps(data, ensure_ascii=False, indent=2).encode("utf-8")
        return self.push_file(remote_path, content, message, sha)
    def push_binary(self, remote_path: str, local_path: str, message: str) -> Optional[str]:
        """读取本地二进制文件并推送到云端
        Returns:
            新的 sha，失败返回 None
        """
        try:
            with open(local_path, "rb") as f:
                content = f.read()
        except OSError as e:
            logger.error(f"读取本地文件失败: {local_path} — {e}")
            return None
        existing_sha = self.file_exists(remote_path)
        return self.push_file(remote_path, content, message, sha=existing_sha)
    def push(self) -> str:
        """推送本地数据到云端：product_cache.json + barcode_mappings.json"""
        import os
        from pathlib import Path
        project_root = Path(__file__).resolve().parent.parent.parent.parent
        results = []
        # 1. Product cache
        from app.core.db.product_db import ProductDatabase
        excel_source = str(project_root / "templates" / "商品资料.xlsx")
        db_path = str(project_root / "data" / "product_cache.db")
        product_db = ProductDatabase(db_path, excel_source)
        product_data = product_db.export_for_sync()
        sha = self.push_json("product_cache.json", product_data, "sync: update product cache")
        results.append(f"product_cache: {'ok' if sha else 'skip'}")
        # 2. Barcode mappings
        barcode_path = project_root / "config" / "barcode_mappings.json"
        if barcode_path.exists():
            with open(barcode_path, "r", encoding="utf-8") as f:
                barcode_data = json.loads(f.read())
            sha = self.push_json("barcode_mappings.json", barcode_data, "sync: update barcode mappings")
            results.append(f"barcode_mappings: {'ok' if sha else 'skip'}")
        return "; ".join(results) if results else "无数据需要同步"
    def pull(self) -> str:
        """从云端拉取数据并写入本地文件"""
        import os
        from pathlib import Path
        project_root = Path(__file__).resolve().parent.parent.parent.parent
        results = []
        # 1. Product cache
        result = self.pull_json("product_cache.json")
        if result is not None:
            data, sha = result
            from app.core.db.product_db import ProductDatabase
            excel_source = str(project_root / "templates" / "商品资料.xlsx")
            db_path = str(project_root / "data" / "product_cache.db")
            os.makedirs(os.path.dirname(db_path), exist_ok=True)
            product_db = ProductDatabase(db_path, excel_source)
            count = product_db.import_from_sync(data)
            results.append(f"product_cache: 导入 {count} 条")
        else:
            results.append("product_cache: 云端无数据")
        # 2. Barcode mappings
        barcode_result = self.pull_json("barcode_mappings.json")
        if barcode_result is not None:
            barcode_data, sha = barcode_result
            barcode_path = project_root / "config" / "barcode_mappings.json"
            barcode_path.parent.mkdir(parents=True, exist_ok=True)
            with open(barcode_path, "w", encoding="utf-8") as f:
                json.dump(barcode_data, f, ensure_ascii=False, indent=2)
            results.append(f"barcode_mappings: 已更新")
        else:
            results.append("barcode_mappings: 云端无数据")
        return "; ".join(results) if results else "无数据需要同步"
    @classmethod
    def from_config(cls, config) -> Optional["GiteaSync"]:
        """从 ConfigManager 创建实例
        Returns:
            GiteaSync 实例，配置不完整时返回 None
        """
        base_url = config.get("Gitea", "base_url", fallback="").strip()
        owner = config.get("Gitea", "owner", fallback="").strip()
        repo = config.get("Gitea", "repo", fallback="").strip()
        token = config.get("Gitea", "token", fallback="").strip()
        if not all([base_url, owner, repo, token]):
            logger.debug("Gitea 配置不完整，跳过云端同步")
            return None
        return cls(base_url=base_url, owner=owner, repo=repo, token=token)
@@ -8,14 +8,10 @@
 """
 import os
 import json
 import tkinter as tk
 from tkinter import messagebox, ttk, simpledialog
 from datetime import datetime
 from .cloud_sync import GiteaSync
 from app.config.settings import ConfigManager
 def create_custom_dialog(title="提示", message="", result_file=None, time_info=None, 
                        count_info=None, amount_info=None, additional_info=None):
    """
@@ -82,12 +78,11 @@ def create_custom_dialog(title="提示", message="", result_file=None, time_info
            file_size = os.path.getsize(result_file)
            file_time = datetime.fromtimestamp(os.path.getmtime(result_file))
-            from .file_utils import format_file_size
+            size_text = f"{file_size / 1024:.1f} KB" if file_size < 1024*1024 else f"{file_size / (1024*1024):.1f} MB"
            size_text = format_file_size(file_size)
            tk.Label(file_frame, text=f"文件大小: {size_text}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
            tk.Label(file_frame, text=f"创建时间: {file_time.strftime('%Y-%m-%d %H:%M:%S')}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
-        except Exception:
+        except:
            tk.Label(file_frame, text="无法获取文件信息", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
        # 添加按钮
@@ -107,7 +102,7 @@ def create_custom_dialog(title="提示", message="", result_file=None, time_info
        button_frame = tk.Frame(dialog)
        button_frame.pack(pady=10)
-        tk.Button(button_frame, text="打开输出目录", command=lambda: os.startfile(ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True))).pack(side=tk.LEFT, padx=5)
+        tk.Button(button_frame, text="打开输出目录", command=lambda: os.startfile(os.path.abspath("data/output"))).pack(side=tk.LEFT, padx=5)
        tk.Button(button_frame, text="关闭", command=dialog.destroy).pack(side=tk.LEFT, padx=5)
    # 确保窗口显示在最前
@@ -488,135 +483,6 @@ def create_barcode_mapping_dialog(parent=None, on_save=None, current_mappings=No
    cancel_btn = tk.Button(bottom_frame, text="取消", command=cancel)
    cancel_btn.pack(side=tk.RIGHT, padx=5)
    # ---- 云端同步按钮 ----
    def _build_current_mappings():
        """从弹窗当前数据构建 mappings dict（与 save_mappings 逻辑相同）"""
        mappings = {}
        for source, target in mapping_list:
            mappings[source] = {
                'map_to': target,
                'description': f'条码映射：{source} -> {target}'
            }
        for barcode, multiplier, unit, price, spec, desc in special_list:
            if barcode not in mappings:
                mappings[barcode] = {}
            if multiplier:
                try:
                    if isinstance(multiplier, str):
                        mappings[barcode]['multiplier'] = float(multiplier) if '.' in multiplier else int(multiplier)
                    else:
                        mappings[barcode]['multiplier'] = multiplier
                except ValueError:
                    mappings[barcode]['multiplier'] = multiplier
            if unit:
                mappings[barcode]['target_unit'] = unit
            if price:
                try:
                    mappings[barcode]['fixed_price'] = float(price)
                except ValueError:
                    mappings[barcode]['fixed_price'] = price
            if spec:
                mappings[barcode]['specification'] = spec
            if desc and "映射到:" in desc:
                parts = desc.split("映射到:")
                base_desc = parts[0].strip()
                target_barcode = parts[1].strip()
                if base_desc:
                    mappings[barcode]['description'] = base_desc
                mappings[barcode]['map_to'] = target_barcode
            elif desc:
                mappings[barcode]['description'] = desc
        return mappings
    def _get_sync():
        """获取 GiteaSync 实例，配置不完整时提示用户"""
        sync = GiteaSync.from_config(ConfigManager())
        if sync is None:
            messagebox.showwarning("云端同步", "请先在「系统设置」中配置 Gitea 云端同步参数（token）")
        return sync
    def _refresh_trees(new_mappings):
        """用新数据刷新两个 Treeview"""
        # 清空
        for item in mapping_tree.get_children():
            mapping_tree.delete(item)
        mapping_list.clear()
        for item in special_tree.get_children():
            special_tree.delete(item)
        special_list.clear()
        # 重新填充
        if new_mappings:
            for barcode, data in new_mappings.items():
                if 'map_to' in data and 'multiplier' not in data:
                    mapping_list.append((barcode, data['map_to']))
                    mapping_tree.insert('', 'end', values=(barcode, data['map_to']))
                else:
                    mult = data.get('multiplier', '')
                    unit = data.get('target_unit', '')
                    price = data.get('fixed_price', '')
                    spec = data.get('specification', '')
                    desc = data.get('description', '')
                    if 'map_to' in data:
                        desc = f"{desc} 映射到: {data['map_to']}" if desc else f"映射到: {data['map_to']}"
                    special_list.append((barcode, mult, unit, price, spec, desc))
                    tags = ("mapped",) if 'map_to' in data else ()
                    special_tree.insert('', 'end', values=(barcode, mult, unit, price, spec, desc), tags=tags)
            if any('map_to' in d for d in new_mappings.values()):
                special_tree.tag_configure("mapped", foreground="blue")
    def push_to_cloud():
        sync = _get_sync()
        if not sync:
            return
        mappings = _build_current_mappings()
        if not mappings:
            messagebox.showwarning("同步到云端", "当前没有映射数据可同步")
            return
        # 先获取当前 sha（如果文件已存在）
        sha = None
        existing = sync.pull_file("barcode_mappings.json")
        if existing:
            sha = existing[1]
        new_sha = sync.push_json(
            "barcode_mappings.json",
            mappings,
            f"同步条码映射 ({len(mappings)} 条)",
            sha=sha,
        )
        if new_sha:
            messagebox.showinfo("同步成功", f"已推送 {len(mappings)} 条映射到云端")
        else:
            messagebox.showerror("同步失败", "推送到云端失败，请检查网络和 Gitea 配置")
    def pull_from_cloud():
        sync = _get_sync()
        if not sync:
            return
        result = sync.pull_json("barcode_mappings.json")
        if result is None:
            messagebox.showwarning("拉取失败", "云端没有找到条码映射文件，或网络错误")
            return
        data, sha = result
        if not isinstance(data, dict) or len(data) == 0:
            messagebox.showwarning("拉取失败", "云端数据格式异常")
            return
        # 同时保存到本地
        from app.core.excel.converter import UnitConverter
        uc = UnitConverter()
        uc.update_barcode_mappings(data)
        # 刷新弹窗
        _refresh_trees(data)
        messagebox.showinfo("拉取成功", f"已从云端拉取 {len(data)} 条映射，本地已同步更新")
    sync_frame = tk.Frame(bottom_frame)
    sync_frame.pack(side=tk.LEFT, padx=5)
    push_btn = tk.Button(sync_frame, text="同步到云端", command=push_to_cloud, fg="white", bg="#4a90d9")
    push_btn.pack(side=tk.LEFT, padx=3)
    pull_btn = tk.Button(sync_frame, text="从云端拉取", command=pull_from_cloud, fg="white", bg="#5cb85c")
    pull_btn.pack(side=tk.LEFT, padx=3)
    # 导入当前映射数据
    if current_mappings:
        for barcode, data in current_mappings.items():
@@ -800,362 +666,5 @@ def show_config_dialog(parent, config_manager, on_save=None):
    dialog.transient(parent)
    dialog.grab_set()
-
+    # 等待窗口关闭
-# ──────────────────────────────────────────────────────────────
+    parent.wait_window(dialog) 
 #  云端同步管理对话框
 # ──────────────────────────────────────────────────────────────
 SYNC_FILES = [
    {
        "name": "条码映射",
        "remote": "barcode_mappings.json",
        "local": "config/barcode_mappings.json",
        "type": "json",
    },
    {
        "name": "供应商配置",
        "remote": "suppliers_config.json",
        "local": "config/suppliers_config.json",
        "type": "json",
    },
    {
        "name": "商品资料",
        "remote": "templates/商品资料.xlsx",
        "local": "templates/商品资料.xlsx",
        "type": "binary",
    },
    {
        "name": "采购单模板",
        "remote": "templates/银豹-采购单模板.xls",
        "local": "templates/银豹-采购单模板.xls",
        "type": "binary",
    },
    {
        "name": "商品记忆库",
        "remote": "product_memory.json",
        "local": "data/product_memory.json",
        "type": "json",
    },
 ]
 def _format_size(path: str) -> str:
    try:
        size = os.path.getsize(path)
        if size < 1024 * 1024:
            return f"{size / 1024:.1f} KB"
        return f"{size / (1024 * 1024):.1f} MB"
    except OSError:
        return "—"
 def show_cloud_sync_dialog(parent=None):
    """统一云端同步管理对话框"""
    sync = GiteaSync.from_config(ConfigManager())
    if sync is None:
        messagebox.showwarning(
            "配置不完整",
            "请先在「系统设置」中配置 Gitea 地址和 Access Token",
        )
        return
    dlg = tk.Toplevel(parent)
    dlg.title("云端同步管理")
    dlg.geometry("620x440")
    dlg.resizable(False, False)
    # 居中
    dlg.update_idletasks()
    x = (dlg.winfo_screenwidth() - 620) // 2
    y = (dlg.winfo_screenheight() - 440) // 2
    dlg.geometry(f"620x440+{x}+{y}")
    # ── Treeview ──
    columns = ("name", "local_status", "cloud_status")
    tree = ttk.Treeview(dlg, columns=columns, show="headings", height=6)
    tree.heading("name", text="文件")
    tree.heading("local_status", text="本地状态")
    tree.heading("cloud_status", text="云端状态")
    tree.column("name", width=140)
    tree.column("local_status", width=220)
    tree.column("cloud_status", width=220)
    tree.pack(fill=tk.BOTH, expand=True, padx=16, pady=(16, 8))
    # tag 颜色
    tree.tag_configure("synced", foreground="#2e7d32")
    tree.tag_configure("cloud_only", foreground="#e65100")
    tree.tag_configure("local_only", foreground="#1565c0")
    tree.tag_configure("missing", foreground="#999999")
    # 用 iid = remote_path 标识每行
    cloud_sha_cache: dict = {}  # remote_path -> sha
    def _load_local_status():
        """仅加载本地状态，不发网络请求"""
        for item in tree.get_children():
            tree.delete(item)
        for entry in SYNC_FILES:
            local = entry["local"]
            if os.path.exists(local):
                if entry["type"] == "json":
                    try:
                        with open(local, "r", encoding="utf-8") as f:
                            data = json.load(f)
                        if isinstance(data, dict):
                            local_text = f"{len(data)} 项"
                        elif isinstance(data, list):
                            local_text = f"{len(data)} 条记录"
                        else:
                            local_text = "已存在"
                    except Exception:
                        local_text = "已存在(解析异常)"
                else:
                    local_text = _format_size(local)
                tag = "local_only"
            else:
                local_text = "不存在"
                tag = "missing"
            tree.insert(
                "", tk.END,
                iid=entry["remote"],
                values=(entry["name"], local_text, "点「刷新状态」检查"),
                tags=(tag,),
            )
    def refresh_status():
        """刷新每行的本地/云端状态"""
        cloud_sha_cache.clear()
        for item in tree.get_children():
            tree.delete(item)
        for entry in SYNC_FILES:
            remote = entry["remote"]
            local = entry["local"]
            # 本地状态
            if os.path.exists(local):
                if entry["type"] == "json":
                    try:
                        with open(local, "r", encoding="utf-8") as f:
                            data = json.load(f)
                        if isinstance(data, dict):
                            local_text = f"{len(data)} 项"
                        elif isinstance(data, list):
                            local_text = f"{len(data)} 条记录"
                        else:
                            local_text = "已存在"
                    except Exception:
                        local_text = "已存在(解析异常)"
                else:
                    local_text = _format_size(local)
            else:
                local_text = "不存在"
            # 云端状态 — 网络请求，可能慢
            sha = sync.file_exists(remote)
            if sha:
                cloud_sha_cache[remote] = sha
                cloud_text = "已存在"
            else:
                cloud_text = "未上传"
            # tag
            local_ok = os.path.exists(local)
            cloud_ok = sha is not None
            if local_ok and cloud_ok:
                tag = "synced"
            elif cloud_ok and not local_ok:
                tag = "cloud_only"
            elif local_ok and not cloud_ok:
                tag = "local_only"
            else:
                tag = "missing"
            tree.insert(
                "", tk.END,
                iid=remote,
                values=(entry["name"], local_text, cloud_text),
                tags=(tag,),
            )
    # ── 操作函数 ──
    def _get_selected_entries():
        """获取选中的文件条目列表"""
        selected = tree.selection()
        if not selected:
            messagebox.showinfo("提示", "请先选中要操作的文件")
            return []
        return [e for e in SYNC_FILES if e["remote"] in selected]
    def push_selected():
        entries = _get_selected_entries()
        if not entries:
            return
        ok, fail = 0, 0
        for entry in entries:
            local, remote = entry["local"], entry["remote"]
            if not os.path.exists(local):
                messagebox.showwarning("跳过", f"本地文件不存在: {local}")
                fail += 1
                continue
            if entry["type"] == "json":
                try:
                    with open(local, "r", encoding="utf-8") as f:
                        data = json.load(f)
                    sha = cloud_sha_cache.get(remote)
                    result = sync.push_json(remote, data, f"同步 {entry['name']}", sha=sha)
                except Exception as e:
                    messagebox.showerror("推送失败", f"{entry['name']}: {e}")
                    fail += 1
                    continue
            else:
                result = sync.push_binary(remote, local, f"同步 {entry['name']}")
            if result:
                ok += 1
            else:
                fail += 1
        if ok:
            messagebox.showinfo("推送完成", f"成功 {ok} 个" + (f"，失败 {fail} 个" if fail else ""))
        refresh_status()
    def pull_selected():
        entries = _get_selected_entries()
        if not entries:
            return
        ok, fail = 0, 0
        for entry in entries:
            remote, local = entry["remote"], entry["local"]
            if entry["type"] == "json":
                result = sync.pull_json(remote)
                if result is None:
                    messagebox.showwarning("拉取失败", f"云端文件不存在: {entry['name']}")
                    fail += 1
                    continue
                content, sha = result
                # 写入本地
                os.makedirs(os.path.dirname(local) or ".", exist_ok=True)
                with open(local, "w", encoding="utf-8") as f:
                    json.dump(content, f, ensure_ascii=False, indent=2)
                # 特殊后处理
                _post_pull(entry, content)
            else:
                result = sync.pull_file(remote)
                if result is None:
                    messagebox.showwarning("拉取失败", f"云端文件不存在: {entry['name']}")
                    fail += 1
                    continue
                content, sha = result
                os.makedirs(os.path.dirname(local) or ".", exist_ok=True)
                with open(local, "wb") as f:
                    f.write(content)
            ok += 1
        if ok:
            messagebox.showinfo("拉取完成", f"成功 {ok} 个" + (f"，失败 {fail} 个" if fail else ""))
        refresh_status()
    def _post_pull(entry, data):
        """拉取 JSON 文件后的特殊处理"""
        if entry["remote"] == "barcode_mappings.json":
            try:
                from app.core.excel.converter import UnitConverter
                UnitConverter().update_barcode_mappings(data)
            except Exception:
                pass
        elif entry["remote"] == "suppliers_config.json":
            try:
                from app.services.processor_service import ProcessorService
                ProcessorService(ConfigManager()).reload_processors()
            except Exception:
                pass
        elif entry["remote"] == "product_memory.json":
            try:
                from app.core.db.product_db import ProductDatabase
                cfg = ConfigManager()
                db_path = cfg.get_path('Paths', 'product_db', fallback='data/product_cache.db') if hasattr(cfg, 'get_path') else 'data/product_cache.db'
                tpl_folder = cfg.get('Paths', 'template_folder', fallback='templates')
                item_data = cfg.get('Templates', 'item_data', fallback='商品资料.xlsx')
                tpl_path = os.path.join(tpl_folder, item_data)
                db = ProductDatabase(db_path, tpl_path)
                count = db.import_from_sync(data)
                logger.info(f"从云端导入商品记忆: {count} 条")
            except Exception:
                pass
    def push_all():
        ok, fail = 0, 0
        for entry in SYNC_FILES:
            local, remote = entry["local"], entry["remote"]
            if not os.path.exists(local):
                fail += 1
                continue
            if entry["type"] == "json":
                try:
                    with open(local, "r", encoding="utf-8") as f:
                        data = json.load(f)
                    sha = cloud_sha_cache.get(remote)
                    result = sync.push_json(remote, data, f"批量同步 {entry['name']}", sha=sha)
                except Exception:
                    fail += 1
                    continue
            else:
                result = sync.push_binary(remote, local, f"批量同步 {entry['name']}")
            if result:
                ok += 1
            else:
                fail += 1
        messagebox.showinfo("批量推送完成", f"成功 {ok} 个，失败 {fail} 个")
        refresh_status()
    def pull_all():
        ok, fail = 0, 0
        for entry in SYNC_FILES:
            remote, local = entry["remote"], entry["local"]
            if entry["type"] == "json":
                result = sync.pull_json(remote)
                if result is None:
                    fail += 1
                    continue
                content, sha = result
                os.makedirs(os.path.dirname(local) or ".", exist_ok=True)
                with open(local, "w", encoding="utf-8") as f:
                    json.dump(content, f, ensure_ascii=False, indent=2)
                _post_pull(entry, content)
            else:
                result = sync.pull_file(remote)
                if result is None:
                    fail += 1
                    continue
                content, sha = result
                os.makedirs(os.path.dirname(local) or ".", exist_ok=True)
                with open(local, "wb") as f:
                    f.write(content)
            ok += 1
        messagebox.showinfo("批量拉取完成", f"成功 {ok} 个，失败 {fail} 个")
        refresh_status()
    # ── 按钮区域 ──
    btn_frame = ttk.Frame(dlg)
    btn_frame.pack(fill=tk.X, padx=16, pady=(4, 16))
    # 左侧：批量操作
    ttk.Button(btn_frame, text="全部推送到云端", command=push_all).pack(side=tk.LEFT, padx=4)
    ttk.Button(btn_frame, text="全部从云端拉取", command=pull_all).pack(side=tk.LEFT, padx=4)
    # 右侧：选中操作 + 刷新 + 关闭
    ttk.Button(btn_frame, text="关闭", command=dlg.destroy).pack(side=tk.RIGHT, padx=4)
    ttk.Button(btn_frame, text="刷新状态", command=refresh_status).pack(side=tk.RIGHT, padx=4)
    tk.Button(btn_frame, text="推送到云端", command=push_selected, fg="white", bg="#4a90d9").pack(side=tk.RIGHT, padx=4)
    tk.Button(btn_frame, text="从云端拉取", command=pull_selected, fg="white", bg="#5cb85c").pack(side=tk.RIGHT, padx=4)
    # 仅显示本地状态，云端状态需手动点"刷新状态"
    _load_local_status()
    dlg.transient(parent)
    dlg.grab_set()
@@ -219,34 +219,6 @@ def save_json(data: Any, file_path: str, ensure_ascii: bool = False, indent: int
        logger.error(f"保存JSON文件失败: {file_path}, 错误: {e}")
        return False
 def smart_read_excel(file_path: Union[str, Path], **kwargs) -> Any:
    """
    智能读取 Excel 文件，自动选择引擎并处理常见错误
    Args:
        file_path: Excel 文件路径
        **kwargs: 传递给 pd.read_excel 的额外参数
    Returns:
        pandas.DataFrame 对象
    """
    import pandas as pd
    path_str = str(file_path)
    ext = os.path.splitext(path_str)[1].lower()
    # 自动选择引擎
    if ext == '.xlsx':
        kwargs.setdefault('engine', 'openpyxl')
    elif ext == '.xls':
        kwargs.setdefault('engine', 'xlrd')
    try:
        return pd.read_excel(path_str, **kwargs)
    except Exception as e:
        logger.error(f"读取 Excel 文件失败: {path_str}, 错误: {e}")
        raise
 def get_file_size(file_path: str) -> int:
    """
    获取文件大小（字节）
@@ -277,10 +249,3 @@ def is_file_size_valid(file_path: str, max_size_mb: float) -> bool:
    size_bytes = get_file_size(file_path)
    max_size_bytes = max_size_mb * 1024 * 1024
    return size_bytes <= max_size_bytes 
 def format_file_size(size_bytes: int) -> str:
    """将字节数格式化为可读的文件大小字符串（KB/MB）"""
    if size_bytes < 1024 * 1024:
        return f"{size_bytes / 1024:.1f} KB"
    return f"{size_bytes / (1024 * 1024):.1f} MB"
@@ -7,7 +7,6 @@
 import os
 import sys
 import logging
 from logging.handlers import RotatingFileHandler
 from datetime import datetime
 from pathlib import Path
 from typing import Optional, Dict
@@ -59,8 +58,7 @@ def setup_logger(name: str,
        # 创建文件处理器
        try:
-            # 使用滚动日志，限制单个日志大小与备份数量
+            file_handler = logging.FileHandler(log_file, encoding='utf-8')
            file_handler = RotatingFileHandler(log_file, maxBytes=5 * 1024 * 1024, backupCount=3, encoding='utf-8')
            file_handler.setFormatter(formatter)
            file_handler.setLevel(level)
            logger.addHandler(file_handler)
@@ -5,7 +5,7 @@
 """
 import re
-from typing import Dict, List, Optional, Tuple, Any
+from typing import Dict, List, Optional, Tuple, Any, Match, Pattern
 def clean_string(text: str) -> str:
    """
@@ -192,51 +192,6 @@ def is_scientific_notation(value: str) -> bool:
    """
    return bool(re.match(r'^-?\d+(\.\d+)?[eE][+-]?\d+$', str(value)))
 def parse_monetary_string(value: Any) -> Optional[float]:
    """
    解析金额/数量字符串为浮点数。
    处理: 货币符号(¥/$)、逗号作小数点、逗号作千位分隔符、中文"元"后缀等。
    Args:
        value: 金额值（字符串、数字或其他类型）
    Returns:
        解析后的浮点数，无法解析则返回 None
    """
    if value is None:
        return None
    if isinstance(value, (int, float)):
        return float(value)
    if not isinstance(value, str):
        return None
    s = value.strip()
    if not s or s.lower() in ('o', 'none', 'null', '-', '--'):
        return None
    # 移除非数字字符，保留数字、小数点、逗号和负号
    cleaned = re.sub(r'[^\d\.\-,]', '', s)
    if not cleaned or cleaned in ('-', '.', '-.', ','):
        return None
    # 逗号处理策略:
    #   多个逗号 -> 千位分隔符，全部移除 (如 "1,234,567" = 1234567)
    #   一个逗号 + 无小数点 -> 逗号当小数点 (如 "1,5" = 1.5)
    #   一个逗号 + 有小数点 -> 千位分隔符，移除 (如 "1,234.56" = 1234.56)
    comma_count = cleaned.count(',')
    if comma_count > 1:
        cleaned = cleaned.replace(',', '')
    elif comma_count == 1 and '.' not in cleaned:
        cleaned = cleaned.replace(',', '.')
    elif comma_count == 1 and '.' in cleaned:
        cleaned = cleaned.replace(',', '')
    try:
        return float(cleaned)
    except (ValueError, TypeError):
        return None
 def format_barcode(barcode: Any) -> str:
    """
    格式化条码，处理科学计数法
@@ -4,7 +4,7 @@ OCR服务模块
 提供OCR识别服务，协调OCR流程。
 """
-from typing import Dict, List, Optional, Tuple, Union, Any, Callable
+from typing import Dict, List, Optional, Tuple, Union, Any
 import os
 from ..config.settings import ConfigManager
@@ -88,7 +88,7 @@ class OCRService:
            logger.error(f"处理图片时发生错误: {e}", exc_info=True)
            return None
-    def process_images_batch(self, batch_size: int = None, max_workers: int = None, progress_cb: Optional[Callable[[int], None]] = None) -> Tuple[int, int]:
+    def process_images_batch(self, batch_size: int = None, max_workers: int = None) -> Tuple[int, int]:
        """
        批量处理图片
@@ -100,10 +100,10 @@ class OCRService:
            (总处理数, 成功处理数)元组
        """
        logger.info(f"OCRService开始批量处理图片, batch_size={batch_size}, max_workers={max_workers}")
-        return self.ocr_processor.process_images_batch(batch_size, max_workers, progress_cb)
+        return self.ocr_processor.process_images_batch(batch_size, max_workers)
    # 添加batch_process作为process_images_batch的别名，确保兼容性
-    def batch_process(self, batch_size: int = None, max_workers: int = None, progress_cb: Optional[Callable[[int], None]] = None) -> Tuple[int, int]:
+    def batch_process(self, batch_size: int = None, max_workers: int = None) -> Tuple[int, int]:
        """
        批量处理图片（别名方法，与process_images_batch功能相同）
@@ -115,7 +115,7 @@ class OCRService:
            (总处理数, 成功处理数)元组
        """
        logger.info(f"OCRService.batch_process被调用，转发到process_images_batch")
-        return self.process_images_batch(batch_size, max_workers, progress_cb)
+        return self.process_images_batch(batch_size, max_workers)
    def validate_image(self, image_path: str) -> bool:
        """
@@ -154,7 +154,7 @@ class OCRService:
        # 获取文件名（不含扩展名）
        base_name = os.path.splitext(os.path.basename(image_path))[0]
        # 生成Excel文件路径
-        output_dir = self.config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(self.config, 'get_path') else os.path.abspath('data/output')
+        output_dir = self.config.get('Paths', 'output_folder', fallback='data/output')
        excel_path = os.path.join(output_dir, f"{base_name}.xlsx")
        return excel_path
@@ -4,14 +4,12 @@
 提供订单处理服务，协调Excel处理和订单合并流程。
 """
-import os
+from typing import Dict, List, Optional, Tuple, Union, Any
 from typing import Dict, List, Optional, Tuple, Union, Any, Callable
 from ..config.settings import ConfigManager
 from ..core.utils.log_utils import get_logger
 from ..core.excel.processor import ExcelProcessor
 from ..core.excel.merger import PurchaseOrderMerger
 from ..core.db.product_db import ProductDatabase
 logger = get_logger(__name__)
@@ -30,15 +28,8 @@ class OrderService:
        logger.info("初始化OrderService")
        self.config = config or ConfigManager()
        # 创建共享的商品数据库实例
        db_path = self.config.get_path('Paths', 'product_db', fallback='data/product_cache.db') if hasattr(self.config, 'get_path') else 'data/product_cache.db'
        tpl_folder = self.config.get('Paths', 'template_folder', fallback='templates')
        item_data = self.config.get('Templates', 'item_data', fallback='商品资料.xlsx')
        tpl_path = os.path.join(tpl_folder, item_data)
        self.product_db = ProductDatabase(db_path, tpl_path)
        # 创建Excel处理器和采购单合并器
-        self.excel_processor = ExcelProcessor(self.config, product_db=self.product_db)
+        self.excel_processor = ExcelProcessor(self.config)
        self.order_merger = PurchaseOrderMerger(self.config)
        logger.info("OrderService初始化完成")
@@ -52,9 +43,9 @@ class OrderService:
        """
        return self.excel_processor.get_latest_excel()
-    def process_excel(self, file_path: Optional[str] = None, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
+    def process_excel(self, file_path: Optional[str] = None) -> Optional[str]:
        """
-        处理Excel订单文件，生成标准采购单
+        处理Excel文件，生成采购单
        Args:
            file_path: Excel文件路径，如果为None则处理最新的文件
@@ -62,84 +53,12 @@ class OrderService:
        Returns:
            输出采购单文件路径，如果处理失败则返回None
        """
-        if not file_path:
+        if file_path:
            file_path = self.excel_processor.get_latest_excel()
            if not file_path:
                logger.warning("未找到可处理的Excel文件")
                return None
            logger.info("OrderService开始处理最新Excel文件")
        else:
            logger.info(f"OrderService开始处理指定Excel文件: {file_path}")
-
+            return self.excel_processor.process_specific_file(file_path)
-        # 检查是否需要特殊的供应商预处理（如杨碧月）
+        else:
-        try:
+            logger.info("OrderService开始处理最新Excel文件")
-            from .special_suppliers_service import SpecialSuppliersService
+            return self.excel_processor.process_latest_file()
            special_service = SpecialSuppliersService(self.config)
            # 尝试识别并预处理（注意：这里不再传入 progress_cb 避免无限递归或重复进度条，
            # 或者我们在 special_service 内部逻辑中处理完后直接返回结果）
            # 为了避免循环调用，我们在 SpecialSuppliersService 内部不再调用 process_excel，
            # 而是让 process_excel 识别后自己决定是否处理预处理后的文件。
            # 我们新增一个 check_and_preprocess 方法
            preprocessed_path = self._check_special_preprocess(file_path)
            if preprocessed_path:
                logger.info(f"检测到特殊供应商，已生成预处理文件: {preprocessed_path}")
                file_path = preprocessed_path
        except Exception as e:
            logger.error(f"检查特殊预处理时出错: {e}")
        return self.excel_processor.process_specific_file(file_path, progress_cb=progress_cb)
    def _check_special_preprocess(self, file_path: str) -> Optional[str]:
        """检查并执行特殊的预处理（支持杨碧月、烟草公司、蓉城易购）"""
        try:
            from app.core.utils.file_utils import smart_read_excel
            import pandas as pd
            import re
            # 仅读取前 50 行进行智能识别 (header=None 确保能读到第一行内容)
            df_head = smart_read_excel(file_path, nrows=50, header=None)
            df_str = df_head.astype(str)
            # 1. 识别：烟草公司 (Tobacco)
            # 特征：内容中包含“专卖证号”或特定证号“510109104938”
            is_tobacco = df_str.apply(lambda x: x.str.contains('专卖证号|510109104938')).any().any()
            if is_tobacco:
                logger.info("识别到烟草公司订单，执行专用预处理...")
                from .tobacco_service import TobaccoService
                tobacco_svc = TobaccoService(self.config)
                return tobacco_svc.preprocess_tobacco_order(file_path)
            # 2. 识别：蓉城易购 (Rongcheng Yigou)
            # 特征：内容中包含单号标识“RCDH”
            is_rongcheng = df_str.apply(lambda x: x.str.contains('RCDH')).any().any()
            if is_rongcheng:
                logger.info("识别到蓉城易购订单，执行专用预处理...")
                from .special_suppliers_service import SpecialSuppliersService
                special_svc = SpecialSuppliersService(self.config)
                return special_svc.preprocess_rongcheng_yigou(file_path)
            # 3. 识别：杨碧月 (Yang Biyue)
            # 特征：经手人列包含“杨碧月”
            handler_col = None
            for col in df_head.columns:
                # 在前50行中搜索“经手人”关键字
                if df_head[col].astype(str).str.contains('经手人').any():
                    handler_col = col
                    break
            if handler_col is not None:
                # 检查该列是否有“杨碧月”
                if df_head[handler_col].astype(str).str.contains('杨碧月').any():
                    logger.info("识别到杨碧月订单，执行专用预处理...")
                    from .special_suppliers_service import SpecialSuppliersService
                    special_svc = SpecialSuppliersService(self.config)
                    return special_svc.process_yang_biyue_only(file_path)
        except Exception as e:
            logger.warning(f"智能预处理识别失败: {e}")
        return None
    def get_purchase_orders(self) -> List[str]:
        """
@@ -150,7 +69,7 @@ class OrderService:
        """
        return self.order_merger.get_purchase_orders()
-    def merge_purchase_orders(self, file_paths: List[str], progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
+    def merge_purchase_orders(self, file_paths: List[str]) -> Optional[str]:
        """
        合并指定的采购单文件
@@ -161,9 +80,9 @@ class OrderService:
            合并后的采购单文件路径，如果合并失败则返回None
        """
        logger.info(f"OrderService开始合并指定采购单: {file_paths}")
-        return self.merge_orders(file_paths, progress_cb)
+        return self.merge_orders(file_paths)
-    def merge_all_purchase_orders(self, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
+    def merge_all_purchase_orders(self) -> Optional[str]:
        """
        合并所有可用的采购单文件
@@ -171,9 +90,9 @@ class OrderService:
            合并后的采购单文件路径，如果合并失败则返回None
        """
        logger.info("OrderService开始合并所有采购单")
-        return self.merge_orders(None, progress_cb)
+        return self.merge_orders(None)
-    def merge_orders(self, file_paths: Optional[List[str]] = None, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
+    def merge_orders(self, file_paths: Optional[List[str]] = None) -> Optional[str]:
        """
        合并采购单
@@ -188,58 +107,4 @@ class OrderService:
        else:
            logger.info("OrderService开始合并所有采购单")
-        return self.order_merger.process(file_paths, progress_cb)
+        return self.order_merger.process(file_paths) 
    def validate_unit_price(self, result_path: str) -> List[str]:
        """
        校验采购单单价与商品资料进货价的差异
        Args:
            result_path: 待校验的采购单路径
        Returns:
            差异信息列表，无差异返回空列表
        """
        try:
            import pandas as pd
            from app.core.utils.file_utils import smart_read_excel
            from app.core.handlers.column_mapper import ColumnMapper as CM
            # 使用共享的商品数据库实例
            product_db = self.product_db
            # 读取待校验的采购单
            df_res = smart_read_excel(result_path)
            res_barcode_col = CM.find_column(list(df_res.columns), 'barcode')
            res_price_col = CM.find_column(list(df_res.columns), 'unit_price')
            if not res_barcode_col or not res_price_col:
                logger.warning("未能在采购单中找到条码或单价列")
                return []
            # 批量查询进货价
            barcodes = df_res[res_barcode_col].astype(str).str.strip().tolist()
            item_prices = product_db.get_prices(barcodes)
            results = []
            for _, row in df_res.iterrows():
                bc = str(row[res_barcode_col]).strip()
                if bc not in item_prices:
                    continue
                try:
                    res_price = float(row[res_price_col])
                except (ValueError, TypeError):
                    continue
                item_price = item_prices[bc]
                diff = abs(res_price - item_price)
                if diff > 1.0:
                    results.append(f"条码 {bc}: 采购单价={res_price} vs 进货价={item_price} 差异={diff:.2f}")
            return results
        except Exception as e:
            logger.error(f"单价校验过程中发生错误: {e}")
            return []
@@ -1,297 +0,0 @@
 """
 处理器调度服务
 负责管理和调度各种文件处理器，实现智能文件类型检测和处理器选择
 """
 import logging
 from typing import Dict, Any, Optional, List
 from pathlib import Path
 from ..core.processors.base import BaseProcessor
 from ..core.processors.tobacco_processor import TobaccoProcessor
 from ..core.processors.ocr_processor import OCRProcessor
 from ..core.utils.log_utils import get_logger
 logger = get_logger(__name__)
 class ProcessorService:
    """处理器调度服务
    负责管理所有处理器实例，提供统一的文件处理接口
    """
    def __init__(self, config: Dict[str, Any]):
        """初始化处理器服务
        Args:
            config: 系统配置字典
        """
        self.config = config
        self.processors: List[BaseProcessor] = []
        self._load_processors()
        logger.info(f"处理器服务初始化完成，加载了{len(self.processors)}个处理器")
    def _load_processors(self):
        """加载所有处理器"""
        try:
            self.processors = [
                TobaccoProcessor(self.config),
                OCRProcessor(self.config),
            ]
            supplier_configs = []
            try:
                import json
                from pathlib import Path
                # 优先从`config/suppliers_config.json`加载
                config_path = Path("config/suppliers_config.json")
                if not config_path.exists():
                    # 兼容其它路径
                    config_path = Path("./suppliers_config.json")
                if config_path.exists():
                    with open(config_path, 'r', encoding='utf-8') as f:
                        data = json.load(f)
                        ok, errs, supplier_configs = self._validate_suppliers_config(data)
                        if not ok:
                            logger.error("供应商配置校验失败：\n" + "\n".join([f"- {e}" for e in errs]))
                        else:
                            logger.info(f"从 {config_path} 加载供应商配置，共 {len(supplier_configs)} 项")
                else:
                    logger.info("未找到供应商配置文件，跳过供应商处理器加载")
            except Exception as e:
                logger.error(f"读取供应商配置失败: {e}")
            for supplier_config in supplier_configs:
                try:
                    from ..core.processors.supplier_processors.generic_supplier_processor import GenericSupplierProcessor
                    processor = GenericSupplierProcessor(self.config, supplier_config)
                    self.processors.append(processor)
                    logger.info(f"加载供应商处理器: {processor.name}")
                except Exception as e:
                    logger.error(f"加载供应商处理器失败: {e}")
            logger.info(f"成功加载{len(self.processors)}个处理器")
        except Exception as e:
            logger.error(f"加载处理器时出错: {e}", exc_info=True)
            self.processors = [
                TobaccoProcessor(self.config),
                OCRProcessor(self.config),
            ]
    def _validate_suppliers_config(self, data):
        try:
            suppliers = data.get('suppliers')
            errors = []
            valid = []
            if not isinstance(suppliers, list) or not suppliers:
                errors.append('suppliers必须是非空数组')
                return False, errors, []
            for idx, s in enumerate(suppliers):
                e = self._validate_single_supplier(s, idx)
                if e:
                    errors.extend(e)
                else:
                    valid.append(s)
            return len(errors) == 0, errors, valid
        except Exception as e:
            return False, [f'配置解析异常: {e}'], []
    def _validate_single_supplier(self, s, idx):
        errs = []
        prefix = f'suppliers[{idx}]'
        name = s.get('name')
        if not name or not isinstance(name, str):
            errs.append(f'{prefix}.name 必须为字符串')
        fp = s.get('filename_patterns', [])
        ci = s.get('content_indicators', [])
        if not fp and not ci:
            errs.append(f'{prefix} 必须至少提供 filename_patterns 或 content_indicators 之一')
        cm = s.get('column_mapping', {})
        if cm and not isinstance(cm, dict):
            errs.append(f'{prefix}.column_mapping 必须为对象')
        cr = s.get('cleaning_rules', [])
        if cr and not isinstance(cr, list):
            errs.append(f'{prefix}.cleaning_rules 必须为数组')
        else:
            for i, rule in enumerate(cr):
                rtype = rule.get('type')
                if rtype not in ('remove_rows','fill_na','convert_type'):
                    errs.append(f'{prefix}.cleaning_rules[{i}].type 非法: {rtype}')
                if rtype == 'remove_rows' and not rule.get('condition'):
                    errs.append(f'{prefix}.cleaning_rules[{i}].condition 必填')
                if rtype in ('fill_na','convert_type'):
                    if not rule.get('columns') and not rule.get('column'):
                        errs.append(f'{prefix}.cleaning_rules[{i}] 需提供 columns 或 column')
        calc = s.get('calculations', [])
        if calc and not isinstance(calc, list):
            errs.append(f'{prefix}.calculations 必须为数组')
        else:
            for i, c in enumerate(calc):
                ctype = c.get('type')
                if ctype not in ('multiply','divide','formula'):
                    errs.append(f'{prefix}.calculations[{i}].type 非法: {ctype}')
                if ctype in ('multiply','divide'):
                    if not c.get('source_column') or not c.get('target_column'):
                        errs.append(f'{prefix}.calculations[{i}] 需提供 source_column 与 target_column')
                if ctype == 'formula' and (not c.get('formula') or not c.get('target_column')):
                    errs.append(f'{prefix}.calculations[{i}] 需提供 formula 与 target_column')
        return errs
    def process_file(self, input_file: Path, output_dir: Path, 
                    preferred_processor: Optional[str] = None) -> Optional[Path]:
        """处理文件 - 自动选择合适的处理器
        Args:
            input_file: 输入文件路径
            output_dir: 输出目录路径
            preferred_processor: 优先使用的处理器名称（可选）
        Returns:
            输出文件路径，处理失败返回None
        """
        if not input_file.exists():
            logger.error(f"输入文件不存在: {input_file}")
            return None
        if not output_dir.exists():
            output_dir.mkdir(parents=True, exist_ok=True)
        try:
            # 如果指定了优先处理器，先尝试使用它
            if preferred_processor:
                processor = self._get_processor_by_name(preferred_processor)
                if processor and processor.can_process(input_file):
                    logger.info(f"使用指定的处理器: {processor.name}")
                    return processor.process(input_file, output_dir)
                else:
                    logger.warning(f"指定的处理器不可用或无法处理该文件: {preferred_processor}")
            # 自动选择合适的处理器
            suitable_processors = [p for p in self.processors if p.can_process(input_file)]
            if not suitable_processors:
                logger.warning(f"未找到适合处理文件的处理器: {input_file}")
                logger.info(f"支持的文件类型: {self.get_supported_types()}")
                return None
            # 使用第一个合适的处理器
            processor = suitable_processors[0]
            logger.info(f"使用处理器 {processor.name} 处理文件: {input_file}")
            return processor.process(input_file, output_dir)
        except Exception as e:
            logger.error(f"处理文件时出错: {e}", exc_info=True)
            return None
    def _get_processor_by_name(self, name: str) -> Optional[BaseProcessor]:
        """根据名称获取处理器
        Args:
            name: 处理器名称
        Returns:
            处理器实例或None
        """
        for processor in self.processors:
            if processor.name == name or processor.__class__.__name__ == name:
                return processor
        return None
    def get_supported_types(self) -> List[Dict[str, Any]]:
        """获取支持的文件类型信息
        Returns:
            处理器类型信息列表
        """
        return [
            {
                'name': processor.name,
                'description': processor.description,
                'extensions': processor.get_supported_extensions(),
                'class_name': processor.__class__.__name__
            }
            for processor in self.processors
        ]
    def get_processor_info(self) -> List[Dict[str, Any]]:
        """获取处理器详细信息
        Returns:
            处理器详细信息列表
        """
        return [
            {
                'name': processor.name,
                'description': processor.description,
                'extensions': processor.get_supported_extensions(),
                'required_columns': processor.get_required_columns(),
                'class_name': processor.__class__.__name__,
                'module': processor.__class__.__module__
            }
            for processor in self.processors
        ]
    def can_process_file(self, file_path: Path) -> bool:
        """检查是否有处理器能处理该文件
        Args:
            file_path: 文件路径
        Returns:
            是否有处理器能处理
        """
        if not file_path.exists():
            return False
        return any(processor.can_process(file_path) for processor in self.processors)
    def get_suitable_processors(self, file_path: Path) -> List[BaseProcessor]:
        """获取能处理该文件的所有处理器
        Args:
            file_path: 文件路径
        Returns:
            合适的处理器列表
        """
        if not file_path.exists():
            return []
        return [p for p in self.processors if p.can_process(file_path)]
    def reload_processors(self):
        """重新加载处理器"""
        logger.info("重新加载处理器...")
        self.processors.clear()
        self._load_processors()
        logger.info(f"重新加载完成，共{len(self.processors)}个处理器")
    def add_processor(self, processor: BaseProcessor):
        """添加处理器
        Args:
            processor: 处理器实例
        """
        self.processors.append(processor)
        logger.info(f"添加处理器: {processor.name}")
    def remove_processor(self, processor_name: str) -> bool:
        """移除处理器
        Args:
            processor_name: 处理器名称
        Returns:
            是否成功移除
        """
        for i, processor in enumerate(self.processors):
            if processor.name == processor_name or processor.__class__.__name__ == processor_name:
                del self.processors[i]
                logger.info(f"移除处理器: {processor_name}")
                return True
        logger.warning(f"未找到要移除的处理器: {processor_name}")
        return False
@@ -1,237 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 import os
 import re
 import time
 import pandas as pd
 from typing import Optional, Callable
 from ..core.utils.log_utils import get_logger
 logger = get_logger(__name__)
 class SpecialSuppliersService:
    """
    处理特殊供应商逻辑的服务类，如蓉城易购等
    """
    def __init__(self, config_manager=None):
        self.config_manager = config_manager
    def process_yang_biyue_only(self, src_path: str) -> Optional[str]:
        """
        仅执行杨碧月订单的预处理，返回预处理后的文件路径
        """
        try:
            from app.core.utils.file_utils import smart_read_excel
            # 读取原始数据
            df = smart_read_excel(src_path)
            # 检查是否包含“杨碧月”
            handler_col = None
            for col in df.columns:
                if '经手人' in str(col):
                    handler_col = col
                    break
            if handler_col is None or not df[handler_col].astype(str).str.contains('杨碧月').any():
                return None
            # 识别到杨碧月订单，执行专用清洗
            logger.info("识别到杨碧月订单，正在执行专用清洗...")
            # 定义列映射关系 (映射到 ExcelProcessor 期望的中文列名)
            # 使用精确匹配优先，防止“结算单位”匹配到“单位”
            column_map = {
                '商品条码': '商品条码',
                '商品名称': '商品名称',
                '商品规格': '规格',
                '单位': '单位',
                '数量': '数量',
                '含税单价': '单价',
                '含税金额': '金额'
            }
            found_cols = {}
            # 1. 第一遍：尝试精确匹配
            for target_zh, std_name in column_map.items():
                for col in df.columns:
                    if str(col).strip() == target_zh:
                        found_cols[col] = std_name
                        break
            # 2. 第二遍：对未匹配成功的列尝试模糊匹配（但要排除特定干扰词）
            for target_zh, std_name in column_map.items():
                if std_name in found_cols.values():
                    continue
                for col in df.columns:
                    col_str = str(col)
                    if target_zh in col_str:
                        # 排除干扰列
                        if target_zh == '单位' and '结算单位' in col_str:
                            continue
                        if target_zh == '数量' and '基本单位数量' in col_str:
                            continue
                        found_cols[col] = std_name
                        break
            if len(found_cols) < 4:
                logger.error(f"杨碧月订单列匹配不足: 找到 {list(found_cols.values())}")
                return None
            df_clean = df[list(found_cols.keys())].copy()
            df_clean = df_clean.rename(columns=found_cols)
            # 过滤掉空的条码行
            df_clean = df_clean.dropna(subset=['商品条码'])
            # 保存预处理文件到输出目录（而非源文件目录）
            if self.config_manager and hasattr(self.config_manager, 'get_path'):
                out_dir = self.config_manager.get_path('Paths', 'output_folder', fallback='data/output', create=True)
            else:
                from app.config.settings import ConfigManager
                out_dir = ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True)
            os.makedirs(out_dir, exist_ok=True)
            base = os.path.basename(src_path)
            final_path = os.path.join(out_dir, f"预处理之后_{base}")
            df_clean.to_excel(final_path, index=False)
            return final_path
        except Exception as e:
            logger.error(f"预处理杨碧月订单出错: {e}")
            return None
    def process_yang_biyue(self, src_path: str, progress_cb: Optional[Callable[[int, str], None]] = None) -> Optional[str]:
        """
        处理杨碧月经手的订单（预处理+处理）
        """
        try:
            if progress_cb: progress_cb(10, "正在进行杨碧月订单预处理...")
            preprocessed_path = self.process_yang_biyue_only(src_path)
            if not preprocessed_path:
                return None
            if progress_cb: progress_cb(60, "预处理文件已保存，开始标准转换流程...")
            # 延迟导入以避免循环依赖
            from app.services.order_service import OrderService
            order_service = OrderService(self.config_manager)
            result = order_service.process_excel(preprocessed_path, progress_cb=lambda p: progress_cb(60 + int(p*0.4), "生成采购单中...") if progress_cb else None)
            return result
        except Exception as e:
            logger.error(f"处理杨碧月订单出错: {e}")
            return None
    def preprocess_rongcheng_yigou(self, src_path: str, progress_cb: Optional[Callable[[int, str], None]] = None) -> Optional[str]:
        """
        蓉城易购订单预处理：按用户提供的 E, N, Q, S 列索引进行强制清洗
        """
        try:
            if progress_cb: progress_cb(10, "正在处理蓉城易购预处理...")
            from app.core.utils.file_utils import smart_read_excel
            # 蓉城易购格式：Row 0是单号，Row 1是联系人，Row 2是表头，Row 3开始是数据
            df_raw = smart_read_excel(src_path, header=None)
            # 检查数据行数
            if len(df_raw) <= 3:
                logger.error("蓉城易购文件数据行数不足")
                return None
            # 提取数据部分 (Row 3开始)
            df_data = df_raw.iloc[3:].reset_index(drop=True)
            # 用户指定列映射：
            # E列 (Index 4) -> 商品条码
            # N列 (Index 13) -> 数量
            # Q列 (Index 16) -> 单价
            # S列 (Index 18) -> 金额
            # C列 (Index 2) -> 商品名称 (通用需求)
            idx_map = {
                2: '商品名称',
                4: '商品条码',
                13: '数量',
                16: '单价',
                18: '金额'
            }
            # 确保列索引不越界
            available_indices = [i for i in idx_map.keys() if i < df_data.shape[1]]
            df2 = df_data.iloc[:, available_indices].copy()
            df2.columns = [idx_map[i] for i in available_indices]
            # 强制转换类型
            for c in ['数量', '单价', '金额']:
                if c in df2.columns:
                    df2[c] = pd.to_numeric(df2[c], errors='coerce').fillna(0)
            # 过滤掉空的条码行
            df2 = df2.dropna(subset=['商品条码'])
            df2['商品条码'] = df2['商品条码'].astype(str).str.strip()
            df2 = df2[df2['商品条码'] != '']
            # 核心逻辑：分裂多条码行并均分数量
            if '商品条码' in df2.columns and '数量' in df2.columns:
                rows = []
                for _, row in df2.iterrows():
                    bc_val = str(row.get('商品条码', '')).strip()
                    # 识别分隔符：/ , ， 、
                    if any(sep in bc_val for sep in ['/', ',', '，', '、']):
                        parts = re.split(r'[/,，、]+', bc_val)
                        parts = [p.strip() for p in parts if p.strip()]
                        if len(parts) >= 2:
                            q_total = float(row.get('数量', 0) or 0)
                            if q_total > 0:
                                n = len(parts)
                                base_qty = int(q_total // n)
                                remainder = int(q_total % n)
                                for i, p_bc in enumerate(parts):
                                    new_row = row.copy()
                                    new_row['商品条码'] = p_bc
                                    current_qty = base_qty + (1 if i < remainder else 0)
                                    new_row['数量'] = current_qty
                                    if '单价' in new_row:
                                        try:
                                            up = float(new_row['单价'] or 0)
                                            new_row['金额'] = up * current_qty
                                        except Exception:
                                            pass
                                    rows.append(new_row)
                                continue
                    rows.append(row)
                df2 = pd.DataFrame(rows)
            # 保存预处理文件到输出目录（而非源文件目录）
            if self.config_manager and hasattr(self.config_manager, 'get_path'):
                out_dir = self.config_manager.get_path('Paths', 'output_folder', fallback='data/output', create=True)
            else:
                from app.config.settings import ConfigManager
                out_dir = ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True)
            os.makedirs(out_dir, exist_ok=True)
            base = os.path.basename(src_path)
            final_path = os.path.join(out_dir, f"预处理之后_{base}")
            df2.to_excel(final_path, index=False)
            if progress_cb: progress_cb(100, "蓉城易购预处理完成")
            return final_path
        except Exception as e:
            logger.error(f"预处理蓉城易购订单出错: {e}")
            return None
    def process_rongcheng_yigou(self, src_path: str, progress_cb: Optional[Callable[[int, str], None]] = None) -> Optional[str]:
        """
        兼容性方法：处理蓉城易购订单并执行后续转换
        """
        cleaned_path = self.preprocess_rongcheng_yigou(src_path, progress_cb)
        if cleaned_path:
            from app.services.order_service import OrderService
            order_service = OrderService(self.config_manager)
            return order_service.process_excel(cleaned_path, progress_cb=lambda p: progress_cb(60 + int(p*0.4), "生成采购单中...") if progress_cb else None)
        return None
@@ -18,7 +18,6 @@ from xlutils.copy import copy
 from openpyxl import load_workbook
 from typing import Optional, Dict, Any, List, Tuple
 from app.core.utils.log_utils import get_logger
 from app.core.utils.string_utils import parse_monetary_string
 from app.core.utils.dialog_utils import show_custom_dialog  # 导入自定义弹窗工具
 from ..config.settings import ConfigManager
@@ -36,10 +35,10 @@ class TobaccoService:
        """
        self.config = config
        # 修复配置获取方式，使用fallback机制
-        self.output_dir = config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/output')
+        self.output_dir = config.get('Paths', 'output_folder', fallback='data/output')
        self.template_file = config.get('Paths', 'template_file', fallback='templates/银豹-采购单模板.xls')
        # 将烟草订单保存到result目录
-        result_dir = config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/result')
+        result_dir = "data/result"
        os.makedirs(result_dir, exist_ok=True)
        self.output_file = os.path.join(result_dir, '银豹采购单_烟草公司.xls')
@@ -74,78 +73,6 @@ class TobaccoService:
            logger.warning(f"找到的烟草订单明细文件不是今天创建的: {latest_file}")
            return latest_file  # 仍然返回最新文件，但给出警告
    def preprocess_tobacco_order(self, file_path: str) -> Optional[str]:
        """
        烟草订单预处理：按用户提供的 B, E, G, H 列索引进行强制清洗
        """
        try:
            logger.info(f"执行烟草订单专用预处理: {file_path}")
            from app.core.utils.file_utils import smart_read_excel
            # 烟草格式：Row 0是专卖证号，Row 1是表头，Row 2是合计，Row 3开始是数据
            df_raw = smart_read_excel(file_path, header=None)
            if len(df_raw) <= 3:
                logger.error("烟草订单文件数据行数不足")
                return None
            # 提取数据部分 (Row 3开始)
            df_data = df_raw.iloc[3:].reset_index(drop=True)
            # 用户指定列映射：
            # A列 (Index 0) -> 商品名称
            # B列 (Index 1) -> 商品条码 (盒码)
            # E列 (Index 4) -> 批发价 (单价)
            # G列 (Index 6) -> 订单量 (数量)
            # H列 (Index 7) -> 金额
            idx_map = {
                0: '商品名称',
                1: '商品条码',
                4: '批发价',
                6: '数量',
                7: '金额'
            }
            available_indices = [i for i in idx_map.keys() if i < df_data.shape[1]]
            df = df_data.iloc[:, available_indices].copy()
            df.columns = [idx_map[i] for i in available_indices]
            # 1. 过滤订单量不为0的数据
            df['数量'] = pd.to_numeric(df['数量'], errors='coerce').fillna(0)
            df = df[df['数量'] != 0].copy()
            if df.empty:
                logger.warning("烟草订单无有效订单量记录")
                return None
            # 2. 核心清洗逻辑：
            # 数量 = 订单量 * 10 (G列)
            # 单价 = 批发价 / 10 (E列)
            df['单价'] = pd.to_numeric(df['批发价'], errors='coerce').fillna(0) / 10
            df['数量'] = df['数量'] * 10
            # 3. 校验金额 (H列)
            df['金额'] = pd.to_numeric(df['金额'], errors='coerce').fillna(0)
            # 4. 只保留需要的列
            final_cols = ['商品条码', '商品名称', '数量', '单价', '金额']
            df_final = df[final_cols].copy()
            # 保存预处理文件到输出目录（而非源文件目录）
            out_dir = self.output_dir
            os.makedirs(out_dir, exist_ok=True)
            base = os.path.basename(file_path)
            final_path = os.path.join(out_dir, f"预处理之后_{base}")
            df_final.to_excel(final_path, index=False)
            logger.info(f"烟草订单预处理完成: {final_path}")
            return final_path
        except Exception as e:
            logger.error(f"烟草订单预处理失败: {e}")
            return None
    def process_tobacco_order(self, input_file=None):
        """
        处理烟草订单
@@ -238,9 +165,8 @@ class TobaccoService:
        columns = ['商品', '盒码', '条码', '建议零售价', '批发价', '需求量', '订单量', '金额']
        try:
            from app.core.utils.file_utils import smart_read_excel
            # 读取Excel文件
-            df_old = smart_read_excel(file_path, header=None, skiprows=3, names=columns)
+            df_old = pd.read_excel(file_path, header=None, skiprows=3, names=columns)
            # 过滤订单量不为0的数据，并计算采购量和单价
            df_filtered = df_old[df_old['订单量'] != 0].copy()
@@ -318,9 +244,13 @@ class TobaccoService:
        }
        # 确保 total_amount 是数字类型
-        parsed = parse_monetary_string(total_amount)
+        try:
-        total_amount = parsed if parsed is not None else 0.0
+            if isinstance(total_amount, str):
-        amount_display = f"¥{total_amount:.2f}"
+                total_amount = float(total_amount.replace(',', ''))
            amount_display = f"¥{total_amount:.2f}"
        except (ValueError, TypeError):
            # 如果转换失败，直接使用原始值
            amount_display = f"¥{total_amount}"
        # 显示自定义对话框
        show_custom_dialog(
@@ -1,2 +0,0 @@
 # -*- coding: utf-8 -*-
 """益选-OCR订单处理系统 UI 模块"""
@@ -1,565 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """业务操作处理模块"""
 import os
 import time
 import datetime
 import json
 import logging
 import tkinter as tk
 from tkinter import messagebox
 from threading import Thread
 from app.config.settings import ConfigManager
 from app.services.ocr_service import OCRService
 from app.services.order_service import OrderService
 from app.core.utils.log_utils import get_logger
 from .logging_ui import add_to_log, init_gui_logger, dispose_gui_logger, GUILogHandler
 from .ui_widgets import ProgressReporter
 from .error_utils import show_error_dialog, get_error_suggestion
 logger = get_logger(__name__)
 from .result_previews import show_ocr_result_preview, show_excel_result_preview, show_merge_result_preview
 from .user_settings import add_recent_file
 from .command_runner import get_running_task, set_running_task
 from .file_operations import select_file, select_excel_file, validate_unit_price_against_item_data
 def _ask_and_merge_purchase_orders(order_service, log_widget, add_to_recent=False):
    """弹窗询问是否合并采购单，返回合并结果路径或 None。
    用于 run_pipeline_directly 和 batch_process_orders_with_status 的共享逻辑。
    """
    try:
        purchase_orders = order_service.get_purchase_orders()
        if len(purchase_orders) == 0:
            add_to_log(log_widget, "没有找到采购单文件，跳过合并步骤\n", "info")
        elif len(purchase_orders) == 1:
            add_to_log(log_widget, f"只有1个采购单文件，无需合并: {os.path.basename(purchase_orders[0])}\n", "info")
        else:
            add_to_log(log_widget, f"找到{len(purchase_orders)}个采购单文件\n", "info")
            file_list = "\n".join([f"• {os.path.basename(f)}" for f in purchase_orders])
            merge_choice = messagebox.askyesnocancel(
                "采购单合并选择",
                f"发现{len(purchase_orders)}个采购单文件：\n\n{file_list}\n\n是否需要合并这些采购单？\n\n• 选择'是'：合并所有采购单\n• 选择'否'：保持文件分离\n• 选择'取消'：跳过此步骤",
                icon='question'
            )
            if merge_choice is True:
                add_to_log(log_widget, "开始合并采购单...\n", "info")
                merge_result = order_service.merge_all_purchase_orders()
                if merge_result:
                    add_to_log(log_widget, "采购单合并完成\n", "success")
                    if add_to_recent:
                        try:
                            add_recent_file(merge_result)
                        except Exception as e:
                            logger.debug(f"添加最近文件失败: {e}")
                    return merge_result
                else:
                    add_to_log(log_widget, "合并失败\n", "warning")
            elif merge_choice is False:
                add_to_log(log_widget, "用户选择不合并采购单，保持文件分离\n", "info")
            else:
                add_to_log(log_widget, "用户取消合并操作\n", "info")
    except Exception as e:
        add_to_log(log_widget, f"合并过程出现问题: {str(e)}\n", "warning")
    return None
 def process_single_image_with_status(log_widget, status_bar):
    status_bar.set_status("选择图片中...")
    file_path = select_file(log_widget, [("图片文件", "*.jpg *.jpeg *.png *.bmp"), ("所有文件", "*.*")], "选择图片")
    if not file_path:
        status_bar.set_status("操作已取消")
        add_to_log(log_widget, "未选择文件，操作已取消\n", "warning")
        return
    def run_in_thread():
        try:
            status_bar.set_running(True)
            status_bar.set_status("开始处理图片...")
            gui_handler = GUILogHandler(log_widget)
            gui_handler.setLevel(logging.INFO)
            formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
            gui_handler.setFormatter(formatter)
            root_logger = logging.getLogger()
            for handler in root_logger.handlers[:]:
                if isinstance(handler, logging.StreamHandler):
                    root_logger.removeHandler(handler)
            root_logger.addHandler(gui_handler)
            root_logger.setLevel(logging.INFO)
            ocr_service = OCRService()
            add_to_log(log_widget, f"开始处理图片: {file_path}\n", "info")
            try:
                add_recent_file(file_path)
            except Exception as e:
                logger.debug(f"添加最近文件失败: {e}")
            excel_path = ocr_service.process_image(file_path)
            if excel_path:
                add_to_log(log_widget, "图片OCR处理完成\n", "success")
                preview_output = f"采购单已保存到: {excel_path}\n"
                show_excel_result_preview(preview_output)
                try:
                    add_recent_file(excel_path)
                except Exception as e:
                    logger.debug(f"添加最近文件失败: {e}")
            else:
                add_to_log(log_widget, "图片OCR处理失败\n", "error")
        except Exception as e:
            add_to_log(log_widget, f"处理单个图片时出错: {str(e)}\n", "error")
            sugg = get_error_suggestion(str(e))
            if sugg:
                show_error_dialog("OCR处理错误", str(e), sugg)
        finally:
            try:
                root_logger = logging.getLogger()
                for handler in root_logger.handlers[:]:
                    if isinstance(handler, GUILogHandler):
                        root_logger.removeHandler(handler)
                        handler.close()
            except Exception as e:
                logger.debug(f"清理日志处理器失败: {e}")
            status_bar.set_running(False)
            status_bar.set_status("就绪")
    thread = Thread(target=run_in_thread)
    thread.daemon = True
    thread.start()
 def run_pipeline_directly(log_widget, status_bar):
    """直接运行完整处理流程"""
    if get_running_task() is not None:
        messagebox.showinfo("任务进行中", "请等待当前任务完成后再执行新的操作。")
        return
    def run_in_thread():
        set_running_task("pipeline")
        if status_bar:
            status_bar.set_running(True)
            status_bar.set_status("开始完整处理流程...")
        start_time = datetime.datetime.now()
        start_perf = time.perf_counter()
        log_widget.configure(state=tk.NORMAL)
        log_widget.delete(1.0, tk.END)
        log_widget.insert(tk.END, "执行命令: 完整处理流程\n", "command")
        log_widget.insert(tk.END, f"开始时间: {start_time.strftime('%Y-%m-%d %H:%M:%S')}\n", "time")
        log_widget.insert(tk.END, "=" * 50 + "\n\n", "separator")
        log_widget.configure(state=tk.DISABLED)
        try:
            config = ConfigManager()
            gui_handler = init_gui_logger(log_widget)
            ocr_service = OCRService(config)
            order_service = OrderService(config)
            reporter = ProgressReporter(status_bar)
            reporter.running()
            reporter.set("开始OCR批量处理...", 10)
            total, success = ocr_service.batch_process(progress_cb=lambda p: reporter.set("OCR处理中...", p))
            if total == 0:
                add_to_log(log_widget, "没有找到需要处理的图片\n", "warning")
                if status_bar:
                    status_bar.set_status("未找到图片文件")
                return
            elif success == 0:
                add_to_log(log_widget, "OCR处理没有成功处理任何新文件\n", "warning")
            else:
                add_to_log(log_widget, f"OCR处理完成，共处理 {success}/{total} 个文件\n", "success")
                try:
                    processed_map = {}
                    config = ConfigManager()
                    pjson = config.get('Paths', 'processed_record', fallback='data/processed_files.json')
                    if os.path.exists(pjson):
                        with open(pjson, 'r', encoding='utf-8') as f:
                            processed_map = json.load(f)
                    outputs = list(processed_map.values())
                    for p in outputs[-10:]:
                        if p:
                            add_recent_file(os.path.abspath(p))
                except Exception as e:
                    logger.debug(f"加载已处理文件记录失败: {e}")
            reporter.set("开始Excel处理...", 92)
            add_to_log(log_widget, "开始Excel处理...\n", "info")
            result = order_service.process_excel()
            if not result:
                add_to_log(log_widget, "Excel处理失败\n", "error")
            else:
                add_to_log(log_widget, "Excel处理完成\n", "success")
                try:
                    add_recent_file(result)
                except Exception as e:
                    logger.debug(f"添加最近文件失败: {e}")
                try:
                    validate_unit_price_against_item_data(result, log_widget)
                except Exception as e:
                    logger.debug(f"单价校验失败: {e}")
            reporter.set("检查是否需要合并采购单...", 80)
            _ask_and_merge_purchase_orders(order_service, log_widget, add_to_recent=True)
            end_time = datetime.datetime.now()
            duration_sec = max(0.0, time.perf_counter() - start_perf)
            add_to_log(log_widget, f"\n{'=' * 50}\n", "separator")
            add_to_log(log_widget, "完整处理流程执行完毕！\n", "success")
            add_to_log(log_widget, f"结束时间: {end_time.strftime('%Y-%m-%d %H:%M:%S')}\n", "time")
            add_to_log(log_widget, f"耗时: {duration_sec:.2f} 秒\n", "time")
            reporter.set("处理完成", 100)
        except Exception as e:
            add_to_log(log_widget, f"执行过程中发生错误: {str(e)}\n", "error")
            import traceback
            add_to_log(log_widget, f"详细错误信息: {traceback.format_exc()}\n", "error")
        finally:
            dispose_gui_logger()
            reporter.done()
            set_running_task(None)
            if status_bar:
                status_bar.set_running(False)
                status_bar.set_status("就绪")
    thread = Thread(target=run_in_thread)
    thread.daemon = True
    thread.start()
 def batch_ocr_with_status(log_widget, status_bar):
    """OCR批量识别"""
    def run_in_thread():
        try:
            reporter = ProgressReporter(status_bar)
            reporter.running()
            reporter.set("正在进行OCR批量识别...", 10)
            add_to_log(log_widget, "开始OCR批量识别\n", "info")
            init_gui_logger(log_widget)
            ocr_service = OCRService()
            result = ocr_service.batch_process()
            if result:
                add_to_log(log_widget, "OCR批量识别完成\n", "success")
                show_ocr_result_preview("OCR批量识别成功完成")
                reporter.set("批量识别完成", 100)
                try:
                    processed_map = {}
                    config = ConfigManager()
                    pjson = config.get('Paths', 'processed_record', fallback='data/processed_files.json')
                    if os.path.exists(pjson):
                        with open(pjson, 'r', encoding='utf-8') as f:
                            processed_map = json.load(f)
                    outputs = list(processed_map.values())
                    for p in outputs[-10:]:
                        if p:
                            add_recent_file(p)
                    inputs = list(processed_map.keys())
                    for p in inputs[-10:]:
                        if p:
                            add_recent_file(p)
                except Exception as e:
                    logger.debug(f"加载已处理文件记录失败: {e}")
            else:
                add_to_log(log_widget, "OCR批量识别失败\n", "error")
        except Exception as e:
            add_to_log(log_widget, f"OCR批量识别出错: {str(e)}\n", "error")
            sugg = get_error_suggestion(str(e))
            if sugg:
                show_error_dialog("OCR处理错误", str(e), sugg)
        finally:
            dispose_gui_logger()
            reporter.done()
    thread = Thread(target=run_in_thread)
    thread.daemon = True
    thread.start()
 def batch_process_orders_with_status(log_widget, status_bar):
    """批量处理订单（仅Excel处理，包含合并确认）"""
    def run_in_thread():
        try:
            reporter = ProgressReporter(status_bar)
            reporter.running()
            reporter.set("正在批量处理订单...", 10)
            add_to_log(log_widget, "开始批量处理订单\n", "info")
            init_gui_logger(log_widget)
            order_service = OrderService()
            add_to_log(log_widget, "开始Excel处理...\n", "info")
            try:
                latest_input = order_service.get_latest_excel()
                if latest_input:
                    add_recent_file(latest_input)
            except Exception as e:
                logger.debug(f"获取最新Excel失败: {e}")
            result = order_service.process_excel(progress_cb=lambda p: reporter.set("Excel处理中...", p))
            if result:
                add_to_log(log_widget, "Excel处理完成\n", "success")
                try:
                    validate_unit_price_against_item_data(result, log_widget)
                except Exception as e:
                    logger.debug(f"单价校验失败: {e}")
                reporter.set("检查是否需要合并采购单...", 70)
                add_to_log(log_widget, "检查是否需要合并采购单...\n", "info")
                _ask_and_merge_purchase_orders(order_service, log_widget)
                add_to_log(log_widget, "批量处理订单完成\n", "success")
                reporter.set("批量处理订单完成", 100)
                show_excel_result_preview(f"采购单已保存到: {result}\n")
                try:
                    add_recent_file(result)
                except Exception as e:
                    logger.debug(f"添加最近文件失败: {e}")
            else:
                add_to_log(log_widget, "批量处理订单失败\n", "error")
        except Exception as e:
            add_to_log(log_widget, f"批量处理订单时出错: {str(e)}\n", "error")
            sugg = get_error_suggestion(str(e))
            if sugg:
                show_error_dialog("Excel处理错误", str(e), sugg)
        finally:
            dispose_gui_logger()
            reporter.done()
    thread = Thread(target=run_in_thread)
    thread.daemon = True
    thread.start()
 def merge_orders_with_status(log_widget, status_bar):
    """合并采购单"""
    def run_in_thread():
        try:
            reporter = ProgressReporter(status_bar)
            reporter.running()
            reporter.set("正在合并采购单...", 10)
            add_to_log(log_widget, "开始合并采购单\n", "info")
            init_gui_logger(log_widget)
            order_service = OrderService()
            result = order_service.merge_all_purchase_orders(progress_cb=lambda p: reporter.set("合并处理中...", p))
            if result:
                add_to_log(log_widget, "采购单合并完成\n", "success")
                show_merge_result_preview(f"已保存到: {result}\n")
                try:
                    add_recent_file(result)
                except Exception as e:
                    logger.debug(f"添加最近文件失败: {e}")
                try:
                    validate_unit_price_against_item_data(result, log_widget)
                except Exception as e:
                    logger.debug(f"单价校验失败: {e}")
            else:
                add_to_log(log_widget, "采购单合并失败\n", "error")
        except Exception as e:
            add_to_log(log_widget, f"采购单合并出错: {str(e)}\n", "error")
            sugg = get_error_suggestion(str(e))
            if sugg:
                show_error_dialog("合并错误", str(e), sugg)
        finally:
            dispose_gui_logger()
            reporter.done()
    thread = Thread(target=run_in_thread)
    thread.daemon = True
    thread.start()
 def process_excel_file_with_status(log_widget, status_bar):
    """处理Excel文件"""
    def run_in_thread():
        try:
            status_bar.set_running(True)
            status_bar.set_status("选择Excel文件中...")
            file_path = select_excel_file(log_widget)
            if file_path:
                status_bar.set_status("开始处理Excel文件...")
                add_to_log(log_widget, f"开始处理Excel文件: {file_path}\n", "info")
            else:
                status_bar.set_status("操作已取消")
                add_to_log(log_widget, "未选择文件，操作已取消\n", "warning")
                return
            init_gui_logger(log_widget)
            order_service = OrderService()
            if file_path:
                try:
                    add_recent_file(file_path)
                except Exception as e:
                    logger.debug(f"添加最近文件失败: {e}")
                result = order_service.process_excel(file_path, progress_cb=lambda p: status_bar.set_status("Excel处理中...", p))
            else:
                try:
                    latest_input = order_service.get_latest_excel()
                    if latest_input:
                        add_recent_file(latest_input)
                except Exception as e:
                    logger.debug(f"获取最新Excel失败: {e}")
                result = order_service.process_excel(progress_cb=lambda p: status_bar.set_status("Excel处理中...", p))
            if result:
                add_to_log(log_widget, "Excel文件处理完成\n", "success")
                show_excel_result_preview(f"采购单已保存到: {result}\n")
                try:
                    add_recent_file(result)
                except Exception as e:
                    logger.debug(f"添加最近文件失败: {e}")
                try:
                    validate_unit_price_against_item_data(result, log_widget)
                except Exception as e:
                    logger.debug(f"单价校验失败: {e}")
            else:
                add_to_log(log_widget, "Excel文件处理失败\n", "error")
        except Exception as e:
            add_to_log(log_widget, f"Excel文件处理出错: {str(e)}\n", "error")
            msg = str(e)
            suggestion = None
            if 'openpyxl' in msg or 'engine' in msg:
                suggestion = "安装依赖：pip install openpyxl"
            elif 'xlrd' in msg:
                suggestion = "安装依赖：pip install xlrd"
            if suggestion:
                show_error_dialog("Excel处理错误", msg, suggestion)
        finally:
            dispose_gui_logger()
            status_bar.set_running(False)
            status_bar.set_status("就绪")
    thread = Thread(target=run_in_thread)
    thread.daemon = True
    thread.start()
 def process_dropped_file(log_widget, status_bar, file_path):
    try:
        ext = os.path.splitext(file_path)[1].lower()
        if ext in ['.jpg', '.jpeg', '.png', '.bmp']:
            def _run_img():
                try:
                    reporter = ProgressReporter(status_bar)
                    reporter.running()
                    init_gui_logger(log_widget)
                    add_to_log(log_widget, f"开始一键处理图片: {file_path}\n", "info")
                    try:
                        add_recent_file(file_path)
                    except Exception as e:
                        logger.debug(f"添加最近文件失败: {e}")
                    # 步骤1: OCR识别
                    reporter.set("OCR识别中...", 10)
                    ocr_service = OCRService()
                    excel_path = ocr_service.process_image(file_path)
                    if not excel_path:
                        add_to_log(log_widget, "图片OCR处理失败\n", "error")
                        return
                    add_to_log(log_widget, f"OCR识别完成: {excel_path}\n", "success")
                    # 步骤2: Excel处理
                    reporter.set("Excel处理中...", 40)
                    order_service = OrderService()
                    result = order_service.process_excel(excel_path, progress_cb=lambda p: reporter.set("Excel处理中...", p))
                    if not result:
                        add_to_log(log_widget, "Excel处理失败\n", "error")
                        return
                    add_to_log(log_widget, f"Excel处理完成: {result}\n", "success")
                    try:
                        add_recent_file(result)
                    except Exception as e:
                        logger.debug(f"添加最近文件失败: {e}")
                    try:
                        validate_unit_price_against_item_data(result, log_widget)
                    except Exception as e:
                        logger.debug(f"单价校验失败: {e}")
                    # 步骤3: 合并采购单
                    reporter.set("检查合并采购单...", 80)
                    _ask_and_merge_purchase_orders(order_service, log_widget, add_to_recent=True)
                    reporter.set("处理完成", 100)
                    add_to_log(log_widget, "一键处理完成！\n", "success")
                finally:
                    dispose_gui_logger()
                    reporter.done()
            t = Thread(target=_run_img)
            t.daemon = True
            t.start()
        elif ext in ['.xlsx', '.xls']:
            def _run_xls():
                try:
                    reporter = ProgressReporter(status_bar)
                    reporter.running()
                    init_gui_logger(log_widget)
                    order_service = OrderService()
                    add_to_log(log_widget, f"开始一键处理Excel文件: {file_path}\n", "info")
                    try:
                        add_recent_file(file_path)
                    except Exception as e:
                        logger.debug(f"添加最近文件失败: {e}")
                    # 步骤1: Excel处理
                    reporter.set("Excel处理中...", 20)
                    result = order_service.process_excel(file_path, progress_cb=lambda p: reporter.set("Excel处理中...", p))
                    if not result:
                        add_to_log(log_widget, "Excel文件处理失败\n", "error")
                        return
                    add_to_log(log_widget, f"Excel处理完成: {result}\n", "success")
                    try:
                        add_recent_file(result)
                    except Exception as e:
                        logger.debug(f"添加最近文件失败: {e}")
                    try:
                        validate_unit_price_against_item_data(result, log_widget)
                    except Exception as e:
                        logger.debug(f"单价校验失败: {e}")
                    # 步骤2: 合并采购单
                    reporter.set("检查合并采购单...", 80)
                    _ask_and_merge_purchase_orders(order_service, log_widget, add_to_recent=True)
                    reporter.set("处理完成", 100)
                    add_to_log(log_widget, "一键处理完成！\n", "success")
                finally:
                    dispose_gui_logger()
                    reporter.done()
            t = Thread(target=_run_xls)
            t.daemon = True
            t.start()
        else:
            add_to_log(log_widget, f"不支持的文件类型: {file_path}\n", "warning")
    except Exception as e:
        add_to_log(log_widget, f"处理拖拽文件失败: {str(e)}\n", "error")
@@ -1,33 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """条码映射编辑模块"""
 from tkinter import messagebox
 from app.core.excel.converter import UnitConverter
 from app.core.utils.dialog_utils import show_barcode_mapping_dialog
 from .logging_ui import add_to_log
 def edit_barcode_mappings(log_widget):
    """编辑条码映射配置"""
    try:
        add_to_log(log_widget, "正在加载条码映射配置...\n", "info")
        unit_converter = UnitConverter()
        current_mappings = unit_converter.special_barcodes
        def save_mappings(new_mappings):
            success = unit_converter.update_barcode_mappings(new_mappings)
            if success:
                add_to_log(log_widget, f"成功保存条码映射配置，共{len(new_mappings)}项\n", "success")
            else:
                add_to_log(log_widget, "保存条码映射配置失败\n", "error")
        show_barcode_mapping_dialog(None, save_mappings, current_mappings)
    except Exception as e:
        add_to_log(log_widget, f"编辑条码映射时出错: {str(e)}\n", "error")
        messagebox.showerror("错误", f"编辑条码映射时出错: {str(e)}")
@@ -1,160 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """命令执行器模块"""
 import os
 import sys
 import time
 import subprocess
 import datetime
 import re
 import tkinter as tk
 from tkinter import messagebox
 from threading import Thread
 from .logging_ui import LogRedirector
 from .result_previews import show_result_preview
 # 任务状态跟踪
 _RUNNING_TASK = None
 def get_running_task():
    return _RUNNING_TASK
 def set_running_task(val):
    global _RUNNING_TASK
    _RUNNING_TASK = val
 def run_command_with_logging(command, log_widget, status_bar=None, on_complete=None):
    """运行命令并将输出重定向到日志窗口"""
    if _RUNNING_TASK is not None:
        messagebox.showinfo("任务进行中", "请等待当前任务完成后再执行新的操作。")
        return
    def run_in_thread():
        global _RUNNING_TASK
        _RUNNING_TASK = command
        if status_bar:
            status_bar.set_running(True)
        start_time = datetime.datetime.now()
        start_perf = time.perf_counter()
        log_widget.configure(state=tk.NORMAL)
        log_widget.delete(1.0, tk.END)
        log_widget.insert(tk.END, f"执行命令: {' '.join(command)}\n", "command")
        log_widget.insert(tk.END, f"开始时间: {start_time.strftime('%Y-%m-%d %H:%M:%S')}\n", "time")
        log_widget.insert(tk.END, "=" * 50 + "\n\n", "separator")
        log_widget.configure(state=tk.DISABLED)
        old_stdout = sys.stdout
        old_stderr = sys.stderr
        log_redirector = LogRedirector(log_widget)
        env = os.environ.copy()
        try:
            from app.config.settings import ConfigManager
            cfg = ConfigManager()
            env["OCR_OUTPUT_DIR"] = cfg.get_path('Paths', 'output_folder', fallback='data/output', create=True)
            env["OCR_INPUT_DIR"] = cfg.get_path('Paths', 'input_folder', fallback='data/input', create=True)
            env["OCR_TEMP_DIR"] = cfg.get_path('Paths', 'temp_folder', fallback='data/temp', create=True)
        except Exception:
            # 回退：使用 exe/脚本所在目录
            app_root = os.path.dirname(sys.executable) if getattr(sys, 'frozen', False) else os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
            env["OCR_OUTPUT_DIR"] = os.path.join(app_root, "data", "output")
            env["OCR_INPUT_DIR"] = os.path.join(app_root, "data", "input")
            env["OCR_TEMP_DIR"] = os.path.join(app_root, "data", "temp")
        env["OCR_LOG_LEVEL"] = "DEBUG"
        try:
            sys.stdout = log_redirector
            sys.stderr = log_redirector
            print("日志重定向已启动，现在同时输出到终端和GUI")
            process = subprocess.Popen(
                command,
                stdout=subprocess.PIPE,
                stderr=subprocess.STDOUT,
                text=True,
                bufsize=1,
                universal_newlines=True,
                env=env
            )
            output_data = []
            for line in process.stdout:
                output_data.append(line)
                print(line.rstrip())
                if status_bar:
                    progress = extract_progress_from_log(line)
                    if progress is not None:
                        log_widget.after(0, lambda p=progress: status_bar.set_status(f"处理中: {p}%完成", p))
            process.wait()
            end_time = datetime.datetime.now()
            duration_sec = max(0.0, time.perf_counter() - start_perf)
            print(f"\n{'=' * 50}")
            print(f"执行完毕！返回码: {process.returncode}")
            print(f"结束时间: {end_time.strftime('%Y-%m-%d %H:%M:%S')}")
            print(f"耗时: {duration_sec:.2f} 秒")
            output_text = ''.join(output_data)
            is_pipeline = "pipeline" in command
            no_merge_files = "未找到采购单文件" in output_text
            single_file = "只有1个采购单文件" in output_text
            if is_pipeline and (no_merge_files or single_file):
                print("完整流程中没有需要合并的文件，但其他步骤执行成功，视为成功完成")
                if status_bar:
                    log_widget.after(0, lambda: status_bar.set_status("处理完成", 100))
                log_widget.after(0, lambda: show_result_preview(command, output_text))
            else:
                if on_complete:
                    log_widget.after(0, lambda: on_complete(process.returncode, output_text))
                elif process.returncode == 0:
                    if status_bar:
                        log_widget.after(0, lambda: status_bar.set_status("处理完成", 100))
                    log_widget.after(0, lambda: show_result_preview(command, output_text))
                else:
                    if status_bar:
                        log_widget.after(0, lambda: status_bar.set_status(f"处理失败 (返回码: {process.returncode})", 0))
                    log_widget.after(0, lambda: messagebox.showerror("操作失败", f"处理失败，返回码：{process.returncode}"))
        except Exception as e:
            print(f"\n执行出错: {str(e)}")
            if status_bar:
                log_widget.after(0, lambda: status_bar.set_status(f"执行出错: {str(e)}", 0))
            log_widget.after(0, lambda: messagebox.showerror("执行错误", f"执行命令时出错: {str(e)}"))
        finally:
            sys.stdout = old_stdout
            sys.stderr = old_stderr
            _RUNNING_TASK = None
            if status_bar:
                log_widget.after(0, lambda: status_bar.set_running(False))
    Thread(target=run_in_thread).start()
 def extract_progress_from_log(log_line):
    """从日志行中提取进度信息"""
    batch_match = re.search(r'处理批次 (\d+)/(\d+)', log_line)
    if batch_match:
        current = int(batch_match.group(1))
        total = int(batch_match.group(2))
        return int(current / total * 100)
    percent_match = re.search(r'(\d+)%', log_line)
    if percent_match:
        return int(percent_match.group(1))
    return None
@@ -1,205 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """系统设置对话框模块"""
 import os
 import tkinter as tk
 from tkinter import messagebox, filedialog, ttk
 from app.config.settings import ConfigManager
 from .user_settings import load_user_settings, save_user_settings
 from .ui_widgets import center_window
 from app.core.utils.dialog_utils import show_cloud_sync_dialog
 def show_config_dialog(root, cfg: ConfigManager):
    settings = load_user_settings()
    dlg = tk.Toplevel(root)
    dlg.title("系统设置")
    dlg.geometry("700x460")
    center_window(dlg)
    content = ttk.Frame(dlg)
    content.pack(fill=tk.BOTH, expand=True, padx=12, pady=12)
    content.columnconfigure(0, weight=1)
    # ── 辅助函数 ──
    def _add_pair(parent, row, col, label_text, widget, label_width=None):
        """在 parent 的 (row, col*2) 放 label, (row, col*2+1) 放 widget"""
        lbl = ttk.Label(parent, text=label_text)
        if label_width:
            lbl.configure(width=label_width)
        lbl.grid(row=row, column=col * 2, sticky='w', padx=(6, 2), pady=3)
        widget.grid(row=row, column=col * 2 + 1, sticky='ew', padx=(2, 6), pady=3)
    def _make_dir_widget(parent, var, label):
        f = ttk.Frame(parent)
        e = ttk.Entry(f, textvariable=var)
        e.pack(side=tk.LEFT, fill=tk.X, expand=True)
        def _select_dir():
            d = filedialog.askdirectory(title=f"选择{label}")
            if d:
                try:
                    var.set(os.path.relpath(d, os.getcwd()))
                except Exception:
                    var.set(d)
        ttk.Button(f, text="选择", command=_select_dir).pack(side=tk.LEFT, padx=4)
        return f
    # ── 当前值 ──
    log_level_val = tk.StringVar(value=settings.get('log_level', 'INFO'))
    max_workers_val = tk.StringVar(value=str(settings.get('concurrency_max_workers', cfg.getint('Performance', 'max_workers', 4))))
    batch_size_val = tk.StringVar(value=str(settings.get('concurrency_batch_size', cfg.getint('Performance', 'batch_size', 5))))
    template_path_val = tk.StringVar(value=settings.get('template_path', os.path.join(cfg.get('Paths', 'template_folder', 'templates'), cfg.get('Templates', 'purchase_order', '银豹-采购单模板.xls'))))
    input_dir_val = tk.StringVar(value=settings.get('input_folder', cfg.get('Paths', 'input_folder', 'data/input')))
    output_dir_val = tk.StringVar(value=settings.get('output_folder', cfg.get('Paths', 'output_folder', 'data/output')))
    result_dir_val = tk.StringVar(value=settings.get('result_folder', 'data/result'))
    api_key_val = tk.StringVar(value=settings.get('api_key', cfg.get('API', 'api_key', '')))
    secret_key_val = tk.StringVar(value=settings.get('secret_key', cfg.get('API', 'secret_key', '')))
    timeout_val = tk.StringVar(value=str(settings.get('timeout', cfg.getint('API', 'timeout', 30))))
    max_retries_val = tk.StringVar(value=str(settings.get('max_retries', cfg.getint('API', 'max_retries', 3))))
    retry_delay_val = tk.StringVar(value=str(settings.get('retry_delay', cfg.getint('API', 'retry_delay', 2))))
    api_url_val = tk.StringVar(value=settings.get('api_url', cfg.get('API', 'api_url', '')))
    gitea_url_val = tk.StringVar(value=cfg.get('Gitea', 'base_url', fallback='https://gitea.94kan.cn'))
    gitea_owner_val = tk.StringVar(value=cfg.get('Gitea', 'owner', fallback='houhuan'))
    gitea_repo_val = tk.StringVar(value=cfg.get('Gitea', 'repo', fallback='yixuan-sync-data'))
    gitea_token_val = tk.StringVar(value=cfg.get('Gitea', 'token', fallback=''))
    # ═══════════════════════════════════════════════════
    #  区块 1: 基本设置
    # ═══════════════════════════════════════════════════
    f1 = ttk.LabelFrame(content, text=" 基本设置 ", padding=(8, 4))
    f1.pack(fill=tk.X, pady=(0, 6))
    for c in range(4):
        f1.columnconfigure(c, weight=1 if c % 2 == 1 else 0)
    lvl = ttk.Combobox(f1, textvariable=log_level_val, values=['DEBUG', 'INFO', 'WARNING', 'ERROR'], state='readonly', width=12)
    _add_pair(f1, 0, 0, "日志级别", lvl)
    _add_pair(f1, 0, 1, "最大并发", ttk.Entry(f1, textvariable=max_workers_val, width=6))
    _add_pair(f1, 1, 0, "批次大小", ttk.Entry(f1, textvariable=batch_size_val, width=6))
    # 模板路径（带选择按钮，占右列）
    tpl_frame = ttk.Frame(f1)
    tpl_entry = ttk.Entry(tpl_frame, textvariable=template_path_val)
    tpl_entry.pack(side=tk.LEFT, fill=tk.X, expand=True)
    def _select_template():
        p = filedialog.askopenfilename(title="选择模板文件", filetypes=[("Excel模板", "*.xls *.xlsx"), ("所有文件", "*.*")])
        if p:
            try:
                template_path_val.set(os.path.relpath(p, os.getcwd()))
            except Exception:
                template_path_val.set(p)
    ttk.Button(tpl_frame, text="选择", command=_select_template).pack(side=tk.LEFT, padx=4)
    _add_pair(f1, 1, 1, "采购模板", tpl_frame)
    _add_pair(f1, 2, 0, "输入目录", _make_dir_widget(f1, input_dir_val, "输入目录"))
    _add_pair(f1, 2, 1, "输出目录", _make_dir_widget(f1, output_dir_val, "输出目录"))
    _add_pair(f1, 3, 0, "结果目录", _make_dir_widget(f1, result_dir_val, "结果目录"))
    # ═══════════════════════════════════════════════════
    #  区块 2: API 设置
    # ═══════════════════════════════════════════════════
    f2 = ttk.LabelFrame(content, text=" API 设置 ", padding=(8, 4))
    f2.pack(fill=tk.X, pady=(0, 6))
    for c in range(4):
        f2.columnconfigure(c, weight=1 if c % 2 == 1 else 0)
    _add_pair(f2, 0, 0, "API Key", ttk.Entry(f2, textvariable=api_key_val))
    secret_entry = ttk.Entry(f2, textvariable=secret_key_val, show='*')
    _add_pair(f2, 0, 1, "Secret Key", secret_entry)
    _add_pair(f2, 1, 0, "Timeout", ttk.Entry(f2, textvariable=timeout_val, width=6))
    _add_pair(f2, 1, 1, "Max Retries", ttk.Entry(f2, textvariable=max_retries_val, width=6))
    _add_pair(f2, 2, 0, "Retry Delay", ttk.Entry(f2, textvariable=retry_delay_val, width=6))
    _add_pair(f2, 2, 1, "API URL", ttk.Entry(f2, textvariable=api_url_val))
    # ═══════════════════════════════════════════════════
    #  区块 3: 云端同步 (Gitea)
    # ═══════════════════════════════════════════════════
    f3 = ttk.LabelFrame(content, text=" 云端同步 (Gitea) ", padding=(8, 4))
    f3.pack(fill=tk.X, pady=(0, 8))
    for c in range(4):
        f3.columnconfigure(c, weight=1 if c % 2 == 1 else 0)
    _add_pair(f3, 0, 0, "Gitea 地址", ttk.Entry(f3, textvariable=gitea_url_val))
    _add_pair(f3, 0, 1, "仓库所有者", ttk.Entry(f3, textvariable=gitea_owner_val))
    _add_pair(f3, 1, 0, "仓库名称", ttk.Entry(f3, textvariable=gitea_repo_val))
    _add_pair(f3, 1, 1, "Access Token", ttk.Entry(f3, textvariable=gitea_token_val, show='*'))
    # ═══════════════════════════════════════════════════
    #  按钮区
    # ═══════════════════════════════════════════════════
    btns = ttk.Frame(content)
    btns.pack(fill=tk.X, pady=(4, 0))
    def save_settings():
        try:
            s = load_user_settings()
            s['log_level'] = log_level_val.get()
            s['concurrency_max_workers'] = int(max_workers_val.get() or '4')
            s['concurrency_batch_size'] = int(batch_size_val.get() or '5')
            tp = template_path_val.get()
            inp = input_dir_val.get()
            outp = output_dir_val.get()
            resp = result_dir_val.get()
            try:
                if tp:
                    tp = os.path.relpath(tp, os.getcwd()) if os.path.isabs(tp) else tp
                if inp:
                    inp = os.path.relpath(inp, os.getcwd()) if os.path.isabs(inp) else inp
                if outp:
                    outp = os.path.relpath(outp, os.getcwd()) if os.path.isabs(outp) else outp
                if resp:
                    resp = os.path.relpath(resp, os.getcwd()) if os.path.isabs(resp) else resp
            except Exception:
                pass
            s['template_path'] = tp
            s['input_folder'] = inp
            s['output_folder'] = outp
            s['result_folder'] = resp
            save_user_settings(s)
            try:
                from app.core.utils.log_utils import set_log_level
                set_log_level(s['log_level'])
            except Exception:
                pass
            try:
                tpl_path = s['template_path']
                tpl_dir = os.path.dirname(tpl_path)
                tpl_name = os.path.basename(tpl_path)
                cfg.update('Paths', 'template_folder', tpl_dir)
                cfg.update('Templates', 'purchase_order', tpl_name)
                try:
                    cfg.update('Paths', 'template_file', os.path.join(tpl_dir, tpl_name))
                except Exception:
                    pass
                cfg.update('Paths', 'input_folder', s['input_folder'])
                cfg.update('Paths', 'output_folder', s['output_folder'])
                cfg.update('Performance', 'max_workers', s['concurrency_max_workers'])
                cfg.update('Performance', 'batch_size', s['concurrency_batch_size'])
                cfg.update('API', 'api_key', api_key_val.get())
                cfg.update('API', 'secret_key', secret_key_val.get())
                cfg.update('API', 'timeout', timeout_val.get())
                cfg.update('API', 'max_retries', max_retries_val.get())
                cfg.update('API', 'retry_delay', retry_delay_val.get())
                cfg.update('API', 'api_url', api_url_val.get())
                cfg.update('Gitea', 'base_url', gitea_url_val.get())
                cfg.update('Gitea', 'owner', gitea_owner_val.get())
                cfg.update('Gitea', 'repo', gitea_repo_val.get())
                cfg.update('Gitea', 'token', gitea_token_val.get())
                cfg.save_config()
            except Exception:
                pass
            messagebox.showinfo("设置已保存", "系统设置已更新并保存")
            dlg.destroy()
        except Exception as e:
            messagebox.showerror("保存失败", str(e))
    ttk.Button(btns, text="云端同步", command=lambda: show_cloud_sync_dialog(dlg)).pack(side=tk.LEFT)
    ttk.Button(btns, text="取消", command=dlg.destroy).pack(side=tk.RIGHT)
    ttk.Button(btns, text="保存", command=save_settings).pack(side=tk.RIGHT, padx=6)
@@ -1,41 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """错误处理工具模块"""
 from tkinter import messagebox
 from typing import Optional
 from app.core.utils.log_utils import get_logger
 logger = get_logger(__name__)
 def show_error_dialog(title: str, message: str, suggestion: Optional[str] = None):
    try:
        full_msg = message
        if suggestion:
            full_msg = f"{message}\n\n建议操作：\n- {suggestion}"
        messagebox.showerror(title, full_msg)
    except Exception as e:
        logger.debug(f"显示错误对话框失败: {e}")
 def get_error_suggestion(message: str) -> Optional[str]:
    msg = (message or "").lower()
    if 'openpyxl' in msg or ('engine' in msg and 'xlsx' in msg):
        return '安装依赖：pip install openpyxl'
    if 'xlrd' in msg or ('engine' in msg and 'xls' in msg):
        return '安装依赖：pip install xlrd'
    if 'timeout' in msg or 'timed out' in msg:
        return '检查网络，增大API超时时间或稍后重试'
    if 'invalid access_token' in msg or 'access token' in msg:
        return '刷新百度OCR令牌或检查api_key/secret_key'
    if '429' in msg or 'too many requests' in msg:
        return '降低识别频率或稍后重试'
    if '模板文件不存在' in msg or ('no such file' in msg and '模板' in msg):
        return '在系统设置中选择正确的模板文件路径'
    if '没有找到采购单' in msg or '未在' in msg and '找到采购单' in msg:
        return '确认result目录内存在采购单文件'
    if 'permission denied' in msg:
        return '以管理员权限运行或更改目录写入权限'
    return None
@@ -1,205 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """文件与目录操作模块"""
 import os
 import json
 import tkinter as tk
 from tkinter import messagebox, filedialog, scrolledtext
 from .logging_ui import add_to_log
 from .ui_widgets import center_window
 from app.config.settings import ConfigManager
 def select_file(log_widget, file_types=None, title="选择文件"):
    """通用文件选择对话框"""
    if file_types is None:
        file_types = [("所有文件", "*.*")]
    file_path = filedialog.askopenfilename(title=title, filetypes=file_types)
    if file_path:
        add_to_log(log_widget, f"已选择文件: {file_path}\n", "info")
    return file_path
 def select_excel_file(log_widget):
    """选择Excel文件"""
    return select_file(
        log_widget,
        [("Excel文件", "*.xlsx *.xls"), ("所有文件", "*.*")],
        "选择Excel文件"
    )
 def ensure_directories():
    """确保必要的目录结构存在"""
    config = ConfigManager()
    directories = [
        config.get_path('Paths', 'input_folder', fallback='data/input', create=True),
        config.get_path('Paths', 'output_folder', fallback='data/output', create=True),
        config.get_path('Paths', 'result_folder', fallback='data/result', create=True),
        config.get_path('Paths', 'temp_folder', fallback='data/temp', create=True),
        os.path.join(config.app_root, 'logs')
    ]
    for directory in directories:
        if not os.path.exists(directory):
            os.makedirs(directory, exist_ok=True)
            print(f"创建目录: {directory}")
 def clean_cache(log_widget):
    """清除处理缓存"""
    from .command_runner import set_running_task
    try:
        config = ConfigManager()
        processed_record = config.get_path('Paths', 'processed_record', fallback='data/processed_files.json')
        output_folder = config.get_path('Paths', 'output_folder', fallback='data/output')
        cache_files = [
            processed_record,
            os.path.join(output_folder, "processed_files.json"),
            os.path.join(output_folder, "merged_files.json")
        ]
        for cache_file in cache_files:
            if os.path.exists(cache_file):
                os.remove(cache_file)
                add_to_log(log_widget, f"已清除缓存文件: {cache_file}\n", "success")
        temp_dir = config.get_path('Paths', 'temp_folder', fallback='data/temp')
        if os.path.exists(temp_dir):
            for file in os.listdir(temp_dir):
                file_path = os.path.join(temp_dir, file)
                try:
                    if os.path.isfile(file_path):
                        os.remove(file_path)
                        add_to_log(log_widget, f"已清除临时文件: {file_path}\n", "info")
                except Exception as e:
                    add_to_log(log_widget, f"清除文件时出错: {file_path}, 错误: {str(e)}\n", "error")
        log_dir = os.path.join(config.app_root, 'logs')
        if os.path.exists(log_dir):
            for file in os.listdir(log_dir):
                if file.endswith(".active"):
                    file_path = os.path.join(log_dir, file)
                    try:
                        os.remove(file_path)
                        add_to_log(log_widget, f"已清除活动日志标记: {file_path}\n", "info")
                    except Exception as e:
                        add_to_log(log_widget, f"清除文件时出错: {file_path}, 错误: {str(e)}\n", "error")
        set_running_task(None)
        add_to_log(log_widget, "缓存清除完成，系统将重新处理所有文件\n", "success")
        messagebox.showinfo("缓存清除", "缓存已清除，系统将重新处理所有文件。")
    except Exception as e:
        add_to_log(log_widget, f"清除缓存时出错: {str(e)}\n", "error")
        messagebox.showerror("错误", f"清除缓存时出错: {str(e)}")
 def open_result_directory():
    try:
        config = ConfigManager()
        result_dir = config.get_path('Paths', 'result_folder', fallback='data/result', create=True)
        os.startfile(result_dir)
    except Exception as e:
        messagebox.showerror("错误", f"无法打开结果目录: {str(e)}")
 def _open_directory_from_settings(config_key, default_path, label):
    """通用的从配置读取路径并打开目录"""
    try:
        config = ConfigManager()
        path = config.get_path('Paths', config_key, fallback=default_path, create=True)
        os.startfile(path)
    except Exception as e:
        messagebox.showerror("错误", f"无法打开{label}: {str(e)}")
 def open_input_directory_from_settings():
    _open_directory_from_settings('input_folder', 'data/input', '输入目录')
 def open_output_directory_from_settings():
    _open_directory_from_settings('output_folder', 'data/output', '输出目录')
 def open_result_directory_from_settings():
    _open_directory_from_settings('result_folder', 'data/result', '结果目录')
 def clean_data_files(log_widget):
    """清理数据文件（仅清理input和output目录）"""
    try:
        if not messagebox.askyesno("确认清理", "确定要清理input和output目录的文件吗？这将删除所有输入和输出数据。"):
            add_to_log(log_widget, "操作已取消\n", "info")
            return
        config = ConfigManager()
        files_cleaned = 0
        input_dir = config.get_path('Paths', 'input_folder', fallback='data/input')
        if os.path.exists(input_dir):
            for file in os.listdir(input_dir):
                file_path = os.path.join(input_dir, file)
                if os.path.isfile(file_path):
                    os.remove(file_path)
                    files_cleaned += 1
            add_to_log(log_widget, "已清理input目录\n", "info")
        output_dir = config.get_path('Paths', 'output_folder', fallback='data/output')
        if os.path.exists(output_dir):
            for file in os.listdir(output_dir):
                file_path = os.path.join(output_dir, file)
                if os.path.isfile(file_path):
                    os.remove(file_path)
                    files_cleaned += 1
            add_to_log(log_widget, "已清理output目录\n", "info")
        add_to_log(log_widget, f"清理完成，共清理 {files_cleaned} 个文件\n", "success")
        messagebox.showinfo("清理完成", f"已成功清理 {files_cleaned} 个文件")
    except Exception as e:
        add_to_log(log_widget, f"清理数据文件时出错: {str(e)}\n", "error")
        messagebox.showerror("错误", f"清理数据文件时出错: {str(e)}")
 def clean_result_files(log_widget):
    try:
        if not messagebox.askyesno("确认清理", "确定要清理result目录的文件吗？这将删除所有已生成的采购单文件。"):
            add_to_log(log_widget, "操作已取消\n", "info")
            return
        config = ConfigManager()
        count = 0
        result_dir = config.get_path('Paths', 'result_folder', fallback='data/result')
        if os.path.exists(result_dir):
            for file in os.listdir(result_dir):
                file_path = os.path.join(result_dir, file)
                if os.path.isfile(file_path):
                    os.remove(file_path)
                    count += 1
        add_to_log(log_widget, f"已清理result目录，共 {count} 个文件\n", "success")
        messagebox.showinfo("清理完成", f"已清理result目录 {count} 个文件")
    except Exception as e:
        add_to_log(log_widget, f"清理result目录时出错: {str(e)}\n", "error")
        messagebox.showerror("错误", f"清理result目录时出错: {str(e)}")
 def validate_unit_price_against_item_data(result_path: str, log_widget=None):
    try:
        from app.services.order_service import OrderService
        service = OrderService()
        bad_results = service.validate_unit_price(result_path)
        if bad_results:
            display_count = min(len(bad_results), 10)
            msg = f"存在{len(bad_results)}条单价与商品资料进货价差异超过1元:\n" + "\n".join(bad_results[:display_count])
            if len(bad_results) > 10:
                msg += f"\n...(其余 {len(bad_results) - 10} 条已省略)"
            messagebox.showwarning("单价校验提示", msg)
            if log_widget is not None:
                add_to_log(log_widget, f"单价校验发现{len(bad_results)}条差异>1元\n", "warning")
        else:
            if log_widget is not None:
                add_to_log(log_widget, "单价校验通过（差异<=1元）\n", "success")
    except Exception as e:
        if log_widget is not None:
            add_to_log(log_widget, f"单价校验出错: {str(e)}\n", "error")
@@ -1,126 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """GUI日志处理模块"""
 import logging
 import queue
 import sys
 import tkinter as tk
 # 全局日志队列，用于异步更新UI
 LOG_QUEUE = queue.Queue()
 class LogRedirector:
    """日志重定向器，用于捕获命令输出并显示到界面"""
    def __init__(self, text_widget):
        self.text_widget = text_widget
        self.buffer = ""
        self.terminal = sys.__stdout__
    def write(self, string):
        self.buffer += string
        self.terminal.write(string)
        self.text_widget.after(0, self.update_text_widget)
    def update_text_widget(self):
        self.text_widget.configure(state=tk.NORMAL)
        if self.buffer.strip():
            if any(marker in self.buffer.lower() for marker in ["错误", "error", "失败", "异常", "exception"]):
                self.text_widget.insert(tk.END, self.buffer, "error")
            elif any(marker in self.buffer.lower() for marker in ["警告", "warning"]):
                self.text_widget.insert(tk.END, self.buffer, "warning")
            elif any(marker in self.buffer.lower() for marker in ["成功", "success", "完成", "成功处理"]):
                self.text_widget.insert(tk.END, self.buffer, "success")
            elif any(marker in self.buffer.lower() for marker in ["info", "信息", "开始", "处理中"]):
                self.text_widget.insert(tk.END, self.buffer, "info")
            else:
                self.text_widget.insert(tk.END, self.buffer, "normal")
        else:
            self.text_widget.insert(tk.END, self.buffer)
        self.text_widget.see(tk.END)
        self.text_widget.configure(state=tk.DISABLED)
        self.buffer = ""
    def flush(self):
        self.terminal.flush()
 class GUILogHandler(logging.Handler):
    """自定义日志处理器，将日志放入队列，由GUI主线程定时消费"""
    def __init__(self, text_widget):
        super().__init__()
        self.text_widget = text_widget
    def emit(self, record):
        try:
            msg = self.format(record)
            if record.levelno >= logging.ERROR:
                tag = "error"
            elif record.levelno >= logging.WARNING:
                tag = "warning"
            elif record.levelno >= logging.INFO:
                tag = "info"
            else:
                tag = "normal"
            LOG_QUEUE.put((msg + "\n", tag))
        except Exception:
            self.handleError(record)
 def poll_log_queue(text_widget):
    """定期从队列中读取日志并更新UI"""
    try:
        updated = False
        while not LOG_QUEUE.empty():
            msg, tag = LOG_QUEUE.get_nowait()
            text_widget.configure(state=tk.NORMAL)
            text_widget.insert(tk.END, msg, tag)
            updated = True
        if updated:
            text_widget.see(tk.END)
            text_widget.configure(state=tk.DISABLED)
    except Exception:
        pass
    finally:
        text_widget.after(100, lambda: poll_log_queue(text_widget))
 def init_gui_logger(text_widget, level=logging.INFO):
    handler = GUILogHandler(text_widget)
    handler.setLevel(level)
    formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
    handler.setFormatter(formatter)
    root_logger = logging.getLogger()
    for h in root_logger.handlers[:]:
        if isinstance(h, logging.StreamHandler):
            root_logger.removeHandler(h)
    if not any(isinstance(h, GUILogHandler) for h in root_logger.handlers):
        root_logger.addHandler(handler)
    root_logger.setLevel(level)
    return handler
 def dispose_gui_logger():
    root_logger = logging.getLogger()
    for handler in root_logger.handlers[:]:
        if isinstance(handler, GUILogHandler):
            root_logger.removeHandler(handler)
            try:
                handler.close()
            except Exception:
                pass
 def add_to_log(log_widget, text, tag="normal"):
    """向日志队列添加文本，由 poll_log_queue 消费并更新 UI"""
    if log_widget is None:
        print(f"[{tag}] {text}", end="")
        return
    LOG_QUEUE.put((text, tag))
@@ -1,501 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """主窗口模块"""
 import os
 import sys
 import subprocess
 import tkinter as tk
 from tkinter import messagebox, filedialog, scrolledtext
 from app.config.settings import ConfigManager
 from app.core.utils.log_utils import set_log_level
 from .theme import THEMES, get_theme_mode, set_theme_mode, create_modern_button, create_card_frame
 from .logging_ui import add_to_log, poll_log_queue
 from .ui_widgets import StatusBar
 from .user_settings import (
    load_user_settings, save_user_settings, refresh_recent_list_widget,
    _extract_path_from_recent_item, clear_recent_files, RECENT_LIST_WIDGET,
 )
 from .file_operations import (
    ensure_directories, open_result_directory, clean_cache,
    clean_data_files, clean_result_files,
 )
 from .action_handlers import (
    process_single_image_with_status, run_pipeline_directly,
    batch_ocr_with_status, batch_process_orders_with_status,
    merge_orders_with_status, process_excel_file_with_status,
    process_dropped_file,
 )
 from .memory_editor import show_memory_editor
 from .config_dialog import show_config_dialog
 from .barcode_editor import edit_barcode_mappings
 from .shortcuts import bind_keyboard_shortcuts
 from app.core.utils.dialog_utils import show_cloud_sync_dialog
 def _init_window():
    """初始化窗口、主题和设置，返回 (root, theme, settings, dnd_supported)"""
    ensure_directories()
    dnd_supported = False
    try:
        from tkinterdnd2 import TkinterDnD, DND_FILES
        root = TkinterDnD.Tk()
        dnd_supported = True
    except Exception:
        root = tk.Tk()
    settings = load_user_settings()
    theme_mode = settings.get('theme_mode', get_theme_mode())
    set_theme_mode(theme_mode)
    try:
        cfg_for_title = ConfigManager()
        ver = cfg_for_title.get('App', 'version', fallback='dev')
        root.title(f"益选-OCR订单处理系统 v{ver} by 欢欢欢")
    except Exception:
        root.title("益选-OCR订单处理系统 by 欢欢欢")
    root.geometry("900x600")
    settings['window_size'] = "900x600"
    theme = THEMES[get_theme_mode()]
    root.configure(bg=theme["bg"])
    try:
        log_level = settings.get('log_level')
        if log_level:
            set_log_level(log_level)
        concurrency = settings.get('concurrency_max_workers')
        if concurrency:
            cfg = ConfigManager()
            cfg.update('Performance', 'max_workers', str(concurrency))
            cfg.save_config()
    except Exception:
        pass
    try:
        root.iconbitmap(default="")
    except Exception:
        pass
    return root, theme, settings, dnd_supported
 def _create_left_panel(content_frame, theme, log_text, status_bar):
    """创建左侧面板：完整流程、OCR处理、Excel处理、最近文件"""
    left_panel = create_card_frame(content_frame)
    left_panel.pack(side=tk.LEFT, fill=tk.BOTH, expand=False, padx=(0, 5), pady=5)
    left_panel.configure(width=160)
    panel_content = tk.Frame(left_panel, bg=theme["card_bg"])
    panel_content.pack(fill=tk.BOTH, expand=True, padx=10, pady=(5, 10))
    # 完整流程区
    pipeline_section = tk.LabelFrame(
        panel_content, text="完整流程", bg=theme["card_bg"], fg=theme["fg"],
        font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
    )
    pipeline_section.pack(fill=tk.X, pady=(0, 8))
    pipeline_frame = tk.Frame(pipeline_section, bg=theme["card_bg"])
    pipeline_frame.pack(fill=tk.X, padx=8, pady=6)
    create_modern_button(pipeline_frame, "一键处理", lambda: run_pipeline_directly(log_text, status_bar), "primary", px_width=150, px_height=32).pack(anchor='w', pady=3)
    # OCR处理区
    core_section = tk.LabelFrame(
        panel_content, text="OCR处理", bg=theme["card_bg"], fg=theme["fg"],
        font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
    )
    core_section.pack(fill=tk.X, pady=(0, 8))
    core_buttons_frame = tk.Frame(core_section, bg=theme["card_bg"])
    core_buttons_frame.pack(fill=tk.X, padx=8, pady=6)
    core_row1 = tk.Frame(core_buttons_frame, bg=theme["card_bg"])
    core_row1.pack(fill=tk.X, pady=3)
    create_modern_button(core_row1, "批量识别", lambda: batch_ocr_with_status(log_text, status_bar), "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(0, 3))
    create_modern_button(core_row1, "单个识别", lambda: process_single_image_with_status(log_text, status_bar), "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(3, 0))
    # Excel处理区
    ocr_section = tk.LabelFrame(
        panel_content, text="Excel处理", bg=theme["card_bg"], fg=theme["fg"],
        font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
    )
    ocr_section.pack(fill=tk.X, pady=(0, 8))
    ocr_buttons_frame = tk.Frame(ocr_section, bg=theme["card_bg"])
    ocr_buttons_frame.pack(fill=tk.X, padx=8, pady=6)
    ocr_row1 = tk.Frame(ocr_buttons_frame, bg=theme["card_bg"])
    ocr_row1.pack(fill=tk.X, pady=3)
    create_modern_button(ocr_row1, "批量处理", lambda: batch_process_orders_with_status(log_text, status_bar), "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(0, 3))
    create_modern_button(ocr_row1, "单个处理", lambda: process_excel_file_with_status(log_text, status_bar), "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(3, 0))
    # 最近文件区
    _create_recent_files_section(panel_content, theme, log_text)
 def _create_recent_files_section(parent, theme, log_text):
    """创建最近文件列表区域"""
    recent_section = tk.LabelFrame(
        parent, text="最近文件", bg=theme["card_bg"], fg=theme["fg"],
        font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
    )
    recent_section.pack(fill=tk.BOTH, pady=(0, 12))
    recent_frame = tk.Frame(recent_section, bg=theme["card_bg"])
    recent_frame.pack(fill=tk.BOTH, padx=8, pady=6)
    recent_top = tk.Frame(recent_frame, bg=theme["card_bg"])
    recent_top.pack(fill=tk.X)
    def _resize_recent_top(e):
        try:
            h = max(int(e.height * 0.85), 180)
            recent_top.configure(height=h)
        except Exception:
            pass
    try:
        recent_top.pack_propagate(False)
    except Exception:
        pass
    recent_frame.bind('<Configure>', _resize_recent_top)
    recent_rect = tk.Frame(recent_top, bg=theme["card_bg"], highlightbackground=theme["border"], highlightthickness=1)
    recent_rect.pack(fill=tk.BOTH, expand=True)
    recent_list = tk.Listbox(recent_rect, height=20)
    recent_scrollbar = tk.Scrollbar(recent_rect)
    recent_list.configure(yscrollcommand=recent_scrollbar.set)
    recent_scrollbar.configure(command=recent_list.yview)
    recent_list.pack(side=tk.LEFT, fill=tk.BOTH, expand=True)
    recent_scrollbar.pack(side=tk.RIGHT, fill=tk.Y)
    import app.ui.user_settings as _us_mod
    _us_mod.RECENT_LIST_WIDGET = recent_list
    def _open_selected_event(evt=None):
        try:
            idxs = recent_list.curselection()
            if not idxs:
                return
            p = _extract_path_from_recent_item(recent_list.get(idxs[0]))
            if os.path.exists(p):
                os.startfile(p)
            else:
                messagebox.showwarning("文件不存在", p)
        except Exception as e:
            messagebox.showerror("打开失败", str(e))
    recent_list.bind('<Double-Button-1>', _open_selected_event)
    refresh_recent_list_widget()
    rf_btns = tk.Frame(recent_frame, bg=theme["card_bg"])
    rf_btns.pack(fill=tk.X, pady=6)
    def clear_list():
        clear_recent_files()
        recent_list.delete(0, tk.END)
    create_modern_button(rf_btns, "清空列表", clear_list, "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(3, 0))
    def purge_invalid():
        try:
            kept = []
            for i in range(recent_list.size()):
                item = recent_list.get(i)
                p = _extract_path_from_recent_item(item)
                if os.path.exists(p):
                    kept.append(p)
            try:
                kept_sorted = sorted(kept, key=lambda p: os.path.getmtime(p), reverse=True)
            except Exception:
                kept_sorted = kept
            s = load_user_settings()
            s['recent_files'] = kept_sorted
            save_user_settings(s)
            recent_list.delete(0, tk.END)
            for i, p in enumerate(s['recent_files'][:recent_list.size() or len(s['recent_files'])], start=1):
                recent_list.insert(tk.END, f"{i}. {p}")
            refresh_recent_list_widget()
            add_to_log(log_text, "已清理无效的最近文件条目\n", "success")
        except Exception as e:
            messagebox.showerror("清理失败", str(e))
    create_modern_button(rf_btns, "清理无效", purge_invalid, "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(3, 0))
 def _create_right_panel(content_frame, theme, log_text, root):
    """创建右侧面板：快捷操作、系统设置"""
    right_panel = create_card_frame(content_frame)
    right_panel.pack(side=tk.RIGHT, fill=tk.BOTH, expand=False, padx=(5, 0), pady=5)
    right_panel.configure(width=380)
    right_panel_content = tk.Frame(right_panel, bg=theme["card_bg"])
    right_panel_content.pack(fill=tk.BOTH, expand=True, padx=10, pady=(5, 10))
    # 工具功能区
    tools_section = tk.LabelFrame(
        right_panel_content, text="快捷操作", bg=theme["card_bg"], fg=theme["fg"],
        font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
    )
    tools_section.pack(fill=tk.X, pady=(0, 8))
    tools_buttons_frame = tk.Frame(tools_section, bg=theme["card_bg"])
    tools_buttons_frame.pack(fill=tk.X, padx=8, pady=6)
    tk.Frame(tools_buttons_frame, bg=theme["card_bg"]).pack(fill=tk.X, pady=3)
    create_modern_button(tools_buttons_frame, "打开结果目录", lambda: open_result_directory(), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    create_modern_button(tools_buttons_frame, "打开输出目录", lambda: os.startfile(ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True)), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    create_modern_button(tools_buttons_frame, "打开输入目录", lambda: os.startfile(ConfigManager().get_path('Paths', 'input_folder', fallback='data/input', create=True)), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    create_modern_button(tools_buttons_frame, "合并订单", lambda: merge_orders_with_status(log_text, StatusBar(root)), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    create_modern_button(tools_buttons_frame, "清除缓存", lambda: clean_cache(log_text), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    create_modern_button(tools_buttons_frame, "清理input/out文件", lambda: clean_data_files(log_text), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    create_modern_button(tools_buttons_frame, "清理result文件", lambda: clean_result_files(log_text), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    # 系统设置区
    settings_section = tk.LabelFrame(
        right_panel_content, text="系统设置", bg=theme["card_bg"], fg=theme["fg"],
        font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
    )
    settings_section.pack(fill=tk.X, pady=(0, 8))
    settings_buttons_frame = tk.Frame(settings_section, bg=theme["card_bg"])
    settings_buttons_frame.pack(fill=tk.X, padx=8, pady=6)
    create_modern_button(settings_buttons_frame, "系统设置", lambda: show_config_dialog(root, ConfigManager()), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    create_modern_button(settings_buttons_frame, "条码映射", lambda: edit_barcode_mappings(log_text), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    create_modern_button(settings_buttons_frame, "云端同步", lambda: show_cloud_sync_dialog(root), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
    create_modern_button(settings_buttons_frame, "商品记忆库", lambda: show_memory_editor(root), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
 def _setup_drag_area(mid_container, theme, dnd_supported, log_text, status_bar):
    """创建拖拽/点击选择文件区域"""
    drag_panel = create_card_frame(mid_container)
    drag_panel.pack(side=tk.TOP, fill=tk.X, padx=(5, 5), pady=(0, 5))
    drag_panel_content = tk.Frame(drag_panel, bg=theme["card_bg"])
    drag_panel_content.pack(fill=tk.X, padx=10, pady=6)
    dnd_section = tk.LabelFrame(
        drag_panel_content, bg=theme["card_bg"], fg=theme["fg"],
        font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
    )
    dnd_section.pack(fill=tk.X, pady=(0, 0))
    dnd_frame = tk.Frame(dnd_section, bg=theme["card_bg"], highlightthickness=1, highlightbackground=theme["border"])
    dnd_frame.configure(height=60)
    dnd_frame.pack(fill=tk.X, padx=8, pady=6)
    try:
        dnd_frame.pack_propagate(False)
    except Exception:
        pass
    def _set_highlight(active: bool):
        try:
            dnd_frame.configure(highlightbackground=theme["info"] if active else theme["border"])
        except Exception:
            pass
    dnd_frame.bind('<Enter>', lambda e: _set_highlight(True))
    dnd_frame.bind('<Leave>', lambda e: _set_highlight(False))
    msg_row = tk.Frame(dnd_frame, bg=theme["card_bg"])
    msg_row.pack(fill=tk.X)
    if dnd_supported:
        tk.Label(
            msg_row, text="拖拽已启用：拖拽或点击此区域选择文件",
            bg=theme["card_bg"], fg="#999999", justify="center"
        ).pack(fill=tk.X)
    else:
        tk.Label(
            msg_row, text="点击此区域选择文件；可安装拖拽支持",
            bg=theme["card_bg"], fg="#999999", justify="center"
        ).pack(fill=tk.X)
    if not dnd_supported:
        btn_row = tk.Frame(dnd_frame, bg=theme["card_bg"])
        btn_row.pack(fill=tk.X)
        is_frozen = getattr(sys, 'frozen', False)
        def copy_install():
            try:
                mid_container.winfo_toplevel().clipboard_clear()
                mid_container.winfo_toplevel().clipboard_append("pip install tkinterdnd2")
                messagebox.showinfo("已复制", "已复制安装命令：pip install tkinterdnd2")
            except Exception as e:
                messagebox.showwarning("复制失败", str(e))
        if is_frozen:
            tk.Label(
                btn_row, text="EXE版不支持运行时安装，请用源码版安装后重新打包",
                bg=theme["card_bg"], fg="#999999", font=("Microsoft YaHei UI", 8)
            ).pack(side=tk.RIGHT, padx=4)
        else:
            def install_and_restart():
                try:
                    add_to_log(log_text, "开始安装拖拽支持库 tkinterdnd2...\n", "info")
                    cmd = [sys.executable, "-m", "pip", "install", "tkinterdnd2"]
                    result = subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
                    add_to_log(log_text, result.stdout + "\n", "info")
                    add_to_log(log_text, "安装成功，准备重启程序以启用拖拽...\n", "success")
                    if messagebox.askyesno("安装完成", "已安装拖拽支持，是否立即重启应用？"):
                        os.execl(sys.executable, sys.executable, *sys.argv)
                except subprocess.CalledProcessError as e:
                    add_to_log(log_text, f"安装失败: {e.stderr}\n", "error")
                    messagebox.showerror("安装失败", f"安装输出:\n{e.stderr}")
                except Exception as e:
                    add_to_log(log_text, f"安装失败: {str(e)}\n", "error")
                    messagebox.showerror("安装失败", str(e))
            create_modern_button(btn_row, "一键安装拖拽", install_and_restart, "primary", px_width=132, px_height=28).pack(side=tk.RIGHT, padx=(3, 0))
        create_modern_button(btn_row, "复制安装命令", copy_install, "primary", px_width=132, px_height=28).pack(side=tk.RIGHT)
    # 点击拖拽框选择文件
    def _click_select(evt=None):
        try:
            files = filedialog.askopenfilenames(
                title="选择图片或Excel文件",
                filetypes=[
                    ("支持文件", "*.xlsx *.xls *.jpg *.jpeg *.png *.bmp"),
                    ("Excel", "*.xlsx *.xls"),
                    ("图片", "*.jpg *.jpeg *.png *.bmp"),
                    ("所有文件", "*.*"),
                ]
            )
            if not files:
                return
            for p in files:
                process_dropped_file(log_text, status_bar, p)
        except Exception as e:
            messagebox.showerror("选择失败", str(e))
    dnd_frame.bind('<Button-1>', _click_select)
    msg_row.bind('<Button-1>', _click_select)
    if dnd_supported:
        def _on_drop(event):
            try:
                data = event.data
                paths = []
                buf = ""
                in_brace = False
                for ch in data:
                    if ch == '{':
                        in_brace = True
                        buf = ""
                    elif ch == '}':
                        in_brace = False
                        paths.append(buf)
                        buf = ""
                    elif ch == ' ' and not in_brace:
                        if buf:
                            paths.append(buf)
                            buf = ""
                    else:
                        buf += ch
                if buf:
                    paths.append(buf)
                for p in paths:
                    process_dropped_file(log_text, status_bar, p)
            except Exception as e:
                add_to_log(log_text, f"拖拽处理失败: {str(e)}\n", "error")
        try:
            from tkinterdnd2 import DND_FILES
            dnd_frame.drop_target_register(DND_FILES)
            dnd_frame.dnd_bind('<<Drop>>', _on_drop)
        except Exception:
            pass
 def _create_log_panel(mid_container, theme):
    """创建中间日志面板，返回 log_text widget"""
    log_panel = create_card_frame(mid_container, "处理日志")
    log_panel.pack(side=tk.TOP, fill=tk.BOTH, expand=True, padx=(5, 5), pady=5)
    log_text = scrolledtext.ScrolledText(
        log_panel, wrap=tk.WORD, width=68, height=26,
        bg=theme["log_bg"], fg=theme["log_fg"],
        font=("Consolas", 9), state=tk.DISABLED,
        relief="flat", borderwidth=0
    )
    log_text.pack(fill=tk.BOTH, expand=True, padx=10, pady=(5, 10))
    log_text.tag_configure("command", foreground=theme["info"], font=("Consolas", 9, "bold"))
    log_text.tag_configure("time", foreground=theme["secondary_bg"], font=("Consolas", 8))
    log_text.tag_configure("separator", foreground=theme["border"])
    log_text.tag_configure("success", foreground=theme["success"], font=("Consolas", 9, "bold"))
    log_text.tag_configure("error", foreground=theme["error"], font=("Consolas", 9, "bold"))
    log_text.tag_configure("warning", foreground=theme["warning"], font=("Consolas", 9, "bold"))
    log_text.tag_configure("info", foreground=theme["info"], font=("Consolas", 9))
    poll_log_queue(log_text)
    try:
        _ver = ConfigManager().get('App', 'version', fallback='')
        _ver_str = f" v{_ver}" if _ver else ""
    except Exception:
        _ver_str = ""
    add_to_log(log_text, f"欢迎使用 益选-OCR订单处理系统{_ver_str}\n", "success")
    add_to_log(log_text, "系统已就绪，请选择相应功能进行操作。\n\n", "info")
    add_to_log(log_text, "功能说明：\n", "command")
    add_to_log(log_text, "• 完整处理流程：一键完成OCR识别和Excel处理\n", "info")
    add_to_log(log_text, "• 批量处理订单：批量处理多个订单文件\n", "info")
    add_to_log(log_text, "• 处理烟草订单：专门处理烟草类订单\n", "info")
    add_to_log(log_text, "• 合并订单：将多个订单合并为一个文件\n\n", "info")
    cfg = ConfigManager()
    add_to_log(log_text, f"请将需要处理的图片文件放入 {cfg.get_path('Paths', 'input_folder', fallback='data/input')} 目录中。\n", "warning")
    add_to_log(log_text, f"OCR识别结果保存在 {cfg.get_path('Paths', 'output_folder', fallback='data/output')} 目录，处理完成的订单保存在 {cfg.get_path('Paths', 'result_folder', fallback='data/result')} 目录中。\n\n", "warning")
    add_to_log(log_text, "=" * 50 + "\n\n", "separator")
    return log_text
 def main():
    """主函数"""
    try:
        root, theme, settings, dnd_supported = _init_window()
        # 主容器
        main_container = tk.Frame(root, bg=theme["bg"])
        main_container.pack(fill=tk.BOTH, expand=True, padx=10, pady=10)
        content_frame = tk.Frame(main_container, bg=theme["bg"])
        content_frame.pack(fill=tk.BOTH, expand=True)
        # 中间容器（拖拽区 + 日志区）
        mid_container = tk.Frame(content_frame, bg=theme["bg"])
        mid_container.pack(side=tk.LEFT, fill=tk.BOTH, expand=True, padx=(5, 5), pady=5)
        log_text = _create_log_panel(mid_container, theme)
        # 状态栏
        status_bar = StatusBar(root)
        status_bar.pack(side=tk.BOTTOM, fill=tk.X)
        # 左侧面板
        _create_left_panel(content_frame, theme, log_text, status_bar)
        # 右侧面板
        _create_right_panel(content_frame, theme, log_text, root)
        # 拖拽区域
        _setup_drag_area(mid_container, theme, dnd_supported, log_text, status_bar)
        # 快捷键 + 关闭事件
        def on_close():
            try:
                w = root.winfo_width()
                h = root.winfo_height()
                settings['window_size'] = f"{w}x{h}"
                settings['theme_mode'] = get_theme_mode()
                save_user_settings(settings)
            except Exception:
                pass
            root.destroy()
        root.protocol("WM_DELETE_WINDOW", on_close)
        bind_keyboard_shortcuts(root, log_text, status_bar)
        root.mainloop()
    except Exception as e:
        import traceback
        error_msg = f"程序启动失败: {str(e)}\n详细错误信息:\n{traceback.format_exc()}"
        print(error_msg)
        try:
            import tkinter.messagebox as mb
            mb.showerror("启动错误", f"程序启动失败:\n{str(e)}")
        except Exception:
            pass
@@ -1,198 +0,0 @@
 """商品记忆库查看/编辑对话框"""
 import os
 import tkinter as tk
 from tkinter import ttk, messagebox, simpledialog
 from app.config.settings import ConfigManager
 from app.core.db.product_db import ProductDatabase
 from .ui_widgets import center_window
 def _get_product_db():
    cfg = ConfigManager()
    db_path = cfg.get_path('Paths', 'product_db', fallback='data/product_cache.db') if hasattr(cfg, 'get_path') else 'data/product_cache.db'
    tpl_folder = cfg.get('Paths', 'template_folder', fallback='templates')
    item_data = cfg.get('Templates', 'item_data', fallback='商品资料.xlsx')
    tpl_path = os.path.join(tpl_folder, item_data)
    return ProductDatabase(db_path, tpl_path)
 def show_memory_editor(root):
    """显示商品记忆库编辑器"""
    db = _get_product_db()
    dlg = tk.Toplevel(root)
    dlg.title("商品记忆库")
    dlg.geometry("950x520")
    center_window(dlg)
    # ── 顶部搜索栏 ──
    top = ttk.Frame(dlg)
    top.pack(fill=tk.X, padx=8, pady=(8, 4))
    ttk.Label(top, text="搜索:").pack(side=tk.LEFT)
    search_var = tk.StringVar()
    search_entry = ttk.Entry(top, textvariable=search_var, width=30)
    search_entry.pack(side=tk.LEFT, padx=4)
    # ── 统计标签 ──
    stats_label = ttk.Label(top, text="")
    stats_label.pack(side=tk.RIGHT)
    # ── Treeview ──
    columns = ("barcode", "name", "specification", "unit", "price", "source", "confidence", "usage_count", "last_seen")
    tree = ttk.Treeview(dlg, columns=columns, show="headings", height=18)
    headers = {
        "barcode": ("条码", 120),
        "name": ("名称", 180),
        "specification": ("规格", 80),
        "unit": ("单位", 50),
        "price": ("单价", 70),
        "source": ("来源", 80),
        "confidence": ("置信度", 60),
        "usage_count": ("使用次数", 70),
        "last_seen": ("最后使用", 140),
    }
    for col, (text, width) in headers.items():
        tree.heading(col, text=text)
        tree.column(col, width=width, anchor="center")
    # 置信度颜色标签
    tree.tag_configure("high", foreground="#28a745")    # >= 80 绿
    tree.tag_configure("medium", foreground="#ffc107")   # 50-79 黄
    tree.tag_configure("low", foreground="#dc3545")      # < 50 红
    scrollbar = ttk.Scrollbar(dlg, orient=tk.VERTICAL, command=tree.yview)
    tree.configure(yscrollcommand=scrollbar.set)
    tree.pack(side=tk.LEFT, fill=tk.BOTH, expand=True, padx=(8, 0), pady=4)
    scrollbar.pack(side=tk.LEFT, fill=tk.Y, padx=(0, 8), pady=4)
    # ── 数据加载 ──
    all_records = []
    def load_data(filter_text=""):
        nonlocal all_records
        all_records = db.get_all_memories()
        tree.delete(*tree.get_children())
        filtered = all_records
        if filter_text:
            ft = filter_text.lower()
            filtered = [r for r in all_records
                        if ft in str(r.get('barcode', '')).lower()
                        or ft in str(r.get('name', '')).lower()]
        for r in filtered:
            conf = r.get('confidence', 0) or 0
            tag = "high" if conf >= 80 else ("medium" if conf >= 50 else "low")
            last_seen = r.get('last_seen', '') or ''
            if last_seen and len(last_seen) > 16:
                last_seen = last_seen[:16]
            source_display = {
                'template': '模板',
                'ocr': 'OCR',
                'user_confirmed': '手动',
            }.get(r.get('source', ''), r.get('source', ''))
            tree.insert("", tk.END, values=(
                r.get('barcode', ''),
                r.get('name', ''),
                r.get('specification', ''),
                r.get('unit', ''),
                f"{r.get('price', 0):.2f}" if r.get('price') else '',
                source_display,
                conf,
                r.get('usage_count', 0) or 0,
                last_seen,
            ), tags=(tag,))
        stats_label.config(text=f"共 {len(filtered)} / {len(all_records)} 条")
    def on_search(*_):
        load_data(search_var.get())
    search_var.trace_add("write", on_search)
    # ── 按钮区 ──
    btn_frame = ttk.Frame(dlg)
    btn_frame.pack(fill=tk.X, padx=8, pady=(0, 8))
    def edit_selected():
        sel = tree.selection()
        if not sel:
            messagebox.showwarning("提示", "请先选择一条记录")
            return
        item = tree.item(sel[0])
        vals = item['values']
        barcode = vals[0]
        # 弹出编辑对话框
        edit_dlg = tk.Toplevel(dlg)
        edit_dlg.title(f"编辑: {barcode}")
        edit_dlg.geometry("380x260")
        center_window(edit_dlg)
        fields = [
            ("名称", "name", vals[1]),
            ("规格", "specification", vals[2]),
            ("单位", "unit", vals[3]),
            ("单价", "price", vals[4]),
        ]
        entries = {}
        for i, (label, key, val) in enumerate(fields):
            ttk.Label(edit_dlg, text=label).grid(row=i, column=0, sticky='w', padx=8, pady=4)
            var = tk.StringVar(value=str(val) if val else '')
            ttk.Entry(edit_dlg, textvariable=var, width=30).grid(row=i, column=1, padx=8, pady=4)
            entries[key] = var
        def save_edit():
            updates = {}
            for key, var in entries.items():
                v = var.get().strip()
                if key == 'price':
                    try:
                        updates[key] = float(v) if v else 0
                    except ValueError:
                        updates[key] = 0
                else:
                    updates[key] = v
            db.update_memory(barcode, updates)
            edit_dlg.destroy()
            load_data(search_var.get())
        ttk.Button(edit_dlg, text="保存", command=save_edit).grid(row=len(fields), column=0, columnspan=2, pady=12)
    def delete_selected():
        sel = tree.selection()
        if not sel:
            messagebox.showwarning("提示", "请先选择一条记录")
            return
        item = tree.item(sel[0])
        barcode = item['values'][0]
        if messagebox.askyesno("确认删除", f"确定要删除条码 {barcode} 的记忆记录吗？"):
            db.delete_memory(barcode)
            load_data(search_var.get())
    def reimport_template():
        if messagebox.askyesno("确认", "重新从商品资料导入将重置所有模板商品的置信度为100，确定继续吗？"):
            count = db.reimport()
            messagebox.showinfo("完成", f"已重新导入 {count} 条记录")
            load_data(search_var.get())
    ttk.Button(btn_frame, text="编辑", command=edit_selected).pack(side=tk.LEFT, padx=4)
    ttk.Button(btn_frame, text="删除", command=delete_selected).pack(side=tk.LEFT, padx=4)
    ttk.Button(btn_frame, text="重新导入模板", command=reimport_template).pack(side=tk.LEFT, padx=4)
    ttk.Button(btn_frame, text="刷新", command=lambda: load_data(search_var.get())).pack(side=tk.LEFT, padx=4)
    ttk.Button(btn_frame, text="关闭", command=dlg.destroy).pack(side=tk.RIGHT, padx=4)
    # 双击编辑
    tree.bind("<Double-1>", lambda e: edit_selected())
    # 初始加载
    load_data()
@@ -1,377 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """处理结果预览对话框模块"""
 import os
 import re
 import datetime
 import tkinter as tk
 from tkinter import messagebox, scrolledtext
 from .theme import THEMES, get_theme_mode, apply_theme
 from .ui_widgets import center_window
 from app.core.utils.file_utils import format_file_size
 from app.config.settings import ConfigManager
 TOBACCO_PREVIEW_WINDOW = None
 def _get_output_dir():
    """获取输出目录的绝对路径"""
    return ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True)
 def show_result_preview(command, output):
    """显示处理结果预览"""
    if "ocr" in command:
        show_ocr_result_preview(output)
    elif "excel" in command:
        show_excel_result_preview(output)
    elif "merge" in command:
        show_merge_result_preview(output)
    elif "pipeline" in command:
        show_pipeline_result_preview(output)
    else:
        messagebox.showinfo("处理完成", f"操作已成功完成！\n请在{_get_output_dir()}目录查看结果。")
 def show_ocr_result_preview(output):
    """显示OCR处理结果预览"""
    files_match = re.search(r'找到 (\d+) 个图片文件，其中 (\d+) 个未处理', output)
    processed_match = re.search(r'所有图片处理完成, 总计: (\d+), 成功: (\d+)', output)
    if processed_match:
        total = int(processed_match.group(1))
        success = int(processed_match.group(2))
        preview = tk.Toplevel()
        preview.title("OCR处理结果")
        preview.geometry("400x300")
        preview.resizable(False, False)
        center_window(preview)
        tk.Label(preview, text="OCR处理完成", font=("Arial", 16, "bold")).pack(pady=10)
        result_frame = tk.Frame(preview)
        result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
        tk.Label(result_frame, text=f"总共处理: {total} 个文件", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        tk.Label(result_frame, text=f"成功处理: {success} 个文件", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        tk.Label(result_frame, text=f"失败数量: {total - success} 个文件", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        if success == total:
            result_text = "全部处理成功！"
            result_color = "#28a745"
        elif success > total * 0.8:
            result_text = "大部分处理成功。"
            result_color = "#ffc107"
        else:
            result_text = "处理失败较多，请检查日志。"
            result_color = "#dc3545"
        tk.Label(result_frame, text=result_text, font=("Arial", 12, "bold"), fg=result_color).pack(pady=10)
        button_frame = tk.Frame(preview)
        button_frame.pack(pady=10)
        tk.Button(button_frame, text="查看输出文件", command=lambda: os.startfile(_get_output_dir())).pack(side=tk.LEFT, padx=10)
        tk.Button(button_frame, text="关闭", command=preview.destroy).pack(side=tk.LEFT, padx=10)
    else:
        messagebox.showinfo("OCR处理完成", f"OCR处理已完成，请在{_get_output_dir()}目录查看结果。")
 def show_excel_result_preview(output):
    """显示Excel处理结果预览"""
    extract_match = re.search(r'提取到 (\d+) 个商品信息', output)
    file_match = re.search(r'采购单已保存到: (.+?)(?:\n|$)', output)
    if extract_match and file_match:
        products_count = int(extract_match.group(1))
        output_file = file_match.group(1)
        preview = tk.Toplevel()
        preview.title("Excel处理结果")
        preview.geometry("450x320")
        preview.resizable(False, False)
        center_window(preview)
        tk.Label(preview, text="Excel处理完成", font=("Arial", 16, "bold")).pack(pady=10)
        result_frame = tk.Frame(preview)
        result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
        tk.Label(result_frame, text=f"提取商品数量: {products_count} 个", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        tk.Label(result_frame, text=f"输出文件: {os.path.basename(output_file)}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        tk.Label(result_frame, text="采购单已成功生成！", font=("Arial", 12, "bold"), fg="#28a745").pack(pady=10)
        file_frame = tk.Frame(result_frame, relief=tk.GROOVE, borderwidth=1)
        file_frame.pack(fill=tk.X, padx=15, pady=5)
        tk.Label(file_frame, text="文件信息", font=("Arial", 10, "bold")).pack(anchor=tk.W, padx=10, pady=5)
        try:
            file_size = os.path.getsize(output_file)
            file_time = datetime.datetime.fromtimestamp(os.path.getmtime(output_file))
            size_text = format_file_size(file_size)
            tk.Label(file_frame, text=f"文件大小: {size_text}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
            tk.Label(file_frame, text=f"创建时间: {file_time.strftime('%Y-%m-%d %H:%M:%S')}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
        except Exception:
            tk.Label(file_frame, text="无法获取文件信息", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
        button_frame = tk.Frame(preview)
        button_frame.pack(pady=10)
        tk.Button(button_frame, text="打开文件", command=lambda: os.startfile(output_file)).pack(side=tk.LEFT, padx=5)
        tk.Button(button_frame, text="打开所在文件夹", command=lambda: os.startfile(os.path.dirname(output_file))).pack(side=tk.LEFT, padx=5)
        tk.Button(button_frame, text="关闭", command=preview.destroy).pack(side=tk.LEFT, padx=5)
    else:
        messagebox.showinfo("Excel处理完成", f"Excel处理已完成，请在{_get_output_dir()}目录查看结果。")
 def show_merge_result_preview(output):
    """显示合并结果预览"""
    merged_match = re.search(r'合并了 (\d+) 个采购单', output)
    product_match = re.search(r'共处理 (\d+) 个商品', output)
    output_match = re.search(r'已保存到: (.+?)(?:\n|$)', output)
    if merged_match and output_match:
        merged_count = int(merged_match.group(1))
        product_count = int(product_match.group(1)) if product_match else 0
        output_file = output_match.group(1)
        preview = tk.Toplevel()
        preview.title("采购单合并结果")
        preview.geometry("450x300")
        preview.resizable(False, False)
        apply_theme(preview)
        tk.Label(preview, text="采购单合并完成", font=("Arial", 16, "bold")).pack(pady=10)
        result_frame = tk.Frame(preview)
        result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
        tk.Label(result_frame, text=f"合并采购单数量: {merged_count} 个", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        tk.Label(result_frame, text=f"处理商品数量: {product_count} 个", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        tk.Label(result_frame, text=f"输出文件: {os.path.basename(output_file)}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        theme = THEMES[get_theme_mode()]
        tk.Label(result_frame, text="采购单已成功合并！", font=("Arial", 12, "bold"), fg=theme["success"]).pack(pady=10)
        button_frame = tk.Frame(preview)
        button_frame.pack(pady=10)
        tk.Button(button_frame, text="打开文件", command=lambda: os.startfile(output_file)).pack(side=tk.LEFT, padx=10)
        tk.Button(button_frame, text="打开所在文件夹", command=lambda: os.startfile(os.path.dirname(output_file))).pack(side=tk.LEFT, padx=10)
        tk.Button(button_frame, text="关闭", command=preview.destroy).pack(side=tk.LEFT, padx=10)
    else:
        messagebox.showinfo("采购单合并完成", f"采购单合并已完成，请在{_get_output_dir()}目录查看结果。")
 def show_pipeline_result_preview(output):
    """显示完整流程结果预览"""
    ocr_match = re.search(r'所有图片处理完成, 总计: (\d+), 成功: (\d+)', output)
    excel_match = re.search(r'提取到 (\d+) 个商品信息', output)
    output_file_match = re.search(r'采购单已保存到: (.+?)(?:\n|$)', output)
    preview = tk.Toplevel()
    preview.title("完整流程处理结果")
    preview.geometry("500x400")
    preview.resizable(False, False)
    center_window(preview)
    tk.Label(preview, text="完整处理流程已完成", font=("Arial", 16, "bold")).pack(pady=10)
    no_files_match = re.search(r'未找到可合并的文件', output)
    if no_files_match:
        tk.Label(preview, text="未找到可合并的文件，但其他步骤已成功执行", font=("Arial", 12)).pack(pady=0)
    result_frame = tk.Frame(preview)
    result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
    result_text = scrolledtext.ScrolledText(result_frame, wrap=tk.WORD, height=15, width=60)
    result_text.pack(fill=tk.BOTH, expand=True, padx=15, pady=5)
    result_text.configure(state=tk.NORMAL)
    result_text.insert(tk.END, "===== 流程执行结果 =====\n\n", "title")
    result_text.insert(tk.END, "步骤1: OCR识别\n", "step")
    if ocr_match:
        total = int(ocr_match.group(1))
        success = int(ocr_match.group(2))
        result_text.insert(tk.END, f"  处理图片: {total} 个\n", "info")
        result_text.insert(tk.END, f"  成功识别: {success} 个\n", "info")
        if success == total:
            result_text.insert(tk.END, "  结果: 全部识别成功\n", "success")
        else:
            result_text.insert(tk.END, f"  结果: 部分识别成功 ({success}/{total})\n", "warning")
    else:
        result_text.insert(tk.END, "  结果: 无OCR处理或处理信息不完整\n", "warning")
    result_text.insert(tk.END, "\n步骤2: Excel处理\n", "step")
    if excel_match:
        products = int(excel_match.group(1))
        result_text.insert(tk.END, f"  提取商品: {products} 个\n", "info")
        result_text.insert(tk.END, "  结果: 成功生成采购单\n", "success")
        if output_file_match:
            output_file = output_file_match.group(1)
            result_text.insert(tk.END, f"  输出文件: {os.path.basename(output_file)}\n", "info")
    else:
        result_text.insert(tk.END, "  结果: 无Excel处理或处理信息不完整\n", "warning")
    result_text.insert(tk.END, "\n===== 整体评估 =====\n", "title")
    has_errors = "错误" in output or "失败" in output
    no_files_match2 = re.search(r'未找到采购单文件', output)
    single_file_match = re.search(r'只有1个采购单文件', output)
    if no_files_match2:
        result_text.insert(tk.END, "没有找到可合并的文件，但处理流程已成功完成。\n", "warning")
        result_text.insert(tk.END, "可以选择打开Excel文件或查看输出文件夹。\n", "info")
    elif single_file_match:
        result_text.insert(tk.END, "只有一个采购单文件，无需合并，处理流程已成功完成。\n", "warning")
        result_text.insert(tk.END, "可以选择打开生成的Excel文件。\n", "info")
    elif ocr_match and excel_match and not has_errors:
        result_text.insert(tk.END, "流程完整执行成功！\n", "success")
    elif ocr_match or excel_match:
        result_text.insert(tk.END, "流程部分执行成功，请检查日志获取详情。\n", "warning")
    else:
        result_text.insert(tk.END, "流程执行可能存在问题，请查看详细日志。\n", "error")
    result_text.tag_configure("title", font=("Arial", 12, "bold"))
    result_text.tag_configure("step", font=("Arial", 11, "bold"))
    result_text.tag_configure("info", font=("Arial", 10))
    result_text.tag_configure("success", font=("Arial", 10, "bold"), foreground="#28a745")
    result_text.tag_configure("warning", font=("Arial", 10, "bold"), foreground="#ffc107")
    result_text.tag_configure("error", font=("Arial", 10, "bold"), foreground="#dc3545")
    result_text.configure(state=tk.DISABLED)
    button_frame = tk.Frame(preview)
    button_frame.pack(pady=10)
    if output_file_match:
        output_file = output_file_match.group(1)
        tk.Button(button_frame, text="打开Excel文件", command=lambda: os.startfile(output_file)).pack(side=tk.LEFT, padx=10)
    else:
        if excel_match or no_files_match or single_file_match:
            output_dir = _get_output_dir()
            excel_files = [f for f in os.listdir(output_dir) if f.startswith('采购单_') and (f.endswith('.xls') or f.endswith('.xlsx'))]
            if excel_files:
                excel_files.sort(key=lambda x: os.path.getmtime(os.path.join(output_dir, x)), reverse=True)
                latest_file = os.path.join(output_dir, excel_files[0])
                tk.Button(button_frame, text="打开最新Excel文件",
                         command=lambda: os.startfile(latest_file)).pack(side=tk.LEFT, padx=10)
    tk.Button(button_frame, text="查看输出文件夹", command=lambda: os.startfile(_get_output_dir())).pack(side=tk.LEFT, padx=10)
    tk.Button(button_frame, text="关闭", command=preview.destroy).pack(side=tk.LEFT, padx=10)
 def show_tobacco_result_preview(returncode, output):
    """显示烟草订单处理结果预览"""
    global TOBACCO_PREVIEW_WINDOW
    if returncode != 0:
        return
    try:
        try:
            if TOBACCO_PREVIEW_WINDOW and TOBACCO_PREVIEW_WINDOW.winfo_exists():
                TOBACCO_PREVIEW_WINDOW.lift()
                return
        except Exception:
            TOBACCO_PREVIEW_WINDOW = None
        result_file = None
        order_time = "（未知）"
        total_amount = "（未知）"
        items_count = 0
        abs_path_match = re.search(r'烟草订单处理完成，绝对路径: (.+)(?:\n|$)', output)
        if abs_path_match:
            result_file = abs_path_match.group(1).strip()
        for line in output.split('\n'):
            if "烟草公司订单处理成功" in line and "订单时间" in line:
                time_match = re.search(r'订单时间: ([^,]+)', line)
                amount_match = re.search(r'总金额: ([^,]+)', line)
                items_match = re.search(r'处理条目: (\d+)', line)
                if time_match:
                    order_time = time_match.group(1).strip()
                if amount_match:
                    total_amount = amount_match.group(1).strip()
                if items_match:
                    items_count = int(items_match.group(1).strip())
        if not result_file or not os.path.exists(result_file):
            default_path = os.path.join(_get_output_dir(), "银豹采购单_烟草公司.xls")
            if os.path.exists(default_path):
                result_file = default_path
        preview = tk.Toplevel()
        preview.title("烟草订单处理结果")
        preview.geometry("450x320")
        preview.resizable(False, False)
        TOBACCO_PREVIEW_WINDOW = preview
        def _close_preview():
            global TOBACCO_PREVIEW_WINDOW
            TOBACCO_PREVIEW_WINDOW = None
            try:
                preview.destroy()
            except Exception:
                pass
        preview.protocol("WM_DELETE_WINDOW", _close_preview)
        center_window(preview)
        tk.Label(preview, text="烟草订单处理完成", font=("Arial", 16, "bold")).pack(pady=10)
        result_frame = tk.Frame(preview)
        result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
        tk.Label(result_frame, text=f"订单时间: {order_time}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        tk.Label(result_frame, text=f"订单总金额: {total_amount}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        tk.Label(result_frame, text=f"处理商品数量: {items_count} 个", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
        if result_file and os.path.exists(result_file):
            tk.Label(result_frame, text=f"输出文件: {os.path.basename(result_file)}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
            tk.Label(result_frame, text="银豹采购单已成功生成！", font=("Arial", 12, "bold"), fg="#28a745").pack(pady=10)
            file_frame = tk.Frame(result_frame, relief=tk.GROOVE, borderwidth=1)
            file_frame.pack(fill=tk.X, padx=15, pady=5)
            tk.Label(file_frame, text="文件信息", font=("Arial", 10, "bold")).pack(anchor=tk.W, padx=10, pady=5)
            try:
                file_size = os.path.getsize(result_file)
                file_time = datetime.datetime.fromtimestamp(os.path.getmtime(result_file))
                size_text = format_file_size(file_size)
                tk.Label(file_frame, text=f"文件大小: {size_text}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
                tk.Label(file_frame, text=f"创建时间: {file_time.strftime('%Y-%m-%d %H:%M:%S')}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
            except Exception:
                tk.Label(file_frame, text="无法获取文件信息", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
            button_frame = tk.Frame(preview)
            button_frame.pack(pady=10)
            tk.Button(button_frame, text="打开文件", command=lambda: os.startfile(result_file)).pack(side=tk.LEFT, padx=5)
            tk.Button(button_frame, text="打开所在文件夹", command=lambda: os.startfile(os.path.dirname(result_file))).pack(side=tk.LEFT, padx=5)
            tk.Button(button_frame, text="关闭", command=_close_preview).pack(side=tk.LEFT, padx=5)
        else:
            tk.Label(result_frame, text="未找到输出文件", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
            tk.Label(result_frame, text=f"请检查{_get_output_dir()}目录", font=("Arial", 12, "bold"), fg="#dc3545").pack(pady=10)
            button_frame = tk.Frame(preview)
            button_frame.pack(pady=10)
            tk.Button(button_frame, text="打开输出目录", command=lambda: os.startfile(_get_output_dir())).pack(side=tk.LEFT, padx=5)
            tk.Button(button_frame, text="关闭", command=_close_preview).pack(side=tk.LEFT, padx=5)
        preview.lift()
        preview.attributes('-topmost', True)
        preview.after_idle(lambda: preview.attributes('-topmost', False))
    except Exception as e:
        messagebox.showerror(
            "处理异常",
            f"显示预览时发生错误: {e}\n请检查日志了解详细信息。"
        )
@@ -1,60 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """键盘快捷键模块"""
 import tkinter as tk
 from tkinter import messagebox
 from .ui_widgets import center_window
 from .action_handlers import (
    process_single_image_with_status,
    process_excel_file_with_status,
    batch_ocr_with_status,
    run_pipeline_directly,
    merge_orders_with_status,
 )
 from .file_operations import clean_cache
 def bind_keyboard_shortcuts(root, log_widget, status_bar):
    """绑定键盘快捷键"""
    root.bind('<Control-o>', lambda e: process_single_image_with_status(log_widget, status_bar))
    root.bind('<Control-e>', lambda e: process_excel_file_with_status(log_widget, status_bar))
    root.bind('<Control-b>', lambda e: batch_ocr_with_status(log_widget, status_bar))
    root.bind('<Control-p>', lambda e: run_pipeline_directly(log_widget, status_bar))
    root.bind('<Control-m>', lambda e: merge_orders_with_status(log_widget, status_bar))
    root.bind('<F5>', lambda e: clean_cache(log_widget))
    root.bind('<Escape>', lambda e: root.quit() if messagebox.askyesno("确认退出", "确定要退出程序吗？") else None)
    root.bind('<F1>', lambda e: show_shortcuts_help())
 def show_shortcuts_help():
    """显示快捷键帮助对话框"""
    help_dialog = tk.Toplevel()
    help_dialog.title("快捷键帮助")
    help_dialog.geometry("400x450")
    center_window(help_dialog)
    tk.Label(help_dialog, text="键盘快捷键", font=("Arial", 16, "bold")).pack(pady=10)
    help_text = tk.Text(help_dialog, wrap=tk.WORD, width=50, height=20)
    help_text.pack(padx=20, pady=10, fill=tk.BOTH, expand=True)
    shortcuts = """
    Ctrl+O: 处理单个图片
    Ctrl+E: 处理Excel文件
    Ctrl+B: OCR批量识别
    Ctrl+P: 完整处理流程
    Ctrl+M: 合并采购单
    F5: 清除处理缓存
    Esc: 退出程序
    """
    help_text.insert(tk.END, shortcuts)
    help_text.configure(state=tk.DISABLED)
    tk.Button(help_dialog, text="确定", command=help_dialog.destroy).pack(pady=10)
    help_dialog.lift()
    help_dialog.attributes('-topmost', True)
    help_dialog.after_idle(lambda: help_dialog.attributes('-topmost', False))
@@ -1,193 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """主题管理模块"""
 import tkinter as tk
 from tkinter import scrolledtext, ttk
 # 私有主题模式变量
 _theme_mode = "light"
 # 浅色和深色主题颜色
 THEMES = {
    "light": {
        "bg": "#f8f9fa",
        "fg": "#212529",
        "button_bg": "#ffffff",
        "button_fg": "#495057",
        "button_hover": "#e9ecef",
        "primary_bg": "#007bff",
        "primary_fg": "#ffffff",
        "secondary_bg": "#6c757d",
        "secondary_fg": "#ffffff",
        "log_bg": "#ffffff",
        "log_fg": "#212529",
        "highlight_bg": "#007bff",
        "highlight_fg": "#ffffff",
        "border": "#dee2e6",
        "success": "#28a745",
        "error": "#dc3545",
        "warning": "#ffc107",
        "info": "#17a2b8",
        "card_bg": "#ffffff",
        "shadow": "#00000010"
    },
    "dark": {
        "bg": "#1a1a1a",
        "fg": "#e9ecef",
        "button_bg": "#343a40",
        "button_fg": "#e9ecef",
        "button_hover": "#495057",
        "primary_bg": "#0d6efd",
        "primary_fg": "#ffffff",
        "secondary_bg": "#6c757d",
        "secondary_fg": "#ffffff",
        "log_bg": "#212529",
        "log_fg": "#e9ecef",
        "highlight_bg": "#0d6efd",
        "highlight_fg": "#ffffff",
        "border": "#495057",
        "success": "#198754",
        "error": "#dc3545",
        "warning": "#ffc107",
        "info": "#0dcaf0",
        "card_bg": "#2d3748",
        "shadow": "#00000030"
    }
 }
 def get_theme_mode() -> str:
    return _theme_mode
 def set_theme_mode(mode: str):
    global _theme_mode
    _theme_mode = mode
 def create_modern_button(parent, text, command, style="primary", width=None, height=None, px_width=None, px_height=None):
    """创建现代化样式的按钮"""
    theme = THEMES[_theme_mode]
    if style == "primary":
        bg_color = "white"
        fg_color = theme["primary_bg"]
        hover_color = "#f0f8ff"
        border_color = theme["primary_bg"]
    elif style == "secondary":
        bg_color = theme["secondary_bg"]
        fg_color = theme["secondary_fg"]
        hover_color = theme["button_hover"]
        border_color = theme["secondary_bg"]
    else:
        bg_color = "white"
        fg_color = theme["primary_bg"]
        hover_color = "#f0f8ff"
        border_color = theme["primary_bg"]
    button_frame = tk.Frame(parent, bg=border_color, highlightthickness=0)
    button_frame.configure(relief="flat", bd=0)
    if px_width or px_height:
        try:
            w = px_width if px_width else button_frame.winfo_reqwidth()
            h = px_height if px_height else 32
            button_frame.configure(width=w, height=h)
            button_frame.pack_propagate(False)
        except Exception:
            pass
    button = tk.Button(
        button_frame,
        text=text,
        command=command,
        bg=bg_color,
        fg=fg_color,
        font=("Microsoft YaHei UI", 8),
        relief="flat",
        bd=0,
        padx=14,
        pady=4,
        anchor="center",
        cursor="hand2",
        activebackground=hover_color,
        activeforeground=fg_color
    )
    if width:
        button.configure(width=width)
    else:
        button.configure(width=12)
    if height is not None:
        button.configure(height=height)
    else:
        button.configure(height=1)
    if height:
        button.configure(height=height)
    # 悬停效果
    def on_enter(e):
        button.configure(bg=hover_color)
    def on_leave(e):
        button.configure(bg=bg_color)
    button.bind("<Enter>", on_enter)
    button.bind("<Leave>", on_leave)
    button_frame.bind("<Enter>", on_enter)
    button_frame.bind("<Leave>", on_leave)
    button.pack(fill=tk.BOTH, expand=True, padx=1, pady=1)
    return button_frame
 def create_card_frame(parent, title=None):
    """创建卡片样式的框架"""
    theme = THEMES[_theme_mode]
    card = tk.Frame(
        parent,
        bg=theme["card_bg"],
        relief="flat",
        borderwidth=1,
        highlightbackground=theme["border"],
        highlightthickness=1
    )
    if title:
        title_label = tk.Label(
            card,
            text=title,
            bg=theme["card_bg"],
            fg=theme["fg"],
            font=("Microsoft YaHei UI", 10, "bold")
        )
        title_label.pack(pady=(6, 3))
    return card
 def apply_theme(widget, theme_mode=None):
    """应用主题到小部件"""
    if theme_mode is None:
        theme_mode = _theme_mode
    theme = THEMES[theme_mode]
    try:
        widget.configure(bg=theme["bg"], fg=theme["fg"])
    except Exception:
        pass
    for child in widget.winfo_children():
        if isinstance(child, tk.Button) and not isinstance(child, ttk.Button):
            child.configure(bg=theme["button_bg"], fg=theme["button_fg"])
        elif isinstance(child, scrolledtext.ScrolledText):
            child.configure(bg=theme["log_bg"], fg=theme["log_fg"])
        else:
            try:
                child.configure(bg=theme["bg"], fg=theme["fg"])
            except Exception:
                pass
        apply_theme(child, theme_mode)
@@ -1,121 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """UI控件模块 - StatusBar、ProgressReporter、可折叠框架等"""
 import tkinter as tk
 from tkinter import ttk
 from .theme import THEMES, get_theme_mode
 class StatusBar(tk.Frame):
    """状态栏，显示当前系统状态和进度"""
    def __init__(self, master, **kwargs):
        super().__init__(master, **kwargs)
        self.configure(height=25, relief=tk.SUNKEN, borderwidth=1)
        self.status_label = tk.Label(self, text="就绪", anchor=tk.W, padx=5)
        self.status_label.pack(side=tk.LEFT, fill=tk.X, expand=True)
        self.progress = ttk.Progressbar(self, orient=tk.HORIZONTAL, length=200, mode='determinate')
        self.progress.pack(side=tk.RIGHT, padx=5, pady=2)
        self.progress.pack_forget()
    def set_status(self, text, progress=None):
        """设置状态栏文本和进度"""
        self.status_label.config(text=text)
        if progress is not None and 0 <= progress <= 100:
            self.progress.pack(side=tk.RIGHT, padx=5, pady=2)
            self.progress.config(value=progress)
        else:
            self.progress.pack_forget()
    def set_running(self, is_running=True):
        """设置运行状态"""
        theme = THEMES[get_theme_mode()]
        if is_running:
            self.status_label.config(text="处理中...", foreground=theme["info"])
            self.progress.pack(side=tk.RIGHT, padx=5, pady=2)
            self.progress.config(mode='indeterminate')
            self.progress.start()
        else:
            self.status_label.config(text="就绪", foreground=theme["fg"])
            self.progress.stop()
            self.progress.pack_forget()
 class ProgressReporter:
    def __init__(self, status_bar: StatusBar):
        self.status_bar = status_bar
    def set(self, text: str, percent: int = None):
        try:
            if percent is not None:
                self.status_bar.set_status(text, percent)
            else:
                self.status_bar.set_status(text)
        except Exception:
            pass
    def running(self):
        try:
            self.status_bar.set_running(True)
        except Exception:
            pass
    def done(self):
        try:
            self.status_bar.set_running(False)
            self.status_bar.set_status("就绪")
        except Exception:
            pass
 def create_collapsible_frame(parent, title, initial_state=True):
    """创建可折叠的面板"""
    frame = tk.Frame(parent)
    frame.pack(fill=tk.X, pady=5)
    title_frame = tk.Frame(frame)
    title_frame.pack(fill=tk.X)
    state_var = tk.BooleanVar(value=initial_state)
    indicator = "▼" if initial_state else "►"
    state_label = tk.Label(title_frame, text=indicator, font=("Arial", 10, "bold"))
    state_label.pack(side=tk.LEFT, padx=5)
    title_label = tk.Label(title_frame, text=title, font=("Arial", 11, "bold"))
    title_label.pack(side=tk.LEFT, padx=5)
    content_frame = tk.Frame(frame)
    if initial_state:
        content_frame.pack(fill=tk.X, padx=20, pady=5)
    def toggle_collapse(event=None):
        current_state = state_var.get()
        new_state = not current_state
        state_var.set(new_state)
        state_label.config(text="▼" if new_state else "►")
        if new_state:
            content_frame.pack(fill=tk.X, padx=20, pady=5)
        else:
            content_frame.pack_forget()
    title_frame.bind("<Button-1>", toggle_collapse)
    state_label.bind("<Button-1>", toggle_collapse)
    title_label.bind("<Button-1>", toggle_collapse)
    return content_frame, state_var
 def center_window(window):
    """使窗口居中显示"""
    window.update_idletasks()
    width = window.winfo_width()
    height = window.winfo_height()
    x = (window.winfo_screenwidth() // 2) - (width // 2)
    y = (window.winfo_screenheight() // 2) - (height // 2)
    window.geometry('{}x{}+{}+{}'.format(width, height, x, y))
@@ -1,130 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """用户设置与最近文件管理模块"""
 import os
 import json
 import re
 import tkinter as tk
 from typing import Dict, List, Any
 from app.core.utils.log_utils import get_logger
 from app.config.settings import ConfigManager
 logger = get_logger(__name__)
 RECENT_LIST_WIDGET = None
 def load_user_settings():
    try:
        path = os.path.abspath(os.path.join('data', 'user_settings.json'))
        if os.path.exists(path):
            with open(path, 'r', encoding='utf-8') as f:
                return json.load(f)
    except Exception as e:
        logger.debug(f"加载用户设置失败: {e}")
    return {}
 def save_user_settings(settings: Dict[str, Any]):
    try:
        os.makedirs('data', exist_ok=True)
        path = os.path.abspath(os.path.join('data', 'user_settings.json'))
        with open(path, 'w', encoding='utf-8') as f:
            json.dump(settings, f, ensure_ascii=False, indent=2)
    except Exception as e:
        logger.debug(f"保存用户设置失败: {e}")
 def get_recent_files() -> List[str]:
    s = load_user_settings()
    items = s.get('recent_files', [])
    if not isinstance(items, list):
        return []
    def _allowed(p: str) -> bool:
        try:
            if not isinstance(p, str) or not os.path.isfile(p):
                return False
            ext = os.path.splitext(p)[1].lower()
            return ext in {'.xlsx', '.xls', '.jpg', '.jpeg', '.png', '.bmp'}
        except Exception:
            return False
    kept = [p for p in items if _allowed(p)]
    if not kept:
        candidates = []
        cfg = ConfigManager()
        for d in [cfg.get_path('Paths', 'output_folder', fallback='data/output'), cfg.get_path('Paths', 'result_folder', fallback='data/result')]:
            try:
                if os.path.exists(d):
                    for name in os.listdir(d):
                        p = os.path.join(d, name)
                        if _allowed(p):
                            candidates.append(p)
            except Exception:
                pass
        if candidates:
            kept = candidates
    try:
        kept_sorted = sorted(kept, key=lambda p: os.path.getmtime(p), reverse=True)
    except Exception:
        kept_sorted = kept
    if kept_sorted != items or len(kept_sorted) != len(items):
        s['recent_files'] = kept_sorted[:20]
        save_user_settings(s)
    return kept_sorted[:10]
 def refresh_recent_list_widget():
    try:
        global RECENT_LIST_WIDGET
        if RECENT_LIST_WIDGET is None:
            return
        RECENT_LIST_WIDGET.delete(0, tk.END)
        for i, p in enumerate(get_recent_files(), start=1):
            RECENT_LIST_WIDGET.insert(tk.END, f"{i}. {p}")
    except Exception as e:
        logger.debug(f"刷新最近文件列表失败: {e}")
 def _extract_path_from_recent_item(s: str) -> str:
    try:
        m = re.match(r'^(\d+)\.\s+(.*)$', s)
        p = m.group(2) if m else s
        return p.strip().strip('"')
    except Exception:
        return s.strip().strip('"')
 def add_recent_file(path: str) -> None:
    try:
        if not path:
            return
        try:
            if not os.path.isfile(path):
                return
            ext = os.path.splitext(path)[1].lower()
            if ext not in {'.xlsx', '.xls', '.jpg', '.jpeg', '.png', '.bmp'}:
                return
        except Exception:
            return
        s = load_user_settings()
        items = s.get('recent_files', [])
        items = [p for p in items if p != path]
        items.insert(0, path)
        s['recent_files'] = items[:20]
        save_user_settings(s)
        refresh_recent_list_widget()
    except Exception as e:
        logger.debug(f"添加最近文件失败: {e}")
 def clear_recent_files():
    try:
        s = load_user_settings()
        s['recent_files'] = []
        save_user_settings(s)
    except Exception as e:
        logger.debug(f"清空最近文件失败: {e}")
@@ -0,0 +1,316 @@
 This file lists modules PyInstaller was not able to find. This does not
 necessarily mean this module is required for running your program. Python and
 Python 3rd-party packages include a lot of conditional or optional modules. For
 example the module 'ntpath' only exists on Windows, whereas the module
 'posixpath' only exists on Posix systems.
 Types if import:
 * top-level: imported at the top-level - look at these first
 * conditional: imported within an if-statement
 * delayed: imported within a function
 * optional: imported within a try-except-statement
 IMPORTANT: Do NOT post this list to the issue-tracker. Use it as a basis for
            tracking down the missing module yourself. Thanks!
 missing module named _posixshmem - imported by multiprocessing.resource_tracker (conditional), multiprocessing.shared_memory (conditional)
 missing module named 'org.python' - imported by copy (optional), xml.sax (delayed, conditional)
 missing module named _scproxy - imported by urllib.request (conditional)
 missing module named termios - imported by getpass (optional), tty (top-level)
 missing module named pwd - imported by posixpath (delayed, conditional), shutil (optional), tarfile (optional), pathlib (delayed, conditional, optional), subprocess (optional), netrc (delayed, conditional), getpass (delayed), http.server (delayed, optional), webbrowser (delayed)
 missing module named 'java.lang' - imported by platform (delayed, optional), xml.sax._exceptions (conditional)
 missing module named multiprocessing.BufferTooShort - imported by multiprocessing (top-level), multiprocessing.connection (top-level)
 missing module named multiprocessing.AuthenticationError - imported by multiprocessing (top-level), multiprocessing.connection (top-level)
 missing module named _posixsubprocess - imported by subprocess (optional), multiprocessing.util (delayed)
 missing module named multiprocessing.get_context - imported by multiprocessing (top-level), multiprocessing.pool (top-level), multiprocessing.managers (top-level), multiprocessing.sharedctypes (top-level)
 missing module named multiprocessing.TimeoutError - imported by multiprocessing (top-level), multiprocessing.pool (top-level)
 missing module named org - imported by pickle (optional)
 missing module named multiprocessing.set_start_method - imported by multiprocessing (top-level), multiprocessing.spawn (top-level)
 missing module named multiprocessing.get_start_method - imported by multiprocessing (top-level), multiprocessing.spawn (top-level)
 missing module named grp - imported by shutil (optional), tarfile (optional), pathlib (delayed, optional), subprocess (optional)
 missing module named pep517 - imported by importlib.metadata (delayed)
 missing module named posix - imported by os (conditional, optional), shutil (conditional), importlib._bootstrap_external (conditional)
 missing module named resource - imported by posix (top-level)
 excluded module named _frozen_importlib - imported by importlib (optional), importlib.abc (optional), zipimport (top-level)
 missing module named _frozen_importlib_external - imported by importlib._bootstrap (delayed), importlib (optional), importlib.abc (optional), zipimport (top-level)
 missing module named pyimod02_importers - imported by C:\Program Files\Python39\Lib\site-packages\PyInstaller\hooks\rthooks\pyi_rth_pkgutil.py (delayed)
 missing module named simplejson - imported by requests.compat (conditional, optional)
 missing module named dummy_threading - imported by requests.cookies (optional)
 missing module named typing_extensions - imported by urllib3.util.retry (conditional), urllib3._collections (conditional), urllib3.util.ssltransport (conditional), urllib3.connectionpool (conditional), urllib3.poolmanager (conditional), urllib3.contrib.emscripten.fetch (conditional), charset_normalizer.legacy (conditional), pandas._typing (conditional)
 missing module named zstandard - imported by urllib3.util.request (optional), urllib3.response (optional)
 missing module named compression - imported by urllib3.util.request (optional), urllib3.response (optional)
 missing module named 'h2.events' - imported by urllib3.http2.connection (top-level)
 missing module named 'h2.connection' - imported by urllib3.http2.connection (top-level)
 missing module named h2 - imported by urllib3.http2.connection (top-level)
 missing module named brotli - imported by urllib3.util.request (optional), urllib3.response (optional)
 missing module named brotlicffi - imported by urllib3.util.request (optional), urllib3.response (optional)
 missing module named socks - imported by urllib3.contrib.socks (optional)
 missing module named 'typing.io' - imported by importlib.resources (top-level)
 missing module named cryptography - imported by urllib3.contrib.pyopenssl (top-level), requests (conditional, optional)
 missing module named 'OpenSSL.crypto' - imported by urllib3.contrib.pyopenssl (delayed, conditional)
 missing module named 'cryptography.x509' - imported by urllib3.contrib.pyopenssl (delayed, optional)
 missing module named OpenSSL - imported by urllib3.contrib.pyopenssl (top-level)
 missing module named chardet - imported by requests (optional)
 missing module named 'pyodide.ffi' - imported by urllib3.contrib.emscripten.fetch (delayed, optional)
 missing module named pyodide - imported by urllib3.contrib.emscripten.fetch (top-level)
 missing module named js - imported by urllib3.contrib.emscripten.fetch (top-level)
 missing module named cStringIO - imported by xlrd.timemachine (conditional)
 missing module named PIL - imported by openpyxl.drawing.image (optional)
 missing module named 'defusedxml.ElementTree' - imported by openpyxl.xml.functions (conditional)
 missing module named 'lxml.etree' - imported by openpyxl.xml.functions (conditional), pandas.io.xml (delayed), pandas.io.formats.xml (delayed), pandas.io.html (delayed)
 missing module named openpyxl.tests - imported by openpyxl.reader.excel (optional)
 missing module named defusedxml - imported by openpyxl.xml (delayed, optional)
 missing module named lxml - imported by openpyxl.xml (delayed, optional), pandas.io.xml (conditional)
 missing module named _dummy_thread - imported by numpy._core.arrayprint (optional)
 missing module named numpy._typing._ufunc - imported by numpy._typing (conditional)
 missing module named 'numpy_distutils.cpuinfo' - imported by numpy.f2py.diagnose (delayed, conditional, optional)
 missing module named 'numpy_distutils.fcompiler' - imported by numpy.f2py.diagnose (delayed, conditional, optional)
 missing module named 'numpy_distutils.command' - imported by numpy.f2py.diagnose (delayed, conditional, optional)
 missing module named numpy_distutils - imported by numpy.f2py.diagnose (delayed, optional)
 missing module named psutil - imported by numpy.testing._private.utils (delayed, optional)
 missing module named readline - imported by cmd (delayed, conditional, optional), code (delayed, conditional, optional), pdb (delayed, optional)
 missing module named win32pdh - imported by numpy.testing._private.utils (delayed, conditional)
 missing module named asyncio.DefaultEventLoopPolicy - imported by asyncio (delayed, conditional), asyncio.events (delayed, conditional)
 missing module named threadpoolctl - imported by numpy.lib._utils_impl (delayed, optional)
 missing module named numpy._core.zeros - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.vstack - imported by numpy._core (top-level), numpy.lib._shape_base_impl (top-level), numpy (conditional)
 missing module named numpy._core.void - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.vecdot - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.ushort - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.unsignedinteger - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.ulonglong - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.ulong - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.uintp - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.uintc - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.uint64 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.uint32 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.uint16 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.uint - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.ubyte - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.trunc - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.true_divide - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.transpose - imported by numpy._core (top-level), numpy.lib._function_base_impl (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.trace - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.timedelta64 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.tensordot - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.tanh - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.tan - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.swapaxes - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.sum - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.subtract - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.str_ - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.square - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.sqrt - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional), numpy.fft._pocketfft (top-level)
 missing module named numpy._core.spacing - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.sort - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.sinh - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.single - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.signedinteger - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.signbit - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional)
 missing module named numpy._core.sign - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.short - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.rint - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.right_shift - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.result_type - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional), numpy.fft._pocketfft (top-level)
 missing module named numpy._core.remainder - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.reciprocal - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional), numpy.fft._pocketfft (top-level)
 missing module named numpy._core.radians - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.rad2deg - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.prod - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.power - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.positive - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.pi - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.outer - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.ones - imported by numpy._core (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
 missing module named numpy._core.object_ - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
 missing module named numpy._core.number - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional)
 missing module named numpy._core.not_equal - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.newaxis - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.negative - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.ndarray - imported by numpy._core (top-level), numpy.lib._utils_impl (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
 missing module named numpy._core.multiply - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.moveaxis - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.modf - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.mod - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.minimum - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.maximum - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.max - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional)
 missing module named numpy._core.matrix_transpose - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.matmul - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.longdouble - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.long - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.logical_xor - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.logical_or - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.logical_not - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.logical_and - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.logaddexp2 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.logaddexp - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.log2 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.log1p - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.log - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.linspace - imported by numpy._core (top-level), numpy.lib._index_tricks_impl (top-level), numpy (conditional)
 missing module named numpy._core.less_equal - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.less - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.left_shift - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.ldexp - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.lcm - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.isscalar - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy.lib._polynomial_impl (top-level), numpy (conditional)
 missing module named numpy._core.isnat - imported by numpy._core (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
 missing module named numpy._core.isnan - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
 missing module named numpy._core.isfinite - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.intp - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
 missing module named numpy._core.integer - imported by numpy._core (conditional), numpy (conditional), numpy.fft._helper (top-level)
 missing module named numpy._core.intc - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.int8 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.int64 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.int32 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.int16 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.inf - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
 missing module named numpy._core.inexact - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.iinfo - imported by numpy._core (top-level), numpy.lib._twodim_base_impl (top-level), numpy (conditional)
 missing module named numpy._core.hypot - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.hstack - imported by numpy._core (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
 missing module named numpy._core.heaviside - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.half - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.greater_equal - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.greater - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.gcd - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.frompyfunc - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.frexp - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.fmod - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.fmin - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.fmax - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.floor_divide - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.floor - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.floating - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.float_power - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.float32 - imported by numpy._core (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
 missing module named numpy._core.float16 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.finfo - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
 missing module named numpy._core.fabs - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.expm1 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.exp - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.euler_gamma - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.errstate - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
 missing module named numpy._core.equal - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.empty_like - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional), numpy.fft._pocketfft (top-level)
 missing module named numpy._core.empty - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (top-level), numpy (conditional), numpy.fft._helper (top-level)
 missing module named numpy._core.e - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.double - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.dot - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
 missing module named numpy._core.divmod - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.divide - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.diagonal - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.degrees - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.deg2rad - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.datetime64 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.csingle - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.cross - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.count_nonzero - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.cosh - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.cos - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.copysign - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.conjugate - imported by numpy._core (conditional), numpy (conditional), numpy.fft._pocketfft (top-level)
 missing module named numpy._core.conj - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.complexfloating - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.complex64 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.clongdouble - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.character - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.ceil - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.cdouble - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.cbrt - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.bytes_ - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.byte - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.bool_ - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.bitwise_xor - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.bitwise_or - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.bitwise_count - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.bitwise_and - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.atleast_3d - imported by numpy._core (top-level), numpy.lib._shape_base_impl (top-level), numpy (conditional)
 missing module named numpy._core.atleast_2d - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.atleast_1d - imported by numpy._core (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
 missing module named numpy._core.asarray - imported by numpy._core (top-level), numpy.lib._array_utils_impl (top-level), numpy.linalg._linalg (top-level), numpy (conditional), numpy.fft._pocketfft (top-level), numpy.fft._helper (top-level)
 missing module named numpy._core.asanyarray - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.array_repr - imported by numpy._core (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
 missing module named numpy._core.array2string - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional)
 missing module named numpy._core.array - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
 missing module named numpy._core.argsort - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.arctanh - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.arctan2 - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.arctan - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.arcsinh - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.arcsin - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.arccosh - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.arccos - imported by numpy._core (conditional), numpy (conditional)
 missing module named numpy._core.arange - imported by numpy._core (top-level), numpy.testing._private.utils (top-level), numpy (conditional), numpy.fft._helper (top-level)
 missing module named numpy._core.amin - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.amax - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named numpy._core.all - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
 missing module named numpy._core.add - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
 missing module named yaml - imported by numpy.__config__ (delayed)
 missing module named numpy._distributor_init_local - imported by numpy (optional), numpy._distributor_init (optional)
 missing module named vms_lib - imported by platform (delayed, optional)
 missing module named java - imported by platform (delayed)
 missing module named _winreg - imported by platform (delayed, optional)
 missing module named six.moves.range - imported by six.moves (top-level), dateutil.rrule (top-level)
 runtime module named six.moves - imported by dateutil.tz.tz (top-level), dateutil.tz._factories (top-level), dateutil.tz.win (top-level), dateutil.rrule (top-level)
 missing module named dateutil.tz.tzfile - imported by dateutil.tz (top-level), dateutil.zoneinfo (top-level)
 missing module named StringIO - imported by six (conditional), xlutils.compat (conditional)
 missing module named numexpr - imported by pandas.core.computation.expressions (conditional), pandas.core.computation.engines (delayed)
 missing module named numba - imported by pandas.core._numba.executor (delayed, conditional), pandas.core.util.numba_ (delayed, conditional), pandas.core.window.numba_ (delayed, conditional), pandas.core.window.online (delayed, conditional), pandas.core._numba.kernels.mean_ (top-level), pandas.core._numba.kernels.shared (top-level), pandas.core._numba.kernels.sum_ (top-level), pandas.core._numba.kernels.min_max_ (top-level), pandas.core._numba.kernels.var_ (top-level), pandas.core.groupby.numba_ (delayed, conditional), pandas.core._numba.extensions (top-level)
 missing module named 'numba.extending' - imported by pandas.core._numba.kernels.sum_ (top-level)
 missing module named 'pyarrow.compute' - imported by pandas.core.arrays._arrow_string_mixins (conditional), pandas.core.arrays.string_arrow (conditional), pandas.core.reshape.merge (delayed, conditional), pandas.core.arrays.arrow.array (conditional), pandas.core.arrays.arrow.accessors (conditional)
 missing module named 'numba.typed' - imported by pandas.core._numba.extensions (delayed)
 missing module named 'numba.core' - imported by pandas.core._numba.extensions (top-level)
 missing module named pyarrow - imported by pandas.core.arrays._arrow_string_mixins (conditional), pandas.core.arrays.masked (delayed), pandas.core.arrays.boolean (delayed, conditional), pandas.core.arrays.numeric (delayed, conditional), pandas.core.arrays.arrow._arrow_utils (top-level), pandas.core.interchange.utils (delayed, conditional), pandas.core.strings.accessor (delayed, conditional), pandas.io._util (conditional), pandas.io.parsers.base_parser (delayed, conditional), pandas.core.arrays.interval (delayed), pandas.core.arrays.arrow.extension_types (top-level), pandas.core.arrays.period (delayed), pandas.core.methods.describe (delayed, conditional), pandas.io.sql (delayed, conditional), pandas.core.arrays.string_arrow (conditional), pandas.core.reshape.merge (delayed, conditional), pandas.core.arrays.arrow.array (conditional), pandas.core.interchange.buffer (conditional), pandas.io.feather_format (delayed), pandas.core.indexes.base (delayed, conditional), pandas.core.dtypes.cast (delayed, conditional), pandas.core.arrays.string_ (delayed, conditional), pandas.core.arrays.arrow.accessors (conditional), pandas.core.dtypes.dtypes (delayed, conditional), pandas.compat.pyarrow (optional), pandas.core.reshape.encoding (delayed, conditional), pandas._testing (conditional)
 missing module named 'scipy.stats' - imported by pandas.core.nanops (delayed, conditional)
 missing module named scipy - imported by pandas.core.dtypes.common (delayed, conditional, optional), pandas.core.missing (delayed)
 missing module named traitlets - imported by pandas.io.formats.printing (delayed, conditional)
 missing module named 'IPython.core' - imported by pandas.io.formats.printing (delayed, conditional)
 missing module named IPython - imported by pandas.io.formats.printing (delayed)
 missing module named xlsxwriter - imported by pandas.io.excel._xlsxwriter (delayed)
 missing module named 'odf.config' - imported by pandas.io.excel._odswriter (delayed)
 missing module named 'odf.style' - imported by pandas.io.excel._odswriter (delayed)
 missing module named 'odf.text' - imported by pandas.io.excel._odfreader (delayed), pandas.io.excel._odswriter (delayed)
 missing module named 'odf.table' - imported by pandas.io.excel._odfreader (delayed), pandas.io.excel._odswriter (delayed)
 missing module named 'odf.opendocument' - imported by pandas.io.excel._odfreader (delayed), pandas.io.excel._odswriter (delayed)
 missing module named pyxlsb - imported by pandas.io.excel._pyxlsb (delayed, conditional)
 missing module named 'odf.office' - imported by pandas.io.excel._odfreader (delayed)
 missing module named 'odf.element' - imported by pandas.io.excel._odfreader (delayed)
 missing module named 'odf.namespaces' - imported by pandas.io.excel._odfreader (delayed)
 missing module named odf - imported by pandas.io.excel._odfreader (conditional)
 missing module named python_calamine - imported by pandas.io.excel._calamine (delayed, conditional)
 missing module named 'matplotlib.pyplot' - imported by pandas.io.formats.style (optional)
 missing module named matplotlib - imported by pandas.plotting._core (conditional), pandas.io.formats.style (optional)
 missing module named 'matplotlib.colors' - imported by pandas.plotting._misc (conditional), pandas.io.formats.style (conditional)
 missing module named markupsafe - imported by pandas.io.formats.style_render (top-level)
 missing module named botocore - imported by pandas.io.common (delayed, conditional, optional)
 missing module named sets - imported by pytz.tzinfo (optional)
 missing module named collections.Mapping - imported by collections (optional), pytz.lazy (optional)
 missing module named UserDict - imported by pytz.lazy (optional)
 missing module named 'scipy.sparse' - imported by pandas.core.arrays.sparse.array (conditional), pandas.core.arrays.sparse.scipy_sparse (delayed, conditional), pandas.core.arrays.sparse.accessor (delayed)
 missing module named pandas.core.internals.Block - imported by pandas.core.internals (conditional), pandas.io.pytables (conditional)
 missing module named Foundation - imported by pandas.io.clipboard (delayed, conditional, optional)
 missing module named AppKit - imported by pandas.io.clipboard (delayed, conditional, optional)
 missing module named PyQt4 - imported by pandas.io.clipboard (delayed, conditional, optional)
 missing module named qtpy - imported by pandas.io.clipboard (delayed, conditional, optional)
 missing module named 'sqlalchemy.engine' - imported by pandas.io.sql (delayed)
 missing module named 'sqlalchemy.types' - imported by pandas.io.sql (delayed, conditional)
 missing module named 'sqlalchemy.schema' - imported by pandas.io.sql (delayed)
 missing module named 'sqlalchemy.sql' - imported by pandas.io.sql (conditional)
 missing module named sqlalchemy - imported by pandas.io.sql (delayed, conditional)
 missing module named tables - imported by pandas.io.pytables (delayed, conditional)
 missing module named 'pyarrow.fs' - imported by pandas.io.orc (conditional)
 missing module named fsspec - imported by pandas.io.orc (conditional)
 missing module named 'pyarrow.parquet' - imported by pandas.io.parquet (delayed)
 missing module named google - imported by pandas.io.gbq (conditional)
 missing module named 'lxml.html' - imported by pandas.io.html (delayed)
 missing module named bs4 - imported by pandas.io.html (delayed)
 missing module named pytest - imported by pandas._testing._io (delayed), pandas._testing (delayed)
 missing module named 'matplotlib.axes' - imported by pandas.plotting._misc (conditional), pandas._testing.asserters (delayed)
 missing module named 'matplotlib.artist' - imported by pandas._testing.asserters (delayed)
 missing module named 'matplotlib.table' - imported by pandas.plotting._misc (conditional)
 missing module named 'matplotlib.figure' - imported by pandas.plotting._misc (conditional)
 missing module named errorhandler - imported by xlutils.filter (delayed)
 missing module named guppy - imported by xlutils.filter (optional)
@@ -57,8 +57,6 @@ hidden_imports = [
    'xlwt',
    'xlutils',
    'requests',
    'dotenv',
    'tkinterdnd2',
    'configparser',
    'threading',
    'datetime',
@@ -70,28 +68,8 @@ hidden_imports = [
    'app.services.ocr_service',
    'app.services.order_service',
    'app.services.tobacco_service',
    'app.services.processor_service',
    'app.core.utils.dialog_utils',
    'app.core.utils.file_utils',
    'app.core.utils.log_utils',
    'app.core.utils.string_utils',
    'app.core.handlers.column_mapper',
    'app.core.excel.converter',
    'app.core.db.product_db',
    'app.ui.error_utils',
    'app.ui.theme',
    'app.ui.logging_ui',
    'app.ui.ui_widgets',
    'app.ui.user_settings',
    'app.ui.result_previews',
    'app.ui.command_runner',
    'app.ui.file_operations',
    'app.ui.action_handlers',
    'app.ui.barcode_editor',
    'app.ui.config_dialog',
    'app.ui.shortcuts',
    'app.ui.main_window',
    'app.ui.memory_editor',
 ]
 a = Analysis(
@@ -143,40 +121,6 @@ def build_exe():
    """构建EXE文件"""
    print("开始构建EXE文件...")
    try:
        # 注入版本信息到根config.ini
        try:
            root_cfg = Path('config.ini')
            from datetime import datetime
            version_str = datetime.now().strftime('%Y.%m.%d.%H%M')
            if root_cfg.exists():
                lines = root_cfg.read_text(encoding='utf-8').splitlines()
                has_app = any(l.strip().lower() == '[app]' for l in lines)
                if not has_app:
                    lines.append('[App]')
                    lines.append(f'version = {version_str}')
                else:
                    # 更新或追加version
                    new_lines = []
                    in_app = False
                    app_written = False
                    for l in lines:
                        if l.strip().lower() == '[app]':
                            in_app = True
                            new_lines.append(l)
                            continue
                        if in_app and l.strip().lower().startswith('version'):
                            new_lines.append(f'version = {version_str}')
                            app_written = True
                            in_app = True
                            continue
                        new_lines.append(l)
                    if not app_written:
                        new_lines.append('version = ' + version_str)
                    lines = new_lines
                root_cfg.write_text('\n'.join(lines), encoding='utf-8')
                print(f"已写入版本号: {version_str}")
        except Exception as e:
            print(f"版本信息注入失败: {e}")
        result = subprocess.run([
            'pyinstaller', 
            'OCR订单处理系统.spec'
@@ -206,9 +150,6 @@ def build_exe():
        if root_config_file.exists():
            shutil.copy2(root_config_file, dist_dir)
            print(f"已复制根配置文件到dist: {root_config_file} -> {dist_dir}")
        else:
            print("警告: 根配置文件不存在，将创建缺省版本")
            (dist_dir / 'config.ini').write_text('[App]\nversion = dev\n', encoding='utf-8')
    except subprocess.CalledProcessError as e:
        print(f"构建失败: {e}")
@@ -223,18 +164,8 @@ def create_portable_package():
    # 创建发布目录
    release_dir = Path('release')
    if release_dir.exists():
-        try:
+        shutil.rmtree(release_dir)
-            shutil.rmtree(release_dir)
+    release_dir.mkdir()
        except Exception as e:
            print(f"警告: 无法完全清理发布目录 (可能文件被占用): {e}")
            # 如果目录还在，尝试清理能清理的部分
            for item in release_dir.iterdir():
                try:
                    if item.is_dir(): shutil.rmtree(item)
                    else: item.unlink()
                except Exception: pass
    release_dir.mkdir(exist_ok=True)
    # 复制exe文件
    exe_file = Path('dist/OCR订单处理系统.exe')
@@ -279,17 +210,6 @@ def create_portable_package():
        print(f"已复制模板文件: {template_file} -> {release_dir / 'templates'}")
    else:
        print(f"警告: 模板文件不存在: {template_file}")
    item_file = Path('templates/商品资料.xlsx')
    if item_file.exists():
        try:
            (Path('dist') / 'templates').mkdir(exist_ok=True)
            shutil.copy2(item_file, Path('dist') / 'templates')
        except Exception:
            pass
        shutil.copy2(item_file, release_dir / 'templates')
        print(f"已复制商品资料: {item_file} -> {release_dir / 'templates'}")
    else:
        print(f"警告: 商品资料文件不存在: {item_file}")
    # 创建README文件
    readme_content = '''
@@ -0,0 +1,88 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """
 清理脚本 - 用于删除无关的文件和日志
 """
 import os
 import shutil
 import glob
 def clean_logs():
    """清理日志文件"""
    print("清理日志文件...")
    # 删除.active文件
    active_files = glob.glob("logs/*.active")
    for file in active_files:
        try:
            os.remove(file)
            print(f"已删除: {file}")
        except Exception as e:
            print(f"删除文件时出错 {file}: {e}")
    # 保留最新的日志，删除旧的备份
    log_files = glob.glob("logs/*.log.*")
    for file in log_files:
        try:
            os.remove(file)
            print(f"已删除: {file}")
        except Exception as e:
            print(f"删除文件时出错 {file}: {e}")
 def clean_temp_files():
    """清理临时文件"""
    print("清理临时文件...")
    # 清空临时目录
    temp_dir = "data/temp"
    if os.path.exists(temp_dir):
        for file in os.listdir(temp_dir):
            file_path = os.path.join(temp_dir, file)
            try:
                if os.path.isfile(file_path):
                    os.remove(file_path)
                    print(f"已删除: {file_path}")
                elif os.path.isdir(file_path):
                    shutil.rmtree(file_path)
                    print(f"已删除目录: {file_path}")
            except Exception as e:
                print(f"删除文件时出错 {file_path}: {e}")
    # 删除备份文件
    backup_files = glob.glob("data/*.bak") + glob.glob("config/*.bak")
    for file in backup_files:
        try:
            os.remove(file)
            print(f"已删除: {file}")
        except Exception as e:
            print(f"删除文件时出错 {file}: {e}")
 def clean_pycache():
    """清理Python缓存文件"""
    print("清理Python缓存文件...")
    # 查找并删除所有__pycache__目录
    for root, dirs, files in os.walk("."):
        for dir in dirs:
            if dir == "__pycache__":
                cache_dir = os.path.join(root, dir)
                try:
                    shutil.rmtree(cache_dir)
                    print(f"已删除目录: {cache_dir}")
                except Exception as e:
                    print(f"删除目录时出错 {cache_dir}: {e}")
 def main():
    """主函数"""
    print("开始清理无关文件...")
    clean_logs()
    clean_temp_files()
    clean_pycache()
    print("清理完成！")
 if __name__ == "__main__":
    main() 
@@ -1,23 +1,17 @@
 [API]
-api_key = 
+api_key = O0Fgk3o69RWJ86eAX8BTHRaB
-secret_key = 
+secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
 timeout = 30
 max_retries = 3
 retry_delay = 2
 api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
 token_url = https://aip.baidubce.com/oauth/2.0/token
 form_ocr_url = https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result
 [Paths]
 input_folder = data/input
 output_folder = data/output
 temp_folder = data/temp
 template_folder = templates
 template_file = templates\银豹-采购单模板.xls
 processed_record = data/processed_files.json
 data_dir = data
 product_db = data/product_cache.db
 result_folder = data/result
 [Performance]
 max_workers = 4
@@ -27,22 +21,8 @@ skip_existing = true
 [File]
 allowed_extensions = .jpg,.jpeg,.png,.bmp
 excel_extension = .xlsx
-max_file_size_mb = 5
+max_file_size_mb = 4
 [Templates]
 purchase_order = 银豹-采购单模板.xls
 item_data = 商品资料.xlsx
 [App]
 version = 2026.05.05.0239
 [Gitea]
 base_url = https://gitea.94kan.cn
 owner = houhuan
 repo = yixuan-sync-data
 token = 50b61e43a141d606ae2529cd1755bc666d800e08
 [WebAuth]
 username = admin
 password_hash = $2b$12$nllT8o1QIMfWKuTlpQI3G./E2NS.gqf0EHZyNkJ8gMpVa9grTXRoC
@@ -179,62 +179,6 @@
    "map_to": "69021343",
    "description": "条码映射：6923450653012 -> 69021343"
  },
  "6923644295844": {
    "map_to": "6923644285036",
    "description": "条码映射：6923644295844 -> 6923644285036"
  },
  "6907992513157": {
    "map_to": "6907992513195",
    "description": "条码映射：6907992513157 -> 6907992513195"
  },
  "6902083893842": {
    "map_to": "6902083907150",
    "description": "条码映射：6902083893842 -> 6902083907150"
  },
  "6902083904685": {
    "map_to": "6902083905217",
    "description": "条码映射：6902083904685 -> 6902083905217"
  },
  "6917878036849": {
    "map_to": "6917878036847",
    "description": "条码映射：6917878036849 -> 6917878036847"
  },
  "6903979000078": {
    "map_to": "6903979000061",
    "description": "条码映射：6903979000078 -> 6903979000061"
  },
  "6937003706353": {
    "map_to": "6937003706360",
    "description": "条码映射：6937003706353 -> 6937003706360"
  },
  "6923644242961": {
    "map_to": "6907992100043",
    "description": "条码映射：6923644242961 -> 6907992100043"
  },
  "6923644258382": {
    "map_to": "6923644252823",
    "description": "条码映射：6923644258382 -> 6923644252823"
  },
  "6923450657430": {
    "map_to": "69029110",
    "description": "条码映射：6923450657430 -> 69029110"
  },
  "6923450660232": {
    "map_to": "6923450690123",
    "description": "条码映射：6923450660232 -> 6923450690123"
  },
  "6923450657614": {
    "map_to": "6923450657607",
    "description": "条码映射：6923450657614 -> 6923450657607"
  },
  "6972556000022": {
    "map_to": "6977826050028",
    "description": "条码映射：6972556000022 -> 6977826050028"
  },
  "6949352266280": {
    "map_to": "6949352266273",
    "description": "条码映射：6949352266280 -> 6949352266273"
  },
  "6925019900087": {
    "multiplier": 10,
    "target_unit": "瓶",
@@ -257,17 +201,5 @@
    "target_unit": "个",
    "specification": "1*14",
    "description": "友臣肉松，1盒14个"
  },
  "6921734933485": {
    "multiplier": 12,
    "target_unit": "支",
    "specification": "1*12",
    "description": "得力铅笔"
  },
  "6901826888244": {
    "multiplier": 30,
    "target_unit": "对",
    "specification": "1*30",
    "description": "南孚电池"
  }
 }
@@ -1,23 +1,17 @@
 [API]
-api_key =
+api_key = O0Fgk3o69RWJ86eAX8BTHRaB
-secret_key =
+secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
 timeout = 30
 max_retries = 3
 retry_delay = 2
 api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
 token_url = https://aip.baidubce.com/oauth/2.0/token
 form_ocr_url = https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result
 [Paths]
 input_folder = data/input
 output_folder = data/output
 result_folder = data/result
 temp_folder = data/temp
 template_folder = templates
 template_file = 银豹-采购单模板.xls
 processed_record = data/processed_files.json
 data_dir = data
 product_db = data/product_cache.db
 [Performance]
 max_workers = 4
@@ -31,11 +25,4 @@ max_file_size_mb = 4
 [Templates]
 purchase_order = 银豹-采购单模板.xls
 item_data = 商品资料.xlsx
 [Gitea]
 base_url = https://gitea.94kan.cn
 owner = houhuan
 repo = yixuan-sync-data
 token =
@@ -1,237 +0,0 @@
 {
  "suppliers": [
    {
      "name": "蓉城易购",
      "description": "蓉城易购供应商订单处理",
      "filename_patterns": [
        "*蓉城*",
        "*rongcheng*",
        "*易*"
      ],
      "content_indicators": [
        "蓉城易购",
        "商品编码",
        "订货数量"
      ],
      "column_mapping": {
        "商品条码（小条码）": "barcode",
        "商品名称": "name",
        "规格": "specification",
        "订购数量（小单位）": "quantity",
        "单位": "unit",
        "单价（小单位)": "unit_price",
        "优惠后金额（小单位）": "total_price",
        "备注": "category",
        "行号": "supplier"
      },
      "cleaning_rules": [
        {
          "type": "remove_rows",
          "condition": "订货数量 == 0 or 订货数量.isna()"
        },
        {
          "type": "fill_na",
          "columns": [
            "unit_price"
          ],
          "value": 0
        }
      ],
      "calculations": [
        {
          "type": "multiply",
          "source_column": "quantity",
          "target_column": "quantity",
          "factor": 1
        }
      ],
      "output_suffix": "_蓉城易购_银豹采购单",
      "header_row": 2,
      "rules": [
        {
          "type": "split_quantity_unit",
          "source": "订购数量（小单位）"
        },
        {
          "type": "extract_spec_from_name",
          "source": "商品名称"
        },
        {
          "type": "normalize_unit",
          "target": "unit",
          "map": {
            "箱": "件",
            "提": "件",
            "盒": "件"
          }
        },
        {
          "type": "compute_quantity_from_total"
        },
        {
          "type": "mark_gift"
        },
        {
          "type": "fill_missing",
          "fills": {
            "unit": "瓶"
          }
        }
      ],
      "output_templates": [
        "templates/银豹-采购单模板.xls"
      ],
      "current_template_index": 0
    },
    {
      "name": "通用食品供应商",
      "description": "通用食品类供应商订单",
      "filename_patterns": [
        "*食品*",
        "*配送*",
        "*供货*"
      ],
      "content_indicators": [
        "产品条码",
        "订购量",
        "进货价"
      ],
      "column_mapping": {
        "产品条码": "barcode",
        "产品名称": "name",
        "订购量": "quantity",
        "进货价": "unit_price"
      },
      "cleaning_rules": [
        {
          "type": "convert_type",
          "columns": [
            "unit_price"
          ],
          "target_type": "float"
        },
        {
          "type": "fill_na",
          "columns": [
            "barcode",
            "name",
            "quantity"
          ],
          "value": 0
        }
      ],
      "output_suffix": "_食品供应商_银豹采购单",
      "rules": [
        {
          "type": "split_quantity_unit",
          "source": "订购量"
        },
        {
          "type": "extract_spec_from_name",
          "source": "产品名称"
        },
        {
          "type": "normalize_unit",
          "target": "unit",
          "map": {
            "箱": "件",
            "提": "件",
            "盒": "件"
          }
        },
        {
          "type": "compute_quantity_from_total"
        },
        {
          "type": "mark_gift"
        },
        {
          "type": "fill_missing",
          "fills": {
            "unit": "瓶"
          }
        }
      ],
      "output_templates": [
        "templates/银豹-采购单模板.xls"
      ],
      "current_template_index": 0
    },
    {
      "name": "农夫山泉",
      "description": "",
      "filename_patterns": [],
      "content_indicators": [],
      "column_mapping": {
        "条形码": "barcode",
        "商品名称": "name",
        "销售价": "unit_price",
        "订单金额": "total_price",
        "Unnamed: 0": "supplier",
        "备注": "brand"
      },
      "header_row": 0,
      "rules": [
        {
          "type": "split_quantity_unit",
          "source": "订单数量"
        },
        {
          "type": "extract_spec_from_name",
          "source": "name"
        },
        {
          "type": "normalize_unit",
          "target": "unit",
          "map": {
            "箱": "件",
            "提": "件",
            "盒": "件"
          }
        },
        {
          "type": "compute_quantity_from_total"
        },
        {
          "type": "mark_gift"
        },
        {
          "type": "fill_missing",
          "fills": {
            "unit": "瓶"
          }
        }
      ],
      "dictionary": {
        "ignore_words": [
          "白膜",
          "彩膜",
          "赠品"
        ],
        "unit_synonyms": {
          "箱": "件",
          "提": "件",
          "盒": "件",
          "瓶": "瓶"
        },
        "pack_multipliers": {
          "件": 24,
          "箱": 24,
          "提": 12,
          "盒": 10
        },
        "name_patterns": [
          "(\\d+(?:\\.\\d+)?)(ml|mL|ML|l|L|升|毫升)[*×xX](\\d+)",
          "(\\d+)[*×xX](\\d+)瓶",
          "(\\d{2,3}).*?(\\d{1,3})"
        ],
        "default_unit": "瓶",
        "default_package_quantity": 1
      },
      "output_templates": [
        "templates/银豹-采购单模板.xls"
      ],
      "current_template_index": 0
    }
  ]
 }
@@ -0,0 +1,3 @@
 {
  "data/output\\7a3a78a02fcf6ccef5daad31bd50bdf2.xlsx": "data/result\\采购单_7a3a78a02fcf6ccef5daad31bd50bdf2.xls"
 }
@@ -0,0 +1 @@
 {"theme": "light"}
@@ -0,0 +1,28 @@
 [API]
 api_key = O0Fgk3o69RWJ86eAX8BTHRaB
 secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
 timeout = 30
 max_retries = 3
 retry_delay = 2
 api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
 [Paths]
 input_folder = data/input
 output_folder = data/output
 temp_folder = data/temp
 template_folder = templates
 processed_record = data/processed_files.json
 [Performance]
 max_workers = 4
 batch_size = 5
 skip_existing = true
 [File]
 allowed_extensions = .jpg,.jpeg,.png,.bmp
 excel_extension = .xlsx
 max_file_size_mb = 4
 [Templates]
 purchase_order = 银豹-采购单模板.xls
@@ -0,0 +1,205 @@
 {
  "6920584471055": {
    "map_to": "6920584471017",
    "description": "条码映射：6920584471055 -> 6920584471017"
  },
  "6925861571159": {
    "map_to": "69021824",
    "description": "条码映射：6925861571159 -> 69021824"
  },
  "6923644268923": {
    "map_to": "6923644268480",
    "description": "条码映射：6923644268923 -> 6923644268480"
  },
  "6925861571466": {
    "map_to": "6925861571459",
    "description": "条码映射：6925861571466 -> 6925861571459"
  },
  "6907992508344": {
    "map_to": "6907992508191",
    "description": "条码映射：6907992508344 -> 6907992508191"
  },
  "6903979000979": {
    "map_to": "6903979000962",
    "description": "条码映射：6903979000979 -> 6903979000962"
  },
  "6923644283582": {
    "map_to": "6923644283575",
    "description": "条码映射：6923644283582 -> 6923644283575"
  },
  "6923644268930": {
    "map_to": "6923644268497",
    "description": "条码映射：6923644268930 -> 6923644268497"
  },
  "6923644268916": {
    "map_to": "6923644268503",
    "description": "条码映射：6923644268916 -> 6923644268503"
  },
  "6923644268909": {
    "map_to": "6923644268510",
    "description": "条码映射：6923644268909 -> 6923644268510"
  },
  "6923644299804": {
    "map_to": "6923644299774",
    "description": "条码映射：6923644299804 -> 6923644299774"
  },
  "6923644266318": {
    "map_to": "6923644266066",
    "description": "条码映射：6923644266318 -> 6923644266066"
  },
  "6923644210151": {
    "map_to": "6923644223458",
    "description": "条码映射：6923644210151 -> 6923644223458"
  },
  "6907992501819": {
    "map_to": "6907992500133",
    "description": "条码映射：6907992501819 -> 6907992500133"
  },
  "6907992502052": {
    "map_to": "6907992100272",
    "description": "条码映射：6907992502052 -> 6907992100272"
  },
  "6907992507385": {
    "map_to": "6907992507095",
    "description": "条码映射：6907992507385 -> 6907992507095"
  },
  "6973726149671": {
    "map_to": "6973726149657",
    "description": "条码映射：6973726149671 -> 6973726149657"
  },
  "6977426410574": {
    "map_to": "6977426410567",
    "description": "条码映射：6977426410574 -> 6977426410567"
  },
  "6973726149688": {
    "map_to": "6973726149664",
    "description": "条码映射：6973726149688 -> 6973726149664"
  },
  "6935205322012": {
    "map_to": "6935205320018",
    "description": "条码映射：6935205322012 -> 6935205320018"
  },
  "6943497411024": {
    "map_to": "6943497411017",
    "description": "条码映射：6943497411024 -> 6943497411017"
  },
  "6921734968821": {
    "map_to": "6921734968814",
    "description": "条码映射：6921734968821 -> 6921734968814"
  },
  "6921734968258": {
    "map_to": "6921734968241",
    "description": "条码映射：6921734968258 -> 6921734968241"
  },
  "6921734968180": {
    "map_to": "6921734968173",
    "description": "条码映射：6921734968180 -> 6921734968173"
  },
  "6921734908735": {
    "map_to": "6935205372772",
    "description": "条码映射：6921734908735 -> 6935205372772"
  },
  "6923644248222": {
    "map_to": "6923644248208",
    "description": "条码映射：6923644248222 -> 6923644248208"
  },
  "6902083881122": {
    "map_to": "6902083881085",
    "description": "条码映射：6902083881122 -> 6902083881085"
  },
  "6907992501857": {
    "map_to": "6907992500010",
    "description": "条码映射：6907992501857 -> 6907992500010"
  },
  "6902083891015": {
    "map_to": "6902083890636",
    "description": "条码映射：6902083891015 -> 6902083890636"
  },
  "6923450605240": {
    "map_to": "6923450605226",
    "description": "条码映射：6923450605240 -> 6923450605226"
  },
  "6923450605196": {
    "map_to": "6923450614624",
    "description": "条码映射：6923450605196 -> 6923450614624"
  },
  "6923450665213": {
    "map_to": "6923450665206",
    "description": "条码映射：6923450665213 -> 6923450665206"
  },
  "6923450666821": {
    "map_to": "6923450666838",
    "description": "条码映射：6923450666821 -> 6923450666838"
  },
  "6923450661505": {
    "map_to": "6923450661499",
    "description": "条码映射：6923450661505 -> 6923450661499"
  },
  "6923450676103": {
    "map_to": "6923450676097",
    "description": "条码映射：6923450676103 -> 6923450676097"
  },
  "6923450614631": {
    "map_to": "6923450614624",
    "description": "条码映射：6923450614631 -> 6923450614624"
  },
  "6901424334174": {
    "map_to": "6973730760015",
    "description": "条码映射：6901424334174 -> 6973730760015"
  },
  "6958620703716": {
    "map_to": "6958620703907",
    "description": "条码映射：6958620703716 -> 6958620703907"
  },
  "6937003706322": {
    "map_to": "6937003703833",
    "description": "条码映射：6937003706322 -> 6937003703833"
  },
  "6950783203494": {
    "map_to": "6950873203494",
    "description": "条码映射：6950783203494 -> 6950873203494"
  },
  "6907992501871": {
    "map_to": "6907992500010",
    "description": "条码映射：6907992501871 -> 6907992500010"
  },
  "6907992501864": {
    "map_to": "6907992100012",
    "description": "条码映射：6907992501864 -> 6907992100012"
  },
  "6923644264192": {
    "map_to": "6923644264116",
    "description": "条码映射：6923644264192 -> 6923644264116"
  },
  "6923450667316": {
    "map_to": "69042386",
    "description": "条码映射：6923450667316 -> 69042386"
  },
  "6923450653012": {
    "map_to": "69021343",
    "description": "条码映射：6923450653012 -> 69021343"
  },
  "6925019900087": {
    "multiplier": 10,
    "target_unit": "瓶",
    "description": "特殊处理：数量*10，单位转换为瓶"
  },
  "6921168593804": {
    "multiplier": 30,
    "target_unit": "瓶",
    "description": "NFC产品特殊处理：每箱30瓶"
  },
  "6901826888138": {
    "multiplier": 30,
    "target_unit": "瓶",
    "fixed_price": 3.7333333333333334,
    "specification": "1*30",
    "description": "特殊处理: 规格1*30，数量*30，单价=112/30"
  },
  "6958620703907": {
    "multiplier": 14,
    "target_unit": "个",
    "specification": "1*14",
    "description": "友臣肉松，1盒14个"
  }
 }
@@ -0,0 +1,28 @@
 [API]
 api_key = O0Fgk3o69RWJ86eAX8BTHRaB
 secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
 timeout = 30
 max_retries = 3
 retry_delay = 2
 api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
 [Paths]
 input_folder = data/input
 output_folder = data/output
 temp_folder = data/temp
 template_folder = templates
 processed_record = data/processed_files.json
 [Performance]
 max_workers = 4
 batch_size = 5
 skip_existing = true
 [File]
 allowed_extensions = .jpg,.jpeg,.png,.bmp
 excel_extension = .xlsx
 max_file_size_mb = 4
 [Templates]
 purchase_order = 银豹-采购单模板.xls
@@ -1,212 +0,0 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """
 OCR订单处理系统 - 无界面自动化接口
 -----------------------------
 专为与 openclaw 等自动化平台对接设计。
 处理流程：输入图片 -> OCR识别 -> 数据清洗 -> 价格校验 -> 输出结果路径。
 """
 import os
 import sys
 import time
 import argparse
 import json
 from pathlib import Path
 from typing import Optional, List, Dict
 # 添加当前目录到路径
 sys.path.append(os.path.dirname(os.path.abspath(__file__)))
 from app.config.settings import ConfigManager
 from app.services.ocr_service import OCRService
 from app.services.order_service import OrderService
 from app.services.tobacco_service import TobaccoService
 from app.services.special_suppliers_service import SpecialSuppliersService
 from app.core.utils.log_utils import get_logger, set_log_level
 logger = get_logger("HeadlessAPI")
 def get_latest_file(directory: str, extensions: List[str]) -> Optional[str]:
    """获取目录中最新的指定后缀文件"""
    dir_path = Path(directory)
    if not dir_path.exists():
        return None
    files = []
    for ext in extensions:
        files.extend(dir_path.glob(f"*{ext}"))
        files.extend(dir_path.glob(f"*{ext.upper()}"))
    if not files:
        return None
    latest_file = max(files, key=lambda p: p.stat().st_mtime)
    return str(latest_file)
 def update_barcode_mapping(barcode: str, target_barcode: str = None, multiplier: float = None, unit: str = None, price: float = None, spec: str = None):
    """更新条码映射或特殊处理配置"""
    try:
        config_path = os.path.join("config", "barcode_mappings.json")
        mappings = {}
        if os.path.exists(config_path):
            with open(config_path, 'r', encoding='utf-8') as f:
                mappings = json.load(f)
        # 获取或创建该条码的配置
        config = mappings.get(barcode, {})
        if target_barcode:
            config["map_to"] = target_barcode
            config["description"] = config.get("description", "") + f" 条码映射 -> {target_barcode}"
        if multiplier is not None:
            config["multiplier"] = multiplier
            config["description"] = config.get("description", "") + f" 数量倍数*{multiplier}"
        if unit:
            config["target_unit"] = unit
        if price is not None:
            config["fixed_price"] = price
        if spec:
            config["specification"] = spec
        if not config.get("description"):
            config["description"] = f"特殊条码配置: {barcode}"
        mappings[barcode] = config
        with open(config_path, 'w', encoding='utf-8') as f:
            json.dump(mappings, f, ensure_ascii=False, indent=2)
        logger.info(f"成功更新条码配置: {barcode} -> {config}")
        return True
    except Exception as e:
        logger.error(f"更新条码配置失败: {e}")
        return False
 def run_pipeline(args):
    """运行处理流水线"""
    try:
        config_manager = ConfigManager()
        order_service = OrderService(config_manager)
        start_time = time.perf_counter()
        final_excel = None
        input_folder = config_manager.get('Paths', 'input_folder', fallback='data/input')
        output_folder = config_manager.get('Paths', 'output_folder', fallback='data/output')
        # 1. 处理条码映射更新
        if args.update_mapping:
            if not args.barcode:
                print("ERROR: --barcode is required for --update-mapping", file=sys.stderr)
                return None
            # 至少需要一个更新项
            if not any([args.target, args.multiplier, args.unit, args.price, args.spec]):
                print("ERROR: At least one update option (--target, --multiplier, --unit, --price, --spec) is required", file=sys.stderr)
                return None
            if update_barcode_mapping(args.barcode, args.target, args.multiplier, args.unit, args.price, args.spec):
                print(f"SUCCESS: Barcode configuration updated for {args.barcode}")
                return "MAPPING_UPDATED"
            return None
        # 2. 烟草公司处理 (显式指定)
        if args.tobacco:
            input_path = args.input or get_latest_file(output_folder, [".xlsx", ".xls"])
            if not input_path:
                print("ERROR: No tobacco order file found.", file=sys.stderr)
                return None
            logger.info(f"开始显式处理烟草订单: {input_path}")
            # 这里的 process_tobacco_order 会调用 preprocess 并生成银豹格式
            tobacco_service = TobaccoService(config_manager)
            final_excel = tobacco_service.process_tobacco_order(input_path)
        # 3. 蓉城易购处理 (显式指定)
        elif args.rongcheng:
            input_path = args.input or get_latest_file(output_folder, [".xlsx", ".xls"])
            if not input_path:
                print("ERROR: No Rongcheng Yigou order file found.", file=sys.stderr)
                return None
            logger.info(f"开始显式处理蓉城易购订单: {input_path}")
            special_service = SpecialSuppliersService(config_manager)
            final_excel = special_service.process_rongcheng_yigou(input_path)
        # 4. 普通 Excel 处理 (支持自动识别烟草/蓉城/杨碧月)
        elif args.excel:
            input_path = args.input or get_latest_file(input_folder, [".xlsx", ".xls"])
            if not input_path:
                print("ERROR: No Excel file found in input.", file=sys.stderr)
                return None
            logger.info(f"开始处理 Excel (支持智能识别): {input_path}")
            # OrderService.process_excel 内部会自动调用 _check_special_preprocess
            final_excel = order_service.process_excel(input_path)
        # 5. 智能处理 (默认逻辑：自动判断图片还是 Excel)
        else:
            input_path = args.input or get_latest_file(input_folder, [".jpg", ".jpeg", ".png", ".bmp", ".xlsx", ".xls"])
            if not input_path:
                print(f"ERROR: No input file found in {input_folder}.", file=sys.stderr)
                return None
            ext = os.path.splitext(input_path)[1].lower()
            if ext in [".xlsx", ".xls"]:
                logger.info(f"智能识别为 Excel 文件，开始处理: {input_path}")
                final_excel = order_service.process_excel(input_path)
            else:
                logger.info(f"智能识别为图片文件，开始 OCR 处理: {input_path}")
                ocr_service = OCRService(config_manager)
                excel_intermediate = ocr_service.process_image(input_path)
                if excel_intermediate:
                    final_excel = order_service.process_excel(excel_intermediate)
        # 6. 后续处理 (校验与输出)
        if final_excel:
            # 单价校验
            discrepancies = order_service.validate_unit_price(final_excel)
            if discrepancies:
                print(f"WARNING: Price validation found {len(discrepancies)} issues:", file=sys.stderr)
                for d in discrepancies:
                    print(f"  - {d}", file=sys.stderr)
            duration = time.perf_counter() - start_time
            logger.info(f"处理完成，耗时: {duration:.2f}s")
            # 输出最终路径
            abs_path = os.path.abspath(final_excel)
            print(abs_path)
            return abs_path
        else:
            print("ERROR: Processing failed.", file=sys.stderr)
            return None
    except Exception as e:
        import traceback
        print(f"CRITICAL ERROR: {str(e)}", file=sys.stderr)
        traceback.print_exc(file=sys.stderr)
        return None
 if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="OCR订单处理系统 - 无界面自动化接口")
    parser.add_argument('input', nargs='?', help='输入文件路径 (图片或Excel)')
    group = parser.add_mutually_exclusive_group()
    group.add_argument('--excel', action='store_true', help='处理普通 Excel 文件')
    group.add_argument('--tobacco', action='store_true', help='处理烟草公司订单')
    group.add_argument('--rongcheng', action='store_true', help='处理蓉城易购订单')
    group.add_argument('--update-mapping', action='store_true', help='更新条码映射')
    parser.add_argument('--barcode', help='待映射的原始条码 (用于 --update-mapping)')
    parser.add_argument('--target', help='目标条码 (用于 --update-mapping)')
    parser.add_argument('--multiplier', type=float, help='数量倍数 (例如箱转瓶填写30)')
    parser.add_argument('--unit', help='目标单位 (例如"瓶")')
    parser.add_argument('--price', type=float, help='固定单价')
    parser.add_argument('--spec', help='固定规格 (例如"1*30")')
    args = parser.parse_args()
    result = run_pipeline(args)
    sys.exit(0 if result else 1)
@@ -0,0 +1,9 @@
 2025-08-16 00:52:16,840 - app.core.excel.converter - INFO - 成功加载条码映射配置，共49项
 2025-08-16 00:52:17,144 - app.core.excel.converter - INFO - 解析容量(ml)规格: 500ml*15 -> 1*15
 2025-08-16 00:52:17,217 - app.core.excel.converter - INFO - 解析容量(ml)规格: 600mL*15 -> 1*15
 2025-08-16 00:52:17,283 - app.core.excel.converter - INFO - 解析容量(ml)规格: 600ml*15 -> 1*15
 2025-08-16 00:52:17,346 - app.core.excel.converter - INFO - 解析容量(ml)规格: 900ml*12 -> 1*12
 2025-08-16 00:52:17,399 - app.core.excel.converter - INFO - 解析容量(ml)规格: 900ml*12 -> 1*12
 2025-08-16 00:52:17,462 - app.core.excel.converter - INFO - 解析容量(ml)规格: 900ml*12 -> 1*12
 2025-08-16 00:52:17,515 - app.core.excel.converter - INFO - 解析容量(ml)规格: 950ml*12 -> 1*12
 2025-08-16 00:52:17,579 - app.core.excel.converter - INFO - 解析容量(ml)规格: 480ml*15 -> 1*15
@@ -0,0 +1 @@
 2025-08-16 00:52:17,210 - app.core.excel.handlers.barcode_mapper - INFO - 条码映射: 6937003706322 -> 6937003703833
@@ -0,0 +1,8 @@
 2025-08-16 00:52:17,160 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 15.0, 单价: 68.0 -> 4.533333333333333, 单位: 件 -> 瓶
 2025-08-16 00:52:17,236 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 15.0, 单价: 68.0 -> 4.533333333333333, 单位: 件 -> 瓶
 2025-08-16 00:52:17,298 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 15.0, 单价: 68.0 -> 4.533333333333333, 单位: 件 -> 瓶
 2025-08-16 00:52:17,366 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 2.0 -> 24.0, 单价: 45.0 -> 3.75, 单位: 件 -> 瓶
 2025-08-16 00:52:17,415 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 2.0 -> 24.0, 单价: 45.0 -> 3.75, 单位: 件 -> 瓶
 2025-08-16 00:52:17,477 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 12.0, 单价: 45.0 -> 3.75, 单位: 件 -> 瓶
 2025-08-16 00:52:17,533 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 12.0, 单价: 73.0 -> 6.083333333333333, 单位: 件 -> 瓶
 2025-08-16 00:52:17,594 - app.core.excel.handlers.unit_converter_handlers - INFO - 赠品瓶单位处理: 保持原样 数量: 5.0, 单价: 0, 单位: 瓶
@@ -0,0 +1,2 @@
 2025-08-16 00:52:16,853 - app.core.excel.merger - INFO - 使用输出目录: E:\2025Code\python\orc-order-v2\data\output
 2025-08-16 00:52:16,861 - app.core.excel.merger - INFO - 初始化PurchaseOrderMerger完成，模板文件: templates\银豹-采购单模板.xls
@@ -0,0 +1,57 @@
 2025-08-16 00:52:16,835 - app.core.excel.processor - INFO - 使用输出目录: E:\2025Code\python\orc-order-v2\data\output
 2025-08-16 00:52:16,839 - app.core.excel.processor - INFO - 使用临时目录: E:\2025Code\python\orc-order-v2\data\temp
 2025-08-16 00:52:16,847 - app.core.excel.processor - INFO - 初始化ExcelProcessor完成，模板文件: templates/银豹-采购单模板.xls
 2025-08-16 00:52:16,877 - app.core.excel.processor - INFO - 搜索目录 data/output 中的Excel文件
 2025-08-16 00:52:16,886 - app.core.excel.processor - INFO - 找到最新的Excel文件: data/output\7a3a78a02fcf6ccef5daad31bd50bdf2.xlsx
 2025-08-16 00:52:16,895 - app.core.excel.processor - INFO - 开始处理Excel文件: data/output\7a3a78a02fcf6ccef5daad31bd50bdf2.xlsx
 2025-08-16 00:52:16,934 - app.core.excel.processor - INFO - 成功读取Excel文件: data/output\7a3a78a02fcf6ccef5daad31bd50bdf2.xlsx, 共 10 行
 2025-08-16 00:52:16,935 - app.core.excel.processor - INFO - 找到可能的表头行: 第1行，评分: 60
 2025-08-16 00:52:16,941 - app.core.excel.processor - INFO - 识别到表头在第 1 行
 2025-08-16 00:52:16,965 - app.core.excel.processor - INFO - 使用表头行重新读取数据，共 9 行有效数据
 2025-08-16 00:52:16,974 - app.core.excel.processor - INFO - 找到精确匹配的条码列: 商品条码
 2025-08-16 00:52:16,988 - app.core.excel.processor - INFO - 使用条码列: 商品条码
 2025-08-16 00:52:17,003 - app.core.excel.processor - INFO - 找到name列: 商品名称
 2025-08-16 00:52:17,016 - app.core.excel.processor - INFO - 找到specification列: 规格型号
 2025-08-16 00:52:17,032 - app.core.excel.processor - INFO - 找到quantity列: 数量
 2025-08-16 00:52:17,049 - app.core.excel.processor - INFO - 找到unit列: 单位
 2025-08-16 00:52:17,064 - app.core.excel.processor - INFO - 找到price列: 单价
 2025-08-16 00:52:17,079 - app.core.excel.processor - INFO - 找到amount列: 金额
 2025-08-16 00:52:17,094 - app.core.excel.processor - INFO - 检测到列映射: {'barcode': '商品条码', 'name': '商品名称', 'specification': '规格型号', 'quantity': '数量', 'unit': '单位', 'price': '单价', 'amount': '金额'}
 2025-08-16 00:52:17,110 - app.core.excel.processor - INFO - 从映射列解析规格: 500ml*15 -> 包装数量=15
 2025-08-16 00:52:17,177 - app.core.excel.processor - INFO - 从映射列解析规格: 600mL*15 -> 包装数量=15
 2025-08-16 00:52:17,252 - app.core.excel.processor - INFO - 从映射列解析规格: 600ml*15 -> 包装数量=15
 2025-08-16 00:52:17,314 - app.core.excel.processor - INFO - 从映射列解析规格: 900ml*12 -> 包装数量=12
 2025-08-16 00:52:17,373 - app.core.excel.processor - INFO - 从映射列解析规格: 900ml*12 -> 包装数量=12
 2025-08-16 00:52:17,431 - app.core.excel.processor - INFO - 从映射列解析规格: 900ml*12 -> 包装数量=12
 2025-08-16 00:52:17,493 - app.core.excel.processor - INFO - 从映射列解析规格: 950ml*12 -> 包装数量=12
 2025-08-16 00:52:17,549 - app.core.excel.processor - INFO - 从映射列解析规格: 480ml*15 -> 包装数量=15
 2025-08-16 00:52:17,610 - app.core.excel.processor - INFO - 提取到 8 个商品信息
 2025-08-16 00:52:17,634 - app.core.excel.processor - INFO - 开始处理8 个产品信息
 2025-08-16 00:52:17,644 - app.core.excel.processor - INFO - 处理商品: 条码=6970399922365, 数量=15.0, 单价=4.533333333333333, 是否赠品=False
 2025-08-16 00:52:17,659 - app.core.excel.processor - INFO - 发现正常商品：条码6970399922365, 数量=15.0, 单价=4.533333333333333
 2025-08-16 00:52:17,675 - app.core.excel.processor - INFO - 处理商品: 条码=6937003703833, 数量=15.0, 单价=4.533333333333333, 是否赠品=False
 2025-08-16 00:52:17,692 - app.core.excel.processor - INFO - 发现正常商品：条码6937003703833, 数量=15.0, 单价=4.533333333333333
 2025-08-16 00:52:17,707 - app.core.excel.processor - INFO - 处理商品: 条码=6937003706346, 数量=15.0, 单价=4.533333333333333, 是否赠品=False
 2025-08-16 00:52:17,723 - app.core.excel.processor - INFO - 发现正常商品：条码6937003706346, 数量=15.0, 单价=4.533333333333333
 2025-08-16 00:52:17,738 - app.core.excel.processor - INFO - 处理商品: 条码=6973003703413, 数量=24.0, 单价=3.75, 是否赠品=False
 2025-08-16 00:52:17,753 - app.core.excel.processor - INFO - 发现正常商品：条码6973003703413, 数量=24.0, 单价=3.75
 2025-08-16 00:52:17,768 - app.core.excel.processor - INFO - 处理商品: 条码=6975176784785, 数量=24.0, 单价=3.75, 是否赠品=False
 2025-08-16 00:52:17,784 - app.core.excel.processor - INFO - 发现正常商品：条码6975176784785, 数量=24.0, 单价=3.75
 2025-08-16 00:52:17,800 - app.core.excel.processor - INFO - 处理商品: 条码=6937003708876, 数量=12.0, 单价=3.75, 是否赠品=False
 2025-08-16 00:52:17,815 - app.core.excel.processor - INFO - 发现正常商品：条码6937003708876, 数量=12.0, 单价=3.75
 2025-08-16 00:52:17,830 - app.core.excel.processor - INFO - 处理商品: 条码=6937003703826, 数量=12.0, 单价=6.083333333333333, 是否赠品=False
 2025-08-16 00:52:17,845 - app.core.excel.processor - INFO - 发现正常商品：条码6937003703826, 数量=12.0, 单价=6.083333333333333
 2025-08-16 00:52:17,859 - app.core.excel.processor - INFO - 处理商品: 条码=6970399920415, 数量=5.0, 单价=0, 是否赠品=True
 2025-08-16 00:52:17,876 - app.core.excel.processor - INFO - 发现赠品：条码6970399920415, 数量=5.0
 2025-08-16 00:52:17,891 - app.core.excel.processor - INFO - 分组后共8 个不同条码的商品
 2025-08-16 00:52:17,906 - app.core.excel.processor - INFO - 条码 6970399922365 处理结果：正常商品数量15.0，单价4.533333333333333，赠品数量0
 2025-08-16 00:52:17,923 - app.core.excel.processor - INFO - 条码 6937003703833 处理结果：正常商品数量15.0，单价4.533333333333333，赠品数量0
 2025-08-16 00:52:17,939 - app.core.excel.processor - INFO - 条码 6937003706346 处理结果：正常商品数量15.0，单价4.533333333333333，赠品数量0
 2025-08-16 00:52:17,955 - app.core.excel.processor - INFO - 条码 6973003703413 处理结果：正常商品数量24.0，单价3.75，赠品数量0
 2025-08-16 00:52:17,970 - app.core.excel.processor - INFO - 条码 6975176784785 处理结果：正常商品数量24.0，单价3.75，赠品数量0
 2025-08-16 00:52:17,987 - app.core.excel.processor - INFO - 条码 6937003708876 处理结果：正常商品数量12.0，单价3.75，赠品数量0
 2025-08-16 00:52:17,994 - app.core.excel.processor - INFO - 条码 6937003703826 处理结果：正常商品数量12.0，单价6.083333333333333，赠品数量0
 2025-08-16 00:52:18,012 - app.core.excel.processor - INFO - 条码 6970399920415 处理结果：只有赠品，数量=5.0
 2025-08-16 00:52:18,029 - app.core.excel.processor - INFO - 条码 6970399920415 填充：仅有赠品，采购量=0，赠品数量=5.0
 2025-08-16 00:52:18,051 - app.core.excel.processor - INFO - 采购单已保存到: data/result\采购单_7a3a78a02fcf6ccef5daad31bd50bdf2.xls
 2025-08-16 00:52:18,082 - app.core.excel.processor - INFO - 采购单已保存到: data/result\采购单_7a3a78a02fcf6ccef5daad31bd50bdf2.xls
@@ -0,0 +1,8 @@
 2025-08-16 00:52:17,127 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
 2025-08-16 00:52:17,193 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
 2025-08-16 00:52:17,268 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
 2025-08-16 00:52:17,329 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
 2025-08-16 00:52:17,381 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
 2025-08-16 00:52:17,446 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
 2025-08-16 00:52:17,508 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
 2025-08-16 00:52:17,565 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
@@ -0,0 +1,3 @@
 2025-08-16 00:52:16,815 - app.services.order_service - INFO - 初始化OrderService
 2025-08-16 00:52:16,863 - app.services.order_service - INFO - OrderService初始化完成
 2025-08-16 00:52:16,867 - app.services.order_service - INFO - OrderService开始处理最新Excel文件
@@ -0,0 +1,19 @@
 # OCR订单处理系统 - 便携版
 ## 使用说明
 1. 双击 "OCR订单处理系统.exe" 启动程序
 2. 将需要处理的图片文件放入 data/input 目录
 3. 处理结果将保存在 data/output 目录
 4. 日志文件保存在 logs 目录
 ## 注意事项
 - 首次运行时需要配置百度OCR API密钥
 - 支持的图片格式：jpg, jpeg, png, bmp
 - 单个文件大小不超过4MB
 ## 目录结构
 - OCR订单处理系统.exe - 主程序
 - data/input/ - 输入图片目录
 - data/output/ - 输出结果目录
 - logs/ - 日志目录
@@ -0,0 +1,28 @@
 [API]
 api_key = O0Fgk3o69RWJ86eAX8BTHRaB
 secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
 timeout = 30
 max_retries = 3
 retry_delay = 2
 api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
 [Paths]
 input_folder = data/input
 output_folder = data/output
 temp_folder = data/temp
 template_folder = templates
 processed_record = data/processed_files.json
 [Performance]
 max_workers = 4
 batch_size = 5
 skip_existing = true
 [File]
 allowed_extensions = .jpg,.jpeg,.png,.bmp
 excel_extension = .xlsx
 max_file_size_mb = 4
 [Templates]
 purchase_order = 银豹-采购单模板.xls
@@ -0,0 +1,205 @@
 {
  "6920584471055": {
    "map_to": "6920584471017",
    "description": "条码映射：6920584471055 -> 6920584471017"
  },
  "6925861571159": {
    "map_to": "69021824",
    "description": "条码映射：6925861571159 -> 69021824"
  },
  "6923644268923": {
    "map_to": "6923644268480",
    "description": "条码映射：6923644268923 -> 6923644268480"
  },
  "6925861571466": {
    "map_to": "6925861571459",
    "description": "条码映射：6925861571466 -> 6925861571459"
  },
  "6907992508344": {
    "map_to": "6907992508191",
    "description": "条码映射：6907992508344 -> 6907992508191"
  },
  "6903979000979": {
    "map_to": "6903979000962",
    "description": "条码映射：6903979000979 -> 6903979000962"
  },
  "6923644283582": {
    "map_to": "6923644283575",
    "description": "条码映射：6923644283582 -> 6923644283575"
  },
  "6923644268930": {
    "map_to": "6923644268497",
    "description": "条码映射：6923644268930 -> 6923644268497"
  },
  "6923644268916": {
    "map_to": "6923644268503",
    "description": "条码映射：6923644268916 -> 6923644268503"
  },
  "6923644268909": {
    "map_to": "6923644268510",
    "description": "条码映射：6923644268909 -> 6923644268510"
  },
  "6923644299804": {
    "map_to": "6923644299774",
    "description": "条码映射：6923644299804 -> 6923644299774"
  },
  "6923644266318": {
    "map_to": "6923644266066",
    "description": "条码映射：6923644266318 -> 6923644266066"
  },
  "6923644210151": {
    "map_to": "6923644223458",
    "description": "条码映射：6923644210151 -> 6923644223458"
  },
  "6907992501819": {
    "map_to": "6907992500133",
    "description": "条码映射：6907992501819 -> 6907992500133"
  },
  "6907992502052": {
    "map_to": "6907992100272",
    "description": "条码映射：6907992502052 -> 6907992100272"
  },
  "6907992507385": {
    "map_to": "6907992507095",
    "description": "条码映射：6907992507385 -> 6907992507095"
  },
  "6973726149671": {
    "map_to": "6973726149657",
    "description": "条码映射：6973726149671 -> 6973726149657"
  },
  "6977426410574": {
    "map_to": "6977426410567",
    "description": "条码映射：6977426410574 -> 6977426410567"
  },
  "6973726149688": {
    "map_to": "6973726149664",
    "description": "条码映射：6973726149688 -> 6973726149664"
  },
  "6935205322012": {
    "map_to": "6935205320018",
    "description": "条码映射：6935205322012 -> 6935205320018"
  },
  "6943497411024": {
    "map_to": "6943497411017",
    "description": "条码映射：6943497411024 -> 6943497411017"
  },
  "6921734968821": {
    "map_to": "6921734968814",
    "description": "条码映射：6921734968821 -> 6921734968814"
  },
  "6921734968258": {
    "map_to": "6921734968241",
    "description": "条码映射：6921734968258 -> 6921734968241"
  },
  "6921734968180": {
    "map_to": "6921734968173",
    "description": "条码映射：6921734968180 -> 6921734968173"
  },
  "6921734908735": {
    "map_to": "6935205372772",
    "description": "条码映射：6921734908735 -> 6935205372772"
  },
  "6923644248222": {
    "map_to": "6923644248208",
    "description": "条码映射：6923644248222 -> 6923644248208"
  },
  "6902083881122": {
    "map_to": "6902083881085",
    "description": "条码映射：6902083881122 -> 6902083881085"
  },
  "6907992501857": {
    "map_to": "6907992500010",
    "description": "条码映射：6907992501857 -> 6907992500010"
  },
  "6902083891015": {
    "map_to": "6902083890636",
    "description": "条码映射：6902083891015 -> 6902083890636"
  },
  "6923450605240": {
    "map_to": "6923450605226",
    "description": "条码映射：6923450605240 -> 6923450605226"
  },
  "6923450605196": {
    "map_to": "6923450614624",
    "description": "条码映射：6923450605196 -> 6923450614624"
  },
  "6923450665213": {
    "map_to": "6923450665206",
    "description": "条码映射：6923450665213 -> 6923450665206"
  },
  "6923450666821": {
    "map_to": "6923450666838",
    "description": "条码映射：6923450666821 -> 6923450666838"
  },
  "6923450661505": {
    "map_to": "6923450661499",
    "description": "条码映射：6923450661505 -> 6923450661499"
  },
  "6923450676103": {
    "map_to": "6923450676097",
    "description": "条码映射：6923450676103 -> 6923450676097"
  },
  "6923450614631": {
    "map_to": "6923450614624",
    "description": "条码映射：6923450614631 -> 6923450614624"
  },
  "6901424334174": {
    "map_to": "6973730760015",
    "description": "条码映射：6901424334174 -> 6973730760015"
  },
  "6958620703716": {
    "map_to": "6958620703907",
    "description": "条码映射：6958620703716 -> 6958620703907"
  },
  "6937003706322": {
    "map_to": "6937003703833",
    "description": "条码映射：6937003706322 -> 6937003703833"
  },
  "6950783203494": {
    "map_to": "6950873203494",
    "description": "条码映射：6950783203494 -> 6950873203494"
  },
  "6907992501871": {
    "map_to": "6907992500010",
    "description": "条码映射：6907992501871 -> 6907992500010"
  },
  "6907992501864": {
    "map_to": "6907992100012",
    "description": "条码映射：6907992501864 -> 6907992100012"
  },
  "6923644264192": {
    "map_to": "6923644264116",
    "description": "条码映射：6923644264192 -> 6923644264116"
  },
  "6923450667316": {
    "map_to": "69042386",
    "description": "条码映射：6923450667316 -> 69042386"
  },
  "6923450653012": {
    "map_to": "69021343",
    "description": "条码映射：6923450653012 -> 69021343"
  },
  "6925019900087": {
    "multiplier": 10,
    "target_unit": "瓶",
    "description": "特殊处理：数量*10，单位转换为瓶"
  },
  "6921168593804": {
    "multiplier": 30,
    "target_unit": "瓶",
    "description": "NFC产品特殊处理：每箱30瓶"
  },
  "6901826888138": {
    "multiplier": 30,
    "target_unit": "瓶",
    "fixed_price": 3.7333333333333334,
    "specification": "1*30",
    "description": "特殊处理: 规格1*30，数量*30，单价=112/30"
  },
  "6958620703907": {
    "multiplier": 14,
    "target_unit": "个",
    "specification": "1*14",
    "description": "友臣肉松，1盒14个"
  }
 }
@@ -0,0 +1,28 @@
 [API]
 api_key = O0Fgk3o69RWJ86eAX8BTHRaB
 secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
 timeout = 30
 max_retries = 3
 retry_delay = 2
 api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
 [Paths]
 input_folder = data/input
 output_folder = data/output
 temp_folder = data/temp
 template_folder = templates
 processed_record = data/processed_files.json
 [Performance]
 max_workers = 4
 batch_size = 5
 skip_existing = true
 [File]
 allowed_extensions = .jpg,.jpeg,.png,.bmp
 excel_extension = .xlsx
 max_file_size_mb = 4
 [Templates]
 purchase_order = 银豹-采购单模板.xls
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
houhuan	9f97ac3f21	新系统	2025-08-17 15:52:49 +08:00
houhuan	3414df5317	在更新一版,更方便了	2025-08-17 15:50:21 +08:00
houhuan	556f8d8020	修复条码验证问题：在验证阶段处理过长条码，移除末尾多余的0，确保条码不会超过标准长度	2025-05-30 12:38:25 +08:00
houhuan	53e907411d	修复条码处理问题：修改format_barcode函数，移除末尾多余的0，确保条码不会超过标准长度	2025-05-30 12:14:53 +08:00
houhuan	c9afe413f5	修复条码处理和数量计算问题：修复条码格式化函数，确保在数量为空时能正确计算	2025-05-30 12:08:06 +08:00
houhuan	5cf3eeed0f	添加数量为空时通过金额和单价计算数量的功能，增强规格解析能力	2025-05-30 11:54:08 +08:00
houhuan	ae8d479acd	bug修复	2025-05-30 10:25:46 +08:00
houhuan	b3c175836a	v1.1.0: 版本更新 - 增强规格解析能力、修复条码映射功能、改进特殊条码处理	2025-05-30 10:24:30 +08:00
houhuan	c0fceea9dc	zuihou	2025-05-13 10:02:29 +08:00
houhuan	772902c919	完整了，基本最后一次提交	2025-05-10 17:41:11 +08:00
houhuan	c3a0e29b19	优化	2025-05-10 14:28:50 +08:00
houhuan	9b2007a995	小更新，但是是比较完善的版本，加油	2025-05-10 13:05:02 +08:00
houhuan	4a8169ff63	## v1.5.3 (2024-03-21) - 优化了完整流程处理逻辑： - 修改了OCR处理逻辑，当遇到已处理的图片时自动跳过并继续执行 - 改进了错误处理，避免因图片已处理而中断流程 - 优化了日志提示信息，提供更清晰的处理状态反馈 - 改进了OCRService的process_image方法： - 添加了文件存在性检查 - 添加了文件类型验证 - 添加了已处理文件检查 - 优化了错误处理和日志记录	2025-05-10 12:58:28 +08:00
houhuan	201aac35e6	新增快捷键，新增日志统计	2025-05-10 12:32:10 +08:00
houhuan	f5eda6cbd8	新增牛奶箱-瓶的映射	2025-05-10 12:13:04 +08:00
houhuan	5c0b709528	新增条码映射编辑功能图形化界面	2025-05-10 11:39:11 +08:00
houhuan	7b7d491663	更新之后，我也不知道有没有问题	2025-05-08 21:16:58 +08:00
houhuan	390eeb67af	新增逻辑条码映射，把件的商品拆分成单个	2025-05-07 22:30:41 +08:00
houhuan	4c8def4b04	更新readme	2025-05-07 19:29:02 +08:00
houhuan	2f088c87ca	更新修复规格逻辑	2025-05-07 19:16:33 +08:00
houhuan	b9739b5267	修复一些问题	2025-05-05 19:38:51 +08:00
houhuan	0b40caaf91	最新提交，提交钱看看有没有优化的地方	2025-05-02 22:46:04 +08:00
houhuan	693c17283b	更新了README文件，添加了版本信息和更新日志	2025-05-02 19:58:27 +08:00
houhuan	71ca90ba6e	v1.0正式版	2025-05-02 19:05:42 +08:00
houhuan	14eeb7b39a	日志同步到控制台显示，处理逻辑增强	2025-05-02 18:52:39 +08:00
houhuan	b3cecda175	excel	2025-05-02 18:17:24 +08:00
houhuan	131fff6a7d	ai说excel部分没问题了，暂且信一次，提交文件	2025-05-02 17:55:29 +08:00
houhuan	0035cd1893	增强版v2-初始化仓库，验证好了ocr部分，先备份一次	2025-05-02 17:25:47 +08:00
		`@@ -1,2 +0,0 @@`
			`# -- coding: utf-8 --`
			`"""益选-OCR订单处理系统 UI 模块"""`
		`@@ -0,0 +1 @@`
							`2025-08-16 00:52:17,210 - app.core.excel.handlers.barcode_mapper - INFO - 条码映射: 6937003706322 -> 6937003703833`
		`@@ -0,0 +1,2 @@`
							`2025-08-16 00:52:16,853 - app.core.excel.merger - INFO - 使用输出目录: E:\2025Code\python\orc-order-v2\data\output`
							`2025-08-16 00:52:16,861 - app.core.excel.merger - INFO - 初始化PurchaseOrderMerger完成，模板文件: templates\银豹-采购单模板.xls`