Compare commits
28 Commits
0c28031e81
..
v1.0.0
| Author | SHA1 | Date | |
|---|---|---|---|
| 9f97ac3f21 | |||
| 3414df5317 | |||
| 556f8d8020 | |||
| 53e907411d | |||
| c9afe413f5 | |||
| 5cf3eeed0f | |||
| ae8d479acd | |||
| b3c175836a | |||
| c0fceea9dc | |||
| 772902c919 | |||
| c3a0e29b19 | |||
| 9b2007a995 | |||
| 4a8169ff63 | |||
| 201aac35e6 | |||
| f5eda6cbd8 | |||
| 5c0b709528 | |||
| 7b7d491663 | |||
| 390eeb67af | |||
| 4c8def4b04 | |||
| 2f088c87ca | |||
| b9739b5267 | |||
| 0b40caaf91 | |||
| 693c17283b | |||
| 71ca90ba6e | |||
| 14eeb7b39a | |||
| b3cecda175 | |||
| 131fff6a7d | |||
| 0035cd1893 |
@@ -1,3 +0,0 @@
|
|||||||
# 百度 OCR API 配置
|
|
||||||
BAIDU_API_KEY=your_api_key_here
|
|
||||||
BAIDU_SECRET_KEY=your_secret_key_here
|
|
||||||
+22
-39
@@ -1,46 +1,29 @@
|
|||||||
# Environment
|
# Python缓存文件
|
||||||
.env
|
|
||||||
|
|
||||||
# Python
|
|
||||||
__pycache__/
|
__pycache__/
|
||||||
*.pyc
|
*.py[cod]
|
||||||
*.pyo
|
*$py.class
|
||||||
.pytest_cache/
|
|
||||||
.venv/
|
|
||||||
|
|
||||||
# Build & dist
|
# 虚拟环境
|
||||||
build/
|
venv/
|
||||||
dist/
|
env/
|
||||||
release/
|
ENV/
|
||||||
*.spec
|
|
||||||
|
|
||||||
# Logs & temp
|
# 日志文件
|
||||||
logs/
|
logs/*.log
|
||||||
|
logs/*.active
|
||||||
|
*.log.*
|
||||||
|
|
||||||
|
# 临时文件和缓存
|
||||||
data/temp/
|
data/temp/
|
||||||
|
data/*.bak
|
||||||
# Runtime data (all runtime outputs, caches, databases)
|
*.bak
|
||||||
data/
|
|
||||||
|
|
||||||
# Claude Code / IDE
|
|
||||||
.claude/
|
|
||||||
.playwright-mcp/
|
|
||||||
.trae/
|
|
||||||
|
|
||||||
# Old project
|
|
||||||
wework_xiaoai_bot/
|
|
||||||
|
|
||||||
# Node.js
|
|
||||||
node_modules/
|
|
||||||
|
|
||||||
# Frontend build output
|
|
||||||
web/backend/static/
|
|
||||||
|
|
||||||
# Screenshots (from testing)
|
|
||||||
*.png
|
|
||||||
|
|
||||||
# OS/IDE
|
|
||||||
.DS_Store
|
.DS_Store
|
||||||
Thumbs.db
|
|
||||||
|
# 输出文件(可选是否忽略)
|
||||||
|
# data/output/
|
||||||
|
|
||||||
|
# IDE文件
|
||||||
.idea/
|
.idea/
|
||||||
.vscode/
|
.vscode/
|
||||||
|
*.swp
|
||||||
|
*.swo
|
||||||
+24
-33
@@ -1,39 +1,30 @@
|
|||||||
# Changelog
|
# 更新日志
|
||||||
|
|
||||||
## [v2.2.0] - 2026-03-31
|
## v1.1.0 (2025-05-30)
|
||||||
### Added
|
|
||||||
- **UI Simplification**: Removed dedicated buttons for Rongcheng and Tobacco; all Excel orders now use the intelligent auto-routing.
|
|
||||||
- **Enhanced Yang Biyue Support**: Fixed column mapping for Yang Biyue orders, ensuring standard fields (Barcode, Quantity, Price) are correctly extracted.
|
|
||||||
- **Headless API Auto-Detect**: `headless_api.py` now automatically distinguishes between Image (OCR) and Excel (Direct) inputs based on file extension.
|
|
||||||
|
|
||||||
### Fixed
|
### 新特性
|
||||||
- **Yang Biyue Preprocessing**: Resolved issue where data was empty due to incorrect column renaming.
|
- 添加对特殊条码6958620703716的处理,支持同时设置规格和条码映射
|
||||||
- **Interference Filtering**: Added logic to exclude distractor columns like "Settlement Unit" or "Base Quantity" during preprocessing.
|
- 增强不规范规格格式的解析能力(如"IL*12"、"6oo*12"等)
|
||||||
|
- 支持带重量单位的规格解析(如"5kg*6")
|
||||||
|
- 添加数量为空时通过金额和单价自动计算数量的功能
|
||||||
|
|
||||||
### Removed
|
### 修复
|
||||||
- **Redundant Files**: Cleaned up `run.py`, `clean.py`, and unused CLI modules.
|
- 修复条码映射功能在特殊处理后不生效的问题
|
||||||
- **Legacy UI Elements**: Removed tobacco-specific keyboard shortcuts and help entries.
|
- 修复OrderService中缺少merge_all_purchase_orders方法导致合并采购单报错的问题
|
||||||
|
- 修复了条码映射对话框无法同时添加特殊处理和映射的问题
|
||||||
|
|
||||||
## [v2.1.0] - 2026-03-30
|
### 改进
|
||||||
### Added
|
- 改进了BarcodeMapper类,使其支持同时进行特殊处理和条码映射
|
||||||
- **Intelligent Recognition**: Automated fingerprinting for Rongcheng Yigou, Tobacco, and Yang Biyue orders.
|
- 改进了规格解析逻辑,增加了对各种单位和格式的支持
|
||||||
- **Auto-Routing**: `OrderService.process_excel` now automatically handles preprocessing without explicit flags.
|
- 添加条码映射对话框中可视化标记映射关系
|
||||||
- **Headless API Enhancements**: `headless_api.py` updated to support the new intelligent recognition mode.
|
- 更新了条码映射配置文件,增加了更多特殊条码处理
|
||||||
- **Comprehensive Documentation**: Added `OPENCLAW_GUIDE.md` and `FINAL_UPDATE_REPORT.md`.
|
- 改进商品验证器,在数量为空但单价和金额存在时,自动计算数量
|
||||||
|
|
||||||
### Fixed
|
## v1.0.0 (2025-05-01)
|
||||||
- **Rongcheng Yigou**: Fixed barcode splitting issue where quantities were incorrectly distributed (30 to 5).
|
|
||||||
- **Tobacco Orders**: Corrected unit price calculation (divided by 10) and quantity calculation (multiplied by 10).
|
|
||||||
- **Identification Failure**: Fixed issue where `header=0` caused identification keywords at the very first row to be missed.
|
|
||||||
|
|
||||||
## [v2.0.0] - 2026-03-25
|
### 初始版本
|
||||||
### Added
|
- 基础OCR识别功能
|
||||||
- **Headless API**: First release of `headless_api.py` for OpenClaw integration.
|
- Excel处理功能
|
||||||
- **Price Validation**: Integration with PosPal item data for unit price auditing.
|
- 采购单合并功能
|
||||||
- **Asynchronous Logging**: GUI now uses a queue for log output to prevent UI freezing.
|
- 烟草订单处理功能
|
||||||
|
- 图形用户界面
|
||||||
## [v1.1.0] - 2026-03-10
|
|
||||||
### Added
|
|
||||||
- **Rongcheng Yigou Support**: Initial support for Rongcheng Excel templates.
|
|
||||||
- **Tobacco Support**: Initial support for Tobacco Excel templates.
|
|
||||||
- **Excel Processor**: Refactored core processing logic into `ExcelProcessor`.
|
|
||||||
@@ -1,146 +0,0 @@
|
|||||||
# CLAUDE.md - 益选 OCR 订单处理系统
|
|
||||||
|
|
||||||
## 项目概述
|
|
||||||
|
|
||||||
益选 OCR 订单处理系统 (orc-order-v2) 是一个面向零售与分销场景的采购单处理工具。
|
|
||||||
|
|
||||||
**核心流程**: 图片 OCR → Excel 规范化 → 模板填充 → 合并导出
|
|
||||||
|
|
||||||
**目标系统**: 银豹 (PosPal) POS 系统
|
|
||||||
|
|
||||||
**技术栈**: Python 3.9+, Tkinter, Pandas, Baidu OCR API, xlrd/xlwt/openpyxl
|
|
||||||
|
|
||||||
## 项目结构
|
|
||||||
|
|
||||||
```
|
|
||||||
orc-order-v2/
|
|
||||||
├── 启动器.py # 入口桩 (~13行, 仅导入 main)
|
|
||||||
├── headless_api.py # CLI 自动化接口 (OpenClaw 对接)
|
|
||||||
├── build_exe.py # PyInstaller 打包脚本
|
|
||||||
├── config.ini # 全局配置 (API密钥、路径)
|
|
||||||
├── config/
|
|
||||||
│ ├── config.ini # 配置副本
|
|
||||||
│ ├── barcode_mappings.json # 条码映射规则
|
|
||||||
│ └── suppliers_config.json # 供应商配置 (列映射/清洗规则/计算规则)
|
|
||||||
├── app/
|
|
||||||
│ ├── config/
|
|
||||||
│ │ ├── settings.py # ConfigManager 单例
|
|
||||||
│ │ └── defaults.py # 默认配置
|
|
||||||
│ ├── core/
|
|
||||||
│ │ ├── excel/
|
|
||||||
│ │ │ ├── processor.py # ExcelProcessor - 标准化转换核心
|
|
||||||
│ │ │ ├── converter.py # UnitConverter - 单位转换与规格推断
|
|
||||||
│ │ │ ├── merger.py # PurchaseOrderMerger - 采购单合并
|
|
||||||
│ │ │ ├── validators.py # ProductValidator
|
|
||||||
│ │ │ └── handlers/ # 条码映射、单位转换处理器
|
|
||||||
│ │ ├── handlers/
|
|
||||||
│ │ │ ├── rule_engine.py # 通用规则引擎 (split/extract/normalize/mark)
|
|
||||||
│ │ │ ├── column_mapper.py # 列映射器
|
|
||||||
│ │ │ ├── data_cleaner.py # 数据清洗器
|
|
||||||
│ │ │ └── calculator.py # 计算器
|
|
||||||
│ │ ├── ocr/
|
|
||||||
│ │ │ ├── table_ocr.py # OCRProcessor
|
|
||||||
│ │ │ └── baidu_ocr.py # BaiduOCRClient
|
|
||||||
│ │ ├── processors/
|
|
||||||
│ │ │ ├── base.py # BaseProcessor 抽象基类
|
|
||||||
│ │ │ ├── tobacco_processor.py
|
|
||||||
│ │ │ ├── ocr_processor.py
|
|
||||||
│ │ │ └── supplier_processors/
|
|
||||||
│ │ │ └── generic_supplier_processor.py
|
|
||||||
│ │ └── utils/
|
|
||||||
│ │ ├── file_utils.py # 文件操作工具
|
|
||||||
│ │ ├── log_utils.py # 日志工具
|
|
||||||
│ │ ├── string_utils.py # 字符串工具
|
|
||||||
│ │ └── dialog_utils.py # Tkinter 对话框工具
|
|
||||||
│ ├── services/
|
|
||||||
│ │ ├── order_service.py # 订单服务 (智能路由分发)
|
|
||||||
│ │ ├── ocr_service.py # OCR 服务
|
|
||||||
│ │ ├── processor_service.py # 处理器调度服务
|
|
||||||
│ │ ├── tobacco_service.py # 烟草公司专用服务
|
|
||||||
│ │ └── special_suppliers_service.py # 特殊供应商服务 (蓉城/杨碧月)
|
|
||||||
│ └── ui/ # GUI 模块 (从启动器.py拆分)
|
|
||||||
│ ├── error_utils.py # L0 错误对话框
|
|
||||||
│ ├── theme.py # L0 主题管理 (THEMES, create_modern_button)
|
|
||||||
│ ├── logging_ui.py # L0 日志队列与GUI日志处理器
|
|
||||||
│ ├── ui_widgets.py # L0 StatusBar, ProgressReporter, center_window
|
|
||||||
│ ├── user_settings.py # L1 用户设置与最近文件管理
|
|
||||||
│ ├── result_previews.py # L1 处理结果预览对话框
|
|
||||||
│ ├── command_runner.py # L1 命令执行器 (subprocess + 日志重定向)
|
|
||||||
│ ├── file_operations.py # L2 文件选择/清理/目录操作
|
|
||||||
│ ├── action_handlers.py # L2 业务操作 (OCR/Excel/合并/拖拽)
|
|
||||||
│ ├── barcode_editor.py # L2 条码映射编辑
|
|
||||||
│ ├── config_dialog.py # L3 系统设置对话框
|
|
||||||
│ ├── shortcuts.py # L3 键盘快捷键绑定
|
|
||||||
│ └── main_window.py # L4 main() 主窗口构建
|
|
||||||
├── templates/
|
|
||||||
│ ├── 银豹-采购单模板.xls # 输出模板
|
|
||||||
│ └── 商品资料.xlsx # 单价校验参考数据
|
|
||||||
├── data/
|
|
||||||
│ ├── input/ # 输入文件
|
|
||||||
│ ├── output/ # OCR 输出
|
|
||||||
│ ├── result/ # 最终采购单
|
|
||||||
│ └── user_settings.json # 用户设置
|
|
||||||
└── docs/
|
|
||||||
└── SYSTEM_ARCHITECTURE.md # 系统架构文档
|
|
||||||
```
|
|
||||||
|
|
||||||
## 命令与运行
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# GUI 模式
|
|
||||||
python 启动器.py
|
|
||||||
|
|
||||||
# CLI 模式 (OpenClaw 对接)
|
|
||||||
python headless_api.py [input] [--excel|--tobacco|--rongcheng] [--barcode X --target Y]
|
|
||||||
|
|
||||||
# 打包 EXE
|
|
||||||
python build_exe.py
|
|
||||||
|
|
||||||
# 条码映射更新
|
|
||||||
python headless_api.py --update-mapping --barcode 6920584471055 --target 6920584471017
|
|
||||||
```
|
|
||||||
|
|
||||||
## 供应商智能识别逻辑
|
|
||||||
|
|
||||||
系统通过扫描 Excel 前 50 行内容特征自动路由:
|
|
||||||
|
|
||||||
| 供应商 | 识别特征 | 预处理逻辑 |
|
|
||||||
|--------|----------|-----------|
|
|
||||||
| 烟草公司 | "专卖证号" 或 "510109104938" | B/E/G/H 列映射, 数量*10, 单价/10 |
|
|
||||||
| 蓉城易购 | "RCDH" | E/N/Q/S 列映射, 多条码分裂均分数量 |
|
|
||||||
| 杨碧月 | "经手人" + "杨碧月" | 列对齐, 单位转换 (件→瓶) |
|
|
||||||
| 通用供应商 | suppliers_config.json 配置 | 列映射 + 规则引擎 |
|
|
||||||
|
|
||||||
## 配置系统
|
|
||||||
|
|
||||||
- **ConfigManager** (`app/config/settings.py`): 单例模式, 基于 configparser 读取 `config.ini`
|
|
||||||
- **供应商配置** (`config/suppliers_config.json`): JSON 格式, 定义列映射/清洗规则/计算规则
|
|
||||||
- **条码映射** (`config/barcode_mappings.json`): 运行时可更新的条码转换规则
|
|
||||||
|
|
||||||
## 关键约定
|
|
||||||
|
|
||||||
### 输出格式
|
|
||||||
- 银豹采购单模板: 4 列 — 条码(B), 采购量(C), 赠送量(D), 采购单价(E)
|
|
||||||
- 单价保留 4 位小数, 使用 xlwt.XFStyle
|
|
||||||
- 采购单文件名: `采购单_{原文件名}.xls`
|
|
||||||
|
|
||||||
### 单位转换规则
|
|
||||||
- "件"/"箱"/"提"/"盒" → 数量*包装数量, 单价/包装数量, 单位→"瓶"
|
|
||||||
- 赠品: 价格为 0 或金额为 0 的行标记为赠品
|
|
||||||
- 条码映射优先于单位转换
|
|
||||||
|
|
||||||
### 规格推断
|
|
||||||
- 从商品名称推断: "24入纸箱" → 1*24, "450g*15" → 1*15
|
|
||||||
- 支持三级规格: 1*5*12
|
|
||||||
- OCR 修正: "IL" → "1L", "6oo" → "600"
|
|
||||||
|
|
||||||
## 已知技术债务
|
|
||||||
|
|
||||||
1. ~~**启动器.py 过大**~~ (已拆分为 13 个 `app/ui/` 模块, 入口桩仅 13 行)
|
|
||||||
2. **代码重复**: 表头识别、列映射、金额解析在多处重复实现
|
|
||||||
3. **配置不统一**: config.ini + suppliers_config.json + 硬编码路径混用
|
|
||||||
4. **无测试**: 测试目录为空, 无自动化测试
|
|
||||||
5. **旧格式依赖**: xlrd/xlwt 仅支持 .xls, 不支持 .xlsx 写入
|
|
||||||
6. **API 密钥明文**: config.ini 中百度 OCR API 密钥未加密
|
|
||||||
7. **路径硬编码**: config.ini 中 `template_folder = E:\2025Code\python\orc-order-v2\templates`
|
|
||||||
8. **日志不统一**: 混用 `get_logger()` 和 `logging.getLogger()`
|
|
||||||
@@ -0,0 +1,82 @@
|
|||||||
|
|
||||||
|
# -*- mode: python ; coding: utf-8 -*-
|
||||||
|
|
||||||
|
block_cipher = None
|
||||||
|
|
||||||
|
# 需要包含的数据文件
|
||||||
|
added_files = [
|
||||||
|
('config.ini', '.'),
|
||||||
|
('config/barcode_mappings.json', 'config/'),
|
||||||
|
('config/config.ini', 'config/'),
|
||||||
|
('templates/银豹-采购单模板.xls', 'templates/'),
|
||||||
|
('app', 'app'),
|
||||||
|
]
|
||||||
|
|
||||||
|
# 需要隐式导入的模块
|
||||||
|
hidden_imports = [
|
||||||
|
'tkinter',
|
||||||
|
'tkinter.ttk',
|
||||||
|
'tkinter.filedialog',
|
||||||
|
'tkinter.messagebox',
|
||||||
|
'tkinter.scrolledtext',
|
||||||
|
'pandas',
|
||||||
|
'numpy',
|
||||||
|
'openpyxl',
|
||||||
|
'xlrd',
|
||||||
|
'xlwt',
|
||||||
|
'xlutils',
|
||||||
|
'requests',
|
||||||
|
'configparser',
|
||||||
|
'threading',
|
||||||
|
'datetime',
|
||||||
|
'json',
|
||||||
|
're',
|
||||||
|
'subprocess',
|
||||||
|
'shutil',
|
||||||
|
'app.config.settings',
|
||||||
|
'app.services.ocr_service',
|
||||||
|
'app.services.order_service',
|
||||||
|
'app.services.tobacco_service',
|
||||||
|
'app.core.utils.dialog_utils',
|
||||||
|
'app.core.excel.converter',
|
||||||
|
]
|
||||||
|
|
||||||
|
a = Analysis(
|
||||||
|
['启动器.py'],
|
||||||
|
pathex=[],
|
||||||
|
binaries=[],
|
||||||
|
datas=added_files,
|
||||||
|
hiddenimports=hidden_imports,
|
||||||
|
hookspath=[],
|
||||||
|
hooksconfig={},
|
||||||
|
runtime_hooks=[],
|
||||||
|
excludes=[],
|
||||||
|
win_no_prefer_redirects=False,
|
||||||
|
win_private_assemblies=False,
|
||||||
|
cipher=block_cipher,
|
||||||
|
noarchive=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)
|
||||||
|
|
||||||
|
exe = EXE(
|
||||||
|
pyz,
|
||||||
|
a.scripts,
|
||||||
|
a.binaries,
|
||||||
|
a.zipfiles,
|
||||||
|
a.datas,
|
||||||
|
[],
|
||||||
|
name='OCR订单处理系统',
|
||||||
|
debug=False,
|
||||||
|
bootloader_ignore_signals=False,
|
||||||
|
strip=False,
|
||||||
|
upx=True,
|
||||||
|
upx_exclude=[],
|
||||||
|
runtime_tmpdir=None,
|
||||||
|
console=False,
|
||||||
|
disable_windowed_traceback=False,
|
||||||
|
argv_emulation=False,
|
||||||
|
target_arch=None,
|
||||||
|
codesign_identity=None,
|
||||||
|
entitlements_file=None,
|
||||||
|
)
|
||||||
@@ -1,206 +1,43 @@
|
|||||||
# 益选 OCR 订单处理系统
|
# 益选-OCR订单处理系统
|
||||||
|
|
||||||
面向零售与分销场景的采购单处理工具,支持图片 OCR → Excel 规范化 → 模板填充 → 合并导出全流程,输出适配银豹 (PosPal) POS 系统。
|
一个集OCR识别、Excel处理和订单合并功能于一体的采购单处理系统。
|
||||||
|
|
||||||
## 核心功能
|
## 主要功能
|
||||||
|
|
||||||
- **智能供应商识别**:自动扫描 Excel 前 50 行内容特征,路由到对应的预处理逻辑(蓉城易购、烟草公司、杨碧月等)
|
- **OCR识别**:识别图片中的商品信息,包括条码、名称、数量、单价等
|
||||||
- **图片 OCR**:调用百度 OCR 表格识别 API,将采购单图片转为结构化 Excel
|
- **Excel处理**:将OCR识别结果处理成规范的Excel采购单
|
||||||
- **规则引擎**:支持列映射、数据清洗、单位转换、规格推断、赠品标记等自动化规则
|
- **采购单合并**:合并多个采购单,汇总相同商品
|
||||||
- **条码映射**:可配置的条码转换规则,支持运行时编辑和云端同步
|
- **条码映射**:支持将特定条码映射为其他条码,适应不同系统要求
|
||||||
- **单价校验**:自动比对 `商品资料.xlsx`,价差超过 1.0 元触发预警
|
- **规格处理**:智能解析商品规格,实现单位自动转换
|
||||||
- **云端同步**:通过 Gitea REST API 在多台设备间同步配置文件(条码映射、供应商配置、商品资料、采购模板)
|
- **烟草订单处理**:专门处理烟草公司订单
|
||||||
- **拖拽一键处理**:拖入图片或 Excel 自动走完 OCR → 规范化 → 合并全流程
|
|
||||||
- **CLI 接口**:`headless_api.py` 支持无界面自动化调用
|
|
||||||
|
|
||||||
## 快速开始
|
## 技术特点
|
||||||
|
|
||||||
### 桌面端 (GUI / CLI)
|
- 基于Python开发,使用Tkinter构建图形界面
|
||||||
|
- 采用模块化设计,易于扩展和维护
|
||||||
|
- 自动处理各种不规范数据格式
|
||||||
|
- 配置文件支持,可自定义各种处理参数
|
||||||
|
- 日志记录,便于问题排查
|
||||||
|
|
||||||
```bash
|
## 使用方法
|
||||||
# 安装依赖
|
|
||||||
pip install -r requirements.txt
|
|
||||||
|
|
||||||
# GUI 模式
|
1. 运行`启动器.py`打开主界面
|
||||||
python 启动器.py
|
2. 根据需要选择相应功能按钮
|
||||||
|
3. 按照提示操作,完成数据处理
|
||||||
|
|
||||||
# CLI 模式
|
## 系统要求
|
||||||
python headless_api.py data/input/xxx.xlsx
|
|
||||||
python headless_api.py data/input/xxx.jpg --barcode 6920584471055 --target 6920584471017
|
|
||||||
|
|
||||||
# 打包 EXE
|
- Python 3.8+
|
||||||
python build_exe.py
|
- 所需第三方库:详见`requirements.txt`
|
||||||
```
|
|
||||||
|
|
||||||
### Web 端
|
## 最近更新
|
||||||
|
|
||||||
```bash
|
请查看[更新日志](CHANGELOG.md)了解最新版本变更。
|
||||||
# 后端依赖
|
|
||||||
cd web/backend && pip install -r requirements.txt
|
|
||||||
|
|
||||||
# 前端依赖
|
## 贡献者
|
||||||
cd web/frontend && npm install
|
|
||||||
|
|
||||||
# 启动后端 (端口 8000)
|
- 欢欢欢
|
||||||
cd web && python -m uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload
|
|
||||||
|
|
||||||
# 启动前端开发服务器 (端口 5173)
|
## 版权
|
||||||
cd web/frontend && npm run dev
|
|
||||||
|
|
||||||
# 构建前端到后端静态目录
|
© 2025 益选-OCR订单处理系统
|
||||||
cd web/frontend && npm run build
|
|
||||||
# 构建后直接访问 http://localhost:8000 即可
|
|
||||||
|
|
||||||
# 生产部署 (仅后端,前端已内嵌)
|
|
||||||
cd web && python -m uvicorn backend.main:app --host 0.0.0.0 --port 8000
|
|
||||||
```
|
|
||||||
|
|
||||||
**默认账号:** `admin` / `admin123`(首次登录后建议修改密码)
|
|
||||||
|
|
||||||
## 项目结构
|
|
||||||
|
|
||||||
```
|
|
||||||
├── 启动器.py # GUI 入口
|
|
||||||
├── headless_api.py # CLI 自动化接口
|
|
||||||
├── config.ini # 全局配置(API密钥、路径、Gitea)
|
|
||||||
├── config/
|
|
||||||
│ ├── config.ini # 配置副本
|
|
||||||
│ ├── barcode_mappings.json # 条码映射规则
|
|
||||||
│ └── suppliers_config.json # 供应商配置(列映射/规则引擎)
|
|
||||||
├── app/
|
|
||||||
│ ├── config/ # 配置管理(ConfigManager 单例)
|
|
||||||
│ ├── core/
|
|
||||||
│ │ ├── excel/ # Excel 处理(标准化、转换、合并、校验)
|
|
||||||
│ │ ├── handlers/ # 规则引擎、列映射、数据清洗、计算器
|
|
||||||
│ │ ├── ocr/ # 百度 OCR 客户端
|
|
||||||
│ │ ├── processors/ # 处理器(通用/烟草/OCR)
|
|
||||||
│ │ └── utils/ # 工具(日志、文件、字符串、云端同步、对话框)
|
|
||||||
│ ├── services/ # 业务服务(订单、OCR、处理器调度)
|
|
||||||
│ └── ui/ # GUI 模块(主题、日志、快捷键、主窗口)
|
|
||||||
├── web/ # Web 端
|
|
||||||
│ ├── backend/
|
|
||||||
│ │ ├── main.py # FastAPI 入口
|
|
||||||
│ │ ├── auth/ # JWT 认证(登录、Token、权限)
|
|
||||||
│ │ ├── routers/ # API 路由(文件、处理、记忆、条码、同步、任务、日志)
|
|
||||||
│ │ ├── services/ # 后端服务(任务管理、数据库、文件同步)
|
|
||||||
│ │ └── middleware/ # HTTP 日志中间件
|
|
||||||
│ └── frontend/
|
|
||||||
│ ├── src/
|
|
||||||
│ │ ├── views/ # 页面(Dashboard、Layout、文件管理、任务、日志等)
|
|
||||||
│ │ ├── stores/ # Pinia 状态管理(auth、processing)
|
|
||||||
│ │ ├── composables/ # 共享逻辑(useDebounce、useFileUtils、useFilePreview)
|
|
||||||
│ │ ├── api.ts # Axios 封装
|
|
||||||
│ │ └── router/ # Vue Router 路由
|
|
||||||
│ ├── package.json
|
|
||||||
│ └── vite.config.ts
|
|
||||||
├── templates/
|
|
||||||
│ ├── 银豹-采购单模板.xls # 输出模板(条码/采购量/赠送量/单价)
|
|
||||||
│ └── 商品资料.xlsx # 单价校验参考数据
|
|
||||||
├── data/
|
|
||||||
│ ├── input/ # 输入文件
|
|
||||||
│ ├── output/ # OCR 输出
|
|
||||||
│ ├── result/ # 最终采购单
|
|
||||||
│ └── web_data.db # Web 端数据库(SQLite)
|
|
||||||
└── tests/ # 单元测试(191 个)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Web 端功能
|
|
||||||
|
|
||||||
基于 Vue 3 + Element Plus + FastAPI 的浏览器端管理界面,与桌面端共享同一个 `data/` 目录。
|
|
||||||
|
|
||||||
### 处理中心 (Dashboard)
|
|
||||||
|
|
||||||
- **一键全流程**:上传图片或 Excel 后,一键完成 OCR → 标准化 → 合并全流程
|
|
||||||
- **批量 OCR / 批量处理**:可单独执行 OCR 识别或 Excel 标准化步骤
|
|
||||||
- **实时进度**:WebSocket 推送任务进度、日志、状态变更
|
|
||||||
- **多任务监控**:同时查看多个运行中任务的进度和日志
|
|
||||||
- **任务重试**:失败任务可查看错误详情并一键重试
|
|
||||||
|
|
||||||
### 文件管理
|
|
||||||
|
|
||||||
- **图片处理**:管理 `data/input/` 中的图片文件,支持上传、预览、批量 OCR、批量生成采购单
|
|
||||||
- **表格处理**:管理 `data/output/` 中的 Excel 文件,支持上传、预览、批量标准化处理
|
|
||||||
- **采购单管理**:管理 `data/result/` 中的采购单,支持预览、下载、合并、批量删除
|
|
||||||
- **实时同步**:页面加载时自动同步磁盘文件到数据库,新文件立即可见
|
|
||||||
- **清除处理缓存**:删除已处理的输出文件,允许重新处理
|
|
||||||
|
|
||||||
### 任务与日志
|
|
||||||
|
|
||||||
- **任务历史**:查看所有处理任务的状态、进度、日志,支持按状态和类型筛选
|
|
||||||
- **HTTP 日志**:记录所有 API 请求,支持按方法和状态码筛选
|
|
||||||
|
|
||||||
### 记忆库
|
|
||||||
|
|
||||||
- **产品记忆**:自动从 OCR 和处理结果中学习产品信息
|
|
||||||
- **置信度系统**:根据出现次数自动评估记忆可靠度
|
|
||||||
- **搜索与管理**:支持搜索、编辑、删除记忆条目
|
|
||||||
|
|
||||||
### 条码映射
|
|
||||||
|
|
||||||
- **映射规则管理**:添加、编辑、删除条码转换规则
|
|
||||||
- **批量操作**:支持批量导入和删除映射
|
|
||||||
|
|
||||||
### 云端同步
|
|
||||||
|
|
||||||
- **Gitea 同步**:通过 Gitea REST API 在多台设备间同步配置文件
|
|
||||||
- **一键推拉**:选择文件推送或拉取,无需 git 客户端
|
|
||||||
|
|
||||||
### 系统配置
|
|
||||||
|
|
||||||
- **配置编辑**:在浏览器中编辑系统配置(API 密钥、路径、参数)
|
|
||||||
- **修改密码**:支持修改 Web 端登录密码
|
|
||||||
|
|
||||||
### UI/UX
|
|
||||||
|
|
||||||
- **响应式布局**:适配桌面和移动端,小屏幕自动切换为抽屉式导航
|
|
||||||
- **全局错误处理**:未捕获的 Vue 错误自动显示用户提示
|
|
||||||
- **表单验证**:修改密码等操作有完整的输入验证
|
|
||||||
|
|
||||||
## 供应商智能路由
|
|
||||||
|
|
||||||
| 供应商 | 识别特征 | 处理逻辑 |
|
|
||||||
|--------|----------|----------|
|
|
||||||
| 烟草公司 | "专卖证号" 或 "510109104938" | B/E/G/H 列映射,数量×10,单价÷10 |
|
|
||||||
| 蓉城易购 | "RCDH" | E/N/Q/S 列映射,多条码分裂均分数量 |
|
|
||||||
| 杨碧月 | "经手人" + "杨碧月" | 列对齐,单位转换(件→瓶) |
|
|
||||||
| 通用供应商 | `suppliers_config.json` 配置 | 列映射 + 规则引擎 |
|
|
||||||
|
|
||||||
## 云端同步
|
|
||||||
|
|
||||||
通过 Gitea REST API 在多台设备间同步配置,无需 git 客户端。
|
|
||||||
|
|
||||||
**支持同步的文件:**
|
|
||||||
- 条码映射 (`barcode_mappings.json`)
|
|
||||||
- 供应商配置 (`suppliers_config.json`)
|
|
||||||
- 商品资料 (`templates/商品资料.xlsx`)
|
|
||||||
- 采购单模板 (`templates/银豹-采购单模板.xls`)
|
|
||||||
|
|
||||||
**配置方式:**
|
|
||||||
1. 系统设置 → 填入 Gitea 地址、仓库信息、Access Token
|
|
||||||
2. 主窗口 → "云端同步" 按钮 → 选择文件推拉
|
|
||||||
|
|
||||||
**Gitea 仓库:** `https://gitea.94kan.cn/houhuan/yixuan-sync-data`
|
|
||||||
|
|
||||||
## 配置说明
|
|
||||||
|
|
||||||
| 配置项 | 文件 | 说明 |
|
|
||||||
|--------|------|------|
|
|
||||||
| API 密钥 | `.env` 或 `config.ini` | 百度 OCR API,优先从环境变量读取 |
|
|
||||||
| Gitea Token | `.env` 或 `config.ini` | 云端同步 Token,优先从环境变量读取 |
|
|
||||||
| 供应商规则 | `config/suppliers_config.json` | 列映射、清洗规则、计算规则 |
|
|
||||||
| 条码映射 | `config/barcode_mappings.json` | 条码转换规则,运行时可更新 |
|
|
||||||
|
|
||||||
## 构建打包
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pip install pyinstaller
|
|
||||||
python build_exe.py
|
|
||||||
# 输出: dist/OCR订单处理系统.exe
|
|
||||||
# 便携包: release/OCR订单处理系统.exe(含模板和商品资料)
|
|
||||||
```
|
|
||||||
|
|
||||||
## 测试
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m pytest tests/ -v
|
|
||||||
```
|
|
||||||
@@ -0,0 +1,5 @@
|
|||||||
|
"""
|
||||||
|
OCR订单处理系统 - 命令行接口
|
||||||
|
-------------------------
|
||||||
|
提供命令行工具,便于用户使用系统功能。
|
||||||
|
"""
|
||||||
@@ -0,0 +1,138 @@
|
|||||||
|
"""
|
||||||
|
Excel处理命令行工具
|
||||||
|
---------------
|
||||||
|
提供Excel处理相关的命令行接口。
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import argparse
|
||||||
|
from typing import List, Optional
|
||||||
|
|
||||||
|
from ..config.settings import ConfigManager
|
||||||
|
from ..core.utils.log_utils import get_logger, close_logger
|
||||||
|
from ..services.order_service import OrderService
|
||||||
|
|
||||||
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
|
def create_parser() -> argparse.ArgumentParser:
|
||||||
|
"""
|
||||||
|
创建命令行参数解析器
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
参数解析器
|
||||||
|
"""
|
||||||
|
parser = argparse.ArgumentParser(description='Excel处理工具')
|
||||||
|
|
||||||
|
# 通用选项
|
||||||
|
parser.add_argument('--config', type=str, help='配置文件路径')
|
||||||
|
|
||||||
|
# 子命令
|
||||||
|
subparsers = parser.add_subparsers(dest='command', help='子命令')
|
||||||
|
|
||||||
|
# 处理Excel命令
|
||||||
|
process_parser = subparsers.add_parser('process', help='处理Excel文件')
|
||||||
|
process_parser.add_argument('--input', type=str, help='输入Excel文件路径,如果不指定则处理最新的文件')
|
||||||
|
|
||||||
|
# 查看命令
|
||||||
|
list_parser = subparsers.add_parser('list', help='获取最新的Excel文件')
|
||||||
|
|
||||||
|
return parser
|
||||||
|
|
||||||
|
def process_excel(order_service: OrderService, input_file: Optional[str] = None) -> bool:
|
||||||
|
"""
|
||||||
|
处理Excel文件
|
||||||
|
|
||||||
|
Args:
|
||||||
|
order_service: 订单服务
|
||||||
|
input_file: 输入文件路径,如果为None则处理最新的文件
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
处理是否成功
|
||||||
|
"""
|
||||||
|
if input_file:
|
||||||
|
if not os.path.exists(input_file):
|
||||||
|
logger.error(f"输入文件不存在: {input_file}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
result = order_service.process_excel(input_file)
|
||||||
|
else:
|
||||||
|
latest_file = order_service.get_latest_excel()
|
||||||
|
if not latest_file:
|
||||||
|
logger.warning("未找到可处理的Excel文件")
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger.info(f"处理最新的Excel文件: {latest_file}")
|
||||||
|
result = order_service.process_excel(latest_file)
|
||||||
|
|
||||||
|
if result:
|
||||||
|
logger.info(f"处理成功,输出文件: {result}")
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
logger.error("处理失败")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def list_latest_excel(order_service: OrderService) -> bool:
|
||||||
|
"""
|
||||||
|
获取最新的Excel文件
|
||||||
|
|
||||||
|
Args:
|
||||||
|
order_service: 订单服务
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
是否找到Excel文件
|
||||||
|
"""
|
||||||
|
latest_file = order_service.get_latest_excel()
|
||||||
|
|
||||||
|
if latest_file:
|
||||||
|
logger.info(f"最新的Excel文件: {latest_file}")
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
logger.info("未找到Excel文件")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def main(args: Optional[List[str]] = None) -> int:
|
||||||
|
"""
|
||||||
|
Excel处理命令行主函数
|
||||||
|
|
||||||
|
Args:
|
||||||
|
args: 命令行参数,如果为None则使用sys.argv
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
退出状态码
|
||||||
|
"""
|
||||||
|
parser = create_parser()
|
||||||
|
parsed_args = parser.parse_args(args)
|
||||||
|
|
||||||
|
if parsed_args.command is None:
|
||||||
|
parser.print_help()
|
||||||
|
return 1
|
||||||
|
|
||||||
|
try:
|
||||||
|
# 创建配置管理器
|
||||||
|
config = ConfigManager(parsed_args.config) if parsed_args.config else ConfigManager()
|
||||||
|
|
||||||
|
# 创建订单服务
|
||||||
|
order_service = OrderService(config)
|
||||||
|
|
||||||
|
# 根据命令执行不同功能
|
||||||
|
if parsed_args.command == 'process':
|
||||||
|
success = process_excel(order_service, parsed_args.input)
|
||||||
|
elif parsed_args.command == 'list':
|
||||||
|
success = list_latest_excel(order_service)
|
||||||
|
else:
|
||||||
|
parser.print_help()
|
||||||
|
return 1
|
||||||
|
|
||||||
|
return 0 if success else 1
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"执行过程中发生错误: {e}")
|
||||||
|
return 1
|
||||||
|
|
||||||
|
finally:
|
||||||
|
# 关闭日志
|
||||||
|
close_logger(__name__)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
sys.exit(main())
|
||||||
@@ -0,0 +1,147 @@
|
|||||||
|
"""
|
||||||
|
订单合并命令行工具
|
||||||
|
--------------
|
||||||
|
提供订单合并相关的命令行接口。
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import argparse
|
||||||
|
from typing import List, Optional
|
||||||
|
|
||||||
|
from ..config.settings import ConfigManager
|
||||||
|
from ..core.utils.log_utils import get_logger, close_logger
|
||||||
|
from ..services.order_service import OrderService
|
||||||
|
|
||||||
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
|
def create_parser() -> argparse.ArgumentParser:
|
||||||
|
"""
|
||||||
|
创建命令行参数解析器
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
参数解析器
|
||||||
|
"""
|
||||||
|
parser = argparse.ArgumentParser(description='订单合并工具')
|
||||||
|
|
||||||
|
# 通用选项
|
||||||
|
parser.add_argument('--config', type=str, help='配置文件路径')
|
||||||
|
|
||||||
|
# 子命令
|
||||||
|
subparsers = parser.add_subparsers(dest='command', help='子命令')
|
||||||
|
|
||||||
|
# 合并命令
|
||||||
|
merge_parser = subparsers.add_parser('merge', help='合并采购单')
|
||||||
|
merge_parser.add_argument('--input', type=str, help='输入采购单文件路径列表,以逗号分隔,如果不指定则合并所有采购单')
|
||||||
|
|
||||||
|
# 列出采购单命令
|
||||||
|
list_parser = subparsers.add_parser('list', help='列出采购单文件')
|
||||||
|
|
||||||
|
return parser
|
||||||
|
|
||||||
|
def merge_orders(order_service: OrderService, input_files: Optional[str] = None) -> bool:
|
||||||
|
"""
|
||||||
|
合并采购单
|
||||||
|
|
||||||
|
Args:
|
||||||
|
order_service: 订单服务
|
||||||
|
input_files: 输入文件路径列表,以逗号分隔,如果为None则合并所有采购单
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
合并是否成功
|
||||||
|
"""
|
||||||
|
if input_files:
|
||||||
|
# 分割输入文件列表
|
||||||
|
file_paths = [path.strip() for path in input_files.split(',')]
|
||||||
|
|
||||||
|
# 检查文件是否存在
|
||||||
|
for path in file_paths:
|
||||||
|
if not os.path.exists(path):
|
||||||
|
logger.error(f"输入文件不存在: {path}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
result = order_service.merge_orders(file_paths)
|
||||||
|
else:
|
||||||
|
# 获取所有采购单文件
|
||||||
|
file_paths = order_service.get_purchase_orders()
|
||||||
|
if not file_paths:
|
||||||
|
logger.warning("未找到采购单文件")
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger.info(f"合并 {len(file_paths)} 个采购单文件")
|
||||||
|
result = order_service.merge_orders()
|
||||||
|
|
||||||
|
if result:
|
||||||
|
logger.info(f"合并成功,输出文件: {result}")
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
logger.error("合并失败")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def list_purchase_orders(order_service: OrderService) -> bool:
|
||||||
|
"""
|
||||||
|
列出采购单文件
|
||||||
|
|
||||||
|
Args:
|
||||||
|
order_service: 订单服务
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
是否有采购单文件
|
||||||
|
"""
|
||||||
|
files = order_service.get_purchase_orders()
|
||||||
|
|
||||||
|
if not files:
|
||||||
|
logger.info("未找到采购单文件")
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger.info(f"采购单文件 ({len(files)}):")
|
||||||
|
for file in files:
|
||||||
|
logger.info(f" {file}")
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
def main(args: Optional[List[str]] = None) -> int:
|
||||||
|
"""
|
||||||
|
订单合并命令行主函数
|
||||||
|
|
||||||
|
Args:
|
||||||
|
args: 命令行参数,如果为None则使用sys.argv
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
退出状态码
|
||||||
|
"""
|
||||||
|
parser = create_parser()
|
||||||
|
parsed_args = parser.parse_args(args)
|
||||||
|
|
||||||
|
if parsed_args.command is None:
|
||||||
|
parser.print_help()
|
||||||
|
return 1
|
||||||
|
|
||||||
|
try:
|
||||||
|
# 创建配置管理器
|
||||||
|
config = ConfigManager(parsed_args.config) if parsed_args.config else ConfigManager()
|
||||||
|
|
||||||
|
# 创建订单服务
|
||||||
|
order_service = OrderService(config)
|
||||||
|
|
||||||
|
# 根据命令执行不同功能
|
||||||
|
if parsed_args.command == 'merge':
|
||||||
|
success = merge_orders(order_service, parsed_args.input)
|
||||||
|
elif parsed_args.command == 'list':
|
||||||
|
success = list_purchase_orders(order_service)
|
||||||
|
else:
|
||||||
|
parser.print_help()
|
||||||
|
return 1
|
||||||
|
|
||||||
|
return 0 if success else 1
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"执行过程中发生错误: {e}")
|
||||||
|
return 1
|
||||||
|
|
||||||
|
finally:
|
||||||
|
# 关闭日志
|
||||||
|
close_logger(__name__)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
sys.exit(main())
|
||||||
@@ -0,0 +1,164 @@
|
|||||||
|
"""
|
||||||
|
OCR命令行工具
|
||||||
|
----------
|
||||||
|
提供OCR识别相关的命令行接口。
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import argparse
|
||||||
|
from typing import List, Optional
|
||||||
|
|
||||||
|
from ..config.settings import ConfigManager
|
||||||
|
from ..core.utils.log_utils import get_logger, close_logger
|
||||||
|
from ..services.ocr_service import OCRService
|
||||||
|
|
||||||
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
|
def create_parser() -> argparse.ArgumentParser:
|
||||||
|
"""
|
||||||
|
创建命令行参数解析器
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
参数解析器
|
||||||
|
"""
|
||||||
|
parser = argparse.ArgumentParser(description='OCR识别工具')
|
||||||
|
|
||||||
|
# 通用选项
|
||||||
|
parser.add_argument('--config', type=str, help='配置文件路径')
|
||||||
|
|
||||||
|
# 子命令
|
||||||
|
subparsers = parser.add_subparsers(dest='command', help='子命令')
|
||||||
|
|
||||||
|
# 单文件处理命令
|
||||||
|
process_parser = subparsers.add_parser('process', help='处理单个文件')
|
||||||
|
process_parser.add_argument('--input', type=str, required=True, help='输入图片文件路径')
|
||||||
|
|
||||||
|
# 批量处理命令
|
||||||
|
batch_parser = subparsers.add_parser('batch', help='批量处理文件')
|
||||||
|
batch_parser.add_argument('--batch-size', type=int, help='批处理大小')
|
||||||
|
batch_parser.add_argument('--max-workers', type=int, help='最大线程数')
|
||||||
|
|
||||||
|
# 查看未处理文件命令
|
||||||
|
list_parser = subparsers.add_parser('list', help='列出未处理的文件')
|
||||||
|
|
||||||
|
return parser
|
||||||
|
|
||||||
|
def process_file(ocr_service: OCRService, input_file: str) -> bool:
|
||||||
|
"""
|
||||||
|
处理单个文件
|
||||||
|
|
||||||
|
Args:
|
||||||
|
ocr_service: OCR服务
|
||||||
|
input_file: 输入文件路径
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
处理是否成功
|
||||||
|
"""
|
||||||
|
if not os.path.exists(input_file):
|
||||||
|
logger.error(f"输入文件不存在: {input_file}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
if not ocr_service.validate_image(input_file):
|
||||||
|
logger.error(f"输入文件无效: {input_file}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
result = ocr_service.process_image(input_file)
|
||||||
|
|
||||||
|
if result:
|
||||||
|
logger.info(f"处理成功,输出文件: {result}")
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
logger.error("处理失败")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def process_batch(ocr_service: OCRService, batch_size: Optional[int] = None, max_workers: Optional[int] = None) -> bool:
|
||||||
|
"""
|
||||||
|
批量处理文件
|
||||||
|
|
||||||
|
Args:
|
||||||
|
ocr_service: OCR服务
|
||||||
|
batch_size: 批处理大小
|
||||||
|
max_workers: 最大线程数
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
处理是否成功
|
||||||
|
"""
|
||||||
|
total, success = ocr_service.process_images_batch(batch_size, max_workers)
|
||||||
|
|
||||||
|
if total == 0:
|
||||||
|
logger.warning("没有找到需要处理的文件")
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger.info(f"批量处理完成,总计: {total},成功: {success}")
|
||||||
|
return success > 0
|
||||||
|
|
||||||
|
def list_unprocessed(ocr_service: OCRService) -> bool:
|
||||||
|
"""
|
||||||
|
列出未处理的文件
|
||||||
|
|
||||||
|
Args:
|
||||||
|
ocr_service: OCR服务
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
是否有未处理的文件
|
||||||
|
"""
|
||||||
|
files = ocr_service.get_unprocessed_images()
|
||||||
|
|
||||||
|
if not files:
|
||||||
|
logger.info("没有未处理的文件")
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger.info(f"未处理的文件 ({len(files)}):")
|
||||||
|
for file in files:
|
||||||
|
logger.info(f" {file}")
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
def main(args: Optional[List[str]] = None) -> int:
|
||||||
|
"""
|
||||||
|
OCR命令行主函数
|
||||||
|
|
||||||
|
Args:
|
||||||
|
args: 命令行参数,如果为None则使用sys.argv
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
退出状态码
|
||||||
|
"""
|
||||||
|
parser = create_parser()
|
||||||
|
parsed_args = parser.parse_args(args)
|
||||||
|
|
||||||
|
if parsed_args.command is None:
|
||||||
|
parser.print_help()
|
||||||
|
return 1
|
||||||
|
|
||||||
|
try:
|
||||||
|
# 创建配置管理器
|
||||||
|
config = ConfigManager(parsed_args.config) if parsed_args.config else ConfigManager()
|
||||||
|
|
||||||
|
# 创建OCR服务
|
||||||
|
ocr_service = OCRService(config)
|
||||||
|
|
||||||
|
# 根据命令执行不同功能
|
||||||
|
if parsed_args.command == 'process':
|
||||||
|
success = process_file(ocr_service, parsed_args.input)
|
||||||
|
elif parsed_args.command == 'batch':
|
||||||
|
success = process_batch(ocr_service, parsed_args.batch_size, parsed_args.max_workers)
|
||||||
|
elif parsed_args.command == 'list':
|
||||||
|
success = list_unprocessed(ocr_service)
|
||||||
|
else:
|
||||||
|
parser.print_help()
|
||||||
|
return 1
|
||||||
|
|
||||||
|
return 0 if success else 1
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"执行过程中发生错误: {e}")
|
||||||
|
return 1
|
||||||
|
|
||||||
|
finally:
|
||||||
|
# 关闭日志
|
||||||
|
close_logger(__name__)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
sys.exit(main())
|
||||||
+3
-16
@@ -12,20 +12,14 @@ DEFAULT_CONFIG = {
|
|||||||
'timeout': '30',
|
'timeout': '30',
|
||||||
'max_retries': '3',
|
'max_retries': '3',
|
||||||
'retry_delay': '2',
|
'retry_delay': '2',
|
||||||
'api_url': 'https://aip.baidubce.com/rest/2.0/ocr/v1/table',
|
'api_url': 'https://aip.baidubce.com/rest/2.0/ocr/v1/table'
|
||||||
'token_url': 'https://aip.baidubce.com/oauth/2.0/token',
|
|
||||||
'form_ocr_url': 'https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result'
|
|
||||||
},
|
},
|
||||||
'Paths': {
|
'Paths': {
|
||||||
'input_folder': 'data/input',
|
'input_folder': 'data/input',
|
||||||
'output_folder': 'data/output',
|
'output_folder': 'data/output',
|
||||||
'result_folder': 'data/result',
|
|
||||||
'temp_folder': 'data/temp',
|
'temp_folder': 'data/temp',
|
||||||
'template_folder': 'templates',
|
'template_folder': 'templates',
|
||||||
'template_file': '银豹-采购单模板.xls',
|
'processed_record': 'data/processed_files.json'
|
||||||
'processed_record': 'data/processed_files.json',
|
|
||||||
'data_dir': 'data',
|
|
||||||
'product_db': 'data/product_cache.db'
|
|
||||||
},
|
},
|
||||||
'Performance': {
|
'Performance': {
|
||||||
'max_workers': '4',
|
'max_workers': '4',
|
||||||
@@ -38,13 +32,6 @@ DEFAULT_CONFIG = {
|
|||||||
'max_file_size_mb': '4'
|
'max_file_size_mb': '4'
|
||||||
},
|
},
|
||||||
'Templates': {
|
'Templates': {
|
||||||
'purchase_order': '银豹-采购单模板.xls',
|
'purchase_order': '银豹-采购单模板.xls'
|
||||||
'item_data': '商品资料.xlsx'
|
|
||||||
},
|
|
||||||
'Gitea': {
|
|
||||||
'base_url': 'https://gitea.94kan.cn',
|
|
||||||
'owner': 'houhuan',
|
|
||||||
'repo': 'yixuan-sync-data',
|
|
||||||
'token': ''
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
+20
-65
@@ -6,16 +6,12 @@
|
|||||||
|
|
||||||
import os
|
import os
|
||||||
import configparser
|
import configparser
|
||||||
|
import logging
|
||||||
from typing import Dict, List, Optional, Any
|
from typing import Dict, List, Optional, Any
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
|
||||||
from ..core.utils.log_utils import get_logger
|
|
||||||
from .defaults import DEFAULT_CONFIG
|
from .defaults import DEFAULT_CONFIG
|
||||||
|
|
||||||
# 加载 .env 文件
|
logger = logging.getLogger(__name__)
|
||||||
load_dotenv()
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
class ConfigManager:
|
class ConfigManager:
|
||||||
"""
|
"""
|
||||||
@@ -33,23 +29,13 @@ class ConfigManager:
|
|||||||
|
|
||||||
def _init(self, config_file):
|
def _init(self, config_file):
|
||||||
"""初始化配置管理器"""
|
"""初始化配置管理器"""
|
||||||
# 计算应用根目录(不依赖 os.getcwd())
|
self.config_file = config_file or 'config.ini'
|
||||||
import sys
|
|
||||||
if getattr(sys, 'frozen', False):
|
|
||||||
# PyInstaller 打包后,根目录是 exe 所在目录
|
|
||||||
self.app_root = os.path.dirname(sys.executable)
|
|
||||||
else:
|
|
||||||
# 源码运行,根目录是 app/config/ 的上两级
|
|
||||||
self.app_root = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
|
||||||
|
|
||||||
self.config_file = config_file or os.path.join(self.app_root, 'config.ini')
|
|
||||||
self.config = configparser.ConfigParser()
|
self.config = configparser.ConfigParser()
|
||||||
self.load_config()
|
self.load_config()
|
||||||
|
|
||||||
def load_config(self) -> None:
|
def load_config(self) -> None:
|
||||||
"""
|
"""
|
||||||
加载配置文件,如果不存在则创建默认配置
|
加载配置文件,如果不存在则创建默认配置
|
||||||
API 密钥优先从环境变量 (.env) 读取
|
|
||||||
"""
|
"""
|
||||||
if not os.path.exists(self.config_file):
|
if not os.path.exists(self.config_file):
|
||||||
self.create_default_config()
|
self.create_default_config()
|
||||||
@@ -67,9 +53,6 @@ class ConfigManager:
|
|||||||
if not self.config.has_option(section, option):
|
if not self.config.has_option(section, option):
|
||||||
self.config.set(section, option, value)
|
self.config.set(section, option, value)
|
||||||
|
|
||||||
# API 密钥优先从环境变量读取
|
|
||||||
self._override_from_env()
|
|
||||||
|
|
||||||
# 保存更新后的配置
|
# 保存更新后的配置
|
||||||
self.save_config()
|
self.save_config()
|
||||||
logger.info(f"已加载并更新配置文件: {self.config_file}")
|
logger.info(f"已加载并更新配置文件: {self.config_file}")
|
||||||
@@ -78,18 +61,6 @@ class ConfigManager:
|
|||||||
logger.info("使用默认配置")
|
logger.info("使用默认配置")
|
||||||
self.create_default_config(save=False)
|
self.create_default_config(save=False)
|
||||||
|
|
||||||
def _override_from_env(self) -> None:
|
|
||||||
"""从环境变量覆盖敏感配置"""
|
|
||||||
env_mapping = {
|
|
||||||
('API', 'api_key'): 'BAIDU_API_KEY',
|
|
||||||
('API', 'secret_key'): 'BAIDU_SECRET_KEY',
|
|
||||||
('Gitea', 'token'): 'GITEA_TOKEN',
|
|
||||||
}
|
|
||||||
for (section, option), env_key in env_mapping.items():
|
|
||||||
env_val = os.getenv(env_key, '').strip()
|
|
||||||
if env_val:
|
|
||||||
self.config.set(section, option, env_val)
|
|
||||||
|
|
||||||
def create_default_config(self, save: bool = True) -> None:
|
def create_default_config(self, save: bool = True) -> None:
|
||||||
"""创建默认配置"""
|
"""创建默认配置"""
|
||||||
for section, options in DEFAULT_CONFIG.items():
|
for section, options in DEFAULT_CONFIG.items():
|
||||||
@@ -104,25 +75,13 @@ class ConfigManager:
|
|||||||
logger.info(f"已创建默认配置文件: {self.config_file}")
|
logger.info(f"已创建默认配置文件: {self.config_file}")
|
||||||
|
|
||||||
def save_config(self) -> None:
|
def save_config(self) -> None:
|
||||||
"""保存配置到文件(API 密钥不写入文件)"""
|
"""保存配置到文件"""
|
||||||
# 保存前临时清空 API 密钥,避免写入文件
|
|
||||||
saved_keys = {}
|
|
||||||
for option in ('api_key', 'secret_key'):
|
|
||||||
try:
|
|
||||||
saved_keys[option] = self.config.get('API', option, fallback='')
|
|
||||||
except Exception:
|
|
||||||
saved_keys[option] = ''
|
|
||||||
self.config.set('API', option, '')
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
with open(self.config_file, 'w', encoding='utf-8') as f:
|
with open(self.config_file, 'w', encoding='utf-8') as f:
|
||||||
self.config.write(f)
|
self.config.write(f)
|
||||||
logger.info(f"配置已保存到: {self.config_file}")
|
logger.info(f"配置已保存到: {self.config_file}")
|
||||||
finally:
|
except Exception as e:
|
||||||
# 恢复内存中的值(即使写入失败也恢复)
|
logger.error(f"保存配置文件时出错: {e}")
|
||||||
for option, val in saved_keys.items():
|
|
||||||
if val:
|
|
||||||
self.config.set('API', option, val)
|
|
||||||
|
|
||||||
def get(self, section: str, option: str, fallback: Any = None) -> Any:
|
def get(self, section: str, option: str, fallback: Any = None) -> Any:
|
||||||
"""获取配置值"""
|
"""获取配置值"""
|
||||||
@@ -158,29 +117,25 @@ class ConfigManager:
|
|||||||
获取路径配置并确保它是一个有效的绝对路径
|
获取路径配置并确保它是一个有效的绝对路径
|
||||||
如果create为True,则自动创建该目录
|
如果create为True,则自动创建该目录
|
||||||
"""
|
"""
|
||||||
from pathlib import Path
|
path = self.get(section, option, fallback)
|
||||||
path_str = self.get(section, option, fallback)
|
|
||||||
path = Path(path_str)
|
|
||||||
|
|
||||||
if not path.is_absolute():
|
if not os.path.isabs(path):
|
||||||
# 相对路径,转为绝对路径(相对于应用根目录)
|
# 相对路径,转为绝对路径
|
||||||
path = Path(self.app_root) / path
|
path = os.path.abspath(path)
|
||||||
|
|
||||||
if create:
|
if create and not os.path.exists(path):
|
||||||
try:
|
try:
|
||||||
# 智能判断是文件还是目录
|
# 如果是文件路径,创建其父目录
|
||||||
# 如果有后缀名则认为是文件,创建其父目录
|
if '.' in os.path.basename(path):
|
||||||
if path.suffix:
|
directory = os.path.dirname(path)
|
||||||
directory = path.parent
|
if directory and not os.path.exists(directory):
|
||||||
if not directory.exists():
|
os.makedirs(directory, exist_ok=True)
|
||||||
directory.mkdir(parents=True, exist_ok=True)
|
logger.info(f"已创建目录: {directory}")
|
||||||
logger.info(f"已创建父目录: {directory}")
|
|
||||||
else:
|
else:
|
||||||
# 否则认为是目录路径
|
# 否则认为是目录路径
|
||||||
if not path.exists():
|
os.makedirs(path, exist_ok=True)
|
||||||
path.mkdir(parents=True, exist_ok=True)
|
logger.info(f"已创建目录: {path}")
|
||||||
logger.info(f"已创建目录: {path}")
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"创建目录失败: {path}, 错误: {e}")
|
logger.error(f"创建目录失败: {path}, 错误: {e}")
|
||||||
|
|
||||||
return str(path.absolute())
|
return path
|
||||||
@@ -1,609 +0,0 @@
|
|||||||
"""
|
|
||||||
商品资料 SQLite 数据库 + 商品记忆库
|
|
||||||
|
|
||||||
记忆库功能:
|
|
||||||
- 处理每步后自动学习商品数据(置信度+一致性加速)
|
|
||||||
- OCR 字段缺失时用记忆库补全 (conf > 50 直接采用)
|
|
||||||
- 价格异常检测:偏差 > 2倍触发补全,偏差 > 50% 记录预警
|
|
||||||
- 批量预加载 → 内存操作 → 批量写回,保障性能
|
|
||||||
"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import json
|
|
||||||
import sqlite3
|
|
||||||
from datetime import datetime
|
|
||||||
from typing import Dict, List, Optional, Tuple, Callable
|
|
||||||
|
|
||||||
import pandas as pd
|
|
||||||
|
|
||||||
from ..utils.log_utils import get_logger
|
|
||||||
from ..utils.file_utils import smart_read_excel
|
|
||||||
from ...core.handlers.column_mapper import ColumnMapper
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class ProductDatabase:
|
|
||||||
"""商品资料 SQLite 数据库 + 商品记忆库"""
|
|
||||||
|
|
||||||
SCHEMA = """
|
|
||||||
CREATE TABLE IF NOT EXISTS products (
|
|
||||||
barcode TEXT PRIMARY KEY,
|
|
||||||
name TEXT DEFAULT '',
|
|
||||||
price REAL DEFAULT 0.0,
|
|
||||||
unit TEXT DEFAULT '',
|
|
||||||
updated_at TEXT,
|
|
||||||
specification TEXT DEFAULT '',
|
|
||||||
source TEXT DEFAULT 'template',
|
|
||||||
confidence INTEGER DEFAULT 0,
|
|
||||||
usage_count INTEGER DEFAULT 0,
|
|
||||||
last_seen TEXT,
|
|
||||||
avg_price REAL DEFAULT 0.0,
|
|
||||||
min_price REAL DEFAULT 0.0,
|
|
||||||
max_price REAL DEFAULT 0.0,
|
|
||||||
price_count INTEGER DEFAULT 0
|
|
||||||
);
|
|
||||||
"""
|
|
||||||
|
|
||||||
_NEW_COLUMNS = {
|
|
||||||
'specification': "TEXT DEFAULT ''",
|
|
||||||
'source': "TEXT DEFAULT 'template'",
|
|
||||||
'confidence': 'INTEGER DEFAULT 0',
|
|
||||||
'usage_count': 'INTEGER DEFAULT 0',
|
|
||||||
'last_seen': 'TEXT',
|
|
||||||
'avg_price': 'REAL DEFAULT 0.0',
|
|
||||||
'min_price': 'REAL DEFAULT 0.0',
|
|
||||||
'max_price': 'REAL DEFAULT 0.0',
|
|
||||||
'price_count': 'INTEGER DEFAULT 0',
|
|
||||||
}
|
|
||||||
|
|
||||||
def __init__(self, db_path: str, excel_source: str):
|
|
||||||
self.db_path = db_path
|
|
||||||
self.excel_source = excel_source
|
|
||||||
self._ensure_db()
|
|
||||||
|
|
||||||
def _connect(self) -> sqlite3.Connection:
|
|
||||||
return sqlite3.connect(self.db_path)
|
|
||||||
|
|
||||||
def _ensure_db(self):
|
|
||||||
if os.path.exists(self.db_path):
|
|
||||||
self._migrate_schema()
|
|
||||||
return
|
|
||||||
if not os.path.exists(self.excel_source):
|
|
||||||
logger.warning(f"商品资料 Excel 不存在: {self.excel_source}")
|
|
||||||
self._create_empty_db()
|
|
||||||
return
|
|
||||||
logger.info(f"首次运行,从 Excel 导入商品资料: {self.excel_source}")
|
|
||||||
os.makedirs(os.path.dirname(self.db_path), exist_ok=True)
|
|
||||||
self._create_empty_db()
|
|
||||||
count = self.import_from_excel(self.excel_source)
|
|
||||||
logger.info(f"商品资料导入完成: {count} 条记录")
|
|
||||||
|
|
||||||
def _create_empty_db(self):
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
conn.executescript(self.SCHEMA)
|
|
||||||
conn.commit()
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
def _migrate_schema(self):
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
cursor = conn.execute("PRAGMA table_info(products)")
|
|
||||||
existing_cols = {row[1] for row in cursor.fetchall()}
|
|
||||||
for col_name, col_type in self._NEW_COLUMNS.items():
|
|
||||||
if col_name not in existing_cols:
|
|
||||||
conn.execute(f"ALTER TABLE products ADD COLUMN {col_name} {col_type}")
|
|
||||||
logger.info(f"数据库迁移: 添加列 {col_name}")
|
|
||||||
conn.commit()
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
# 导入
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
|
|
||||||
def import_from_excel(self, excel_path: str) -> int:
|
|
||||||
df = smart_read_excel(excel_path)
|
|
||||||
if df is None or df.empty:
|
|
||||||
return 0
|
|
||||||
barcode_col = ColumnMapper.find_column(list(df.columns), 'barcode')
|
|
||||||
if not barcode_col:
|
|
||||||
return 0
|
|
||||||
price_col = ColumnMapper.find_column(list(df.columns), 'unit_price')
|
|
||||||
if not price_col:
|
|
||||||
for col in df.columns:
|
|
||||||
if '进货价' in str(col).strip():
|
|
||||||
price_col = col
|
|
||||||
break
|
|
||||||
name_col = ColumnMapper.find_column(list(df.columns), 'name')
|
|
||||||
unit_col = ColumnMapper.find_column(list(df.columns), 'unit')
|
|
||||||
spec_col = ColumnMapper.find_column(list(df.columns), 'specification')
|
|
||||||
|
|
||||||
now = datetime.now().isoformat()
|
|
||||||
rows = []
|
|
||||||
for _, row in df.iterrows():
|
|
||||||
barcode = str(row.get(barcode_col, '')).strip()
|
|
||||||
if not barcode or barcode == 'nan':
|
|
||||||
continue
|
|
||||||
price = 0.0
|
|
||||||
if price_col:
|
|
||||||
try:
|
|
||||||
p = row.get(price_col)
|
|
||||||
if p is not None and str(p).strip() not in ('', 'nan', 'None'):
|
|
||||||
price = float(p)
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
pass
|
|
||||||
name = str(row.get(name_col, '')).strip() if name_col else ''
|
|
||||||
if name == 'nan': name = ''
|
|
||||||
unit = str(row.get(unit_col, '')).strip() if unit_col else ''
|
|
||||||
if unit == 'nan': unit = ''
|
|
||||||
spec = str(row.get(spec_col, '')).strip() if spec_col else ''
|
|
||||||
if spec == 'nan': spec = ''
|
|
||||||
# template 源置信度 50
|
|
||||||
rows.append((barcode, name, price, unit, now, spec, 'template', 50, 0, now,
|
|
||||||
price, price, price, 1 if price > 0 else 0))
|
|
||||||
|
|
||||||
if not rows:
|
|
||||||
return 0
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
conn.executemany(
|
|
||||||
"INSERT OR REPLACE INTO products "
|
|
||||||
"(barcode, name, price, unit, updated_at, specification, source, confidence, "
|
|
||||||
"usage_count, last_seen, avg_price, min_price, max_price, price_count) "
|
|
||||||
"VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
|
|
||||||
rows)
|
|
||||||
conn.commit()
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
return len(rows)
|
|
||||||
|
|
||||||
def reimport(self) -> int:
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
conn.execute("DELETE FROM products")
|
|
||||||
conn.commit()
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
return self.import_from_excel(self.excel_source)
|
|
||||||
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
# 查询
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
|
|
||||||
def get_price(self, barcode: str) -> Optional[float]:
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
row = conn.execute("SELECT avg_price FROM products WHERE barcode=?",
|
|
||||||
(str(barcode).strip(),)).fetchone()
|
|
||||||
return row[0] if row and row[0] else None
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
def get_prices(self, barcodes: List[str]) -> Dict[str, float]:
|
|
||||||
if not barcodes:
|
|
||||||
return {}
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
placeholders = ','.join('?' * len(barcodes))
|
|
||||||
rows = conn.execute(
|
|
||||||
f"SELECT barcode, avg_price FROM products WHERE barcode IN ({placeholders})",
|
|
||||||
[str(b).strip() for b in barcodes]).fetchall()
|
|
||||||
return {r[0]: r[1] for r in rows if r[1]}
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
def count(self) -> int:
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
return conn.execute("SELECT COUNT(*) FROM products").fetchone()[0]
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
def get_memory(self, barcode: str) -> Optional[Dict]:
|
|
||||||
conn = self._connect()
|
|
||||||
conn.row_factory = sqlite3.Row
|
|
||||||
try:
|
|
||||||
row = conn.execute("SELECT * FROM products WHERE barcode=?",
|
|
||||||
(str(barcode).strip(),)).fetchone()
|
|
||||||
return dict(row) if row else None
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
def get_memories(self, barcodes: List[str]) -> Dict[str, Dict]:
|
|
||||||
if not barcodes:
|
|
||||||
return {}
|
|
||||||
conn = self._connect()
|
|
||||||
conn.row_factory = sqlite3.Row
|
|
||||||
try:
|
|
||||||
placeholders = ','.join('?' * len(barcodes))
|
|
||||||
rows = conn.execute(
|
|
||||||
f"SELECT * FROM products WHERE barcode IN ({placeholders})",
|
|
||||||
[str(b).strip() for b in barcodes]).fetchall()
|
|
||||||
return {r['barcode']: dict(r) for r in rows}
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
def get_all_memories(self) -> List[Dict]:
|
|
||||||
conn = self._connect()
|
|
||||||
conn.row_factory = sqlite3.Row
|
|
||||||
try:
|
|
||||||
return [dict(row) for row in
|
|
||||||
conn.execute("SELECT * FROM products ORDER BY usage_count DESC, barcode").fetchall()]
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
# 批量预加载 — 性能核心
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
|
|
||||||
def load_batch(self, barcodes: List[str]) -> Dict[str, Dict]:
|
|
||||||
"""批量预加载条码记忆到 dict — 单次 SQL,后续纯内存操作"""
|
|
||||||
if not barcodes:
|
|
||||||
return {}
|
|
||||||
conn = self._connect()
|
|
||||||
conn.row_factory = sqlite3.Row
|
|
||||||
try:
|
|
||||||
placeholders = ','.join('?' * len(barcodes))
|
|
||||||
rows = conn.execute(
|
|
||||||
f"SELECT * FROM products WHERE barcode IN ({placeholders})",
|
|
||||||
[str(b).strip() for b in barcodes]).fetchall()
|
|
||||||
return {r['barcode']: dict(r) for r in rows}
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
# 学习逻辑 — 一致性加速 + 价格区间
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
|
|
||||||
def learn_from_product(self, product: Dict, source: str = 'ocr',
|
|
||||||
memory: Dict[str, Dict] = None,
|
|
||||||
add_log: Callable = None) -> Optional[str]:
|
|
||||||
"""
|
|
||||||
从处理结果中学习,返回日志字符串。
|
|
||||||
memory: 可选的预加载批量内存,传入则零 DB 查询。
|
|
||||||
"""
|
|
||||||
barcode = str(product.get('barcode', '')).strip()
|
|
||||||
if not barcode:
|
|
||||||
return None
|
|
||||||
|
|
||||||
name = str(product.get('name', ''))
|
|
||||||
spec = str(product.get('specification', ''))
|
|
||||||
unit = str(product.get('unit', ''))
|
|
||||||
price = float(product.get('price', 0))
|
|
||||||
now = datetime.now().isoformat()
|
|
||||||
|
|
||||||
# 查现有记录(优先从内存查)
|
|
||||||
if memory is not None and barcode in memory:
|
|
||||||
row = memory[barcode]
|
|
||||||
old_name = row.get('name', '')
|
|
||||||
old_spec = row.get('specification', '')
|
|
||||||
old_unit = row.get('unit', '')
|
|
||||||
old_conf = row.get('confidence', 0)
|
|
||||||
old_count = row.get('usage_count', 0)
|
|
||||||
old_avg = row.get('avg_price', 0) or 0
|
|
||||||
old_min = row.get('min_price') or price
|
|
||||||
old_max = row.get('max_price') or price
|
|
||||||
pc = row.get('price_count', 0) or 0
|
|
||||||
exists = True
|
|
||||||
else:
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
cursor = conn.execute(
|
|
||||||
"SELECT name, specification, unit, confidence, usage_count, "
|
|
||||||
"avg_price, min_price, max_price, price_count FROM products WHERE barcode=?",
|
|
||||||
(barcode,)).fetchone()
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
if cursor is None:
|
|
||||||
exists = False
|
|
||||||
else:
|
|
||||||
old_name, old_spec, old_unit, old_conf, old_count, old_avg, old_min, old_max, pc = cursor
|
|
||||||
old_avg = old_avg or 0
|
|
||||||
pc = pc or 0
|
|
||||||
old_min = old_min if old_min is not None else price
|
|
||||||
old_max = old_max if old_max is not None else price
|
|
||||||
exists = True
|
|
||||||
|
|
||||||
new_count = old_count + 1 if exists else 1
|
|
||||||
|
|
||||||
# ── 置信度 ──
|
|
||||||
if source == 'user_confirmed':
|
|
||||||
new_conf = 90
|
|
||||||
elif source == 'template':
|
|
||||||
new_conf = 50
|
|
||||||
elif exists and old_conf < 50:
|
|
||||||
# 一致性加速
|
|
||||||
spec_match = bool(spec and old_spec and spec == old_spec)
|
|
||||||
unit_match = bool(unit and old_unit and unit == old_unit)
|
|
||||||
if spec_match and unit_match:
|
|
||||||
boost = 10
|
|
||||||
elif unit_match:
|
|
||||||
boost = 5
|
|
||||||
else:
|
|
||||||
boost = 3
|
|
||||||
new_conf = min(50, old_conf + boost)
|
|
||||||
elif exists:
|
|
||||||
new_conf = old_conf # > 50 稳定不变
|
|
||||||
else:
|
|
||||||
new_conf = 10 # 新 OCR 记录
|
|
||||||
|
|
||||||
# ── 价格区间 ──
|
|
||||||
if price > 0:
|
|
||||||
new_pc = (pc if exists else 0) + 1
|
|
||||||
new_avg = ((old_avg * (new_pc - 1)) + price) / new_pc if exists else price
|
|
||||||
new_min = min(old_min, price) if exists else price
|
|
||||||
new_max = max(old_max, price) if exists else price
|
|
||||||
else:
|
|
||||||
new_avg = old_avg if exists else 0
|
|
||||||
new_min = old_min if exists else 0
|
|
||||||
new_max = old_max if exists else 0
|
|
||||||
new_pc = pc if exists else 0
|
|
||||||
|
|
||||||
# ── 写入 ──
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
if not exists:
|
|
||||||
conn.execute(
|
|
||||||
"INSERT INTO products (barcode, name, specification, unit, price, "
|
|
||||||
"source, confidence, usage_count, last_seen, updated_at, "
|
|
||||||
"avg_price, min_price, max_price, price_count) "
|
|
||||||
"VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
|
|
||||||
(barcode, name, spec, unit, price, source, new_conf, 1, now, now,
|
|
||||||
new_avg, new_min, new_max, new_pc))
|
|
||||||
log = f"记忆库新增: {barcode} {name} 源={source} 可信度={new_conf}"
|
|
||||||
else:
|
|
||||||
# 高可信度源全字段覆盖;低可信度仅填空
|
|
||||||
if source in ('template', 'user_confirmed') or new_conf > 50:
|
|
||||||
conn.execute(
|
|
||||||
"UPDATE products SET name=?, specification=?, unit=?, price=?, "
|
|
||||||
"source=?, confidence=?, usage_count=?, last_seen=?, updated_at=?, "
|
|
||||||
"avg_price=?, min_price=?, max_price=?, price_count=? WHERE barcode=?",
|
|
||||||
(name or old_name, spec or old_spec, unit or old_unit, price,
|
|
||||||
source, new_conf, new_count, now, now,
|
|
||||||
new_avg, new_min, new_max, new_pc, barcode))
|
|
||||||
else:
|
|
||||||
conn.execute(
|
|
||||||
"UPDATE products SET "
|
|
||||||
"name=CASE WHEN name='' THEN ? ELSE name END, "
|
|
||||||
"specification=CASE WHEN specification='' THEN ? ELSE specification END, "
|
|
||||||
"unit=CASE WHEN unit='' THEN ? ELSE unit END, "
|
|
||||||
"source=?, confidence=?, usage_count=?, last_seen=?, updated_at=?, "
|
|
||||||
"avg_price=?, min_price=?, max_price=?, price_count=? WHERE barcode=?",
|
|
||||||
(name, spec, unit, source, new_conf, new_count, now, now,
|
|
||||||
new_avg, new_min, new_max, new_pc, barcode))
|
|
||||||
log = f"记忆库更新: {barcode} 可信度{old_conf if exists else 0}→{new_conf}"
|
|
||||||
if price > 0:
|
|
||||||
log += f" 均价{new_avg:.4f}({new_pc}次)"
|
|
||||||
|
|
||||||
conn.commit()
|
|
||||||
|
|
||||||
# 更新内存 dict(如果传入了)
|
|
||||||
if memory is not None and barcode in memory:
|
|
||||||
memory[barcode].update({
|
|
||||||
'confidence': new_conf, 'usage_count': new_count,
|
|
||||||
'avg_price': new_avg, 'min_price': new_min,
|
|
||||||
'max_price': new_max, 'price_count': new_pc,
|
|
||||||
'name': name or old_name,
|
|
||||||
'specification': spec or old_spec,
|
|
||||||
'unit': unit or old_unit,
|
|
||||||
})
|
|
||||||
|
|
||||||
if add_log:
|
|
||||||
add_log(log)
|
|
||||||
return log
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
def learn_from_products(self, products: List[Dict], source: str = 'ocr',
|
|
||||||
add_log: Callable = None) -> int:
|
|
||||||
"""批量学习 — 先批量预加载,再逐条处理,返回更新条数"""
|
|
||||||
barcodes = [str(p.get('barcode', '')) for p in products if p.get('barcode')]
|
|
||||||
memory = self.load_batch(barcodes)
|
|
||||||
count = 0
|
|
||||||
for p in products:
|
|
||||||
try:
|
|
||||||
result = self.learn_from_product(p, source, memory=memory, add_log=add_log)
|
|
||||||
if result:
|
|
||||||
count += 1
|
|
||||||
except Exception as e:
|
|
||||||
logger.warning(f"学习商品记忆失败: {e}")
|
|
||||||
return count
|
|
||||||
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
# 记忆辅助 — OCR 补全
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
|
|
||||||
def _price_anomaly(self, product: Dict, mem: Dict) -> bool:
|
|
||||||
"""价格异常:> 2倍偏差"""
|
|
||||||
price = float(product.get('price', 0))
|
|
||||||
avg = mem.get('avg_price', 0)
|
|
||||||
if not price or not avg:
|
|
||||||
return False
|
|
||||||
return price > avg * 2 or price < avg * 0.5
|
|
||||||
|
|
||||||
def fill_from_memory(self, barcode: str, ocr_result: Dict,
|
|
||||||
memory: Dict[str, Dict] = None) -> Tuple[Dict, str]:
|
|
||||||
"""用记忆库补全 OCR 缺失字段。返回 (补全后的dict, 日志字符串)"""
|
|
||||||
if memory:
|
|
||||||
mem = memory.get(barcode)
|
|
||||||
else:
|
|
||||||
mem = self.get_memory(barcode)
|
|
||||||
|
|
||||||
if not mem or mem.get('confidence', 0) < 10:
|
|
||||||
return ocr_result, ""
|
|
||||||
|
|
||||||
logs = []
|
|
||||||
result = dict(ocr_result)
|
|
||||||
conf = mem.get('confidence', 0)
|
|
||||||
|
|
||||||
has_spec = result.get('specification')
|
|
||||||
has_unit = result.get('unit')
|
|
||||||
price = float(result.get('price', 0))
|
|
||||||
|
|
||||||
if conf > 50 and not has_spec and mem.get('specification'):
|
|
||||||
result['specification'] = mem['specification']
|
|
||||||
logs.append(f"规格补全(可信{conf}): {barcode} → {mem['specification']}")
|
|
||||||
elif not has_spec and mem.get('specification') and self._price_anomaly(result, mem):
|
|
||||||
result['specification'] = mem['specification']
|
|
||||||
logs.append(f"价格异常→规格补全: {barcode} 本次{price:.2f} vs 均价{mem['avg_price']:.2f} → {mem['specification']}")
|
|
||||||
|
|
||||||
if conf > 50 and not has_unit and mem.get('unit'):
|
|
||||||
result['unit'] = mem['unit']
|
|
||||||
logs.append(f"单位补全(可信{conf}): {barcode} → {mem['unit']}")
|
|
||||||
elif not has_unit and mem.get('unit') and self._price_anomaly(result, mem):
|
|
||||||
result['unit'] = mem['unit']
|
|
||||||
logs.append(f"价格异常→单位补全: {barcode} → {mem['unit']}")
|
|
||||||
|
|
||||||
return result, "; ".join(logs)
|
|
||||||
|
|
||||||
def price_warning(self, barcode: str, price: float,
|
|
||||||
memory: Dict[str, Dict] = None) -> Optional[str]:
|
|
||||||
"""价格预警。> 50% 偏差告警"""
|
|
||||||
if memory:
|
|
||||||
mem = memory.get(barcode)
|
|
||||||
else:
|
|
||||||
mem = self.get_memory(barcode)
|
|
||||||
if not mem or not mem.get('avg_price'):
|
|
||||||
return None
|
|
||||||
avg = mem['avg_price']
|
|
||||||
min_p = mem.get('min_price', avg)
|
|
||||||
max_p = mem.get('max_price', avg)
|
|
||||||
pc = mem.get('price_count', 0)
|
|
||||||
if price > avg * 1.5 or price < avg * 0.5:
|
|
||||||
return (f"单价预警: {barcode} 本次{price:.4f}元 vs "
|
|
||||||
f"历史均价{avg:.4f} (范围{min_p:.4f}~{max_p:.4f}, {pc}次)")
|
|
||||||
return None
|
|
||||||
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
# 手动编辑
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
|
|
||||||
def update_memory(self, barcode: str, fields: Dict) -> bool:
|
|
||||||
barcode = str(barcode).strip()
|
|
||||||
if not barcode:
|
|
||||||
return False
|
|
||||||
allowed = {'name', 'specification', 'unit', 'price', 'confidence'}
|
|
||||||
updates = {k: v for k, v in fields.items() if k in allowed}
|
|
||||||
if not updates:
|
|
||||||
return False
|
|
||||||
now = datetime.now().isoformat()
|
|
||||||
set_clause = ', '.join(f"{k}=?" for k in updates)
|
|
||||||
values = list(updates.values())
|
|
||||||
extra_sql = ", source='user_confirmed'"
|
|
||||||
if 'confidence' not in updates:
|
|
||||||
extra_sql += ", confidence=90"
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
conn.execute(
|
|
||||||
f"UPDATE products SET {set_clause}{extra_sql}, updated_at=? WHERE barcode=?",
|
|
||||||
values + [now, barcode])
|
|
||||||
conn.commit()
|
|
||||||
return conn.total_changes > 0
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
def delete_memory(self, barcode: str) -> bool:
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
conn.execute("DELETE FROM products WHERE barcode=?", (str(barcode).strip(),))
|
|
||||||
conn.commit()
|
|
||||||
return conn.total_changes > 0
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
# 云端同步
|
|
||||||
# ══════════════════════════════════════════════════════════════
|
|
||||||
|
|
||||||
def export_for_sync(self) -> Dict:
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
cursor = conn.execute(
|
|
||||||
"SELECT barcode, name, specification, unit, price, source, "
|
|
||||||
"confidence, usage_count, last_seen, avg_price, min_price, max_price, price_count "
|
|
||||||
"FROM products")
|
|
||||||
result = {}
|
|
||||||
for row in cursor.fetchall():
|
|
||||||
result[row[0]] = {
|
|
||||||
'name': row[1], 'specification': row[2], 'unit': row[3],
|
|
||||||
'price': row[4], 'source': row[5], 'confidence': row[6],
|
|
||||||
'usage_count': row[7], 'last_seen': row[8],
|
|
||||||
'avg_price': row[9], 'min_price': row[10],
|
|
||||||
'max_price': row[11], 'price_count': row[12],
|
|
||||||
}
|
|
||||||
return result
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
|
|
||||||
def import_from_sync(self, data: Dict) -> int:
|
|
||||||
now = datetime.now().isoformat()
|
|
||||||
count = 0
|
|
||||||
conn = self._connect()
|
|
||||||
try:
|
|
||||||
for barcode, info in data.items():
|
|
||||||
barcode = str(barcode).strip()
|
|
||||||
if not barcode:
|
|
||||||
continue
|
|
||||||
name = str(info.get('name', ''))
|
|
||||||
spec = str(info.get('specification', ''))
|
|
||||||
unit = str(info.get('unit', ''))
|
|
||||||
price = float(info.get('price', 0))
|
|
||||||
remote_source = str(info.get('source', 'ocr'))
|
|
||||||
remote_conf = int(info.get('confidence', 50))
|
|
||||||
remote_count = int(info.get('usage_count', 1))
|
|
||||||
remote_seen = str(info.get('last_seen', now))
|
|
||||||
remote_avg = float(info.get('avg_price', price))
|
|
||||||
remote_min = float(info.get('min_price', price))
|
|
||||||
remote_max = float(info.get('max_price', price))
|
|
||||||
remote_pc = int(info.get('price_count', 1))
|
|
||||||
|
|
||||||
row = conn.execute("SELECT confidence FROM products WHERE barcode=?",
|
|
||||||
(barcode,)).fetchone()
|
|
||||||
if row is None:
|
|
||||||
conn.execute(
|
|
||||||
"INSERT INTO products (barcode, name, specification, unit, price, "
|
|
||||||
"source, confidence, usage_count, last_seen, updated_at, "
|
|
||||||
"avg_price, min_price, max_price, price_count) "
|
|
||||||
"VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
|
|
||||||
(barcode, name, spec, unit, price, remote_source, remote_conf,
|
|
||||||
remote_count, remote_seen, now,
|
|
||||||
remote_avg, remote_min, remote_max, remote_pc))
|
|
||||||
count += 1
|
|
||||||
else:
|
|
||||||
local_conf = row[0]
|
|
||||||
if remote_conf > local_conf:
|
|
||||||
conn.execute(
|
|
||||||
"UPDATE products SET name=?, specification=?, unit=?, price=?, "
|
|
||||||
"source=?, confidence=?, usage_count=?, last_seen=?, updated_at=?, "
|
|
||||||
"avg_price=?, min_price=?, max_price=?, price_count=? WHERE barcode=?",
|
|
||||||
(name, spec, unit, price, remote_source, remote_conf,
|
|
||||||
remote_count, remote_seen, now,
|
|
||||||
remote_avg, remote_min, remote_max, remote_pc, barcode))
|
|
||||||
count += 1
|
|
||||||
elif remote_conf == local_conf:
|
|
||||||
conn.execute(
|
|
||||||
"UPDATE products SET "
|
|
||||||
"name=CASE WHEN name='' THEN ? ELSE name END, "
|
|
||||||
"specification=CASE WHEN specification='' THEN ? ELSE specification END, "
|
|
||||||
"unit=CASE WHEN unit='' THEN ? ELSE unit END, "
|
|
||||||
"usage_count=MAX(usage_count, ?), updated_at=? WHERE barcode=?",
|
|
||||||
(name, spec, unit, remote_count, now, barcode))
|
|
||||||
count += 1
|
|
||||||
conn.commit()
|
|
||||||
finally:
|
|
||||||
conn.close()
|
|
||||||
return count
|
|
||||||
|
|
||||||
def _export_memory_json(self, json_path=None):
|
|
||||||
"""导出记忆库为 JSON(兼容旧代码调用)"""
|
|
||||||
import os as _os
|
|
||||||
if json_path is None:
|
|
||||||
json_path = _os.path.join(_os.path.dirname(self.db_path), 'product_memory.json')
|
|
||||||
data = self.export_for_sync()
|
|
||||||
_os.makedirs(_os.path.dirname(json_path), exist_ok=True)
|
|
||||||
with open(json_path, 'w', encoding='utf-8') as f:
|
|
||||||
json.dump(data, f, ensure_ascii=False, indent=2)
|
|
||||||
return json_path
|
|
||||||
@@ -285,16 +285,6 @@ class UnitConverter:
|
|||||||
|
|
||||||
logger.debug(f"解析规格: {spec}")
|
logger.debug(f"解析规格: {spec}")
|
||||||
|
|
||||||
# 新增:处理“1件=12桶/袋/盒...”等等式规格,统一为1*12
|
|
||||||
eq_match = re.match(r'(\d+(?:\.\d+)?)\s*(?:件|箱|提|盒)\s*[==]\s*(\d+)\s*(?:瓶|桶|盒|支|个|袋|罐|包|卷)', spec)
|
|
||||||
if eq_match:
|
|
||||||
try:
|
|
||||||
level2 = int(eq_match.group(2))
|
|
||||||
logger.info(f"解析等式规格: {spec} -> 1*{level2}")
|
|
||||||
return 1, level2, None
|
|
||||||
except ValueError:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# 处理三级包装,如1*5*12
|
# 处理三级包装,如1*5*12
|
||||||
three_level_match = re.match(r'(\d+)[*](\d+)[*](\d+)', spec)
|
three_level_match = re.match(r'(\d+)[*](\d+)[*](\d+)', spec)
|
||||||
if three_level_match:
|
if three_level_match:
|
||||||
|
|||||||
@@ -63,9 +63,8 @@ class JianUnitHandler(UnitHandler):
|
|||||||
Returns:
|
Returns:
|
||||||
是否可以处理
|
是否可以处理
|
||||||
"""
|
"""
|
||||||
unit = str(product.get('unit', '')).strip()
|
unit = product.get('unit', '')
|
||||||
# 匹配"件"、"件、"、"件装"等
|
return unit == '件'
|
||||||
return unit == '件' or unit.startswith('件')
|
|
||||||
|
|
||||||
def handle(self, product: Dict[str, Any], level1: int, level2: int, level3: Optional[int]) -> Dict[str, Any]:
|
def handle(self, product: Dict[str, Any], level1: int, level2: int, level3: Optional[int]) -> Dict[str, Any]:
|
||||||
"""
|
"""
|
||||||
@@ -118,9 +117,8 @@ class BoxUnitHandler(UnitHandler):
|
|||||||
Returns:
|
Returns:
|
||||||
是否可以处理
|
是否可以处理
|
||||||
"""
|
"""
|
||||||
unit = str(product.get('unit', '')).strip()
|
unit = product.get('unit', '')
|
||||||
# 匹配"箱"、"箱、"、"箱装"等
|
return unit == '箱'
|
||||||
return unit == '箱' or unit.startswith('箱')
|
|
||||||
|
|
||||||
def handle(self, product: Dict[str, Any], level1: int, level2: int, level3: Optional[int]) -> Dict[str, Any]:
|
def handle(self, product: Dict[str, Any], level1: int, level2: int, level3: Optional[int]) -> Dict[str, Any]:
|
||||||
"""
|
"""
|
||||||
@@ -173,8 +171,8 @@ class TiHeUnitHandler(UnitHandler):
|
|||||||
Returns:
|
Returns:
|
||||||
是否可以处理
|
是否可以处理
|
||||||
"""
|
"""
|
||||||
unit = str(product.get('unit', '')).strip()
|
unit = product.get('unit', '')
|
||||||
return unit in ['提', '盒'] or unit.startswith('提') or unit.startswith('盒')
|
return unit in ['提', '盒']
|
||||||
|
|
||||||
def handle(self, product: Dict[str, Any], level1: int, level2: int, level3: Optional[int]) -> Dict[str, Any]:
|
def handle(self, product: Dict[str, Any], level1: int, level2: int, level3: Optional[int]) -> Dict[str, Any]:
|
||||||
"""
|
"""
|
||||||
|
|||||||
+84
-54
@@ -11,12 +11,11 @@ import numpy as np
|
|||||||
import xlrd
|
import xlrd
|
||||||
import xlwt
|
import xlwt
|
||||||
from xlutils.copy import copy as xlcopy
|
from xlutils.copy import copy as xlcopy
|
||||||
from typing import Dict, List, Optional, Tuple, Union, Any, Callable
|
from typing import Dict, List, Optional, Tuple, Union, Any
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
|
|
||||||
from ...config.settings import ConfigManager
|
from ...config.settings import ConfigManager
|
||||||
from ..utils.log_utils import get_logger
|
from ..utils.log_utils import get_logger
|
||||||
from ..handlers.column_mapper import ColumnMapper
|
|
||||||
from ..utils.file_utils import (
|
from ..utils.file_utils import (
|
||||||
ensure_dir,
|
ensure_dir,
|
||||||
get_file_extension,
|
get_file_extension,
|
||||||
@@ -49,7 +48,7 @@ class PurchaseOrderMerger:
|
|||||||
# 修复ConfigParser对象没有get_path方法的问题
|
# 修复ConfigParser对象没有get_path方法的问题
|
||||||
try:
|
try:
|
||||||
# 获取输出目录
|
# 获取输出目录
|
||||||
self.output_dir = config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/output')
|
self.output_dir = config.get('Paths', 'output_folder', fallback='data/output')
|
||||||
|
|
||||||
# 确保目录存在
|
# 确保目录存在
|
||||||
os.makedirs(self.output_dir, exist_ok=True)
|
os.makedirs(self.output_dir, exist_ok=True)
|
||||||
@@ -96,8 +95,8 @@ class PurchaseOrderMerger:
|
|||||||
Returns:
|
Returns:
|
||||||
采购单文件路径列表
|
采购单文件路径列表
|
||||||
"""
|
"""
|
||||||
# 采购单文件保存在result目录
|
# 采购单文件保存在data/result目录
|
||||||
result_dir = self.config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(self.config, 'get_path') else os.path.abspath('data/result')
|
result_dir = "data/result"
|
||||||
logger.info(f"搜索目录 {result_dir} 中的采购单Excel文件")
|
logger.info(f"搜索目录 {result_dir} 中的采购单Excel文件")
|
||||||
|
|
||||||
# 确保目录存在
|
# 确保目录存在
|
||||||
@@ -141,45 +140,91 @@ class PurchaseOrderMerger:
|
|||||||
logger.debug(f"Excel文件的列名: {df.columns.tolist()}")
|
logger.debug(f"Excel文件的列名: {df.columns.tolist()}")
|
||||||
|
|
||||||
# 处理特殊情况:检查是否需要读取指定行作为标题行
|
# 处理特殊情况:检查是否需要读取指定行作为标题行
|
||||||
header_row_idx = ColumnMapper.detect_header_row(df, max_rows=5, min_matches=3)
|
for header_row_idx in range(5): # 检查前5行
|
||||||
if header_row_idx >= 0:
|
if len(df) <= header_row_idx:
|
||||||
logger.info(f"检测到表头在第 {header_row_idx+1} 行")
|
continue
|
||||||
|
|
||||||
# 使用此行作为列名,数据从下一行开始
|
potential_header = df.iloc[header_row_idx].astype(str)
|
||||||
header_row = df.iloc[header_row_idx].astype(str)
|
header_keywords = ['条码', '条形码', '商品条码', '商品名称', '规格', '单价', '数量', '金额', '单位', '必填']
|
||||||
data_rows = df.iloc[header_row_idx+1:].reset_index(drop=True)
|
matches = sum(1 for keyword in header_keywords if any(keyword in str(val) for val in potential_header.values))
|
||||||
|
|
||||||
# 为每一列分配名称(避免重复的列名)
|
if matches >= 3: # 如果至少匹配3个关键词,认为是表头
|
||||||
new_columns = []
|
logger.info(f"检测到表头在第 {header_row_idx+1} 行")
|
||||||
for i, col in enumerate(header_row):
|
|
||||||
col_str = str(col)
|
|
||||||
if col_str == 'nan' or col_str == 'None' or pd.isna(col):
|
|
||||||
new_columns.append(f"Col_{i}")
|
|
||||||
else:
|
|
||||||
new_columns.append(col_str)
|
|
||||||
|
|
||||||
# 使用新列名创建新的DataFrame
|
# 使用此行作为列名,数据从下一行开始
|
||||||
data_rows.columns = new_columns
|
header_row = potential_header
|
||||||
df = data_rows
|
data_rows = df.iloc[header_row_idx+1:].reset_index(drop=True)
|
||||||
logger.debug(f"重新构建的数据帧列名: {df.columns.tolist()}")
|
|
||||||
|
|
||||||
# 使用 ColumnMapper 统一查找列名(保留中文键名以兼容下游代码)
|
# 为每一列分配名称(避免重复的列名)
|
||||||
|
new_columns = []
|
||||||
|
for i, col in enumerate(header_row):
|
||||||
|
col_str = str(col)
|
||||||
|
if col_str == 'nan' or col_str == 'None' or pd.isna(col):
|
||||||
|
new_columns.append(f"Col_{i}")
|
||||||
|
else:
|
||||||
|
new_columns.append(col_str)
|
||||||
|
|
||||||
|
# 使用新列名创建新的DataFrame
|
||||||
|
data_rows.columns = new_columns
|
||||||
|
df = data_rows
|
||||||
|
logger.debug(f"重新构建的数据帧列名: {df.columns.tolist()}")
|
||||||
|
break
|
||||||
|
|
||||||
|
# 定义可能的列名映射
|
||||||
|
column_mapping = {
|
||||||
|
'条码': ['条码', '条形码', '商品条码', 'barcode', '商品条形码', '条形码', '商品条码', '商品编码', '商品编号', '条形码', '条码(必填)'],
|
||||||
|
'采购量': ['数量', '采购数量', '购买数量', '采购数量', '订单数量', '采购数量', '采购量(必填)', '采购量', '数量(必填)'],
|
||||||
|
'采购单价': ['单价', '价格', '采购单价', '销售价', '采购单价(必填)', '单价(必填)', '价格(必填)'],
|
||||||
|
'赠送量': ['赠送量', '赠品数量', '赠送数量', '赠品']
|
||||||
|
}
|
||||||
|
|
||||||
|
# 显示所有列名,用于调试
|
||||||
all_columns = df.columns.tolist()
|
all_columns = df.columns.tolist()
|
||||||
logger.info(f"列名: {all_columns}")
|
logger.info(f"列名: {all_columns}")
|
||||||
|
|
||||||
standard_to_chinese = {
|
# 映射实际的列名
|
||||||
'barcode': '条码',
|
|
||||||
'quantity': '采购量',
|
|
||||||
'unit_price': '采购单价',
|
|
||||||
'gift_quantity': '赠送量',
|
|
||||||
}
|
|
||||||
|
|
||||||
mapped_columns = {}
|
mapped_columns = {}
|
||||||
for std_name, chinese_name in standard_to_chinese.items():
|
for target_col, possible_names in column_mapping.items():
|
||||||
matched = ColumnMapper.find_column(all_columns, std_name)
|
for col in all_columns:
|
||||||
if matched:
|
# 清理列名以进行匹配
|
||||||
mapped_columns[chinese_name] = matched
|
col_str = str(col).strip()
|
||||||
logger.info(f"列名映射: {matched} -> {chinese_name}")
|
|
||||||
|
# 直接匹配整个列名
|
||||||
|
if col_str in possible_names:
|
||||||
|
mapped_columns[target_col] = col
|
||||||
|
logger.info(f"直接匹配列名: {col_str} -> {target_col}")
|
||||||
|
break
|
||||||
|
|
||||||
|
# 移除列名中的空白字符进行比较
|
||||||
|
clean_col = re.sub(r'\s+', '', col_str)
|
||||||
|
for name in possible_names:
|
||||||
|
clean_name = re.sub(r'\s+', '', name)
|
||||||
|
# 完全匹配
|
||||||
|
if clean_col == clean_name:
|
||||||
|
mapped_columns[target_col] = col
|
||||||
|
logger.info(f"清理后匹配列名: {col_str} -> {target_col}")
|
||||||
|
break
|
||||||
|
# 部分匹配(列名包含关键词)
|
||||||
|
elif clean_name in clean_col:
|
||||||
|
mapped_columns[target_col] = col
|
||||||
|
logger.info(f"部分匹配列名: {col_str} -> {target_col}")
|
||||||
|
break
|
||||||
|
|
||||||
|
if target_col in mapped_columns:
|
||||||
|
break
|
||||||
|
|
||||||
|
# 如果没有找到匹配,尝试模糊匹配
|
||||||
|
if target_col not in mapped_columns:
|
||||||
|
for col in all_columns:
|
||||||
|
col_str = str(col).strip().lower()
|
||||||
|
for name in possible_names:
|
||||||
|
name_lower = name.lower()
|
||||||
|
if name_lower in col_str:
|
||||||
|
mapped_columns[target_col] = col
|
||||||
|
logger.info(f"模糊匹配列名: {col} -> {target_col}")
|
||||||
|
break
|
||||||
|
if target_col in mapped_columns:
|
||||||
|
break
|
||||||
|
|
||||||
# 如果找到了必要的列,重命名列
|
# 如果找到了必要的列,重命名列
|
||||||
if mapped_columns:
|
if mapped_columns:
|
||||||
@@ -354,9 +399,9 @@ class PurchaseOrderMerger:
|
|||||||
# 采购单价(必填)- E列(4)
|
# 采购单价(必填)- E列(4)
|
||||||
output_sheet.write(r, price_col, float(row['采购单价']), price_style)
|
output_sheet.write(r, price_col, float(row['采购单价']), price_style)
|
||||||
|
|
||||||
# 生成输出文件名,保存到result目录
|
# 生成输出文件名,保存到data/result目录
|
||||||
timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
|
timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
|
||||||
result_dir = self.config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(self.config, 'get_path') else os.path.abspath('data/result')
|
result_dir = "data/result"
|
||||||
os.makedirs(result_dir, exist_ok=True)
|
os.makedirs(result_dir, exist_ok=True)
|
||||||
output_file = os.path.join(result_dir, f"合并采购单_{timestamp}.xls")
|
output_file = os.path.join(result_dir, f"合并采购单_{timestamp}.xls")
|
||||||
|
|
||||||
@@ -369,7 +414,7 @@ class PurchaseOrderMerger:
|
|||||||
logger.error(f"创建合并采购单时出错: {e}")
|
logger.error(f"创建合并采购单时出错: {e}")
|
||||||
return None
|
return None
|
||||||
|
|
||||||
def process(self, file_paths: Optional[List[str]] = None, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
|
def process(self, file_paths: Optional[List[str]] = None) -> Optional[str]:
|
||||||
"""
|
"""
|
||||||
处理采购单合并
|
处理采购单合并
|
||||||
|
|
||||||
@@ -382,11 +427,6 @@ class PurchaseOrderMerger:
|
|||||||
# 如果未指定文件路径,则获取所有采购单文件
|
# 如果未指定文件路径,则获取所有采购单文件
|
||||||
if file_paths is None:
|
if file_paths is None:
|
||||||
file_paths = self.get_purchase_orders()
|
file_paths = self.get_purchase_orders()
|
||||||
try:
|
|
||||||
if progress_cb:
|
|
||||||
progress_cb(97)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# 检查是否有文件需要合并
|
# 检查是否有文件需要合并
|
||||||
if not file_paths:
|
if not file_paths:
|
||||||
@@ -398,22 +438,12 @@ class PurchaseOrderMerger:
|
|||||||
if merged_df is None:
|
if merged_df is None:
|
||||||
logger.error("合并采购单失败")
|
logger.error("合并采购单失败")
|
||||||
return None
|
return None
|
||||||
try:
|
|
||||||
if progress_cb:
|
|
||||||
progress_cb(98)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# 创建合并的采购单文件
|
# 创建合并的采购单文件
|
||||||
output_file = self.create_merged_purchase_order(merged_df)
|
output_file = self.create_merged_purchase_order(merged_df)
|
||||||
if output_file is None:
|
if output_file is None:
|
||||||
logger.error("创建合并采购单文件失败")
|
logger.error("创建合并采购单文件失败")
|
||||||
return None
|
return None
|
||||||
try:
|
|
||||||
if progress_cb:
|
|
||||||
progress_cb(100)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# 记录已合并文件
|
# 记录已合并文件
|
||||||
for file_path in file_paths:
|
for file_path in file_paths:
|
||||||
|
|||||||
+191
-179
@@ -11,7 +11,7 @@ import numpy as np
|
|||||||
import xlrd
|
import xlrd
|
||||||
import xlwt
|
import xlwt
|
||||||
from xlutils.copy import copy as xlcopy
|
from xlutils.copy import copy as xlcopy
|
||||||
from typing import Dict, List, Optional, Tuple, Union, Any, Callable
|
from typing import Dict, List, Optional, Tuple, Union, Any
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
|
|
||||||
from ...config.settings import ConfigManager
|
from ...config.settings import ConfigManager
|
||||||
@@ -25,12 +25,11 @@ from ..utils.file_utils import (
|
|||||||
)
|
)
|
||||||
from ..utils.string_utils import (
|
from ..utils.string_utils import (
|
||||||
clean_string,
|
clean_string,
|
||||||
|
clean_barcode,
|
||||||
extract_number,
|
extract_number,
|
||||||
format_barcode,
|
format_barcode
|
||||||
parse_monetary_string
|
|
||||||
)
|
)
|
||||||
from .converter import UnitConverter
|
from .converter import UnitConverter
|
||||||
from ..handlers.column_mapper import ColumnMapper
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
@@ -40,20 +39,19 @@ class ExcelProcessor:
|
|||||||
提取条码、单价和数量,并按照采购单模板的格式填充
|
提取条码、单价和数量,并按照采购单模板的格式填充
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(self, config, product_db=None):
|
def __init__(self, config):
|
||||||
"""
|
"""
|
||||||
初始化Excel处理器
|
初始化Excel处理器
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
config: 配置信息
|
config: 配置信息
|
||||||
product_db: 商品数据库实例(可选,由外部传入以共享)
|
|
||||||
"""
|
"""
|
||||||
self.config = config
|
self.config = config
|
||||||
|
|
||||||
# 修复ConfigParser对象没有get_path方法的问题
|
# 修复ConfigParser对象没有get_path方法的问题
|
||||||
try:
|
try:
|
||||||
# 获取输入和输出目录
|
# 获取输入和输出目录
|
||||||
self.output_dir = config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/output')
|
self.output_dir = config.get('Paths', 'output_folder', fallback='data/output')
|
||||||
self.temp_dir = config.get('Paths', 'temp_folder', fallback='data/temp')
|
self.temp_dir = config.get('Paths', 'temp_folder', fallback='data/temp')
|
||||||
|
|
||||||
# 获取模板文件路径
|
# 获取模板文件路径
|
||||||
@@ -75,18 +73,6 @@ class ExcelProcessor:
|
|||||||
|
|
||||||
# 加载单位转换器和配置
|
# 加载单位转换器和配置
|
||||||
self.unit_converter = UnitConverter()
|
self.unit_converter = UnitConverter()
|
||||||
|
|
||||||
# 商品记忆库
|
|
||||||
if product_db is not None:
|
|
||||||
self.product_db = product_db
|
|
||||||
else:
|
|
||||||
from ..db.product_db import ProductDatabase
|
|
||||||
db_path = config.get_path('Paths', 'product_db', fallback='data/product_cache.db') if hasattr(config, 'get_path') else 'data/product_cache.db'
|
|
||||||
tpl_folder = config.get('Paths', 'template_folder', fallback='templates')
|
|
||||||
item_data = config.get('Templates', 'item_data', fallback='商品资料.xlsx')
|
|
||||||
tpl_path = os.path.join(tpl_folder, item_data)
|
|
||||||
self.product_db = ProductDatabase(db_path, tpl_path)
|
|
||||||
|
|
||||||
logger.info(f"初始化ExcelProcessor完成,模板文件: {self.template_path}")
|
logger.info(f"初始化ExcelProcessor完成,模板文件: {self.template_path}")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"初始化ExcelProcessor失败: {e}")
|
logger.error(f"初始化ExcelProcessor失败: {e}")
|
||||||
@@ -135,6 +121,48 @@ class ExcelProcessor:
|
|||||||
logger.info(f"找到最新的Excel文件: {latest_file}")
|
logger.info(f"找到最新的Excel文件: {latest_file}")
|
||||||
return latest_file
|
return latest_file
|
||||||
|
|
||||||
|
def validate_barcode(self, barcode: Any) -> bool:
|
||||||
|
"""
|
||||||
|
验证条码是否有效
|
||||||
|
新增功能:如果条码是"仓库",则返回False以避免误认为有效条码
|
||||||
|
|
||||||
|
Args:
|
||||||
|
barcode: 条码值
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
条码是否有效
|
||||||
|
"""
|
||||||
|
# 处理"仓库"特殊情况
|
||||||
|
if isinstance(barcode, str) and barcode.strip() in ["仓库", "仓库全名"]:
|
||||||
|
logger.warning(f"条码为仓库标识: {barcode}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# 清理条码格式
|
||||||
|
barcode_clean = clean_barcode(barcode)
|
||||||
|
|
||||||
|
# 对特定的错误条码进行修正(开头改6开头)
|
||||||
|
if len(barcode_clean) > 8 and barcode_clean.startswith('5') and not barcode_clean.startswith('53'):
|
||||||
|
barcode_clean = '6' + barcode_clean[1:]
|
||||||
|
logger.info(f"修正条码前缀 5->6: {barcode} -> {barcode_clean}")
|
||||||
|
|
||||||
|
# 验证条码长度
|
||||||
|
if len(barcode_clean) < 8 or len(barcode_clean) > 13:
|
||||||
|
logger.warning(f"条码长度异常: {barcode_clean}, 长度={len(barcode_clean)}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# 验证条码是否全为数字
|
||||||
|
if not barcode_clean.isdigit():
|
||||||
|
logger.warning(f"条码包含非数字字符: {barcode_clean}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# 对于序号9的特殊情况,允许其条码格式
|
||||||
|
if barcode_clean == "5321545613":
|
||||||
|
logger.info(f"特殊条码验证通过: {barcode_clean}")
|
||||||
|
return True
|
||||||
|
|
||||||
|
logger.debug(f"条码验证通过: {barcode_clean}")
|
||||||
|
return True
|
||||||
|
|
||||||
def extract_barcode(self, df: pd.DataFrame) -> List[str]:
|
def extract_barcode(self, df: pd.DataFrame) -> List[str]:
|
||||||
"""
|
"""
|
||||||
从数据帧中提取条码列名
|
从数据帧中提取条码列名
|
||||||
@@ -145,7 +173,12 @@ class ExcelProcessor:
|
|||||||
Returns:
|
Returns:
|
||||||
可能的条码列名列表
|
可能的条码列名列表
|
||||||
"""
|
"""
|
||||||
possible_barcode_columns = ColumnMapper.STANDARD_COLUMNS['barcode']
|
possible_barcode_columns = [
|
||||||
|
'条码', '条形码', '商品条码', '商品条形码',
|
||||||
|
'商品编码', '商品编号', '条形码', '条码(必填)',
|
||||||
|
'barcode', 'Barcode', '编码', '条形码', '电脑条码',
|
||||||
|
'条码ID', '产品条码', 'BarCode'
|
||||||
|
]
|
||||||
|
|
||||||
found_columns = []
|
found_columns = []
|
||||||
|
|
||||||
@@ -221,20 +254,6 @@ class ExcelProcessor:
|
|||||||
if not product['barcode']:
|
if not product['barcode']:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# 检查备注列,过滤换货、退货、作废等非采购行
|
|
||||||
skip_row = False
|
|
||||||
for col in df.columns:
|
|
||||||
col_str = str(col)
|
|
||||||
if any(k in col_str for k in ['备注', '说明', '类型', '备注1']):
|
|
||||||
val = str(row[col]).strip()
|
|
||||||
# 过滤常见的非采购关键字
|
|
||||||
if any(k in val for k in ['换货', '退货', '作废', '减钱', '冲减', '赠品单', '补货']):
|
|
||||||
logger.info(f"过滤非采购行: {product['barcode']} - {product.get('name', '')}, 原因: {col_str}包含 '{val}'")
|
|
||||||
skip_row = True
|
|
||||||
break
|
|
||||||
if skip_row:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# 提取商品名称
|
# 提取商品名称
|
||||||
if '商品名称' in df.columns and not pd.isna(row['商品名称']):
|
if '商品名称' in df.columns and not pd.isna(row['商品名称']):
|
||||||
product['name'] = str(row['商品名称']).strip()
|
product['name'] = str(row['商品名称']).strip()
|
||||||
@@ -262,22 +281,6 @@ class ExcelProcessor:
|
|||||||
product['amount'] = row['小计']
|
product['amount'] = row['小计']
|
||||||
elif column_mapping.get('amount') and not pd.isna(row[column_mapping['amount']]):
|
elif column_mapping.get('amount') and not pd.isna(row[column_mapping['amount']]):
|
||||||
product['amount'] = row[column_mapping['amount']]
|
product['amount'] = row[column_mapping['amount']]
|
||||||
# 根据金额判断赠品:金额为0、为空、或为o/O
|
|
||||||
amt = product.get('amount', None)
|
|
||||||
try:
|
|
||||||
is_amt_gift = False
|
|
||||||
if amt is None:
|
|
||||||
is_amt_gift = True
|
|
||||||
elif isinstance(amt, str):
|
|
||||||
parsed = parse_monetary_string(amt)
|
|
||||||
is_amt_gift = (parsed is None or parsed == 0.0)
|
|
||||||
else:
|
|
||||||
parsed = parse_monetary_string(amt)
|
|
||||||
is_amt_gift = (parsed is not None and parsed == 0.0)
|
|
||||||
if is_amt_gift:
|
|
||||||
product['is_gift'] = True
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# 提取数量
|
# 提取数量
|
||||||
if '数量' in df.columns and not pd.isna(row['数量']):
|
if '数量' in df.columns and not pd.isna(row['数量']):
|
||||||
@@ -376,17 +379,27 @@ class ExcelProcessor:
|
|||||||
# 如果数量为0但单价和金额都存在,计算数量 = 金额/单价
|
# 如果数量为0但单价和金额都存在,计算数量 = 金额/单价
|
||||||
if (product['quantity'] == 0 or product['quantity'] is None) and product['price'] > 0 and product['amount']:
|
if (product['quantity'] == 0 or product['quantity'] is None) and product['price'] > 0 and product['amount']:
|
||||||
try:
|
try:
|
||||||
amount = parse_monetary_string(product['amount'])
|
# 确保金额是数字
|
||||||
if amount is not None and amount > 0:
|
if isinstance(product['amount'], str):
|
||||||
|
# 移除货币符号和非数字字符,保留数字、小数点和逗号
|
||||||
|
amount_str = re.sub(r'[^\d\.,]', '', product['amount'].strip())
|
||||||
|
# 替换逗号为小数点(如果逗号作为小数分隔符)
|
||||||
|
if ',' in amount_str and '.' not in amount_str:
|
||||||
|
amount_str = amount_str.replace(',', '.')
|
||||||
|
# 处理既有逗号又有小数点的情况(通常逗号是千位分隔符)
|
||||||
|
elif ',' in amount_str and '.' in amount_str:
|
||||||
|
amount_str = amount_str.replace(',', '')
|
||||||
|
amount = float(amount_str)
|
||||||
|
else:
|
||||||
|
amount = float(product['amount'])
|
||||||
|
# 计算数量
|
||||||
|
if amount > 0:
|
||||||
quantity = amount / product['price']
|
quantity = amount / product['price']
|
||||||
logger.info(f"数量为空或为0,通过金额({amount})和单价({product['price']})计算得出数量: {quantity}")
|
logger.info(f"数量为空或为0,通过金额({amount})和单价({product['price']})计算得出数量: {quantity}")
|
||||||
product['quantity'] = quantity
|
product['quantity'] = quantity
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.warning(f"通过金额和单价计算数量失败: {e}")
|
logger.warning(f"通过金额和单价计算数量失败: {e}")
|
||||||
|
|
||||||
# 应用记忆库补全
|
|
||||||
product = self._apply_memory(product)
|
|
||||||
|
|
||||||
products.append(product)
|
products.append(product)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"提取第{idx+1}行商品信息时出错: {e}", exc_info=True)
|
logger.error(f"提取第{idx+1}行商品信息时出错: {e}", exc_info=True)
|
||||||
@@ -395,59 +408,6 @@ class ExcelProcessor:
|
|||||||
logger.info(f"提取到 {len(products)} 个商品信息")
|
logger.info(f"提取到 {len(products)} 个商品信息")
|
||||||
return products
|
return products
|
||||||
|
|
||||||
def _apply_memory(self, product: Dict) -> Dict:
|
|
||||||
"""查记忆库,补全 OCR 缺失/错误的字段"""
|
|
||||||
barcode = product.get('barcode', '')
|
|
||||||
if not barcode:
|
|
||||||
return product
|
|
||||||
|
|
||||||
try:
|
|
||||||
memory = self.product_db.get_memory(barcode)
|
|
||||||
except Exception:
|
|
||||||
return product
|
|
||||||
|
|
||||||
if memory is None or memory.get('confidence', 0) < 80:
|
|
||||||
return product
|
|
||||||
|
|
||||||
# 补全规格
|
|
||||||
ocr_spec = product.get('specification', '')
|
|
||||||
mem_spec = memory.get('specification', '') or ''
|
|
||||||
if mem_spec and (not ocr_spec or self._is_spec_suspicious(ocr_spec)):
|
|
||||||
product['specification'] = mem_spec
|
|
||||||
logger.info(f"记忆修正规格: {barcode} '{ocr_spec}' -> '{mem_spec}'")
|
|
||||||
|
|
||||||
# 补全名称
|
|
||||||
ocr_name = product.get('name', '')
|
|
||||||
mem_name = memory.get('name', '') or ''
|
|
||||||
if mem_name and not ocr_name:
|
|
||||||
product['name'] = mem_name
|
|
||||||
logger.info(f"记忆修正名称: {barcode} -> '{mem_name}'")
|
|
||||||
|
|
||||||
# 补全单位
|
|
||||||
ocr_unit = product.get('unit', '')
|
|
||||||
mem_unit = memory.get('unit', '') or ''
|
|
||||||
if mem_unit and not ocr_unit:
|
|
||||||
product['unit'] = mem_unit
|
|
||||||
logger.info(f"记忆修正单位: {barcode} -> '{mem_unit}'")
|
|
||||||
|
|
||||||
# 不改数量和单价(每单不同)
|
|
||||||
return product
|
|
||||||
|
|
||||||
def _is_spec_suspicious(self, spec: str) -> bool:
|
|
||||||
"""检测规格是否像 OCR 垃圾"""
|
|
||||||
if not spec:
|
|
||||||
return True
|
|
||||||
# IL*12(I 和 1 混淆)
|
|
||||||
if re.search(r'^[Ii][Ll*]', spec):
|
|
||||||
return True
|
|
||||||
# 4.51*4(L 被识别为 1)
|
|
||||||
if re.search(r'\d+\.\d+1\*\d+', spec):
|
|
||||||
return True
|
|
||||||
# 包含非常规字符(排除常见规格字符)
|
|
||||||
if re.search(r'[^\d.*xX\-LlKkGgMm升毫瓶桶盒箱件提\s]', spec):
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
|
|
||||||
def fill_template(self, products: List[Dict], output_file_path: str) -> bool:
|
def fill_template(self, products: List[Dict], output_file_path: str) -> bool:
|
||||||
"""
|
"""
|
||||||
填充采购单模板
|
填充采购单模板
|
||||||
@@ -490,8 +450,21 @@ class ExcelProcessor:
|
|||||||
# 如果数量为0但单价和金额都存在,计算数量 = 金额/单价
|
# 如果数量为0但单价和金额都存在,计算数量 = 金额/单价
|
||||||
if (quantity == 0 or quantity is None) and price > 0 and amount:
|
if (quantity == 0 or quantity is None) and price > 0 and amount:
|
||||||
try:
|
try:
|
||||||
amount = parse_monetary_string(amount)
|
# 确保金额是数字
|
||||||
if amount is not None and amount > 0:
|
if isinstance(amount, str):
|
||||||
|
# 移除货币符号和非数字字符,保留数字、小数点和逗号
|
||||||
|
amount_str = re.sub(r'[^\d\.,]', '', amount.strip())
|
||||||
|
# 替换逗号为小数点(如果逗号作为小数分隔符)
|
||||||
|
if ',' in amount_str and '.' not in amount_str:
|
||||||
|
amount_str = amount_str.replace(',', '.')
|
||||||
|
# 处理既有逗号又有小数点的情况(通常逗号是千位分隔符)
|
||||||
|
elif ',' in amount_str and '.' in amount_str:
|
||||||
|
amount_str = amount_str.replace(',', '')
|
||||||
|
amount = float(amount_str)
|
||||||
|
else:
|
||||||
|
amount = float(amount)
|
||||||
|
# 计算数量
|
||||||
|
if amount > 0:
|
||||||
quantity = amount / price
|
quantity = amount / price
|
||||||
logger.info(f"数量为空或为0,通过金额({amount})和单价({price})计算得出数量: {quantity}")
|
logger.info(f"数量为空或为0,通过金额({amount})和单价({price})计算得出数量: {quantity}")
|
||||||
product['quantity'] = quantity
|
product['quantity'] = quantity
|
||||||
@@ -499,7 +472,7 @@ class ExcelProcessor:
|
|||||||
logger.warning(f"通过金额和单价计算数量失败: {e}")
|
logger.warning(f"通过金额和单价计算数量失败: {e}")
|
||||||
|
|
||||||
# 判断是否为赠品(价格为0)
|
# 判断是否为赠品(价格为0)
|
||||||
is_gift = bool(product.get('is_gift', False)) or (price == 0)
|
is_gift = price == 0
|
||||||
|
|
||||||
logger.info(f"处理商品: 条码={barcode}, 数量={quantity}, 单价={price}, 是否赠品={is_gift}")
|
logger.info(f"处理商品: 条码={barcode}, 数量={quantity}, 单价={price}, 是否赠品={is_gift}")
|
||||||
|
|
||||||
@@ -588,20 +561,77 @@ class ExcelProcessor:
|
|||||||
return False
|
return False
|
||||||
|
|
||||||
def _find_header_row(self, df: pd.DataFrame) -> Optional[int]:
|
def _find_header_row(self, df: pd.DataFrame) -> Optional[int]:
|
||||||
"""自动识别表头行,委托给 ColumnMapper.detect_header_row"""
|
"""
|
||||||
result = ColumnMapper.detect_header_row(df, max_rows=30)
|
自动识别表头行
|
||||||
if result >= 0:
|
|
||||||
logger.info(f"找到表头行: 第{result+1}行")
|
通过多种规则识别表头:
|
||||||
return result
|
1. 检查行是否包含典型的表头关键词(条码、商品名称、数量等)
|
||||||
# 回退:找第一个非空行
|
2. 检查是否是第一个非空行
|
||||||
|
3. 检查行是否有较多的字符串类型单元格(表头通常是字符串)
|
||||||
|
|
||||||
|
Args:
|
||||||
|
df: 数据帧
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
表头行索引,如果未找到则返回None
|
||||||
|
"""
|
||||||
|
# 定义可能的表头关键词
|
||||||
|
header_keywords = [
|
||||||
|
'条码', '条形码', '商品条码', '商品名称', '名称', '数量', '单位', '单价',
|
||||||
|
'规格', '商品编码', '采购数量', '采购单位', '商品', '品名'
|
||||||
|
]
|
||||||
|
|
||||||
|
# 存储每行的匹配分数
|
||||||
|
row_scores = []
|
||||||
|
|
||||||
|
# 遍历前10行(通常表头不会太靠后)
|
||||||
|
max_rows_to_check = min(10, len(df))
|
||||||
|
for row in range(max_rows_to_check):
|
||||||
|
row_data = df.iloc[row]
|
||||||
|
score = 0
|
||||||
|
|
||||||
|
# 检查1: 关键词匹配
|
||||||
|
for cell in row_data:
|
||||||
|
if isinstance(cell, str):
|
||||||
|
cell_clean = str(cell).strip().lower()
|
||||||
|
for keyword in header_keywords:
|
||||||
|
if keyword.lower() in cell_clean:
|
||||||
|
score += 5 # 每匹配一个关键词加5分
|
||||||
|
|
||||||
|
# 检查2: 非空单元格比例
|
||||||
|
non_empty_cells = row_data.count()
|
||||||
|
if non_empty_cells / len(row_data) > 0.5: # 如果超过一半的单元格有内容
|
||||||
|
score += 2
|
||||||
|
|
||||||
|
# 检查3: 字符串类型单元格比例
|
||||||
|
string_cells = sum(1 for cell in row_data if isinstance(cell, str))
|
||||||
|
if string_cells / len(row_data) > 0.5: # 如果超过一半的单元格是字符串
|
||||||
|
score += 3
|
||||||
|
|
||||||
|
row_scores.append((row, score))
|
||||||
|
|
||||||
|
# 日志记录每行的评分情况
|
||||||
|
logger.debug(f"第{row+1}行评分: {score},内容: {row_data.values}")
|
||||||
|
|
||||||
|
# 按评分排序
|
||||||
|
row_scores.sort(key=lambda x: x[1], reverse=True)
|
||||||
|
|
||||||
|
# 如果最高分达到一定阈值,认为是表头
|
||||||
|
if row_scores and row_scores[0][1] >= 5:
|
||||||
|
best_row = row_scores[0][0]
|
||||||
|
logger.info(f"找到可能的表头行: 第{best_row+1}行,评分: {row_scores[0][1]}")
|
||||||
|
return best_row
|
||||||
|
|
||||||
|
# 如果没有找到明确的表头,尝试找第一个非空行
|
||||||
for row in range(len(df)):
|
for row in range(len(df)):
|
||||||
if df.iloc[row].notna().sum() > 3:
|
if df.iloc[row].notna().sum() > 3: # 至少有3个非空单元格
|
||||||
logger.info(f"未找到明确表头,使用第一个有效行: 第{row+1}行")
|
logger.info(f"未找到明确表头,使用第一个有效行: 第{row+1}行")
|
||||||
return row
|
return row
|
||||||
|
|
||||||
logger.warning("无法识别表头行")
|
logger.warning("无法识别表头行")
|
||||||
return None
|
return None
|
||||||
|
|
||||||
def process_specific_file(self, file_path: str, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
|
def process_specific_file(self, file_path: str) -> Optional[str]:
|
||||||
"""
|
"""
|
||||||
处理指定的Excel文件
|
处理指定的Excel文件
|
||||||
|
|
||||||
@@ -619,11 +649,6 @@ class ExcelProcessor:
|
|||||||
|
|
||||||
try:
|
try:
|
||||||
# 读取Excel文件时不立即指定表头
|
# 读取Excel文件时不立即指定表头
|
||||||
if progress_cb:
|
|
||||||
try:
|
|
||||||
progress_cb(92)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
df = pd.read_excel(file_path, header=None)
|
df = pd.read_excel(file_path, header=None)
|
||||||
logger.info(f"成功读取Excel文件: {file_path}, 共 {len(df)} 行")
|
logger.info(f"成功读取Excel文件: {file_path}, 共 {len(df)} 行")
|
||||||
|
|
||||||
@@ -635,58 +660,31 @@ class ExcelProcessor:
|
|||||||
|
|
||||||
logger.info(f"识别到表头在第 {header_row+1} 行")
|
logger.info(f"识别到表头在第 {header_row+1} 行")
|
||||||
|
|
||||||
# 重新设置表头,避免二次读取
|
# 重新读取Excel,正确指定表头行
|
||||||
if progress_cb:
|
df = pd.read_excel(file_path, header=header_row)
|
||||||
try:
|
logger.info(f"使用表头行重新读取数据,共 {len(df)} 行有效数据")
|
||||||
progress_cb(94)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# 使用识别到的表头行设置列名,并过滤掉表头之前的行
|
|
||||||
df.columns = df.iloc[header_row]
|
|
||||||
df = df.iloc[header_row + 1:].reset_index(drop=True)
|
|
||||||
|
|
||||||
logger.info(f"重新整理数据结构,共 {len(df)} 行有效数据")
|
|
||||||
|
|
||||||
# 提取商品信息
|
# 提取商品信息
|
||||||
if progress_cb:
|
|
||||||
try:
|
|
||||||
progress_cb(96)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
products = self.extract_product_info(df)
|
products = self.extract_product_info(df)
|
||||||
|
|
||||||
if not products:
|
if not products:
|
||||||
logger.warning("未提取到有效商品信息")
|
logger.warning("未提取到有效商品信息")
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# 生成输出文件名,保存到result目录
|
# 生成输出文件名,保存到data/result目录
|
||||||
file_name = os.path.splitext(os.path.basename(file_path))[0]
|
file_name = os.path.splitext(os.path.basename(file_path))[0]
|
||||||
result_dir = self.config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(self.config, 'get_path') else os.path.abspath('data/result')
|
result_dir = "data/result"
|
||||||
os.makedirs(result_dir, exist_ok=True)
|
os.makedirs(result_dir, exist_ok=True)
|
||||||
output_file = os.path.join(result_dir, f"采购单_{file_name}.xls")
|
output_file = os.path.join(result_dir, f"采购单_{file_name}.xls")
|
||||||
|
|
||||||
# 填充模板并保存
|
# 填充模板并保存
|
||||||
if self.fill_template(products, output_file):
|
if self.fill_template(products, output_file):
|
||||||
# 从处理结果中学习商品记忆
|
|
||||||
try:
|
|
||||||
self.product_db.learn_from_products(products, source='ocr')
|
|
||||||
self.product_db._export_memory_json()
|
|
||||||
logger.info(f"已从处理结果学习 {len(products)} 条商品记忆")
|
|
||||||
except Exception as e:
|
|
||||||
logger.warning(f"学习商品记忆失败: {e}")
|
|
||||||
|
|
||||||
# 记录已处理文件
|
# 记录已处理文件
|
||||||
self.processed_files[file_path] = output_file
|
self.processed_files[file_path] = output_file
|
||||||
self._save_processed_files()
|
self._save_processed_files()
|
||||||
|
|
||||||
# 不再自动打开输出目录
|
# 不再自动打开输出目录
|
||||||
logger.info(f"采购单已保存到: {output_file}")
|
logger.info(f"采购单已保存到: {output_file}")
|
||||||
if progress_cb:
|
|
||||||
try:
|
|
||||||
progress_cb(100)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
return output_file
|
return output_file
|
||||||
|
|
||||||
@@ -696,7 +694,7 @@ class ExcelProcessor:
|
|||||||
logger.error(f"处理Excel文件时出错: {file_path}, 错误: {e}")
|
logger.error(f"处理Excel文件时出错: {file_path}, 错误: {e}")
|
||||||
return None
|
return None
|
||||||
|
|
||||||
def process_latest_file(self, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
|
def process_latest_file(self) -> Optional[str]:
|
||||||
"""
|
"""
|
||||||
处理最新的Excel文件
|
处理最新的Excel文件
|
||||||
|
|
||||||
@@ -710,7 +708,7 @@ class ExcelProcessor:
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
# 处理文件
|
# 处理文件
|
||||||
return self.process_specific_file(latest_file, progress_cb=progress_cb)
|
return self.process_specific_file(latest_file)
|
||||||
|
|
||||||
def _detect_column_mapping(self, df: pd.DataFrame) -> Dict[str, str]:
|
def _detect_column_mapping(self, df: pd.DataFrame) -> Dict[str, str]:
|
||||||
"""
|
"""
|
||||||
@@ -730,25 +728,44 @@ class ExcelProcessor:
|
|||||||
logger.error("未找到条码列,无法处理")
|
logger.error("未找到条码列,无法处理")
|
||||||
return {}
|
return {}
|
||||||
|
|
||||||
# 使用 ColumnMapper 统一查找列名
|
# 定义列名映射
|
||||||
mapped_columns = {'barcode': barcode_cols[0]}
|
column_mapping = {
|
||||||
|
'name': ['商品名称', '名称', '品名', '商品', '商品名', '商品或服务名称', '品项名', '产品名称', '品项', '名 称'],
|
||||||
|
'specification': ['规格', '规格型号', '型号', '商品规格', '产品规格', '包装规格','规 格'],
|
||||||
|
'quantity': ['数量', '采购数量', '购买数量', '采购数量', '订单数量', '数量(必填)', '入库数', '入库数量','数 量'],
|
||||||
|
'unit': ['单位', '采购单位', '计量单位', '单位(必填)', '单位名称', '计价单位','单 位'],
|
||||||
|
'price': ['单价', '价格', '采购单价', '销售价', '进货价', '单价(必填)', '采购价', '参考价', '入库单价','单 价'],
|
||||||
|
'amount': ['金额', '小计', '总价', '合计金额', '小计金额', '金额(元)', '金额合计', '合计', '总额']
|
||||||
|
}
|
||||||
|
|
||||||
|
# 映射列名到标准名称
|
||||||
|
mapped_columns = {'barcode': barcode_cols[0]} # 使用第一个找到的条码列
|
||||||
|
|
||||||
|
# 记录列名映射详情
|
||||||
logger.info(f"使用条码列: {mapped_columns['barcode']}")
|
logger.info(f"使用条码列: {mapped_columns['barcode']}")
|
||||||
|
|
||||||
# 内部键名 -> 标准列名映射 (processor.py 使用 price/amount 作为内部键名)
|
for target, possible_names in column_mapping.items():
|
||||||
field_map = [
|
for col in df.columns:
|
||||||
('name', 'name'),
|
col_str = str(col).strip()
|
||||||
('specification', 'specification'),
|
for name in possible_names:
|
||||||
('quantity', 'quantity'),
|
if col_str == name:
|
||||||
('unit', 'unit'),
|
mapped_columns[target] = col
|
||||||
('price', 'unit_price'),
|
logger.info(f"找到{target}列: {col}")
|
||||||
('amount', 'total_price'),
|
break
|
||||||
]
|
if target in mapped_columns:
|
||||||
|
break
|
||||||
|
|
||||||
for internal_key, standard_name in field_map:
|
# 如果没有找到精确匹配,尝试部分匹配
|
||||||
matched = ColumnMapper.find_column(list(df.columns), standard_name)
|
if target not in mapped_columns:
|
||||||
if matched:
|
for col in df.columns:
|
||||||
mapped_columns[internal_key] = matched
|
col_str = str(col).strip().lower()
|
||||||
logger.info(f"找到{internal_key}列: {matched}")
|
for name in possible_names:
|
||||||
|
if name.lower() in col_str:
|
||||||
|
mapped_columns[target] = col
|
||||||
|
logger.info(f"找到{target}列(部分匹配): {col}")
|
||||||
|
break
|
||||||
|
if target in mapped_columns:
|
||||||
|
break
|
||||||
|
|
||||||
return mapped_columns
|
return mapped_columns
|
||||||
|
|
||||||
@@ -872,11 +889,6 @@ class ExcelProcessor:
|
|||||||
|
|
||||||
logger.debug(f"清理后的规格字符串: {spec_str}")
|
logger.debug(f"清理后的规格字符串: {spec_str}")
|
||||||
|
|
||||||
# 新增:匹配“1件=12桶/袋/盒…”等等式规格,取右侧数量作为包装数量
|
|
||||||
eq_match = re.search(r'(\d+(?:\.\d+)?)\s*(?:件|箱|提|盒)\s*[==]\s*(\d+)\s*(?:瓶|桶|盒|支|个|袋|罐|包|卷)', spec_str)
|
|
||||||
if eq_match:
|
|
||||||
return int(eq_match.group(2))
|
|
||||||
|
|
||||||
# 匹配带单位的格式,如"5kg*6"、"450g*15"、"450ml*15"
|
# 匹配带单位的格式,如"5kg*6"、"450g*15"、"450ml*15"
|
||||||
weight_pattern = r'(\d+(?:\.\d+)?)\s*(?:kg|KG|千克|公斤)[*×](\d+)'
|
weight_pattern = r'(\d+(?:\.\d+)?)\s*(?:kg|KG|千克|公斤)[*×](\d+)'
|
||||||
match = re.search(weight_pattern, spec_str)
|
match = re.search(weight_pattern, spec_str)
|
||||||
|
|||||||
@@ -0,0 +1,355 @@
|
|||||||
|
"""
|
||||||
|
单位转换器测试模块
|
||||||
|
---------------
|
||||||
|
测试单位转换和条码映射逻辑
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import unittest
|
||||||
|
from typing import Dict, Any
|
||||||
|
|
||||||
|
# 添加项目根目录到Python路径
|
||||||
|
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '../../..')))
|
||||||
|
|
||||||
|
from app.core.excel.converter import UnitConverter
|
||||||
|
from app.core.excel.validators import ProductValidator
|
||||||
|
|
||||||
|
|
||||||
|
class TestUnitConverter(unittest.TestCase):
|
||||||
|
"""
|
||||||
|
测试单位转换器功能
|
||||||
|
"""
|
||||||
|
|
||||||
|
def setUp(self):
|
||||||
|
"""
|
||||||
|
测试前的准备工作
|
||||||
|
"""
|
||||||
|
self.converter = UnitConverter()
|
||||||
|
|
||||||
|
def test_jian_unit_conversion(self):
|
||||||
|
"""
|
||||||
|
测试"件"单位的转换
|
||||||
|
"""
|
||||||
|
# 准备测试数据
|
||||||
|
product = {
|
||||||
|
'barcode': '6954767400129',
|
||||||
|
'name': '美汁源果粒橙1.8L*8瓶',
|
||||||
|
'specification': '1.8L*8',
|
||||||
|
'quantity': 1.0,
|
||||||
|
'unit': '件',
|
||||||
|
'price': 65.0
|
||||||
|
}
|
||||||
|
|
||||||
|
# 执行转换
|
||||||
|
result = self.converter.process_unit_conversion(product)
|
||||||
|
|
||||||
|
# 验证结果
|
||||||
|
self.assertEqual(result['quantity'], 8.0)
|
||||||
|
self.assertEqual(result['price'], 8.125)
|
||||||
|
self.assertEqual(result['unit'], '瓶')
|
||||||
|
|
||||||
|
def test_box_unit_conversion(self):
|
||||||
|
"""
|
||||||
|
测试"箱"单位的转换
|
||||||
|
"""
|
||||||
|
# 准备测试数据
|
||||||
|
product = {
|
||||||
|
'barcode': '6925303721244',
|
||||||
|
'name': '统一鲜橙多2L*6瓶',
|
||||||
|
'specification': '2L*6',
|
||||||
|
'quantity': 1.0,
|
||||||
|
'unit': '箱',
|
||||||
|
'price': 43.0
|
||||||
|
}
|
||||||
|
|
||||||
|
# 执行转换
|
||||||
|
result = self.converter.process_unit_conversion(product)
|
||||||
|
|
||||||
|
# 验证结果
|
||||||
|
self.assertEqual(result['quantity'], 6.0)
|
||||||
|
self.assertEqual(result['price'], 7.1666666666666667)
|
||||||
|
self.assertEqual(result['unit'], '瓶')
|
||||||
|
|
||||||
|
def test_tihe_unit_conversion_level3(self):
|
||||||
|
"""
|
||||||
|
测试"提"单位的转换(三级规格)
|
||||||
|
"""
|
||||||
|
# 准备测试数据(三级规格:1*6*4,表示1排6提,每提4瓶)
|
||||||
|
product = {
|
||||||
|
'barcode': '6921168509347',
|
||||||
|
'name': '农夫山泉550ml*24瓶',
|
||||||
|
'specification': '1*6*4',
|
||||||
|
'quantity': 2.0,
|
||||||
|
'unit': '提',
|
||||||
|
'price': 16.0
|
||||||
|
}
|
||||||
|
|
||||||
|
# 执行转换
|
||||||
|
result = self.converter.process_unit_conversion(product)
|
||||||
|
|
||||||
|
# 验证结果:三级规格,提单位特殊处理,数量*最后一级
|
||||||
|
self.assertEqual(result['quantity'], 8.0) # 2提 * 4瓶/提
|
||||||
|
self.assertEqual(result['price'], 4.0) # 16元/提 ÷ 4瓶/提
|
||||||
|
self.assertEqual(result['unit'], '瓶')
|
||||||
|
|
||||||
|
def test_tihe_unit_conversion_level2(self):
|
||||||
|
"""
|
||||||
|
测试"提"单位的转换(二级规格)
|
||||||
|
"""
|
||||||
|
# 准备测试数据(二级规格:1*4,表示每件4提)
|
||||||
|
product = {
|
||||||
|
'barcode': '6921168509347',
|
||||||
|
'name': '农夫山泉550ml*4瓶',
|
||||||
|
'specification': '1*4',
|
||||||
|
'quantity': 5.0,
|
||||||
|
'unit': '提',
|
||||||
|
'price': 10.0
|
||||||
|
}
|
||||||
|
|
||||||
|
# 执行转换
|
||||||
|
result = self.converter.process_unit_conversion(product)
|
||||||
|
|
||||||
|
# 验证结果:二级规格,提单位保持不变
|
||||||
|
self.assertEqual(result['quantity'], 5.0)
|
||||||
|
self.assertEqual(result['price'], 10.0)
|
||||||
|
self.assertEqual(result['unit'], '提')
|
||||||
|
|
||||||
|
def test_barcode_mapping(self):
|
||||||
|
"""
|
||||||
|
测试条码映射
|
||||||
|
"""
|
||||||
|
# 准备测试数据(使用需要被映射的条码)
|
||||||
|
product = {
|
||||||
|
'barcode': '6920584471055', # 这个条码应映射到6920584471017
|
||||||
|
'name': '测试映射条码商品',
|
||||||
|
'specification': '1*12',
|
||||||
|
'quantity': 1.0,
|
||||||
|
'unit': '件',
|
||||||
|
'price': 60.0
|
||||||
|
}
|
||||||
|
|
||||||
|
# 执行转换
|
||||||
|
result = self.converter.process_unit_conversion(product)
|
||||||
|
|
||||||
|
# 验证结果:条码应该被映射
|
||||||
|
self.assertEqual(result['barcode'], '6920584471017')
|
||||||
|
self.assertEqual(result['quantity'], 12.0) # 同时处理件单位转换
|
||||||
|
self.assertEqual(result['price'], 5.0) # 60元/件 ÷ 12瓶/件
|
||||||
|
self.assertEqual(result['unit'], '瓶')
|
||||||
|
|
||||||
|
def test_special_barcode_multiplier(self):
|
||||||
|
"""
|
||||||
|
测试特殊条码的倍数处理
|
||||||
|
"""
|
||||||
|
# 准备测试数据(使用特殊条码)
|
||||||
|
product = {
|
||||||
|
'barcode': '6925019900087', # 特殊条码:数量*10,单位转瓶
|
||||||
|
'name': '特殊条码商品',
|
||||||
|
'specification': '1*10',
|
||||||
|
'quantity': 2.0,
|
||||||
|
'unit': '箱',
|
||||||
|
'price': 100.0
|
||||||
|
}
|
||||||
|
|
||||||
|
# 执行转换
|
||||||
|
result = self.converter.process_unit_conversion(product)
|
||||||
|
|
||||||
|
# 验证结果:特殊条码乘数应该生效
|
||||||
|
self.assertEqual(result['quantity'], 20.0) # 2箱 * 10倍数
|
||||||
|
self.assertEqual(result['price'], 5.0) # 100元/箱 ÷ 10倍数/箱
|
||||||
|
self.assertEqual(result['unit'], '瓶')
|
||||||
|
|
||||||
|
|
||||||
|
class TestProductValidator(unittest.TestCase):
|
||||||
|
"""
|
||||||
|
测试商品数据验证器功能
|
||||||
|
"""
|
||||||
|
|
||||||
|
def setUp(self):
|
||||||
|
"""
|
||||||
|
测试前的准备工作
|
||||||
|
"""
|
||||||
|
self.validator = ProductValidator()
|
||||||
|
|
||||||
|
def test_validate_barcode(self):
|
||||||
|
"""
|
||||||
|
测试条码验证
|
||||||
|
"""
|
||||||
|
# 测试有效条码
|
||||||
|
is_valid, barcode, error = self.validator.validate_barcode('6925303721244')
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(barcode, '6925303721244')
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试包含非数字字符的条码
|
||||||
|
is_valid, barcode, error = self.validator.validate_barcode('6925303-721244')
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(barcode, '6925303721244')
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试5开头的条码修正
|
||||||
|
is_valid, barcode, error = self.validator.validate_barcode('5925303721244')
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(barcode, '6925303721244')
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试过短的条码
|
||||||
|
is_valid, barcode, error = self.validator.validate_barcode('12345')
|
||||||
|
self.assertFalse(is_valid)
|
||||||
|
self.assertEqual(barcode, '12345')
|
||||||
|
self.assertIn("条码长度异常", error)
|
||||||
|
|
||||||
|
# 测试仓库标识
|
||||||
|
is_valid, barcode, error = self.validator.validate_barcode('仓库')
|
||||||
|
self.assertFalse(is_valid)
|
||||||
|
self.assertEqual(barcode, '仓库')
|
||||||
|
self.assertEqual(error, "条码为仓库标识")
|
||||||
|
|
||||||
|
# 测试空值
|
||||||
|
is_valid, barcode, error = self.validator.validate_barcode(None)
|
||||||
|
self.assertFalse(is_valid)
|
||||||
|
self.assertEqual(barcode, "")
|
||||||
|
self.assertEqual(error, "条码为空")
|
||||||
|
|
||||||
|
def test_validate_quantity(self):
|
||||||
|
"""
|
||||||
|
测试数量验证
|
||||||
|
"""
|
||||||
|
# 测试有效数量
|
||||||
|
is_valid, quantity, error = self.validator.validate_quantity(10)
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(quantity, 10.0)
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试字符串数量
|
||||||
|
is_valid, quantity, error = self.validator.validate_quantity("25.5")
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(quantity, 25.5)
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试带单位的数量
|
||||||
|
is_valid, quantity, error = self.validator.validate_quantity("30瓶")
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(quantity, 30.0)
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试零数量
|
||||||
|
is_valid, quantity, error = self.validator.validate_quantity(0)
|
||||||
|
self.assertFalse(is_valid)
|
||||||
|
self.assertEqual(quantity, 0.0)
|
||||||
|
self.assertIn("数量必须大于0", error)
|
||||||
|
|
||||||
|
# 测试负数量
|
||||||
|
is_valid, quantity, error = self.validator.validate_quantity(-5)
|
||||||
|
self.assertFalse(is_valid)
|
||||||
|
self.assertEqual(quantity, 0.0)
|
||||||
|
self.assertIn("数量必须大于0", error)
|
||||||
|
|
||||||
|
# 测试非数字
|
||||||
|
is_valid, quantity, error = self.validator.validate_quantity("abc")
|
||||||
|
self.assertFalse(is_valid)
|
||||||
|
self.assertEqual(quantity, 0.0)
|
||||||
|
self.assertIn("数量不包含数字", error)
|
||||||
|
|
||||||
|
# 测试空值
|
||||||
|
is_valid, quantity, error = self.validator.validate_quantity(None)
|
||||||
|
self.assertFalse(is_valid)
|
||||||
|
self.assertEqual(quantity, 0.0)
|
||||||
|
self.assertEqual(error, "数量为空")
|
||||||
|
|
||||||
|
def test_validate_price(self):
|
||||||
|
"""
|
||||||
|
测试单价验证
|
||||||
|
"""
|
||||||
|
# 测试有效单价
|
||||||
|
is_valid, price, is_gift, error = self.validator.validate_price(12.5)
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(price, 12.5)
|
||||||
|
self.assertFalse(is_gift)
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试字符串单价
|
||||||
|
is_valid, price, is_gift, error = self.validator.validate_price("8.0")
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(price, 8.0)
|
||||||
|
self.assertFalse(is_gift)
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试零单价(赠品)
|
||||||
|
is_valid, price, is_gift, error = self.validator.validate_price(0)
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(price, 0.0)
|
||||||
|
self.assertTrue(is_gift)
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试"赠品"标记
|
||||||
|
is_valid, price, is_gift, error = self.validator.validate_price("赠品")
|
||||||
|
self.assertTrue(is_valid)
|
||||||
|
self.assertEqual(price, 0.0)
|
||||||
|
self.assertTrue(is_gift)
|
||||||
|
self.assertIsNone(error)
|
||||||
|
|
||||||
|
# 测试负单价
|
||||||
|
is_valid, price, is_gift, error = self.validator.validate_price(-5)
|
||||||
|
self.assertFalse(is_valid)
|
||||||
|
self.assertEqual(price, 0.0)
|
||||||
|
self.assertTrue(is_gift)
|
||||||
|
self.assertIn("单价不能为负数", error)
|
||||||
|
|
||||||
|
# 测试空值
|
||||||
|
is_valid, price, is_gift, error = self.validator.validate_price(None)
|
||||||
|
self.assertFalse(is_valid)
|
||||||
|
self.assertEqual(price, 0.0)
|
||||||
|
self.assertTrue(is_gift)
|
||||||
|
self.assertEqual(error, "单价为空,视为赠品")
|
||||||
|
|
||||||
|
def test_validate_product(self):
|
||||||
|
"""
|
||||||
|
测试商品数据验证
|
||||||
|
"""
|
||||||
|
# 准备测试数据(有效商品)
|
||||||
|
product = {
|
||||||
|
'barcode': '6954767400129',
|
||||||
|
'name': '测试商品',
|
||||||
|
'specification': '1*12',
|
||||||
|
'quantity': 3.0,
|
||||||
|
'price': 36.0,
|
||||||
|
'unit': '件',
|
||||||
|
'is_gift': False
|
||||||
|
}
|
||||||
|
|
||||||
|
# 验证有效商品
|
||||||
|
result = self.validator.validate_product(product)
|
||||||
|
self.assertEqual(result['barcode'], '6954767400129')
|
||||||
|
self.assertEqual(result['quantity'], 3.0)
|
||||||
|
self.assertEqual(result['price'], 36.0)
|
||||||
|
self.assertFalse(result['is_gift'])
|
||||||
|
|
||||||
|
# 验证赠品商品
|
||||||
|
gift_product = product.copy()
|
||||||
|
gift_product['price'] = 0
|
||||||
|
result = self.validator.validate_product(gift_product)
|
||||||
|
self.assertEqual(result['price'], 0.0)
|
||||||
|
self.assertTrue(result['is_gift'])
|
||||||
|
|
||||||
|
# 验证需要修复的商品
|
||||||
|
invalid_product = {
|
||||||
|
'barcode': '5954767-400129', # 需要修复前缀和移除非数字
|
||||||
|
'name': '测试商品',
|
||||||
|
'specification': '1*12',
|
||||||
|
'quantity': '2件', # 需要提取数字
|
||||||
|
'price': '赠品', # 赠品标记
|
||||||
|
'unit': '件',
|
||||||
|
'is_gift': False
|
||||||
|
}
|
||||||
|
|
||||||
|
result = self.validator.validate_product(invalid_product)
|
||||||
|
self.assertEqual(result['barcode'], '6954767400129') # 5->6,移除 '-'
|
||||||
|
self.assertEqual(result['quantity'], 2.0) # 提取数字
|
||||||
|
self.assertEqual(result['price'], 0.0) # 赠品价格为0
|
||||||
|
self.assertTrue(result['is_gift']) # 标记为赠品
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
unittest.main()
|
||||||
@@ -9,7 +9,6 @@ import logging
|
|||||||
from typing import Dict, Any, Optional, List, Tuple, Union
|
from typing import Dict, Any, Optional, List, Tuple, Union
|
||||||
|
|
||||||
from ..utils.log_utils import get_logger
|
from ..utils.log_utils import get_logger
|
||||||
from ..utils.string_utils import parse_monetary_string
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
@@ -157,8 +156,23 @@ class ProductValidator:
|
|||||||
if price_str in ["赠品", "gift", "赠送", "0", ""]:
|
if price_str in ["赠品", "gift", "赠送", "0", ""]:
|
||||||
return True, 0.0, True, None
|
return True, 0.0, True, None
|
||||||
|
|
||||||
price_value = parse_monetary_string(price_str)
|
# 去除空白和非数字字符(保留小数点和逗号)
|
||||||
if price_value is None:
|
price_clean = re.sub(r'[^\d\.,]', '', price_str)
|
||||||
|
|
||||||
|
# 处理小数点和逗号
|
||||||
|
if ',' in price_clean and '.' not in price_clean:
|
||||||
|
# 如果只有逗号没有小数点,将逗号视为小数点
|
||||||
|
price_clean = price_clean.replace(',', '.')
|
||||||
|
elif ',' in price_clean and '.' in price_clean:
|
||||||
|
# 如果既有逗号又有小数点,移除逗号(认为逗号是千位分隔符)
|
||||||
|
price_clean = price_clean.replace(',', '')
|
||||||
|
|
||||||
|
if not price_clean:
|
||||||
|
return False, 0.0, True, "单价不包含数字,视为赠品"
|
||||||
|
|
||||||
|
try:
|
||||||
|
price_value = float(price_clean)
|
||||||
|
except ValueError:
|
||||||
return False, 0.0, True, f"无法将单价 '{price}' 转换为数字,视为赠品"
|
return False, 0.0, True, f"无法将单价 '{price}' 转换为数字,视为赠品"
|
||||||
else:
|
else:
|
||||||
# 尝试直接转换
|
# 尝试直接转换
|
||||||
@@ -212,17 +226,6 @@ class ProductValidator:
|
|||||||
if error_msg:
|
if error_msg:
|
||||||
logger.info(error_msg)
|
logger.info(error_msg)
|
||||||
|
|
||||||
amount = product.get('amount', None)
|
|
||||||
try:
|
|
||||||
is_amount_gift = False
|
|
||||||
parsed_amount = parse_monetary_string(amount)
|
|
||||||
if parsed_amount is None or parsed_amount == 0.0:
|
|
||||||
is_amount_gift = True
|
|
||||||
if is_amount_gift:
|
|
||||||
validated_product['is_gift'] = True
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# 验证数量
|
# 验证数量
|
||||||
quantity = product.get('quantity', None)
|
quantity = product.get('quantity', None)
|
||||||
is_valid, fixed_quantity, error_msg = self.validate_quantity(quantity)
|
is_valid, fixed_quantity, error_msg = self.validate_quantity(quantity)
|
||||||
@@ -236,9 +239,18 @@ class ProductValidator:
|
|||||||
if fixed_price > 0 and amount is not None:
|
if fixed_price > 0 and amount is not None:
|
||||||
try:
|
try:
|
||||||
# 确保金额是数字
|
# 确保金额是数字
|
||||||
amount = parse_monetary_string(amount)
|
if isinstance(amount, str):
|
||||||
if amount is None:
|
# 移除货币符号和非数字字符,保留数字、小数点和逗号
|
||||||
raise ValueError("无法解析金额")
|
amount_str = re.sub(r'[^\d\.,]', '', amount.strip())
|
||||||
|
# 替换逗号为小数点(如果逗号作为小数分隔符)
|
||||||
|
if ',' in amount_str and '.' not in amount_str:
|
||||||
|
amount_str = amount_str.replace(',', '.')
|
||||||
|
# 处理既有逗号又有小数点的情况(通常逗号是千位分隔符)
|
||||||
|
elif ',' in amount_str and '.' in amount_str:
|
||||||
|
amount_str = amount_str.replace(',', '')
|
||||||
|
amount = float(amount_str)
|
||||||
|
else:
|
||||||
|
amount = float(amount)
|
||||||
|
|
||||||
# 计算数量 = 金额 / 单价
|
# 计算数量 = 金额 / 单价
|
||||||
if amount > 0:
|
if amount > 0:
|
||||||
|
|||||||
@@ -1,9 +0,0 @@
|
|||||||
"""
|
|
||||||
数据处理handlers模块初始化文件
|
|
||||||
"""
|
|
||||||
|
|
||||||
from .data_cleaner import DataCleaner
|
|
||||||
from .column_mapper import ColumnMapper
|
|
||||||
from .calculator import DataCalculator
|
|
||||||
|
|
||||||
__all__ = ['DataCleaner', 'ColumnMapper', 'DataCalculator']
|
|
||||||
@@ -1,378 +0,0 @@
|
|||||||
"""
|
|
||||||
数据计算处理器
|
|
||||||
|
|
||||||
提供各种数据计算功能,如数量计算、价格计算、汇总统计等
|
|
||||||
"""
|
|
||||||
|
|
||||||
import pandas as pd
|
|
||||||
import numpy as np
|
|
||||||
from typing import Dict, Any, Optional, List, Union
|
|
||||||
from ...core.utils.log_utils import get_logger
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class DataCalculator:
|
|
||||||
"""数据计算处理器
|
|
||||||
|
|
||||||
提供标准化的数据计算功能,支持各种业务计算规则
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, config: Optional[Dict[str, Any]] = None):
|
|
||||||
"""初始化数据计算器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
config: 计算配置
|
|
||||||
"""
|
|
||||||
self.config = config or {}
|
|
||||||
self.calculation_rules = []
|
|
||||||
|
|
||||||
def add_rule(self, rule_type: str, **kwargs):
|
|
||||||
"""添加计算规则
|
|
||||||
|
|
||||||
Args:
|
|
||||||
rule_type: 规则类型
|
|
||||||
**kwargs: 规则参数
|
|
||||||
"""
|
|
||||||
rule = {'type': rule_type, **kwargs}
|
|
||||||
self.calculation_rules.append(rule)
|
|
||||||
logger.debug(f"添加计算规则: {rule_type}")
|
|
||||||
|
|
||||||
def calculate(self, df: pd.DataFrame) -> pd.DataFrame:
|
|
||||||
"""执行数据计算
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 输入数据
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
计算后的数据
|
|
||||||
"""
|
|
||||||
logger.info(f"开始数据计算,原始数据形状: {df.shape}")
|
|
||||||
|
|
||||||
result_df = df.copy()
|
|
||||||
|
|
||||||
for i, rule in enumerate(self.calculation_rules):
|
|
||||||
try:
|
|
||||||
logger.debug(f"执行计算规则 {i+1}/{len(self.calculation_rules)}: {rule['type']}")
|
|
||||||
result_df = self._apply_rule(result_df, rule)
|
|
||||||
logger.debug(f"规则执行完成,数据形状: {result_df.shape}")
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"计算规则执行失败: {rule}, 错误: {e}")
|
|
||||||
# 继续执行下一个规则,而不是中断整个流程
|
|
||||||
continue
|
|
||||||
|
|
||||||
logger.info(f"数据计算完成,最终数据形状: {result_df.shape}")
|
|
||||||
return result_df
|
|
||||||
|
|
||||||
def _apply_rule(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""应用单个计算规则
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
rule_type = rule.get('type')
|
|
||||||
|
|
||||||
if rule_type == 'multiply':
|
|
||||||
return self._multiply(df, rule)
|
|
||||||
elif rule_type == 'divide':
|
|
||||||
return self._divide(df, rule)
|
|
||||||
elif rule_type == 'add':
|
|
||||||
return self._add(df, rule)
|
|
||||||
elif rule_type == 'subtract':
|
|
||||||
return self._subtract(df, rule)
|
|
||||||
elif rule_type == 'formula':
|
|
||||||
return self._formula(df, rule)
|
|
||||||
elif rule_type == 'round':
|
|
||||||
return self._round(df, rule)
|
|
||||||
elif rule_type == 'sum':
|
|
||||||
return self._sum(df, rule)
|
|
||||||
elif rule_type == 'aggregate':
|
|
||||||
return self._aggregate(df, rule)
|
|
||||||
else:
|
|
||||||
logger.warning(f"未知的计算规则类型: {rule_type}")
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _multiply(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""乘法计算
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
source_column = rule.get('source_column')
|
|
||||||
target_column = rule.get('target_column')
|
|
||||||
factor = rule.get('factor', 1)
|
|
||||||
|
|
||||||
if source_column and target_column:
|
|
||||||
if source_column in df.columns:
|
|
||||||
df[target_column] = df[source_column] * factor
|
|
||||||
logger.debug(f"乘法计算: {source_column} * {factor} -> {target_column}")
|
|
||||||
else:
|
|
||||||
logger.warning(f"源列不存在: {source_column}")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _divide(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""除法计算
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
source_column = rule.get('source_column')
|
|
||||||
target_column = rule.get('target_column')
|
|
||||||
divisor = rule.get('divisor', 1)
|
|
||||||
|
|
||||||
if source_column and target_column and divisor != 0:
|
|
||||||
if source_column in df.columns:
|
|
||||||
df[target_column] = df[source_column] / divisor
|
|
||||||
logger.debug(f"除法计算: {source_column} / {divisor} -> {target_column}")
|
|
||||||
else:
|
|
||||||
logger.warning(f"源列不存在: {source_column}")
|
|
||||||
elif divisor == 0:
|
|
||||||
logger.error("除数不能为0")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _add(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""加法计算
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
columns = rule.get('columns', [])
|
|
||||||
target_column = rule.get('target_column')
|
|
||||||
constant = rule.get('constant', 0)
|
|
||||||
|
|
||||||
if target_column:
|
|
||||||
if isinstance(columns, str):
|
|
||||||
columns = [columns]
|
|
||||||
|
|
||||||
if columns:
|
|
||||||
# 列相加
|
|
||||||
valid_columns = [col for col in columns if col in df.columns]
|
|
||||||
if valid_columns:
|
|
||||||
df[target_column] = df[valid_columns].sum(axis=1) + constant
|
|
||||||
logger.debug(f"加法计算: {valid_columns} + {constant} -> {target_column}")
|
|
||||||
else:
|
|
||||||
logger.warning(f"没有有效的列用于加法计算: {columns}")
|
|
||||||
else:
|
|
||||||
# 只加常数
|
|
||||||
if target_column in df.columns:
|
|
||||||
df[target_column] = df[target_column] + constant
|
|
||||||
logger.debug(f"加法计算: {target_column} + {constant}")
|
|
||||||
else:
|
|
||||||
logger.warning(f"目标列不存在: {target_column}")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _subtract(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""减法计算
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
minuend = rule.get('minuend') # 被减数列
|
|
||||||
subtrahend = rule.get('subtrahend') # 减数列
|
|
||||||
target_column = rule.get('target_column')
|
|
||||||
constant = rule.get('constant', 0)
|
|
||||||
|
|
||||||
if target_column and minuend and minuend in df.columns:
|
|
||||||
if subtrahend and subtrahend in df.columns:
|
|
||||||
df[target_column] = df[minuend] - df[subtrahend] - constant
|
|
||||||
logger.debug(f"减法计算: {minuend} - {subtrahend} - {constant} -> {target_column}")
|
|
||||||
else:
|
|
||||||
df[target_column] = df[minuend] - constant
|
|
||||||
logger.debug(f"减法计算: {minuend} - {constant} -> {target_column}")
|
|
||||||
else:
|
|
||||||
logger.warning(f"减法计算参数不完整或列不存在")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _formula(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""公式计算
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
formula = rule.get('formula')
|
|
||||||
target_column = rule.get('target_column')
|
|
||||||
|
|
||||||
if formula and target_column:
|
|
||||||
try:
|
|
||||||
df[target_column] = df.eval(formula)
|
|
||||||
logger.debug(f"公式计算: {formula} -> {target_column}")
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"公式计算失败: {formula}, 错误: {e}")
|
|
||||||
else:
|
|
||||||
logger.warning("公式计算缺少公式或目标列")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _round(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""四舍五入
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
columns = rule.get('columns', [])
|
|
||||||
decimals = rule.get('decimals', 0)
|
|
||||||
|
|
||||||
if isinstance(columns, str):
|
|
||||||
columns = [columns]
|
|
||||||
|
|
||||||
target_columns = columns or df.select_dtypes(include=[np.number]).columns
|
|
||||||
|
|
||||||
for col in target_columns:
|
|
||||||
if col in df.columns and pd.api.types.is_numeric_dtype(df[col]):
|
|
||||||
df[col] = df[col].round(decimals)
|
|
||||||
logger.debug(f"四舍五入: {col} 保留 {decimals} 位小数")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _sum(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""求和计算
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
columns = rule.get('columns', [])
|
|
||||||
target_column = rule.get('target_column')
|
|
||||||
group_by = rule.get('group_by')
|
|
||||||
|
|
||||||
if isinstance(columns, str):
|
|
||||||
columns = [columns]
|
|
||||||
|
|
||||||
if group_by and group_by in df.columns:
|
|
||||||
# 分组求和
|
|
||||||
if columns:
|
|
||||||
for col in columns:
|
|
||||||
if col in df.columns:
|
|
||||||
sum_result = df.groupby(group_by)[col].sum()
|
|
||||||
logger.debug(f"分组求和: {col} 按 {group_by} 分组")
|
|
||||||
else:
|
|
||||||
# 所有数值列分组求和
|
|
||||||
numeric_columns = df.select_dtypes(include=[np.number]).columns
|
|
||||||
sum_result = df.groupby(group_by)[numeric_columns].sum()
|
|
||||||
logger.debug(f"分组求和: 所有数值列 按 {group_by} 分组")
|
|
||||||
else:
|
|
||||||
# 总体求和
|
|
||||||
if columns:
|
|
||||||
valid_columns = [col for col in columns if col in df.columns]
|
|
||||||
if valid_columns and target_column:
|
|
||||||
df[target_column] = df[valid_columns].sum(axis=1)
|
|
||||||
logger.debug(f"求和计算: {valid_columns} -> {target_column}")
|
|
||||||
else:
|
|
||||||
# 所有数值列求和
|
|
||||||
numeric_columns = df.select_dtypes(include=[np.number]).columns
|
|
||||||
if target_column and len(numeric_columns) > 0:
|
|
||||||
df[target_column] = df[numeric_columns].sum(axis=1)
|
|
||||||
logger.debug(f"求和计算: {list(numeric_columns)} -> {target_column}")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _aggregate(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""聚合计算
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
group_by = rule.get('group_by')
|
|
||||||
aggregations = rule.get('aggregations', {})
|
|
||||||
|
|
||||||
if group_by and group_by in df.columns:
|
|
||||||
# 构建聚合函数字典
|
|
||||||
agg_dict = {}
|
|
||||||
for column, func in aggregations.items():
|
|
||||||
if column in df.columns:
|
|
||||||
if isinstance(func, str):
|
|
||||||
agg_dict[column] = func
|
|
||||||
elif isinstance(func, list):
|
|
||||||
agg_dict[column] = func
|
|
||||||
|
|
||||||
if agg_dict:
|
|
||||||
result = df.groupby(group_by).agg(agg_dict)
|
|
||||||
logger.debug(f"聚合计算: 按 {group_by} 分组, 聚合: {agg_dict}")
|
|
||||||
return result.reset_index()
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
# 便捷方法
|
|
||||||
def multiply(self, source_column: str, target_column: str, factor: float):
|
|
||||||
"""乘法计算"""
|
|
||||||
self.add_rule('multiply', source_column=source_column,
|
|
||||||
target_column=target_column, factor=factor)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def divide(self, source_column: str, target_column: str, divisor: float):
|
|
||||||
"""除法计算"""
|
|
||||||
self.add_rule('divide', source_column=source_column,
|
|
||||||
target_column=target_column, divisor=divisor)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def add(self, columns: Union[str, List[str]], target_column: str, constant: float = 0):
|
|
||||||
"""加法计算"""
|
|
||||||
self.add_rule('add', columns=columns, target_column=target_column, constant=constant)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def subtract(self, minuend: str, target_column: str,
|
|
||||||
subtrahend: Optional[str] = None, constant: float = 0):
|
|
||||||
"""减法计算"""
|
|
||||||
self.add_rule('subtract', minuend=minuend, target_column=target_column,
|
|
||||||
subtrahend=subtrahend, constant=constant)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def formula(self, formula: str, target_column: str):
|
|
||||||
"""公式计算"""
|
|
||||||
self.add_rule('formula', formula=formula, target_column=target_column)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def round_columns(self, columns: Optional[Union[str, List[str]]] = None, decimals: int = 0):
|
|
||||||
"""四舍五入"""
|
|
||||||
self.add_rule('round', columns=columns, decimals=decimals)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def sum_columns(self, columns: Optional[Union[str, List[str]]] = None,
|
|
||||||
target_column: Optional[str] = None, group_by: Optional[str] = None):
|
|
||||||
"""求和计算"""
|
|
||||||
self.add_rule('sum', columns=columns, target_column=target_column, group_by=group_by)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def aggregate(self, group_by: str, aggregations: Dict[str, Union[str, List[str]]]):
|
|
||||||
"""聚合计算"""
|
|
||||||
self.add_rule('aggregate', group_by=group_by, aggregations=aggregations)
|
|
||||||
return self
|
|
||||||
@@ -1,382 +0,0 @@
|
|||||||
"""
|
|
||||||
列映射处理器
|
|
||||||
|
|
||||||
提供列名映射和转换功能,支持不同供应商的列名标准化
|
|
||||||
"""
|
|
||||||
|
|
||||||
import re
|
|
||||||
import pandas as pd
|
|
||||||
from typing import Dict, Any, Optional, List, Union
|
|
||||||
from ...core.utils.log_utils import get_logger
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class ColumnMapper:
|
|
||||||
"""列映射处理器
|
|
||||||
|
|
||||||
提供列名标准化功能,将不同供应商的列名映射到标准列名
|
|
||||||
"""
|
|
||||||
|
|
||||||
# 标准列名定义(所有列名别名的唯一来源)
|
|
||||||
STANDARD_COLUMNS = {
|
|
||||||
'barcode': [
|
|
||||||
'条码', '条形码', '商品条码', '商品条形码', '产品条码', '商品编码',
|
|
||||||
'商品编号', '条码(必填)', '电脑条码', '条码ID',
|
|
||||||
'barcode', 'Barcode', 'BarCode', 'code', '编码',
|
|
||||||
],
|
|
||||||
'name': [
|
|
||||||
'商品名称', '产品名称', '名称', '商品', '产品', '商品名', '品名',
|
|
||||||
'品项名', '商品或服务名称', '品项', '名 称',
|
|
||||||
'name', 'product_name',
|
|
||||||
],
|
|
||||||
'specification': [
|
|
||||||
'规格', '规格型号', '型号', '商品规格', '产品规格', '包装规格', '规 格',
|
|
||||||
'specification', 'spec', 'model',
|
|
||||||
],
|
|
||||||
'quantity': [
|
|
||||||
'数量', '采购量', '订货数量', '订单量', '需求量', '采购数量', '购买数量',
|
|
||||||
'订单数量', '数量(必填)', '采购量(必填)', '入库数', '入库数量', '数 量',
|
|
||||||
'quantity', 'qty',
|
|
||||||
],
|
|
||||||
'unit': [
|
|
||||||
'单位', '计量单位', '采购单位', '单位(必填)', '单位名称', '计价单位', '单 位',
|
|
||||||
'unit', 'units',
|
|
||||||
],
|
|
||||||
'unit_price': [
|
|
||||||
'单价', '价格', '采购单价', '进货价', '销售价', '采购价', '参考价',
|
|
||||||
'入库单价', '单价(必填)', '采购单价(必填)', '价格(必填)', '单 价',
|
|
||||||
'unit_price', 'price',
|
|
||||||
],
|
|
||||||
'total_price': [
|
|
||||||
'总价', '金额', '小计', '合计金额', '小计金额', '金额(元)',
|
|
||||||
'金额合计', '合计', '总额',
|
|
||||||
'total_price', 'total', 'amount',
|
|
||||||
],
|
|
||||||
'gift_quantity': [
|
|
||||||
'赠送量', '赠品数量', '赠送数量', '赠品',
|
|
||||||
],
|
|
||||||
'category': ['类别', '分类', '商品类别', 'category', 'type'],
|
|
||||||
'brand': ['品牌', '商标', 'brand'],
|
|
||||||
'supplier': ['供应商', '供货商', 'supplier', 'vendor'],
|
|
||||||
}
|
|
||||||
|
|
||||||
def __init__(self, mapping_config: Optional[Dict[str, Any]] = None):
|
|
||||||
"""初始化列映射器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
mapping_config: 映射配置
|
|
||||||
"""
|
|
||||||
self.mapping_config = mapping_config or {}
|
|
||||||
self.custom_mappings = {}
|
|
||||||
self._build_reverse_mapping()
|
|
||||||
|
|
||||||
def _build_reverse_mapping(self):
|
|
||||||
"""构建反向映射表"""
|
|
||||||
self.reverse_mapping = {}
|
|
||||||
|
|
||||||
# 添加标准列的反向映射
|
|
||||||
for standard_name, variations in self.STANDARD_COLUMNS.items():
|
|
||||||
for variation in variations:
|
|
||||||
self.reverse_mapping[variation.lower()] = standard_name
|
|
||||||
|
|
||||||
# 添加自定义映射
|
|
||||||
for standard_name, custom_names in self.mapping_config.items():
|
|
||||||
if isinstance(custom_names, str):
|
|
||||||
custom_names = [custom_names]
|
|
||||||
|
|
||||||
for custom_name in custom_names:
|
|
||||||
self.reverse_mapping[custom_name.lower()] = standard_name
|
|
||||||
self.custom_mappings[custom_name.lower()] = standard_name
|
|
||||||
|
|
||||||
def map_columns(self, df: pd.DataFrame, target_columns: Optional[List[str]] = None) -> pd.DataFrame:
|
|
||||||
"""映射列名
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 输入数据
|
|
||||||
target_columns: 目标列名列表,如果为None则使用所有标准列
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
列名映射后的数据
|
|
||||||
"""
|
|
||||||
if target_columns is None:
|
|
||||||
target_columns = list(self.STANDARD_COLUMNS.keys())
|
|
||||||
|
|
||||||
logger.info(f"开始列名映射,目标列: {target_columns}")
|
|
||||||
logger.info(f"原始列名: {list(df.columns)}")
|
|
||||||
|
|
||||||
# 创建列名映射
|
|
||||||
column_mapping = {}
|
|
||||||
used_columns = set()
|
|
||||||
|
|
||||||
for target_col in target_columns:
|
|
||||||
# 查找匹配的原始列名
|
|
||||||
matched_column = self._find_matching_column(df.columns, target_col)
|
|
||||||
if matched_column:
|
|
||||||
column_mapping[matched_column] = target_col
|
|
||||||
used_columns.add(matched_column)
|
|
||||||
logger.debug(f"列名映射: {matched_column} -> {target_col}")
|
|
||||||
|
|
||||||
# 重命名列
|
|
||||||
if column_mapping:
|
|
||||||
df_mapped = df.rename(columns=column_mapping)
|
|
||||||
|
|
||||||
# 添加缺失的目标列
|
|
||||||
for target_col in target_columns:
|
|
||||||
if target_col not in df_mapped.columns:
|
|
||||||
df_mapped[target_col] = self._get_default_value(target_col)
|
|
||||||
logger.debug(f"添加缺失列: {target_col}")
|
|
||||||
|
|
||||||
# 只保留目标列
|
|
||||||
existing_target_columns = [col for col in target_columns if col in df_mapped.columns]
|
|
||||||
df_result = df_mapped[existing_target_columns]
|
|
||||||
|
|
||||||
logger.info(f"列名映射完成,结果列名: {list(df_result.columns)}")
|
|
||||||
return df_result
|
|
||||||
else:
|
|
||||||
logger.warning("没有找到可映射的列名")
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _find_matching_column(self, columns: List[str], target_column: str) -> Optional[str]:
|
|
||||||
"""查找匹配的列名
|
|
||||||
|
|
||||||
Args:
|
|
||||||
columns: 原始列名列表
|
|
||||||
target_column: 目标标准列名
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
匹配的原始列名或None
|
|
||||||
"""
|
|
||||||
# 获取目标列的所有可能变体
|
|
||||||
possible_names = []
|
|
||||||
|
|
||||||
# 标准列名变体
|
|
||||||
if target_column in self.STANDARD_COLUMNS:
|
|
||||||
possible_names.extend(self.STANDARD_COLUMNS[target_column])
|
|
||||||
|
|
||||||
# 自定义映射
|
|
||||||
for standard_name, custom_names in self.mapping_config.items():
|
|
||||||
if standard_name == target_column:
|
|
||||||
if isinstance(custom_names, str):
|
|
||||||
possible_names.append(custom_names)
|
|
||||||
else:
|
|
||||||
possible_names.extend(custom_names)
|
|
||||||
|
|
||||||
# 查找匹配
|
|
||||||
for possible_name in possible_names:
|
|
||||||
# 精确匹配(忽略大小写)
|
|
||||||
for column in columns:
|
|
||||||
if column.lower() == possible_name.lower():
|
|
||||||
return column
|
|
||||||
|
|
||||||
# 模糊匹配
|
|
||||||
for column in columns:
|
|
||||||
if possible_name.lower() in column.lower() or column.lower() in possible_name.lower():
|
|
||||||
return column
|
|
||||||
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _get_default_value(self, column_name: str) -> Any:
|
|
||||||
"""获取列的默认值
|
|
||||||
|
|
||||||
Args:
|
|
||||||
column_name: 列名
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
默认值
|
|
||||||
"""
|
|
||||||
# 根据列名类型返回合适的默认值
|
|
||||||
if column_name in ['quantity', 'unit_price', 'total_price']:
|
|
||||||
return 0
|
|
||||||
elif column_name in ['barcode', 'name', 'specification', 'unit', 'category', 'brand', 'supplier']:
|
|
||||||
return ''
|
|
||||||
else:
|
|
||||||
return None
|
|
||||||
|
|
||||||
def add_custom_mapping(self, standard_name: str, custom_names: Union[str, List[str]]):
|
|
||||||
"""添加自定义列名映射
|
|
||||||
|
|
||||||
Args:
|
|
||||||
standard_name: 标准列名
|
|
||||||
custom_names: 自定义列名或列名列表
|
|
||||||
"""
|
|
||||||
if isinstance(custom_names, str):
|
|
||||||
custom_names = [custom_names]
|
|
||||||
|
|
||||||
# 更新配置
|
|
||||||
self.mapping_config[standard_name] = custom_names
|
|
||||||
|
|
||||||
# 更新反向映射
|
|
||||||
for custom_name in custom_names:
|
|
||||||
self.reverse_mapping[custom_name.lower()] = standard_name
|
|
||||||
self.custom_mappings[custom_name.lower()] = standard_name
|
|
||||||
|
|
||||||
logger.info(f"添加自定义映射: {standard_name} <- {custom_names}")
|
|
||||||
|
|
||||||
def detect_column_types(self, df: pd.DataFrame) -> Dict[str, str]:
|
|
||||||
"""检测列的数据类型
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
列类型字典
|
|
||||||
"""
|
|
||||||
column_types = {}
|
|
||||||
|
|
||||||
for column in df.columns:
|
|
||||||
if pd.api.types.is_numeric_dtype(df[column]):
|
|
||||||
column_types[column] = 'numeric'
|
|
||||||
elif pd.api.types.is_datetime64_any_dtype(df[column]):
|
|
||||||
column_types[column] = 'datetime'
|
|
||||||
elif pd.api.types.is_bool_dtype(df[column]):
|
|
||||||
column_types[column] = 'boolean'
|
|
||||||
else:
|
|
||||||
column_types[column] = 'text'
|
|
||||||
|
|
||||||
return column_types
|
|
||||||
|
|
||||||
def suggest_column_mapping(self, df: pd.DataFrame) -> Dict[str, List[str]]:
|
|
||||||
"""建议列名映射
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
建议的映射关系
|
|
||||||
"""
|
|
||||||
suggestions = {}
|
|
||||||
|
|
||||||
for column in df.columns:
|
|
||||||
column_lower = column.lower()
|
|
||||||
suggestions[column] = []
|
|
||||||
|
|
||||||
# 检查标准列名
|
|
||||||
for standard_name, variations in self.STANDARD_COLUMNS.items():
|
|
||||||
for variation in variations:
|
|
||||||
if column_lower in variation.lower() or variation.lower() in column_lower:
|
|
||||||
suggestions[column].append(standard_name)
|
|
||||||
|
|
||||||
# 检查自定义映射
|
|
||||||
for custom_name, standard_name in self.custom_mappings.items():
|
|
||||||
if column_lower in custom_name or custom_name in column_lower:
|
|
||||||
suggestions[column].append(standard_name)
|
|
||||||
|
|
||||||
# 去重
|
|
||||||
suggestions[column] = list(set(suggestions[column]))
|
|
||||||
|
|
||||||
# 只返回有建议的列
|
|
||||||
return {k: v for k, v in suggestions.items() if v}
|
|
||||||
|
|
||||||
def validate_mapping(self, df: pd.DataFrame, required_columns: List[str]) -> Dict[str, Any]:
|
|
||||||
"""验证列映射结果
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 映射后的数据
|
|
||||||
required_columns: 必需的列名列表
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
验证结果
|
|
||||||
"""
|
|
||||||
result = {
|
|
||||||
'valid': True,
|
|
||||||
'missing_columns': [],
|
|
||||||
'empty_columns': [],
|
|
||||||
'warnings': []
|
|
||||||
}
|
|
||||||
|
|
||||||
# 检查缺失列
|
|
||||||
for col in required_columns:
|
|
||||||
if col not in df.columns:
|
|
||||||
result['missing_columns'].append(col)
|
|
||||||
result['valid'] = False
|
|
||||||
|
|
||||||
# 检查空列
|
|
||||||
for col in df.columns:
|
|
||||||
if df[col].isnull().all():
|
|
||||||
result['empty_columns'].append(col)
|
|
||||||
result['warnings'].append(f"列 '{col}' 全部为空值")
|
|
||||||
|
|
||||||
# 检查数值列
|
|
||||||
numeric_columns = ['quantity', 'unit_price', 'total_price']
|
|
||||||
for col in numeric_columns:
|
|
||||||
if col in df.columns and not pd.api.types.is_numeric_dtype(df[col]):
|
|
||||||
result['warnings'].append(f"列 '{col}' 不是数值类型")
|
|
||||||
|
|
||||||
return result
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def find_column(cls, columns: List[str], standard_name: str) -> Optional[str]:
|
|
||||||
"""在列名列表中查找匹配标准列名的列
|
|
||||||
|
|
||||||
匹配策略: 精确匹配 → 忽略空白匹配 → 子串匹配
|
|
||||||
|
|
||||||
Args:
|
|
||||||
columns: 实际列名列表
|
|
||||||
standard_name: 标准列名 (STANDARD_COLUMNS 的键)
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
匹配到的实际列名,未找到返回 None
|
|
||||||
"""
|
|
||||||
candidates = cls.STANDARD_COLUMNS.get(standard_name, [])
|
|
||||||
if not candidates:
|
|
||||||
return None
|
|
||||||
|
|
||||||
columns_str = [str(c) for c in columns]
|
|
||||||
|
|
||||||
# 精确匹配
|
|
||||||
for col in columns_str:
|
|
||||||
col_clean = col.strip()
|
|
||||||
for candidate in candidates:
|
|
||||||
if col_clean == candidate:
|
|
||||||
return col
|
|
||||||
|
|
||||||
# 忽略空白匹配
|
|
||||||
for col in columns_str:
|
|
||||||
col_clean = re.sub(r'\s+', '', col.strip())
|
|
||||||
for candidate in candidates:
|
|
||||||
if col_clean == re.sub(r'\s+', '', candidate):
|
|
||||||
return col
|
|
||||||
|
|
||||||
# 子串匹配 (候选名包含在列名中)
|
|
||||||
for col in columns_str:
|
|
||||||
col_lower = col.strip().lower()
|
|
||||||
for candidate in candidates:
|
|
||||||
if candidate.lower() in col_lower:
|
|
||||||
return col
|
|
||||||
|
|
||||||
return None
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def detect_header_row(df: pd.DataFrame, max_rows: int = 10, min_matches: int = 3) -> int:
|
|
||||||
"""检测表头所在行
|
|
||||||
|
|
||||||
扫描前 max_rows 行,返回包含最多关键词匹配的行索引。
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据框
|
|
||||||
max_rows: 最大扫描行数
|
|
||||||
min_matches: 最少关键词匹配数
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
表头行索引,未找到返回 -1
|
|
||||||
"""
|
|
||||||
header_keywords = [
|
|
||||||
'条码', '条形码', '商品条码', '商品名称', '名称', '规格',
|
|
||||||
'单价', '数量', '金额', '单位', '必填', '编码',
|
|
||||||
]
|
|
||||||
|
|
||||||
best_row = -1
|
|
||||||
best_matches = 0
|
|
||||||
|
|
||||||
for row_idx in range(min(max_rows, len(df))):
|
|
||||||
row_values = df.iloc[row_idx].astype(str)
|
|
||||||
matches = sum(
|
|
||||||
1 for kw in header_keywords
|
|
||||||
if any(kw in str(val) for val in row_values.values)
|
|
||||||
)
|
|
||||||
if matches >= min_matches and matches > best_matches:
|
|
||||||
best_matches = matches
|
|
||||||
best_row = row_idx
|
|
||||||
|
|
||||||
return best_row
|
|
||||||
@@ -1,401 +0,0 @@
|
|||||||
"""
|
|
||||||
数据清洗处理器
|
|
||||||
|
|
||||||
提供各种数据清洗功能,如空值处理、重复项处理、数据类型转换等
|
|
||||||
"""
|
|
||||||
|
|
||||||
import pandas as pd
|
|
||||||
from typing import Dict, Any, Optional, List, Union
|
|
||||||
from ...core.utils.log_utils import get_logger
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class DataCleaner:
|
|
||||||
"""数据清洗处理器
|
|
||||||
|
|
||||||
提供标准化的数据清洗功能,支持链式调用和规则配置
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, config: Optional[Dict[str, Any]] = None):
|
|
||||||
"""初始化数据清洗器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
config: 清洗配置
|
|
||||||
"""
|
|
||||||
self.config = config or {}
|
|
||||||
self.cleaning_rules = []
|
|
||||||
|
|
||||||
def add_rule(self, rule_type: str, **kwargs):
|
|
||||||
"""添加清洗规则
|
|
||||||
|
|
||||||
Args:
|
|
||||||
rule_type: 规则类型
|
|
||||||
**kwargs: 规则参数
|
|
||||||
"""
|
|
||||||
rule = {'type': rule_type, **kwargs}
|
|
||||||
self.cleaning_rules.append(rule)
|
|
||||||
logger.debug(f"添加清洗规则: {rule_type}")
|
|
||||||
|
|
||||||
def clean(self, df: pd.DataFrame) -> pd.DataFrame:
|
|
||||||
"""执行数据清洗
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 输入数据
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
清洗后的数据
|
|
||||||
"""
|
|
||||||
logger.info(f"开始数据清洗,原始数据形状: {df.shape}")
|
|
||||||
|
|
||||||
result_df = df.copy()
|
|
||||||
|
|
||||||
for i, rule in enumerate(self.cleaning_rules):
|
|
||||||
try:
|
|
||||||
logger.debug(f"执行清洗规则 {i+1}/{len(self.cleaning_rules)}: {rule['type']}")
|
|
||||||
result_df = self._apply_rule(result_df, rule)
|
|
||||||
logger.debug(f"规则执行完成,数据形状: {result_df.shape}")
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"清洗规则执行失败: {rule}, 错误: {e}")
|
|
||||||
# 继续执行下一个规则,而不是中断整个流程
|
|
||||||
continue
|
|
||||||
|
|
||||||
logger.info(f"数据清洗完成,最终数据形状: {result_df.shape}")
|
|
||||||
return result_df
|
|
||||||
|
|
||||||
def _apply_rule(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""应用单个清洗规则
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
rule_type = rule.get('type')
|
|
||||||
|
|
||||||
if rule_type == 'remove_duplicates':
|
|
||||||
return self._remove_duplicates(df, rule)
|
|
||||||
elif rule_type == 'fill_na':
|
|
||||||
return self._fill_na(df, rule)
|
|
||||||
elif rule_type == 'remove_rows':
|
|
||||||
return self._remove_rows(df, rule)
|
|
||||||
elif rule_type == 'convert_type':
|
|
||||||
return self._convert_type(df, rule)
|
|
||||||
elif rule_type == 'strip_whitespace':
|
|
||||||
return self._strip_whitespace(df, rule)
|
|
||||||
elif rule_type == 'normalize_text':
|
|
||||||
return self._normalize_text(df, rule)
|
|
||||||
elif rule_type == 'validate_data':
|
|
||||||
return self._validate_data(df, rule)
|
|
||||||
else:
|
|
||||||
logger.warning(f"未知的清洗规则类型: {rule_type}")
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _remove_duplicates(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""移除重复项
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
subset = rule.get('subset') # 用于判断重复的列
|
|
||||||
keep = rule.get('keep', 'first') # 保留哪个重复项
|
|
||||||
|
|
||||||
before_count = len(df)
|
|
||||||
df_cleaned = df.drop_duplicates(subset=subset, keep=keep)
|
|
||||||
after_count = len(df_cleaned)
|
|
||||||
|
|
||||||
logger.info(f"移除重复项: {before_count - after_count} 行被移除")
|
|
||||||
return df_cleaned
|
|
||||||
|
|
||||||
def _fill_na(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""填充空值
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
columns = rule.get('columns') # 要处理的列
|
|
||||||
value = rule.get('value', 0) # 填充值
|
|
||||||
method = rule.get('method') # 填充方法('ffill', 'bfill', 'mean', 'median')
|
|
||||||
|
|
||||||
if columns:
|
|
||||||
# 处理指定列
|
|
||||||
if isinstance(columns, str):
|
|
||||||
columns = [columns]
|
|
||||||
|
|
||||||
for col in columns:
|
|
||||||
if col in df.columns:
|
|
||||||
if method == 'ffill':
|
|
||||||
df[col] = df[col].fillna(method='ffill')
|
|
||||||
elif method == 'bfill':
|
|
||||||
df[col] = df[col].fillna(method='bfill')
|
|
||||||
elif method == 'mean':
|
|
||||||
df[col] = df[col].fillna(df[col].mean())
|
|
||||||
elif method == 'median':
|
|
||||||
df[col] = df[col].fillna(df[col].median())
|
|
||||||
else:
|
|
||||||
df[col] = df[col].fillna(value)
|
|
||||||
|
|
||||||
logger.debug(f"填充列 {col} 的空值: {method or value}")
|
|
||||||
else:
|
|
||||||
# 处理所有列
|
|
||||||
if method == 'ffill':
|
|
||||||
df = df.fillna(method='ffill')
|
|
||||||
elif method == 'bfill':
|
|
||||||
df = df.fillna(method='bfill')
|
|
||||||
else:
|
|
||||||
df = df.fillna(value)
|
|
||||||
|
|
||||||
logger.debug(f"填充所有列的空值: {method or value}")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _remove_rows(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""移除行
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
condition = rule.get('condition') # 条件表达式
|
|
||||||
columns = rule.get('columns') # 要检查的列
|
|
||||||
values = rule.get('values') # 要移除的值
|
|
||||||
|
|
||||||
if condition:
|
|
||||||
# 使用条件表达式
|
|
||||||
try:
|
|
||||||
before_count = len(df)
|
|
||||||
df_filtered = df.query(condition)
|
|
||||||
after_count = len(df_filtered)
|
|
||||||
logger.info(f"条件过滤: {condition}, 移除了 {before_count - after_count} 行")
|
|
||||||
return df_filtered
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"条件表达式执行失败: {condition}, 错误: {e}")
|
|
||||||
return df
|
|
||||||
|
|
||||||
if columns and values:
|
|
||||||
# 基于列值过滤
|
|
||||||
if isinstance(columns, str):
|
|
||||||
columns = [columns]
|
|
||||||
if not isinstance(values, list):
|
|
||||||
values = [values]
|
|
||||||
|
|
||||||
df_filtered = df.copy()
|
|
||||||
for col in columns:
|
|
||||||
if col in df_filtered.columns:
|
|
||||||
mask = ~df_filtered[col].isin(values)
|
|
||||||
df_filtered = df_filtered[mask]
|
|
||||||
logger.debug(f"列 {col} 过滤值 {values}")
|
|
||||||
|
|
||||||
return df_filtered
|
|
||||||
|
|
||||||
logger.warning("移除行规则缺少条件或列配置")
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _convert_type(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""类型转换
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
columns = rule.get('columns')
|
|
||||||
target_type = rule.get('target_type', 'float')
|
|
||||||
errors = rule.get('errors', 'coerce') # 错误处理方式
|
|
||||||
|
|
||||||
if isinstance(columns, str):
|
|
||||||
columns = [columns]
|
|
||||||
|
|
||||||
for col in columns:
|
|
||||||
if col in df.columns:
|
|
||||||
try:
|
|
||||||
if target_type == 'int':
|
|
||||||
df[col] = pd.to_numeric(df[col], errors=errors).astype('Int64')
|
|
||||||
elif target_type == 'float':
|
|
||||||
df[col] = pd.to_numeric(df[col], errors=errors)
|
|
||||||
elif target_type == 'datetime':
|
|
||||||
df[col] = pd.to_datetime(df[col], errors=errors)
|
|
||||||
elif target_type == 'string':
|
|
||||||
df[col] = df[col].astype(str)
|
|
||||||
else:
|
|
||||||
df[col] = df[col].astype(target_type)
|
|
||||||
|
|
||||||
logger.debug(f"列 {col} 类型转换: {target_type}")
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"列 {col} 类型转换失败: {e}")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _strip_whitespace(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""去除空白字符
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
columns = rule.get('columns')
|
|
||||||
|
|
||||||
if columns:
|
|
||||||
if isinstance(columns, str):
|
|
||||||
columns = [columns]
|
|
||||||
|
|
||||||
for col in columns:
|
|
||||||
if col in df.columns and df[col].dtype == 'object':
|
|
||||||
df[col] = df[col].str.strip()
|
|
||||||
logger.debug(f"列 {col} 去除空白字符")
|
|
||||||
else:
|
|
||||||
# 处理所有文本列
|
|
||||||
text_columns = df.select_dtypes(include=['object']).columns
|
|
||||||
for col in text_columns:
|
|
||||||
df[col] = df[col].str.strip()
|
|
||||||
|
|
||||||
logger.debug(f"所有文本列去除空白字符: {list(text_columns)}")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _normalize_text(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""文本标准化
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
columns = rule.get('columns')
|
|
||||||
lowercase = rule.get('lowercase', False)
|
|
||||||
uppercase = rule.get('uppercase', False)
|
|
||||||
replace_map = rule.get('replace_map', {}) # 替换映射
|
|
||||||
|
|
||||||
if isinstance(columns, str):
|
|
||||||
columns = [columns]
|
|
||||||
|
|
||||||
target_columns = columns or df.select_dtypes(include=['object']).columns
|
|
||||||
|
|
||||||
for col in target_columns:
|
|
||||||
if col in df.columns and df[col].dtype == 'object':
|
|
||||||
if lowercase:
|
|
||||||
df[col] = df[col].str.lower()
|
|
||||||
elif uppercase:
|
|
||||||
df[col] = df[col].str.upper()
|
|
||||||
|
|
||||||
# 应用替换映射
|
|
||||||
for old, new in replace_map.items():
|
|
||||||
df[col] = df[col].str.replace(old, new)
|
|
||||||
|
|
||||||
logger.debug(f"列 {col} 文本标准化完成")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _validate_data(self, df: pd.DataFrame, rule: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
"""数据验证
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 数据
|
|
||||||
rule: 规则配置
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的数据
|
|
||||||
"""
|
|
||||||
columns = rule.get('columns')
|
|
||||||
min_value = rule.get('min_value')
|
|
||||||
max_value = rule.get('max_value')
|
|
||||||
required = rule.get('required', False)
|
|
||||||
|
|
||||||
if isinstance(columns, str):
|
|
||||||
columns = [columns]
|
|
||||||
|
|
||||||
validation_results = []
|
|
||||||
|
|
||||||
for col in columns:
|
|
||||||
if col in df.columns:
|
|
||||||
# 检查必需值
|
|
||||||
if required:
|
|
||||||
null_count = df[col].isnull().sum()
|
|
||||||
if null_count > 0:
|
|
||||||
validation_results.append(f"{col}: {null_count} 个空值")
|
|
||||||
|
|
||||||
# 检查数值范围
|
|
||||||
if min_value is not None or max_value is not None:
|
|
||||||
if pd.api.types.is_numeric_dtype(df[col]):
|
|
||||||
invalid_mask = pd.Series(False, index=df.index)
|
|
||||||
if min_value is not None:
|
|
||||||
invalid_mask |= df[col] < min_value
|
|
||||||
if max_value is not None:
|
|
||||||
invalid_mask |= df[col] > max_value
|
|
||||||
|
|
||||||
invalid_count = invalid_mask.sum()
|
|
||||||
if invalid_count > 0:
|
|
||||||
validation_results.append(f"{col}: {invalid_count} 个值超出范围")
|
|
||||||
|
|
||||||
if validation_results:
|
|
||||||
logger.warning(f"数据验证发现问题: {', '.join(validation_results)}")
|
|
||||||
else:
|
|
||||||
logger.debug("数据验证通过")
|
|
||||||
|
|
||||||
return df
|
|
||||||
|
|
||||||
# 便捷方法
|
|
||||||
def remove_duplicates(self, subset: Optional[List[str]] = None, keep: str = 'first'):
|
|
||||||
"""移除重复项"""
|
|
||||||
self.add_rule('remove_duplicates', subset=subset, keep=keep)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def fill_na(self, columns: Optional[Union[str, List[str]]] = None,
|
|
||||||
value: Any = 0, method: Optional[str] = None):
|
|
||||||
"""填充空值"""
|
|
||||||
self.add_rule('fill_na', columns=columns, value=value, method=method)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def remove_rows(self, condition: Optional[str] = None,
|
|
||||||
columns: Optional[Union[str, List[str]]] = None,
|
|
||||||
values: Optional[Any] = None):
|
|
||||||
"""移除行"""
|
|
||||||
self.add_rule('remove_rows', condition=condition, columns=columns, values=values)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def convert_type(self, columns: Union[str, List[str]], target_type: str, errors: str = 'coerce'):
|
|
||||||
"""类型转换"""
|
|
||||||
self.add_rule('convert_type', columns=columns, target_type=target_type, errors=errors)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def strip_whitespace(self, columns: Optional[Union[str, List[str]]] = None):
|
|
||||||
"""去除空白字符"""
|
|
||||||
self.add_rule('strip_whitespace', columns=columns)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def normalize_text(self, columns: Optional[Union[str, List[str]]] = None,
|
|
||||||
lowercase: bool = False, uppercase: bool = False,
|
|
||||||
replace_map: Optional[Dict[str, str]] = None):
|
|
||||||
"""文本标准化"""
|
|
||||||
self.add_rule('normalize_text', columns=columns, lowercase=lowercase,
|
|
||||||
uppercase=uppercase, replace_map=replace_map or {})
|
|
||||||
return self
|
|
||||||
|
|
||||||
def validate_data(self, columns: Union[str, List[str]],
|
|
||||||
min_value: Optional[float] = None,
|
|
||||||
max_value: Optional[float] = None,
|
|
||||||
required: bool = False):
|
|
||||||
"""数据验证"""
|
|
||||||
self.add_rule('validate_data', columns=columns, min_value=min_value,
|
|
||||||
max_value=max_value, required=required)
|
|
||||||
return self
|
|
||||||
@@ -1,150 +0,0 @@
|
|||||||
import re
|
|
||||||
import pandas as pd
|
|
||||||
from typing import List, Dict, Any, Optional
|
|
||||||
|
|
||||||
def _split_quantity_unit(df: pd.DataFrame, source: str, dictionary: Optional[Dict[str, Any]] = None) -> pd.DataFrame:
|
|
||||||
if source in df.columns:
|
|
||||||
vals = df[source].astype(str).fillna("")
|
|
||||||
nums = []
|
|
||||||
units = []
|
|
||||||
default_unit = (dictionary or {}).get("default_unit", "")
|
|
||||||
unit_synonyms = (dictionary or {}).get("unit_synonyms", {})
|
|
||||||
for v in vals:
|
|
||||||
m = re.search(r"(\d+(?:\.\d+)?)(箱|件|提|盒|瓶)", v)
|
|
||||||
if m:
|
|
||||||
nums.append(float(m.group(1)))
|
|
||||||
u = unit_synonyms.get(m.group(2), m.group(2))
|
|
||||||
units.append(u)
|
|
||||||
else:
|
|
||||||
try:
|
|
||||||
nums.append(float(v))
|
|
||||||
units.append(unit_synonyms.get(default_unit, default_unit))
|
|
||||||
except Exception:
|
|
||||||
nums.append(0.0)
|
|
||||||
units.append(unit_synonyms.get(default_unit, default_unit))
|
|
||||||
df["quantity"] = nums
|
|
||||||
df["unit"] = units
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _extract_spec_from_name(df: pd.DataFrame, source: str, dictionary: Optional[Dict[str, Any]] = None) -> pd.DataFrame:
|
|
||||||
if source in df.columns:
|
|
||||||
names = df[source].astype(str).fillna("")
|
|
||||||
specs = []
|
|
||||||
packs = []
|
|
||||||
ignore_words = (dictionary or {}).get("ignore_words", [])
|
|
||||||
name_patterns = (dictionary or {}).get("name_patterns", [])
|
|
||||||
for s in names:
|
|
||||||
if ignore_words:
|
|
||||||
for w in ignore_words:
|
|
||||||
s = s.replace(w, "")
|
|
||||||
matched = False
|
|
||||||
for pat in name_patterns:
|
|
||||||
try:
|
|
||||||
m = re.search(pat, s)
|
|
||||||
if m and len(m.groups()) >= 2:
|
|
||||||
try:
|
|
||||||
qty = int(m.group(len(m.groups())))
|
|
||||||
except Exception:
|
|
||||||
qty = None
|
|
||||||
specs.append(s)
|
|
||||||
packs.append(qty)
|
|
||||||
matched = True
|
|
||||||
break
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
if matched:
|
|
||||||
continue
|
|
||||||
m = re.search(r"(\d+(?:\.\d+)?)(ml|l|升|毫升)[*×xX](\d+)", s, re.IGNORECASE)
|
|
||||||
if m:
|
|
||||||
specs.append(f"{m.group(1)}{m.group(2)}*{m.group(3)}")
|
|
||||||
packs.append(int(m.group(3)))
|
|
||||||
continue
|
|
||||||
m2 = re.search(r"(\d+)[*×xX](\d+)", s)
|
|
||||||
if m2:
|
|
||||||
specs.append(f"1*{m2.group(2)}")
|
|
||||||
packs.append(int(m2.group(2)))
|
|
||||||
continue
|
|
||||||
m3 = re.search(r"(\d{2,3})\D*(\d{1,3})\D*", s)
|
|
||||||
if m3:
|
|
||||||
specs.append(f"1*{m3.group(2)}")
|
|
||||||
packs.append(int(m3.group(2)))
|
|
||||||
continue
|
|
||||||
specs.append("")
|
|
||||||
packs.append(None)
|
|
||||||
df["specification"] = df.get("specification", pd.Series(specs))
|
|
||||||
df["package_quantity"] = packs
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _normalize_unit(df: pd.DataFrame, target: str, unit_map: Dict[str, str], dictionary: Optional[Dict[str, Any]] = None) -> pd.DataFrame:
|
|
||||||
if target in df.columns:
|
|
||||||
df[target] = df[target].astype(str)
|
|
||||||
df[target] = df[target].apply(lambda u: unit_map.get(u, u))
|
|
||||||
pack_multipliers = (dictionary or {}).get("pack_multipliers", {})
|
|
||||||
default_pq = (dictionary or {}).get("default_package_quantity", 1)
|
|
||||||
try:
|
|
||||||
if "quantity" in df.columns:
|
|
||||||
def convert_qty(row):
|
|
||||||
u = row.get(target)
|
|
||||||
q = row.get("quantity")
|
|
||||||
pq = row.get("package_quantity")
|
|
||||||
if u in ("件", "箱", "提", "盒"):
|
|
||||||
mult = pq or pack_multipliers.get(u, default_pq)
|
|
||||||
if pd.notna(q) and pd.notna(mult) and float(mult) > 0:
|
|
||||||
return float(q) * float(mult)
|
|
||||||
return q
|
|
||||||
df["quantity"] = df.apply(convert_qty, axis=1)
|
|
||||||
df[target] = df[target].apply(lambda u: "瓶" if u in ("件","箱","提","盒") else u)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _compute_quantity_from_total(df: pd.DataFrame) -> pd.DataFrame:
|
|
||||||
if "quantity" in df.columns and "unit_price" in df.columns:
|
|
||||||
qty = df["quantity"].fillna(0)
|
|
||||||
up = pd.to_numeric(df.get("unit_price", 0), errors="coerce").fillna(0)
|
|
||||||
tp = pd.to_numeric(df.get("total_price", 0), errors="coerce").fillna(0)
|
|
||||||
need = (qty <= 0) & (up > 0) & (tp > 0)
|
|
||||||
df.loc[need, "quantity"] = (tp[need] / up[need]).round(6)
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _fill_missing(df: pd.DataFrame, fills: Dict[str, Any]) -> pd.DataFrame:
|
|
||||||
for k, v in fills.items():
|
|
||||||
if k in df.columns:
|
|
||||||
df[k] = df[k].fillna(v)
|
|
||||||
else:
|
|
||||||
df[k] = v
|
|
||||||
return df
|
|
||||||
|
|
||||||
def _mark_gift(df: pd.DataFrame) -> pd.DataFrame:
|
|
||||||
df["is_gift"] = False
|
|
||||||
tp = df.get("total_price")
|
|
||||||
up = df.get("unit_price")
|
|
||||||
flags = pd.Series([False]*len(df))
|
|
||||||
if tp is not None:
|
|
||||||
tpn = pd.to_numeric(tp, errors="coerce").fillna(0)
|
|
||||||
flags = flags | (tpn == 0)
|
|
||||||
if up is not None:
|
|
||||||
upn = pd.to_numeric(up, errors="coerce").fillna(0)
|
|
||||||
flags = flags | (upn == 0)
|
|
||||||
if "name" in df.columns:
|
|
||||||
flags = flags | df["name"].astype(str).str.contains(r"赠品|^o$|^O$", regex=True)
|
|
||||||
df.loc[flags, "is_gift"] = True
|
|
||||||
return df
|
|
||||||
|
|
||||||
def apply_rules(df: pd.DataFrame, rules: List[Dict[str, Any]], dictionary: Optional[Dict[str, Any]] = None) -> pd.DataFrame:
|
|
||||||
out = df.copy()
|
|
||||||
for r in rules or []:
|
|
||||||
t = r.get("type")
|
|
||||||
if t == "split_quantity_unit":
|
|
||||||
out = _split_quantity_unit(out, r.get("source", "quantity"), dictionary)
|
|
||||||
elif t == "extract_spec_from_name":
|
|
||||||
out = _extract_spec_from_name(out, r.get("source", "name"), dictionary)
|
|
||||||
elif t == "normalize_unit":
|
|
||||||
out = _normalize_unit(out, r.get("target", "unit"), r.get("map", {}), dictionary)
|
|
||||||
elif t == "compute_quantity_from_total":
|
|
||||||
out = _compute_quantity_from_total(out)
|
|
||||||
elif t == "fill_missing":
|
|
||||||
out = _fill_missing(out, r.get("fills", {}))
|
|
||||||
elif t == "mark_gift":
|
|
||||||
out = _mark_gift(out)
|
|
||||||
return out
|
|
||||||
@@ -4,25 +4,24 @@
|
|||||||
提供百度OCR API的访问和调用功能。
|
提供百度OCR API的访问和调用功能。
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
import time
|
import time
|
||||||
import base64
|
import base64
|
||||||
import requests
|
import requests
|
||||||
from typing import Dict, Optional, Union
|
import logging
|
||||||
|
from typing import Dict, Optional, Any, Union
|
||||||
|
|
||||||
|
from ...config.settings import ConfigManager
|
||||||
from ..utils.log_utils import get_logger
|
from ..utils.log_utils import get_logger
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
# Token 过期相关常量
|
|
||||||
_DEFAULT_TOKEN_LIFETIME = 30 * 24 * 3600 # 30天(秒)
|
|
||||||
_TOKEN_EARLY_EXPIRY = 3600 # 提前1小时刷新(秒)
|
|
||||||
|
|
||||||
class TokenManager:
|
class TokenManager:
|
||||||
"""
|
"""
|
||||||
令牌管理类,负责获取和刷新百度API访问令牌
|
令牌管理类,负责获取和刷新百度API访问令牌
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(self, api_key: str, secret_key: str, max_retries: int = 3, retry_delay: int = 2, token_url: str = None):
|
def __init__(self, api_key: str, secret_key: str, max_retries: int = 3, retry_delay: int = 2):
|
||||||
"""
|
"""
|
||||||
初始化令牌管理器
|
初始化令牌管理器
|
||||||
|
|
||||||
@@ -31,13 +30,11 @@ class TokenManager:
|
|||||||
secret_key: 百度Secret Key
|
secret_key: 百度Secret Key
|
||||||
max_retries: 最大重试次数
|
max_retries: 最大重试次数
|
||||||
retry_delay: 重试延迟(秒)
|
retry_delay: 重试延迟(秒)
|
||||||
token_url: 令牌获取地址
|
|
||||||
"""
|
"""
|
||||||
self.api_key = api_key
|
self.api_key = api_key
|
||||||
self.secret_key = secret_key
|
self.secret_key = secret_key
|
||||||
self.max_retries = max_retries
|
self.max_retries = max_retries
|
||||||
self.retry_delay = retry_delay
|
self.retry_delay = retry_delay
|
||||||
self.token_url = token_url or 'https://aip.baidubce.com/oauth/2.0/token'
|
|
||||||
self.access_token = None
|
self.access_token = None
|
||||||
self.token_expiry = 0
|
self.token_expiry = 0
|
||||||
|
|
||||||
@@ -72,7 +69,7 @@ class TokenManager:
|
|||||||
Returns:
|
Returns:
|
||||||
新的访问令牌,如果获取失败则返回None
|
新的访问令牌,如果获取失败则返回None
|
||||||
"""
|
"""
|
||||||
url = self.token_url
|
url = "https://aip.baidubce.com/oauth/2.0/token"
|
||||||
params = {
|
params = {
|
||||||
"grant_type": "client_credentials",
|
"grant_type": "client_credentials",
|
||||||
"client_id": self.api_key,
|
"client_id": self.api_key,
|
||||||
@@ -87,7 +84,7 @@ class TokenManager:
|
|||||||
if "access_token" in result:
|
if "access_token" in result:
|
||||||
self.access_token = result["access_token"]
|
self.access_token = result["access_token"]
|
||||||
# 设置令牌过期时间(默认30天,提前1小时过期以确保安全)
|
# 设置令牌过期时间(默认30天,提前1小时过期以确保安全)
|
||||||
self.token_expiry = time.time() + result.get("expires_in", _DEFAULT_TOKEN_LIFETIME) - _TOKEN_EARLY_EXPIRY
|
self.token_expiry = time.time() + result.get("expires_in", 2592000) - 3600
|
||||||
logger.info("成功获取访问令牌")
|
logger.info("成功获取访问令牌")
|
||||||
return self.access_token
|
return self.access_token
|
||||||
|
|
||||||
@@ -147,8 +144,7 @@ class BaiduOCRClient:
|
|||||||
self.api_key,
|
self.api_key,
|
||||||
self.secret_key,
|
self.secret_key,
|
||||||
self.max_retries,
|
self.max_retries,
|
||||||
self.retry_delay,
|
self.retry_delay
|
||||||
token_url=config.get('API', 'token_url', fallback='https://aip.baidubce.com/oauth/2.0/token')
|
|
||||||
)
|
)
|
||||||
|
|
||||||
# 验证API配置
|
# 验证API配置
|
||||||
@@ -303,8 +299,7 @@ class BaiduOCRClient:
|
|||||||
logger.error(f"无法从结果中提取有效的request_id: {request_id_or_result}")
|
logger.error(f"无法从结果中提取有效的request_id: {request_id_or_result}")
|
||||||
return None
|
return None
|
||||||
|
|
||||||
base_url = self.config.get('API', 'form_ocr_url', fallback='https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result')
|
url = f"https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result?access_token={access_token}"
|
||||||
url = f"{base_url}?access_token={access_token}"
|
|
||||||
|
|
||||||
payload = {
|
payload = {
|
||||||
'request_id': request_id,
|
'request_id': request_id,
|
||||||
|
|||||||
+15
-23
@@ -5,11 +5,15 @@
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
|
import sys
|
||||||
import time
|
import time
|
||||||
|
import json
|
||||||
import base64
|
import base64
|
||||||
|
from datetime import datetime
|
||||||
from concurrent.futures import ThreadPoolExecutor
|
from concurrent.futures import ThreadPoolExecutor
|
||||||
from typing import Dict, List, Optional, Tuple, Callable
|
from typing import Dict, List, Optional, Tuple, Union, Any
|
||||||
|
|
||||||
|
from ...config.settings import ConfigManager
|
||||||
from ..utils.log_utils import get_logger
|
from ..utils.log_utils import get_logger
|
||||||
from ..utils.file_utils import (
|
from ..utils.file_utils import (
|
||||||
ensure_dir,
|
ensure_dir,
|
||||||
@@ -114,9 +118,9 @@ class OCRProcessor:
|
|||||||
# 修复ConfigParser对象没有get_path方法的问题
|
# 修复ConfigParser对象没有get_path方法的问题
|
||||||
try:
|
try:
|
||||||
# 获取输入和输出目录
|
# 获取输入和输出目录
|
||||||
self.input_folder = config.get_path('Paths', 'input_folder', fallback='data/input', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/input')
|
self.input_folder = config.get('Paths', 'input_folder', fallback='data/input')
|
||||||
self.output_folder = config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/output')
|
self.output_folder = config.get('Paths', 'output_folder', fallback='data/output')
|
||||||
self.temp_folder = config.get_path('Paths', 'temp_folder', fallback='data/temp', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/temp')
|
self.temp_folder = config.get('Paths', 'temp_folder', fallback='data/temp')
|
||||||
|
|
||||||
# 确保目录存在
|
# 确保目录存在
|
||||||
os.makedirs(self.input_folder, exist_ok=True)
|
os.makedirs(self.input_folder, exist_ok=True)
|
||||||
@@ -173,7 +177,7 @@ class OCRProcessor:
|
|||||||
skip_existing = True
|
skip_existing = True
|
||||||
try:
|
try:
|
||||||
skip_existing = self.config.getboolean('Performance', 'skip_existing', fallback=True)
|
skip_existing = self.config.getboolean('Performance', 'skip_existing', fallback=True)
|
||||||
except Exception:
|
except:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
if skip_existing:
|
if skip_existing:
|
||||||
@@ -210,7 +214,7 @@ class OCRProcessor:
|
|||||||
max_size_mb = 4.0
|
max_size_mb = 4.0
|
||||||
try:
|
try:
|
||||||
max_size_mb = float(self.config.get('File', 'max_file_size_mb', fallback='4.0'))
|
max_size_mb = float(self.config.get('File', 'max_file_size_mb', fallback='4.0'))
|
||||||
except Exception:
|
except:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
if not is_file_size_valid(image_path, max_size_mb):
|
if not is_file_size_valid(image_path, max_size_mb):
|
||||||
@@ -237,7 +241,7 @@ class OCRProcessor:
|
|||||||
skip_existing = True
|
skip_existing = True
|
||||||
try:
|
try:
|
||||||
skip_existing = self.config.getboolean('Performance', 'skip_existing', fallback=True)
|
skip_existing = self.config.getboolean('Performance', 'skip_existing', fallback=True)
|
||||||
except Exception:
|
except:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
# 如果需要跳过已处理的文件
|
# 如果需要跳过已处理的文件
|
||||||
@@ -253,7 +257,7 @@ class OCRProcessor:
|
|||||||
excel_extension = '.xlsx'
|
excel_extension = '.xlsx'
|
||||||
try:
|
try:
|
||||||
excel_extension = self.config.get('File', 'excel_extension', fallback='.xlsx')
|
excel_extension = self.config.get('File', 'excel_extension', fallback='.xlsx')
|
||||||
except Exception:
|
except:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
# 生成输出文件路径
|
# 生成输出文件路径
|
||||||
@@ -328,7 +332,7 @@ class OCRProcessor:
|
|||||||
logger.error(f"处理图片时出错: {image_path}, 错误: {e}")
|
logger.error(f"处理图片时出错: {image_path}, 错误: {e}")
|
||||||
return None
|
return None
|
||||||
|
|
||||||
def process_images_batch(self, batch_size: int = None, max_workers: int = None, progress_cb: Optional[Callable[[int], None]] = None) -> Tuple[int, int]:
|
def process_images_batch(self, batch_size: int = None, max_workers: int = None) -> Tuple[int, int]:
|
||||||
"""
|
"""
|
||||||
批量处理图片
|
批量处理图片
|
||||||
|
|
||||||
@@ -343,13 +347,13 @@ class OCRProcessor:
|
|||||||
if batch_size is None:
|
if batch_size is None:
|
||||||
try:
|
try:
|
||||||
batch_size = self.config.getint('Performance', 'batch_size', fallback=5)
|
batch_size = self.config.getint('Performance', 'batch_size', fallback=5)
|
||||||
except Exception:
|
except:
|
||||||
batch_size = 5
|
batch_size = 5
|
||||||
|
|
||||||
if max_workers is None:
|
if max_workers is None:
|
||||||
try:
|
try:
|
||||||
max_workers = self.config.getint('Performance', 'max_workers', fallback=4)
|
max_workers = self.config.getint('Performance', 'max_workers', fallback=4)
|
||||||
except Exception:
|
except:
|
||||||
max_workers = 4
|
max_workers = 4
|
||||||
|
|
||||||
# 获取未处理的图片
|
# 获取未处理的图片
|
||||||
@@ -365,13 +369,6 @@ class OCRProcessor:
|
|||||||
for i in range(0, total, batch_size):
|
for i in range(0, total, batch_size):
|
||||||
batch = unprocessed_images[i:i+batch_size]
|
batch = unprocessed_images[i:i+batch_size]
|
||||||
logger.info(f"处理批次 {i//batch_size+1}/{(total+batch_size-1)//batch_size}: {len(batch)} 个文件")
|
logger.info(f"处理批次 {i//batch_size+1}/{(total+batch_size-1)//batch_size}: {len(batch)} 个文件")
|
||||||
try:
|
|
||||||
if progress_cb:
|
|
||||||
# 以批次为单位估算进度(0-90%),保留10%给后续阶段
|
|
||||||
percent = int(10 + (i / max(total, 1)) * 80)
|
|
||||||
progress_cb(min(percent, 90))
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# 使用多线程处理批次
|
# 使用多线程处理批次
|
||||||
with ThreadPoolExecutor(max_workers=max_workers) as executor:
|
with ThreadPoolExecutor(max_workers=max_workers) as executor:
|
||||||
@@ -381,9 +378,4 @@ class OCRProcessor:
|
|||||||
success_count += sum(1 for result in results if result is not None)
|
success_count += sum(1 for result in results if result is not None)
|
||||||
|
|
||||||
logger.info(f"所有图片处理完成, 总计: {total}, 成功: {success_count}")
|
logger.info(f"所有图片处理完成, 总计: {total}, 成功: {success_count}")
|
||||||
try:
|
|
||||||
if progress_cb:
|
|
||||||
progress_cb(90)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
return total, success_count
|
return total, success_count
|
||||||
|
|||||||
@@ -1,9 +0,0 @@
|
|||||||
"""
|
|
||||||
处理器模块初始化文件
|
|
||||||
"""
|
|
||||||
|
|
||||||
from .base import BaseProcessor
|
|
||||||
from .ocr_processor import OCRProcessor
|
|
||||||
from .tobacco_processor import TobaccoProcessor
|
|
||||||
|
|
||||||
__all__ = ['BaseProcessor', 'OCRProcessor', 'TobaccoProcessor']
|
|
||||||
@@ -1,167 +0,0 @@
|
|||||||
"""
|
|
||||||
基础处理器接口模块
|
|
||||||
|
|
||||||
定义所有处理器的基类,提供统一的处理接口
|
|
||||||
"""
|
|
||||||
|
|
||||||
from abc import ABC, abstractmethod
|
|
||||||
from typing import Dict, Any, Optional, List
|
|
||||||
from pathlib import Path
|
|
||||||
import logging
|
|
||||||
import pandas as pd
|
|
||||||
|
|
||||||
from ...core.utils.log_utils import get_logger
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class BaseProcessor(ABC):
|
|
||||||
"""基础处理器接口 - 所有处理器的基类
|
|
||||||
|
|
||||||
采用策略模式设计,每个处理器负责特定类型的文件处理
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, config: Dict[str, Any]):
|
|
||||||
"""初始化处理器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
config: 处理器配置字典
|
|
||||||
"""
|
|
||||||
self.config = config
|
|
||||||
self.name = self.__class__.__name__
|
|
||||||
self.description = ""
|
|
||||||
self._setup_logging()
|
|
||||||
|
|
||||||
def _setup_logging(self):
|
|
||||||
"""设置处理器日志"""
|
|
||||||
self.logger = logging.getLogger(f"{__name__}.{self.name}")
|
|
||||||
|
|
||||||
@abstractmethod
|
|
||||||
def can_process(self, file_path: Path) -> bool:
|
|
||||||
"""判断是否能处理该文件
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
是否能处理该文件
|
|
||||||
"""
|
|
||||||
pass
|
|
||||||
|
|
||||||
@abstractmethod
|
|
||||||
def process(self, input_file: Path, output_dir: Path) -> Optional[Path]:
|
|
||||||
"""处理文件,返回输出文件路径
|
|
||||||
|
|
||||||
Args:
|
|
||||||
input_file: 输入文件路径
|
|
||||||
output_dir: 输出目录路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
输出文件路径,处理失败返回None
|
|
||||||
"""
|
|
||||||
pass
|
|
||||||
|
|
||||||
@abstractmethod
|
|
||||||
def get_required_columns(self) -> List[str]:
|
|
||||||
"""返回需要的列名列表
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
列名列表
|
|
||||||
"""
|
|
||||||
pass
|
|
||||||
|
|
||||||
def validate_input(self, file_path: Path) -> bool:
|
|
||||||
"""验证输入文件有效性
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
文件是否有效
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
if not file_path.exists():
|
|
||||||
self.logger.warning(f"文件不存在: {file_path}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
if not file_path.is_file():
|
|
||||||
self.logger.warning(f"不是文件: {file_path}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
supported_extensions = self.get_supported_extensions()
|
|
||||||
if supported_extensions and file_path.suffix.lower() not in supported_extensions:
|
|
||||||
self.logger.warning(f"不支持的文件类型: {file_path.suffix}, 支持的类型: {supported_extensions}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
return True
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"验证文件时出错: {e}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
def get_supported_extensions(self) -> List[str]:
|
|
||||||
"""获取支持的文件扩展名
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
支持的扩展名列表,空列表表示支持所有类型
|
|
||||||
"""
|
|
||||||
return []
|
|
||||||
|
|
||||||
def get_output_filename(self, input_file: Path, suffix: str = "_processed") -> str:
|
|
||||||
"""生成输出文件名
|
|
||||||
|
|
||||||
Args:
|
|
||||||
input_file: 输入文件路径
|
|
||||||
suffix: 文件名后缀
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
输出文件名
|
|
||||||
"""
|
|
||||||
return f"{input_file.stem}{suffix}{input_file.suffix}"
|
|
||||||
|
|
||||||
def _read_excel_safely(self, file_path: Path, **kwargs) -> pd.DataFrame:
|
|
||||||
"""根据扩展名选择合适的读取引擎
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
**kwargs: 传递给 pd.read_excel 的参数
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
DataFrame
|
|
||||||
|
|
||||||
Raises:
|
|
||||||
Exception: 读取失败时抛出
|
|
||||||
"""
|
|
||||||
suffix = file_path.suffix.lower()
|
|
||||||
if suffix == '.xlsx':
|
|
||||||
return pd.read_excel(file_path, engine='openpyxl', **kwargs)
|
|
||||||
elif suffix == '.xls':
|
|
||||||
try:
|
|
||||||
return pd.read_excel(file_path, engine='xlrd', **kwargs)
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.warning(f"读取xls失败,可能缺少xlrd: {e}")
|
|
||||||
raise
|
|
||||||
else:
|
|
||||||
return pd.read_excel(file_path, **kwargs)
|
|
||||||
|
|
||||||
def log_processing_start(self, input_file: Path):
|
|
||||||
"""记录处理开始日志"""
|
|
||||||
self.logger.info(f"开始处理文件: {input_file}")
|
|
||||||
self.logger.info(f"处理器: {self.name} - {self.description}")
|
|
||||||
|
|
||||||
def log_processing_end(self, input_file: Path, output_file: Optional[Path] = None, success: bool = True):
|
|
||||||
"""记录处理结束日志"""
|
|
||||||
if success:
|
|
||||||
self.logger.info(f"处理完成: {input_file}")
|
|
||||||
if output_file:
|
|
||||||
self.logger.info(f"输出文件: {output_file}")
|
|
||||||
else:
|
|
||||||
self.logger.error(f"处理失败: {input_file}")
|
|
||||||
|
|
||||||
def __str__(self) -> str:
|
|
||||||
"""字符串表示"""
|
|
||||||
return f"{self.name}({self.description})"
|
|
||||||
|
|
||||||
def __repr__(self) -> str:
|
|
||||||
"""详细字符串表示"""
|
|
||||||
return f"{self.__class__.__module__}.{self.__class__.__name__}(name='{self.name}', description='{self.description}')"
|
|
||||||
@@ -1,192 +0,0 @@
|
|||||||
"""
|
|
||||||
OCR处理器
|
|
||||||
|
|
||||||
处理图片文件的OCR识别完整流程:图片识别 → Excel处理 → 标准采购单生成
|
|
||||||
"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import Optional, Dict, Any, List
|
|
||||||
|
|
||||||
from .base import BaseProcessor
|
|
||||||
from ...services.ocr_service import OCRService
|
|
||||||
from ...services.order_service import OrderService
|
|
||||||
from ...core.utils.log_utils import get_logger
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class OCRProcessor(BaseProcessor):
|
|
||||||
"""OCR处理器
|
|
||||||
|
|
||||||
处理图片文件的完整OCR识别流程:
|
|
||||||
1. OCR识别图片中的表格信息
|
|
||||||
2. 处理识别结果生成Excel文件
|
|
||||||
3. 转换为标准采购单格式
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, config: Dict[str, Any]):
|
|
||||||
"""初始化OCR处理器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
config: 配置信息
|
|
||||||
"""
|
|
||||||
super().__init__(config)
|
|
||||||
self.description = "OCR识别完整流程(图片→识别→Excel→采购单)"
|
|
||||||
|
|
||||||
# 初始化服务
|
|
||||||
self.ocr_service = OCRService(config)
|
|
||||||
self.order_service = OrderService(config)
|
|
||||||
|
|
||||||
def can_process(self, file_path: Path) -> bool:
|
|
||||||
"""判断是否为支持的图片文件
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
是否能处理该文件
|
|
||||||
"""
|
|
||||||
if not self.validate_input(file_path):
|
|
||||||
return False
|
|
||||||
|
|
||||||
# 支持的图片格式
|
|
||||||
supported_extensions = ['.jpg', '.jpeg', '.png', '.bmp']
|
|
||||||
|
|
||||||
if file_path.suffix.lower() in supported_extensions:
|
|
||||||
self.logger.info(f"识别为图片文件: {file_path.name}")
|
|
||||||
return True
|
|
||||||
|
|
||||||
return False
|
|
||||||
|
|
||||||
def process(self, input_file: Path, output_dir: Path) -> Optional[Path]:
|
|
||||||
"""处理图片文件的完整OCR流程
|
|
||||||
|
|
||||||
Args:
|
|
||||||
input_file: 输入图片文件路径
|
|
||||||
output_dir: 输出目录路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
输出文件路径,处理失败返回None
|
|
||||||
"""
|
|
||||||
self.log_processing_start(input_file)
|
|
||||||
|
|
||||||
try:
|
|
||||||
self.logger.info("开始OCR识别流程...")
|
|
||||||
|
|
||||||
# 步骤1: OCR识别
|
|
||||||
self.logger.info("步骤1/3: OCR识别图片...")
|
|
||||||
ocr_result = self._perform_ocr(input_file, output_dir)
|
|
||||||
if not ocr_result:
|
|
||||||
self.logger.error("OCR识别失败")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 步骤2: Excel处理
|
|
||||||
self.logger.info("步骤2/3: 处理Excel文件...")
|
|
||||||
excel_result = self._process_excel(ocr_result, output_dir)
|
|
||||||
if not excel_result:
|
|
||||||
self.logger.error("Excel处理失败")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 步骤3: 生成标准采购单
|
|
||||||
self.logger.info("步骤3/3: 生成标准采购单...")
|
|
||||||
final_result = self._generate_purchase_order(excel_result, output_dir)
|
|
||||||
|
|
||||||
if final_result:
|
|
||||||
self.logger.info(f"OCR处理流程完成,输出文件: {final_result}")
|
|
||||||
self.log_processing_end(input_file, final_result, success=True)
|
|
||||||
return final_result
|
|
||||||
else:
|
|
||||||
self.logger.error("生成采购单失败")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"OCR处理流程出错: {e}", exc_info=True)
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def get_required_columns(self) -> List[str]:
|
|
||||||
"""返回需要的列名列表"""
|
|
||||||
# OCR处理不直接依赖列名,由后续处理步骤决定
|
|
||||||
return []
|
|
||||||
|
|
||||||
def get_supported_extensions(self) -> List[str]:
|
|
||||||
"""支持的文件扩展名"""
|
|
||||||
return ['.jpg', '.jpeg', '.png', '.bmp']
|
|
||||||
|
|
||||||
def _perform_ocr(self, input_file: Path, output_dir: Path) -> Optional[Path]:
|
|
||||||
"""执行OCR识别
|
|
||||||
|
|
||||||
Args:
|
|
||||||
input_file: 输入图片文件
|
|
||||||
output_dir: 输出目录
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
OCR生成的Excel文件路径,失败返回None
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
self.logger.info(f"开始OCR识别: {input_file}")
|
|
||||||
|
|
||||||
# 使用OCR服务处理图片
|
|
||||||
result_path = self.ocr_service.process_image(str(input_file))
|
|
||||||
|
|
||||||
if result_path:
|
|
||||||
# 确保结果文件在输出目录中
|
|
||||||
result_path = Path(result_path)
|
|
||||||
if result_path.exists():
|
|
||||||
self.logger.info(f"OCR识别成功,输出文件: {result_path}")
|
|
||||||
return result_path
|
|
||||||
else:
|
|
||||||
self.logger.error(f"OCR结果文件不存在: {result_path}")
|
|
||||||
return None
|
|
||||||
else:
|
|
||||||
self.logger.error("OCR服务返回None")
|
|
||||||
return None
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"OCR识别失败: {e}", exc_info=True)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _process_excel(self, excel_file: Path, output_dir: Path) -> Optional[Path]:
|
|
||||||
"""处理Excel文件
|
|
||||||
|
|
||||||
Args:
|
|
||||||
excel_file: Excel文件路径
|
|
||||||
output_dir: 输出目录
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理后的Excel文件路径,失败返回None
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
self.logger.info(f"开始处理Excel文件: {excel_file}")
|
|
||||||
|
|
||||||
# 使用订单服务处理Excel文件(生成采购单)
|
|
||||||
result_path = self.order_service.process_excel(str(excel_file))
|
|
||||||
|
|
||||||
if result_path:
|
|
||||||
result_path = Path(result_path)
|
|
||||||
if result_path.exists():
|
|
||||||
self.logger.info(f"Excel处理成功,输出文件: {result_path}")
|
|
||||||
return result_path
|
|
||||||
else:
|
|
||||||
self.logger.error(f"Excel处理结果文件不存在: {result_path}")
|
|
||||||
return None
|
|
||||||
else:
|
|
||||||
self.logger.error("Excel处理服务返回None")
|
|
||||||
return None
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"Excel处理失败: {e}", exc_info=True)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _generate_purchase_order(self, processed_file: Path, output_dir: Path) -> Optional[Path]:
|
|
||||||
"""采购单生成由OrderService完成,此处直接返回处理结果"""
|
|
||||||
try:
|
|
||||||
if processed_file and processed_file.exists():
|
|
||||||
return processed_file
|
|
||||||
return None
|
|
||||||
except Exception:
|
|
||||||
return None
|
|
||||||
@@ -1,7 +0,0 @@
|
|||||||
"""
|
|
||||||
供应商处理器模块初始化文件
|
|
||||||
"""
|
|
||||||
|
|
||||||
from .generic_supplier_processor import GenericSupplierProcessor
|
|
||||||
|
|
||||||
__all__ = ['GenericSupplierProcessor']
|
|
||||||
@@ -1,340 +0,0 @@
|
|||||||
"""
|
|
||||||
通用供应商处理器
|
|
||||||
|
|
||||||
可配置化的供应商处理器,支持通过配置文件定义处理规则
|
|
||||||
"""
|
|
||||||
|
|
||||||
import fnmatch
|
|
||||||
import pandas as pd
|
|
||||||
from typing import Optional, Dict, Any, List
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from ..base import BaseProcessor
|
|
||||||
from ...utils.log_utils import get_logger
|
|
||||||
from ...handlers.rule_engine import apply_rules
|
|
||||||
from ...handlers.column_mapper import ColumnMapper
|
|
||||||
from ...handlers.data_cleaner import DataCleaner
|
|
||||||
from ...handlers.calculator import DataCalculator
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class GenericSupplierProcessor(BaseProcessor):
|
|
||||||
"""通用供应商处理器
|
|
||||||
|
|
||||||
基于配置文件处理不同供应商的Excel文件,支持:
|
|
||||||
- 文件名模式匹配
|
|
||||||
- 内容特征识别
|
|
||||||
- 列映射配置
|
|
||||||
- 数据清洗规则
|
|
||||||
- 计算处理规则
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, config: Dict[str, Any], supplier_config: Dict[str, Any]):
|
|
||||||
"""初始化通用供应商处理器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
config: 系统配置
|
|
||||||
supplier_config: 供应商特定配置
|
|
||||||
"""
|
|
||||||
super().__init__(config)
|
|
||||||
self.supplier_config = supplier_config
|
|
||||||
|
|
||||||
# 从配置中提取基本信息
|
|
||||||
self.name = supplier_config.get('name', 'GenericSupplier')
|
|
||||||
self.description = supplier_config.get('description', '通用供应商处理器')
|
|
||||||
|
|
||||||
# 处理规则配置
|
|
||||||
self.filename_patterns = supplier_config.get('filename_patterns', [])
|
|
||||||
self.content_indicators = supplier_config.get('content_indicators', [])
|
|
||||||
self.column_mapping = supplier_config.get('column_mapping', {})
|
|
||||||
self.cleaning_rules = supplier_config.get('cleaning_rules', [])
|
|
||||||
self.calculations = supplier_config.get('calculations', [])
|
|
||||||
|
|
||||||
# 输出配置
|
|
||||||
self.output_template = supplier_config.get('output_template', 'templates/银豹-采购单模板.xls')
|
|
||||||
self.output_suffix = supplier_config.get('output_suffix', '_银豹采购单')
|
|
||||||
|
|
||||||
def can_process(self, file_path: Path) -> bool:
|
|
||||||
"""判断是否能处理该文件
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
是否能处理
|
|
||||||
"""
|
|
||||||
if not self.validate_input(file_path):
|
|
||||||
return False
|
|
||||||
|
|
||||||
# 检查文件名模式
|
|
||||||
if self.filename_patterns:
|
|
||||||
filename_match = self._check_filename_patterns(file_path)
|
|
||||||
if filename_match:
|
|
||||||
return True
|
|
||||||
|
|
||||||
# 检查文件内容特征
|
|
||||||
if self.content_indicators:
|
|
||||||
content_match = self._check_content_indicators(file_path)
|
|
||||||
if content_match:
|
|
||||||
return True
|
|
||||||
|
|
||||||
# 如果都没有配置,则无法判断
|
|
||||||
if not self.filename_patterns and not self.content_indicators:
|
|
||||||
self.logger.warning(f"处理器 {self.name} 没有配置识别规则")
|
|
||||||
return False
|
|
||||||
|
|
||||||
return False
|
|
||||||
|
|
||||||
def process(self, input_file: Path, output_dir: Path) -> Optional[Path]:
|
|
||||||
"""处理文件
|
|
||||||
|
|
||||||
Args:
|
|
||||||
input_file: 输入文件路径
|
|
||||||
output_dir: 输出目录路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
输出文件路径,处理失败返回None
|
|
||||||
"""
|
|
||||||
self.log_processing_start(input_file)
|
|
||||||
|
|
||||||
try:
|
|
||||||
# 步骤1: 读取数据
|
|
||||||
self.logger.info("步骤1/4: 读取数据...")
|
|
||||||
df = self._read_supplier_data(input_file)
|
|
||||||
if df is None or df.empty:
|
|
||||||
self.logger.error("读取数据失败或数据为空")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 步骤2: 应用列映射
|
|
||||||
self.logger.info("步骤2/4: 应用列映射...")
|
|
||||||
mapped_df = self._apply_column_mapping(df)
|
|
||||||
if mapped_df is None:
|
|
||||||
self.logger.error("列映射失败")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 步骤3: 数据清洗
|
|
||||||
self.logger.info("步骤3/4: 数据清洗...")
|
|
||||||
cleaned_df = self._apply_data_cleaning(mapped_df)
|
|
||||||
if cleaned_df is None:
|
|
||||||
self.logger.error("数据清洗失败")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
rules = self.supplier_config.get('rules', [])
|
|
||||||
dictionary = self.supplier_config.get('dictionary')
|
|
||||||
standardized_df = apply_rules(cleaned_df, rules, dictionary)
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.warning(f"规则执行失败: {e}")
|
|
||||||
standardized_df = cleaned_df
|
|
||||||
|
|
||||||
# 步骤4: 计算处理
|
|
||||||
self.logger.info("步骤4/4: 计算处理...")
|
|
||||||
calculated_df = self._apply_calculations(standardized_df)
|
|
||||||
if calculated_df is None:
|
|
||||||
self.logger.error("计算处理失败")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 生成输出文件
|
|
||||||
output_file = self._generate_output(calculated_df, input_file, output_dir)
|
|
||||||
|
|
||||||
if output_file and output_file.exists():
|
|
||||||
self.logger.info(f"处理完成,输出文件: {output_file}")
|
|
||||||
self.log_processing_end(input_file, output_file, success=True)
|
|
||||||
return output_file
|
|
||||||
else:
|
|
||||||
self.logger.error("输出文件生成失败")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"处理文件时出错: {e}", exc_info=True)
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def get_required_columns(self) -> List[str]:
|
|
||||||
"""返回需要的列名列表"""
|
|
||||||
# 从列映射配置中提取目标列名
|
|
||||||
return list(self.column_mapping.values()) if self.column_mapping else []
|
|
||||||
|
|
||||||
def _check_filename_patterns(self, file_path: Path) -> bool:
|
|
||||||
"""检查文件名模式
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
是否匹配
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
filename = file_path.name
|
|
||||||
for pattern in self.filename_patterns:
|
|
||||||
if fnmatch.fnmatch(filename.lower(), pattern.lower()):
|
|
||||||
self.logger.info(f"文件名匹配成功: {filename} -> {pattern}")
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"检查文件名模式时出错: {e}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
def _check_content_indicators(self, file_path: Path) -> bool:
|
|
||||||
"""检查文件内容特征
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
是否匹配
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
df = self._read_excel_safely(file_path, nrows=5)
|
|
||||||
|
|
||||||
# 检查列名中是否包含指定关键词
|
|
||||||
columns_str = str(list(df.columns)).lower()
|
|
||||||
|
|
||||||
for indicator in self.content_indicators:
|
|
||||||
if indicator.lower() in columns_str:
|
|
||||||
self.logger.info(f"内容特征匹配成功: {indicator}")
|
|
||||||
return True
|
|
||||||
|
|
||||||
return False
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"检查内容特征时出错: {e}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
def _read_supplier_data(self, file_path: Path) -> Optional[pd.DataFrame]:
|
|
||||||
"""读取供应商数据
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
数据DataFrame或None
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
specified = self.supplier_config.get('header_row')
|
|
||||||
if specified is not None:
|
|
||||||
try:
|
|
||||||
df = self._read_excel_safely(file_path, header=int(specified))
|
|
||||||
except Exception:
|
|
||||||
df = self._read_excel_safely(file_path)
|
|
||||||
else:
|
|
||||||
df0 = self._read_excel_safely(file_path, header=None)
|
|
||||||
if df0 is None:
|
|
||||||
return None
|
|
||||||
header_row = self._find_header_row(df0)
|
|
||||||
if header_row is not None:
|
|
||||||
df = self._read_excel_safely(file_path, header=header_row)
|
|
||||||
else:
|
|
||||||
df = self._read_excel_safely(file_path)
|
|
||||||
if df is None or df.empty:
|
|
||||||
self.logger.warning("数据文件为空")
|
|
||||||
return None
|
|
||||||
self.logger.info(f"成功读取数据,形状: {df.shape}")
|
|
||||||
return df
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"读取数据失败: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _find_header_row(self, df: pd.DataFrame) -> Optional[int]:
|
|
||||||
result = ColumnMapper.detect_header_row(df, max_rows=30)
|
|
||||||
return result if result >= 0 else None
|
|
||||||
|
|
||||||
def _apply_column_mapping(self, df: pd.DataFrame) -> Optional[pd.DataFrame]:
|
|
||||||
"""应用列映射
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 原始数据
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
映射后的数据或None
|
|
||||||
"""
|
|
||||||
if not self.column_mapping:
|
|
||||||
self.logger.info("没有列映射配置")
|
|
||||||
return df
|
|
||||||
|
|
||||||
try:
|
|
||||||
# 应用列重命名
|
|
||||||
df_renamed = df.rename(columns=self.column_mapping)
|
|
||||||
|
|
||||||
# 检查必需的列是否存在
|
|
||||||
required_columns = self.get_required_columns()
|
|
||||||
missing_columns = [col for col in required_columns if col not in df_renamed.columns]
|
|
||||||
|
|
||||||
if missing_columns:
|
|
||||||
self.logger.warning(f"缺少必需的列: {missing_columns}")
|
|
||||||
# 创建缺失的列并填充默认值
|
|
||||||
for col in missing_columns:
|
|
||||||
df_renamed[col] = 0 if '量' in col or '价' in col else ''
|
|
||||||
self.logger.info(f"创建缺失列: {col},默认值: {df_renamed[col].iloc[0] if len(df_renamed) > 0 else 'N/A'}")
|
|
||||||
|
|
||||||
self.logger.info(f"列映射完成,列名: {list(df_renamed.columns)}")
|
|
||||||
return df_renamed
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"列映射失败: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _apply_data_cleaning(self, df: pd.DataFrame) -> Optional[pd.DataFrame]:
|
|
||||||
"""应用数据清洗规则,委托给 DataCleaner"""
|
|
||||||
if not self.cleaning_rules:
|
|
||||||
self.logger.info("没有数据清洗规则")
|
|
||||||
return df
|
|
||||||
try:
|
|
||||||
cleaner = DataCleaner()
|
|
||||||
for rule in self.cleaning_rules:
|
|
||||||
cleaner.add_rule(rule.get('type'), **{k: v for k, v in rule.items() if k != 'type'})
|
|
||||||
result = cleaner.clean(df)
|
|
||||||
self.logger.info(f"数据清洗完成,数据形状: {result.shape}")
|
|
||||||
return result
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"数据清洗失败: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _apply_calculations(self, df: pd.DataFrame) -> Optional[pd.DataFrame]:
|
|
||||||
"""应用计算处理,委托给 DataCalculator"""
|
|
||||||
if not self.calculations:
|
|
||||||
self.logger.info("没有计算规则")
|
|
||||||
return df
|
|
||||||
try:
|
|
||||||
calculator = DataCalculator()
|
|
||||||
for calc in self.calculations:
|
|
||||||
calculator.add_rule(calc.get('type'), **{k: v for k, v in calc.items() if k != 'type'})
|
|
||||||
result = calculator.calculate(df)
|
|
||||||
self.logger.info(f"计算处理完成,数据形状: {result.shape}")
|
|
||||||
return result
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"计算处理失败: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _generate_output(self, df: pd.DataFrame, input_file: Path, output_dir: Path) -> Optional[Path]:
|
|
||||||
"""生成输出文件
|
|
||||||
|
|
||||||
Args:
|
|
||||||
df: 最终数据
|
|
||||||
input_file: 输入文件路径
|
|
||||||
output_dir: 输出目录
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
输出文件路径或None
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
# 生成输出文件名
|
|
||||||
timestamp = pd.Timestamp.now().strftime("%Y%m%d_%H%M%S")
|
|
||||||
output_filename = f"{input_file.stem}{self.output_suffix}_{timestamp}.xls"
|
|
||||||
output_file = output_dir / output_filename
|
|
||||||
|
|
||||||
# 这里应该使用实际的模板生成逻辑
|
|
||||||
# 暂时直接保存为Excel文件
|
|
||||||
df.to_excel(output_file, index=False)
|
|
||||||
|
|
||||||
self.logger.info(f"输出文件生成成功: {output_file}")
|
|
||||||
return output_file
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"生成输出文件失败: {e}")
|
|
||||||
return None
|
|
||||||
@@ -1,347 +0,0 @@
|
|||||||
"""
|
|
||||||
烟草订单处理器
|
|
||||||
|
|
||||||
处理烟草公司特定格式的订单明细文件,生成银豹采购单
|
|
||||||
"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import datetime
|
|
||||||
import pandas as pd
|
|
||||||
import xlrd
|
|
||||||
import xlwt
|
|
||||||
from xlutils.copy import copy
|
|
||||||
from openpyxl import load_workbook
|
|
||||||
from typing import Optional, Dict, Any, List, Tuple
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from .base import BaseProcessor
|
|
||||||
from ...core.utils.log_utils import get_logger
|
|
||||||
from ...core.utils.string_utils import parse_monetary_string
|
|
||||||
from ...core.utils.dialog_utils import show_custom_dialog
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class TobaccoProcessor(BaseProcessor):
|
|
||||||
"""烟草订单处理器
|
|
||||||
|
|
||||||
处理烟草公司订单明细文件,提取商品信息并生成标准银豹采购单格式
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, config: Dict[str, Any]):
|
|
||||||
"""初始化烟草订单处理器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
config: 配置信息
|
|
||||||
"""
|
|
||||||
super().__init__(config)
|
|
||||||
self.description = "处理烟草公司订单明细文件"
|
|
||||||
self.template_file = config.get('Paths', 'template_file', fallback='templates/银豹-采购单模板.xls')
|
|
||||||
|
|
||||||
# 输出目录配置
|
|
||||||
self.result_dir = Path(config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/result'))
|
|
||||||
self.result_dir.mkdir(exist_ok=True)
|
|
||||||
|
|
||||||
# 默认输出文件名
|
|
||||||
self.default_output_name = "银豹采购单_烟草公司.xls"
|
|
||||||
|
|
||||||
def can_process(self, file_path: Path) -> bool:
|
|
||||||
"""判断是否为烟草订单文件
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
是否能处理该文件
|
|
||||||
"""
|
|
||||||
if not self.validate_input(file_path):
|
|
||||||
return False
|
|
||||||
|
|
||||||
# 检查文件名特征
|
|
||||||
filename = file_path.name
|
|
||||||
tobacco_keywords = ['烟草', '卷烟', '订单明细', 'tobacco', '烟']
|
|
||||||
|
|
||||||
# 检查文件内容特征
|
|
||||||
try:
|
|
||||||
df = self._read_excel_safely(file_path, nrows=5)
|
|
||||||
required_columns = ['商品', '盒码', '订单量']
|
|
||||||
|
|
||||||
# 检查文件名或内容特征
|
|
||||||
filename_match = any(keyword in filename for keyword in tobacco_keywords)
|
|
||||||
content_match = all(col in df.columns for col in required_columns)
|
|
||||||
|
|
||||||
if filename_match or content_match:
|
|
||||||
self.logger.info(f"识别为烟草订单文件: {filename}")
|
|
||||||
return True
|
|
||||||
|
|
||||||
return False
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.warning(f"检查文件内容时出错: {e}")
|
|
||||||
# 如果无法读取内容,仅基于文件名判断
|
|
||||||
return any(keyword in filename for keyword in tobacco_keywords)
|
|
||||||
|
|
||||||
def process(self, input_file: Path, output_dir: Path) -> Optional[Path]:
|
|
||||||
"""处理烟草订单
|
|
||||||
|
|
||||||
Args:
|
|
||||||
input_file: 输入文件路径
|
|
||||||
output_dir: 输出目录路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
输出文件路径,处理失败返回None
|
|
||||||
"""
|
|
||||||
self.log_processing_start(input_file)
|
|
||||||
|
|
||||||
try:
|
|
||||||
# 读取订单信息(时间和总金额)
|
|
||||||
order_info = self._read_order_info(input_file)
|
|
||||||
if not order_info:
|
|
||||||
self.logger.error(f"读取订单信息失败: {input_file}")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
order_time, total_amount = order_info
|
|
||||||
self.logger.info(f"订单信息 - 时间: {order_time}, 总金额: {total_amount}")
|
|
||||||
|
|
||||||
# 读取订单数据
|
|
||||||
order_data = self._read_order_data(input_file)
|
|
||||||
if order_data is None or order_data.empty:
|
|
||||||
self.logger.error(f"读取订单数据失败或数据为空: {input_file}")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
self.logger.info(f"成功读取订单数据,共{len(order_data)}条记录")
|
|
||||||
|
|
||||||
# 生成输出文件路径
|
|
||||||
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
|
|
||||||
output_filename = f"银豹采购单_烟草公司_{timestamp}.xls"
|
|
||||||
output_file = output_dir / output_filename
|
|
||||||
|
|
||||||
# 确保输出目录存在
|
|
||||||
output_file.parent.mkdir(parents=True, exist_ok=True)
|
|
||||||
|
|
||||||
# 生成银豹采购单
|
|
||||||
result = self._generate_pospal_order(order_data, order_time, output_file)
|
|
||||||
|
|
||||||
if result:
|
|
||||||
self.logger.info(f"采购单生成成功: {output_file}")
|
|
||||||
self.log_processing_end(input_file, output_file, success=True)
|
|
||||||
|
|
||||||
# 显示处理结果
|
|
||||||
self._show_processing_result(output_file, order_time, len(order_data), total_amount)
|
|
||||||
|
|
||||||
return output_file
|
|
||||||
else:
|
|
||||||
self.logger.error("生成银豹采购单失败")
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"处理烟草订单时发生错误: {e}", exc_info=True)
|
|
||||||
self.log_processing_end(input_file, success=False)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def get_required_columns(self) -> List[str]:
|
|
||||||
"""返回需要的列名列表"""
|
|
||||||
return ['商品', '盒码', '条码', '建议零售价', '批发价', '需求量', '订单量', '金额']
|
|
||||||
|
|
||||||
def get_supported_extensions(self) -> List[str]:
|
|
||||||
"""支持的文件扩展名"""
|
|
||||||
return ['.xlsx', '.xls']
|
|
||||||
|
|
||||||
def _read_order_info(self, file_path: Path) -> Optional[Tuple[str, float]]:
|
|
||||||
"""读取订单信息(时间和总金额)
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
包含订单时间和总金额的元组或None
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
wb_info = load_workbook(file_path, data_only=True)
|
|
||||||
ws_info = wb_info.active
|
|
||||||
|
|
||||||
# 从指定单元格读取订单信息
|
|
||||||
order_time = ws_info["H1"].value or "(空)"
|
|
||||||
total_amount = ws_info["H3"].value or 0.0
|
|
||||||
|
|
||||||
self.logger.info(f"成功读取订单信息: 时间={order_time}, 总金额={total_amount}")
|
|
||||||
return (order_time, total_amount)
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"读取订单信息出错: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _read_order_data(self, file_path: Path) -> Optional[pd.DataFrame]:
|
|
||||||
"""读取订单数据
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
订单数据DataFrame或None
|
|
||||||
"""
|
|
||||||
columns = ['商品', '盒码', '条码', '建议零售价', '批发价', '需求量', '订单量', '金额']
|
|
||||||
|
|
||||||
try:
|
|
||||||
df_old = self._read_excel_safely(file_path, header=None, skiprows=3, names=columns)
|
|
||||||
|
|
||||||
# 过滤订单量不为0的数据,并计算采购量和单价
|
|
||||||
df_filtered = df_old[df_old['订单量'] != 0].copy()
|
|
||||||
|
|
||||||
if df_filtered.empty:
|
|
||||||
self.logger.warning("没有订单量不为0的记录")
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 计算采购量和单价
|
|
||||||
df_filtered['采购量'] = df_filtered['订单量'] * 10 # 烟草订单通常需要乘以10
|
|
||||||
df_filtered['采购单价'] = df_filtered['金额'] / df_filtered['采购量']
|
|
||||||
df_filtered = df_filtered.reset_index(drop=True)
|
|
||||||
|
|
||||||
self.logger.info(f"成功处理订单数据,有效记录数: {len(df_filtered)}")
|
|
||||||
return df_filtered
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"读取订单数据失败: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _generate_pospal_order(self, order_data: pd.DataFrame, order_time: str, output_file: Path) -> bool:
|
|
||||||
"""生成银豹采购单
|
|
||||||
|
|
||||||
Args:
|
|
||||||
order_data: 订单数据
|
|
||||||
order_time: 订单时间
|
|
||||||
output_file: 输出文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
是否生成成功
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
# 检查模板文件是否存在
|
|
||||||
template_path = Path(self.template_file)
|
|
||||||
if not template_path.exists():
|
|
||||||
self.logger.error(f"采购单模板文件不存在: {template_path}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
self.logger.info(f"使用模板文件: {template_path}")
|
|
||||||
|
|
||||||
# 打开模板,准备写入
|
|
||||||
template_rd = xlrd.open_workbook(str(template_path), formatting_info=True)
|
|
||||||
template_wb = copy(template_rd)
|
|
||||||
template_ws = template_wb.get_sheet(0)
|
|
||||||
|
|
||||||
# 获取模板中的表头列索引
|
|
||||||
header_row = template_rd.sheet_by_index(0).row_values(0)
|
|
||||||
|
|
||||||
# 查找需要的列索引
|
|
||||||
try:
|
|
||||||
barcode_col = header_row.index("条码(必填)")
|
|
||||||
amount_col = header_row.index("采购量(必填)")
|
|
||||||
gift_col = header_row.index("赠送量")
|
|
||||||
price_col = header_row.index("采购单价(必填)")
|
|
||||||
except ValueError as e:
|
|
||||||
self.logger.error(f"模板列查找失败: {e}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
self.logger.info(f"模板列索引 - 条码:{barcode_col}, 采购量:{amount_col}, 赠送量:{gift_col}, 单价:{price_col}")
|
|
||||||
|
|
||||||
# 写入数据到模板
|
|
||||||
for i, row in order_data.iterrows():
|
|
||||||
template_ws.write(i + 1, barcode_col, row['盒码']) # 商品条码
|
|
||||||
template_ws.write(i + 1, amount_col, int(row['采购量'])) # 采购量
|
|
||||||
template_ws.write(i + 1, gift_col, "") # 赠送量为空
|
|
||||||
template_ws.write(i + 1, price_col, round(row['采购单价'], 2)) # 采购单价保留两位小数
|
|
||||||
|
|
||||||
# 确保输出目录存在
|
|
||||||
output_file.parent.mkdir(parents=True, exist_ok=True)
|
|
||||||
|
|
||||||
# 保存输出文件
|
|
||||||
template_wb.save(str(output_file))
|
|
||||||
|
|
||||||
self.logger.info(f"采购单生成成功: {output_file}")
|
|
||||||
return True
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"生成银豹采购单失败: {e}", exc_info=True)
|
|
||||||
return False
|
|
||||||
|
|
||||||
def _show_processing_result(self, output_file: Path, order_time: str, total_count: int, total_amount: float):
|
|
||||||
"""显示处理结果
|
|
||||||
|
|
||||||
Args:
|
|
||||||
output_file: 输出文件路径
|
|
||||||
order_time: 订单时间
|
|
||||||
total_count: 处理条目数
|
|
||||||
total_amount: 总金额
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
# 创建附加信息
|
|
||||||
additional_info = {
|
|
||||||
"订单来源": "烟草公司",
|
|
||||||
"处理时间": datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
|
||||||
}
|
|
||||||
|
|
||||||
# 格式化金额显示
|
|
||||||
parsed = parse_monetary_string(total_amount)
|
|
||||||
total_amount = parsed if parsed is not None else 0.0
|
|
||||||
amount_display = f"¥{total_amount:.2f}"
|
|
||||||
|
|
||||||
# 显示自定义对话框
|
|
||||||
show_custom_dialog(
|
|
||||||
title="烟草订单处理结果",
|
|
||||||
message="烟草订单处理完成",
|
|
||||||
result_file=str(output_file),
|
|
||||||
time_info=order_time,
|
|
||||||
count_info=f"{total_count}个商品",
|
|
||||||
amount_info=amount_display,
|
|
||||||
additional_info=additional_info
|
|
||||||
)
|
|
||||||
|
|
||||||
self.logger.info(f"显示处理结果 - 文件:{output_file}, 时间:{order_time}, 数量:{total_count}, 金额:{total_amount}")
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"显示处理结果时出错: {e}")
|
|
||||||
|
|
||||||
def get_latest_tobacco_order(self) -> Optional[Path]:
|
|
||||||
"""获取最新的烟草订单明细文件(兼容旧接口)
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
文件路径或None
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
# 获取今日开始时间戳
|
|
||||||
today = datetime.date.today()
|
|
||||||
today_start = datetime.datetime.combine(today, datetime.time.min).timestamp()
|
|
||||||
|
|
||||||
# 查找订单明细文件
|
|
||||||
result_dir = Path(self.config.get_path('Paths', 'output_folder', fallback='data/output') if hasattr(self.config, 'get_path') else os.path.abspath('data/output'))
|
|
||||||
if not result_dir.exists():
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 查找符合条件的文件
|
|
||||||
candidates = []
|
|
||||||
for file_path in result_dir.glob("订单明细*.xlsx"):
|
|
||||||
if file_path.stat().st_ctime >= today_start:
|
|
||||||
candidates.append(file_path)
|
|
||||||
|
|
||||||
if not candidates:
|
|
||||||
self.logger.warning("未找到今天创建的烟草订单明细文件")
|
|
||||||
# 返回最新的文件
|
|
||||||
all_files = list(result_dir.glob("订单明细*.xlsx"))
|
|
||||||
if all_files:
|
|
||||||
all_files.sort(key=lambda x: x.stat().st_ctime, reverse=True)
|
|
||||||
return all_files[0]
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 返回最新的文件
|
|
||||||
candidates.sort(key=lambda x: x.stat().st_ctime, reverse=True)
|
|
||||||
latest_file = candidates[0]
|
|
||||||
|
|
||||||
self.logger.info(f"找到最新烟草订单明细文件: {latest_file}")
|
|
||||||
return latest_file
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
self.logger.error(f"获取最新烟草订单文件时出错: {e}")
|
|
||||||
return None
|
|
||||||
@@ -1,247 +0,0 @@
|
|||||||
"""云端同步模块 — 基于 Gitea REST API 的文件同步"""
|
|
||||||
|
|
||||||
import base64
|
|
||||||
import json
|
|
||||||
from typing import Optional, Tuple
|
|
||||||
|
|
||||||
import requests
|
|
||||||
|
|
||||||
from .log_utils import get_logger
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class GiteaSync:
|
|
||||||
"""通过 Gitea REST API 读写仓库文件"""
|
|
||||||
|
|
||||||
def __init__(self, base_url: str, owner: str, repo: str, token: str, timeout: int = 15):
|
|
||||||
self.base_url = base_url.rstrip("/")
|
|
||||||
self.owner = owner
|
|
||||||
self.repo = repo
|
|
||||||
self.token = token
|
|
||||||
self.timeout = timeout
|
|
||||||
|
|
||||||
@property
|
|
||||||
def _headers(self) -> dict:
|
|
||||||
return {"Authorization": f"token {self.token}"}
|
|
||||||
|
|
||||||
def _api_url(self, path: str) -> str:
|
|
||||||
return f"{self.base_url}/api/v1/repos/{self.owner}/{self.repo}/contents/{path}"
|
|
||||||
|
|
||||||
def pull_file(self, remote_path: str) -> Optional[Tuple[bytes, str]]:
|
|
||||||
"""从仓库下载文件
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
(content_bytes, sha) 或 None(文件不存在或失败)
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
resp = requests.get(
|
|
||||||
self._api_url(remote_path),
|
|
||||||
headers=self._headers,
|
|
||||||
timeout=self.timeout,
|
|
||||||
)
|
|
||||||
if resp.status_code == 404:
|
|
||||||
logger.info(f"云端文件不存在: {remote_path}")
|
|
||||||
return None
|
|
||||||
if resp.status_code != 200:
|
|
||||||
logger.warning(f"拉取文件失败: {resp.status_code} {resp.text[:200]}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
data = resp.json()
|
|
||||||
sha = data.get("sha", "")
|
|
||||||
content_b64 = data.get("content", "")
|
|
||||||
# Gitea 返回的 base64 可能含换行
|
|
||||||
content_bytes = base64.b64decode(content_b64.replace("\n", ""))
|
|
||||||
logger.info(f"拉取文件成功: {remote_path} ({len(content_bytes)} bytes)")
|
|
||||||
return content_bytes, sha
|
|
||||||
|
|
||||||
except requests.RequestException as e:
|
|
||||||
logger.error(f"拉取文件网络错误: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def push_file(
|
|
||||||
self,
|
|
||||||
remote_path: str,
|
|
||||||
content: bytes,
|
|
||||||
message: str,
|
|
||||||
sha: Optional[str] = None,
|
|
||||||
) -> Optional[str]:
|
|
||||||
"""上传或更新文件到仓库
|
|
||||||
|
|
||||||
Args:
|
|
||||||
remote_path: 仓库中的文件路径
|
|
||||||
content: 文件内容(bytes)
|
|
||||||
message: commit message
|
|
||||||
sha: 文件当前 sha(更新时必传,新建时省略)
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
新的 sha,失败返回 None
|
|
||||||
"""
|
|
||||||
payload = {
|
|
||||||
"message": message,
|
|
||||||
"content": base64.b64encode(content).decode("ascii"),
|
|
||||||
}
|
|
||||||
if sha:
|
|
||||||
payload["sha"] = sha
|
|
||||||
|
|
||||||
try:
|
|
||||||
resp = requests.put(
|
|
||||||
self._api_url(remote_path),
|
|
||||||
headers={**self._headers, "Content-Type": "application/json"},
|
|
||||||
json=payload,
|
|
||||||
timeout=self.timeout,
|
|
||||||
)
|
|
||||||
if resp.status_code not in (200, 201):
|
|
||||||
logger.warning(f"推送文件失败: {resp.status_code} {resp.text[:200]}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
new_sha = resp.json().get("content", {}).get("sha", "")
|
|
||||||
logger.info(f"推送文件成功: {remote_path} (sha={new_sha[:12]})")
|
|
||||||
return new_sha
|
|
||||||
|
|
||||||
except requests.RequestException as e:
|
|
||||||
logger.error(f"推送文件网络错误: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def file_exists(self, remote_path: str) -> Optional[str]:
|
|
||||||
"""检查文件是否存在
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
文件 sha(存在)或 None(不存在)
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
resp = requests.head(
|
|
||||||
self._api_url(remote_path),
|
|
||||||
headers=self._headers,
|
|
||||||
timeout=self.timeout,
|
|
||||||
)
|
|
||||||
if resp.status_code == 200:
|
|
||||||
# HEAD 不返回 body,需要 GET 获取 sha
|
|
||||||
result = self.pull_file(remote_path)
|
|
||||||
return result[1] if result else None
|
|
||||||
return None
|
|
||||||
except requests.RequestException:
|
|
||||||
return None
|
|
||||||
|
|
||||||
def pull_json(self, remote_path: str) -> Optional[Tuple[dict, str]]:
|
|
||||||
"""拉取并解析 JSON 文件
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
(parsed_dict, sha) 或 None
|
|
||||||
"""
|
|
||||||
result = self.pull_file(remote_path)
|
|
||||||
if result is None:
|
|
||||||
return None
|
|
||||||
content_bytes, sha = result
|
|
||||||
try:
|
|
||||||
data = json.loads(content_bytes)
|
|
||||||
return data, sha
|
|
||||||
except json.JSONDecodeError as e:
|
|
||||||
logger.error(f"解析 JSON 失败: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def push_json(self, remote_path: str, data: dict, message: str, sha: Optional[str] = None) -> Optional[str]:
|
|
||||||
"""将 dict 序列化为 JSON 并推送
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
新的 sha,失败返回 None
|
|
||||||
"""
|
|
||||||
content = json.dumps(data, ensure_ascii=False, indent=2).encode("utf-8")
|
|
||||||
return self.push_file(remote_path, content, message, sha)
|
|
||||||
|
|
||||||
def push_binary(self, remote_path: str, local_path: str, message: str) -> Optional[str]:
|
|
||||||
"""读取本地二进制文件并推送到云端
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
新的 sha,失败返回 None
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
with open(local_path, "rb") as f:
|
|
||||||
content = f.read()
|
|
||||||
except OSError as e:
|
|
||||||
logger.error(f"读取本地文件失败: {local_path} — {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
existing_sha = self.file_exists(remote_path)
|
|
||||||
return self.push_file(remote_path, content, message, sha=existing_sha)
|
|
||||||
|
|
||||||
def push(self) -> str:
|
|
||||||
"""推送本地数据到云端:product_cache.json + barcode_mappings.json"""
|
|
||||||
import os
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
project_root = Path(__file__).resolve().parent.parent.parent.parent
|
|
||||||
|
|
||||||
results = []
|
|
||||||
# 1. Product cache
|
|
||||||
from app.core.db.product_db import ProductDatabase
|
|
||||||
excel_source = str(project_root / "templates" / "商品资料.xlsx")
|
|
||||||
db_path = str(project_root / "data" / "product_cache.db")
|
|
||||||
product_db = ProductDatabase(db_path, excel_source)
|
|
||||||
product_data = product_db.export_for_sync()
|
|
||||||
sha = self.push_json("product_cache.json", product_data, "sync: update product cache")
|
|
||||||
results.append(f"product_cache: {'ok' if sha else 'skip'}")
|
|
||||||
|
|
||||||
# 2. Barcode mappings
|
|
||||||
barcode_path = project_root / "config" / "barcode_mappings.json"
|
|
||||||
if barcode_path.exists():
|
|
||||||
with open(barcode_path, "r", encoding="utf-8") as f:
|
|
||||||
barcode_data = json.loads(f.read())
|
|
||||||
sha = self.push_json("barcode_mappings.json", barcode_data, "sync: update barcode mappings")
|
|
||||||
results.append(f"barcode_mappings: {'ok' if sha else 'skip'}")
|
|
||||||
|
|
||||||
return "; ".join(results) if results else "无数据需要同步"
|
|
||||||
|
|
||||||
def pull(self) -> str:
|
|
||||||
"""从云端拉取数据并写入本地文件"""
|
|
||||||
import os
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
project_root = Path(__file__).resolve().parent.parent.parent.parent
|
|
||||||
|
|
||||||
results = []
|
|
||||||
# 1. Product cache
|
|
||||||
result = self.pull_json("product_cache.json")
|
|
||||||
if result is not None:
|
|
||||||
data, sha = result
|
|
||||||
from app.core.db.product_db import ProductDatabase
|
|
||||||
excel_source = str(project_root / "templates" / "商品资料.xlsx")
|
|
||||||
db_path = str(project_root / "data" / "product_cache.db")
|
|
||||||
os.makedirs(os.path.dirname(db_path), exist_ok=True)
|
|
||||||
product_db = ProductDatabase(db_path, excel_source)
|
|
||||||
count = product_db.import_from_sync(data)
|
|
||||||
results.append(f"product_cache: 导入 {count} 条")
|
|
||||||
else:
|
|
||||||
results.append("product_cache: 云端无数据")
|
|
||||||
|
|
||||||
# 2. Barcode mappings
|
|
||||||
barcode_result = self.pull_json("barcode_mappings.json")
|
|
||||||
if barcode_result is not None:
|
|
||||||
barcode_data, sha = barcode_result
|
|
||||||
barcode_path = project_root / "config" / "barcode_mappings.json"
|
|
||||||
barcode_path.parent.mkdir(parents=True, exist_ok=True)
|
|
||||||
with open(barcode_path, "w", encoding="utf-8") as f:
|
|
||||||
json.dump(barcode_data, f, ensure_ascii=False, indent=2)
|
|
||||||
results.append(f"barcode_mappings: 已更新")
|
|
||||||
else:
|
|
||||||
results.append("barcode_mappings: 云端无数据")
|
|
||||||
|
|
||||||
return "; ".join(results) if results else "无数据需要同步"
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def from_config(cls, config) -> Optional["GiteaSync"]:
|
|
||||||
"""从 ConfigManager 创建实例
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
GiteaSync 实例,配置不完整时返回 None
|
|
||||||
"""
|
|
||||||
base_url = config.get("Gitea", "base_url", fallback="").strip()
|
|
||||||
owner = config.get("Gitea", "owner", fallback="").strip()
|
|
||||||
repo = config.get("Gitea", "repo", fallback="").strip()
|
|
||||||
token = config.get("Gitea", "token", fallback="").strip()
|
|
||||||
|
|
||||||
if not all([base_url, owner, repo, token]):
|
|
||||||
logger.debug("Gitea 配置不完整,跳过云端同步")
|
|
||||||
return None
|
|
||||||
|
|
||||||
return cls(base_url=base_url, owner=owner, repo=repo, token=token)
|
|
||||||
@@ -8,14 +8,10 @@
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
import json
|
|
||||||
import tkinter as tk
|
import tkinter as tk
|
||||||
from tkinter import messagebox, ttk, simpledialog
|
from tkinter import messagebox, ttk, simpledialog
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
|
|
||||||
from .cloud_sync import GiteaSync
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
|
|
||||||
def create_custom_dialog(title="提示", message="", result_file=None, time_info=None,
|
def create_custom_dialog(title="提示", message="", result_file=None, time_info=None,
|
||||||
count_info=None, amount_info=None, additional_info=None):
|
count_info=None, amount_info=None, additional_info=None):
|
||||||
"""
|
"""
|
||||||
@@ -82,12 +78,11 @@ def create_custom_dialog(title="提示", message="", result_file=None, time_info
|
|||||||
file_size = os.path.getsize(result_file)
|
file_size = os.path.getsize(result_file)
|
||||||
file_time = datetime.fromtimestamp(os.path.getmtime(result_file))
|
file_time = datetime.fromtimestamp(os.path.getmtime(result_file))
|
||||||
|
|
||||||
from .file_utils import format_file_size
|
size_text = f"{file_size / 1024:.1f} KB" if file_size < 1024*1024 else f"{file_size / (1024*1024):.1f} MB"
|
||||||
size_text = format_file_size(file_size)
|
|
||||||
|
|
||||||
tk.Label(file_frame, text=f"文件大小: {size_text}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
tk.Label(file_frame, text=f"文件大小: {size_text}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
||||||
tk.Label(file_frame, text=f"创建时间: {file_time.strftime('%Y-%m-%d %H:%M:%S')}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
tk.Label(file_frame, text=f"创建时间: {file_time.strftime('%Y-%m-%d %H:%M:%S')}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
||||||
except Exception:
|
except:
|
||||||
tk.Label(file_frame, text="无法获取文件信息", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
tk.Label(file_frame, text="无法获取文件信息", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
||||||
|
|
||||||
# 添加按钮
|
# 添加按钮
|
||||||
@@ -107,7 +102,7 @@ def create_custom_dialog(title="提示", message="", result_file=None, time_info
|
|||||||
button_frame = tk.Frame(dialog)
|
button_frame = tk.Frame(dialog)
|
||||||
button_frame.pack(pady=10)
|
button_frame.pack(pady=10)
|
||||||
|
|
||||||
tk.Button(button_frame, text="打开输出目录", command=lambda: os.startfile(ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True))).pack(side=tk.LEFT, padx=5)
|
tk.Button(button_frame, text="打开输出目录", command=lambda: os.startfile(os.path.abspath("data/output"))).pack(side=tk.LEFT, padx=5)
|
||||||
tk.Button(button_frame, text="关闭", command=dialog.destroy).pack(side=tk.LEFT, padx=5)
|
tk.Button(button_frame, text="关闭", command=dialog.destroy).pack(side=tk.LEFT, padx=5)
|
||||||
|
|
||||||
# 确保窗口显示在最前
|
# 确保窗口显示在最前
|
||||||
@@ -488,135 +483,6 @@ def create_barcode_mapping_dialog(parent=None, on_save=None, current_mappings=No
|
|||||||
cancel_btn = tk.Button(bottom_frame, text="取消", command=cancel)
|
cancel_btn = tk.Button(bottom_frame, text="取消", command=cancel)
|
||||||
cancel_btn.pack(side=tk.RIGHT, padx=5)
|
cancel_btn.pack(side=tk.RIGHT, padx=5)
|
||||||
|
|
||||||
# ---- 云端同步按钮 ----
|
|
||||||
def _build_current_mappings():
|
|
||||||
"""从弹窗当前数据构建 mappings dict(与 save_mappings 逻辑相同)"""
|
|
||||||
mappings = {}
|
|
||||||
for source, target in mapping_list:
|
|
||||||
mappings[source] = {
|
|
||||||
'map_to': target,
|
|
||||||
'description': f'条码映射:{source} -> {target}'
|
|
||||||
}
|
|
||||||
for barcode, multiplier, unit, price, spec, desc in special_list:
|
|
||||||
if barcode not in mappings:
|
|
||||||
mappings[barcode] = {}
|
|
||||||
if multiplier:
|
|
||||||
try:
|
|
||||||
if isinstance(multiplier, str):
|
|
||||||
mappings[barcode]['multiplier'] = float(multiplier) if '.' in multiplier else int(multiplier)
|
|
||||||
else:
|
|
||||||
mappings[barcode]['multiplier'] = multiplier
|
|
||||||
except ValueError:
|
|
||||||
mappings[barcode]['multiplier'] = multiplier
|
|
||||||
if unit:
|
|
||||||
mappings[barcode]['target_unit'] = unit
|
|
||||||
if price:
|
|
||||||
try:
|
|
||||||
mappings[barcode]['fixed_price'] = float(price)
|
|
||||||
except ValueError:
|
|
||||||
mappings[barcode]['fixed_price'] = price
|
|
||||||
if spec:
|
|
||||||
mappings[barcode]['specification'] = spec
|
|
||||||
if desc and "映射到:" in desc:
|
|
||||||
parts = desc.split("映射到:")
|
|
||||||
base_desc = parts[0].strip()
|
|
||||||
target_barcode = parts[1].strip()
|
|
||||||
if base_desc:
|
|
||||||
mappings[barcode]['description'] = base_desc
|
|
||||||
mappings[barcode]['map_to'] = target_barcode
|
|
||||||
elif desc:
|
|
||||||
mappings[barcode]['description'] = desc
|
|
||||||
return mappings
|
|
||||||
|
|
||||||
def _get_sync():
|
|
||||||
"""获取 GiteaSync 实例,配置不完整时提示用户"""
|
|
||||||
sync = GiteaSync.from_config(ConfigManager())
|
|
||||||
if sync is None:
|
|
||||||
messagebox.showwarning("云端同步", "请先在「系统设置」中配置 Gitea 云端同步参数(token)")
|
|
||||||
return sync
|
|
||||||
|
|
||||||
def _refresh_trees(new_mappings):
|
|
||||||
"""用新数据刷新两个 Treeview"""
|
|
||||||
# 清空
|
|
||||||
for item in mapping_tree.get_children():
|
|
||||||
mapping_tree.delete(item)
|
|
||||||
mapping_list.clear()
|
|
||||||
for item in special_tree.get_children():
|
|
||||||
special_tree.delete(item)
|
|
||||||
special_list.clear()
|
|
||||||
# 重新填充
|
|
||||||
if new_mappings:
|
|
||||||
for barcode, data in new_mappings.items():
|
|
||||||
if 'map_to' in data and 'multiplier' not in data:
|
|
||||||
mapping_list.append((barcode, data['map_to']))
|
|
||||||
mapping_tree.insert('', 'end', values=(barcode, data['map_to']))
|
|
||||||
else:
|
|
||||||
mult = data.get('multiplier', '')
|
|
||||||
unit = data.get('target_unit', '')
|
|
||||||
price = data.get('fixed_price', '')
|
|
||||||
spec = data.get('specification', '')
|
|
||||||
desc = data.get('description', '')
|
|
||||||
if 'map_to' in data:
|
|
||||||
desc = f"{desc} 映射到: {data['map_to']}" if desc else f"映射到: {data['map_to']}"
|
|
||||||
special_list.append((barcode, mult, unit, price, spec, desc))
|
|
||||||
tags = ("mapped",) if 'map_to' in data else ()
|
|
||||||
special_tree.insert('', 'end', values=(barcode, mult, unit, price, spec, desc), tags=tags)
|
|
||||||
if any('map_to' in d for d in new_mappings.values()):
|
|
||||||
special_tree.tag_configure("mapped", foreground="blue")
|
|
||||||
|
|
||||||
def push_to_cloud():
|
|
||||||
sync = _get_sync()
|
|
||||||
if not sync:
|
|
||||||
return
|
|
||||||
mappings = _build_current_mappings()
|
|
||||||
if not mappings:
|
|
||||||
messagebox.showwarning("同步到云端", "当前没有映射数据可同步")
|
|
||||||
return
|
|
||||||
# 先获取当前 sha(如果文件已存在)
|
|
||||||
sha = None
|
|
||||||
existing = sync.pull_file("barcode_mappings.json")
|
|
||||||
if existing:
|
|
||||||
sha = existing[1]
|
|
||||||
new_sha = sync.push_json(
|
|
||||||
"barcode_mappings.json",
|
|
||||||
mappings,
|
|
||||||
f"同步条码映射 ({len(mappings)} 条)",
|
|
||||||
sha=sha,
|
|
||||||
)
|
|
||||||
if new_sha:
|
|
||||||
messagebox.showinfo("同步成功", f"已推送 {len(mappings)} 条映射到云端")
|
|
||||||
else:
|
|
||||||
messagebox.showerror("同步失败", "推送到云端失败,请检查网络和 Gitea 配置")
|
|
||||||
|
|
||||||
def pull_from_cloud():
|
|
||||||
sync = _get_sync()
|
|
||||||
if not sync:
|
|
||||||
return
|
|
||||||
result = sync.pull_json("barcode_mappings.json")
|
|
||||||
if result is None:
|
|
||||||
messagebox.showwarning("拉取失败", "云端没有找到条码映射文件,或网络错误")
|
|
||||||
return
|
|
||||||
data, sha = result
|
|
||||||
if not isinstance(data, dict) or len(data) == 0:
|
|
||||||
messagebox.showwarning("拉取失败", "云端数据格式异常")
|
|
||||||
return
|
|
||||||
# 同时保存到本地
|
|
||||||
from app.core.excel.converter import UnitConverter
|
|
||||||
uc = UnitConverter()
|
|
||||||
uc.update_barcode_mappings(data)
|
|
||||||
# 刷新弹窗
|
|
||||||
_refresh_trees(data)
|
|
||||||
messagebox.showinfo("拉取成功", f"已从云端拉取 {len(data)} 条映射,本地已同步更新")
|
|
||||||
|
|
||||||
sync_frame = tk.Frame(bottom_frame)
|
|
||||||
sync_frame.pack(side=tk.LEFT, padx=5)
|
|
||||||
|
|
||||||
push_btn = tk.Button(sync_frame, text="同步到云端", command=push_to_cloud, fg="white", bg="#4a90d9")
|
|
||||||
push_btn.pack(side=tk.LEFT, padx=3)
|
|
||||||
|
|
||||||
pull_btn = tk.Button(sync_frame, text="从云端拉取", command=pull_from_cloud, fg="white", bg="#5cb85c")
|
|
||||||
pull_btn.pack(side=tk.LEFT, padx=3)
|
|
||||||
|
|
||||||
# 导入当前映射数据
|
# 导入当前映射数据
|
||||||
if current_mappings:
|
if current_mappings:
|
||||||
for barcode, data in current_mappings.items():
|
for barcode, data in current_mappings.items():
|
||||||
@@ -800,362 +666,5 @@ def show_config_dialog(parent, config_manager, on_save=None):
|
|||||||
dialog.transient(parent)
|
dialog.transient(parent)
|
||||||
dialog.grab_set()
|
dialog.grab_set()
|
||||||
|
|
||||||
|
# 等待窗口关闭
|
||||||
# ──────────────────────────────────────────────────────────────
|
parent.wait_window(dialog)
|
||||||
# 云端同步管理对话框
|
|
||||||
# ──────────────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
SYNC_FILES = [
|
|
||||||
{
|
|
||||||
"name": "条码映射",
|
|
||||||
"remote": "barcode_mappings.json",
|
|
||||||
"local": "config/barcode_mappings.json",
|
|
||||||
"type": "json",
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"name": "供应商配置",
|
|
||||||
"remote": "suppliers_config.json",
|
|
||||||
"local": "config/suppliers_config.json",
|
|
||||||
"type": "json",
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"name": "商品资料",
|
|
||||||
"remote": "templates/商品资料.xlsx",
|
|
||||||
"local": "templates/商品资料.xlsx",
|
|
||||||
"type": "binary",
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"name": "采购单模板",
|
|
||||||
"remote": "templates/银豹-采购单模板.xls",
|
|
||||||
"local": "templates/银豹-采购单模板.xls",
|
|
||||||
"type": "binary",
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"name": "商品记忆库",
|
|
||||||
"remote": "product_memory.json",
|
|
||||||
"local": "data/product_memory.json",
|
|
||||||
"type": "json",
|
|
||||||
},
|
|
||||||
]
|
|
||||||
|
|
||||||
|
|
||||||
def _format_size(path: str) -> str:
|
|
||||||
try:
|
|
||||||
size = os.path.getsize(path)
|
|
||||||
if size < 1024 * 1024:
|
|
||||||
return f"{size / 1024:.1f} KB"
|
|
||||||
return f"{size / (1024 * 1024):.1f} MB"
|
|
||||||
except OSError:
|
|
||||||
return "—"
|
|
||||||
|
|
||||||
|
|
||||||
def show_cloud_sync_dialog(parent=None):
|
|
||||||
"""统一云端同步管理对话框"""
|
|
||||||
|
|
||||||
sync = GiteaSync.from_config(ConfigManager())
|
|
||||||
if sync is None:
|
|
||||||
messagebox.showwarning(
|
|
||||||
"配置不完整",
|
|
||||||
"请先在「系统设置」中配置 Gitea 地址和 Access Token",
|
|
||||||
)
|
|
||||||
return
|
|
||||||
|
|
||||||
dlg = tk.Toplevel(parent)
|
|
||||||
dlg.title("云端同步管理")
|
|
||||||
dlg.geometry("620x440")
|
|
||||||
dlg.resizable(False, False)
|
|
||||||
|
|
||||||
# 居中
|
|
||||||
dlg.update_idletasks()
|
|
||||||
x = (dlg.winfo_screenwidth() - 620) // 2
|
|
||||||
y = (dlg.winfo_screenheight() - 440) // 2
|
|
||||||
dlg.geometry(f"620x440+{x}+{y}")
|
|
||||||
|
|
||||||
# ── Treeview ──
|
|
||||||
columns = ("name", "local_status", "cloud_status")
|
|
||||||
tree = ttk.Treeview(dlg, columns=columns, show="headings", height=6)
|
|
||||||
tree.heading("name", text="文件")
|
|
||||||
tree.heading("local_status", text="本地状态")
|
|
||||||
tree.heading("cloud_status", text="云端状态")
|
|
||||||
tree.column("name", width=140)
|
|
||||||
tree.column("local_status", width=220)
|
|
||||||
tree.column("cloud_status", width=220)
|
|
||||||
tree.pack(fill=tk.BOTH, expand=True, padx=16, pady=(16, 8))
|
|
||||||
|
|
||||||
# tag 颜色
|
|
||||||
tree.tag_configure("synced", foreground="#2e7d32")
|
|
||||||
tree.tag_configure("cloud_only", foreground="#e65100")
|
|
||||||
tree.tag_configure("local_only", foreground="#1565c0")
|
|
||||||
tree.tag_configure("missing", foreground="#999999")
|
|
||||||
|
|
||||||
# 用 iid = remote_path 标识每行
|
|
||||||
cloud_sha_cache: dict = {} # remote_path -> sha
|
|
||||||
|
|
||||||
def _load_local_status():
|
|
||||||
"""仅加载本地状态,不发网络请求"""
|
|
||||||
for item in tree.get_children():
|
|
||||||
tree.delete(item)
|
|
||||||
for entry in SYNC_FILES:
|
|
||||||
local = entry["local"]
|
|
||||||
if os.path.exists(local):
|
|
||||||
if entry["type"] == "json":
|
|
||||||
try:
|
|
||||||
with open(local, "r", encoding="utf-8") as f:
|
|
||||||
data = json.load(f)
|
|
||||||
if isinstance(data, dict):
|
|
||||||
local_text = f"{len(data)} 项"
|
|
||||||
elif isinstance(data, list):
|
|
||||||
local_text = f"{len(data)} 条记录"
|
|
||||||
else:
|
|
||||||
local_text = "已存在"
|
|
||||||
except Exception:
|
|
||||||
local_text = "已存在(解析异常)"
|
|
||||||
else:
|
|
||||||
local_text = _format_size(local)
|
|
||||||
tag = "local_only"
|
|
||||||
else:
|
|
||||||
local_text = "不存在"
|
|
||||||
tag = "missing"
|
|
||||||
tree.insert(
|
|
||||||
"", tk.END,
|
|
||||||
iid=entry["remote"],
|
|
||||||
values=(entry["name"], local_text, "点「刷新状态」检查"),
|
|
||||||
tags=(tag,),
|
|
||||||
)
|
|
||||||
|
|
||||||
def refresh_status():
|
|
||||||
"""刷新每行的本地/云端状态"""
|
|
||||||
cloud_sha_cache.clear()
|
|
||||||
for item in tree.get_children():
|
|
||||||
tree.delete(item)
|
|
||||||
|
|
||||||
for entry in SYNC_FILES:
|
|
||||||
remote = entry["remote"]
|
|
||||||
local = entry["local"]
|
|
||||||
|
|
||||||
# 本地状态
|
|
||||||
if os.path.exists(local):
|
|
||||||
if entry["type"] == "json":
|
|
||||||
try:
|
|
||||||
with open(local, "r", encoding="utf-8") as f:
|
|
||||||
data = json.load(f)
|
|
||||||
if isinstance(data, dict):
|
|
||||||
local_text = f"{len(data)} 项"
|
|
||||||
elif isinstance(data, list):
|
|
||||||
local_text = f"{len(data)} 条记录"
|
|
||||||
else:
|
|
||||||
local_text = "已存在"
|
|
||||||
except Exception:
|
|
||||||
local_text = "已存在(解析异常)"
|
|
||||||
else:
|
|
||||||
local_text = _format_size(local)
|
|
||||||
else:
|
|
||||||
local_text = "不存在"
|
|
||||||
|
|
||||||
# 云端状态 — 网络请求,可能慢
|
|
||||||
sha = sync.file_exists(remote)
|
|
||||||
if sha:
|
|
||||||
cloud_sha_cache[remote] = sha
|
|
||||||
cloud_text = "已存在"
|
|
||||||
else:
|
|
||||||
cloud_text = "未上传"
|
|
||||||
|
|
||||||
# tag
|
|
||||||
local_ok = os.path.exists(local)
|
|
||||||
cloud_ok = sha is not None
|
|
||||||
if local_ok and cloud_ok:
|
|
||||||
tag = "synced"
|
|
||||||
elif cloud_ok and not local_ok:
|
|
||||||
tag = "cloud_only"
|
|
||||||
elif local_ok and not cloud_ok:
|
|
||||||
tag = "local_only"
|
|
||||||
else:
|
|
||||||
tag = "missing"
|
|
||||||
|
|
||||||
tree.insert(
|
|
||||||
"", tk.END,
|
|
||||||
iid=remote,
|
|
||||||
values=(entry["name"], local_text, cloud_text),
|
|
||||||
tags=(tag,),
|
|
||||||
)
|
|
||||||
|
|
||||||
# ── 操作函数 ──
|
|
||||||
def _get_selected_entries():
|
|
||||||
"""获取选中的文件条目列表"""
|
|
||||||
selected = tree.selection()
|
|
||||||
if not selected:
|
|
||||||
messagebox.showinfo("提示", "请先选中要操作的文件")
|
|
||||||
return []
|
|
||||||
return [e for e in SYNC_FILES if e["remote"] in selected]
|
|
||||||
|
|
||||||
def push_selected():
|
|
||||||
entries = _get_selected_entries()
|
|
||||||
if not entries:
|
|
||||||
return
|
|
||||||
ok, fail = 0, 0
|
|
||||||
for entry in entries:
|
|
||||||
local, remote = entry["local"], entry["remote"]
|
|
||||||
if not os.path.exists(local):
|
|
||||||
messagebox.showwarning("跳过", f"本地文件不存在: {local}")
|
|
||||||
fail += 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
if entry["type"] == "json":
|
|
||||||
try:
|
|
||||||
with open(local, "r", encoding="utf-8") as f:
|
|
||||||
data = json.load(f)
|
|
||||||
sha = cloud_sha_cache.get(remote)
|
|
||||||
result = sync.push_json(remote, data, f"同步 {entry['name']}", sha=sha)
|
|
||||||
except Exception as e:
|
|
||||||
messagebox.showerror("推送失败", f"{entry['name']}: {e}")
|
|
||||||
fail += 1
|
|
||||||
continue
|
|
||||||
else:
|
|
||||||
result = sync.push_binary(remote, local, f"同步 {entry['name']}")
|
|
||||||
|
|
||||||
if result:
|
|
||||||
ok += 1
|
|
||||||
else:
|
|
||||||
fail += 1
|
|
||||||
|
|
||||||
if ok:
|
|
||||||
messagebox.showinfo("推送完成", f"成功 {ok} 个" + (f",失败 {fail} 个" if fail else ""))
|
|
||||||
refresh_status()
|
|
||||||
|
|
||||||
def pull_selected():
|
|
||||||
entries = _get_selected_entries()
|
|
||||||
if not entries:
|
|
||||||
return
|
|
||||||
ok, fail = 0, 0
|
|
||||||
for entry in entries:
|
|
||||||
remote, local = entry["remote"], entry["local"]
|
|
||||||
|
|
||||||
if entry["type"] == "json":
|
|
||||||
result = sync.pull_json(remote)
|
|
||||||
if result is None:
|
|
||||||
messagebox.showwarning("拉取失败", f"云端文件不存在: {entry['name']}")
|
|
||||||
fail += 1
|
|
||||||
continue
|
|
||||||
content, sha = result
|
|
||||||
# 写入本地
|
|
||||||
os.makedirs(os.path.dirname(local) or ".", exist_ok=True)
|
|
||||||
with open(local, "w", encoding="utf-8") as f:
|
|
||||||
json.dump(content, f, ensure_ascii=False, indent=2)
|
|
||||||
# 特殊后处理
|
|
||||||
_post_pull(entry, content)
|
|
||||||
else:
|
|
||||||
result = sync.pull_file(remote)
|
|
||||||
if result is None:
|
|
||||||
messagebox.showwarning("拉取失败", f"云端文件不存在: {entry['name']}")
|
|
||||||
fail += 1
|
|
||||||
continue
|
|
||||||
content, sha = result
|
|
||||||
os.makedirs(os.path.dirname(local) or ".", exist_ok=True)
|
|
||||||
with open(local, "wb") as f:
|
|
||||||
f.write(content)
|
|
||||||
|
|
||||||
ok += 1
|
|
||||||
|
|
||||||
if ok:
|
|
||||||
messagebox.showinfo("拉取完成", f"成功 {ok} 个" + (f",失败 {fail} 个" if fail else ""))
|
|
||||||
refresh_status()
|
|
||||||
|
|
||||||
def _post_pull(entry, data):
|
|
||||||
"""拉取 JSON 文件后的特殊处理"""
|
|
||||||
if entry["remote"] == "barcode_mappings.json":
|
|
||||||
try:
|
|
||||||
from app.core.excel.converter import UnitConverter
|
|
||||||
UnitConverter().update_barcode_mappings(data)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
elif entry["remote"] == "suppliers_config.json":
|
|
||||||
try:
|
|
||||||
from app.services.processor_service import ProcessorService
|
|
||||||
ProcessorService(ConfigManager()).reload_processors()
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
elif entry["remote"] == "product_memory.json":
|
|
||||||
try:
|
|
||||||
from app.core.db.product_db import ProductDatabase
|
|
||||||
cfg = ConfigManager()
|
|
||||||
db_path = cfg.get_path('Paths', 'product_db', fallback='data/product_cache.db') if hasattr(cfg, 'get_path') else 'data/product_cache.db'
|
|
||||||
tpl_folder = cfg.get('Paths', 'template_folder', fallback='templates')
|
|
||||||
item_data = cfg.get('Templates', 'item_data', fallback='商品资料.xlsx')
|
|
||||||
tpl_path = os.path.join(tpl_folder, item_data)
|
|
||||||
db = ProductDatabase(db_path, tpl_path)
|
|
||||||
count = db.import_from_sync(data)
|
|
||||||
logger.info(f"从云端导入商品记忆: {count} 条")
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
def push_all():
|
|
||||||
ok, fail = 0, 0
|
|
||||||
for entry in SYNC_FILES:
|
|
||||||
local, remote = entry["local"], entry["remote"]
|
|
||||||
if not os.path.exists(local):
|
|
||||||
fail += 1
|
|
||||||
continue
|
|
||||||
if entry["type"] == "json":
|
|
||||||
try:
|
|
||||||
with open(local, "r", encoding="utf-8") as f:
|
|
||||||
data = json.load(f)
|
|
||||||
sha = cloud_sha_cache.get(remote)
|
|
||||||
result = sync.push_json(remote, data, f"批量同步 {entry['name']}", sha=sha)
|
|
||||||
except Exception:
|
|
||||||
fail += 1
|
|
||||||
continue
|
|
||||||
else:
|
|
||||||
result = sync.push_binary(remote, local, f"批量同步 {entry['name']}")
|
|
||||||
if result:
|
|
||||||
ok += 1
|
|
||||||
else:
|
|
||||||
fail += 1
|
|
||||||
messagebox.showinfo("批量推送完成", f"成功 {ok} 个,失败 {fail} 个")
|
|
||||||
refresh_status()
|
|
||||||
|
|
||||||
def pull_all():
|
|
||||||
ok, fail = 0, 0
|
|
||||||
for entry in SYNC_FILES:
|
|
||||||
remote, local = entry["remote"], entry["local"]
|
|
||||||
if entry["type"] == "json":
|
|
||||||
result = sync.pull_json(remote)
|
|
||||||
if result is None:
|
|
||||||
fail += 1
|
|
||||||
continue
|
|
||||||
content, sha = result
|
|
||||||
os.makedirs(os.path.dirname(local) or ".", exist_ok=True)
|
|
||||||
with open(local, "w", encoding="utf-8") as f:
|
|
||||||
json.dump(content, f, ensure_ascii=False, indent=2)
|
|
||||||
_post_pull(entry, content)
|
|
||||||
else:
|
|
||||||
result = sync.pull_file(remote)
|
|
||||||
if result is None:
|
|
||||||
fail += 1
|
|
||||||
continue
|
|
||||||
content, sha = result
|
|
||||||
os.makedirs(os.path.dirname(local) or ".", exist_ok=True)
|
|
||||||
with open(local, "wb") as f:
|
|
||||||
f.write(content)
|
|
||||||
ok += 1
|
|
||||||
messagebox.showinfo("批量拉取完成", f"成功 {ok} 个,失败 {fail} 个")
|
|
||||||
refresh_status()
|
|
||||||
|
|
||||||
# ── 按钮区域 ──
|
|
||||||
btn_frame = ttk.Frame(dlg)
|
|
||||||
btn_frame.pack(fill=tk.X, padx=16, pady=(4, 16))
|
|
||||||
|
|
||||||
# 左侧:批量操作
|
|
||||||
ttk.Button(btn_frame, text="全部推送到云端", command=push_all).pack(side=tk.LEFT, padx=4)
|
|
||||||
ttk.Button(btn_frame, text="全部从云端拉取", command=pull_all).pack(side=tk.LEFT, padx=4)
|
|
||||||
|
|
||||||
# 右侧:选中操作 + 刷新 + 关闭
|
|
||||||
ttk.Button(btn_frame, text="关闭", command=dlg.destroy).pack(side=tk.RIGHT, padx=4)
|
|
||||||
ttk.Button(btn_frame, text="刷新状态", command=refresh_status).pack(side=tk.RIGHT, padx=4)
|
|
||||||
tk.Button(btn_frame, text="推送到云端", command=push_selected, fg="white", bg="#4a90d9").pack(side=tk.RIGHT, padx=4)
|
|
||||||
tk.Button(btn_frame, text="从云端拉取", command=pull_selected, fg="white", bg="#5cb85c").pack(side=tk.RIGHT, padx=4)
|
|
||||||
|
|
||||||
# 仅显示本地状态,云端状态需手动点"刷新状态"
|
|
||||||
_load_local_status()
|
|
||||||
|
|
||||||
dlg.transient(parent)
|
|
||||||
dlg.grab_set()
|
|
||||||
@@ -219,34 +219,6 @@ def save_json(data: Any, file_path: str, ensure_ascii: bool = False, indent: int
|
|||||||
logger.error(f"保存JSON文件失败: {file_path}, 错误: {e}")
|
logger.error(f"保存JSON文件失败: {file_path}, 错误: {e}")
|
||||||
return False
|
return False
|
||||||
|
|
||||||
def smart_read_excel(file_path: Union[str, Path], **kwargs) -> Any:
|
|
||||||
"""
|
|
||||||
智能读取 Excel 文件,自动选择引擎并处理常见错误
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: Excel 文件路径
|
|
||||||
**kwargs: 传递给 pd.read_excel 的额外参数
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
pandas.DataFrame 对象
|
|
||||||
"""
|
|
||||||
import pandas as pd
|
|
||||||
|
|
||||||
path_str = str(file_path)
|
|
||||||
ext = os.path.splitext(path_str)[1].lower()
|
|
||||||
|
|
||||||
# 自动选择引擎
|
|
||||||
if ext == '.xlsx':
|
|
||||||
kwargs.setdefault('engine', 'openpyxl')
|
|
||||||
elif ext == '.xls':
|
|
||||||
kwargs.setdefault('engine', 'xlrd')
|
|
||||||
|
|
||||||
try:
|
|
||||||
return pd.read_excel(path_str, **kwargs)
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"读取 Excel 文件失败: {path_str}, 错误: {e}")
|
|
||||||
raise
|
|
||||||
|
|
||||||
def get_file_size(file_path: str) -> int:
|
def get_file_size(file_path: str) -> int:
|
||||||
"""
|
"""
|
||||||
获取文件大小(字节)
|
获取文件大小(字节)
|
||||||
@@ -277,10 +249,3 @@ def is_file_size_valid(file_path: str, max_size_mb: float) -> bool:
|
|||||||
size_bytes = get_file_size(file_path)
|
size_bytes = get_file_size(file_path)
|
||||||
max_size_bytes = max_size_mb * 1024 * 1024
|
max_size_bytes = max_size_mb * 1024 * 1024
|
||||||
return size_bytes <= max_size_bytes
|
return size_bytes <= max_size_bytes
|
||||||
|
|
||||||
|
|
||||||
def format_file_size(size_bytes: int) -> str:
|
|
||||||
"""将字节数格式化为可读的文件大小字符串(KB/MB)"""
|
|
||||||
if size_bytes < 1024 * 1024:
|
|
||||||
return f"{size_bytes / 1024:.1f} KB"
|
|
||||||
return f"{size_bytes / (1024 * 1024):.1f} MB"
|
|
||||||
@@ -7,7 +7,6 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
import sys
|
||||||
import logging
|
import logging
|
||||||
from logging.handlers import RotatingFileHandler
|
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Optional, Dict
|
from typing import Optional, Dict
|
||||||
@@ -59,8 +58,7 @@ def setup_logger(name: str,
|
|||||||
|
|
||||||
# 创建文件处理器
|
# 创建文件处理器
|
||||||
try:
|
try:
|
||||||
# 使用滚动日志,限制单个日志大小与备份数量
|
file_handler = logging.FileHandler(log_file, encoding='utf-8')
|
||||||
file_handler = RotatingFileHandler(log_file, maxBytes=5 * 1024 * 1024, backupCount=3, encoding='utf-8')
|
|
||||||
file_handler.setFormatter(formatter)
|
file_handler.setFormatter(formatter)
|
||||||
file_handler.setLevel(level)
|
file_handler.setLevel(level)
|
||||||
logger.addHandler(file_handler)
|
logger.addHandler(file_handler)
|
||||||
|
|||||||
@@ -5,7 +5,7 @@
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
import re
|
import re
|
||||||
from typing import Dict, List, Optional, Tuple, Any
|
from typing import Dict, List, Optional, Tuple, Any, Match, Pattern
|
||||||
|
|
||||||
def clean_string(text: str) -> str:
|
def clean_string(text: str) -> str:
|
||||||
"""
|
"""
|
||||||
@@ -192,51 +192,6 @@ def is_scientific_notation(value: str) -> bool:
|
|||||||
"""
|
"""
|
||||||
return bool(re.match(r'^-?\d+(\.\d+)?[eE][+-]?\d+$', str(value)))
|
return bool(re.match(r'^-?\d+(\.\d+)?[eE][+-]?\d+$', str(value)))
|
||||||
|
|
||||||
def parse_monetary_string(value: Any) -> Optional[float]:
|
|
||||||
"""
|
|
||||||
解析金额/数量字符串为浮点数。
|
|
||||||
处理: 货币符号(¥/$)、逗号作小数点、逗号作千位分隔符、中文"元"后缀等。
|
|
||||||
|
|
||||||
Args:
|
|
||||||
value: 金额值(字符串、数字或其他类型)
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
解析后的浮点数,无法解析则返回 None
|
|
||||||
"""
|
|
||||||
if value is None:
|
|
||||||
return None
|
|
||||||
if isinstance(value, (int, float)):
|
|
||||||
return float(value)
|
|
||||||
if not isinstance(value, str):
|
|
||||||
return None
|
|
||||||
|
|
||||||
s = value.strip()
|
|
||||||
if not s or s.lower() in ('o', 'none', 'null', '-', '--'):
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 移除非数字字符,保留数字、小数点、逗号和负号
|
|
||||||
cleaned = re.sub(r'[^\d\.\-,]', '', s)
|
|
||||||
if not cleaned or cleaned in ('-', '.', '-.', ','):
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 逗号处理策略:
|
|
||||||
# 多个逗号 -> 千位分隔符,全部移除 (如 "1,234,567" = 1234567)
|
|
||||||
# 一个逗号 + 无小数点 -> 逗号当小数点 (如 "1,5" = 1.5)
|
|
||||||
# 一个逗号 + 有小数点 -> 千位分隔符,移除 (如 "1,234.56" = 1234.56)
|
|
||||||
comma_count = cleaned.count(',')
|
|
||||||
if comma_count > 1:
|
|
||||||
cleaned = cleaned.replace(',', '')
|
|
||||||
elif comma_count == 1 and '.' not in cleaned:
|
|
||||||
cleaned = cleaned.replace(',', '.')
|
|
||||||
elif comma_count == 1 and '.' in cleaned:
|
|
||||||
cleaned = cleaned.replace(',', '')
|
|
||||||
|
|
||||||
try:
|
|
||||||
return float(cleaned)
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def format_barcode(barcode: Any) -> str:
|
def format_barcode(barcode: Any) -> str:
|
||||||
"""
|
"""
|
||||||
格式化条码,处理科学计数法
|
格式化条码,处理科学计数法
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ OCR服务模块
|
|||||||
提供OCR识别服务,协调OCR流程。
|
提供OCR识别服务,协调OCR流程。
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from typing import Dict, List, Optional, Tuple, Union, Any, Callable
|
from typing import Dict, List, Optional, Tuple, Union, Any
|
||||||
import os
|
import os
|
||||||
|
|
||||||
from ..config.settings import ConfigManager
|
from ..config.settings import ConfigManager
|
||||||
@@ -88,7 +88,7 @@ class OCRService:
|
|||||||
logger.error(f"处理图片时发生错误: {e}", exc_info=True)
|
logger.error(f"处理图片时发生错误: {e}", exc_info=True)
|
||||||
return None
|
return None
|
||||||
|
|
||||||
def process_images_batch(self, batch_size: int = None, max_workers: int = None, progress_cb: Optional[Callable[[int], None]] = None) -> Tuple[int, int]:
|
def process_images_batch(self, batch_size: int = None, max_workers: int = None) -> Tuple[int, int]:
|
||||||
"""
|
"""
|
||||||
批量处理图片
|
批量处理图片
|
||||||
|
|
||||||
@@ -100,10 +100,10 @@ class OCRService:
|
|||||||
(总处理数, 成功处理数)元组
|
(总处理数, 成功处理数)元组
|
||||||
"""
|
"""
|
||||||
logger.info(f"OCRService开始批量处理图片, batch_size={batch_size}, max_workers={max_workers}")
|
logger.info(f"OCRService开始批量处理图片, batch_size={batch_size}, max_workers={max_workers}")
|
||||||
return self.ocr_processor.process_images_batch(batch_size, max_workers, progress_cb)
|
return self.ocr_processor.process_images_batch(batch_size, max_workers)
|
||||||
|
|
||||||
# 添加batch_process作为process_images_batch的别名,确保兼容性
|
# 添加batch_process作为process_images_batch的别名,确保兼容性
|
||||||
def batch_process(self, batch_size: int = None, max_workers: int = None, progress_cb: Optional[Callable[[int], None]] = None) -> Tuple[int, int]:
|
def batch_process(self, batch_size: int = None, max_workers: int = None) -> Tuple[int, int]:
|
||||||
"""
|
"""
|
||||||
批量处理图片(别名方法,与process_images_batch功能相同)
|
批量处理图片(别名方法,与process_images_batch功能相同)
|
||||||
|
|
||||||
@@ -115,7 +115,7 @@ class OCRService:
|
|||||||
(总处理数, 成功处理数)元组
|
(总处理数, 成功处理数)元组
|
||||||
"""
|
"""
|
||||||
logger.info(f"OCRService.batch_process被调用,转发到process_images_batch")
|
logger.info(f"OCRService.batch_process被调用,转发到process_images_batch")
|
||||||
return self.process_images_batch(batch_size, max_workers, progress_cb)
|
return self.process_images_batch(batch_size, max_workers)
|
||||||
|
|
||||||
def validate_image(self, image_path: str) -> bool:
|
def validate_image(self, image_path: str) -> bool:
|
||||||
"""
|
"""
|
||||||
@@ -154,7 +154,7 @@ class OCRService:
|
|||||||
# 获取文件名(不含扩展名)
|
# 获取文件名(不含扩展名)
|
||||||
base_name = os.path.splitext(os.path.basename(image_path))[0]
|
base_name = os.path.splitext(os.path.basename(image_path))[0]
|
||||||
# 生成Excel文件路径
|
# 生成Excel文件路径
|
||||||
output_dir = self.config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(self.config, 'get_path') else os.path.abspath('data/output')
|
output_dir = self.config.get('Paths', 'output_folder', fallback='data/output')
|
||||||
excel_path = os.path.join(output_dir, f"{base_name}.xlsx")
|
excel_path = os.path.join(output_dir, f"{base_name}.xlsx")
|
||||||
return excel_path
|
return excel_path
|
||||||
|
|
||||||
|
|||||||
+15
-150
@@ -4,14 +4,12 @@
|
|||||||
提供订单处理服务,协调Excel处理和订单合并流程。
|
提供订单处理服务,协调Excel处理和订单合并流程。
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
from typing import Dict, List, Optional, Tuple, Union, Any
|
||||||
from typing import Dict, List, Optional, Tuple, Union, Any, Callable
|
|
||||||
|
|
||||||
from ..config.settings import ConfigManager
|
from ..config.settings import ConfigManager
|
||||||
from ..core.utils.log_utils import get_logger
|
from ..core.utils.log_utils import get_logger
|
||||||
from ..core.excel.processor import ExcelProcessor
|
from ..core.excel.processor import ExcelProcessor
|
||||||
from ..core.excel.merger import PurchaseOrderMerger
|
from ..core.excel.merger import PurchaseOrderMerger
|
||||||
from ..core.db.product_db import ProductDatabase
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
@@ -30,15 +28,8 @@ class OrderService:
|
|||||||
logger.info("初始化OrderService")
|
logger.info("初始化OrderService")
|
||||||
self.config = config or ConfigManager()
|
self.config = config or ConfigManager()
|
||||||
|
|
||||||
# 创建共享的商品数据库实例
|
|
||||||
db_path = self.config.get_path('Paths', 'product_db', fallback='data/product_cache.db') if hasattr(self.config, 'get_path') else 'data/product_cache.db'
|
|
||||||
tpl_folder = self.config.get('Paths', 'template_folder', fallback='templates')
|
|
||||||
item_data = self.config.get('Templates', 'item_data', fallback='商品资料.xlsx')
|
|
||||||
tpl_path = os.path.join(tpl_folder, item_data)
|
|
||||||
self.product_db = ProductDatabase(db_path, tpl_path)
|
|
||||||
|
|
||||||
# 创建Excel处理器和采购单合并器
|
# 创建Excel处理器和采购单合并器
|
||||||
self.excel_processor = ExcelProcessor(self.config, product_db=self.product_db)
|
self.excel_processor = ExcelProcessor(self.config)
|
||||||
self.order_merger = PurchaseOrderMerger(self.config)
|
self.order_merger = PurchaseOrderMerger(self.config)
|
||||||
|
|
||||||
logger.info("OrderService初始化完成")
|
logger.info("OrderService初始化完成")
|
||||||
@@ -52,9 +43,9 @@ class OrderService:
|
|||||||
"""
|
"""
|
||||||
return self.excel_processor.get_latest_excel()
|
return self.excel_processor.get_latest_excel()
|
||||||
|
|
||||||
def process_excel(self, file_path: Optional[str] = None, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
|
def process_excel(self, file_path: Optional[str] = None) -> Optional[str]:
|
||||||
"""
|
"""
|
||||||
处理Excel订单文件,生成标准采购单
|
处理Excel文件,生成采购单
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
file_path: Excel文件路径,如果为None则处理最新的文件
|
file_path: Excel文件路径,如果为None则处理最新的文件
|
||||||
@@ -62,84 +53,12 @@ class OrderService:
|
|||||||
Returns:
|
Returns:
|
||||||
输出采购单文件路径,如果处理失败则返回None
|
输出采购单文件路径,如果处理失败则返回None
|
||||||
"""
|
"""
|
||||||
if not file_path:
|
if file_path:
|
||||||
file_path = self.excel_processor.get_latest_excel()
|
|
||||||
if not file_path:
|
|
||||||
logger.warning("未找到可处理的Excel文件")
|
|
||||||
return None
|
|
||||||
logger.info("OrderService开始处理最新Excel文件")
|
|
||||||
else:
|
|
||||||
logger.info(f"OrderService开始处理指定Excel文件: {file_path}")
|
logger.info(f"OrderService开始处理指定Excel文件: {file_path}")
|
||||||
|
return self.excel_processor.process_specific_file(file_path)
|
||||||
# 检查是否需要特殊的供应商预处理(如杨碧月)
|
else:
|
||||||
try:
|
logger.info("OrderService开始处理最新Excel文件")
|
||||||
from .special_suppliers_service import SpecialSuppliersService
|
return self.excel_processor.process_latest_file()
|
||||||
special_service = SpecialSuppliersService(self.config)
|
|
||||||
|
|
||||||
# 尝试识别并预处理(注意:这里不再传入 progress_cb 避免无限递归或重复进度条,
|
|
||||||
# 或者我们在 special_service 内部逻辑中处理完后直接返回结果)
|
|
||||||
# 为了避免循环调用,我们在 SpecialSuppliersService 内部不再调用 process_excel,
|
|
||||||
# 而是让 process_excel 识别后自己决定是否处理预处理后的文件。
|
|
||||||
|
|
||||||
# 我们新增一个 check_and_preprocess 方法
|
|
||||||
preprocessed_path = self._check_special_preprocess(file_path)
|
|
||||||
if preprocessed_path:
|
|
||||||
logger.info(f"检测到特殊供应商,已生成预处理文件: {preprocessed_path}")
|
|
||||||
file_path = preprocessed_path
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"检查特殊预处理时出错: {e}")
|
|
||||||
|
|
||||||
return self.excel_processor.process_specific_file(file_path, progress_cb=progress_cb)
|
|
||||||
|
|
||||||
def _check_special_preprocess(self, file_path: str) -> Optional[str]:
|
|
||||||
"""检查并执行特殊的预处理(支持杨碧月、烟草公司、蓉城易购)"""
|
|
||||||
try:
|
|
||||||
from app.core.utils.file_utils import smart_read_excel
|
|
||||||
import pandas as pd
|
|
||||||
import re
|
|
||||||
|
|
||||||
# 仅读取前 50 行进行智能识别 (header=None 确保能读到第一行内容)
|
|
||||||
df_head = smart_read_excel(file_path, nrows=50, header=None)
|
|
||||||
df_str = df_head.astype(str)
|
|
||||||
|
|
||||||
# 1. 识别:烟草公司 (Tobacco)
|
|
||||||
# 特征:内容中包含“专卖证号”或特定证号“510109104938”
|
|
||||||
is_tobacco = df_str.apply(lambda x: x.str.contains('专卖证号|510109104938')).any().any()
|
|
||||||
if is_tobacco:
|
|
||||||
logger.info("识别到烟草公司订单,执行专用预处理...")
|
|
||||||
from .tobacco_service import TobaccoService
|
|
||||||
tobacco_svc = TobaccoService(self.config)
|
|
||||||
return tobacco_svc.preprocess_tobacco_order(file_path)
|
|
||||||
|
|
||||||
# 2. 识别:蓉城易购 (Rongcheng Yigou)
|
|
||||||
# 特征:内容中包含单号标识“RCDH”
|
|
||||||
is_rongcheng = df_str.apply(lambda x: x.str.contains('RCDH')).any().any()
|
|
||||||
if is_rongcheng:
|
|
||||||
logger.info("识别到蓉城易购订单,执行专用预处理...")
|
|
||||||
from .special_suppliers_service import SpecialSuppliersService
|
|
||||||
special_svc = SpecialSuppliersService(self.config)
|
|
||||||
return special_svc.preprocess_rongcheng_yigou(file_path)
|
|
||||||
|
|
||||||
# 3. 识别:杨碧月 (Yang Biyue)
|
|
||||||
# 特征:经手人列包含“杨碧月”
|
|
||||||
handler_col = None
|
|
||||||
for col in df_head.columns:
|
|
||||||
# 在前50行中搜索“经手人”关键字
|
|
||||||
if df_head[col].astype(str).str.contains('经手人').any():
|
|
||||||
handler_col = col
|
|
||||||
break
|
|
||||||
|
|
||||||
if handler_col is not None:
|
|
||||||
# 检查该列是否有“杨碧月”
|
|
||||||
if df_head[handler_col].astype(str).str.contains('杨碧月').any():
|
|
||||||
logger.info("识别到杨碧月订单,执行专用预处理...")
|
|
||||||
from .special_suppliers_service import SpecialSuppliersService
|
|
||||||
special_svc = SpecialSuppliersService(self.config)
|
|
||||||
return special_svc.process_yang_biyue_only(file_path)
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
logger.warning(f"智能预处理识别失败: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def get_purchase_orders(self) -> List[str]:
|
def get_purchase_orders(self) -> List[str]:
|
||||||
"""
|
"""
|
||||||
@@ -150,7 +69,7 @@ class OrderService:
|
|||||||
"""
|
"""
|
||||||
return self.order_merger.get_purchase_orders()
|
return self.order_merger.get_purchase_orders()
|
||||||
|
|
||||||
def merge_purchase_orders(self, file_paths: List[str], progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
|
def merge_purchase_orders(self, file_paths: List[str]) -> Optional[str]:
|
||||||
"""
|
"""
|
||||||
合并指定的采购单文件
|
合并指定的采购单文件
|
||||||
|
|
||||||
@@ -161,9 +80,9 @@ class OrderService:
|
|||||||
合并后的采购单文件路径,如果合并失败则返回None
|
合并后的采购单文件路径,如果合并失败则返回None
|
||||||
"""
|
"""
|
||||||
logger.info(f"OrderService开始合并指定采购单: {file_paths}")
|
logger.info(f"OrderService开始合并指定采购单: {file_paths}")
|
||||||
return self.merge_orders(file_paths, progress_cb)
|
return self.merge_orders(file_paths)
|
||||||
|
|
||||||
def merge_all_purchase_orders(self, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
|
def merge_all_purchase_orders(self) -> Optional[str]:
|
||||||
"""
|
"""
|
||||||
合并所有可用的采购单文件
|
合并所有可用的采购单文件
|
||||||
|
|
||||||
@@ -171,9 +90,9 @@ class OrderService:
|
|||||||
合并后的采购单文件路径,如果合并失败则返回None
|
合并后的采购单文件路径,如果合并失败则返回None
|
||||||
"""
|
"""
|
||||||
logger.info("OrderService开始合并所有采购单")
|
logger.info("OrderService开始合并所有采购单")
|
||||||
return self.merge_orders(None, progress_cb)
|
return self.merge_orders(None)
|
||||||
|
|
||||||
def merge_orders(self, file_paths: Optional[List[str]] = None, progress_cb: Optional[Callable[[int], None]] = None) -> Optional[str]:
|
def merge_orders(self, file_paths: Optional[List[str]] = None) -> Optional[str]:
|
||||||
"""
|
"""
|
||||||
合并采购单
|
合并采购单
|
||||||
|
|
||||||
@@ -188,58 +107,4 @@ class OrderService:
|
|||||||
else:
|
else:
|
||||||
logger.info("OrderService开始合并所有采购单")
|
logger.info("OrderService开始合并所有采购单")
|
||||||
|
|
||||||
return self.order_merger.process(file_paths, progress_cb)
|
return self.order_merger.process(file_paths)
|
||||||
|
|
||||||
def validate_unit_price(self, result_path: str) -> List[str]:
|
|
||||||
"""
|
|
||||||
校验采购单单价与商品资料进货价的差异
|
|
||||||
|
|
||||||
Args:
|
|
||||||
result_path: 待校验的采购单路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
差异信息列表,无差异返回空列表
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
import pandas as pd
|
|
||||||
from app.core.utils.file_utils import smart_read_excel
|
|
||||||
from app.core.handlers.column_mapper import ColumnMapper as CM
|
|
||||||
|
|
||||||
# 使用共享的商品数据库实例
|
|
||||||
product_db = self.product_db
|
|
||||||
|
|
||||||
# 读取待校验的采购单
|
|
||||||
df_res = smart_read_excel(result_path)
|
|
||||||
|
|
||||||
res_barcode_col = CM.find_column(list(df_res.columns), 'barcode')
|
|
||||||
res_price_col = CM.find_column(list(df_res.columns), 'unit_price')
|
|
||||||
|
|
||||||
if not res_barcode_col or not res_price_col:
|
|
||||||
logger.warning("未能在采购单中找到条码或单价列")
|
|
||||||
return []
|
|
||||||
|
|
||||||
# 批量查询进货价
|
|
||||||
barcodes = df_res[res_barcode_col].astype(str).str.strip().tolist()
|
|
||||||
item_prices = product_db.get_prices(barcodes)
|
|
||||||
|
|
||||||
results = []
|
|
||||||
for _, row in df_res.iterrows():
|
|
||||||
bc = str(row[res_barcode_col]).strip()
|
|
||||||
if bc not in item_prices:
|
|
||||||
continue
|
|
||||||
|
|
||||||
try:
|
|
||||||
res_price = float(row[res_price_col])
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
continue
|
|
||||||
|
|
||||||
item_price = item_prices[bc]
|
|
||||||
diff = abs(res_price - item_price)
|
|
||||||
if diff > 1.0:
|
|
||||||
results.append(f"条码 {bc}: 采购单价={res_price} vs 进货价={item_price} 差异={diff:.2f}")
|
|
||||||
|
|
||||||
return results
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"单价校验过程中发生错误: {e}")
|
|
||||||
return []
|
|
||||||
@@ -1,297 +0,0 @@
|
|||||||
"""
|
|
||||||
处理器调度服务
|
|
||||||
|
|
||||||
负责管理和调度各种文件处理器,实现智能文件类型检测和处理器选择
|
|
||||||
"""
|
|
||||||
|
|
||||||
import logging
|
|
||||||
from typing import Dict, Any, Optional, List
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from ..core.processors.base import BaseProcessor
|
|
||||||
from ..core.processors.tobacco_processor import TobaccoProcessor
|
|
||||||
from ..core.processors.ocr_processor import OCRProcessor
|
|
||||||
from ..core.utils.log_utils import get_logger
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class ProcessorService:
|
|
||||||
"""处理器调度服务
|
|
||||||
|
|
||||||
负责管理所有处理器实例,提供统一的文件处理接口
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, config: Dict[str, Any]):
|
|
||||||
"""初始化处理器服务
|
|
||||||
|
|
||||||
Args:
|
|
||||||
config: 系统配置字典
|
|
||||||
"""
|
|
||||||
self.config = config
|
|
||||||
self.processors: List[BaseProcessor] = []
|
|
||||||
self._load_processors()
|
|
||||||
logger.info(f"处理器服务初始化完成,加载了{len(self.processors)}个处理器")
|
|
||||||
|
|
||||||
def _load_processors(self):
|
|
||||||
"""加载所有处理器"""
|
|
||||||
try:
|
|
||||||
self.processors = [
|
|
||||||
TobaccoProcessor(self.config),
|
|
||||||
OCRProcessor(self.config),
|
|
||||||
]
|
|
||||||
|
|
||||||
supplier_configs = []
|
|
||||||
try:
|
|
||||||
import json
|
|
||||||
from pathlib import Path
|
|
||||||
# 优先从`config/suppliers_config.json`加载
|
|
||||||
config_path = Path("config/suppliers_config.json")
|
|
||||||
if not config_path.exists():
|
|
||||||
# 兼容其它路径
|
|
||||||
config_path = Path("./suppliers_config.json")
|
|
||||||
if config_path.exists():
|
|
||||||
with open(config_path, 'r', encoding='utf-8') as f:
|
|
||||||
data = json.load(f)
|
|
||||||
ok, errs, supplier_configs = self._validate_suppliers_config(data)
|
|
||||||
if not ok:
|
|
||||||
logger.error("供应商配置校验失败:\n" + "\n".join([f"- {e}" for e in errs]))
|
|
||||||
else:
|
|
||||||
logger.info(f"从 {config_path} 加载供应商配置,共 {len(supplier_configs)} 项")
|
|
||||||
else:
|
|
||||||
logger.info("未找到供应商配置文件,跳过供应商处理器加载")
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"读取供应商配置失败: {e}")
|
|
||||||
|
|
||||||
for supplier_config in supplier_configs:
|
|
||||||
try:
|
|
||||||
from ..core.processors.supplier_processors.generic_supplier_processor import GenericSupplierProcessor
|
|
||||||
processor = GenericSupplierProcessor(self.config, supplier_config)
|
|
||||||
self.processors.append(processor)
|
|
||||||
logger.info(f"加载供应商处理器: {processor.name}")
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"加载供应商处理器失败: {e}")
|
|
||||||
|
|
||||||
logger.info(f"成功加载{len(self.processors)}个处理器")
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"加载处理器时出错: {e}", exc_info=True)
|
|
||||||
self.processors = [
|
|
||||||
TobaccoProcessor(self.config),
|
|
||||||
OCRProcessor(self.config),
|
|
||||||
]
|
|
||||||
|
|
||||||
def _validate_suppliers_config(self, data):
|
|
||||||
try:
|
|
||||||
suppliers = data.get('suppliers')
|
|
||||||
errors = []
|
|
||||||
valid = []
|
|
||||||
if not isinstance(suppliers, list) or not suppliers:
|
|
||||||
errors.append('suppliers必须是非空数组')
|
|
||||||
return False, errors, []
|
|
||||||
for idx, s in enumerate(suppliers):
|
|
||||||
e = self._validate_single_supplier(s, idx)
|
|
||||||
if e:
|
|
||||||
errors.extend(e)
|
|
||||||
else:
|
|
||||||
valid.append(s)
|
|
||||||
return len(errors) == 0, errors, valid
|
|
||||||
except Exception as e:
|
|
||||||
return False, [f'配置解析异常: {e}'], []
|
|
||||||
|
|
||||||
def _validate_single_supplier(self, s, idx):
|
|
||||||
errs = []
|
|
||||||
prefix = f'suppliers[{idx}]'
|
|
||||||
name = s.get('name')
|
|
||||||
if not name or not isinstance(name, str):
|
|
||||||
errs.append(f'{prefix}.name 必须为字符串')
|
|
||||||
fp = s.get('filename_patterns', [])
|
|
||||||
ci = s.get('content_indicators', [])
|
|
||||||
if not fp and not ci:
|
|
||||||
errs.append(f'{prefix} 必须至少提供 filename_patterns 或 content_indicators 之一')
|
|
||||||
cm = s.get('column_mapping', {})
|
|
||||||
if cm and not isinstance(cm, dict):
|
|
||||||
errs.append(f'{prefix}.column_mapping 必须为对象')
|
|
||||||
cr = s.get('cleaning_rules', [])
|
|
||||||
if cr and not isinstance(cr, list):
|
|
||||||
errs.append(f'{prefix}.cleaning_rules 必须为数组')
|
|
||||||
else:
|
|
||||||
for i, rule in enumerate(cr):
|
|
||||||
rtype = rule.get('type')
|
|
||||||
if rtype not in ('remove_rows','fill_na','convert_type'):
|
|
||||||
errs.append(f'{prefix}.cleaning_rules[{i}].type 非法: {rtype}')
|
|
||||||
if rtype == 'remove_rows' and not rule.get('condition'):
|
|
||||||
errs.append(f'{prefix}.cleaning_rules[{i}].condition 必填')
|
|
||||||
if rtype in ('fill_na','convert_type'):
|
|
||||||
if not rule.get('columns') and not rule.get('column'):
|
|
||||||
errs.append(f'{prefix}.cleaning_rules[{i}] 需提供 columns 或 column')
|
|
||||||
calc = s.get('calculations', [])
|
|
||||||
if calc and not isinstance(calc, list):
|
|
||||||
errs.append(f'{prefix}.calculations 必须为数组')
|
|
||||||
else:
|
|
||||||
for i, c in enumerate(calc):
|
|
||||||
ctype = c.get('type')
|
|
||||||
if ctype not in ('multiply','divide','formula'):
|
|
||||||
errs.append(f'{prefix}.calculations[{i}].type 非法: {ctype}')
|
|
||||||
if ctype in ('multiply','divide'):
|
|
||||||
if not c.get('source_column') or not c.get('target_column'):
|
|
||||||
errs.append(f'{prefix}.calculations[{i}] 需提供 source_column 与 target_column')
|
|
||||||
if ctype == 'formula' and (not c.get('formula') or not c.get('target_column')):
|
|
||||||
errs.append(f'{prefix}.calculations[{i}] 需提供 formula 与 target_column')
|
|
||||||
return errs
|
|
||||||
|
|
||||||
def process_file(self, input_file: Path, output_dir: Path,
|
|
||||||
preferred_processor: Optional[str] = None) -> Optional[Path]:
|
|
||||||
"""处理文件 - 自动选择合适的处理器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
input_file: 输入文件路径
|
|
||||||
output_dir: 输出目录路径
|
|
||||||
preferred_processor: 优先使用的处理器名称(可选)
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
输出文件路径,处理失败返回None
|
|
||||||
"""
|
|
||||||
if not input_file.exists():
|
|
||||||
logger.error(f"输入文件不存在: {input_file}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
if not output_dir.exists():
|
|
||||||
output_dir.mkdir(parents=True, exist_ok=True)
|
|
||||||
|
|
||||||
try:
|
|
||||||
# 如果指定了优先处理器,先尝试使用它
|
|
||||||
if preferred_processor:
|
|
||||||
processor = self._get_processor_by_name(preferred_processor)
|
|
||||||
if processor and processor.can_process(input_file):
|
|
||||||
logger.info(f"使用指定的处理器: {processor.name}")
|
|
||||||
return processor.process(input_file, output_dir)
|
|
||||||
else:
|
|
||||||
logger.warning(f"指定的处理器不可用或无法处理该文件: {preferred_processor}")
|
|
||||||
|
|
||||||
# 自动选择合适的处理器
|
|
||||||
suitable_processors = [p for p in self.processors if p.can_process(input_file)]
|
|
||||||
|
|
||||||
if not suitable_processors:
|
|
||||||
logger.warning(f"未找到适合处理文件的处理器: {input_file}")
|
|
||||||
logger.info(f"支持的文件类型: {self.get_supported_types()}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 使用第一个合适的处理器
|
|
||||||
processor = suitable_processors[0]
|
|
||||||
logger.info(f"使用处理器 {processor.name} 处理文件: {input_file}")
|
|
||||||
|
|
||||||
return processor.process(input_file, output_dir)
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"处理文件时出错: {e}", exc_info=True)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _get_processor_by_name(self, name: str) -> Optional[BaseProcessor]:
|
|
||||||
"""根据名称获取处理器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
name: 处理器名称
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理器实例或None
|
|
||||||
"""
|
|
||||||
for processor in self.processors:
|
|
||||||
if processor.name == name or processor.__class__.__name__ == name:
|
|
||||||
return processor
|
|
||||||
return None
|
|
||||||
|
|
||||||
def get_supported_types(self) -> List[Dict[str, Any]]:
|
|
||||||
"""获取支持的文件类型信息
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理器类型信息列表
|
|
||||||
"""
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
'name': processor.name,
|
|
||||||
'description': processor.description,
|
|
||||||
'extensions': processor.get_supported_extensions(),
|
|
||||||
'class_name': processor.__class__.__name__
|
|
||||||
}
|
|
||||||
for processor in self.processors
|
|
||||||
]
|
|
||||||
|
|
||||||
def get_processor_info(self) -> List[Dict[str, Any]]:
|
|
||||||
"""获取处理器详细信息
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
处理器详细信息列表
|
|
||||||
"""
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
'name': processor.name,
|
|
||||||
'description': processor.description,
|
|
||||||
'extensions': processor.get_supported_extensions(),
|
|
||||||
'required_columns': processor.get_required_columns(),
|
|
||||||
'class_name': processor.__class__.__name__,
|
|
||||||
'module': processor.__class__.__module__
|
|
||||||
}
|
|
||||||
for processor in self.processors
|
|
||||||
]
|
|
||||||
|
|
||||||
def can_process_file(self, file_path: Path) -> bool:
|
|
||||||
"""检查是否有处理器能处理该文件
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
是否有处理器能处理
|
|
||||||
"""
|
|
||||||
if not file_path.exists():
|
|
||||||
return False
|
|
||||||
|
|
||||||
return any(processor.can_process(file_path) for processor in self.processors)
|
|
||||||
|
|
||||||
def get_suitable_processors(self, file_path: Path) -> List[BaseProcessor]:
|
|
||||||
"""获取能处理该文件的所有处理器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
file_path: 文件路径
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
合适的处理器列表
|
|
||||||
"""
|
|
||||||
if not file_path.exists():
|
|
||||||
return []
|
|
||||||
|
|
||||||
return [p for p in self.processors if p.can_process(file_path)]
|
|
||||||
|
|
||||||
def reload_processors(self):
|
|
||||||
"""重新加载处理器"""
|
|
||||||
logger.info("重新加载处理器...")
|
|
||||||
self.processors.clear()
|
|
||||||
self._load_processors()
|
|
||||||
logger.info(f"重新加载完成,共{len(self.processors)}个处理器")
|
|
||||||
|
|
||||||
def add_processor(self, processor: BaseProcessor):
|
|
||||||
"""添加处理器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
processor: 处理器实例
|
|
||||||
"""
|
|
||||||
self.processors.append(processor)
|
|
||||||
logger.info(f"添加处理器: {processor.name}")
|
|
||||||
|
|
||||||
def remove_processor(self, processor_name: str) -> bool:
|
|
||||||
"""移除处理器
|
|
||||||
|
|
||||||
Args:
|
|
||||||
processor_name: 处理器名称
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
是否成功移除
|
|
||||||
"""
|
|
||||||
for i, processor in enumerate(self.processors):
|
|
||||||
if processor.name == processor_name or processor.__class__.__name__ == processor_name:
|
|
||||||
del self.processors[i]
|
|
||||||
logger.info(f"移除处理器: {processor_name}")
|
|
||||||
return True
|
|
||||||
logger.warning(f"未找到要移除的处理器: {processor_name}")
|
|
||||||
return False
|
|
||||||
@@ -1,237 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
|
|
||||||
import os
|
|
||||||
import re
|
|
||||||
import time
|
|
||||||
import pandas as pd
|
|
||||||
from typing import Optional, Callable
|
|
||||||
|
|
||||||
from ..core.utils.log_utils import get_logger
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
class SpecialSuppliersService:
|
|
||||||
"""
|
|
||||||
处理特殊供应商逻辑的服务类,如蓉城易购等
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, config_manager=None):
|
|
||||||
self.config_manager = config_manager
|
|
||||||
|
|
||||||
def process_yang_biyue_only(self, src_path: str) -> Optional[str]:
|
|
||||||
"""
|
|
||||||
仅执行杨碧月订单的预处理,返回预处理后的文件路径
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
from app.core.utils.file_utils import smart_read_excel
|
|
||||||
# 读取原始数据
|
|
||||||
df = smart_read_excel(src_path)
|
|
||||||
|
|
||||||
# 检查是否包含“杨碧月”
|
|
||||||
handler_col = None
|
|
||||||
for col in df.columns:
|
|
||||||
if '经手人' in str(col):
|
|
||||||
handler_col = col
|
|
||||||
break
|
|
||||||
|
|
||||||
if handler_col is None or not df[handler_col].astype(str).str.contains('杨碧月').any():
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 识别到杨碧月订单,执行专用清洗
|
|
||||||
logger.info("识别到杨碧月订单,正在执行专用清洗...")
|
|
||||||
|
|
||||||
# 定义列映射关系 (映射到 ExcelProcessor 期望的中文列名)
|
|
||||||
# 使用精确匹配优先,防止“结算单位”匹配到“单位”
|
|
||||||
column_map = {
|
|
||||||
'商品条码': '商品条码',
|
|
||||||
'商品名称': '商品名称',
|
|
||||||
'商品规格': '规格',
|
|
||||||
'单位': '单位',
|
|
||||||
'数量': '数量',
|
|
||||||
'含税单价': '单价',
|
|
||||||
'含税金额': '金额'
|
|
||||||
}
|
|
||||||
|
|
||||||
found_cols = {}
|
|
||||||
# 1. 第一遍:尝试精确匹配
|
|
||||||
for target_zh, std_name in column_map.items():
|
|
||||||
for col in df.columns:
|
|
||||||
if str(col).strip() == target_zh:
|
|
||||||
found_cols[col] = std_name
|
|
||||||
break
|
|
||||||
|
|
||||||
# 2. 第二遍:对未匹配成功的列尝试模糊匹配(但要排除特定干扰词)
|
|
||||||
for target_zh, std_name in column_map.items():
|
|
||||||
if std_name in found_cols.values():
|
|
||||||
continue
|
|
||||||
for col in df.columns:
|
|
||||||
col_str = str(col)
|
|
||||||
if target_zh in col_str:
|
|
||||||
# 排除干扰列
|
|
||||||
if target_zh == '单位' and '结算单位' in col_str:
|
|
||||||
continue
|
|
||||||
if target_zh == '数量' and '基本单位数量' in col_str:
|
|
||||||
continue
|
|
||||||
found_cols[col] = std_name
|
|
||||||
break
|
|
||||||
|
|
||||||
if len(found_cols) < 4:
|
|
||||||
logger.error(f"杨碧月订单列匹配不足: 找到 {list(found_cols.values())}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
df_clean = df[list(found_cols.keys())].copy()
|
|
||||||
df_clean = df_clean.rename(columns=found_cols)
|
|
||||||
|
|
||||||
# 过滤掉空的条码行
|
|
||||||
df_clean = df_clean.dropna(subset=['商品条码'])
|
|
||||||
|
|
||||||
# 保存预处理文件到输出目录(而非源文件目录)
|
|
||||||
if self.config_manager and hasattr(self.config_manager, 'get_path'):
|
|
||||||
out_dir = self.config_manager.get_path('Paths', 'output_folder', fallback='data/output', create=True)
|
|
||||||
else:
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
out_dir = ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True)
|
|
||||||
os.makedirs(out_dir, exist_ok=True)
|
|
||||||
base = os.path.basename(src_path)
|
|
||||||
final_path = os.path.join(out_dir, f"预处理之后_{base}")
|
|
||||||
df_clean.to_excel(final_path, index=False)
|
|
||||||
|
|
||||||
return final_path
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"预处理杨碧月订单出错: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def process_yang_biyue(self, src_path: str, progress_cb: Optional[Callable[[int, str], None]] = None) -> Optional[str]:
|
|
||||||
"""
|
|
||||||
处理杨碧月经手的订单(预处理+处理)
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
if progress_cb: progress_cb(10, "正在进行杨碧月订单预处理...")
|
|
||||||
preprocessed_path = self.process_yang_biyue_only(src_path)
|
|
||||||
|
|
||||||
if not preprocessed_path:
|
|
||||||
return None
|
|
||||||
|
|
||||||
if progress_cb: progress_cb(60, "预处理文件已保存,开始标准转换流程...")
|
|
||||||
|
|
||||||
# 延迟导入以避免循环依赖
|
|
||||||
from app.services.order_service import OrderService
|
|
||||||
order_service = OrderService(self.config_manager)
|
|
||||||
result = order_service.process_excel(preprocessed_path, progress_cb=lambda p: progress_cb(60 + int(p*0.4), "生成采购单中...") if progress_cb else None)
|
|
||||||
return result
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"处理杨碧月订单出错: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def preprocess_rongcheng_yigou(self, src_path: str, progress_cb: Optional[Callable[[int, str], None]] = None) -> Optional[str]:
|
|
||||||
"""
|
|
||||||
蓉城易购订单预处理:按用户提供的 E, N, Q, S 列索引进行强制清洗
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
if progress_cb: progress_cb(10, "正在处理蓉城易购预处理...")
|
|
||||||
|
|
||||||
from app.core.utils.file_utils import smart_read_excel
|
|
||||||
# 蓉城易购格式:Row 0是单号,Row 1是联系人,Row 2是表头,Row 3开始是数据
|
|
||||||
df_raw = smart_read_excel(src_path, header=None)
|
|
||||||
|
|
||||||
# 检查数据行数
|
|
||||||
if len(df_raw) <= 3:
|
|
||||||
logger.error("蓉城易购文件数据行数不足")
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 提取数据部分 (Row 3开始)
|
|
||||||
df_data = df_raw.iloc[3:].reset_index(drop=True)
|
|
||||||
|
|
||||||
# 用户指定列映射:
|
|
||||||
# E列 (Index 4) -> 商品条码
|
|
||||||
# N列 (Index 13) -> 数量
|
|
||||||
# Q列 (Index 16) -> 单价
|
|
||||||
# S列 (Index 18) -> 金额
|
|
||||||
# C列 (Index 2) -> 商品名称 (通用需求)
|
|
||||||
|
|
||||||
idx_map = {
|
|
||||||
2: '商品名称',
|
|
||||||
4: '商品条码',
|
|
||||||
13: '数量',
|
|
||||||
16: '单价',
|
|
||||||
18: '金额'
|
|
||||||
}
|
|
||||||
|
|
||||||
# 确保列索引不越界
|
|
||||||
available_indices = [i for i in idx_map.keys() if i < df_data.shape[1]]
|
|
||||||
df2 = df_data.iloc[:, available_indices].copy()
|
|
||||||
df2.columns = [idx_map[i] for i in available_indices]
|
|
||||||
|
|
||||||
# 强制转换类型
|
|
||||||
for c in ['数量', '单价', '金额']:
|
|
||||||
if c in df2.columns:
|
|
||||||
df2[c] = pd.to_numeric(df2[c], errors='coerce').fillna(0)
|
|
||||||
|
|
||||||
# 过滤掉空的条码行
|
|
||||||
df2 = df2.dropna(subset=['商品条码'])
|
|
||||||
df2['商品条码'] = df2['商品条码'].astype(str).str.strip()
|
|
||||||
df2 = df2[df2['商品条码'] != '']
|
|
||||||
|
|
||||||
# 核心逻辑:分裂多条码行并均分数量
|
|
||||||
if '商品条码' in df2.columns and '数量' in df2.columns:
|
|
||||||
rows = []
|
|
||||||
for _, row in df2.iterrows():
|
|
||||||
bc_val = str(row.get('商品条码', '')).strip()
|
|
||||||
# 识别分隔符:/ , , 、
|
|
||||||
if any(sep in bc_val for sep in ['/', ',', ',', '、']):
|
|
||||||
parts = re.split(r'[/,,、]+', bc_val)
|
|
||||||
parts = [p.strip() for p in parts if p.strip()]
|
|
||||||
|
|
||||||
if len(parts) >= 2:
|
|
||||||
q_total = float(row.get('数量', 0) or 0)
|
|
||||||
if q_total > 0:
|
|
||||||
n = len(parts)
|
|
||||||
base_qty = int(q_total // n)
|
|
||||||
remainder = int(q_total % n)
|
|
||||||
|
|
||||||
for i, p_bc in enumerate(parts):
|
|
||||||
new_row = row.copy()
|
|
||||||
new_row['商品条码'] = p_bc
|
|
||||||
current_qty = base_qty + (1 if i < remainder else 0)
|
|
||||||
new_row['数量'] = current_qty
|
|
||||||
if '单价' in new_row:
|
|
||||||
try:
|
|
||||||
up = float(new_row['单价'] or 0)
|
|
||||||
new_row['金额'] = up * current_qty
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
rows.append(new_row)
|
|
||||||
continue
|
|
||||||
rows.append(row)
|
|
||||||
df2 = pd.DataFrame(rows)
|
|
||||||
|
|
||||||
# 保存预处理文件到输出目录(而非源文件目录)
|
|
||||||
if self.config_manager and hasattr(self.config_manager, 'get_path'):
|
|
||||||
out_dir = self.config_manager.get_path('Paths', 'output_folder', fallback='data/output', create=True)
|
|
||||||
else:
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
out_dir = ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True)
|
|
||||||
os.makedirs(out_dir, exist_ok=True)
|
|
||||||
base = os.path.basename(src_path)
|
|
||||||
final_path = os.path.join(out_dir, f"预处理之后_{base}")
|
|
||||||
df2.to_excel(final_path, index=False)
|
|
||||||
|
|
||||||
if progress_cb: progress_cb(100, "蓉城易购预处理完成")
|
|
||||||
return final_path
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"预处理蓉城易购订单出错: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def process_rongcheng_yigou(self, src_path: str, progress_cb: Optional[Callable[[int, str], None]] = None) -> Optional[str]:
|
|
||||||
"""
|
|
||||||
兼容性方法:处理蓉城易购订单并执行后续转换
|
|
||||||
"""
|
|
||||||
cleaned_path = self.preprocess_rongcheng_yigou(src_path, progress_cb)
|
|
||||||
if cleaned_path:
|
|
||||||
from app.services.order_service import OrderService
|
|
||||||
order_service = OrderService(self.config_manager)
|
|
||||||
return order_service.process_excel(cleaned_path, progress_cb=lambda p: progress_cb(60 + int(p*0.4), "生成采购单中...") if progress_cb else None)
|
|
||||||
return None
|
|
||||||
@@ -18,7 +18,6 @@ from xlutils.copy import copy
|
|||||||
from openpyxl import load_workbook
|
from openpyxl import load_workbook
|
||||||
from typing import Optional, Dict, Any, List, Tuple
|
from typing import Optional, Dict, Any, List, Tuple
|
||||||
from app.core.utils.log_utils import get_logger
|
from app.core.utils.log_utils import get_logger
|
||||||
from app.core.utils.string_utils import parse_monetary_string
|
|
||||||
from app.core.utils.dialog_utils import show_custom_dialog # 导入自定义弹窗工具
|
from app.core.utils.dialog_utils import show_custom_dialog # 导入自定义弹窗工具
|
||||||
from ..config.settings import ConfigManager
|
from ..config.settings import ConfigManager
|
||||||
|
|
||||||
@@ -36,10 +35,10 @@ class TobaccoService:
|
|||||||
"""
|
"""
|
||||||
self.config = config
|
self.config = config
|
||||||
# 修复配置获取方式,使用fallback机制
|
# 修复配置获取方式,使用fallback机制
|
||||||
self.output_dir = config.get_path('Paths', 'output_folder', fallback='data/output', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/output')
|
self.output_dir = config.get('Paths', 'output_folder', fallback='data/output')
|
||||||
self.template_file = config.get('Paths', 'template_file', fallback='templates/银豹-采购单模板.xls')
|
self.template_file = config.get('Paths', 'template_file', fallback='templates/银豹-采购单模板.xls')
|
||||||
# 将烟草订单保存到result目录
|
# 将烟草订单保存到result目录
|
||||||
result_dir = config.get_path('Paths', 'result_folder', fallback='data/result', create=True) if hasattr(config, 'get_path') else os.path.abspath('data/result')
|
result_dir = "data/result"
|
||||||
os.makedirs(result_dir, exist_ok=True)
|
os.makedirs(result_dir, exist_ok=True)
|
||||||
self.output_file = os.path.join(result_dir, '银豹采购单_烟草公司.xls')
|
self.output_file = os.path.join(result_dir, '银豹采购单_烟草公司.xls')
|
||||||
|
|
||||||
@@ -74,78 +73,6 @@ class TobaccoService:
|
|||||||
logger.warning(f"找到的烟草订单明细文件不是今天创建的: {latest_file}")
|
logger.warning(f"找到的烟草订单明细文件不是今天创建的: {latest_file}")
|
||||||
return latest_file # 仍然返回最新文件,但给出警告
|
return latest_file # 仍然返回最新文件,但给出警告
|
||||||
|
|
||||||
def preprocess_tobacco_order(self, file_path: str) -> Optional[str]:
|
|
||||||
"""
|
|
||||||
烟草订单预处理:按用户提供的 B, E, G, H 列索引进行强制清洗
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
logger.info(f"执行烟草订单专用预处理: {file_path}")
|
|
||||||
from app.core.utils.file_utils import smart_read_excel
|
|
||||||
|
|
||||||
# 烟草格式:Row 0是专卖证号,Row 1是表头,Row 2是合计,Row 3开始是数据
|
|
||||||
df_raw = smart_read_excel(file_path, header=None)
|
|
||||||
|
|
||||||
if len(df_raw) <= 3:
|
|
||||||
logger.error("烟草订单文件数据行数不足")
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 提取数据部分 (Row 3开始)
|
|
||||||
df_data = df_raw.iloc[3:].reset_index(drop=True)
|
|
||||||
|
|
||||||
# 用户指定列映射:
|
|
||||||
# A列 (Index 0) -> 商品名称
|
|
||||||
# B列 (Index 1) -> 商品条码 (盒码)
|
|
||||||
# E列 (Index 4) -> 批发价 (单价)
|
|
||||||
# G列 (Index 6) -> 订单量 (数量)
|
|
||||||
# H列 (Index 7) -> 金额
|
|
||||||
|
|
||||||
idx_map = {
|
|
||||||
0: '商品名称',
|
|
||||||
1: '商品条码',
|
|
||||||
4: '批发价',
|
|
||||||
6: '数量',
|
|
||||||
7: '金额'
|
|
||||||
}
|
|
||||||
|
|
||||||
available_indices = [i for i in idx_map.keys() if i < df_data.shape[1]]
|
|
||||||
df = df_data.iloc[:, available_indices].copy()
|
|
||||||
df.columns = [idx_map[i] for i in available_indices]
|
|
||||||
|
|
||||||
# 1. 过滤订单量不为0的数据
|
|
||||||
df['数量'] = pd.to_numeric(df['数量'], errors='coerce').fillna(0)
|
|
||||||
df = df[df['数量'] != 0].copy()
|
|
||||||
|
|
||||||
if df.empty:
|
|
||||||
logger.warning("烟草订单无有效订单量记录")
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 2. 核心清洗逻辑:
|
|
||||||
# 数量 = 订单量 * 10 (G列)
|
|
||||||
# 单价 = 批发价 / 10 (E列)
|
|
||||||
df['单价'] = pd.to_numeric(df['批发价'], errors='coerce').fillna(0) / 10
|
|
||||||
df['数量'] = df['数量'] * 10
|
|
||||||
|
|
||||||
# 3. 校验金额 (H列)
|
|
||||||
df['金额'] = pd.to_numeric(df['金额'], errors='coerce').fillna(0)
|
|
||||||
|
|
||||||
# 4. 只保留需要的列
|
|
||||||
final_cols = ['商品条码', '商品名称', '数量', '单价', '金额']
|
|
||||||
df_final = df[final_cols].copy()
|
|
||||||
|
|
||||||
# 保存预处理文件到输出目录(而非源文件目录)
|
|
||||||
out_dir = self.output_dir
|
|
||||||
os.makedirs(out_dir, exist_ok=True)
|
|
||||||
base = os.path.basename(file_path)
|
|
||||||
final_path = os.path.join(out_dir, f"预处理之后_{base}")
|
|
||||||
df_final.to_excel(final_path, index=False)
|
|
||||||
|
|
||||||
logger.info(f"烟草订单预处理完成: {final_path}")
|
|
||||||
return final_path
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"烟草订单预处理失败: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
def process_tobacco_order(self, input_file=None):
|
def process_tobacco_order(self, input_file=None):
|
||||||
"""
|
"""
|
||||||
处理烟草订单
|
处理烟草订单
|
||||||
@@ -238,9 +165,8 @@ class TobaccoService:
|
|||||||
columns = ['商品', '盒码', '条码', '建议零售价', '批发价', '需求量', '订单量', '金额']
|
columns = ['商品', '盒码', '条码', '建议零售价', '批发价', '需求量', '订单量', '金额']
|
||||||
|
|
||||||
try:
|
try:
|
||||||
from app.core.utils.file_utils import smart_read_excel
|
|
||||||
# 读取Excel文件
|
# 读取Excel文件
|
||||||
df_old = smart_read_excel(file_path, header=None, skiprows=3, names=columns)
|
df_old = pd.read_excel(file_path, header=None, skiprows=3, names=columns)
|
||||||
|
|
||||||
# 过滤订单量不为0的数据,并计算采购量和单价
|
# 过滤订单量不为0的数据,并计算采购量和单价
|
||||||
df_filtered = df_old[df_old['订单量'] != 0].copy()
|
df_filtered = df_old[df_old['订单量'] != 0].copy()
|
||||||
@@ -318,9 +244,13 @@ class TobaccoService:
|
|||||||
}
|
}
|
||||||
|
|
||||||
# 确保 total_amount 是数字类型
|
# 确保 total_amount 是数字类型
|
||||||
parsed = parse_monetary_string(total_amount)
|
try:
|
||||||
total_amount = parsed if parsed is not None else 0.0
|
if isinstance(total_amount, str):
|
||||||
amount_display = f"¥{total_amount:.2f}"
|
total_amount = float(total_amount.replace(',', ''))
|
||||||
|
amount_display = f"¥{total_amount:.2f}"
|
||||||
|
except (ValueError, TypeError):
|
||||||
|
# 如果转换失败,直接使用原始值
|
||||||
|
amount_display = f"¥{total_amount}"
|
||||||
|
|
||||||
# 显示自定义对话框
|
# 显示自定义对话框
|
||||||
show_custom_dialog(
|
show_custom_dialog(
|
||||||
|
|||||||
@@ -1,2 +0,0 @@
|
|||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""益选-OCR订单处理系统 UI 模块"""
|
|
||||||
@@ -1,565 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""业务操作处理模块"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import time
|
|
||||||
import datetime
|
|
||||||
import json
|
|
||||||
import logging
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import messagebox
|
|
||||||
from threading import Thread
|
|
||||||
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
from app.services.ocr_service import OCRService
|
|
||||||
from app.services.order_service import OrderService
|
|
||||||
from app.core.utils.log_utils import get_logger
|
|
||||||
|
|
||||||
from .logging_ui import add_to_log, init_gui_logger, dispose_gui_logger, GUILogHandler
|
|
||||||
from .ui_widgets import ProgressReporter
|
|
||||||
from .error_utils import show_error_dialog, get_error_suggestion
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
from .result_previews import show_ocr_result_preview, show_excel_result_preview, show_merge_result_preview
|
|
||||||
from .user_settings import add_recent_file
|
|
||||||
from .command_runner import get_running_task, set_running_task
|
|
||||||
from .file_operations import select_file, select_excel_file, validate_unit_price_against_item_data
|
|
||||||
|
|
||||||
|
|
||||||
def _ask_and_merge_purchase_orders(order_service, log_widget, add_to_recent=False):
|
|
||||||
"""弹窗询问是否合并采购单,返回合并结果路径或 None。
|
|
||||||
|
|
||||||
用于 run_pipeline_directly 和 batch_process_orders_with_status 的共享逻辑。
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
purchase_orders = order_service.get_purchase_orders()
|
|
||||||
|
|
||||||
if len(purchase_orders) == 0:
|
|
||||||
add_to_log(log_widget, "没有找到采购单文件,跳过合并步骤\n", "info")
|
|
||||||
elif len(purchase_orders) == 1:
|
|
||||||
add_to_log(log_widget, f"只有1个采购单文件,无需合并: {os.path.basename(purchase_orders[0])}\n", "info")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, f"找到{len(purchase_orders)}个采购单文件\n", "info")
|
|
||||||
|
|
||||||
file_list = "\n".join([f"• {os.path.basename(f)}" for f in purchase_orders])
|
|
||||||
merge_choice = messagebox.askyesnocancel(
|
|
||||||
"采购单合并选择",
|
|
||||||
f"发现{len(purchase_orders)}个采购单文件:\n\n{file_list}\n\n是否需要合并这些采购单?\n\n• 选择'是':合并所有采购单\n• 选择'否':保持文件分离\n• 选择'取消':跳过此步骤",
|
|
||||||
icon='question'
|
|
||||||
)
|
|
||||||
|
|
||||||
if merge_choice is True:
|
|
||||||
add_to_log(log_widget, "开始合并采购单...\n", "info")
|
|
||||||
merge_result = order_service.merge_all_purchase_orders()
|
|
||||||
if merge_result:
|
|
||||||
add_to_log(log_widget, "采购单合并完成\n", "success")
|
|
||||||
if add_to_recent:
|
|
||||||
try:
|
|
||||||
add_recent_file(merge_result)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
return merge_result
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, "合并失败\n", "warning")
|
|
||||||
elif merge_choice is False:
|
|
||||||
add_to_log(log_widget, "用户选择不合并采购单,保持文件分离\n", "info")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, "用户取消合并操作\n", "info")
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"合并过程出现问题: {str(e)}\n", "warning")
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def process_single_image_with_status(log_widget, status_bar):
|
|
||||||
status_bar.set_status("选择图片中...")
|
|
||||||
file_path = select_file(log_widget, [("图片文件", "*.jpg *.jpeg *.png *.bmp"), ("所有文件", "*.*")], "选择图片")
|
|
||||||
if not file_path:
|
|
||||||
status_bar.set_status("操作已取消")
|
|
||||||
add_to_log(log_widget, "未选择文件,操作已取消\n", "warning")
|
|
||||||
return
|
|
||||||
|
|
||||||
def run_in_thread():
|
|
||||||
try:
|
|
||||||
status_bar.set_running(True)
|
|
||||||
status_bar.set_status("开始处理图片...")
|
|
||||||
|
|
||||||
gui_handler = GUILogHandler(log_widget)
|
|
||||||
gui_handler.setLevel(logging.INFO)
|
|
||||||
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
|
|
||||||
gui_handler.setFormatter(formatter)
|
|
||||||
|
|
||||||
root_logger = logging.getLogger()
|
|
||||||
for handler in root_logger.handlers[:]:
|
|
||||||
if isinstance(handler, logging.StreamHandler):
|
|
||||||
root_logger.removeHandler(handler)
|
|
||||||
root_logger.addHandler(gui_handler)
|
|
||||||
root_logger.setLevel(logging.INFO)
|
|
||||||
|
|
||||||
ocr_service = OCRService()
|
|
||||||
add_to_log(log_widget, f"开始处理图片: {file_path}\n", "info")
|
|
||||||
try:
|
|
||||||
add_recent_file(file_path)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
excel_path = ocr_service.process_image(file_path)
|
|
||||||
|
|
||||||
if excel_path:
|
|
||||||
add_to_log(log_widget, "图片OCR处理完成\n", "success")
|
|
||||||
preview_output = f"采购单已保存到: {excel_path}\n"
|
|
||||||
show_excel_result_preview(preview_output)
|
|
||||||
try:
|
|
||||||
add_recent_file(excel_path)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, "图片OCR处理失败\n", "error")
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"处理单个图片时出错: {str(e)}\n", "error")
|
|
||||||
sugg = get_error_suggestion(str(e))
|
|
||||||
if sugg:
|
|
||||||
show_error_dialog("OCR处理错误", str(e), sugg)
|
|
||||||
finally:
|
|
||||||
try:
|
|
||||||
root_logger = logging.getLogger()
|
|
||||||
for handler in root_logger.handlers[:]:
|
|
||||||
if isinstance(handler, GUILogHandler):
|
|
||||||
root_logger.removeHandler(handler)
|
|
||||||
handler.close()
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"清理日志处理器失败: {e}")
|
|
||||||
status_bar.set_running(False)
|
|
||||||
status_bar.set_status("就绪")
|
|
||||||
|
|
||||||
thread = Thread(target=run_in_thread)
|
|
||||||
thread.daemon = True
|
|
||||||
thread.start()
|
|
||||||
|
|
||||||
|
|
||||||
def run_pipeline_directly(log_widget, status_bar):
|
|
||||||
"""直接运行完整处理流程"""
|
|
||||||
if get_running_task() is not None:
|
|
||||||
messagebox.showinfo("任务进行中", "请等待当前任务完成后再执行新的操作。")
|
|
||||||
return
|
|
||||||
|
|
||||||
def run_in_thread():
|
|
||||||
set_running_task("pipeline")
|
|
||||||
|
|
||||||
if status_bar:
|
|
||||||
status_bar.set_running(True)
|
|
||||||
status_bar.set_status("开始完整处理流程...")
|
|
||||||
|
|
||||||
start_time = datetime.datetime.now()
|
|
||||||
start_perf = time.perf_counter()
|
|
||||||
log_widget.configure(state=tk.NORMAL)
|
|
||||||
log_widget.delete(1.0, tk.END)
|
|
||||||
log_widget.insert(tk.END, "执行命令: 完整处理流程\n", "command")
|
|
||||||
log_widget.insert(tk.END, f"开始时间: {start_time.strftime('%Y-%m-%d %H:%M:%S')}\n", "time")
|
|
||||||
log_widget.insert(tk.END, "=" * 50 + "\n\n", "separator")
|
|
||||||
log_widget.configure(state=tk.DISABLED)
|
|
||||||
|
|
||||||
try:
|
|
||||||
config = ConfigManager()
|
|
||||||
|
|
||||||
gui_handler = init_gui_logger(log_widget)
|
|
||||||
|
|
||||||
ocr_service = OCRService(config)
|
|
||||||
order_service = OrderService(config)
|
|
||||||
|
|
||||||
reporter = ProgressReporter(status_bar)
|
|
||||||
reporter.running()
|
|
||||||
reporter.set("开始OCR批量处理...", 10)
|
|
||||||
|
|
||||||
total, success = ocr_service.batch_process(progress_cb=lambda p: reporter.set("OCR处理中...", p))
|
|
||||||
if total == 0:
|
|
||||||
add_to_log(log_widget, "没有找到需要处理的图片\n", "warning")
|
|
||||||
if status_bar:
|
|
||||||
status_bar.set_status("未找到图片文件")
|
|
||||||
return
|
|
||||||
elif success == 0:
|
|
||||||
add_to_log(log_widget, "OCR处理没有成功处理任何新文件\n", "warning")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, f"OCR处理完成,共处理 {success}/{total} 个文件\n", "success")
|
|
||||||
try:
|
|
||||||
processed_map = {}
|
|
||||||
config = ConfigManager()
|
|
||||||
pjson = config.get('Paths', 'processed_record', fallback='data/processed_files.json')
|
|
||||||
if os.path.exists(pjson):
|
|
||||||
with open(pjson, 'r', encoding='utf-8') as f:
|
|
||||||
processed_map = json.load(f)
|
|
||||||
outputs = list(processed_map.values())
|
|
||||||
for p in outputs[-10:]:
|
|
||||||
if p:
|
|
||||||
add_recent_file(os.path.abspath(p))
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"加载已处理文件记录失败: {e}")
|
|
||||||
reporter.set("开始Excel处理...", 92)
|
|
||||||
|
|
||||||
add_to_log(log_widget, "开始Excel处理...\n", "info")
|
|
||||||
result = order_service.process_excel()
|
|
||||||
|
|
||||||
if not result:
|
|
||||||
add_to_log(log_widget, "Excel处理失败\n", "error")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, "Excel处理完成\n", "success")
|
|
||||||
try:
|
|
||||||
add_recent_file(result)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
try:
|
|
||||||
validate_unit_price_against_item_data(result, log_widget)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"单价校验失败: {e}")
|
|
||||||
|
|
||||||
reporter.set("检查是否需要合并采购单...", 80)
|
|
||||||
_ask_and_merge_purchase_orders(order_service, log_widget, add_to_recent=True)
|
|
||||||
|
|
||||||
end_time = datetime.datetime.now()
|
|
||||||
duration_sec = max(0.0, time.perf_counter() - start_perf)
|
|
||||||
|
|
||||||
add_to_log(log_widget, f"\n{'=' * 50}\n", "separator")
|
|
||||||
add_to_log(log_widget, "完整处理流程执行完毕!\n", "success")
|
|
||||||
add_to_log(log_widget, f"结束时间: {end_time.strftime('%Y-%m-%d %H:%M:%S')}\n", "time")
|
|
||||||
add_to_log(log_widget, f"耗时: {duration_sec:.2f} 秒\n", "time")
|
|
||||||
reporter.set("处理完成", 100)
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"执行过程中发生错误: {str(e)}\n", "error")
|
|
||||||
import traceback
|
|
||||||
add_to_log(log_widget, f"详细错误信息: {traceback.format_exc()}\n", "error")
|
|
||||||
finally:
|
|
||||||
dispose_gui_logger()
|
|
||||||
reporter.done()
|
|
||||||
|
|
||||||
set_running_task(None)
|
|
||||||
if status_bar:
|
|
||||||
status_bar.set_running(False)
|
|
||||||
status_bar.set_status("就绪")
|
|
||||||
|
|
||||||
thread = Thread(target=run_in_thread)
|
|
||||||
thread.daemon = True
|
|
||||||
thread.start()
|
|
||||||
|
|
||||||
|
|
||||||
def batch_ocr_with_status(log_widget, status_bar):
|
|
||||||
"""OCR批量识别"""
|
|
||||||
def run_in_thread():
|
|
||||||
try:
|
|
||||||
reporter = ProgressReporter(status_bar)
|
|
||||||
reporter.running()
|
|
||||||
reporter.set("正在进行OCR批量识别...", 10)
|
|
||||||
add_to_log(log_widget, "开始OCR批量识别\n", "info")
|
|
||||||
|
|
||||||
init_gui_logger(log_widget)
|
|
||||||
|
|
||||||
ocr_service = OCRService()
|
|
||||||
|
|
||||||
result = ocr_service.batch_process()
|
|
||||||
|
|
||||||
if result:
|
|
||||||
add_to_log(log_widget, "OCR批量识别完成\n", "success")
|
|
||||||
show_ocr_result_preview("OCR批量识别成功完成")
|
|
||||||
reporter.set("批量识别完成", 100)
|
|
||||||
try:
|
|
||||||
processed_map = {}
|
|
||||||
config = ConfigManager()
|
|
||||||
pjson = config.get('Paths', 'processed_record', fallback='data/processed_files.json')
|
|
||||||
if os.path.exists(pjson):
|
|
||||||
with open(pjson, 'r', encoding='utf-8') as f:
|
|
||||||
processed_map = json.load(f)
|
|
||||||
outputs = list(processed_map.values())
|
|
||||||
for p in outputs[-10:]:
|
|
||||||
if p:
|
|
||||||
add_recent_file(p)
|
|
||||||
inputs = list(processed_map.keys())
|
|
||||||
for p in inputs[-10:]:
|
|
||||||
if p:
|
|
||||||
add_recent_file(p)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"加载已处理文件记录失败: {e}")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, "OCR批量识别失败\n", "error")
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"OCR批量识别出错: {str(e)}\n", "error")
|
|
||||||
sugg = get_error_suggestion(str(e))
|
|
||||||
if sugg:
|
|
||||||
show_error_dialog("OCR处理错误", str(e), sugg)
|
|
||||||
finally:
|
|
||||||
dispose_gui_logger()
|
|
||||||
reporter.done()
|
|
||||||
|
|
||||||
thread = Thread(target=run_in_thread)
|
|
||||||
thread.daemon = True
|
|
||||||
thread.start()
|
|
||||||
|
|
||||||
|
|
||||||
def batch_process_orders_with_status(log_widget, status_bar):
|
|
||||||
"""批量处理订单(仅Excel处理,包含合并确认)"""
|
|
||||||
def run_in_thread():
|
|
||||||
try:
|
|
||||||
reporter = ProgressReporter(status_bar)
|
|
||||||
reporter.running()
|
|
||||||
reporter.set("正在批量处理订单...", 10)
|
|
||||||
add_to_log(log_widget, "开始批量处理订单\n", "info")
|
|
||||||
|
|
||||||
init_gui_logger(log_widget)
|
|
||||||
|
|
||||||
order_service = OrderService()
|
|
||||||
|
|
||||||
add_to_log(log_widget, "开始Excel处理...\n", "info")
|
|
||||||
try:
|
|
||||||
latest_input = order_service.get_latest_excel()
|
|
||||||
if latest_input:
|
|
||||||
add_recent_file(latest_input)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"获取最新Excel失败: {e}")
|
|
||||||
result = order_service.process_excel(progress_cb=lambda p: reporter.set("Excel处理中...", p))
|
|
||||||
|
|
||||||
if result:
|
|
||||||
add_to_log(log_widget, "Excel处理完成\n", "success")
|
|
||||||
try:
|
|
||||||
validate_unit_price_against_item_data(result, log_widget)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"单价校验失败: {e}")
|
|
||||||
|
|
||||||
reporter.set("检查是否需要合并采购单...", 70)
|
|
||||||
add_to_log(log_widget, "检查是否需要合并采购单...\n", "info")
|
|
||||||
_ask_and_merge_purchase_orders(order_service, log_widget)
|
|
||||||
|
|
||||||
add_to_log(log_widget, "批量处理订单完成\n", "success")
|
|
||||||
reporter.set("批量处理订单完成", 100)
|
|
||||||
show_excel_result_preview(f"采购单已保存到: {result}\n")
|
|
||||||
try:
|
|
||||||
add_recent_file(result)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, "批量处理订单失败\n", "error")
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"批量处理订单时出错: {str(e)}\n", "error")
|
|
||||||
sugg = get_error_suggestion(str(e))
|
|
||||||
if sugg:
|
|
||||||
show_error_dialog("Excel处理错误", str(e), sugg)
|
|
||||||
finally:
|
|
||||||
dispose_gui_logger()
|
|
||||||
reporter.done()
|
|
||||||
|
|
||||||
thread = Thread(target=run_in_thread)
|
|
||||||
thread.daemon = True
|
|
||||||
thread.start()
|
|
||||||
|
|
||||||
|
|
||||||
def merge_orders_with_status(log_widget, status_bar):
|
|
||||||
"""合并采购单"""
|
|
||||||
def run_in_thread():
|
|
||||||
try:
|
|
||||||
reporter = ProgressReporter(status_bar)
|
|
||||||
reporter.running()
|
|
||||||
reporter.set("正在合并采购单...", 10)
|
|
||||||
add_to_log(log_widget, "开始合并采购单\n", "info")
|
|
||||||
|
|
||||||
init_gui_logger(log_widget)
|
|
||||||
|
|
||||||
order_service = OrderService()
|
|
||||||
|
|
||||||
result = order_service.merge_all_purchase_orders(progress_cb=lambda p: reporter.set("合并处理中...", p))
|
|
||||||
|
|
||||||
if result:
|
|
||||||
add_to_log(log_widget, "采购单合并完成\n", "success")
|
|
||||||
show_merge_result_preview(f"已保存到: {result}\n")
|
|
||||||
try:
|
|
||||||
add_recent_file(result)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
try:
|
|
||||||
validate_unit_price_against_item_data(result, log_widget)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"单价校验失败: {e}")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, "采购单合并失败\n", "error")
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"采购单合并出错: {str(e)}\n", "error")
|
|
||||||
sugg = get_error_suggestion(str(e))
|
|
||||||
if sugg:
|
|
||||||
show_error_dialog("合并错误", str(e), sugg)
|
|
||||||
finally:
|
|
||||||
dispose_gui_logger()
|
|
||||||
reporter.done()
|
|
||||||
|
|
||||||
thread = Thread(target=run_in_thread)
|
|
||||||
thread.daemon = True
|
|
||||||
thread.start()
|
|
||||||
|
|
||||||
|
|
||||||
def process_excel_file_with_status(log_widget, status_bar):
|
|
||||||
"""处理Excel文件"""
|
|
||||||
def run_in_thread():
|
|
||||||
try:
|
|
||||||
status_bar.set_running(True)
|
|
||||||
status_bar.set_status("选择Excel文件中...")
|
|
||||||
file_path = select_excel_file(log_widget)
|
|
||||||
|
|
||||||
if file_path:
|
|
||||||
status_bar.set_status("开始处理Excel文件...")
|
|
||||||
add_to_log(log_widget, f"开始处理Excel文件: {file_path}\n", "info")
|
|
||||||
else:
|
|
||||||
status_bar.set_status("操作已取消")
|
|
||||||
add_to_log(log_widget, "未选择文件,操作已取消\n", "warning")
|
|
||||||
return
|
|
||||||
|
|
||||||
init_gui_logger(log_widget)
|
|
||||||
|
|
||||||
order_service = OrderService()
|
|
||||||
|
|
||||||
if file_path:
|
|
||||||
try:
|
|
||||||
add_recent_file(file_path)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
result = order_service.process_excel(file_path, progress_cb=lambda p: status_bar.set_status("Excel处理中...", p))
|
|
||||||
else:
|
|
||||||
try:
|
|
||||||
latest_input = order_service.get_latest_excel()
|
|
||||||
if latest_input:
|
|
||||||
add_recent_file(latest_input)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"获取最新Excel失败: {e}")
|
|
||||||
result = order_service.process_excel(progress_cb=lambda p: status_bar.set_status("Excel处理中...", p))
|
|
||||||
|
|
||||||
if result:
|
|
||||||
add_to_log(log_widget, "Excel文件处理完成\n", "success")
|
|
||||||
show_excel_result_preview(f"采购单已保存到: {result}\n")
|
|
||||||
try:
|
|
||||||
add_recent_file(result)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
try:
|
|
||||||
validate_unit_price_against_item_data(result, log_widget)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"单价校验失败: {e}")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, "Excel文件处理失败\n", "error")
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"Excel文件处理出错: {str(e)}\n", "error")
|
|
||||||
msg = str(e)
|
|
||||||
suggestion = None
|
|
||||||
if 'openpyxl' in msg or 'engine' in msg:
|
|
||||||
suggestion = "安装依赖:pip install openpyxl"
|
|
||||||
elif 'xlrd' in msg:
|
|
||||||
suggestion = "安装依赖:pip install xlrd"
|
|
||||||
if suggestion:
|
|
||||||
show_error_dialog("Excel处理错误", msg, suggestion)
|
|
||||||
finally:
|
|
||||||
dispose_gui_logger()
|
|
||||||
|
|
||||||
status_bar.set_running(False)
|
|
||||||
status_bar.set_status("就绪")
|
|
||||||
|
|
||||||
thread = Thread(target=run_in_thread)
|
|
||||||
thread.daemon = True
|
|
||||||
thread.start()
|
|
||||||
|
|
||||||
|
|
||||||
def process_dropped_file(log_widget, status_bar, file_path):
|
|
||||||
try:
|
|
||||||
ext = os.path.splitext(file_path)[1].lower()
|
|
||||||
if ext in ['.jpg', '.jpeg', '.png', '.bmp']:
|
|
||||||
def _run_img():
|
|
||||||
try:
|
|
||||||
reporter = ProgressReporter(status_bar)
|
|
||||||
reporter.running()
|
|
||||||
init_gui_logger(log_widget)
|
|
||||||
add_to_log(log_widget, f"开始一键处理图片: {file_path}\n", "info")
|
|
||||||
try:
|
|
||||||
add_recent_file(file_path)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
|
|
||||||
# 步骤1: OCR识别
|
|
||||||
reporter.set("OCR识别中...", 10)
|
|
||||||
ocr_service = OCRService()
|
|
||||||
excel_path = ocr_service.process_image(file_path)
|
|
||||||
if not excel_path:
|
|
||||||
add_to_log(log_widget, "图片OCR处理失败\n", "error")
|
|
||||||
return
|
|
||||||
add_to_log(log_widget, f"OCR识别完成: {excel_path}\n", "success")
|
|
||||||
|
|
||||||
# 步骤2: Excel处理
|
|
||||||
reporter.set("Excel处理中...", 40)
|
|
||||||
order_service = OrderService()
|
|
||||||
result = order_service.process_excel(excel_path, progress_cb=lambda p: reporter.set("Excel处理中...", p))
|
|
||||||
if not result:
|
|
||||||
add_to_log(log_widget, "Excel处理失败\n", "error")
|
|
||||||
return
|
|
||||||
add_to_log(log_widget, f"Excel处理完成: {result}\n", "success")
|
|
||||||
try:
|
|
||||||
add_recent_file(result)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
try:
|
|
||||||
validate_unit_price_against_item_data(result, log_widget)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"单价校验失败: {e}")
|
|
||||||
|
|
||||||
# 步骤3: 合并采购单
|
|
||||||
reporter.set("检查合并采购单...", 80)
|
|
||||||
_ask_and_merge_purchase_orders(order_service, log_widget, add_to_recent=True)
|
|
||||||
|
|
||||||
reporter.set("处理完成", 100)
|
|
||||||
add_to_log(log_widget, "一键处理完成!\n", "success")
|
|
||||||
finally:
|
|
||||||
dispose_gui_logger()
|
|
||||||
reporter.done()
|
|
||||||
t = Thread(target=_run_img)
|
|
||||||
t.daemon = True
|
|
||||||
t.start()
|
|
||||||
elif ext in ['.xlsx', '.xls']:
|
|
||||||
def _run_xls():
|
|
||||||
try:
|
|
||||||
reporter = ProgressReporter(status_bar)
|
|
||||||
reporter.running()
|
|
||||||
init_gui_logger(log_widget)
|
|
||||||
order_service = OrderService()
|
|
||||||
add_to_log(log_widget, f"开始一键处理Excel文件: {file_path}\n", "info")
|
|
||||||
try:
|
|
||||||
add_recent_file(file_path)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
|
|
||||||
# 步骤1: Excel处理
|
|
||||||
reporter.set("Excel处理中...", 20)
|
|
||||||
result = order_service.process_excel(file_path, progress_cb=lambda p: reporter.set("Excel处理中...", p))
|
|
||||||
if not result:
|
|
||||||
add_to_log(log_widget, "Excel文件处理失败\n", "error")
|
|
||||||
return
|
|
||||||
add_to_log(log_widget, f"Excel处理完成: {result}\n", "success")
|
|
||||||
try:
|
|
||||||
add_recent_file(result)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
try:
|
|
||||||
validate_unit_price_against_item_data(result, log_widget)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"单价校验失败: {e}")
|
|
||||||
|
|
||||||
# 步骤2: 合并采购单
|
|
||||||
reporter.set("检查合并采购单...", 80)
|
|
||||||
_ask_and_merge_purchase_orders(order_service, log_widget, add_to_recent=True)
|
|
||||||
|
|
||||||
reporter.set("处理完成", 100)
|
|
||||||
add_to_log(log_widget, "一键处理完成!\n", "success")
|
|
||||||
finally:
|
|
||||||
dispose_gui_logger()
|
|
||||||
reporter.done()
|
|
||||||
t = Thread(target=_run_xls)
|
|
||||||
t.daemon = True
|
|
||||||
t.start()
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, f"不支持的文件类型: {file_path}\n", "warning")
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"处理拖拽文件失败: {str(e)}\n", "error")
|
|
||||||
@@ -1,33 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""条码映射编辑模块"""
|
|
||||||
|
|
||||||
from tkinter import messagebox
|
|
||||||
|
|
||||||
from app.core.excel.converter import UnitConverter
|
|
||||||
from app.core.utils.dialog_utils import show_barcode_mapping_dialog
|
|
||||||
|
|
||||||
from .logging_ui import add_to_log
|
|
||||||
|
|
||||||
|
|
||||||
def edit_barcode_mappings(log_widget):
|
|
||||||
"""编辑条码映射配置"""
|
|
||||||
try:
|
|
||||||
add_to_log(log_widget, "正在加载条码映射配置...\n", "info")
|
|
||||||
|
|
||||||
unit_converter = UnitConverter()
|
|
||||||
|
|
||||||
current_mappings = unit_converter.special_barcodes
|
|
||||||
|
|
||||||
def save_mappings(new_mappings):
|
|
||||||
success = unit_converter.update_barcode_mappings(new_mappings)
|
|
||||||
if success:
|
|
||||||
add_to_log(log_widget, f"成功保存条码映射配置,共{len(new_mappings)}项\n", "success")
|
|
||||||
else:
|
|
||||||
add_to_log(log_widget, "保存条码映射配置失败\n", "error")
|
|
||||||
|
|
||||||
show_barcode_mapping_dialog(None, save_mappings, current_mappings)
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"编辑条码映射时出错: {str(e)}\n", "error")
|
|
||||||
messagebox.showerror("错误", f"编辑条码映射时出错: {str(e)}")
|
|
||||||
@@ -1,160 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""命令执行器模块"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import time
|
|
||||||
import subprocess
|
|
||||||
import datetime
|
|
||||||
import re
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import messagebox
|
|
||||||
from threading import Thread
|
|
||||||
|
|
||||||
from .logging_ui import LogRedirector
|
|
||||||
from .result_previews import show_result_preview
|
|
||||||
|
|
||||||
# 任务状态跟踪
|
|
||||||
_RUNNING_TASK = None
|
|
||||||
|
|
||||||
|
|
||||||
def get_running_task():
|
|
||||||
return _RUNNING_TASK
|
|
||||||
|
|
||||||
|
|
||||||
def set_running_task(val):
|
|
||||||
global _RUNNING_TASK
|
|
||||||
_RUNNING_TASK = val
|
|
||||||
|
|
||||||
|
|
||||||
def run_command_with_logging(command, log_widget, status_bar=None, on_complete=None):
|
|
||||||
"""运行命令并将输出重定向到日志窗口"""
|
|
||||||
if _RUNNING_TASK is not None:
|
|
||||||
messagebox.showinfo("任务进行中", "请等待当前任务完成后再执行新的操作。")
|
|
||||||
return
|
|
||||||
|
|
||||||
def run_in_thread():
|
|
||||||
global _RUNNING_TASK
|
|
||||||
_RUNNING_TASK = command
|
|
||||||
|
|
||||||
if status_bar:
|
|
||||||
status_bar.set_running(True)
|
|
||||||
|
|
||||||
start_time = datetime.datetime.now()
|
|
||||||
start_perf = time.perf_counter()
|
|
||||||
log_widget.configure(state=tk.NORMAL)
|
|
||||||
log_widget.delete(1.0, tk.END)
|
|
||||||
log_widget.insert(tk.END, f"执行命令: {' '.join(command)}\n", "command")
|
|
||||||
log_widget.insert(tk.END, f"开始时间: {start_time.strftime('%Y-%m-%d %H:%M:%S')}\n", "time")
|
|
||||||
log_widget.insert(tk.END, "=" * 50 + "\n\n", "separator")
|
|
||||||
log_widget.configure(state=tk.DISABLED)
|
|
||||||
|
|
||||||
old_stdout = sys.stdout
|
|
||||||
old_stderr = sys.stderr
|
|
||||||
|
|
||||||
log_redirector = LogRedirector(log_widget)
|
|
||||||
|
|
||||||
env = os.environ.copy()
|
|
||||||
try:
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
cfg = ConfigManager()
|
|
||||||
env["OCR_OUTPUT_DIR"] = cfg.get_path('Paths', 'output_folder', fallback='data/output', create=True)
|
|
||||||
env["OCR_INPUT_DIR"] = cfg.get_path('Paths', 'input_folder', fallback='data/input', create=True)
|
|
||||||
env["OCR_TEMP_DIR"] = cfg.get_path('Paths', 'temp_folder', fallback='data/temp', create=True)
|
|
||||||
except Exception:
|
|
||||||
# 回退:使用 exe/脚本所在目录
|
|
||||||
app_root = os.path.dirname(sys.executable) if getattr(sys, 'frozen', False) else os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
|
||||||
env["OCR_OUTPUT_DIR"] = os.path.join(app_root, "data", "output")
|
|
||||||
env["OCR_INPUT_DIR"] = os.path.join(app_root, "data", "input")
|
|
||||||
env["OCR_TEMP_DIR"] = os.path.join(app_root, "data", "temp")
|
|
||||||
env["OCR_LOG_LEVEL"] = "DEBUG"
|
|
||||||
|
|
||||||
try:
|
|
||||||
sys.stdout = log_redirector
|
|
||||||
sys.stderr = log_redirector
|
|
||||||
|
|
||||||
print("日志重定向已启动,现在同时输出到终端和GUI")
|
|
||||||
|
|
||||||
process = subprocess.Popen(
|
|
||||||
command,
|
|
||||||
stdout=subprocess.PIPE,
|
|
||||||
stderr=subprocess.STDOUT,
|
|
||||||
text=True,
|
|
||||||
bufsize=1,
|
|
||||||
universal_newlines=True,
|
|
||||||
env=env
|
|
||||||
)
|
|
||||||
|
|
||||||
output_data = []
|
|
||||||
for line in process.stdout:
|
|
||||||
output_data.append(line)
|
|
||||||
print(line.rstrip())
|
|
||||||
|
|
||||||
if status_bar:
|
|
||||||
progress = extract_progress_from_log(line)
|
|
||||||
if progress is not None:
|
|
||||||
log_widget.after(0, lambda p=progress: status_bar.set_status(f"处理中: {p}%完成", p))
|
|
||||||
|
|
||||||
process.wait()
|
|
||||||
|
|
||||||
end_time = datetime.datetime.now()
|
|
||||||
duration_sec = max(0.0, time.perf_counter() - start_perf)
|
|
||||||
|
|
||||||
print(f"\n{'=' * 50}")
|
|
||||||
print(f"执行完毕!返回码: {process.returncode}")
|
|
||||||
print(f"结束时间: {end_time.strftime('%Y-%m-%d %H:%M:%S')}")
|
|
||||||
print(f"耗时: {duration_sec:.2f} 秒")
|
|
||||||
|
|
||||||
output_text = ''.join(output_data)
|
|
||||||
|
|
||||||
is_pipeline = "pipeline" in command
|
|
||||||
no_merge_files = "未找到采购单文件" in output_text
|
|
||||||
single_file = "只有1个采购单文件" in output_text
|
|
||||||
|
|
||||||
if is_pipeline and (no_merge_files or single_file):
|
|
||||||
print("完整流程中没有需要合并的文件,但其他步骤执行成功,视为成功完成")
|
|
||||||
if status_bar:
|
|
||||||
log_widget.after(0, lambda: status_bar.set_status("处理完成", 100))
|
|
||||||
log_widget.after(0, lambda: show_result_preview(command, output_text))
|
|
||||||
else:
|
|
||||||
if on_complete:
|
|
||||||
log_widget.after(0, lambda: on_complete(process.returncode, output_text))
|
|
||||||
elif process.returncode == 0:
|
|
||||||
if status_bar:
|
|
||||||
log_widget.after(0, lambda: status_bar.set_status("处理完成", 100))
|
|
||||||
log_widget.after(0, lambda: show_result_preview(command, output_text))
|
|
||||||
else:
|
|
||||||
if status_bar:
|
|
||||||
log_widget.after(0, lambda: status_bar.set_status(f"处理失败 (返回码: {process.returncode})", 0))
|
|
||||||
log_widget.after(0, lambda: messagebox.showerror("操作失败", f"处理失败,返回码:{process.returncode}"))
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
print(f"\n执行出错: {str(e)}")
|
|
||||||
if status_bar:
|
|
||||||
log_widget.after(0, lambda: status_bar.set_status(f"执行出错: {str(e)}", 0))
|
|
||||||
log_widget.after(0, lambda: messagebox.showerror("执行错误", f"执行命令时出错: {str(e)}"))
|
|
||||||
finally:
|
|
||||||
sys.stdout = old_stdout
|
|
||||||
sys.stderr = old_stderr
|
|
||||||
|
|
||||||
_RUNNING_TASK = None
|
|
||||||
if status_bar:
|
|
||||||
log_widget.after(0, lambda: status_bar.set_running(False))
|
|
||||||
|
|
||||||
Thread(target=run_in_thread).start()
|
|
||||||
|
|
||||||
|
|
||||||
def extract_progress_from_log(log_line):
|
|
||||||
"""从日志行中提取进度信息"""
|
|
||||||
batch_match = re.search(r'处理批次 (\d+)/(\d+)', log_line)
|
|
||||||
if batch_match:
|
|
||||||
current = int(batch_match.group(1))
|
|
||||||
total = int(batch_match.group(2))
|
|
||||||
return int(current / total * 100)
|
|
||||||
|
|
||||||
percent_match = re.search(r'(\d+)%', log_line)
|
|
||||||
if percent_match:
|
|
||||||
return int(percent_match.group(1))
|
|
||||||
|
|
||||||
return None
|
|
||||||
@@ -1,205 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""系统设置对话框模块"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import messagebox, filedialog, ttk
|
|
||||||
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
|
|
||||||
from .user_settings import load_user_settings, save_user_settings
|
|
||||||
from .ui_widgets import center_window
|
|
||||||
from app.core.utils.dialog_utils import show_cloud_sync_dialog
|
|
||||||
|
|
||||||
|
|
||||||
def show_config_dialog(root, cfg: ConfigManager):
|
|
||||||
|
|
||||||
settings = load_user_settings()
|
|
||||||
dlg = tk.Toplevel(root)
|
|
||||||
dlg.title("系统设置")
|
|
||||||
dlg.geometry("700x460")
|
|
||||||
center_window(dlg)
|
|
||||||
|
|
||||||
content = ttk.Frame(dlg)
|
|
||||||
content.pack(fill=tk.BOTH, expand=True, padx=12, pady=12)
|
|
||||||
content.columnconfigure(0, weight=1)
|
|
||||||
|
|
||||||
# ── 辅助函数 ──
|
|
||||||
def _add_pair(parent, row, col, label_text, widget, label_width=None):
|
|
||||||
"""在 parent 的 (row, col*2) 放 label, (row, col*2+1) 放 widget"""
|
|
||||||
lbl = ttk.Label(parent, text=label_text)
|
|
||||||
if label_width:
|
|
||||||
lbl.configure(width=label_width)
|
|
||||||
lbl.grid(row=row, column=col * 2, sticky='w', padx=(6, 2), pady=3)
|
|
||||||
widget.grid(row=row, column=col * 2 + 1, sticky='ew', padx=(2, 6), pady=3)
|
|
||||||
|
|
||||||
def _make_dir_widget(parent, var, label):
|
|
||||||
f = ttk.Frame(parent)
|
|
||||||
e = ttk.Entry(f, textvariable=var)
|
|
||||||
e.pack(side=tk.LEFT, fill=tk.X, expand=True)
|
|
||||||
|
|
||||||
def _select_dir():
|
|
||||||
d = filedialog.askdirectory(title=f"选择{label}")
|
|
||||||
if d:
|
|
||||||
try:
|
|
||||||
var.set(os.path.relpath(d, os.getcwd()))
|
|
||||||
except Exception:
|
|
||||||
var.set(d)
|
|
||||||
|
|
||||||
ttk.Button(f, text="选择", command=_select_dir).pack(side=tk.LEFT, padx=4)
|
|
||||||
return f
|
|
||||||
|
|
||||||
# ── 当前值 ──
|
|
||||||
log_level_val = tk.StringVar(value=settings.get('log_level', 'INFO'))
|
|
||||||
max_workers_val = tk.StringVar(value=str(settings.get('concurrency_max_workers', cfg.getint('Performance', 'max_workers', 4))))
|
|
||||||
batch_size_val = tk.StringVar(value=str(settings.get('concurrency_batch_size', cfg.getint('Performance', 'batch_size', 5))))
|
|
||||||
template_path_val = tk.StringVar(value=settings.get('template_path', os.path.join(cfg.get('Paths', 'template_folder', 'templates'), cfg.get('Templates', 'purchase_order', '银豹-采购单模板.xls'))))
|
|
||||||
input_dir_val = tk.StringVar(value=settings.get('input_folder', cfg.get('Paths', 'input_folder', 'data/input')))
|
|
||||||
output_dir_val = tk.StringVar(value=settings.get('output_folder', cfg.get('Paths', 'output_folder', 'data/output')))
|
|
||||||
result_dir_val = tk.StringVar(value=settings.get('result_folder', 'data/result'))
|
|
||||||
api_key_val = tk.StringVar(value=settings.get('api_key', cfg.get('API', 'api_key', '')))
|
|
||||||
secret_key_val = tk.StringVar(value=settings.get('secret_key', cfg.get('API', 'secret_key', '')))
|
|
||||||
timeout_val = tk.StringVar(value=str(settings.get('timeout', cfg.getint('API', 'timeout', 30))))
|
|
||||||
max_retries_val = tk.StringVar(value=str(settings.get('max_retries', cfg.getint('API', 'max_retries', 3))))
|
|
||||||
retry_delay_val = tk.StringVar(value=str(settings.get('retry_delay', cfg.getint('API', 'retry_delay', 2))))
|
|
||||||
api_url_val = tk.StringVar(value=settings.get('api_url', cfg.get('API', 'api_url', '')))
|
|
||||||
gitea_url_val = tk.StringVar(value=cfg.get('Gitea', 'base_url', fallback='https://gitea.94kan.cn'))
|
|
||||||
gitea_owner_val = tk.StringVar(value=cfg.get('Gitea', 'owner', fallback='houhuan'))
|
|
||||||
gitea_repo_val = tk.StringVar(value=cfg.get('Gitea', 'repo', fallback='yixuan-sync-data'))
|
|
||||||
gitea_token_val = tk.StringVar(value=cfg.get('Gitea', 'token', fallback=''))
|
|
||||||
|
|
||||||
# ═══════════════════════════════════════════════════
|
|
||||||
# 区块 1: 基本设置
|
|
||||||
# ═══════════════════════════════════════════════════
|
|
||||||
f1 = ttk.LabelFrame(content, text=" 基本设置 ", padding=(8, 4))
|
|
||||||
f1.pack(fill=tk.X, pady=(0, 6))
|
|
||||||
for c in range(4):
|
|
||||||
f1.columnconfigure(c, weight=1 if c % 2 == 1 else 0)
|
|
||||||
|
|
||||||
lvl = ttk.Combobox(f1, textvariable=log_level_val, values=['DEBUG', 'INFO', 'WARNING', 'ERROR'], state='readonly', width=12)
|
|
||||||
_add_pair(f1, 0, 0, "日志级别", lvl)
|
|
||||||
_add_pair(f1, 0, 1, "最大并发", ttk.Entry(f1, textvariable=max_workers_val, width=6))
|
|
||||||
_add_pair(f1, 1, 0, "批次大小", ttk.Entry(f1, textvariable=batch_size_val, width=6))
|
|
||||||
|
|
||||||
# 模板路径(带选择按钮,占右列)
|
|
||||||
tpl_frame = ttk.Frame(f1)
|
|
||||||
tpl_entry = ttk.Entry(tpl_frame, textvariable=template_path_val)
|
|
||||||
tpl_entry.pack(side=tk.LEFT, fill=tk.X, expand=True)
|
|
||||||
|
|
||||||
def _select_template():
|
|
||||||
p = filedialog.askopenfilename(title="选择模板文件", filetypes=[("Excel模板", "*.xls *.xlsx"), ("所有文件", "*.*")])
|
|
||||||
if p:
|
|
||||||
try:
|
|
||||||
template_path_val.set(os.path.relpath(p, os.getcwd()))
|
|
||||||
except Exception:
|
|
||||||
template_path_val.set(p)
|
|
||||||
|
|
||||||
ttk.Button(tpl_frame, text="选择", command=_select_template).pack(side=tk.LEFT, padx=4)
|
|
||||||
_add_pair(f1, 1, 1, "采购模板", tpl_frame)
|
|
||||||
|
|
||||||
_add_pair(f1, 2, 0, "输入目录", _make_dir_widget(f1, input_dir_val, "输入目录"))
|
|
||||||
_add_pair(f1, 2, 1, "输出目录", _make_dir_widget(f1, output_dir_val, "输出目录"))
|
|
||||||
_add_pair(f1, 3, 0, "结果目录", _make_dir_widget(f1, result_dir_val, "结果目录"))
|
|
||||||
|
|
||||||
# ═══════════════════════════════════════════════════
|
|
||||||
# 区块 2: API 设置
|
|
||||||
# ═══════════════════════════════════════════════════
|
|
||||||
f2 = ttk.LabelFrame(content, text=" API 设置 ", padding=(8, 4))
|
|
||||||
f2.pack(fill=tk.X, pady=(0, 6))
|
|
||||||
for c in range(4):
|
|
||||||
f2.columnconfigure(c, weight=1 if c % 2 == 1 else 0)
|
|
||||||
|
|
||||||
_add_pair(f2, 0, 0, "API Key", ttk.Entry(f2, textvariable=api_key_val))
|
|
||||||
secret_entry = ttk.Entry(f2, textvariable=secret_key_val, show='*')
|
|
||||||
_add_pair(f2, 0, 1, "Secret Key", secret_entry)
|
|
||||||
_add_pair(f2, 1, 0, "Timeout", ttk.Entry(f2, textvariable=timeout_val, width=6))
|
|
||||||
_add_pair(f2, 1, 1, "Max Retries", ttk.Entry(f2, textvariable=max_retries_val, width=6))
|
|
||||||
_add_pair(f2, 2, 0, "Retry Delay", ttk.Entry(f2, textvariable=retry_delay_val, width=6))
|
|
||||||
_add_pair(f2, 2, 1, "API URL", ttk.Entry(f2, textvariable=api_url_val))
|
|
||||||
|
|
||||||
# ═══════════════════════════════════════════════════
|
|
||||||
# 区块 3: 云端同步 (Gitea)
|
|
||||||
# ═══════════════════════════════════════════════════
|
|
||||||
f3 = ttk.LabelFrame(content, text=" 云端同步 (Gitea) ", padding=(8, 4))
|
|
||||||
f3.pack(fill=tk.X, pady=(0, 8))
|
|
||||||
for c in range(4):
|
|
||||||
f3.columnconfigure(c, weight=1 if c % 2 == 1 else 0)
|
|
||||||
|
|
||||||
_add_pair(f3, 0, 0, "Gitea 地址", ttk.Entry(f3, textvariable=gitea_url_val))
|
|
||||||
_add_pair(f3, 0, 1, "仓库所有者", ttk.Entry(f3, textvariable=gitea_owner_val))
|
|
||||||
_add_pair(f3, 1, 0, "仓库名称", ttk.Entry(f3, textvariable=gitea_repo_val))
|
|
||||||
_add_pair(f3, 1, 1, "Access Token", ttk.Entry(f3, textvariable=gitea_token_val, show='*'))
|
|
||||||
|
|
||||||
# ═══════════════════════════════════════════════════
|
|
||||||
# 按钮区
|
|
||||||
# ═══════════════════════════════════════════════════
|
|
||||||
btns = ttk.Frame(content)
|
|
||||||
btns.pack(fill=tk.X, pady=(4, 0))
|
|
||||||
|
|
||||||
def save_settings():
|
|
||||||
try:
|
|
||||||
s = load_user_settings()
|
|
||||||
s['log_level'] = log_level_val.get()
|
|
||||||
s['concurrency_max_workers'] = int(max_workers_val.get() or '4')
|
|
||||||
s['concurrency_batch_size'] = int(batch_size_val.get() or '5')
|
|
||||||
tp = template_path_val.get()
|
|
||||||
inp = input_dir_val.get()
|
|
||||||
outp = output_dir_val.get()
|
|
||||||
resp = result_dir_val.get()
|
|
||||||
try:
|
|
||||||
if tp:
|
|
||||||
tp = os.path.relpath(tp, os.getcwd()) if os.path.isabs(tp) else tp
|
|
||||||
if inp:
|
|
||||||
inp = os.path.relpath(inp, os.getcwd()) if os.path.isabs(inp) else inp
|
|
||||||
if outp:
|
|
||||||
outp = os.path.relpath(outp, os.getcwd()) if os.path.isabs(outp) else outp
|
|
||||||
if resp:
|
|
||||||
resp = os.path.relpath(resp, os.getcwd()) if os.path.isabs(resp) else resp
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
s['template_path'] = tp
|
|
||||||
s['input_folder'] = inp
|
|
||||||
s['output_folder'] = outp
|
|
||||||
s['result_folder'] = resp
|
|
||||||
save_user_settings(s)
|
|
||||||
try:
|
|
||||||
from app.core.utils.log_utils import set_log_level
|
|
||||||
set_log_level(s['log_level'])
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
try:
|
|
||||||
tpl_path = s['template_path']
|
|
||||||
tpl_dir = os.path.dirname(tpl_path)
|
|
||||||
tpl_name = os.path.basename(tpl_path)
|
|
||||||
cfg.update('Paths', 'template_folder', tpl_dir)
|
|
||||||
cfg.update('Templates', 'purchase_order', tpl_name)
|
|
||||||
try:
|
|
||||||
cfg.update('Paths', 'template_file', os.path.join(tpl_dir, tpl_name))
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
cfg.update('Paths', 'input_folder', s['input_folder'])
|
|
||||||
cfg.update('Paths', 'output_folder', s['output_folder'])
|
|
||||||
cfg.update('Performance', 'max_workers', s['concurrency_max_workers'])
|
|
||||||
cfg.update('Performance', 'batch_size', s['concurrency_batch_size'])
|
|
||||||
cfg.update('API', 'api_key', api_key_val.get())
|
|
||||||
cfg.update('API', 'secret_key', secret_key_val.get())
|
|
||||||
cfg.update('API', 'timeout', timeout_val.get())
|
|
||||||
cfg.update('API', 'max_retries', max_retries_val.get())
|
|
||||||
cfg.update('API', 'retry_delay', retry_delay_val.get())
|
|
||||||
cfg.update('API', 'api_url', api_url_val.get())
|
|
||||||
cfg.update('Gitea', 'base_url', gitea_url_val.get())
|
|
||||||
cfg.update('Gitea', 'owner', gitea_owner_val.get())
|
|
||||||
cfg.update('Gitea', 'repo', gitea_repo_val.get())
|
|
||||||
cfg.update('Gitea', 'token', gitea_token_val.get())
|
|
||||||
cfg.save_config()
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
messagebox.showinfo("设置已保存", "系统设置已更新并保存")
|
|
||||||
dlg.destroy()
|
|
||||||
except Exception as e:
|
|
||||||
messagebox.showerror("保存失败", str(e))
|
|
||||||
|
|
||||||
ttk.Button(btns, text="云端同步", command=lambda: show_cloud_sync_dialog(dlg)).pack(side=tk.LEFT)
|
|
||||||
ttk.Button(btns, text="取消", command=dlg.destroy).pack(side=tk.RIGHT)
|
|
||||||
ttk.Button(btns, text="保存", command=save_settings).pack(side=tk.RIGHT, padx=6)
|
|
||||||
@@ -1,41 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""错误处理工具模块"""
|
|
||||||
|
|
||||||
from tkinter import messagebox
|
|
||||||
from typing import Optional
|
|
||||||
|
|
||||||
from app.core.utils.log_utils import get_logger
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
def show_error_dialog(title: str, message: str, suggestion: Optional[str] = None):
|
|
||||||
try:
|
|
||||||
full_msg = message
|
|
||||||
if suggestion:
|
|
||||||
full_msg = f"{message}\n\n建议操作:\n- {suggestion}"
|
|
||||||
messagebox.showerror(title, full_msg)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"显示错误对话框失败: {e}")
|
|
||||||
|
|
||||||
|
|
||||||
def get_error_suggestion(message: str) -> Optional[str]:
|
|
||||||
msg = (message or "").lower()
|
|
||||||
if 'openpyxl' in msg or ('engine' in msg and 'xlsx' in msg):
|
|
||||||
return '安装依赖:pip install openpyxl'
|
|
||||||
if 'xlrd' in msg or ('engine' in msg and 'xls' in msg):
|
|
||||||
return '安装依赖:pip install xlrd'
|
|
||||||
if 'timeout' in msg or 'timed out' in msg:
|
|
||||||
return '检查网络,增大API超时时间或稍后重试'
|
|
||||||
if 'invalid access_token' in msg or 'access token' in msg:
|
|
||||||
return '刷新百度OCR令牌或检查api_key/secret_key'
|
|
||||||
if '429' in msg or 'too many requests' in msg:
|
|
||||||
return '降低识别频率或稍后重试'
|
|
||||||
if '模板文件不存在' in msg or ('no such file' in msg and '模板' in msg):
|
|
||||||
return '在系统设置中选择正确的模板文件路径'
|
|
||||||
if '没有找到采购单' in msg or '未在' in msg and '找到采购单' in msg:
|
|
||||||
return '确认result目录内存在采购单文件'
|
|
||||||
if 'permission denied' in msg:
|
|
||||||
return '以管理员权限运行或更改目录写入权限'
|
|
||||||
return None
|
|
||||||
@@ -1,205 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""文件与目录操作模块"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import json
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import messagebox, filedialog, scrolledtext
|
|
||||||
|
|
||||||
from .logging_ui import add_to_log
|
|
||||||
from .ui_widgets import center_window
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
|
|
||||||
|
|
||||||
def select_file(log_widget, file_types=None, title="选择文件"):
|
|
||||||
"""通用文件选择对话框"""
|
|
||||||
if file_types is None:
|
|
||||||
file_types = [("所有文件", "*.*")]
|
|
||||||
file_path = filedialog.askopenfilename(title=title, filetypes=file_types)
|
|
||||||
if file_path:
|
|
||||||
add_to_log(log_widget, f"已选择文件: {file_path}\n", "info")
|
|
||||||
return file_path
|
|
||||||
|
|
||||||
|
|
||||||
def select_excel_file(log_widget):
|
|
||||||
"""选择Excel文件"""
|
|
||||||
return select_file(
|
|
||||||
log_widget,
|
|
||||||
[("Excel文件", "*.xlsx *.xls"), ("所有文件", "*.*")],
|
|
||||||
"选择Excel文件"
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def ensure_directories():
|
|
||||||
"""确保必要的目录结构存在"""
|
|
||||||
config = ConfigManager()
|
|
||||||
directories = [
|
|
||||||
config.get_path('Paths', 'input_folder', fallback='data/input', create=True),
|
|
||||||
config.get_path('Paths', 'output_folder', fallback='data/output', create=True),
|
|
||||||
config.get_path('Paths', 'result_folder', fallback='data/result', create=True),
|
|
||||||
config.get_path('Paths', 'temp_folder', fallback='data/temp', create=True),
|
|
||||||
os.path.join(config.app_root, 'logs')
|
|
||||||
]
|
|
||||||
for directory in directories:
|
|
||||||
if not os.path.exists(directory):
|
|
||||||
os.makedirs(directory, exist_ok=True)
|
|
||||||
print(f"创建目录: {directory}")
|
|
||||||
|
|
||||||
|
|
||||||
def clean_cache(log_widget):
|
|
||||||
"""清除处理缓存"""
|
|
||||||
from .command_runner import set_running_task
|
|
||||||
try:
|
|
||||||
config = ConfigManager()
|
|
||||||
processed_record = config.get_path('Paths', 'processed_record', fallback='data/processed_files.json')
|
|
||||||
output_folder = config.get_path('Paths', 'output_folder', fallback='data/output')
|
|
||||||
cache_files = [
|
|
||||||
processed_record,
|
|
||||||
os.path.join(output_folder, "processed_files.json"),
|
|
||||||
os.path.join(output_folder, "merged_files.json")
|
|
||||||
]
|
|
||||||
|
|
||||||
for cache_file in cache_files:
|
|
||||||
if os.path.exists(cache_file):
|
|
||||||
os.remove(cache_file)
|
|
||||||
add_to_log(log_widget, f"已清除缓存文件: {cache_file}\n", "success")
|
|
||||||
|
|
||||||
temp_dir = config.get_path('Paths', 'temp_folder', fallback='data/temp')
|
|
||||||
if os.path.exists(temp_dir):
|
|
||||||
for file in os.listdir(temp_dir):
|
|
||||||
file_path = os.path.join(temp_dir, file)
|
|
||||||
try:
|
|
||||||
if os.path.isfile(file_path):
|
|
||||||
os.remove(file_path)
|
|
||||||
add_to_log(log_widget, f"已清除临时文件: {file_path}\n", "info")
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"清除文件时出错: {file_path}, 错误: {str(e)}\n", "error")
|
|
||||||
|
|
||||||
log_dir = os.path.join(config.app_root, 'logs')
|
|
||||||
if os.path.exists(log_dir):
|
|
||||||
for file in os.listdir(log_dir):
|
|
||||||
if file.endswith(".active"):
|
|
||||||
file_path = os.path.join(log_dir, file)
|
|
||||||
try:
|
|
||||||
os.remove(file_path)
|
|
||||||
add_to_log(log_widget, f"已清除活动日志标记: {file_path}\n", "info")
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"清除文件时出错: {file_path}, 错误: {str(e)}\n", "error")
|
|
||||||
|
|
||||||
set_running_task(None)
|
|
||||||
|
|
||||||
add_to_log(log_widget, "缓存清除完成,系统将重新处理所有文件\n", "success")
|
|
||||||
messagebox.showinfo("缓存清除", "缓存已清除,系统将重新处理所有文件。")
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"清除缓存时出错: {str(e)}\n", "error")
|
|
||||||
messagebox.showerror("错误", f"清除缓存时出错: {str(e)}")
|
|
||||||
|
|
||||||
|
|
||||||
def open_result_directory():
|
|
||||||
try:
|
|
||||||
config = ConfigManager()
|
|
||||||
result_dir = config.get_path('Paths', 'result_folder', fallback='data/result', create=True)
|
|
||||||
os.startfile(result_dir)
|
|
||||||
except Exception as e:
|
|
||||||
messagebox.showerror("错误", f"无法打开结果目录: {str(e)}")
|
|
||||||
|
|
||||||
|
|
||||||
def _open_directory_from_settings(config_key, default_path, label):
|
|
||||||
"""通用的从配置读取路径并打开目录"""
|
|
||||||
try:
|
|
||||||
config = ConfigManager()
|
|
||||||
path = config.get_path('Paths', config_key, fallback=default_path, create=True)
|
|
||||||
os.startfile(path)
|
|
||||||
except Exception as e:
|
|
||||||
messagebox.showerror("错误", f"无法打开{label}: {str(e)}")
|
|
||||||
|
|
||||||
|
|
||||||
def open_input_directory_from_settings():
|
|
||||||
_open_directory_from_settings('input_folder', 'data/input', '输入目录')
|
|
||||||
|
|
||||||
|
|
||||||
def open_output_directory_from_settings():
|
|
||||||
_open_directory_from_settings('output_folder', 'data/output', '输出目录')
|
|
||||||
|
|
||||||
|
|
||||||
def open_result_directory_from_settings():
|
|
||||||
_open_directory_from_settings('result_folder', 'data/result', '结果目录')
|
|
||||||
|
|
||||||
|
|
||||||
def clean_data_files(log_widget):
|
|
||||||
"""清理数据文件(仅清理input和output目录)"""
|
|
||||||
try:
|
|
||||||
if not messagebox.askyesno("确认清理", "确定要清理input和output目录的文件吗?这将删除所有输入和输出数据。"):
|
|
||||||
add_to_log(log_widget, "操作已取消\n", "info")
|
|
||||||
return
|
|
||||||
|
|
||||||
config = ConfigManager()
|
|
||||||
files_cleaned = 0
|
|
||||||
|
|
||||||
input_dir = config.get_path('Paths', 'input_folder', fallback='data/input')
|
|
||||||
if os.path.exists(input_dir):
|
|
||||||
for file in os.listdir(input_dir):
|
|
||||||
file_path = os.path.join(input_dir, file)
|
|
||||||
if os.path.isfile(file_path):
|
|
||||||
os.remove(file_path)
|
|
||||||
files_cleaned += 1
|
|
||||||
add_to_log(log_widget, "已清理input目录\n", "info")
|
|
||||||
|
|
||||||
output_dir = config.get_path('Paths', 'output_folder', fallback='data/output')
|
|
||||||
if os.path.exists(output_dir):
|
|
||||||
for file in os.listdir(output_dir):
|
|
||||||
file_path = os.path.join(output_dir, file)
|
|
||||||
if os.path.isfile(file_path):
|
|
||||||
os.remove(file_path)
|
|
||||||
files_cleaned += 1
|
|
||||||
add_to_log(log_widget, "已清理output目录\n", "info")
|
|
||||||
|
|
||||||
add_to_log(log_widget, f"清理完成,共清理 {files_cleaned} 个文件\n", "success")
|
|
||||||
messagebox.showinfo("清理完成", f"已成功清理 {files_cleaned} 个文件")
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"清理数据文件时出错: {str(e)}\n", "error")
|
|
||||||
messagebox.showerror("错误", f"清理数据文件时出错: {str(e)}")
|
|
||||||
|
|
||||||
|
|
||||||
def clean_result_files(log_widget):
|
|
||||||
try:
|
|
||||||
if not messagebox.askyesno("确认清理", "确定要清理result目录的文件吗?这将删除所有已生成的采购单文件。"):
|
|
||||||
add_to_log(log_widget, "操作已取消\n", "info")
|
|
||||||
return
|
|
||||||
config = ConfigManager()
|
|
||||||
count = 0
|
|
||||||
result_dir = config.get_path('Paths', 'result_folder', fallback='data/result')
|
|
||||||
if os.path.exists(result_dir):
|
|
||||||
for file in os.listdir(result_dir):
|
|
||||||
file_path = os.path.join(result_dir, file)
|
|
||||||
if os.path.isfile(file_path):
|
|
||||||
os.remove(file_path)
|
|
||||||
count += 1
|
|
||||||
add_to_log(log_widget, f"已清理result目录,共 {count} 个文件\n", "success")
|
|
||||||
messagebox.showinfo("清理完成", f"已清理result目录 {count} 个文件")
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_widget, f"清理result目录时出错: {str(e)}\n", "error")
|
|
||||||
messagebox.showerror("错误", f"清理result目录时出错: {str(e)}")
|
|
||||||
|
|
||||||
|
|
||||||
def validate_unit_price_against_item_data(result_path: str, log_widget=None):
|
|
||||||
try:
|
|
||||||
from app.services.order_service import OrderService
|
|
||||||
service = OrderService()
|
|
||||||
bad_results = service.validate_unit_price(result_path)
|
|
||||||
|
|
||||||
if bad_results:
|
|
||||||
display_count = min(len(bad_results), 10)
|
|
||||||
msg = f"存在{len(bad_results)}条单价与商品资料进货价差异超过1元:\n" + "\n".join(bad_results[:display_count])
|
|
||||||
if len(bad_results) > 10:
|
|
||||||
msg += f"\n...(其余 {len(bad_results) - 10} 条已省略)"
|
|
||||||
messagebox.showwarning("单价校验提示", msg)
|
|
||||||
if log_widget is not None:
|
|
||||||
add_to_log(log_widget, f"单价校验发现{len(bad_results)}条差异>1元\n", "warning")
|
|
||||||
else:
|
|
||||||
if log_widget is not None:
|
|
||||||
add_to_log(log_widget, "单价校验通过(差异<=1元)\n", "success")
|
|
||||||
except Exception as e:
|
|
||||||
if log_widget is not None:
|
|
||||||
add_to_log(log_widget, f"单价校验出错: {str(e)}\n", "error")
|
|
||||||
@@ -1,126 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""GUI日志处理模块"""
|
|
||||||
|
|
||||||
import logging
|
|
||||||
import queue
|
|
||||||
import sys
|
|
||||||
import tkinter as tk
|
|
||||||
|
|
||||||
# 全局日志队列,用于异步更新UI
|
|
||||||
LOG_QUEUE = queue.Queue()
|
|
||||||
|
|
||||||
|
|
||||||
class LogRedirector:
|
|
||||||
"""日志重定向器,用于捕获命令输出并显示到界面"""
|
|
||||||
def __init__(self, text_widget):
|
|
||||||
self.text_widget = text_widget
|
|
||||||
self.buffer = ""
|
|
||||||
self.terminal = sys.__stdout__
|
|
||||||
|
|
||||||
def write(self, string):
|
|
||||||
self.buffer += string
|
|
||||||
self.terminal.write(string)
|
|
||||||
self.text_widget.after(0, self.update_text_widget)
|
|
||||||
|
|
||||||
def update_text_widget(self):
|
|
||||||
self.text_widget.configure(state=tk.NORMAL)
|
|
||||||
|
|
||||||
if self.buffer.strip():
|
|
||||||
if any(marker in self.buffer.lower() for marker in ["错误", "error", "失败", "异常", "exception"]):
|
|
||||||
self.text_widget.insert(tk.END, self.buffer, "error")
|
|
||||||
elif any(marker in self.buffer.lower() for marker in ["警告", "warning"]):
|
|
||||||
self.text_widget.insert(tk.END, self.buffer, "warning")
|
|
||||||
elif any(marker in self.buffer.lower() for marker in ["成功", "success", "完成", "成功处理"]):
|
|
||||||
self.text_widget.insert(tk.END, self.buffer, "success")
|
|
||||||
elif any(marker in self.buffer.lower() for marker in ["info", "信息", "开始", "处理中"]):
|
|
||||||
self.text_widget.insert(tk.END, self.buffer, "info")
|
|
||||||
else:
|
|
||||||
self.text_widget.insert(tk.END, self.buffer, "normal")
|
|
||||||
else:
|
|
||||||
self.text_widget.insert(tk.END, self.buffer)
|
|
||||||
|
|
||||||
self.text_widget.see(tk.END)
|
|
||||||
self.text_widget.configure(state=tk.DISABLED)
|
|
||||||
self.buffer = ""
|
|
||||||
|
|
||||||
def flush(self):
|
|
||||||
self.terminal.flush()
|
|
||||||
|
|
||||||
|
|
||||||
class GUILogHandler(logging.Handler):
|
|
||||||
"""自定义日志处理器,将日志放入队列,由GUI主线程定时消费"""
|
|
||||||
def __init__(self, text_widget):
|
|
||||||
super().__init__()
|
|
||||||
self.text_widget = text_widget
|
|
||||||
|
|
||||||
def emit(self, record):
|
|
||||||
try:
|
|
||||||
msg = self.format(record)
|
|
||||||
if record.levelno >= logging.ERROR:
|
|
||||||
tag = "error"
|
|
||||||
elif record.levelno >= logging.WARNING:
|
|
||||||
tag = "warning"
|
|
||||||
elif record.levelno >= logging.INFO:
|
|
||||||
tag = "info"
|
|
||||||
else:
|
|
||||||
tag = "normal"
|
|
||||||
|
|
||||||
LOG_QUEUE.put((msg + "\n", tag))
|
|
||||||
except Exception:
|
|
||||||
self.handleError(record)
|
|
||||||
|
|
||||||
|
|
||||||
def poll_log_queue(text_widget):
|
|
||||||
"""定期从队列中读取日志并更新UI"""
|
|
||||||
try:
|
|
||||||
updated = False
|
|
||||||
while not LOG_QUEUE.empty():
|
|
||||||
msg, tag = LOG_QUEUE.get_nowait()
|
|
||||||
text_widget.configure(state=tk.NORMAL)
|
|
||||||
text_widget.insert(tk.END, msg, tag)
|
|
||||||
updated = True
|
|
||||||
|
|
||||||
if updated:
|
|
||||||
text_widget.see(tk.END)
|
|
||||||
text_widget.configure(state=tk.DISABLED)
|
|
||||||
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
finally:
|
|
||||||
text_widget.after(100, lambda: poll_log_queue(text_widget))
|
|
||||||
|
|
||||||
|
|
||||||
def init_gui_logger(text_widget, level=logging.INFO):
|
|
||||||
handler = GUILogHandler(text_widget)
|
|
||||||
handler.setLevel(level)
|
|
||||||
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
|
|
||||||
handler.setFormatter(formatter)
|
|
||||||
root_logger = logging.getLogger()
|
|
||||||
for h in root_logger.handlers[:]:
|
|
||||||
if isinstance(h, logging.StreamHandler):
|
|
||||||
root_logger.removeHandler(h)
|
|
||||||
if not any(isinstance(h, GUILogHandler) for h in root_logger.handlers):
|
|
||||||
root_logger.addHandler(handler)
|
|
||||||
root_logger.setLevel(level)
|
|
||||||
return handler
|
|
||||||
|
|
||||||
|
|
||||||
def dispose_gui_logger():
|
|
||||||
root_logger = logging.getLogger()
|
|
||||||
for handler in root_logger.handlers[:]:
|
|
||||||
if isinstance(handler, GUILogHandler):
|
|
||||||
root_logger.removeHandler(handler)
|
|
||||||
try:
|
|
||||||
handler.close()
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
def add_to_log(log_widget, text, tag="normal"):
|
|
||||||
"""向日志队列添加文本,由 poll_log_queue 消费并更新 UI"""
|
|
||||||
if log_widget is None:
|
|
||||||
print(f"[{tag}] {text}", end="")
|
|
||||||
return
|
|
||||||
|
|
||||||
LOG_QUEUE.put((text, tag))
|
|
||||||
@@ -1,501 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""主窗口模块"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import subprocess
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import messagebox, filedialog, scrolledtext
|
|
||||||
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
from app.core.utils.log_utils import set_log_level
|
|
||||||
|
|
||||||
from .theme import THEMES, get_theme_mode, set_theme_mode, create_modern_button, create_card_frame
|
|
||||||
from .logging_ui import add_to_log, poll_log_queue
|
|
||||||
from .ui_widgets import StatusBar
|
|
||||||
from .user_settings import (
|
|
||||||
load_user_settings, save_user_settings, refresh_recent_list_widget,
|
|
||||||
_extract_path_from_recent_item, clear_recent_files, RECENT_LIST_WIDGET,
|
|
||||||
)
|
|
||||||
from .file_operations import (
|
|
||||||
ensure_directories, open_result_directory, clean_cache,
|
|
||||||
clean_data_files, clean_result_files,
|
|
||||||
)
|
|
||||||
from .action_handlers import (
|
|
||||||
process_single_image_with_status, run_pipeline_directly,
|
|
||||||
batch_ocr_with_status, batch_process_orders_with_status,
|
|
||||||
merge_orders_with_status, process_excel_file_with_status,
|
|
||||||
process_dropped_file,
|
|
||||||
)
|
|
||||||
from .memory_editor import show_memory_editor
|
|
||||||
from .config_dialog import show_config_dialog
|
|
||||||
from .barcode_editor import edit_barcode_mappings
|
|
||||||
from .shortcuts import bind_keyboard_shortcuts
|
|
||||||
from app.core.utils.dialog_utils import show_cloud_sync_dialog
|
|
||||||
|
|
||||||
|
|
||||||
def _init_window():
|
|
||||||
"""初始化窗口、主题和设置,返回 (root, theme, settings, dnd_supported)"""
|
|
||||||
ensure_directories()
|
|
||||||
|
|
||||||
dnd_supported = False
|
|
||||||
try:
|
|
||||||
from tkinterdnd2 import TkinterDnD, DND_FILES
|
|
||||||
root = TkinterDnD.Tk()
|
|
||||||
dnd_supported = True
|
|
||||||
except Exception:
|
|
||||||
root = tk.Tk()
|
|
||||||
|
|
||||||
settings = load_user_settings()
|
|
||||||
theme_mode = settings.get('theme_mode', get_theme_mode())
|
|
||||||
set_theme_mode(theme_mode)
|
|
||||||
|
|
||||||
try:
|
|
||||||
cfg_for_title = ConfigManager()
|
|
||||||
ver = cfg_for_title.get('App', 'version', fallback='dev')
|
|
||||||
root.title(f"益选-OCR订单处理系统 v{ver} by 欢欢欢")
|
|
||||||
except Exception:
|
|
||||||
root.title("益选-OCR订单处理系统 by 欢欢欢")
|
|
||||||
|
|
||||||
root.geometry("900x600")
|
|
||||||
settings['window_size'] = "900x600"
|
|
||||||
theme = THEMES[get_theme_mode()]
|
|
||||||
root.configure(bg=theme["bg"])
|
|
||||||
|
|
||||||
try:
|
|
||||||
log_level = settings.get('log_level')
|
|
||||||
if log_level:
|
|
||||||
set_log_level(log_level)
|
|
||||||
concurrency = settings.get('concurrency_max_workers')
|
|
||||||
if concurrency:
|
|
||||||
cfg = ConfigManager()
|
|
||||||
cfg.update('Performance', 'max_workers', str(concurrency))
|
|
||||||
cfg.save_config()
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
try:
|
|
||||||
root.iconbitmap(default="")
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
return root, theme, settings, dnd_supported
|
|
||||||
|
|
||||||
|
|
||||||
def _create_left_panel(content_frame, theme, log_text, status_bar):
|
|
||||||
"""创建左侧面板:完整流程、OCR处理、Excel处理、最近文件"""
|
|
||||||
left_panel = create_card_frame(content_frame)
|
|
||||||
left_panel.pack(side=tk.LEFT, fill=tk.BOTH, expand=False, padx=(0, 5), pady=5)
|
|
||||||
left_panel.configure(width=160)
|
|
||||||
|
|
||||||
panel_content = tk.Frame(left_panel, bg=theme["card_bg"])
|
|
||||||
panel_content.pack(fill=tk.BOTH, expand=True, padx=10, pady=(5, 10))
|
|
||||||
|
|
||||||
# 完整流程区
|
|
||||||
pipeline_section = tk.LabelFrame(
|
|
||||||
panel_content, text="完整流程", bg=theme["card_bg"], fg=theme["fg"],
|
|
||||||
font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
|
|
||||||
)
|
|
||||||
pipeline_section.pack(fill=tk.X, pady=(0, 8))
|
|
||||||
pipeline_frame = tk.Frame(pipeline_section, bg=theme["card_bg"])
|
|
||||||
pipeline_frame.pack(fill=tk.X, padx=8, pady=6)
|
|
||||||
create_modern_button(pipeline_frame, "一键处理", lambda: run_pipeline_directly(log_text, status_bar), "primary", px_width=150, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
|
|
||||||
# OCR处理区
|
|
||||||
core_section = tk.LabelFrame(
|
|
||||||
panel_content, text="OCR处理", bg=theme["card_bg"], fg=theme["fg"],
|
|
||||||
font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
|
|
||||||
)
|
|
||||||
core_section.pack(fill=tk.X, pady=(0, 8))
|
|
||||||
core_buttons_frame = tk.Frame(core_section, bg=theme["card_bg"])
|
|
||||||
core_buttons_frame.pack(fill=tk.X, padx=8, pady=6)
|
|
||||||
core_row1 = tk.Frame(core_buttons_frame, bg=theme["card_bg"])
|
|
||||||
core_row1.pack(fill=tk.X, pady=3)
|
|
||||||
create_modern_button(core_row1, "批量识别", lambda: batch_ocr_with_status(log_text, status_bar), "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(0, 3))
|
|
||||||
create_modern_button(core_row1, "单个识别", lambda: process_single_image_with_status(log_text, status_bar), "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(3, 0))
|
|
||||||
|
|
||||||
# Excel处理区
|
|
||||||
ocr_section = tk.LabelFrame(
|
|
||||||
panel_content, text="Excel处理", bg=theme["card_bg"], fg=theme["fg"],
|
|
||||||
font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
|
|
||||||
)
|
|
||||||
ocr_section.pack(fill=tk.X, pady=(0, 8))
|
|
||||||
ocr_buttons_frame = tk.Frame(ocr_section, bg=theme["card_bg"])
|
|
||||||
ocr_buttons_frame.pack(fill=tk.X, padx=8, pady=6)
|
|
||||||
ocr_row1 = tk.Frame(ocr_buttons_frame, bg=theme["card_bg"])
|
|
||||||
ocr_row1.pack(fill=tk.X, pady=3)
|
|
||||||
create_modern_button(ocr_row1, "批量处理", lambda: batch_process_orders_with_status(log_text, status_bar), "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(0, 3))
|
|
||||||
create_modern_button(ocr_row1, "单个处理", lambda: process_excel_file_with_status(log_text, status_bar), "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(3, 0))
|
|
||||||
|
|
||||||
# 最近文件区
|
|
||||||
_create_recent_files_section(panel_content, theme, log_text)
|
|
||||||
|
|
||||||
|
|
||||||
def _create_recent_files_section(parent, theme, log_text):
|
|
||||||
"""创建最近文件列表区域"""
|
|
||||||
recent_section = tk.LabelFrame(
|
|
||||||
parent, text="最近文件", bg=theme["card_bg"], fg=theme["fg"],
|
|
||||||
font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
|
|
||||||
)
|
|
||||||
recent_section.pack(fill=tk.BOTH, pady=(0, 12))
|
|
||||||
recent_frame = tk.Frame(recent_section, bg=theme["card_bg"])
|
|
||||||
recent_frame.pack(fill=tk.BOTH, padx=8, pady=6)
|
|
||||||
recent_top = tk.Frame(recent_frame, bg=theme["card_bg"])
|
|
||||||
recent_top.pack(fill=tk.X)
|
|
||||||
|
|
||||||
def _resize_recent_top(e):
|
|
||||||
try:
|
|
||||||
h = max(int(e.height * 0.85), 180)
|
|
||||||
recent_top.configure(height=h)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
try:
|
|
||||||
recent_top.pack_propagate(False)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
recent_frame.bind('<Configure>', _resize_recent_top)
|
|
||||||
|
|
||||||
recent_rect = tk.Frame(recent_top, bg=theme["card_bg"], highlightbackground=theme["border"], highlightthickness=1)
|
|
||||||
recent_rect.pack(fill=tk.BOTH, expand=True)
|
|
||||||
recent_list = tk.Listbox(recent_rect, height=20)
|
|
||||||
recent_scrollbar = tk.Scrollbar(recent_rect)
|
|
||||||
recent_list.configure(yscrollcommand=recent_scrollbar.set)
|
|
||||||
recent_scrollbar.configure(command=recent_list.yview)
|
|
||||||
recent_list.pack(side=tk.LEFT, fill=tk.BOTH, expand=True)
|
|
||||||
recent_scrollbar.pack(side=tk.RIGHT, fill=tk.Y)
|
|
||||||
|
|
||||||
import app.ui.user_settings as _us_mod
|
|
||||||
_us_mod.RECENT_LIST_WIDGET = recent_list
|
|
||||||
|
|
||||||
def _open_selected_event(evt=None):
|
|
||||||
try:
|
|
||||||
idxs = recent_list.curselection()
|
|
||||||
if not idxs:
|
|
||||||
return
|
|
||||||
p = _extract_path_from_recent_item(recent_list.get(idxs[0]))
|
|
||||||
if os.path.exists(p):
|
|
||||||
os.startfile(p)
|
|
||||||
else:
|
|
||||||
messagebox.showwarning("文件不存在", p)
|
|
||||||
except Exception as e:
|
|
||||||
messagebox.showerror("打开失败", str(e))
|
|
||||||
|
|
||||||
recent_list.bind('<Double-Button-1>', _open_selected_event)
|
|
||||||
refresh_recent_list_widget()
|
|
||||||
rf_btns = tk.Frame(recent_frame, bg=theme["card_bg"])
|
|
||||||
rf_btns.pack(fill=tk.X, pady=6)
|
|
||||||
|
|
||||||
def clear_list():
|
|
||||||
clear_recent_files()
|
|
||||||
recent_list.delete(0, tk.END)
|
|
||||||
|
|
||||||
create_modern_button(rf_btns, "清空列表", clear_list, "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(3, 0))
|
|
||||||
|
|
||||||
def purge_invalid():
|
|
||||||
try:
|
|
||||||
kept = []
|
|
||||||
for i in range(recent_list.size()):
|
|
||||||
item = recent_list.get(i)
|
|
||||||
p = _extract_path_from_recent_item(item)
|
|
||||||
if os.path.exists(p):
|
|
||||||
kept.append(p)
|
|
||||||
try:
|
|
||||||
kept_sorted = sorted(kept, key=lambda p: os.path.getmtime(p), reverse=True)
|
|
||||||
except Exception:
|
|
||||||
kept_sorted = kept
|
|
||||||
s = load_user_settings()
|
|
||||||
s['recent_files'] = kept_sorted
|
|
||||||
save_user_settings(s)
|
|
||||||
recent_list.delete(0, tk.END)
|
|
||||||
for i, p in enumerate(s['recent_files'][:recent_list.size() or len(s['recent_files'])], start=1):
|
|
||||||
recent_list.insert(tk.END, f"{i}. {p}")
|
|
||||||
refresh_recent_list_widget()
|
|
||||||
add_to_log(log_text, "已清理无效的最近文件条目\n", "success")
|
|
||||||
except Exception as e:
|
|
||||||
messagebox.showerror("清理失败", str(e))
|
|
||||||
|
|
||||||
create_modern_button(rf_btns, "清理无效", purge_invalid, "primary", px_width=72, px_height=32).pack(side=tk.LEFT, padx=(3, 0))
|
|
||||||
|
|
||||||
|
|
||||||
def _create_right_panel(content_frame, theme, log_text, root):
|
|
||||||
"""创建右侧面板:快捷操作、系统设置"""
|
|
||||||
right_panel = create_card_frame(content_frame)
|
|
||||||
right_panel.pack(side=tk.RIGHT, fill=tk.BOTH, expand=False, padx=(5, 0), pady=5)
|
|
||||||
right_panel.configure(width=380)
|
|
||||||
|
|
||||||
right_panel_content = tk.Frame(right_panel, bg=theme["card_bg"])
|
|
||||||
right_panel_content.pack(fill=tk.BOTH, expand=True, padx=10, pady=(5, 10))
|
|
||||||
|
|
||||||
# 工具功能区
|
|
||||||
tools_section = tk.LabelFrame(
|
|
||||||
right_panel_content, text="快捷操作", bg=theme["card_bg"], fg=theme["fg"],
|
|
||||||
font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
|
|
||||||
)
|
|
||||||
tools_section.pack(fill=tk.X, pady=(0, 8))
|
|
||||||
tools_buttons_frame = tk.Frame(tools_section, bg=theme["card_bg"])
|
|
||||||
tools_buttons_frame.pack(fill=tk.X, padx=8, pady=6)
|
|
||||||
tk.Frame(tools_buttons_frame, bg=theme["card_bg"]).pack(fill=tk.X, pady=3)
|
|
||||||
|
|
||||||
create_modern_button(tools_buttons_frame, "打开结果目录", lambda: open_result_directory(), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
create_modern_button(tools_buttons_frame, "打开输出目录", lambda: os.startfile(ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True)), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
create_modern_button(tools_buttons_frame, "打开输入目录", lambda: os.startfile(ConfigManager().get_path('Paths', 'input_folder', fallback='data/input', create=True)), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
create_modern_button(tools_buttons_frame, "合并订单", lambda: merge_orders_with_status(log_text, StatusBar(root)), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
create_modern_button(tools_buttons_frame, "清除缓存", lambda: clean_cache(log_text), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
create_modern_button(tools_buttons_frame, "清理input/out文件", lambda: clean_data_files(log_text), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
create_modern_button(tools_buttons_frame, "清理result文件", lambda: clean_result_files(log_text), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
|
|
||||||
# 系统设置区
|
|
||||||
settings_section = tk.LabelFrame(
|
|
||||||
right_panel_content, text="系统设置", bg=theme["card_bg"], fg=theme["fg"],
|
|
||||||
font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
|
|
||||||
)
|
|
||||||
settings_section.pack(fill=tk.X, pady=(0, 8))
|
|
||||||
settings_buttons_frame = tk.Frame(settings_section, bg=theme["card_bg"])
|
|
||||||
settings_buttons_frame.pack(fill=tk.X, padx=8, pady=6)
|
|
||||||
create_modern_button(settings_buttons_frame, "系统设置", lambda: show_config_dialog(root, ConfigManager()), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
create_modern_button(settings_buttons_frame, "条码映射", lambda: edit_barcode_mappings(log_text), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
create_modern_button(settings_buttons_frame, "云端同步", lambda: show_cloud_sync_dialog(root), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
create_modern_button(settings_buttons_frame, "商品记忆库", lambda: show_memory_editor(root), "primary", px_width=132, px_height=32).pack(anchor='w', pady=3)
|
|
||||||
|
|
||||||
|
|
||||||
def _setup_drag_area(mid_container, theme, dnd_supported, log_text, status_bar):
|
|
||||||
"""创建拖拽/点击选择文件区域"""
|
|
||||||
drag_panel = create_card_frame(mid_container)
|
|
||||||
drag_panel.pack(side=tk.TOP, fill=tk.X, padx=(5, 5), pady=(0, 5))
|
|
||||||
drag_panel_content = tk.Frame(drag_panel, bg=theme["card_bg"])
|
|
||||||
drag_panel_content.pack(fill=tk.X, padx=10, pady=6)
|
|
||||||
|
|
||||||
dnd_section = tk.LabelFrame(
|
|
||||||
drag_panel_content, bg=theme["card_bg"], fg=theme["fg"],
|
|
||||||
font=("Microsoft YaHei UI", 10, "bold"), relief="flat", borderwidth=0
|
|
||||||
)
|
|
||||||
dnd_section.pack(fill=tk.X, pady=(0, 0))
|
|
||||||
dnd_frame = tk.Frame(dnd_section, bg=theme["card_bg"], highlightthickness=1, highlightbackground=theme["border"])
|
|
||||||
dnd_frame.configure(height=60)
|
|
||||||
dnd_frame.pack(fill=tk.X, padx=8, pady=6)
|
|
||||||
try:
|
|
||||||
dnd_frame.pack_propagate(False)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
def _set_highlight(active: bool):
|
|
||||||
try:
|
|
||||||
dnd_frame.configure(highlightbackground=theme["info"] if active else theme["border"])
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
dnd_frame.bind('<Enter>', lambda e: _set_highlight(True))
|
|
||||||
dnd_frame.bind('<Leave>', lambda e: _set_highlight(False))
|
|
||||||
|
|
||||||
msg_row = tk.Frame(dnd_frame, bg=theme["card_bg"])
|
|
||||||
msg_row.pack(fill=tk.X)
|
|
||||||
if dnd_supported:
|
|
||||||
tk.Label(
|
|
||||||
msg_row, text="拖拽已启用:拖拽或点击此区域选择文件",
|
|
||||||
bg=theme["card_bg"], fg="#999999", justify="center"
|
|
||||||
).pack(fill=tk.X)
|
|
||||||
else:
|
|
||||||
tk.Label(
|
|
||||||
msg_row, text="点击此区域选择文件;可安装拖拽支持",
|
|
||||||
bg=theme["card_bg"], fg="#999999", justify="center"
|
|
||||||
).pack(fill=tk.X)
|
|
||||||
|
|
||||||
if not dnd_supported:
|
|
||||||
btn_row = tk.Frame(dnd_frame, bg=theme["card_bg"])
|
|
||||||
btn_row.pack(fill=tk.X)
|
|
||||||
|
|
||||||
is_frozen = getattr(sys, 'frozen', False)
|
|
||||||
|
|
||||||
def copy_install():
|
|
||||||
try:
|
|
||||||
mid_container.winfo_toplevel().clipboard_clear()
|
|
||||||
mid_container.winfo_toplevel().clipboard_append("pip install tkinterdnd2")
|
|
||||||
messagebox.showinfo("已复制", "已复制安装命令:pip install tkinterdnd2")
|
|
||||||
except Exception as e:
|
|
||||||
messagebox.showwarning("复制失败", str(e))
|
|
||||||
|
|
||||||
if is_frozen:
|
|
||||||
tk.Label(
|
|
||||||
btn_row, text="EXE版不支持运行时安装,请用源码版安装后重新打包",
|
|
||||||
bg=theme["card_bg"], fg="#999999", font=("Microsoft YaHei UI", 8)
|
|
||||||
).pack(side=tk.RIGHT, padx=4)
|
|
||||||
else:
|
|
||||||
def install_and_restart():
|
|
||||||
try:
|
|
||||||
add_to_log(log_text, "开始安装拖拽支持库 tkinterdnd2...\n", "info")
|
|
||||||
cmd = [sys.executable, "-m", "pip", "install", "tkinterdnd2"]
|
|
||||||
result = subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
|
|
||||||
add_to_log(log_text, result.stdout + "\n", "info")
|
|
||||||
add_to_log(log_text, "安装成功,准备重启程序以启用拖拽...\n", "success")
|
|
||||||
if messagebox.askyesno("安装完成", "已安装拖拽支持,是否立即重启应用?"):
|
|
||||||
os.execl(sys.executable, sys.executable, *sys.argv)
|
|
||||||
except subprocess.CalledProcessError as e:
|
|
||||||
add_to_log(log_text, f"安装失败: {e.stderr}\n", "error")
|
|
||||||
messagebox.showerror("安装失败", f"安装输出:\n{e.stderr}")
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_text, f"安装失败: {str(e)}\n", "error")
|
|
||||||
messagebox.showerror("安装失败", str(e))
|
|
||||||
|
|
||||||
create_modern_button(btn_row, "一键安装拖拽", install_and_restart, "primary", px_width=132, px_height=28).pack(side=tk.RIGHT, padx=(3, 0))
|
|
||||||
|
|
||||||
create_modern_button(btn_row, "复制安装命令", copy_install, "primary", px_width=132, px_height=28).pack(side=tk.RIGHT)
|
|
||||||
|
|
||||||
# 点击拖拽框选择文件
|
|
||||||
def _click_select(evt=None):
|
|
||||||
try:
|
|
||||||
files = filedialog.askopenfilenames(
|
|
||||||
title="选择图片或Excel文件",
|
|
||||||
filetypes=[
|
|
||||||
("支持文件", "*.xlsx *.xls *.jpg *.jpeg *.png *.bmp"),
|
|
||||||
("Excel", "*.xlsx *.xls"),
|
|
||||||
("图片", "*.jpg *.jpeg *.png *.bmp"),
|
|
||||||
("所有文件", "*.*"),
|
|
||||||
]
|
|
||||||
)
|
|
||||||
if not files:
|
|
||||||
return
|
|
||||||
for p in files:
|
|
||||||
process_dropped_file(log_text, status_bar, p)
|
|
||||||
except Exception as e:
|
|
||||||
messagebox.showerror("选择失败", str(e))
|
|
||||||
|
|
||||||
dnd_frame.bind('<Button-1>', _click_select)
|
|
||||||
msg_row.bind('<Button-1>', _click_select)
|
|
||||||
|
|
||||||
if dnd_supported:
|
|
||||||
def _on_drop(event):
|
|
||||||
try:
|
|
||||||
data = event.data
|
|
||||||
paths = []
|
|
||||||
buf = ""
|
|
||||||
in_brace = False
|
|
||||||
for ch in data:
|
|
||||||
if ch == '{':
|
|
||||||
in_brace = True
|
|
||||||
buf = ""
|
|
||||||
elif ch == '}':
|
|
||||||
in_brace = False
|
|
||||||
paths.append(buf)
|
|
||||||
buf = ""
|
|
||||||
elif ch == ' ' and not in_brace:
|
|
||||||
if buf:
|
|
||||||
paths.append(buf)
|
|
||||||
buf = ""
|
|
||||||
else:
|
|
||||||
buf += ch
|
|
||||||
if buf:
|
|
||||||
paths.append(buf)
|
|
||||||
for p in paths:
|
|
||||||
process_dropped_file(log_text, status_bar, p)
|
|
||||||
except Exception as e:
|
|
||||||
add_to_log(log_text, f"拖拽处理失败: {str(e)}\n", "error")
|
|
||||||
|
|
||||||
try:
|
|
||||||
from tkinterdnd2 import DND_FILES
|
|
||||||
dnd_frame.drop_target_register(DND_FILES)
|
|
||||||
dnd_frame.dnd_bind('<<Drop>>', _on_drop)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
def _create_log_panel(mid_container, theme):
|
|
||||||
"""创建中间日志面板,返回 log_text widget"""
|
|
||||||
log_panel = create_card_frame(mid_container, "处理日志")
|
|
||||||
log_panel.pack(side=tk.TOP, fill=tk.BOTH, expand=True, padx=(5, 5), pady=5)
|
|
||||||
|
|
||||||
log_text = scrolledtext.ScrolledText(
|
|
||||||
log_panel, wrap=tk.WORD, width=68, height=26,
|
|
||||||
bg=theme["log_bg"], fg=theme["log_fg"],
|
|
||||||
font=("Consolas", 9), state=tk.DISABLED,
|
|
||||||
relief="flat", borderwidth=0
|
|
||||||
)
|
|
||||||
log_text.pack(fill=tk.BOTH, expand=True, padx=10, pady=(5, 10))
|
|
||||||
|
|
||||||
log_text.tag_configure("command", foreground=theme["info"], font=("Consolas", 9, "bold"))
|
|
||||||
log_text.tag_configure("time", foreground=theme["secondary_bg"], font=("Consolas", 8))
|
|
||||||
log_text.tag_configure("separator", foreground=theme["border"])
|
|
||||||
log_text.tag_configure("success", foreground=theme["success"], font=("Consolas", 9, "bold"))
|
|
||||||
log_text.tag_configure("error", foreground=theme["error"], font=("Consolas", 9, "bold"))
|
|
||||||
log_text.tag_configure("warning", foreground=theme["warning"], font=("Consolas", 9, "bold"))
|
|
||||||
log_text.tag_configure("info", foreground=theme["info"], font=("Consolas", 9))
|
|
||||||
|
|
||||||
poll_log_queue(log_text)
|
|
||||||
|
|
||||||
try:
|
|
||||||
_ver = ConfigManager().get('App', 'version', fallback='')
|
|
||||||
_ver_str = f" v{_ver}" if _ver else ""
|
|
||||||
except Exception:
|
|
||||||
_ver_str = ""
|
|
||||||
add_to_log(log_text, f"欢迎使用 益选-OCR订单处理系统{_ver_str}\n", "success")
|
|
||||||
add_to_log(log_text, "系统已就绪,请选择相应功能进行操作。\n\n", "info")
|
|
||||||
add_to_log(log_text, "功能说明:\n", "command")
|
|
||||||
add_to_log(log_text, "• 完整处理流程:一键完成OCR识别和Excel处理\n", "info")
|
|
||||||
add_to_log(log_text, "• 批量处理订单:批量处理多个订单文件\n", "info")
|
|
||||||
add_to_log(log_text, "• 处理烟草订单:专门处理烟草类订单\n", "info")
|
|
||||||
add_to_log(log_text, "• 合并订单:将多个订单合并为一个文件\n\n", "info")
|
|
||||||
cfg = ConfigManager()
|
|
||||||
add_to_log(log_text, f"请将需要处理的图片文件放入 {cfg.get_path('Paths', 'input_folder', fallback='data/input')} 目录中。\n", "warning")
|
|
||||||
add_to_log(log_text, f"OCR识别结果保存在 {cfg.get_path('Paths', 'output_folder', fallback='data/output')} 目录,处理完成的订单保存在 {cfg.get_path('Paths', 'result_folder', fallback='data/result')} 目录中。\n\n", "warning")
|
|
||||||
add_to_log(log_text, "=" * 50 + "\n\n", "separator")
|
|
||||||
|
|
||||||
return log_text
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
"""主函数"""
|
|
||||||
try:
|
|
||||||
root, theme, settings, dnd_supported = _init_window()
|
|
||||||
|
|
||||||
# 主容器
|
|
||||||
main_container = tk.Frame(root, bg=theme["bg"])
|
|
||||||
main_container.pack(fill=tk.BOTH, expand=True, padx=10, pady=10)
|
|
||||||
content_frame = tk.Frame(main_container, bg=theme["bg"])
|
|
||||||
content_frame.pack(fill=tk.BOTH, expand=True)
|
|
||||||
|
|
||||||
# 中间容器(拖拽区 + 日志区)
|
|
||||||
mid_container = tk.Frame(content_frame, bg=theme["bg"])
|
|
||||||
mid_container.pack(side=tk.LEFT, fill=tk.BOTH, expand=True, padx=(5, 5), pady=5)
|
|
||||||
|
|
||||||
log_text = _create_log_panel(mid_container, theme)
|
|
||||||
|
|
||||||
# 状态栏
|
|
||||||
status_bar = StatusBar(root)
|
|
||||||
status_bar.pack(side=tk.BOTTOM, fill=tk.X)
|
|
||||||
|
|
||||||
# 左侧面板
|
|
||||||
_create_left_panel(content_frame, theme, log_text, status_bar)
|
|
||||||
|
|
||||||
# 右侧面板
|
|
||||||
_create_right_panel(content_frame, theme, log_text, root)
|
|
||||||
|
|
||||||
# 拖拽区域
|
|
||||||
_setup_drag_area(mid_container, theme, dnd_supported, log_text, status_bar)
|
|
||||||
|
|
||||||
# 快捷键 + 关闭事件
|
|
||||||
def on_close():
|
|
||||||
try:
|
|
||||||
w = root.winfo_width()
|
|
||||||
h = root.winfo_height()
|
|
||||||
settings['window_size'] = f"{w}x{h}"
|
|
||||||
settings['theme_mode'] = get_theme_mode()
|
|
||||||
save_user_settings(settings)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
root.destroy()
|
|
||||||
|
|
||||||
root.protocol("WM_DELETE_WINDOW", on_close)
|
|
||||||
bind_keyboard_shortcuts(root, log_text, status_bar)
|
|
||||||
|
|
||||||
root.mainloop()
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
import traceback
|
|
||||||
error_msg = f"程序启动失败: {str(e)}\n详细错误信息:\n{traceback.format_exc()}"
|
|
||||||
print(error_msg)
|
|
||||||
try:
|
|
||||||
import tkinter.messagebox as mb
|
|
||||||
mb.showerror("启动错误", f"程序启动失败:\n{str(e)}")
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
@@ -1,198 +0,0 @@
|
|||||||
"""商品记忆库查看/编辑对话框"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import ttk, messagebox, simpledialog
|
|
||||||
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
from app.core.db.product_db import ProductDatabase
|
|
||||||
from .ui_widgets import center_window
|
|
||||||
|
|
||||||
|
|
||||||
def _get_product_db():
|
|
||||||
cfg = ConfigManager()
|
|
||||||
db_path = cfg.get_path('Paths', 'product_db', fallback='data/product_cache.db') if hasattr(cfg, 'get_path') else 'data/product_cache.db'
|
|
||||||
tpl_folder = cfg.get('Paths', 'template_folder', fallback='templates')
|
|
||||||
item_data = cfg.get('Templates', 'item_data', fallback='商品资料.xlsx')
|
|
||||||
tpl_path = os.path.join(tpl_folder, item_data)
|
|
||||||
return ProductDatabase(db_path, tpl_path)
|
|
||||||
|
|
||||||
|
|
||||||
def show_memory_editor(root):
|
|
||||||
"""显示商品记忆库编辑器"""
|
|
||||||
db = _get_product_db()
|
|
||||||
|
|
||||||
dlg = tk.Toplevel(root)
|
|
||||||
dlg.title("商品记忆库")
|
|
||||||
dlg.geometry("950x520")
|
|
||||||
center_window(dlg)
|
|
||||||
|
|
||||||
# ── 顶部搜索栏 ──
|
|
||||||
top = ttk.Frame(dlg)
|
|
||||||
top.pack(fill=tk.X, padx=8, pady=(8, 4))
|
|
||||||
|
|
||||||
ttk.Label(top, text="搜索:").pack(side=tk.LEFT)
|
|
||||||
search_var = tk.StringVar()
|
|
||||||
search_entry = ttk.Entry(top, textvariable=search_var, width=30)
|
|
||||||
search_entry.pack(side=tk.LEFT, padx=4)
|
|
||||||
|
|
||||||
# ── 统计标签 ──
|
|
||||||
stats_label = ttk.Label(top, text="")
|
|
||||||
stats_label.pack(side=tk.RIGHT)
|
|
||||||
|
|
||||||
# ── Treeview ──
|
|
||||||
columns = ("barcode", "name", "specification", "unit", "price", "source", "confidence", "usage_count", "last_seen")
|
|
||||||
tree = ttk.Treeview(dlg, columns=columns, show="headings", height=18)
|
|
||||||
|
|
||||||
headers = {
|
|
||||||
"barcode": ("条码", 120),
|
|
||||||
"name": ("名称", 180),
|
|
||||||
"specification": ("规格", 80),
|
|
||||||
"unit": ("单位", 50),
|
|
||||||
"price": ("单价", 70),
|
|
||||||
"source": ("来源", 80),
|
|
||||||
"confidence": ("置信度", 60),
|
|
||||||
"usage_count": ("使用次数", 70),
|
|
||||||
"last_seen": ("最后使用", 140),
|
|
||||||
}
|
|
||||||
for col, (text, width) in headers.items():
|
|
||||||
tree.heading(col, text=text)
|
|
||||||
tree.column(col, width=width, anchor="center")
|
|
||||||
|
|
||||||
# 置信度颜色标签
|
|
||||||
tree.tag_configure("high", foreground="#28a745") # >= 80 绿
|
|
||||||
tree.tag_configure("medium", foreground="#ffc107") # 50-79 黄
|
|
||||||
tree.tag_configure("low", foreground="#dc3545") # < 50 红
|
|
||||||
|
|
||||||
scrollbar = ttk.Scrollbar(dlg, orient=tk.VERTICAL, command=tree.yview)
|
|
||||||
tree.configure(yscrollcommand=scrollbar.set)
|
|
||||||
|
|
||||||
tree.pack(side=tk.LEFT, fill=tk.BOTH, expand=True, padx=(8, 0), pady=4)
|
|
||||||
scrollbar.pack(side=tk.LEFT, fill=tk.Y, padx=(0, 8), pady=4)
|
|
||||||
|
|
||||||
# ── 数据加载 ──
|
|
||||||
all_records = []
|
|
||||||
|
|
||||||
def load_data(filter_text=""):
|
|
||||||
nonlocal all_records
|
|
||||||
all_records = db.get_all_memories()
|
|
||||||
|
|
||||||
tree.delete(*tree.get_children())
|
|
||||||
|
|
||||||
filtered = all_records
|
|
||||||
if filter_text:
|
|
||||||
ft = filter_text.lower()
|
|
||||||
filtered = [r for r in all_records
|
|
||||||
if ft in str(r.get('barcode', '')).lower()
|
|
||||||
or ft in str(r.get('name', '')).lower()]
|
|
||||||
|
|
||||||
for r in filtered:
|
|
||||||
conf = r.get('confidence', 0) or 0
|
|
||||||
tag = "high" if conf >= 80 else ("medium" if conf >= 50 else "low")
|
|
||||||
|
|
||||||
last_seen = r.get('last_seen', '') or ''
|
|
||||||
if last_seen and len(last_seen) > 16:
|
|
||||||
last_seen = last_seen[:16]
|
|
||||||
|
|
||||||
source_display = {
|
|
||||||
'template': '模板',
|
|
||||||
'ocr': 'OCR',
|
|
||||||
'user_confirmed': '手动',
|
|
||||||
}.get(r.get('source', ''), r.get('source', ''))
|
|
||||||
|
|
||||||
tree.insert("", tk.END, values=(
|
|
||||||
r.get('barcode', ''),
|
|
||||||
r.get('name', ''),
|
|
||||||
r.get('specification', ''),
|
|
||||||
r.get('unit', ''),
|
|
||||||
f"{r.get('price', 0):.2f}" if r.get('price') else '',
|
|
||||||
source_display,
|
|
||||||
conf,
|
|
||||||
r.get('usage_count', 0) or 0,
|
|
||||||
last_seen,
|
|
||||||
), tags=(tag,))
|
|
||||||
|
|
||||||
stats_label.config(text=f"共 {len(filtered)} / {len(all_records)} 条")
|
|
||||||
|
|
||||||
def on_search(*_):
|
|
||||||
load_data(search_var.get())
|
|
||||||
|
|
||||||
search_var.trace_add("write", on_search)
|
|
||||||
|
|
||||||
# ── 按钮区 ──
|
|
||||||
btn_frame = ttk.Frame(dlg)
|
|
||||||
btn_frame.pack(fill=tk.X, padx=8, pady=(0, 8))
|
|
||||||
|
|
||||||
def edit_selected():
|
|
||||||
sel = tree.selection()
|
|
||||||
if not sel:
|
|
||||||
messagebox.showwarning("提示", "请先选择一条记录")
|
|
||||||
return
|
|
||||||
item = tree.item(sel[0])
|
|
||||||
vals = item['values']
|
|
||||||
barcode = vals[0]
|
|
||||||
|
|
||||||
# 弹出编辑对话框
|
|
||||||
edit_dlg = tk.Toplevel(dlg)
|
|
||||||
edit_dlg.title(f"编辑: {barcode}")
|
|
||||||
edit_dlg.geometry("380x260")
|
|
||||||
center_window(edit_dlg)
|
|
||||||
|
|
||||||
fields = [
|
|
||||||
("名称", "name", vals[1]),
|
|
||||||
("规格", "specification", vals[2]),
|
|
||||||
("单位", "unit", vals[3]),
|
|
||||||
("单价", "price", vals[4]),
|
|
||||||
]
|
|
||||||
entries = {}
|
|
||||||
for i, (label, key, val) in enumerate(fields):
|
|
||||||
ttk.Label(edit_dlg, text=label).grid(row=i, column=0, sticky='w', padx=8, pady=4)
|
|
||||||
var = tk.StringVar(value=str(val) if val else '')
|
|
||||||
ttk.Entry(edit_dlg, textvariable=var, width=30).grid(row=i, column=1, padx=8, pady=4)
|
|
||||||
entries[key] = var
|
|
||||||
|
|
||||||
def save_edit():
|
|
||||||
updates = {}
|
|
||||||
for key, var in entries.items():
|
|
||||||
v = var.get().strip()
|
|
||||||
if key == 'price':
|
|
||||||
try:
|
|
||||||
updates[key] = float(v) if v else 0
|
|
||||||
except ValueError:
|
|
||||||
updates[key] = 0
|
|
||||||
else:
|
|
||||||
updates[key] = v
|
|
||||||
db.update_memory(barcode, updates)
|
|
||||||
edit_dlg.destroy()
|
|
||||||
load_data(search_var.get())
|
|
||||||
|
|
||||||
ttk.Button(edit_dlg, text="保存", command=save_edit).grid(row=len(fields), column=0, columnspan=2, pady=12)
|
|
||||||
|
|
||||||
def delete_selected():
|
|
||||||
sel = tree.selection()
|
|
||||||
if not sel:
|
|
||||||
messagebox.showwarning("提示", "请先选择一条记录")
|
|
||||||
return
|
|
||||||
item = tree.item(sel[0])
|
|
||||||
barcode = item['values'][0]
|
|
||||||
if messagebox.askyesno("确认删除", f"确定要删除条码 {barcode} 的记忆记录吗?"):
|
|
||||||
db.delete_memory(barcode)
|
|
||||||
load_data(search_var.get())
|
|
||||||
|
|
||||||
def reimport_template():
|
|
||||||
if messagebox.askyesno("确认", "重新从商品资料导入将重置所有模板商品的置信度为100,确定继续吗?"):
|
|
||||||
count = db.reimport()
|
|
||||||
messagebox.showinfo("完成", f"已重新导入 {count} 条记录")
|
|
||||||
load_data(search_var.get())
|
|
||||||
|
|
||||||
ttk.Button(btn_frame, text="编辑", command=edit_selected).pack(side=tk.LEFT, padx=4)
|
|
||||||
ttk.Button(btn_frame, text="删除", command=delete_selected).pack(side=tk.LEFT, padx=4)
|
|
||||||
ttk.Button(btn_frame, text="重新导入模板", command=reimport_template).pack(side=tk.LEFT, padx=4)
|
|
||||||
ttk.Button(btn_frame, text="刷新", command=lambda: load_data(search_var.get())).pack(side=tk.LEFT, padx=4)
|
|
||||||
ttk.Button(btn_frame, text="关闭", command=dlg.destroy).pack(side=tk.RIGHT, padx=4)
|
|
||||||
|
|
||||||
# 双击编辑
|
|
||||||
tree.bind("<Double-1>", lambda e: edit_selected())
|
|
||||||
|
|
||||||
# 初始加载
|
|
||||||
load_data()
|
|
||||||
@@ -1,377 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""处理结果预览对话框模块"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import re
|
|
||||||
import datetime
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import messagebox, scrolledtext
|
|
||||||
|
|
||||||
from .theme import THEMES, get_theme_mode, apply_theme
|
|
||||||
from .ui_widgets import center_window
|
|
||||||
from app.core.utils.file_utils import format_file_size
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
|
|
||||||
TOBACCO_PREVIEW_WINDOW = None
|
|
||||||
|
|
||||||
|
|
||||||
def _get_output_dir():
|
|
||||||
"""获取输出目录的绝对路径"""
|
|
||||||
return ConfigManager().get_path('Paths', 'output_folder', fallback='data/output', create=True)
|
|
||||||
|
|
||||||
|
|
||||||
def show_result_preview(command, output):
|
|
||||||
"""显示处理结果预览"""
|
|
||||||
if "ocr" in command:
|
|
||||||
show_ocr_result_preview(output)
|
|
||||||
elif "excel" in command:
|
|
||||||
show_excel_result_preview(output)
|
|
||||||
elif "merge" in command:
|
|
||||||
show_merge_result_preview(output)
|
|
||||||
elif "pipeline" in command:
|
|
||||||
show_pipeline_result_preview(output)
|
|
||||||
else:
|
|
||||||
messagebox.showinfo("处理完成", f"操作已成功完成!\n请在{_get_output_dir()}目录查看结果。")
|
|
||||||
|
|
||||||
|
|
||||||
def show_ocr_result_preview(output):
|
|
||||||
"""显示OCR处理结果预览"""
|
|
||||||
files_match = re.search(r'找到 (\d+) 个图片文件,其中 (\d+) 个未处理', output)
|
|
||||||
processed_match = re.search(r'所有图片处理完成, 总计: (\d+), 成功: (\d+)', output)
|
|
||||||
|
|
||||||
if processed_match:
|
|
||||||
total = int(processed_match.group(1))
|
|
||||||
success = int(processed_match.group(2))
|
|
||||||
|
|
||||||
preview = tk.Toplevel()
|
|
||||||
preview.title("OCR处理结果")
|
|
||||||
preview.geometry("400x300")
|
|
||||||
preview.resizable(False, False)
|
|
||||||
center_window(preview)
|
|
||||||
|
|
||||||
tk.Label(preview, text="OCR处理完成", font=("Arial", 16, "bold")).pack(pady=10)
|
|
||||||
|
|
||||||
result_frame = tk.Frame(preview)
|
|
||||||
result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
|
|
||||||
|
|
||||||
tk.Label(result_frame, text=f"总共处理: {total} 个文件", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
tk.Label(result_frame, text=f"成功处理: {success} 个文件", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
tk.Label(result_frame, text=f"失败数量: {total - success} 个文件", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
|
|
||||||
if success == total:
|
|
||||||
result_text = "全部处理成功!"
|
|
||||||
result_color = "#28a745"
|
|
||||||
elif success > total * 0.8:
|
|
||||||
result_text = "大部分处理成功。"
|
|
||||||
result_color = "#ffc107"
|
|
||||||
else:
|
|
||||||
result_text = "处理失败较多,请检查日志。"
|
|
||||||
result_color = "#dc3545"
|
|
||||||
|
|
||||||
tk.Label(result_frame, text=result_text, font=("Arial", 12, "bold"), fg=result_color).pack(pady=10)
|
|
||||||
|
|
||||||
button_frame = tk.Frame(preview)
|
|
||||||
button_frame.pack(pady=10)
|
|
||||||
|
|
||||||
tk.Button(button_frame, text="查看输出文件", command=lambda: os.startfile(_get_output_dir())).pack(side=tk.LEFT, padx=10)
|
|
||||||
tk.Button(button_frame, text="关闭", command=preview.destroy).pack(side=tk.LEFT, padx=10)
|
|
||||||
else:
|
|
||||||
messagebox.showinfo("OCR处理完成", f"OCR处理已完成,请在{_get_output_dir()}目录查看结果。")
|
|
||||||
|
|
||||||
|
|
||||||
def show_excel_result_preview(output):
|
|
||||||
"""显示Excel处理结果预览"""
|
|
||||||
extract_match = re.search(r'提取到 (\d+) 个商品信息', output)
|
|
||||||
file_match = re.search(r'采购单已保存到: (.+?)(?:\n|$)', output)
|
|
||||||
|
|
||||||
if extract_match and file_match:
|
|
||||||
products_count = int(extract_match.group(1))
|
|
||||||
output_file = file_match.group(1)
|
|
||||||
|
|
||||||
preview = tk.Toplevel()
|
|
||||||
preview.title("Excel处理结果")
|
|
||||||
preview.geometry("450x320")
|
|
||||||
preview.resizable(False, False)
|
|
||||||
center_window(preview)
|
|
||||||
|
|
||||||
tk.Label(preview, text="Excel处理完成", font=("Arial", 16, "bold")).pack(pady=10)
|
|
||||||
|
|
||||||
result_frame = tk.Frame(preview)
|
|
||||||
result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
|
|
||||||
|
|
||||||
tk.Label(result_frame, text=f"提取商品数量: {products_count} 个", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
tk.Label(result_frame, text=f"输出文件: {os.path.basename(output_file)}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
|
|
||||||
tk.Label(result_frame, text="采购单已成功生成!", font=("Arial", 12, "bold"), fg="#28a745").pack(pady=10)
|
|
||||||
|
|
||||||
file_frame = tk.Frame(result_frame, relief=tk.GROOVE, borderwidth=1)
|
|
||||||
file_frame.pack(fill=tk.X, padx=15, pady=5)
|
|
||||||
|
|
||||||
tk.Label(file_frame, text="文件信息", font=("Arial", 10, "bold")).pack(anchor=tk.W, padx=10, pady=5)
|
|
||||||
|
|
||||||
try:
|
|
||||||
file_size = os.path.getsize(output_file)
|
|
||||||
file_time = datetime.datetime.fromtimestamp(os.path.getmtime(output_file))
|
|
||||||
size_text = format_file_size(file_size)
|
|
||||||
tk.Label(file_frame, text=f"文件大小: {size_text}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
|
||||||
tk.Label(file_frame, text=f"创建时间: {file_time.strftime('%Y-%m-%d %H:%M:%S')}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
|
||||||
except Exception:
|
|
||||||
tk.Label(file_frame, text="无法获取文件信息", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
|
||||||
|
|
||||||
button_frame = tk.Frame(preview)
|
|
||||||
button_frame.pack(pady=10)
|
|
||||||
|
|
||||||
tk.Button(button_frame, text="打开文件", command=lambda: os.startfile(output_file)).pack(side=tk.LEFT, padx=5)
|
|
||||||
tk.Button(button_frame, text="打开所在文件夹", command=lambda: os.startfile(os.path.dirname(output_file))).pack(side=tk.LEFT, padx=5)
|
|
||||||
tk.Button(button_frame, text="关闭", command=preview.destroy).pack(side=tk.LEFT, padx=5)
|
|
||||||
else:
|
|
||||||
messagebox.showinfo("Excel处理完成", f"Excel处理已完成,请在{_get_output_dir()}目录查看结果。")
|
|
||||||
|
|
||||||
|
|
||||||
def show_merge_result_preview(output):
|
|
||||||
"""显示合并结果预览"""
|
|
||||||
merged_match = re.search(r'合并了 (\d+) 个采购单', output)
|
|
||||||
product_match = re.search(r'共处理 (\d+) 个商品', output)
|
|
||||||
output_match = re.search(r'已保存到: (.+?)(?:\n|$)', output)
|
|
||||||
|
|
||||||
if merged_match and output_match:
|
|
||||||
merged_count = int(merged_match.group(1))
|
|
||||||
product_count = int(product_match.group(1)) if product_match else 0
|
|
||||||
output_file = output_match.group(1)
|
|
||||||
|
|
||||||
preview = tk.Toplevel()
|
|
||||||
preview.title("采购单合并结果")
|
|
||||||
preview.geometry("450x300")
|
|
||||||
preview.resizable(False, False)
|
|
||||||
apply_theme(preview)
|
|
||||||
|
|
||||||
tk.Label(preview, text="采购单合并完成", font=("Arial", 16, "bold")).pack(pady=10)
|
|
||||||
|
|
||||||
result_frame = tk.Frame(preview)
|
|
||||||
result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
|
|
||||||
|
|
||||||
tk.Label(result_frame, text=f"合并采购单数量: {merged_count} 个", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
tk.Label(result_frame, text=f"处理商品数量: {product_count} 个", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
tk.Label(result_frame, text=f"输出文件: {os.path.basename(output_file)}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
|
|
||||||
theme = THEMES[get_theme_mode()]
|
|
||||||
tk.Label(result_frame, text="采购单已成功合并!", font=("Arial", 12, "bold"), fg=theme["success"]).pack(pady=10)
|
|
||||||
|
|
||||||
button_frame = tk.Frame(preview)
|
|
||||||
button_frame.pack(pady=10)
|
|
||||||
|
|
||||||
tk.Button(button_frame, text="打开文件", command=lambda: os.startfile(output_file)).pack(side=tk.LEFT, padx=10)
|
|
||||||
tk.Button(button_frame, text="打开所在文件夹", command=lambda: os.startfile(os.path.dirname(output_file))).pack(side=tk.LEFT, padx=10)
|
|
||||||
tk.Button(button_frame, text="关闭", command=preview.destroy).pack(side=tk.LEFT, padx=10)
|
|
||||||
else:
|
|
||||||
messagebox.showinfo("采购单合并完成", f"采购单合并已完成,请在{_get_output_dir()}目录查看结果。")
|
|
||||||
|
|
||||||
|
|
||||||
def show_pipeline_result_preview(output):
|
|
||||||
"""显示完整流程结果预览"""
|
|
||||||
ocr_match = re.search(r'所有图片处理完成, 总计: (\d+), 成功: (\d+)', output)
|
|
||||||
excel_match = re.search(r'提取到 (\d+) 个商品信息', output)
|
|
||||||
output_file_match = re.search(r'采购单已保存到: (.+?)(?:\n|$)', output)
|
|
||||||
|
|
||||||
preview = tk.Toplevel()
|
|
||||||
preview.title("完整流程处理结果")
|
|
||||||
preview.geometry("500x400")
|
|
||||||
preview.resizable(False, False)
|
|
||||||
center_window(preview)
|
|
||||||
|
|
||||||
tk.Label(preview, text="完整处理流程已完成", font=("Arial", 16, "bold")).pack(pady=10)
|
|
||||||
|
|
||||||
no_files_match = re.search(r'未找到可合并的文件', output)
|
|
||||||
if no_files_match:
|
|
||||||
tk.Label(preview, text="未找到可合并的文件,但其他步骤已成功执行", font=("Arial", 12)).pack(pady=0)
|
|
||||||
|
|
||||||
result_frame = tk.Frame(preview)
|
|
||||||
result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
|
|
||||||
|
|
||||||
result_text = scrolledtext.ScrolledText(result_frame, wrap=tk.WORD, height=15, width=60)
|
|
||||||
result_text.pack(fill=tk.BOTH, expand=True, padx=15, pady=5)
|
|
||||||
result_text.configure(state=tk.NORMAL)
|
|
||||||
|
|
||||||
result_text.insert(tk.END, "===== 流程执行结果 =====\n\n", "title")
|
|
||||||
|
|
||||||
result_text.insert(tk.END, "步骤1: OCR识别\n", "step")
|
|
||||||
if ocr_match:
|
|
||||||
total = int(ocr_match.group(1))
|
|
||||||
success = int(ocr_match.group(2))
|
|
||||||
result_text.insert(tk.END, f" 处理图片: {total} 个\n", "info")
|
|
||||||
result_text.insert(tk.END, f" 成功识别: {success} 个\n", "info")
|
|
||||||
if success == total:
|
|
||||||
result_text.insert(tk.END, " 结果: 全部识别成功\n", "success")
|
|
||||||
else:
|
|
||||||
result_text.insert(tk.END, f" 结果: 部分识别成功 ({success}/{total})\n", "warning")
|
|
||||||
else:
|
|
||||||
result_text.insert(tk.END, " 结果: 无OCR处理或处理信息不完整\n", "warning")
|
|
||||||
|
|
||||||
result_text.insert(tk.END, "\n步骤2: Excel处理\n", "step")
|
|
||||||
if excel_match:
|
|
||||||
products = int(excel_match.group(1))
|
|
||||||
result_text.insert(tk.END, f" 提取商品: {products} 个\n", "info")
|
|
||||||
result_text.insert(tk.END, " 结果: 成功生成采购单\n", "success")
|
|
||||||
if output_file_match:
|
|
||||||
output_file = output_file_match.group(1)
|
|
||||||
result_text.insert(tk.END, f" 输出文件: {os.path.basename(output_file)}\n", "info")
|
|
||||||
else:
|
|
||||||
result_text.insert(tk.END, " 结果: 无Excel处理或处理信息不完整\n", "warning")
|
|
||||||
|
|
||||||
result_text.insert(tk.END, "\n===== 整体评估 =====\n", "title")
|
|
||||||
|
|
||||||
has_errors = "错误" in output or "失败" in output
|
|
||||||
|
|
||||||
no_files_match2 = re.search(r'未找到采购单文件', output)
|
|
||||||
single_file_match = re.search(r'只有1个采购单文件', output)
|
|
||||||
|
|
||||||
if no_files_match2:
|
|
||||||
result_text.insert(tk.END, "没有找到可合并的文件,但处理流程已成功完成。\n", "warning")
|
|
||||||
result_text.insert(tk.END, "可以选择打开Excel文件或查看输出文件夹。\n", "info")
|
|
||||||
elif single_file_match:
|
|
||||||
result_text.insert(tk.END, "只有一个采购单文件,无需合并,处理流程已成功完成。\n", "warning")
|
|
||||||
result_text.insert(tk.END, "可以选择打开生成的Excel文件。\n", "info")
|
|
||||||
elif ocr_match and excel_match and not has_errors:
|
|
||||||
result_text.insert(tk.END, "流程完整执行成功!\n", "success")
|
|
||||||
elif ocr_match or excel_match:
|
|
||||||
result_text.insert(tk.END, "流程部分执行成功,请检查日志获取详情。\n", "warning")
|
|
||||||
else:
|
|
||||||
result_text.insert(tk.END, "流程执行可能存在问题,请查看详细日志。\n", "error")
|
|
||||||
|
|
||||||
result_text.tag_configure("title", font=("Arial", 12, "bold"))
|
|
||||||
result_text.tag_configure("step", font=("Arial", 11, "bold"))
|
|
||||||
result_text.tag_configure("info", font=("Arial", 10))
|
|
||||||
result_text.tag_configure("success", font=("Arial", 10, "bold"), foreground="#28a745")
|
|
||||||
result_text.tag_configure("warning", font=("Arial", 10, "bold"), foreground="#ffc107")
|
|
||||||
result_text.tag_configure("error", font=("Arial", 10, "bold"), foreground="#dc3545")
|
|
||||||
|
|
||||||
result_text.configure(state=tk.DISABLED)
|
|
||||||
|
|
||||||
button_frame = tk.Frame(preview)
|
|
||||||
button_frame.pack(pady=10)
|
|
||||||
|
|
||||||
if output_file_match:
|
|
||||||
output_file = output_file_match.group(1)
|
|
||||||
tk.Button(button_frame, text="打开Excel文件", command=lambda: os.startfile(output_file)).pack(side=tk.LEFT, padx=10)
|
|
||||||
else:
|
|
||||||
if excel_match or no_files_match or single_file_match:
|
|
||||||
output_dir = _get_output_dir()
|
|
||||||
excel_files = [f for f in os.listdir(output_dir) if f.startswith('采购单_') and (f.endswith('.xls') or f.endswith('.xlsx'))]
|
|
||||||
if excel_files:
|
|
||||||
excel_files.sort(key=lambda x: os.path.getmtime(os.path.join(output_dir, x)), reverse=True)
|
|
||||||
latest_file = os.path.join(output_dir, excel_files[0])
|
|
||||||
tk.Button(button_frame, text="打开最新Excel文件",
|
|
||||||
command=lambda: os.startfile(latest_file)).pack(side=tk.LEFT, padx=10)
|
|
||||||
|
|
||||||
tk.Button(button_frame, text="查看输出文件夹", command=lambda: os.startfile(_get_output_dir())).pack(side=tk.LEFT, padx=10)
|
|
||||||
tk.Button(button_frame, text="关闭", command=preview.destroy).pack(side=tk.LEFT, padx=10)
|
|
||||||
|
|
||||||
|
|
||||||
def show_tobacco_result_preview(returncode, output):
|
|
||||||
"""显示烟草订单处理结果预览"""
|
|
||||||
global TOBACCO_PREVIEW_WINDOW
|
|
||||||
if returncode != 0:
|
|
||||||
return
|
|
||||||
|
|
||||||
try:
|
|
||||||
try:
|
|
||||||
if TOBACCO_PREVIEW_WINDOW and TOBACCO_PREVIEW_WINDOW.winfo_exists():
|
|
||||||
TOBACCO_PREVIEW_WINDOW.lift()
|
|
||||||
return
|
|
||||||
except Exception:
|
|
||||||
TOBACCO_PREVIEW_WINDOW = None
|
|
||||||
|
|
||||||
result_file = None
|
|
||||||
order_time = "(未知)"
|
|
||||||
total_amount = "(未知)"
|
|
||||||
items_count = 0
|
|
||||||
|
|
||||||
abs_path_match = re.search(r'烟草订单处理完成,绝对路径: (.+)(?:\n|$)', output)
|
|
||||||
if abs_path_match:
|
|
||||||
result_file = abs_path_match.group(1).strip()
|
|
||||||
|
|
||||||
for line in output.split('\n'):
|
|
||||||
if "烟草公司订单处理成功" in line and "订单时间" in line:
|
|
||||||
time_match = re.search(r'订单时间: ([^,]+)', line)
|
|
||||||
amount_match = re.search(r'总金额: ([^,]+)', line)
|
|
||||||
items_match = re.search(r'处理条目: (\d+)', line)
|
|
||||||
|
|
||||||
if time_match:
|
|
||||||
order_time = time_match.group(1).strip()
|
|
||||||
if amount_match:
|
|
||||||
total_amount = amount_match.group(1).strip()
|
|
||||||
if items_match:
|
|
||||||
items_count = int(items_match.group(1).strip())
|
|
||||||
|
|
||||||
if not result_file or not os.path.exists(result_file):
|
|
||||||
default_path = os.path.join(_get_output_dir(), "银豹采购单_烟草公司.xls")
|
|
||||||
if os.path.exists(default_path):
|
|
||||||
result_file = default_path
|
|
||||||
|
|
||||||
preview = tk.Toplevel()
|
|
||||||
preview.title("烟草订单处理结果")
|
|
||||||
preview.geometry("450x320")
|
|
||||||
preview.resizable(False, False)
|
|
||||||
TOBACCO_PREVIEW_WINDOW = preview
|
|
||||||
|
|
||||||
def _close_preview():
|
|
||||||
global TOBACCO_PREVIEW_WINDOW
|
|
||||||
TOBACCO_PREVIEW_WINDOW = None
|
|
||||||
try:
|
|
||||||
preview.destroy()
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
preview.protocol("WM_DELETE_WINDOW", _close_preview)
|
|
||||||
center_window(preview)
|
|
||||||
|
|
||||||
tk.Label(preview, text="烟草订单处理完成", font=("Arial", 16, "bold")).pack(pady=10)
|
|
||||||
|
|
||||||
result_frame = tk.Frame(preview)
|
|
||||||
result_frame.pack(pady=10, fill=tk.BOTH, expand=True)
|
|
||||||
|
|
||||||
tk.Label(result_frame, text=f"订单时间: {order_time}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
tk.Label(result_frame, text=f"订单总金额: {total_amount}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
tk.Label(result_frame, text=f"处理商品数量: {items_count} 个", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
|
|
||||||
if result_file and os.path.exists(result_file):
|
|
||||||
tk.Label(result_frame, text=f"输出文件: {os.path.basename(result_file)}", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
tk.Label(result_frame, text="银豹采购单已成功生成!", font=("Arial", 12, "bold"), fg="#28a745").pack(pady=10)
|
|
||||||
|
|
||||||
file_frame = tk.Frame(result_frame, relief=tk.GROOVE, borderwidth=1)
|
|
||||||
file_frame.pack(fill=tk.X, padx=15, pady=5)
|
|
||||||
tk.Label(file_frame, text="文件信息", font=("Arial", 10, "bold")).pack(anchor=tk.W, padx=10, pady=5)
|
|
||||||
|
|
||||||
try:
|
|
||||||
file_size = os.path.getsize(result_file)
|
|
||||||
file_time = datetime.datetime.fromtimestamp(os.path.getmtime(result_file))
|
|
||||||
size_text = format_file_size(file_size)
|
|
||||||
tk.Label(file_frame, text=f"文件大小: {size_text}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
|
||||||
tk.Label(file_frame, text=f"创建时间: {file_time.strftime('%Y-%m-%d %H:%M:%S')}", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
|
||||||
except Exception:
|
|
||||||
tk.Label(file_frame, text="无法获取文件信息", font=("Arial", 10)).pack(anchor=tk.W, padx=10, pady=2)
|
|
||||||
|
|
||||||
button_frame = tk.Frame(preview)
|
|
||||||
button_frame.pack(pady=10)
|
|
||||||
tk.Button(button_frame, text="打开文件", command=lambda: os.startfile(result_file)).pack(side=tk.LEFT, padx=5)
|
|
||||||
tk.Button(button_frame, text="打开所在文件夹", command=lambda: os.startfile(os.path.dirname(result_file))).pack(side=tk.LEFT, padx=5)
|
|
||||||
tk.Button(button_frame, text="关闭", command=_close_preview).pack(side=tk.LEFT, padx=5)
|
|
||||||
else:
|
|
||||||
tk.Label(result_frame, text="未找到输出文件", font=("Arial", 12)).pack(anchor=tk.W, padx=20, pady=5)
|
|
||||||
tk.Label(result_frame, text=f"请检查{_get_output_dir()}目录", font=("Arial", 12, "bold"), fg="#dc3545").pack(pady=10)
|
|
||||||
|
|
||||||
button_frame = tk.Frame(preview)
|
|
||||||
button_frame.pack(pady=10)
|
|
||||||
tk.Button(button_frame, text="打开输出目录", command=lambda: os.startfile(_get_output_dir())).pack(side=tk.LEFT, padx=5)
|
|
||||||
tk.Button(button_frame, text="关闭", command=_close_preview).pack(side=tk.LEFT, padx=5)
|
|
||||||
|
|
||||||
preview.lift()
|
|
||||||
preview.attributes('-topmost', True)
|
|
||||||
preview.after_idle(lambda: preview.attributes('-topmost', False))
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
messagebox.showerror(
|
|
||||||
"处理异常",
|
|
||||||
f"显示预览时发生错误: {e}\n请检查日志了解详细信息。"
|
|
||||||
)
|
|
||||||
@@ -1,60 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""键盘快捷键模块"""
|
|
||||||
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import messagebox
|
|
||||||
|
|
||||||
from .ui_widgets import center_window
|
|
||||||
from .action_handlers import (
|
|
||||||
process_single_image_with_status,
|
|
||||||
process_excel_file_with_status,
|
|
||||||
batch_ocr_with_status,
|
|
||||||
run_pipeline_directly,
|
|
||||||
merge_orders_with_status,
|
|
||||||
)
|
|
||||||
from .file_operations import clean_cache
|
|
||||||
|
|
||||||
|
|
||||||
def bind_keyboard_shortcuts(root, log_widget, status_bar):
|
|
||||||
"""绑定键盘快捷键"""
|
|
||||||
root.bind('<Control-o>', lambda e: process_single_image_with_status(log_widget, status_bar))
|
|
||||||
root.bind('<Control-e>', lambda e: process_excel_file_with_status(log_widget, status_bar))
|
|
||||||
root.bind('<Control-b>', lambda e: batch_ocr_with_status(log_widget, status_bar))
|
|
||||||
root.bind('<Control-p>', lambda e: run_pipeline_directly(log_widget, status_bar))
|
|
||||||
root.bind('<Control-m>', lambda e: merge_orders_with_status(log_widget, status_bar))
|
|
||||||
root.bind('<F5>', lambda e: clean_cache(log_widget))
|
|
||||||
root.bind('<Escape>', lambda e: root.quit() if messagebox.askyesno("确认退出", "确定要退出程序吗?") else None)
|
|
||||||
root.bind('<F1>', lambda e: show_shortcuts_help())
|
|
||||||
|
|
||||||
|
|
||||||
def show_shortcuts_help():
|
|
||||||
"""显示快捷键帮助对话框"""
|
|
||||||
help_dialog = tk.Toplevel()
|
|
||||||
help_dialog.title("快捷键帮助")
|
|
||||||
help_dialog.geometry("400x450")
|
|
||||||
center_window(help_dialog)
|
|
||||||
|
|
||||||
tk.Label(help_dialog, text="键盘快捷键", font=("Arial", 16, "bold")).pack(pady=10)
|
|
||||||
|
|
||||||
help_text = tk.Text(help_dialog, wrap=tk.WORD, width=50, height=20)
|
|
||||||
help_text.pack(padx=20, pady=10, fill=tk.BOTH, expand=True)
|
|
||||||
|
|
||||||
shortcuts = """
|
|
||||||
Ctrl+O: 处理单个图片
|
|
||||||
Ctrl+E: 处理Excel文件
|
|
||||||
Ctrl+B: OCR批量识别
|
|
||||||
Ctrl+P: 完整处理流程
|
|
||||||
Ctrl+M: 合并采购单
|
|
||||||
F5: 清除处理缓存
|
|
||||||
Esc: 退出程序
|
|
||||||
"""
|
|
||||||
|
|
||||||
help_text.insert(tk.END, shortcuts)
|
|
||||||
help_text.configure(state=tk.DISABLED)
|
|
||||||
|
|
||||||
tk.Button(help_dialog, text="确定", command=help_dialog.destroy).pack(pady=10)
|
|
||||||
|
|
||||||
help_dialog.lift()
|
|
||||||
help_dialog.attributes('-topmost', True)
|
|
||||||
help_dialog.after_idle(lambda: help_dialog.attributes('-topmost', False))
|
|
||||||
-193
@@ -1,193 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""主题管理模块"""
|
|
||||||
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import scrolledtext, ttk
|
|
||||||
|
|
||||||
# 私有主题模式变量
|
|
||||||
_theme_mode = "light"
|
|
||||||
|
|
||||||
# 浅色和深色主题颜色
|
|
||||||
THEMES = {
|
|
||||||
"light": {
|
|
||||||
"bg": "#f8f9fa",
|
|
||||||
"fg": "#212529",
|
|
||||||
"button_bg": "#ffffff",
|
|
||||||
"button_fg": "#495057",
|
|
||||||
"button_hover": "#e9ecef",
|
|
||||||
"primary_bg": "#007bff",
|
|
||||||
"primary_fg": "#ffffff",
|
|
||||||
"secondary_bg": "#6c757d",
|
|
||||||
"secondary_fg": "#ffffff",
|
|
||||||
"log_bg": "#ffffff",
|
|
||||||
"log_fg": "#212529",
|
|
||||||
"highlight_bg": "#007bff",
|
|
||||||
"highlight_fg": "#ffffff",
|
|
||||||
"border": "#dee2e6",
|
|
||||||
"success": "#28a745",
|
|
||||||
"error": "#dc3545",
|
|
||||||
"warning": "#ffc107",
|
|
||||||
"info": "#17a2b8",
|
|
||||||
"card_bg": "#ffffff",
|
|
||||||
"shadow": "#00000010"
|
|
||||||
},
|
|
||||||
"dark": {
|
|
||||||
"bg": "#1a1a1a",
|
|
||||||
"fg": "#e9ecef",
|
|
||||||
"button_bg": "#343a40",
|
|
||||||
"button_fg": "#e9ecef",
|
|
||||||
"button_hover": "#495057",
|
|
||||||
"primary_bg": "#0d6efd",
|
|
||||||
"primary_fg": "#ffffff",
|
|
||||||
"secondary_bg": "#6c757d",
|
|
||||||
"secondary_fg": "#ffffff",
|
|
||||||
"log_bg": "#212529",
|
|
||||||
"log_fg": "#e9ecef",
|
|
||||||
"highlight_bg": "#0d6efd",
|
|
||||||
"highlight_fg": "#ffffff",
|
|
||||||
"border": "#495057",
|
|
||||||
"success": "#198754",
|
|
||||||
"error": "#dc3545",
|
|
||||||
"warning": "#ffc107",
|
|
||||||
"info": "#0dcaf0",
|
|
||||||
"card_bg": "#2d3748",
|
|
||||||
"shadow": "#00000030"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def get_theme_mode() -> str:
|
|
||||||
return _theme_mode
|
|
||||||
|
|
||||||
|
|
||||||
def set_theme_mode(mode: str):
|
|
||||||
global _theme_mode
|
|
||||||
_theme_mode = mode
|
|
||||||
|
|
||||||
|
|
||||||
def create_modern_button(parent, text, command, style="primary", width=None, height=None, px_width=None, px_height=None):
|
|
||||||
"""创建现代化样式的按钮"""
|
|
||||||
theme = THEMES[_theme_mode]
|
|
||||||
|
|
||||||
if style == "primary":
|
|
||||||
bg_color = "white"
|
|
||||||
fg_color = theme["primary_bg"]
|
|
||||||
hover_color = "#f0f8ff"
|
|
||||||
border_color = theme["primary_bg"]
|
|
||||||
elif style == "secondary":
|
|
||||||
bg_color = theme["secondary_bg"]
|
|
||||||
fg_color = theme["secondary_fg"]
|
|
||||||
hover_color = theme["button_hover"]
|
|
||||||
border_color = theme["secondary_bg"]
|
|
||||||
else:
|
|
||||||
bg_color = "white"
|
|
||||||
fg_color = theme["primary_bg"]
|
|
||||||
hover_color = "#f0f8ff"
|
|
||||||
border_color = theme["primary_bg"]
|
|
||||||
|
|
||||||
button_frame = tk.Frame(parent, bg=border_color, highlightthickness=0)
|
|
||||||
button_frame.configure(relief="flat", bd=0)
|
|
||||||
if px_width or px_height:
|
|
||||||
try:
|
|
||||||
w = px_width if px_width else button_frame.winfo_reqwidth()
|
|
||||||
h = px_height if px_height else 32
|
|
||||||
button_frame.configure(width=w, height=h)
|
|
||||||
button_frame.pack_propagate(False)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
button = tk.Button(
|
|
||||||
button_frame,
|
|
||||||
text=text,
|
|
||||||
command=command,
|
|
||||||
bg=bg_color,
|
|
||||||
fg=fg_color,
|
|
||||||
font=("Microsoft YaHei UI", 8),
|
|
||||||
relief="flat",
|
|
||||||
bd=0,
|
|
||||||
padx=14,
|
|
||||||
pady=4,
|
|
||||||
anchor="center",
|
|
||||||
cursor="hand2",
|
|
||||||
activebackground=hover_color,
|
|
||||||
activeforeground=fg_color
|
|
||||||
)
|
|
||||||
|
|
||||||
if width:
|
|
||||||
button.configure(width=width)
|
|
||||||
else:
|
|
||||||
button.configure(width=12)
|
|
||||||
if height is not None:
|
|
||||||
button.configure(height=height)
|
|
||||||
else:
|
|
||||||
button.configure(height=1)
|
|
||||||
if height:
|
|
||||||
button.configure(height=height)
|
|
||||||
|
|
||||||
# 悬停效果
|
|
||||||
def on_enter(e):
|
|
||||||
button.configure(bg=hover_color)
|
|
||||||
|
|
||||||
def on_leave(e):
|
|
||||||
button.configure(bg=bg_color)
|
|
||||||
|
|
||||||
button.bind("<Enter>", on_enter)
|
|
||||||
button.bind("<Leave>", on_leave)
|
|
||||||
button_frame.bind("<Enter>", on_enter)
|
|
||||||
button_frame.bind("<Leave>", on_leave)
|
|
||||||
|
|
||||||
button.pack(fill=tk.BOTH, expand=True, padx=1, pady=1)
|
|
||||||
return button_frame
|
|
||||||
|
|
||||||
|
|
||||||
def create_card_frame(parent, title=None):
|
|
||||||
"""创建卡片样式的框架"""
|
|
||||||
theme = THEMES[_theme_mode]
|
|
||||||
|
|
||||||
card = tk.Frame(
|
|
||||||
parent,
|
|
||||||
bg=theme["card_bg"],
|
|
||||||
relief="flat",
|
|
||||||
borderwidth=1,
|
|
||||||
highlightbackground=theme["border"],
|
|
||||||
highlightthickness=1
|
|
||||||
)
|
|
||||||
|
|
||||||
if title:
|
|
||||||
title_label = tk.Label(
|
|
||||||
card,
|
|
||||||
text=title,
|
|
||||||
bg=theme["card_bg"],
|
|
||||||
fg=theme["fg"],
|
|
||||||
font=("Microsoft YaHei UI", 10, "bold")
|
|
||||||
)
|
|
||||||
title_label.pack(pady=(6, 3))
|
|
||||||
|
|
||||||
return card
|
|
||||||
|
|
||||||
|
|
||||||
def apply_theme(widget, theme_mode=None):
|
|
||||||
"""应用主题到小部件"""
|
|
||||||
if theme_mode is None:
|
|
||||||
theme_mode = _theme_mode
|
|
||||||
|
|
||||||
theme = THEMES[theme_mode]
|
|
||||||
|
|
||||||
try:
|
|
||||||
widget.configure(bg=theme["bg"], fg=theme["fg"])
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
for child in widget.winfo_children():
|
|
||||||
if isinstance(child, tk.Button) and not isinstance(child, ttk.Button):
|
|
||||||
child.configure(bg=theme["button_bg"], fg=theme["button_fg"])
|
|
||||||
elif isinstance(child, scrolledtext.ScrolledText):
|
|
||||||
child.configure(bg=theme["log_bg"], fg=theme["log_fg"])
|
|
||||||
else:
|
|
||||||
try:
|
|
||||||
child.configure(bg=theme["bg"], fg=theme["fg"])
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
apply_theme(child, theme_mode)
|
|
||||||
@@ -1,121 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""UI控件模块 - StatusBar、ProgressReporter、可折叠框架等"""
|
|
||||||
|
|
||||||
import tkinter as tk
|
|
||||||
from tkinter import ttk
|
|
||||||
|
|
||||||
from .theme import THEMES, get_theme_mode
|
|
||||||
|
|
||||||
|
|
||||||
class StatusBar(tk.Frame):
|
|
||||||
"""状态栏,显示当前系统状态和进度"""
|
|
||||||
|
|
||||||
def __init__(self, master, **kwargs):
|
|
||||||
super().__init__(master, **kwargs)
|
|
||||||
self.configure(height=25, relief=tk.SUNKEN, borderwidth=1)
|
|
||||||
|
|
||||||
self.status_label = tk.Label(self, text="就绪", anchor=tk.W, padx=5)
|
|
||||||
self.status_label.pack(side=tk.LEFT, fill=tk.X, expand=True)
|
|
||||||
|
|
||||||
self.progress = ttk.Progressbar(self, orient=tk.HORIZONTAL, length=200, mode='determinate')
|
|
||||||
self.progress.pack(side=tk.RIGHT, padx=5, pady=2)
|
|
||||||
|
|
||||||
self.progress.pack_forget()
|
|
||||||
|
|
||||||
def set_status(self, text, progress=None):
|
|
||||||
"""设置状态栏文本和进度"""
|
|
||||||
self.status_label.config(text=text)
|
|
||||||
|
|
||||||
if progress is not None and 0 <= progress <= 100:
|
|
||||||
self.progress.pack(side=tk.RIGHT, padx=5, pady=2)
|
|
||||||
self.progress.config(value=progress)
|
|
||||||
else:
|
|
||||||
self.progress.pack_forget()
|
|
||||||
|
|
||||||
def set_running(self, is_running=True):
|
|
||||||
"""设置运行状态"""
|
|
||||||
theme = THEMES[get_theme_mode()]
|
|
||||||
if is_running:
|
|
||||||
self.status_label.config(text="处理中...", foreground=theme["info"])
|
|
||||||
self.progress.pack(side=tk.RIGHT, padx=5, pady=2)
|
|
||||||
self.progress.config(mode='indeterminate')
|
|
||||||
self.progress.start()
|
|
||||||
else:
|
|
||||||
self.status_label.config(text="就绪", foreground=theme["fg"])
|
|
||||||
self.progress.stop()
|
|
||||||
self.progress.pack_forget()
|
|
||||||
|
|
||||||
|
|
||||||
class ProgressReporter:
|
|
||||||
def __init__(self, status_bar: StatusBar):
|
|
||||||
self.status_bar = status_bar
|
|
||||||
|
|
||||||
def set(self, text: str, percent: int = None):
|
|
||||||
try:
|
|
||||||
if percent is not None:
|
|
||||||
self.status_bar.set_status(text, percent)
|
|
||||||
else:
|
|
||||||
self.status_bar.set_status(text)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
def running(self):
|
|
||||||
try:
|
|
||||||
self.status_bar.set_running(True)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
def done(self):
|
|
||||||
try:
|
|
||||||
self.status_bar.set_running(False)
|
|
||||||
self.status_bar.set_status("就绪")
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
def create_collapsible_frame(parent, title, initial_state=True):
|
|
||||||
"""创建可折叠的面板"""
|
|
||||||
frame = tk.Frame(parent)
|
|
||||||
frame.pack(fill=tk.X, pady=5)
|
|
||||||
|
|
||||||
title_frame = tk.Frame(frame)
|
|
||||||
title_frame.pack(fill=tk.X)
|
|
||||||
|
|
||||||
state_var = tk.BooleanVar(value=initial_state)
|
|
||||||
indicator = "▼" if initial_state else "►"
|
|
||||||
state_label = tk.Label(title_frame, text=indicator, font=("Arial", 10, "bold"))
|
|
||||||
state_label.pack(side=tk.LEFT, padx=5)
|
|
||||||
|
|
||||||
title_label = tk.Label(title_frame, text=title, font=("Arial", 11, "bold"))
|
|
||||||
title_label.pack(side=tk.LEFT, padx=5)
|
|
||||||
|
|
||||||
content_frame = tk.Frame(frame)
|
|
||||||
if initial_state:
|
|
||||||
content_frame.pack(fill=tk.X, padx=20, pady=5)
|
|
||||||
|
|
||||||
def toggle_collapse(event=None):
|
|
||||||
current_state = state_var.get()
|
|
||||||
new_state = not current_state
|
|
||||||
state_var.set(new_state)
|
|
||||||
state_label.config(text="▼" if new_state else "►")
|
|
||||||
if new_state:
|
|
||||||
content_frame.pack(fill=tk.X, padx=20, pady=5)
|
|
||||||
else:
|
|
||||||
content_frame.pack_forget()
|
|
||||||
|
|
||||||
title_frame.bind("<Button-1>", toggle_collapse)
|
|
||||||
state_label.bind("<Button-1>", toggle_collapse)
|
|
||||||
title_label.bind("<Button-1>", toggle_collapse)
|
|
||||||
|
|
||||||
return content_frame, state_var
|
|
||||||
|
|
||||||
|
|
||||||
def center_window(window):
|
|
||||||
"""使窗口居中显示"""
|
|
||||||
window.update_idletasks()
|
|
||||||
width = window.winfo_width()
|
|
||||||
height = window.winfo_height()
|
|
||||||
x = (window.winfo_screenwidth() // 2) - (width // 2)
|
|
||||||
y = (window.winfo_screenheight() // 2) - (height // 2)
|
|
||||||
window.geometry('{}x{}+{}+{}'.format(width, height, x, y))
|
|
||||||
@@ -1,130 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
"""用户设置与最近文件管理模块"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import json
|
|
||||||
import re
|
|
||||||
import tkinter as tk
|
|
||||||
from typing import Dict, List, Any
|
|
||||||
|
|
||||||
from app.core.utils.log_utils import get_logger
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
|
||||||
|
|
||||||
RECENT_LIST_WIDGET = None
|
|
||||||
|
|
||||||
|
|
||||||
def load_user_settings():
|
|
||||||
try:
|
|
||||||
path = os.path.abspath(os.path.join('data', 'user_settings.json'))
|
|
||||||
if os.path.exists(path):
|
|
||||||
with open(path, 'r', encoding='utf-8') as f:
|
|
||||||
return json.load(f)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"加载用户设置失败: {e}")
|
|
||||||
return {}
|
|
||||||
|
|
||||||
|
|
||||||
def save_user_settings(settings: Dict[str, Any]):
|
|
||||||
try:
|
|
||||||
os.makedirs('data', exist_ok=True)
|
|
||||||
path = os.path.abspath(os.path.join('data', 'user_settings.json'))
|
|
||||||
with open(path, 'w', encoding='utf-8') as f:
|
|
||||||
json.dump(settings, f, ensure_ascii=False, indent=2)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"保存用户设置失败: {e}")
|
|
||||||
|
|
||||||
|
|
||||||
def get_recent_files() -> List[str]:
|
|
||||||
s = load_user_settings()
|
|
||||||
items = s.get('recent_files', [])
|
|
||||||
if not isinstance(items, list):
|
|
||||||
return []
|
|
||||||
|
|
||||||
def _allowed(p: str) -> bool:
|
|
||||||
try:
|
|
||||||
if not isinstance(p, str) or not os.path.isfile(p):
|
|
||||||
return False
|
|
||||||
ext = os.path.splitext(p)[1].lower()
|
|
||||||
return ext in {'.xlsx', '.xls', '.jpg', '.jpeg', '.png', '.bmp'}
|
|
||||||
except Exception:
|
|
||||||
return False
|
|
||||||
|
|
||||||
kept = [p for p in items if _allowed(p)]
|
|
||||||
if not kept:
|
|
||||||
candidates = []
|
|
||||||
cfg = ConfigManager()
|
|
||||||
for d in [cfg.get_path('Paths', 'output_folder', fallback='data/output'), cfg.get_path('Paths', 'result_folder', fallback='data/result')]:
|
|
||||||
try:
|
|
||||||
if os.path.exists(d):
|
|
||||||
for name in os.listdir(d):
|
|
||||||
p = os.path.join(d, name)
|
|
||||||
if _allowed(p):
|
|
||||||
candidates.append(p)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
if candidates:
|
|
||||||
kept = candidates
|
|
||||||
try:
|
|
||||||
kept_sorted = sorted(kept, key=lambda p: os.path.getmtime(p), reverse=True)
|
|
||||||
except Exception:
|
|
||||||
kept_sorted = kept
|
|
||||||
if kept_sorted != items or len(kept_sorted) != len(items):
|
|
||||||
s['recent_files'] = kept_sorted[:20]
|
|
||||||
save_user_settings(s)
|
|
||||||
return kept_sorted[:10]
|
|
||||||
|
|
||||||
|
|
||||||
def refresh_recent_list_widget():
|
|
||||||
try:
|
|
||||||
global RECENT_LIST_WIDGET
|
|
||||||
if RECENT_LIST_WIDGET is None:
|
|
||||||
return
|
|
||||||
RECENT_LIST_WIDGET.delete(0, tk.END)
|
|
||||||
for i, p in enumerate(get_recent_files(), start=1):
|
|
||||||
RECENT_LIST_WIDGET.insert(tk.END, f"{i}. {p}")
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"刷新最近文件列表失败: {e}")
|
|
||||||
|
|
||||||
|
|
||||||
def _extract_path_from_recent_item(s: str) -> str:
|
|
||||||
try:
|
|
||||||
m = re.match(r'^(\d+)\.\s+(.*)$', s)
|
|
||||||
p = m.group(2) if m else s
|
|
||||||
return p.strip().strip('"')
|
|
||||||
except Exception:
|
|
||||||
return s.strip().strip('"')
|
|
||||||
|
|
||||||
|
|
||||||
def add_recent_file(path: str) -> None:
|
|
||||||
try:
|
|
||||||
if not path:
|
|
||||||
return
|
|
||||||
try:
|
|
||||||
if not os.path.isfile(path):
|
|
||||||
return
|
|
||||||
ext = os.path.splitext(path)[1].lower()
|
|
||||||
if ext not in {'.xlsx', '.xls', '.jpg', '.jpeg', '.png', '.bmp'}:
|
|
||||||
return
|
|
||||||
except Exception:
|
|
||||||
return
|
|
||||||
s = load_user_settings()
|
|
||||||
items = s.get('recent_files', [])
|
|
||||||
items = [p for p in items if p != path]
|
|
||||||
items.insert(0, path)
|
|
||||||
s['recent_files'] = items[:20]
|
|
||||||
save_user_settings(s)
|
|
||||||
refresh_recent_list_widget()
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"添加最近文件失败: {e}")
|
|
||||||
|
|
||||||
|
|
||||||
def clear_recent_files():
|
|
||||||
try:
|
|
||||||
s = load_user_settings()
|
|
||||||
s['recent_files'] = []
|
|
||||||
save_user_settings(s)
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug(f"清空最近文件失败: {e}")
|
|
||||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
Binary file not shown.
File diff suppressed because it is too large
Load Diff
Binary file not shown.
File diff suppressed because it is too large
Load Diff
Binary file not shown.
@@ -0,0 +1,316 @@
|
|||||||
|
|
||||||
|
This file lists modules PyInstaller was not able to find. This does not
|
||||||
|
necessarily mean this module is required for running your program. Python and
|
||||||
|
Python 3rd-party packages include a lot of conditional or optional modules. For
|
||||||
|
example the module 'ntpath' only exists on Windows, whereas the module
|
||||||
|
'posixpath' only exists on Posix systems.
|
||||||
|
|
||||||
|
Types if import:
|
||||||
|
* top-level: imported at the top-level - look at these first
|
||||||
|
* conditional: imported within an if-statement
|
||||||
|
* delayed: imported within a function
|
||||||
|
* optional: imported within a try-except-statement
|
||||||
|
|
||||||
|
IMPORTANT: Do NOT post this list to the issue-tracker. Use it as a basis for
|
||||||
|
tracking down the missing module yourself. Thanks!
|
||||||
|
|
||||||
|
missing module named _posixshmem - imported by multiprocessing.resource_tracker (conditional), multiprocessing.shared_memory (conditional)
|
||||||
|
missing module named 'org.python' - imported by copy (optional), xml.sax (delayed, conditional)
|
||||||
|
missing module named _scproxy - imported by urllib.request (conditional)
|
||||||
|
missing module named termios - imported by getpass (optional), tty (top-level)
|
||||||
|
missing module named pwd - imported by posixpath (delayed, conditional), shutil (optional), tarfile (optional), pathlib (delayed, conditional, optional), subprocess (optional), netrc (delayed, conditional), getpass (delayed), http.server (delayed, optional), webbrowser (delayed)
|
||||||
|
missing module named 'java.lang' - imported by platform (delayed, optional), xml.sax._exceptions (conditional)
|
||||||
|
missing module named multiprocessing.BufferTooShort - imported by multiprocessing (top-level), multiprocessing.connection (top-level)
|
||||||
|
missing module named multiprocessing.AuthenticationError - imported by multiprocessing (top-level), multiprocessing.connection (top-level)
|
||||||
|
missing module named _posixsubprocess - imported by subprocess (optional), multiprocessing.util (delayed)
|
||||||
|
missing module named multiprocessing.get_context - imported by multiprocessing (top-level), multiprocessing.pool (top-level), multiprocessing.managers (top-level), multiprocessing.sharedctypes (top-level)
|
||||||
|
missing module named multiprocessing.TimeoutError - imported by multiprocessing (top-level), multiprocessing.pool (top-level)
|
||||||
|
missing module named org - imported by pickle (optional)
|
||||||
|
missing module named multiprocessing.set_start_method - imported by multiprocessing (top-level), multiprocessing.spawn (top-level)
|
||||||
|
missing module named multiprocessing.get_start_method - imported by multiprocessing (top-level), multiprocessing.spawn (top-level)
|
||||||
|
missing module named grp - imported by shutil (optional), tarfile (optional), pathlib (delayed, optional), subprocess (optional)
|
||||||
|
missing module named pep517 - imported by importlib.metadata (delayed)
|
||||||
|
missing module named posix - imported by os (conditional, optional), shutil (conditional), importlib._bootstrap_external (conditional)
|
||||||
|
missing module named resource - imported by posix (top-level)
|
||||||
|
excluded module named _frozen_importlib - imported by importlib (optional), importlib.abc (optional), zipimport (top-level)
|
||||||
|
missing module named _frozen_importlib_external - imported by importlib._bootstrap (delayed), importlib (optional), importlib.abc (optional), zipimport (top-level)
|
||||||
|
missing module named pyimod02_importers - imported by C:\Program Files\Python39\Lib\site-packages\PyInstaller\hooks\rthooks\pyi_rth_pkgutil.py (delayed)
|
||||||
|
missing module named simplejson - imported by requests.compat (conditional, optional)
|
||||||
|
missing module named dummy_threading - imported by requests.cookies (optional)
|
||||||
|
missing module named typing_extensions - imported by urllib3.util.retry (conditional), urllib3._collections (conditional), urllib3.util.ssltransport (conditional), urllib3.connectionpool (conditional), urllib3.poolmanager (conditional), urllib3.contrib.emscripten.fetch (conditional), charset_normalizer.legacy (conditional), pandas._typing (conditional)
|
||||||
|
missing module named zstandard - imported by urllib3.util.request (optional), urllib3.response (optional)
|
||||||
|
missing module named compression - imported by urllib3.util.request (optional), urllib3.response (optional)
|
||||||
|
missing module named 'h2.events' - imported by urllib3.http2.connection (top-level)
|
||||||
|
missing module named 'h2.connection' - imported by urllib3.http2.connection (top-level)
|
||||||
|
missing module named h2 - imported by urllib3.http2.connection (top-level)
|
||||||
|
missing module named brotli - imported by urllib3.util.request (optional), urllib3.response (optional)
|
||||||
|
missing module named brotlicffi - imported by urllib3.util.request (optional), urllib3.response (optional)
|
||||||
|
missing module named socks - imported by urllib3.contrib.socks (optional)
|
||||||
|
missing module named 'typing.io' - imported by importlib.resources (top-level)
|
||||||
|
missing module named cryptography - imported by urllib3.contrib.pyopenssl (top-level), requests (conditional, optional)
|
||||||
|
missing module named 'OpenSSL.crypto' - imported by urllib3.contrib.pyopenssl (delayed, conditional)
|
||||||
|
missing module named 'cryptography.x509' - imported by urllib3.contrib.pyopenssl (delayed, optional)
|
||||||
|
missing module named OpenSSL - imported by urllib3.contrib.pyopenssl (top-level)
|
||||||
|
missing module named chardet - imported by requests (optional)
|
||||||
|
missing module named 'pyodide.ffi' - imported by urllib3.contrib.emscripten.fetch (delayed, optional)
|
||||||
|
missing module named pyodide - imported by urllib3.contrib.emscripten.fetch (top-level)
|
||||||
|
missing module named js - imported by urllib3.contrib.emscripten.fetch (top-level)
|
||||||
|
missing module named cStringIO - imported by xlrd.timemachine (conditional)
|
||||||
|
missing module named PIL - imported by openpyxl.drawing.image (optional)
|
||||||
|
missing module named 'defusedxml.ElementTree' - imported by openpyxl.xml.functions (conditional)
|
||||||
|
missing module named 'lxml.etree' - imported by openpyxl.xml.functions (conditional), pandas.io.xml (delayed), pandas.io.formats.xml (delayed), pandas.io.html (delayed)
|
||||||
|
missing module named openpyxl.tests - imported by openpyxl.reader.excel (optional)
|
||||||
|
missing module named defusedxml - imported by openpyxl.xml (delayed, optional)
|
||||||
|
missing module named lxml - imported by openpyxl.xml (delayed, optional), pandas.io.xml (conditional)
|
||||||
|
missing module named _dummy_thread - imported by numpy._core.arrayprint (optional)
|
||||||
|
missing module named numpy._typing._ufunc - imported by numpy._typing (conditional)
|
||||||
|
missing module named 'numpy_distutils.cpuinfo' - imported by numpy.f2py.diagnose (delayed, conditional, optional)
|
||||||
|
missing module named 'numpy_distutils.fcompiler' - imported by numpy.f2py.diagnose (delayed, conditional, optional)
|
||||||
|
missing module named 'numpy_distutils.command' - imported by numpy.f2py.diagnose (delayed, conditional, optional)
|
||||||
|
missing module named numpy_distutils - imported by numpy.f2py.diagnose (delayed, optional)
|
||||||
|
missing module named psutil - imported by numpy.testing._private.utils (delayed, optional)
|
||||||
|
missing module named readline - imported by cmd (delayed, conditional, optional), code (delayed, conditional, optional), pdb (delayed, optional)
|
||||||
|
missing module named win32pdh - imported by numpy.testing._private.utils (delayed, conditional)
|
||||||
|
missing module named asyncio.DefaultEventLoopPolicy - imported by asyncio (delayed, conditional), asyncio.events (delayed, conditional)
|
||||||
|
missing module named threadpoolctl - imported by numpy.lib._utils_impl (delayed, optional)
|
||||||
|
missing module named numpy._core.zeros - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.vstack - imported by numpy._core (top-level), numpy.lib._shape_base_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.void - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.vecdot - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.ushort - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.unsignedinteger - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.ulonglong - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.ulong - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.uintp - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.uintc - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.uint64 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.uint32 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.uint16 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.uint - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.ubyte - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.trunc - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.true_divide - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.transpose - imported by numpy._core (top-level), numpy.lib._function_base_impl (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.trace - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.timedelta64 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.tensordot - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.tanh - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.tan - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.swapaxes - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.sum - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.subtract - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.str_ - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.square - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.sqrt - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional), numpy.fft._pocketfft (top-level)
|
||||||
|
missing module named numpy._core.spacing - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.sort - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.sinh - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.single - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.signedinteger - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.signbit - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional)
|
||||||
|
missing module named numpy._core.sign - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.short - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.rint - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.right_shift - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.result_type - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional), numpy.fft._pocketfft (top-level)
|
||||||
|
missing module named numpy._core.remainder - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.reciprocal - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional), numpy.fft._pocketfft (top-level)
|
||||||
|
missing module named numpy._core.radians - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.rad2deg - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.prod - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.power - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.positive - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.pi - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.outer - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.ones - imported by numpy._core (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.object_ - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
|
||||||
|
missing module named numpy._core.number - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional)
|
||||||
|
missing module named numpy._core.not_equal - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.newaxis - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.negative - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.ndarray - imported by numpy._core (top-level), numpy.lib._utils_impl (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.multiply - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.moveaxis - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.modf - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.mod - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.minimum - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.maximum - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.max - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional)
|
||||||
|
missing module named numpy._core.matrix_transpose - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.matmul - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.longdouble - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.long - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.logical_xor - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.logical_or - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.logical_not - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.logical_and - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.logaddexp2 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.logaddexp - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.log2 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.log1p - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.log - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.linspace - imported by numpy._core (top-level), numpy.lib._index_tricks_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.less_equal - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.less - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.left_shift - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.ldexp - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.lcm - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.isscalar - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy.lib._polynomial_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.isnat - imported by numpy._core (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.isnan - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
|
||||||
|
missing module named numpy._core.isfinite - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.intp - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.integer - imported by numpy._core (conditional), numpy (conditional), numpy.fft._helper (top-level)
|
||||||
|
missing module named numpy._core.intc - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.int8 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.int64 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.int32 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.int16 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.inf - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
|
||||||
|
missing module named numpy._core.inexact - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.iinfo - imported by numpy._core (top-level), numpy.lib._twodim_base_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.hypot - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.hstack - imported by numpy._core (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.heaviside - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.half - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.greater_equal - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.greater - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.gcd - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.frompyfunc - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.frexp - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.fmod - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.fmin - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.fmax - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.floor_divide - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.floor - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.floating - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.float_power - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.float32 - imported by numpy._core (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.float16 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.finfo - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.fabs - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.expm1 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.exp - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.euler_gamma - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.errstate - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
|
||||||
|
missing module named numpy._core.equal - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.empty_like - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional), numpy.fft._pocketfft (top-level)
|
||||||
|
missing module named numpy._core.empty - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (top-level), numpy (conditional), numpy.fft._helper (top-level)
|
||||||
|
missing module named numpy._core.e - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.double - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.dot - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.divmod - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.divide - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.diagonal - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.degrees - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.deg2rad - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.datetime64 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.csingle - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.cross - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.count_nonzero - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.cosh - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.cos - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.copysign - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.conjugate - imported by numpy._core (conditional), numpy (conditional), numpy.fft._pocketfft (top-level)
|
||||||
|
missing module named numpy._core.conj - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.complexfloating - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.complex64 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.clongdouble - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.character - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.ceil - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.cdouble - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.cbrt - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.bytes_ - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.byte - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.bool_ - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.bitwise_xor - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.bitwise_or - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.bitwise_count - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.bitwise_and - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.atleast_3d - imported by numpy._core (top-level), numpy.lib._shape_base_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.atleast_2d - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.atleast_1d - imported by numpy._core (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.asarray - imported by numpy._core (top-level), numpy.lib._array_utils_impl (top-level), numpy.linalg._linalg (top-level), numpy (conditional), numpy.fft._pocketfft (top-level), numpy.fft._helper (top-level)
|
||||||
|
missing module named numpy._core.asanyarray - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.array_repr - imported by numpy._core (top-level), numpy.testing._private.utils (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.array2string - imported by numpy._core (delayed), numpy.testing._private.utils (delayed), numpy (conditional)
|
||||||
|
missing module named numpy._core.array - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (top-level), numpy.lib._polynomial_impl (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.argsort - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.arctanh - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.arctan2 - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.arctan - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.arcsinh - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.arcsin - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.arccosh - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.arccos - imported by numpy._core (conditional), numpy (conditional)
|
||||||
|
missing module named numpy._core.arange - imported by numpy._core (top-level), numpy.testing._private.utils (top-level), numpy (conditional), numpy.fft._helper (top-level)
|
||||||
|
missing module named numpy._core.amin - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.amax - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named numpy._core.all - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy.testing._private.utils (delayed), numpy (conditional)
|
||||||
|
missing module named numpy._core.add - imported by numpy._core (top-level), numpy.linalg._linalg (top-level), numpy (conditional)
|
||||||
|
missing module named yaml - imported by numpy.__config__ (delayed)
|
||||||
|
missing module named numpy._distributor_init_local - imported by numpy (optional), numpy._distributor_init (optional)
|
||||||
|
missing module named vms_lib - imported by platform (delayed, optional)
|
||||||
|
missing module named java - imported by platform (delayed)
|
||||||
|
missing module named _winreg - imported by platform (delayed, optional)
|
||||||
|
missing module named six.moves.range - imported by six.moves (top-level), dateutil.rrule (top-level)
|
||||||
|
runtime module named six.moves - imported by dateutil.tz.tz (top-level), dateutil.tz._factories (top-level), dateutil.tz.win (top-level), dateutil.rrule (top-level)
|
||||||
|
missing module named dateutil.tz.tzfile - imported by dateutil.tz (top-level), dateutil.zoneinfo (top-level)
|
||||||
|
missing module named StringIO - imported by six (conditional), xlutils.compat (conditional)
|
||||||
|
missing module named numexpr - imported by pandas.core.computation.expressions (conditional), pandas.core.computation.engines (delayed)
|
||||||
|
missing module named numba - imported by pandas.core._numba.executor (delayed, conditional), pandas.core.util.numba_ (delayed, conditional), pandas.core.window.numba_ (delayed, conditional), pandas.core.window.online (delayed, conditional), pandas.core._numba.kernels.mean_ (top-level), pandas.core._numba.kernels.shared (top-level), pandas.core._numba.kernels.sum_ (top-level), pandas.core._numba.kernels.min_max_ (top-level), pandas.core._numba.kernels.var_ (top-level), pandas.core.groupby.numba_ (delayed, conditional), pandas.core._numba.extensions (top-level)
|
||||||
|
missing module named 'numba.extending' - imported by pandas.core._numba.kernels.sum_ (top-level)
|
||||||
|
missing module named 'pyarrow.compute' - imported by pandas.core.arrays._arrow_string_mixins (conditional), pandas.core.arrays.string_arrow (conditional), pandas.core.reshape.merge (delayed, conditional), pandas.core.arrays.arrow.array (conditional), pandas.core.arrays.arrow.accessors (conditional)
|
||||||
|
missing module named 'numba.typed' - imported by pandas.core._numba.extensions (delayed)
|
||||||
|
missing module named 'numba.core' - imported by pandas.core._numba.extensions (top-level)
|
||||||
|
missing module named pyarrow - imported by pandas.core.arrays._arrow_string_mixins (conditional), pandas.core.arrays.masked (delayed), pandas.core.arrays.boolean (delayed, conditional), pandas.core.arrays.numeric (delayed, conditional), pandas.core.arrays.arrow._arrow_utils (top-level), pandas.core.interchange.utils (delayed, conditional), pandas.core.strings.accessor (delayed, conditional), pandas.io._util (conditional), pandas.io.parsers.base_parser (delayed, conditional), pandas.core.arrays.interval (delayed), pandas.core.arrays.arrow.extension_types (top-level), pandas.core.arrays.period (delayed), pandas.core.methods.describe (delayed, conditional), pandas.io.sql (delayed, conditional), pandas.core.arrays.string_arrow (conditional), pandas.core.reshape.merge (delayed, conditional), pandas.core.arrays.arrow.array (conditional), pandas.core.interchange.buffer (conditional), pandas.io.feather_format (delayed), pandas.core.indexes.base (delayed, conditional), pandas.core.dtypes.cast (delayed, conditional), pandas.core.arrays.string_ (delayed, conditional), pandas.core.arrays.arrow.accessors (conditional), pandas.core.dtypes.dtypes (delayed, conditional), pandas.compat.pyarrow (optional), pandas.core.reshape.encoding (delayed, conditional), pandas._testing (conditional)
|
||||||
|
missing module named 'scipy.stats' - imported by pandas.core.nanops (delayed, conditional)
|
||||||
|
missing module named scipy - imported by pandas.core.dtypes.common (delayed, conditional, optional), pandas.core.missing (delayed)
|
||||||
|
missing module named traitlets - imported by pandas.io.formats.printing (delayed, conditional)
|
||||||
|
missing module named 'IPython.core' - imported by pandas.io.formats.printing (delayed, conditional)
|
||||||
|
missing module named IPython - imported by pandas.io.formats.printing (delayed)
|
||||||
|
missing module named xlsxwriter - imported by pandas.io.excel._xlsxwriter (delayed)
|
||||||
|
missing module named 'odf.config' - imported by pandas.io.excel._odswriter (delayed)
|
||||||
|
missing module named 'odf.style' - imported by pandas.io.excel._odswriter (delayed)
|
||||||
|
missing module named 'odf.text' - imported by pandas.io.excel._odfreader (delayed), pandas.io.excel._odswriter (delayed)
|
||||||
|
missing module named 'odf.table' - imported by pandas.io.excel._odfreader (delayed), pandas.io.excel._odswriter (delayed)
|
||||||
|
missing module named 'odf.opendocument' - imported by pandas.io.excel._odfreader (delayed), pandas.io.excel._odswriter (delayed)
|
||||||
|
missing module named pyxlsb - imported by pandas.io.excel._pyxlsb (delayed, conditional)
|
||||||
|
missing module named 'odf.office' - imported by pandas.io.excel._odfreader (delayed)
|
||||||
|
missing module named 'odf.element' - imported by pandas.io.excel._odfreader (delayed)
|
||||||
|
missing module named 'odf.namespaces' - imported by pandas.io.excel._odfreader (delayed)
|
||||||
|
missing module named odf - imported by pandas.io.excel._odfreader (conditional)
|
||||||
|
missing module named python_calamine - imported by pandas.io.excel._calamine (delayed, conditional)
|
||||||
|
missing module named 'matplotlib.pyplot' - imported by pandas.io.formats.style (optional)
|
||||||
|
missing module named matplotlib - imported by pandas.plotting._core (conditional), pandas.io.formats.style (optional)
|
||||||
|
missing module named 'matplotlib.colors' - imported by pandas.plotting._misc (conditional), pandas.io.formats.style (conditional)
|
||||||
|
missing module named markupsafe - imported by pandas.io.formats.style_render (top-level)
|
||||||
|
missing module named botocore - imported by pandas.io.common (delayed, conditional, optional)
|
||||||
|
missing module named sets - imported by pytz.tzinfo (optional)
|
||||||
|
missing module named collections.Mapping - imported by collections (optional), pytz.lazy (optional)
|
||||||
|
missing module named UserDict - imported by pytz.lazy (optional)
|
||||||
|
missing module named 'scipy.sparse' - imported by pandas.core.arrays.sparse.array (conditional), pandas.core.arrays.sparse.scipy_sparse (delayed, conditional), pandas.core.arrays.sparse.accessor (delayed)
|
||||||
|
missing module named pandas.core.internals.Block - imported by pandas.core.internals (conditional), pandas.io.pytables (conditional)
|
||||||
|
missing module named Foundation - imported by pandas.io.clipboard (delayed, conditional, optional)
|
||||||
|
missing module named AppKit - imported by pandas.io.clipboard (delayed, conditional, optional)
|
||||||
|
missing module named PyQt4 - imported by pandas.io.clipboard (delayed, conditional, optional)
|
||||||
|
missing module named qtpy - imported by pandas.io.clipboard (delayed, conditional, optional)
|
||||||
|
missing module named 'sqlalchemy.engine' - imported by pandas.io.sql (delayed)
|
||||||
|
missing module named 'sqlalchemy.types' - imported by pandas.io.sql (delayed, conditional)
|
||||||
|
missing module named 'sqlalchemy.schema' - imported by pandas.io.sql (delayed)
|
||||||
|
missing module named 'sqlalchemy.sql' - imported by pandas.io.sql (conditional)
|
||||||
|
missing module named sqlalchemy - imported by pandas.io.sql (delayed, conditional)
|
||||||
|
missing module named tables - imported by pandas.io.pytables (delayed, conditional)
|
||||||
|
missing module named 'pyarrow.fs' - imported by pandas.io.orc (conditional)
|
||||||
|
missing module named fsspec - imported by pandas.io.orc (conditional)
|
||||||
|
missing module named 'pyarrow.parquet' - imported by pandas.io.parquet (delayed)
|
||||||
|
missing module named google - imported by pandas.io.gbq (conditional)
|
||||||
|
missing module named 'lxml.html' - imported by pandas.io.html (delayed)
|
||||||
|
missing module named bs4 - imported by pandas.io.html (delayed)
|
||||||
|
missing module named pytest - imported by pandas._testing._io (delayed), pandas._testing (delayed)
|
||||||
|
missing module named 'matplotlib.axes' - imported by pandas.plotting._misc (conditional), pandas._testing.asserters (delayed)
|
||||||
|
missing module named 'matplotlib.artist' - imported by pandas._testing.asserters (delayed)
|
||||||
|
missing module named 'matplotlib.table' - imported by pandas.plotting._misc (conditional)
|
||||||
|
missing module named 'matplotlib.figure' - imported by pandas.plotting._misc (conditional)
|
||||||
|
missing module named errorhandler - imported by xlutils.filter (delayed)
|
||||||
|
missing module named guppy - imported by xlutils.filter (optional)
|
||||||
File diff suppressed because it is too large
Load Diff
+2
-82
@@ -57,8 +57,6 @@ hidden_imports = [
|
|||||||
'xlwt',
|
'xlwt',
|
||||||
'xlutils',
|
'xlutils',
|
||||||
'requests',
|
'requests',
|
||||||
'dotenv',
|
|
||||||
'tkinterdnd2',
|
|
||||||
'configparser',
|
'configparser',
|
||||||
'threading',
|
'threading',
|
||||||
'datetime',
|
'datetime',
|
||||||
@@ -70,28 +68,8 @@ hidden_imports = [
|
|||||||
'app.services.ocr_service',
|
'app.services.ocr_service',
|
||||||
'app.services.order_service',
|
'app.services.order_service',
|
||||||
'app.services.tobacco_service',
|
'app.services.tobacco_service',
|
||||||
'app.services.processor_service',
|
|
||||||
'app.core.utils.dialog_utils',
|
'app.core.utils.dialog_utils',
|
||||||
'app.core.utils.file_utils',
|
|
||||||
'app.core.utils.log_utils',
|
|
||||||
'app.core.utils.string_utils',
|
|
||||||
'app.core.handlers.column_mapper',
|
|
||||||
'app.core.excel.converter',
|
'app.core.excel.converter',
|
||||||
'app.core.db.product_db',
|
|
||||||
'app.ui.error_utils',
|
|
||||||
'app.ui.theme',
|
|
||||||
'app.ui.logging_ui',
|
|
||||||
'app.ui.ui_widgets',
|
|
||||||
'app.ui.user_settings',
|
|
||||||
'app.ui.result_previews',
|
|
||||||
'app.ui.command_runner',
|
|
||||||
'app.ui.file_operations',
|
|
||||||
'app.ui.action_handlers',
|
|
||||||
'app.ui.barcode_editor',
|
|
||||||
'app.ui.config_dialog',
|
|
||||||
'app.ui.shortcuts',
|
|
||||||
'app.ui.main_window',
|
|
||||||
'app.ui.memory_editor',
|
|
||||||
]
|
]
|
||||||
|
|
||||||
a = Analysis(
|
a = Analysis(
|
||||||
@@ -143,40 +121,6 @@ def build_exe():
|
|||||||
"""构建EXE文件"""
|
"""构建EXE文件"""
|
||||||
print("开始构建EXE文件...")
|
print("开始构建EXE文件...")
|
||||||
try:
|
try:
|
||||||
# 注入版本信息到根config.ini
|
|
||||||
try:
|
|
||||||
root_cfg = Path('config.ini')
|
|
||||||
from datetime import datetime
|
|
||||||
version_str = datetime.now().strftime('%Y.%m.%d.%H%M')
|
|
||||||
if root_cfg.exists():
|
|
||||||
lines = root_cfg.read_text(encoding='utf-8').splitlines()
|
|
||||||
has_app = any(l.strip().lower() == '[app]' for l in lines)
|
|
||||||
if not has_app:
|
|
||||||
lines.append('[App]')
|
|
||||||
lines.append(f'version = {version_str}')
|
|
||||||
else:
|
|
||||||
# 更新或追加version
|
|
||||||
new_lines = []
|
|
||||||
in_app = False
|
|
||||||
app_written = False
|
|
||||||
for l in lines:
|
|
||||||
if l.strip().lower() == '[app]':
|
|
||||||
in_app = True
|
|
||||||
new_lines.append(l)
|
|
||||||
continue
|
|
||||||
if in_app and l.strip().lower().startswith('version'):
|
|
||||||
new_lines.append(f'version = {version_str}')
|
|
||||||
app_written = True
|
|
||||||
in_app = True
|
|
||||||
continue
|
|
||||||
new_lines.append(l)
|
|
||||||
if not app_written:
|
|
||||||
new_lines.append('version = ' + version_str)
|
|
||||||
lines = new_lines
|
|
||||||
root_cfg.write_text('\n'.join(lines), encoding='utf-8')
|
|
||||||
print(f"已写入版本号: {version_str}")
|
|
||||||
except Exception as e:
|
|
||||||
print(f"版本信息注入失败: {e}")
|
|
||||||
result = subprocess.run([
|
result = subprocess.run([
|
||||||
'pyinstaller',
|
'pyinstaller',
|
||||||
'OCR订单处理系统.spec'
|
'OCR订单处理系统.spec'
|
||||||
@@ -206,9 +150,6 @@ def build_exe():
|
|||||||
if root_config_file.exists():
|
if root_config_file.exists():
|
||||||
shutil.copy2(root_config_file, dist_dir)
|
shutil.copy2(root_config_file, dist_dir)
|
||||||
print(f"已复制根配置文件到dist: {root_config_file} -> {dist_dir}")
|
print(f"已复制根配置文件到dist: {root_config_file} -> {dist_dir}")
|
||||||
else:
|
|
||||||
print("警告: 根配置文件不存在,将创建缺省版本")
|
|
||||||
(dist_dir / 'config.ini').write_text('[App]\nversion = dev\n', encoding='utf-8')
|
|
||||||
|
|
||||||
except subprocess.CalledProcessError as e:
|
except subprocess.CalledProcessError as e:
|
||||||
print(f"构建失败: {e}")
|
print(f"构建失败: {e}")
|
||||||
@@ -223,18 +164,8 @@ def create_portable_package():
|
|||||||
# 创建发布目录
|
# 创建发布目录
|
||||||
release_dir = Path('release')
|
release_dir = Path('release')
|
||||||
if release_dir.exists():
|
if release_dir.exists():
|
||||||
try:
|
shutil.rmtree(release_dir)
|
||||||
shutil.rmtree(release_dir)
|
release_dir.mkdir()
|
||||||
except Exception as e:
|
|
||||||
print(f"警告: 无法完全清理发布目录 (可能文件被占用): {e}")
|
|
||||||
# 如果目录还在,尝试清理能清理的部分
|
|
||||||
for item in release_dir.iterdir():
|
|
||||||
try:
|
|
||||||
if item.is_dir(): shutil.rmtree(item)
|
|
||||||
else: item.unlink()
|
|
||||||
except Exception: pass
|
|
||||||
|
|
||||||
release_dir.mkdir(exist_ok=True)
|
|
||||||
|
|
||||||
# 复制exe文件
|
# 复制exe文件
|
||||||
exe_file = Path('dist/OCR订单处理系统.exe')
|
exe_file = Path('dist/OCR订单处理系统.exe')
|
||||||
@@ -279,17 +210,6 @@ def create_portable_package():
|
|||||||
print(f"已复制模板文件: {template_file} -> {release_dir / 'templates'}")
|
print(f"已复制模板文件: {template_file} -> {release_dir / 'templates'}")
|
||||||
else:
|
else:
|
||||||
print(f"警告: 模板文件不存在: {template_file}")
|
print(f"警告: 模板文件不存在: {template_file}")
|
||||||
item_file = Path('templates/商品资料.xlsx')
|
|
||||||
if item_file.exists():
|
|
||||||
try:
|
|
||||||
(Path('dist') / 'templates').mkdir(exist_ok=True)
|
|
||||||
shutil.copy2(item_file, Path('dist') / 'templates')
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
shutil.copy2(item_file, release_dir / 'templates')
|
|
||||||
print(f"已复制商品资料: {item_file} -> {release_dir / 'templates'}")
|
|
||||||
else:
|
|
||||||
print(f"警告: 商品资料文件不存在: {item_file}")
|
|
||||||
|
|
||||||
# 创建README文件
|
# 创建README文件
|
||||||
readme_content = '''
|
readme_content = '''
|
||||||
|
|||||||
@@ -0,0 +1,88 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
"""
|
||||||
|
清理脚本 - 用于删除无关的文件和日志
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import shutil
|
||||||
|
import glob
|
||||||
|
|
||||||
|
def clean_logs():
|
||||||
|
"""清理日志文件"""
|
||||||
|
print("清理日志文件...")
|
||||||
|
|
||||||
|
# 删除.active文件
|
||||||
|
active_files = glob.glob("logs/*.active")
|
||||||
|
for file in active_files:
|
||||||
|
try:
|
||||||
|
os.remove(file)
|
||||||
|
print(f"已删除: {file}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"删除文件时出错 {file}: {e}")
|
||||||
|
|
||||||
|
# 保留最新的日志,删除旧的备份
|
||||||
|
log_files = glob.glob("logs/*.log.*")
|
||||||
|
for file in log_files:
|
||||||
|
try:
|
||||||
|
os.remove(file)
|
||||||
|
print(f"已删除: {file}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"删除文件时出错 {file}: {e}")
|
||||||
|
|
||||||
|
def clean_temp_files():
|
||||||
|
"""清理临时文件"""
|
||||||
|
print("清理临时文件...")
|
||||||
|
|
||||||
|
# 清空临时目录
|
||||||
|
temp_dir = "data/temp"
|
||||||
|
if os.path.exists(temp_dir):
|
||||||
|
for file in os.listdir(temp_dir):
|
||||||
|
file_path = os.path.join(temp_dir, file)
|
||||||
|
try:
|
||||||
|
if os.path.isfile(file_path):
|
||||||
|
os.remove(file_path)
|
||||||
|
print(f"已删除: {file_path}")
|
||||||
|
elif os.path.isdir(file_path):
|
||||||
|
shutil.rmtree(file_path)
|
||||||
|
print(f"已删除目录: {file_path}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"删除文件时出错 {file_path}: {e}")
|
||||||
|
|
||||||
|
# 删除备份文件
|
||||||
|
backup_files = glob.glob("data/*.bak") + glob.glob("config/*.bak")
|
||||||
|
for file in backup_files:
|
||||||
|
try:
|
||||||
|
os.remove(file)
|
||||||
|
print(f"已删除: {file}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"删除文件时出错 {file}: {e}")
|
||||||
|
|
||||||
|
def clean_pycache():
|
||||||
|
"""清理Python缓存文件"""
|
||||||
|
print("清理Python缓存文件...")
|
||||||
|
|
||||||
|
# 查找并删除所有__pycache__目录
|
||||||
|
for root, dirs, files in os.walk("."):
|
||||||
|
for dir in dirs:
|
||||||
|
if dir == "__pycache__":
|
||||||
|
cache_dir = os.path.join(root, dir)
|
||||||
|
try:
|
||||||
|
shutil.rmtree(cache_dir)
|
||||||
|
print(f"已删除目录: {cache_dir}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"删除目录时出错 {cache_dir}: {e}")
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""主函数"""
|
||||||
|
print("开始清理无关文件...")
|
||||||
|
|
||||||
|
clean_logs()
|
||||||
|
clean_temp_files()
|
||||||
|
clean_pycache()
|
||||||
|
|
||||||
|
print("清理完成!")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
+3
-23
@@ -1,23 +1,17 @@
|
|||||||
[API]
|
[API]
|
||||||
api_key =
|
api_key = O0Fgk3o69RWJ86eAX8BTHRaB
|
||||||
secret_key =
|
secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
|
||||||
timeout = 30
|
timeout = 30
|
||||||
max_retries = 3
|
max_retries = 3
|
||||||
retry_delay = 2
|
retry_delay = 2
|
||||||
api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
|
api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
|
||||||
token_url = https://aip.baidubce.com/oauth/2.0/token
|
|
||||||
form_ocr_url = https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result
|
|
||||||
|
|
||||||
[Paths]
|
[Paths]
|
||||||
input_folder = data/input
|
input_folder = data/input
|
||||||
output_folder = data/output
|
output_folder = data/output
|
||||||
temp_folder = data/temp
|
temp_folder = data/temp
|
||||||
template_folder = templates
|
template_folder = templates
|
||||||
template_file = templates\银豹-采购单模板.xls
|
|
||||||
processed_record = data/processed_files.json
|
processed_record = data/processed_files.json
|
||||||
data_dir = data
|
|
||||||
product_db = data/product_cache.db
|
|
||||||
result_folder = data/result
|
|
||||||
|
|
||||||
[Performance]
|
[Performance]
|
||||||
max_workers = 4
|
max_workers = 4
|
||||||
@@ -27,22 +21,8 @@ skip_existing = true
|
|||||||
[File]
|
[File]
|
||||||
allowed_extensions = .jpg,.jpeg,.png,.bmp
|
allowed_extensions = .jpg,.jpeg,.png,.bmp
|
||||||
excel_extension = .xlsx
|
excel_extension = .xlsx
|
||||||
max_file_size_mb = 5
|
max_file_size_mb = 4
|
||||||
|
|
||||||
[Templates]
|
[Templates]
|
||||||
purchase_order = 银豹-采购单模板.xls
|
purchase_order = 银豹-采购单模板.xls
|
||||||
item_data = 商品资料.xlsx
|
|
||||||
|
|
||||||
[App]
|
|
||||||
version = 2026.05.05.0239
|
|
||||||
|
|
||||||
[Gitea]
|
|
||||||
base_url = https://gitea.94kan.cn
|
|
||||||
owner = houhuan
|
|
||||||
repo = yixuan-sync-data
|
|
||||||
token = 50b61e43a141d606ae2529cd1755bc666d800e08
|
|
||||||
|
|
||||||
[WebAuth]
|
|
||||||
username = admin
|
|
||||||
password_hash = $2b$12$nllT8o1QIMfWKuTlpQI3G./E2NS.gqf0EHZyNkJ8gMpVa9grTXRoC
|
|
||||||
|
|
||||||
|
|||||||
@@ -179,62 +179,6 @@
|
|||||||
"map_to": "69021343",
|
"map_to": "69021343",
|
||||||
"description": "条码映射:6923450653012 -> 69021343"
|
"description": "条码映射:6923450653012 -> 69021343"
|
||||||
},
|
},
|
||||||
"6923644295844": {
|
|
||||||
"map_to": "6923644285036",
|
|
||||||
"description": "条码映射:6923644295844 -> 6923644285036"
|
|
||||||
},
|
|
||||||
"6907992513157": {
|
|
||||||
"map_to": "6907992513195",
|
|
||||||
"description": "条码映射:6907992513157 -> 6907992513195"
|
|
||||||
},
|
|
||||||
"6902083893842": {
|
|
||||||
"map_to": "6902083907150",
|
|
||||||
"description": "条码映射:6902083893842 -> 6902083907150"
|
|
||||||
},
|
|
||||||
"6902083904685": {
|
|
||||||
"map_to": "6902083905217",
|
|
||||||
"description": "条码映射:6902083904685 -> 6902083905217"
|
|
||||||
},
|
|
||||||
"6917878036849": {
|
|
||||||
"map_to": "6917878036847",
|
|
||||||
"description": "条码映射:6917878036849 -> 6917878036847"
|
|
||||||
},
|
|
||||||
"6903979000078": {
|
|
||||||
"map_to": "6903979000061",
|
|
||||||
"description": "条码映射:6903979000078 -> 6903979000061"
|
|
||||||
},
|
|
||||||
"6937003706353": {
|
|
||||||
"map_to": "6937003706360",
|
|
||||||
"description": "条码映射:6937003706353 -> 6937003706360"
|
|
||||||
},
|
|
||||||
"6923644242961": {
|
|
||||||
"map_to": "6907992100043",
|
|
||||||
"description": "条码映射:6923644242961 -> 6907992100043"
|
|
||||||
},
|
|
||||||
"6923644258382": {
|
|
||||||
"map_to": "6923644252823",
|
|
||||||
"description": "条码映射:6923644258382 -> 6923644252823"
|
|
||||||
},
|
|
||||||
"6923450657430": {
|
|
||||||
"map_to": "69029110",
|
|
||||||
"description": "条码映射:6923450657430 -> 69029110"
|
|
||||||
},
|
|
||||||
"6923450660232": {
|
|
||||||
"map_to": "6923450690123",
|
|
||||||
"description": "条码映射:6923450660232 -> 6923450690123"
|
|
||||||
},
|
|
||||||
"6923450657614": {
|
|
||||||
"map_to": "6923450657607",
|
|
||||||
"description": "条码映射:6923450657614 -> 6923450657607"
|
|
||||||
},
|
|
||||||
"6972556000022": {
|
|
||||||
"map_to": "6977826050028",
|
|
||||||
"description": "条码映射:6972556000022 -> 6977826050028"
|
|
||||||
},
|
|
||||||
"6949352266280": {
|
|
||||||
"map_to": "6949352266273",
|
|
||||||
"description": "条码映射:6949352266280 -> 6949352266273"
|
|
||||||
},
|
|
||||||
"6925019900087": {
|
"6925019900087": {
|
||||||
"multiplier": 10,
|
"multiplier": 10,
|
||||||
"target_unit": "瓶",
|
"target_unit": "瓶",
|
||||||
@@ -257,17 +201,5 @@
|
|||||||
"target_unit": "个",
|
"target_unit": "个",
|
||||||
"specification": "1*14",
|
"specification": "1*14",
|
||||||
"description": "友臣肉松,1盒14个"
|
"description": "友臣肉松,1盒14个"
|
||||||
},
|
|
||||||
"6921734933485": {
|
|
||||||
"multiplier": 12,
|
|
||||||
"target_unit": "支",
|
|
||||||
"specification": "1*12",
|
|
||||||
"description": "得力铅笔"
|
|
||||||
},
|
|
||||||
"6901826888244": {
|
|
||||||
"multiplier": 30,
|
|
||||||
"target_unit": "对",
|
|
||||||
"specification": "1*30",
|
|
||||||
"description": "南孚电池"
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
+2
-15
@@ -1,23 +1,17 @@
|
|||||||
[API]
|
[API]
|
||||||
api_key =
|
api_key = O0Fgk3o69RWJ86eAX8BTHRaB
|
||||||
secret_key =
|
secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
|
||||||
timeout = 30
|
timeout = 30
|
||||||
max_retries = 3
|
max_retries = 3
|
||||||
retry_delay = 2
|
retry_delay = 2
|
||||||
api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
|
api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
|
||||||
token_url = https://aip.baidubce.com/oauth/2.0/token
|
|
||||||
form_ocr_url = https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/get_request_result
|
|
||||||
|
|
||||||
[Paths]
|
[Paths]
|
||||||
input_folder = data/input
|
input_folder = data/input
|
||||||
output_folder = data/output
|
output_folder = data/output
|
||||||
result_folder = data/result
|
|
||||||
temp_folder = data/temp
|
temp_folder = data/temp
|
||||||
template_folder = templates
|
template_folder = templates
|
||||||
template_file = 银豹-采购单模板.xls
|
|
||||||
processed_record = data/processed_files.json
|
processed_record = data/processed_files.json
|
||||||
data_dir = data
|
|
||||||
product_db = data/product_cache.db
|
|
||||||
|
|
||||||
[Performance]
|
[Performance]
|
||||||
max_workers = 4
|
max_workers = 4
|
||||||
@@ -31,11 +25,4 @@ max_file_size_mb = 4
|
|||||||
|
|
||||||
[Templates]
|
[Templates]
|
||||||
purchase_order = 银豹-采购单模板.xls
|
purchase_order = 银豹-采购单模板.xls
|
||||||
item_data = 商品资料.xlsx
|
|
||||||
|
|
||||||
[Gitea]
|
|
||||||
base_url = https://gitea.94kan.cn
|
|
||||||
owner = houhuan
|
|
||||||
repo = yixuan-sync-data
|
|
||||||
token =
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,237 +0,0 @@
|
|||||||
{
|
|
||||||
"suppliers": [
|
|
||||||
{
|
|
||||||
"name": "蓉城易购",
|
|
||||||
"description": "蓉城易购供应商订单处理",
|
|
||||||
"filename_patterns": [
|
|
||||||
"*蓉城*",
|
|
||||||
"*rongcheng*",
|
|
||||||
"*易*"
|
|
||||||
],
|
|
||||||
"content_indicators": [
|
|
||||||
"蓉城易购",
|
|
||||||
"商品编码",
|
|
||||||
"订货数量"
|
|
||||||
],
|
|
||||||
"column_mapping": {
|
|
||||||
"商品条码(小条码)": "barcode",
|
|
||||||
"商品名称": "name",
|
|
||||||
"规格": "specification",
|
|
||||||
"订购数量(小单位)": "quantity",
|
|
||||||
"单位": "unit",
|
|
||||||
"单价(小单位)": "unit_price",
|
|
||||||
"优惠后金额(小单位)": "total_price",
|
|
||||||
"备注": "category",
|
|
||||||
"行号": "supplier"
|
|
||||||
},
|
|
||||||
"cleaning_rules": [
|
|
||||||
{
|
|
||||||
"type": "remove_rows",
|
|
||||||
"condition": "订货数量 == 0 or 订货数量.isna()"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "fill_na",
|
|
||||||
"columns": [
|
|
||||||
"unit_price"
|
|
||||||
],
|
|
||||||
"value": 0
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"calculations": [
|
|
||||||
{
|
|
||||||
"type": "multiply",
|
|
||||||
"source_column": "quantity",
|
|
||||||
"target_column": "quantity",
|
|
||||||
"factor": 1
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"output_suffix": "_蓉城易购_银豹采购单",
|
|
||||||
"header_row": 2,
|
|
||||||
"rules": [
|
|
||||||
{
|
|
||||||
"type": "split_quantity_unit",
|
|
||||||
"source": "订购数量(小单位)"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "extract_spec_from_name",
|
|
||||||
"source": "商品名称"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "normalize_unit",
|
|
||||||
"target": "unit",
|
|
||||||
"map": {
|
|
||||||
"箱": "件",
|
|
||||||
"提": "件",
|
|
||||||
"盒": "件"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "compute_quantity_from_total"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "mark_gift"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "fill_missing",
|
|
||||||
"fills": {
|
|
||||||
"unit": "瓶"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"output_templates": [
|
|
||||||
"templates/银豹-采购单模板.xls"
|
|
||||||
],
|
|
||||||
"current_template_index": 0
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"name": "通用食品供应商",
|
|
||||||
"description": "通用食品类供应商订单",
|
|
||||||
"filename_patterns": [
|
|
||||||
"*食品*",
|
|
||||||
"*配送*",
|
|
||||||
"*供货*"
|
|
||||||
],
|
|
||||||
"content_indicators": [
|
|
||||||
"产品条码",
|
|
||||||
"订购量",
|
|
||||||
"进货价"
|
|
||||||
],
|
|
||||||
"column_mapping": {
|
|
||||||
"产品条码": "barcode",
|
|
||||||
"产品名称": "name",
|
|
||||||
"订购量": "quantity",
|
|
||||||
"进货价": "unit_price"
|
|
||||||
},
|
|
||||||
"cleaning_rules": [
|
|
||||||
{
|
|
||||||
"type": "convert_type",
|
|
||||||
"columns": [
|
|
||||||
"unit_price"
|
|
||||||
],
|
|
||||||
"target_type": "float"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "fill_na",
|
|
||||||
"columns": [
|
|
||||||
"barcode",
|
|
||||||
"name",
|
|
||||||
"quantity"
|
|
||||||
],
|
|
||||||
"value": 0
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"output_suffix": "_食品供应商_银豹采购单",
|
|
||||||
"rules": [
|
|
||||||
{
|
|
||||||
"type": "split_quantity_unit",
|
|
||||||
"source": "订购量"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "extract_spec_from_name",
|
|
||||||
"source": "产品名称"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "normalize_unit",
|
|
||||||
"target": "unit",
|
|
||||||
"map": {
|
|
||||||
"箱": "件",
|
|
||||||
"提": "件",
|
|
||||||
"盒": "件"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "compute_quantity_from_total"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "mark_gift"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "fill_missing",
|
|
||||||
"fills": {
|
|
||||||
"unit": "瓶"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"output_templates": [
|
|
||||||
"templates/银豹-采购单模板.xls"
|
|
||||||
],
|
|
||||||
"current_template_index": 0
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"name": "农夫山泉",
|
|
||||||
"description": "",
|
|
||||||
"filename_patterns": [],
|
|
||||||
"content_indicators": [],
|
|
||||||
"column_mapping": {
|
|
||||||
"条形码": "barcode",
|
|
||||||
"商品名称": "name",
|
|
||||||
"销售价": "unit_price",
|
|
||||||
"订单金额": "total_price",
|
|
||||||
"Unnamed: 0": "supplier",
|
|
||||||
"备注": "brand"
|
|
||||||
},
|
|
||||||
"header_row": 0,
|
|
||||||
"rules": [
|
|
||||||
{
|
|
||||||
"type": "split_quantity_unit",
|
|
||||||
"source": "订单数量"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "extract_spec_from_name",
|
|
||||||
"source": "name"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "normalize_unit",
|
|
||||||
"target": "unit",
|
|
||||||
"map": {
|
|
||||||
"箱": "件",
|
|
||||||
"提": "件",
|
|
||||||
"盒": "件"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "compute_quantity_from_total"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "mark_gift"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"type": "fill_missing",
|
|
||||||
"fills": {
|
|
||||||
"unit": "瓶"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"dictionary": {
|
|
||||||
"ignore_words": [
|
|
||||||
"白膜",
|
|
||||||
"彩膜",
|
|
||||||
"赠品"
|
|
||||||
],
|
|
||||||
"unit_synonyms": {
|
|
||||||
"箱": "件",
|
|
||||||
"提": "件",
|
|
||||||
"盒": "件",
|
|
||||||
"瓶": "瓶"
|
|
||||||
},
|
|
||||||
"pack_multipliers": {
|
|
||||||
"件": 24,
|
|
||||||
"箱": 24,
|
|
||||||
"提": 12,
|
|
||||||
"盒": 10
|
|
||||||
},
|
|
||||||
"name_patterns": [
|
|
||||||
"(\\d+(?:\\.\\d+)?)(ml|mL|ML|l|L|升|毫升)[*×xX](\\d+)",
|
|
||||||
"(\\d+)[*×xX](\\d+)瓶",
|
|
||||||
"(\\d{2,3}).*?(\\d{1,3})"
|
|
||||||
],
|
|
||||||
"default_unit": "瓶",
|
|
||||||
"default_package_quantity": 1
|
|
||||||
},
|
|
||||||
"output_templates": [
|
|
||||||
"templates/银豹-采购单模板.xls"
|
|
||||||
],
|
|
||||||
"current_template_index": 0
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
Binary file not shown.
|
After Width: | Height: | Size: 173 KiB |
Binary file not shown.
@@ -0,0 +1,3 @@
|
|||||||
|
{
|
||||||
|
"data/output\\7a3a78a02fcf6ccef5daad31bd50bdf2.xlsx": "data/result\\采购单_7a3a78a02fcf6ccef5daad31bd50bdf2.xls"
|
||||||
|
}
|
||||||
Binary file not shown.
@@ -0,0 +1 @@
|
|||||||
|
{"theme": "light"}
|
||||||
Vendored
BIN
Binary file not shown.
Vendored
+28
@@ -0,0 +1,28 @@
|
|||||||
|
[API]
|
||||||
|
api_key = O0Fgk3o69RWJ86eAX8BTHRaB
|
||||||
|
secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
|
||||||
|
timeout = 30
|
||||||
|
max_retries = 3
|
||||||
|
retry_delay = 2
|
||||||
|
api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
|
||||||
|
|
||||||
|
[Paths]
|
||||||
|
input_folder = data/input
|
||||||
|
output_folder = data/output
|
||||||
|
temp_folder = data/temp
|
||||||
|
template_folder = templates
|
||||||
|
processed_record = data/processed_files.json
|
||||||
|
|
||||||
|
[Performance]
|
||||||
|
max_workers = 4
|
||||||
|
batch_size = 5
|
||||||
|
skip_existing = true
|
||||||
|
|
||||||
|
[File]
|
||||||
|
allowed_extensions = .jpg,.jpeg,.png,.bmp
|
||||||
|
excel_extension = .xlsx
|
||||||
|
max_file_size_mb = 4
|
||||||
|
|
||||||
|
[Templates]
|
||||||
|
purchase_order = 银豹-采购单模板.xls
|
||||||
|
|
||||||
Vendored
+205
@@ -0,0 +1,205 @@
|
|||||||
|
{
|
||||||
|
"6920584471055": {
|
||||||
|
"map_to": "6920584471017",
|
||||||
|
"description": "条码映射:6920584471055 -> 6920584471017"
|
||||||
|
},
|
||||||
|
"6925861571159": {
|
||||||
|
"map_to": "69021824",
|
||||||
|
"description": "条码映射:6925861571159 -> 69021824"
|
||||||
|
},
|
||||||
|
"6923644268923": {
|
||||||
|
"map_to": "6923644268480",
|
||||||
|
"description": "条码映射:6923644268923 -> 6923644268480"
|
||||||
|
},
|
||||||
|
"6925861571466": {
|
||||||
|
"map_to": "6925861571459",
|
||||||
|
"description": "条码映射:6925861571466 -> 6925861571459"
|
||||||
|
},
|
||||||
|
"6907992508344": {
|
||||||
|
"map_to": "6907992508191",
|
||||||
|
"description": "条码映射:6907992508344 -> 6907992508191"
|
||||||
|
},
|
||||||
|
"6903979000979": {
|
||||||
|
"map_to": "6903979000962",
|
||||||
|
"description": "条码映射:6903979000979 -> 6903979000962"
|
||||||
|
},
|
||||||
|
"6923644283582": {
|
||||||
|
"map_to": "6923644283575",
|
||||||
|
"description": "条码映射:6923644283582 -> 6923644283575"
|
||||||
|
},
|
||||||
|
"6923644268930": {
|
||||||
|
"map_to": "6923644268497",
|
||||||
|
"description": "条码映射:6923644268930 -> 6923644268497"
|
||||||
|
},
|
||||||
|
"6923644268916": {
|
||||||
|
"map_to": "6923644268503",
|
||||||
|
"description": "条码映射:6923644268916 -> 6923644268503"
|
||||||
|
},
|
||||||
|
"6923644268909": {
|
||||||
|
"map_to": "6923644268510",
|
||||||
|
"description": "条码映射:6923644268909 -> 6923644268510"
|
||||||
|
},
|
||||||
|
"6923644299804": {
|
||||||
|
"map_to": "6923644299774",
|
||||||
|
"description": "条码映射:6923644299804 -> 6923644299774"
|
||||||
|
},
|
||||||
|
"6923644266318": {
|
||||||
|
"map_to": "6923644266066",
|
||||||
|
"description": "条码映射:6923644266318 -> 6923644266066"
|
||||||
|
},
|
||||||
|
"6923644210151": {
|
||||||
|
"map_to": "6923644223458",
|
||||||
|
"description": "条码映射:6923644210151 -> 6923644223458"
|
||||||
|
},
|
||||||
|
"6907992501819": {
|
||||||
|
"map_to": "6907992500133",
|
||||||
|
"description": "条码映射:6907992501819 -> 6907992500133"
|
||||||
|
},
|
||||||
|
"6907992502052": {
|
||||||
|
"map_to": "6907992100272",
|
||||||
|
"description": "条码映射:6907992502052 -> 6907992100272"
|
||||||
|
},
|
||||||
|
"6907992507385": {
|
||||||
|
"map_to": "6907992507095",
|
||||||
|
"description": "条码映射:6907992507385 -> 6907992507095"
|
||||||
|
},
|
||||||
|
"6973726149671": {
|
||||||
|
"map_to": "6973726149657",
|
||||||
|
"description": "条码映射:6973726149671 -> 6973726149657"
|
||||||
|
},
|
||||||
|
"6977426410574": {
|
||||||
|
"map_to": "6977426410567",
|
||||||
|
"description": "条码映射:6977426410574 -> 6977426410567"
|
||||||
|
},
|
||||||
|
"6973726149688": {
|
||||||
|
"map_to": "6973726149664",
|
||||||
|
"description": "条码映射:6973726149688 -> 6973726149664"
|
||||||
|
},
|
||||||
|
"6935205322012": {
|
||||||
|
"map_to": "6935205320018",
|
||||||
|
"description": "条码映射:6935205322012 -> 6935205320018"
|
||||||
|
},
|
||||||
|
"6943497411024": {
|
||||||
|
"map_to": "6943497411017",
|
||||||
|
"description": "条码映射:6943497411024 -> 6943497411017"
|
||||||
|
},
|
||||||
|
"6921734968821": {
|
||||||
|
"map_to": "6921734968814",
|
||||||
|
"description": "条码映射:6921734968821 -> 6921734968814"
|
||||||
|
},
|
||||||
|
"6921734968258": {
|
||||||
|
"map_to": "6921734968241",
|
||||||
|
"description": "条码映射:6921734968258 -> 6921734968241"
|
||||||
|
},
|
||||||
|
"6921734968180": {
|
||||||
|
"map_to": "6921734968173",
|
||||||
|
"description": "条码映射:6921734968180 -> 6921734968173"
|
||||||
|
},
|
||||||
|
"6921734908735": {
|
||||||
|
"map_to": "6935205372772",
|
||||||
|
"description": "条码映射:6921734908735 -> 6935205372772"
|
||||||
|
},
|
||||||
|
"6923644248222": {
|
||||||
|
"map_to": "6923644248208",
|
||||||
|
"description": "条码映射:6923644248222 -> 6923644248208"
|
||||||
|
},
|
||||||
|
"6902083881122": {
|
||||||
|
"map_to": "6902083881085",
|
||||||
|
"description": "条码映射:6902083881122 -> 6902083881085"
|
||||||
|
},
|
||||||
|
"6907992501857": {
|
||||||
|
"map_to": "6907992500010",
|
||||||
|
"description": "条码映射:6907992501857 -> 6907992500010"
|
||||||
|
},
|
||||||
|
"6902083891015": {
|
||||||
|
"map_to": "6902083890636",
|
||||||
|
"description": "条码映射:6902083891015 -> 6902083890636"
|
||||||
|
},
|
||||||
|
"6923450605240": {
|
||||||
|
"map_to": "6923450605226",
|
||||||
|
"description": "条码映射:6923450605240 -> 6923450605226"
|
||||||
|
},
|
||||||
|
"6923450605196": {
|
||||||
|
"map_to": "6923450614624",
|
||||||
|
"description": "条码映射:6923450605196 -> 6923450614624"
|
||||||
|
},
|
||||||
|
"6923450665213": {
|
||||||
|
"map_to": "6923450665206",
|
||||||
|
"description": "条码映射:6923450665213 -> 6923450665206"
|
||||||
|
},
|
||||||
|
"6923450666821": {
|
||||||
|
"map_to": "6923450666838",
|
||||||
|
"description": "条码映射:6923450666821 -> 6923450666838"
|
||||||
|
},
|
||||||
|
"6923450661505": {
|
||||||
|
"map_to": "6923450661499",
|
||||||
|
"description": "条码映射:6923450661505 -> 6923450661499"
|
||||||
|
},
|
||||||
|
"6923450676103": {
|
||||||
|
"map_to": "6923450676097",
|
||||||
|
"description": "条码映射:6923450676103 -> 6923450676097"
|
||||||
|
},
|
||||||
|
"6923450614631": {
|
||||||
|
"map_to": "6923450614624",
|
||||||
|
"description": "条码映射:6923450614631 -> 6923450614624"
|
||||||
|
},
|
||||||
|
"6901424334174": {
|
||||||
|
"map_to": "6973730760015",
|
||||||
|
"description": "条码映射:6901424334174 -> 6973730760015"
|
||||||
|
},
|
||||||
|
"6958620703716": {
|
||||||
|
"map_to": "6958620703907",
|
||||||
|
"description": "条码映射:6958620703716 -> 6958620703907"
|
||||||
|
},
|
||||||
|
"6937003706322": {
|
||||||
|
"map_to": "6937003703833",
|
||||||
|
"description": "条码映射:6937003706322 -> 6937003703833"
|
||||||
|
},
|
||||||
|
"6950783203494": {
|
||||||
|
"map_to": "6950873203494",
|
||||||
|
"description": "条码映射:6950783203494 -> 6950873203494"
|
||||||
|
},
|
||||||
|
"6907992501871": {
|
||||||
|
"map_to": "6907992500010",
|
||||||
|
"description": "条码映射:6907992501871 -> 6907992500010"
|
||||||
|
},
|
||||||
|
"6907992501864": {
|
||||||
|
"map_to": "6907992100012",
|
||||||
|
"description": "条码映射:6907992501864 -> 6907992100012"
|
||||||
|
},
|
||||||
|
"6923644264192": {
|
||||||
|
"map_to": "6923644264116",
|
||||||
|
"description": "条码映射:6923644264192 -> 6923644264116"
|
||||||
|
},
|
||||||
|
"6923450667316": {
|
||||||
|
"map_to": "69042386",
|
||||||
|
"description": "条码映射:6923450667316 -> 69042386"
|
||||||
|
},
|
||||||
|
"6923450653012": {
|
||||||
|
"map_to": "69021343",
|
||||||
|
"description": "条码映射:6923450653012 -> 69021343"
|
||||||
|
},
|
||||||
|
"6925019900087": {
|
||||||
|
"multiplier": 10,
|
||||||
|
"target_unit": "瓶",
|
||||||
|
"description": "特殊处理:数量*10,单位转换为瓶"
|
||||||
|
},
|
||||||
|
"6921168593804": {
|
||||||
|
"multiplier": 30,
|
||||||
|
"target_unit": "瓶",
|
||||||
|
"description": "NFC产品特殊处理:每箱30瓶"
|
||||||
|
},
|
||||||
|
"6901826888138": {
|
||||||
|
"multiplier": 30,
|
||||||
|
"target_unit": "瓶",
|
||||||
|
"fixed_price": 3.7333333333333334,
|
||||||
|
"specification": "1*30",
|
||||||
|
"description": "特殊处理: 规格1*30,数量*30,单价=112/30"
|
||||||
|
},
|
||||||
|
"6958620703907": {
|
||||||
|
"multiplier": 14,
|
||||||
|
"target_unit": "个",
|
||||||
|
"specification": "1*14",
|
||||||
|
"description": "友臣肉松,1盒14个"
|
||||||
|
}
|
||||||
|
}
|
||||||
Vendored
+28
@@ -0,0 +1,28 @@
|
|||||||
|
[API]
|
||||||
|
api_key = O0Fgk3o69RWJ86eAX8BTHRaB
|
||||||
|
secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
|
||||||
|
timeout = 30
|
||||||
|
max_retries = 3
|
||||||
|
retry_delay = 2
|
||||||
|
api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
|
||||||
|
|
||||||
|
[Paths]
|
||||||
|
input_folder = data/input
|
||||||
|
output_folder = data/output
|
||||||
|
temp_folder = data/temp
|
||||||
|
template_folder = templates
|
||||||
|
processed_record = data/processed_files.json
|
||||||
|
|
||||||
|
[Performance]
|
||||||
|
max_workers = 4
|
||||||
|
batch_size = 5
|
||||||
|
skip_existing = true
|
||||||
|
|
||||||
|
[File]
|
||||||
|
allowed_extensions = .jpg,.jpeg,.png,.bmp
|
||||||
|
excel_extension = .xlsx
|
||||||
|
max_file_size_mb = 4
|
||||||
|
|
||||||
|
[Templates]
|
||||||
|
purchase_order = 银豹-采购单模板.xls
|
||||||
|
|
||||||
-212
@@ -1,212 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# -*- coding: utf-8 -*-
|
|
||||||
|
|
||||||
"""
|
|
||||||
OCR订单处理系统 - 无界面自动化接口
|
|
||||||
-----------------------------
|
|
||||||
专为与 openclaw 等自动化平台对接设计。
|
|
||||||
处理流程:输入图片 -> OCR识别 -> 数据清洗 -> 价格校验 -> 输出结果路径。
|
|
||||||
"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import time
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import Optional, List, Dict
|
|
||||||
|
|
||||||
# 添加当前目录到路径
|
|
||||||
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
|
|
||||||
|
|
||||||
from app.config.settings import ConfigManager
|
|
||||||
from app.services.ocr_service import OCRService
|
|
||||||
from app.services.order_service import OrderService
|
|
||||||
from app.services.tobacco_service import TobaccoService
|
|
||||||
from app.services.special_suppliers_service import SpecialSuppliersService
|
|
||||||
from app.core.utils.log_utils import get_logger, set_log_level
|
|
||||||
|
|
||||||
logger = get_logger("HeadlessAPI")
|
|
||||||
|
|
||||||
def get_latest_file(directory: str, extensions: List[str]) -> Optional[str]:
|
|
||||||
"""获取目录中最新的指定后缀文件"""
|
|
||||||
dir_path = Path(directory)
|
|
||||||
if not dir_path.exists():
|
|
||||||
return None
|
|
||||||
|
|
||||||
files = []
|
|
||||||
for ext in extensions:
|
|
||||||
files.extend(dir_path.glob(f"*{ext}"))
|
|
||||||
files.extend(dir_path.glob(f"*{ext.upper()}"))
|
|
||||||
|
|
||||||
if not files:
|
|
||||||
return None
|
|
||||||
|
|
||||||
latest_file = max(files, key=lambda p: p.stat().st_mtime)
|
|
||||||
return str(latest_file)
|
|
||||||
|
|
||||||
def update_barcode_mapping(barcode: str, target_barcode: str = None, multiplier: float = None, unit: str = None, price: float = None, spec: str = None):
|
|
||||||
"""更新条码映射或特殊处理配置"""
|
|
||||||
try:
|
|
||||||
config_path = os.path.join("config", "barcode_mappings.json")
|
|
||||||
mappings = {}
|
|
||||||
if os.path.exists(config_path):
|
|
||||||
with open(config_path, 'r', encoding='utf-8') as f:
|
|
||||||
mappings = json.load(f)
|
|
||||||
|
|
||||||
# 获取或创建该条码的配置
|
|
||||||
config = mappings.get(barcode, {})
|
|
||||||
|
|
||||||
if target_barcode:
|
|
||||||
config["map_to"] = target_barcode
|
|
||||||
config["description"] = config.get("description", "") + f" 条码映射 -> {target_barcode}"
|
|
||||||
|
|
||||||
if multiplier is not None:
|
|
||||||
config["multiplier"] = multiplier
|
|
||||||
config["description"] = config.get("description", "") + f" 数量倍数*{multiplier}"
|
|
||||||
|
|
||||||
if unit:
|
|
||||||
config["target_unit"] = unit
|
|
||||||
|
|
||||||
if price is not None:
|
|
||||||
config["fixed_price"] = price
|
|
||||||
|
|
||||||
if spec:
|
|
||||||
config["specification"] = spec
|
|
||||||
|
|
||||||
if not config.get("description"):
|
|
||||||
config["description"] = f"特殊条码配置: {barcode}"
|
|
||||||
|
|
||||||
mappings[barcode] = config
|
|
||||||
|
|
||||||
with open(config_path, 'w', encoding='utf-8') as f:
|
|
||||||
json.dump(mappings, f, ensure_ascii=False, indent=2)
|
|
||||||
|
|
||||||
logger.info(f"成功更新条码配置: {barcode} -> {config}")
|
|
||||||
return True
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"更新条码配置失败: {e}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
def run_pipeline(args):
|
|
||||||
"""运行处理流水线"""
|
|
||||||
try:
|
|
||||||
config_manager = ConfigManager()
|
|
||||||
order_service = OrderService(config_manager)
|
|
||||||
start_time = time.perf_counter()
|
|
||||||
final_excel = None
|
|
||||||
|
|
||||||
input_folder = config_manager.get('Paths', 'input_folder', fallback='data/input')
|
|
||||||
output_folder = config_manager.get('Paths', 'output_folder', fallback='data/output')
|
|
||||||
|
|
||||||
# 1. 处理条码映射更新
|
|
||||||
if args.update_mapping:
|
|
||||||
if not args.barcode:
|
|
||||||
print("ERROR: --barcode is required for --update-mapping", file=sys.stderr)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 至少需要一个更新项
|
|
||||||
if not any([args.target, args.multiplier, args.unit, args.price, args.spec]):
|
|
||||||
print("ERROR: At least one update option (--target, --multiplier, --unit, --price, --spec) is required", file=sys.stderr)
|
|
||||||
return None
|
|
||||||
|
|
||||||
if update_barcode_mapping(args.barcode, args.target, args.multiplier, args.unit, args.price, args.spec):
|
|
||||||
print(f"SUCCESS: Barcode configuration updated for {args.barcode}")
|
|
||||||
return "MAPPING_UPDATED"
|
|
||||||
return None
|
|
||||||
|
|
||||||
# 2. 烟草公司处理 (显式指定)
|
|
||||||
if args.tobacco:
|
|
||||||
input_path = args.input or get_latest_file(output_folder, [".xlsx", ".xls"])
|
|
||||||
if not input_path:
|
|
||||||
print("ERROR: No tobacco order file found.", file=sys.stderr)
|
|
||||||
return None
|
|
||||||
logger.info(f"开始显式处理烟草订单: {input_path}")
|
|
||||||
# 这里的 process_tobacco_order 会调用 preprocess 并生成银豹格式
|
|
||||||
tobacco_service = TobaccoService(config_manager)
|
|
||||||
final_excel = tobacco_service.process_tobacco_order(input_path)
|
|
||||||
|
|
||||||
# 3. 蓉城易购处理 (显式指定)
|
|
||||||
elif args.rongcheng:
|
|
||||||
input_path = args.input or get_latest_file(output_folder, [".xlsx", ".xls"])
|
|
||||||
if not input_path:
|
|
||||||
print("ERROR: No Rongcheng Yigou order file found.", file=sys.stderr)
|
|
||||||
return None
|
|
||||||
logger.info(f"开始显式处理蓉城易购订单: {input_path}")
|
|
||||||
special_service = SpecialSuppliersService(config_manager)
|
|
||||||
final_excel = special_service.process_rongcheng_yigou(input_path)
|
|
||||||
|
|
||||||
# 4. 普通 Excel 处理 (支持自动识别烟草/蓉城/杨碧月)
|
|
||||||
elif args.excel:
|
|
||||||
input_path = args.input or get_latest_file(input_folder, [".xlsx", ".xls"])
|
|
||||||
if not input_path:
|
|
||||||
print("ERROR: No Excel file found in input.", file=sys.stderr)
|
|
||||||
return None
|
|
||||||
logger.info(f"开始处理 Excel (支持智能识别): {input_path}")
|
|
||||||
# OrderService.process_excel 内部会自动调用 _check_special_preprocess
|
|
||||||
final_excel = order_service.process_excel(input_path)
|
|
||||||
|
|
||||||
# 5. 智能处理 (默认逻辑:自动判断图片还是 Excel)
|
|
||||||
else:
|
|
||||||
input_path = args.input or get_latest_file(input_folder, [".jpg", ".jpeg", ".png", ".bmp", ".xlsx", ".xls"])
|
|
||||||
if not input_path:
|
|
||||||
print(f"ERROR: No input file found in {input_folder}.", file=sys.stderr)
|
|
||||||
return None
|
|
||||||
|
|
||||||
ext = os.path.splitext(input_path)[1].lower()
|
|
||||||
if ext in [".xlsx", ".xls"]:
|
|
||||||
logger.info(f"智能识别为 Excel 文件,开始处理: {input_path}")
|
|
||||||
final_excel = order_service.process_excel(input_path)
|
|
||||||
else:
|
|
||||||
logger.info(f"智能识别为图片文件,开始 OCR 处理: {input_path}")
|
|
||||||
ocr_service = OCRService(config_manager)
|
|
||||||
excel_intermediate = ocr_service.process_image(input_path)
|
|
||||||
if excel_intermediate:
|
|
||||||
final_excel = order_service.process_excel(excel_intermediate)
|
|
||||||
|
|
||||||
# 6. 后续处理 (校验与输出)
|
|
||||||
if final_excel:
|
|
||||||
# 单价校验
|
|
||||||
discrepancies = order_service.validate_unit_price(final_excel)
|
|
||||||
if discrepancies:
|
|
||||||
print(f"WARNING: Price validation found {len(discrepancies)} issues:", file=sys.stderr)
|
|
||||||
for d in discrepancies:
|
|
||||||
print(f" - {d}", file=sys.stderr)
|
|
||||||
|
|
||||||
duration = time.perf_counter() - start_time
|
|
||||||
logger.info(f"处理完成,耗时: {duration:.2f}s")
|
|
||||||
|
|
||||||
# 输出最终路径
|
|
||||||
abs_path = os.path.abspath(final_excel)
|
|
||||||
print(abs_path)
|
|
||||||
return abs_path
|
|
||||||
else:
|
|
||||||
print("ERROR: Processing failed.", file=sys.stderr)
|
|
||||||
return None
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
import traceback
|
|
||||||
print(f"CRITICAL ERROR: {str(e)}", file=sys.stderr)
|
|
||||||
traceback.print_exc(file=sys.stderr)
|
|
||||||
return None
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
parser = argparse.ArgumentParser(description="OCR订单处理系统 - 无界面自动化接口")
|
|
||||||
parser.add_argument('input', nargs='?', help='输入文件路径 (图片或Excel)')
|
|
||||||
|
|
||||||
group = parser.add_mutually_exclusive_group()
|
|
||||||
group.add_argument('--excel', action='store_true', help='处理普通 Excel 文件')
|
|
||||||
group.add_argument('--tobacco', action='store_true', help='处理烟草公司订单')
|
|
||||||
group.add_argument('--rongcheng', action='store_true', help='处理蓉城易购订单')
|
|
||||||
group.add_argument('--update-mapping', action='store_true', help='更新条码映射')
|
|
||||||
|
|
||||||
parser.add_argument('--barcode', help='待映射的原始条码 (用于 --update-mapping)')
|
|
||||||
parser.add_argument('--target', help='目标条码 (用于 --update-mapping)')
|
|
||||||
parser.add_argument('--multiplier', type=float, help='数量倍数 (例如箱转瓶填写30)')
|
|
||||||
parser.add_argument('--unit', help='目标单位 (例如"瓶")')
|
|
||||||
parser.add_argument('--price', type=float, help='固定单价')
|
|
||||||
parser.add_argument('--spec', help='固定规格 (例如"1*30")')
|
|
||||||
|
|
||||||
args = parser.parse_args()
|
|
||||||
result = run_pipeline(args)
|
|
||||||
sys.exit(0 if result else 1)
|
|
||||||
@@ -0,0 +1,9 @@
|
|||||||
|
2025-08-16 00:52:16,840 - app.core.excel.converter - INFO - 成功加载条码映射配置,共49项
|
||||||
|
2025-08-16 00:52:17,144 - app.core.excel.converter - INFO - 解析容量(ml)规格: 500ml*15 -> 1*15
|
||||||
|
2025-08-16 00:52:17,217 - app.core.excel.converter - INFO - 解析容量(ml)规格: 600mL*15 -> 1*15
|
||||||
|
2025-08-16 00:52:17,283 - app.core.excel.converter - INFO - 解析容量(ml)规格: 600ml*15 -> 1*15
|
||||||
|
2025-08-16 00:52:17,346 - app.core.excel.converter - INFO - 解析容量(ml)规格: 900ml*12 -> 1*12
|
||||||
|
2025-08-16 00:52:17,399 - app.core.excel.converter - INFO - 解析容量(ml)规格: 900ml*12 -> 1*12
|
||||||
|
2025-08-16 00:52:17,462 - app.core.excel.converter - INFO - 解析容量(ml)规格: 900ml*12 -> 1*12
|
||||||
|
2025-08-16 00:52:17,515 - app.core.excel.converter - INFO - 解析容量(ml)规格: 950ml*12 -> 1*12
|
||||||
|
2025-08-16 00:52:17,579 - app.core.excel.converter - INFO - 解析容量(ml)规格: 480ml*15 -> 1*15
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
2025-08-16 00:52:17,210 - app.core.excel.handlers.barcode_mapper - INFO - 条码映射: 6937003706322 -> 6937003703833
|
||||||
@@ -0,0 +1,8 @@
|
|||||||
|
2025-08-16 00:52:17,160 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 15.0, 单价: 68.0 -> 4.533333333333333, 单位: 件 -> 瓶
|
||||||
|
2025-08-16 00:52:17,236 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 15.0, 单价: 68.0 -> 4.533333333333333, 单位: 件 -> 瓶
|
||||||
|
2025-08-16 00:52:17,298 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 15.0, 单价: 68.0 -> 4.533333333333333, 单位: 件 -> 瓶
|
||||||
|
2025-08-16 00:52:17,366 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 2.0 -> 24.0, 单价: 45.0 -> 3.75, 单位: 件 -> 瓶
|
||||||
|
2025-08-16 00:52:17,415 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 2.0 -> 24.0, 单价: 45.0 -> 3.75, 单位: 件 -> 瓶
|
||||||
|
2025-08-16 00:52:17,477 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 12.0, 单价: 45.0 -> 3.75, 单位: 件 -> 瓶
|
||||||
|
2025-08-16 00:52:17,533 - app.core.excel.handlers.unit_converter_handlers - INFO - 件单位处理: 数量: 1.0 -> 12.0, 单价: 73.0 -> 6.083333333333333, 单位: 件 -> 瓶
|
||||||
|
2025-08-16 00:52:17,594 - app.core.excel.handlers.unit_converter_handlers - INFO - 赠品瓶单位处理: 保持原样 数量: 5.0, 单价: 0, 单位: 瓶
|
||||||
@@ -0,0 +1,2 @@
|
|||||||
|
2025-08-16 00:52:16,853 - app.core.excel.merger - INFO - 使用输出目录: E:\2025Code\python\orc-order-v2\data\output
|
||||||
|
2025-08-16 00:52:16,861 - app.core.excel.merger - INFO - 初始化PurchaseOrderMerger完成,模板文件: templates\银豹-采购单模板.xls
|
||||||
@@ -0,0 +1,57 @@
|
|||||||
|
2025-08-16 00:52:16,835 - app.core.excel.processor - INFO - 使用输出目录: E:\2025Code\python\orc-order-v2\data\output
|
||||||
|
2025-08-16 00:52:16,839 - app.core.excel.processor - INFO - 使用临时目录: E:\2025Code\python\orc-order-v2\data\temp
|
||||||
|
2025-08-16 00:52:16,847 - app.core.excel.processor - INFO - 初始化ExcelProcessor完成,模板文件: templates/银豹-采购单模板.xls
|
||||||
|
2025-08-16 00:52:16,877 - app.core.excel.processor - INFO - 搜索目录 data/output 中的Excel文件
|
||||||
|
2025-08-16 00:52:16,886 - app.core.excel.processor - INFO - 找到最新的Excel文件: data/output\7a3a78a02fcf6ccef5daad31bd50bdf2.xlsx
|
||||||
|
2025-08-16 00:52:16,895 - app.core.excel.processor - INFO - 开始处理Excel文件: data/output\7a3a78a02fcf6ccef5daad31bd50bdf2.xlsx
|
||||||
|
2025-08-16 00:52:16,934 - app.core.excel.processor - INFO - 成功读取Excel文件: data/output\7a3a78a02fcf6ccef5daad31bd50bdf2.xlsx, 共 10 行
|
||||||
|
2025-08-16 00:52:16,935 - app.core.excel.processor - INFO - 找到可能的表头行: 第1行,评分: 60
|
||||||
|
2025-08-16 00:52:16,941 - app.core.excel.processor - INFO - 识别到表头在第 1 行
|
||||||
|
2025-08-16 00:52:16,965 - app.core.excel.processor - INFO - 使用表头行重新读取数据,共 9 行有效数据
|
||||||
|
2025-08-16 00:52:16,974 - app.core.excel.processor - INFO - 找到精确匹配的条码列: 商品条码
|
||||||
|
2025-08-16 00:52:16,988 - app.core.excel.processor - INFO - 使用条码列: 商品条码
|
||||||
|
2025-08-16 00:52:17,003 - app.core.excel.processor - INFO - 找到name列: 商品名称
|
||||||
|
2025-08-16 00:52:17,016 - app.core.excel.processor - INFO - 找到specification列: 规格型号
|
||||||
|
2025-08-16 00:52:17,032 - app.core.excel.processor - INFO - 找到quantity列: 数量
|
||||||
|
2025-08-16 00:52:17,049 - app.core.excel.processor - INFO - 找到unit列: 单位
|
||||||
|
2025-08-16 00:52:17,064 - app.core.excel.processor - INFO - 找到price列: 单价
|
||||||
|
2025-08-16 00:52:17,079 - app.core.excel.processor - INFO - 找到amount列: 金额
|
||||||
|
2025-08-16 00:52:17,094 - app.core.excel.processor - INFO - 检测到列映射: {'barcode': '商品条码', 'name': '商品名称', 'specification': '规格型号', 'quantity': '数量', 'unit': '单位', 'price': '单价', 'amount': '金额'}
|
||||||
|
2025-08-16 00:52:17,110 - app.core.excel.processor - INFO - 从映射列解析规格: 500ml*15 -> 包装数量=15
|
||||||
|
2025-08-16 00:52:17,177 - app.core.excel.processor - INFO - 从映射列解析规格: 600mL*15 -> 包装数量=15
|
||||||
|
2025-08-16 00:52:17,252 - app.core.excel.processor - INFO - 从映射列解析规格: 600ml*15 -> 包装数量=15
|
||||||
|
2025-08-16 00:52:17,314 - app.core.excel.processor - INFO - 从映射列解析规格: 900ml*12 -> 包装数量=12
|
||||||
|
2025-08-16 00:52:17,373 - app.core.excel.processor - INFO - 从映射列解析规格: 900ml*12 -> 包装数量=12
|
||||||
|
2025-08-16 00:52:17,431 - app.core.excel.processor - INFO - 从映射列解析规格: 900ml*12 -> 包装数量=12
|
||||||
|
2025-08-16 00:52:17,493 - app.core.excel.processor - INFO - 从映射列解析规格: 950ml*12 -> 包装数量=12
|
||||||
|
2025-08-16 00:52:17,549 - app.core.excel.processor - INFO - 从映射列解析规格: 480ml*15 -> 包装数量=15
|
||||||
|
2025-08-16 00:52:17,610 - app.core.excel.processor - INFO - 提取到 8 个商品信息
|
||||||
|
2025-08-16 00:52:17,634 - app.core.excel.processor - INFO - 开始处理8 个产品信息
|
||||||
|
2025-08-16 00:52:17,644 - app.core.excel.processor - INFO - 处理商品: 条码=6970399922365, 数量=15.0, 单价=4.533333333333333, 是否赠品=False
|
||||||
|
2025-08-16 00:52:17,659 - app.core.excel.processor - INFO - 发现正常商品:条码6970399922365, 数量=15.0, 单价=4.533333333333333
|
||||||
|
2025-08-16 00:52:17,675 - app.core.excel.processor - INFO - 处理商品: 条码=6937003703833, 数量=15.0, 单价=4.533333333333333, 是否赠品=False
|
||||||
|
2025-08-16 00:52:17,692 - app.core.excel.processor - INFO - 发现正常商品:条码6937003703833, 数量=15.0, 单价=4.533333333333333
|
||||||
|
2025-08-16 00:52:17,707 - app.core.excel.processor - INFO - 处理商品: 条码=6937003706346, 数量=15.0, 单价=4.533333333333333, 是否赠品=False
|
||||||
|
2025-08-16 00:52:17,723 - app.core.excel.processor - INFO - 发现正常商品:条码6937003706346, 数量=15.0, 单价=4.533333333333333
|
||||||
|
2025-08-16 00:52:17,738 - app.core.excel.processor - INFO - 处理商品: 条码=6973003703413, 数量=24.0, 单价=3.75, 是否赠品=False
|
||||||
|
2025-08-16 00:52:17,753 - app.core.excel.processor - INFO - 发现正常商品:条码6973003703413, 数量=24.0, 单价=3.75
|
||||||
|
2025-08-16 00:52:17,768 - app.core.excel.processor - INFO - 处理商品: 条码=6975176784785, 数量=24.0, 单价=3.75, 是否赠品=False
|
||||||
|
2025-08-16 00:52:17,784 - app.core.excel.processor - INFO - 发现正常商品:条码6975176784785, 数量=24.0, 单价=3.75
|
||||||
|
2025-08-16 00:52:17,800 - app.core.excel.processor - INFO - 处理商品: 条码=6937003708876, 数量=12.0, 单价=3.75, 是否赠品=False
|
||||||
|
2025-08-16 00:52:17,815 - app.core.excel.processor - INFO - 发现正常商品:条码6937003708876, 数量=12.0, 单价=3.75
|
||||||
|
2025-08-16 00:52:17,830 - app.core.excel.processor - INFO - 处理商品: 条码=6937003703826, 数量=12.0, 单价=6.083333333333333, 是否赠品=False
|
||||||
|
2025-08-16 00:52:17,845 - app.core.excel.processor - INFO - 发现正常商品:条码6937003703826, 数量=12.0, 单价=6.083333333333333
|
||||||
|
2025-08-16 00:52:17,859 - app.core.excel.processor - INFO - 处理商品: 条码=6970399920415, 数量=5.0, 单价=0, 是否赠品=True
|
||||||
|
2025-08-16 00:52:17,876 - app.core.excel.processor - INFO - 发现赠品:条码6970399920415, 数量=5.0
|
||||||
|
2025-08-16 00:52:17,891 - app.core.excel.processor - INFO - 分组后共8 个不同条码的商品
|
||||||
|
2025-08-16 00:52:17,906 - app.core.excel.processor - INFO - 条码 6970399922365 处理结果:正常商品数量15.0,单价4.533333333333333,赠品数量0
|
||||||
|
2025-08-16 00:52:17,923 - app.core.excel.processor - INFO - 条码 6937003703833 处理结果:正常商品数量15.0,单价4.533333333333333,赠品数量0
|
||||||
|
2025-08-16 00:52:17,939 - app.core.excel.processor - INFO - 条码 6937003706346 处理结果:正常商品数量15.0,单价4.533333333333333,赠品数量0
|
||||||
|
2025-08-16 00:52:17,955 - app.core.excel.processor - INFO - 条码 6973003703413 处理结果:正常商品数量24.0,单价3.75,赠品数量0
|
||||||
|
2025-08-16 00:52:17,970 - app.core.excel.processor - INFO - 条码 6975176784785 处理结果:正常商品数量24.0,单价3.75,赠品数量0
|
||||||
|
2025-08-16 00:52:17,987 - app.core.excel.processor - INFO - 条码 6937003708876 处理结果:正常商品数量12.0,单价3.75,赠品数量0
|
||||||
|
2025-08-16 00:52:17,994 - app.core.excel.processor - INFO - 条码 6937003703826 处理结果:正常商品数量12.0,单价6.083333333333333,赠品数量0
|
||||||
|
2025-08-16 00:52:18,012 - app.core.excel.processor - INFO - 条码 6970399920415 处理结果:只有赠品,数量=5.0
|
||||||
|
2025-08-16 00:52:18,029 - app.core.excel.processor - INFO - 条码 6970399920415 填充:仅有赠品,采购量=0,赠品数量=5.0
|
||||||
|
2025-08-16 00:52:18,051 - app.core.excel.processor - INFO - 采购单已保存到: data/result\采购单_7a3a78a02fcf6ccef5daad31bd50bdf2.xls
|
||||||
|
2025-08-16 00:52:18,082 - app.core.excel.processor - INFO - 采购单已保存到: data/result\采购单_7a3a78a02fcf6ccef5daad31bd50bdf2.xls
|
||||||
@@ -0,0 +1,8 @@
|
|||||||
|
2025-08-16 00:52:17,127 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
|
||||||
|
2025-08-16 00:52:17,193 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
|
||||||
|
2025-08-16 00:52:17,268 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
|
||||||
|
2025-08-16 00:52:17,329 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
|
||||||
|
2025-08-16 00:52:17,381 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
|
||||||
|
2025-08-16 00:52:17,446 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
|
||||||
|
2025-08-16 00:52:17,508 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
|
||||||
|
2025-08-16 00:52:17,565 - app.core.excel.validators - INFO - 修正条码长度: 从14位截断到13位
|
||||||
@@ -0,0 +1,3 @@
|
|||||||
|
2025-08-16 00:52:16,815 - app.services.order_service - INFO - 初始化OrderService
|
||||||
|
2025-08-16 00:52:16,863 - app.services.order_service - INFO - OrderService初始化完成
|
||||||
|
2025-08-16 00:52:16,867 - app.services.order_service - INFO - OrderService开始处理最新Excel文件
|
||||||
Binary file not shown.
@@ -0,0 +1,19 @@
|
|||||||
|
|
||||||
|
# OCR订单处理系统 - 便携版
|
||||||
|
|
||||||
|
## 使用说明
|
||||||
|
1. 双击 "OCR订单处理系统.exe" 启动程序
|
||||||
|
2. 将需要处理的图片文件放入 data/input 目录
|
||||||
|
3. 处理结果将保存在 data/output 目录
|
||||||
|
4. 日志文件保存在 logs 目录
|
||||||
|
|
||||||
|
## 注意事项
|
||||||
|
- 首次运行时需要配置百度OCR API密钥
|
||||||
|
- 支持的图片格式:jpg, jpeg, png, bmp
|
||||||
|
- 单个文件大小不超过4MB
|
||||||
|
|
||||||
|
## 目录结构
|
||||||
|
- OCR订单处理系统.exe - 主程序
|
||||||
|
- data/input/ - 输入图片目录
|
||||||
|
- data/output/ - 输出结果目录
|
||||||
|
- logs/ - 日志目录
|
||||||
@@ -0,0 +1,28 @@
|
|||||||
|
[API]
|
||||||
|
api_key = O0Fgk3o69RWJ86eAX8BTHRaB
|
||||||
|
secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
|
||||||
|
timeout = 30
|
||||||
|
max_retries = 3
|
||||||
|
retry_delay = 2
|
||||||
|
api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
|
||||||
|
|
||||||
|
[Paths]
|
||||||
|
input_folder = data/input
|
||||||
|
output_folder = data/output
|
||||||
|
temp_folder = data/temp
|
||||||
|
template_folder = templates
|
||||||
|
processed_record = data/processed_files.json
|
||||||
|
|
||||||
|
[Performance]
|
||||||
|
max_workers = 4
|
||||||
|
batch_size = 5
|
||||||
|
skip_existing = true
|
||||||
|
|
||||||
|
[File]
|
||||||
|
allowed_extensions = .jpg,.jpeg,.png,.bmp
|
||||||
|
excel_extension = .xlsx
|
||||||
|
max_file_size_mb = 4
|
||||||
|
|
||||||
|
[Templates]
|
||||||
|
purchase_order = 银豹-采购单模板.xls
|
||||||
|
|
||||||
@@ -0,0 +1,205 @@
|
|||||||
|
{
|
||||||
|
"6920584471055": {
|
||||||
|
"map_to": "6920584471017",
|
||||||
|
"description": "条码映射:6920584471055 -> 6920584471017"
|
||||||
|
},
|
||||||
|
"6925861571159": {
|
||||||
|
"map_to": "69021824",
|
||||||
|
"description": "条码映射:6925861571159 -> 69021824"
|
||||||
|
},
|
||||||
|
"6923644268923": {
|
||||||
|
"map_to": "6923644268480",
|
||||||
|
"description": "条码映射:6923644268923 -> 6923644268480"
|
||||||
|
},
|
||||||
|
"6925861571466": {
|
||||||
|
"map_to": "6925861571459",
|
||||||
|
"description": "条码映射:6925861571466 -> 6925861571459"
|
||||||
|
},
|
||||||
|
"6907992508344": {
|
||||||
|
"map_to": "6907992508191",
|
||||||
|
"description": "条码映射:6907992508344 -> 6907992508191"
|
||||||
|
},
|
||||||
|
"6903979000979": {
|
||||||
|
"map_to": "6903979000962",
|
||||||
|
"description": "条码映射:6903979000979 -> 6903979000962"
|
||||||
|
},
|
||||||
|
"6923644283582": {
|
||||||
|
"map_to": "6923644283575",
|
||||||
|
"description": "条码映射:6923644283582 -> 6923644283575"
|
||||||
|
},
|
||||||
|
"6923644268930": {
|
||||||
|
"map_to": "6923644268497",
|
||||||
|
"description": "条码映射:6923644268930 -> 6923644268497"
|
||||||
|
},
|
||||||
|
"6923644268916": {
|
||||||
|
"map_to": "6923644268503",
|
||||||
|
"description": "条码映射:6923644268916 -> 6923644268503"
|
||||||
|
},
|
||||||
|
"6923644268909": {
|
||||||
|
"map_to": "6923644268510",
|
||||||
|
"description": "条码映射:6923644268909 -> 6923644268510"
|
||||||
|
},
|
||||||
|
"6923644299804": {
|
||||||
|
"map_to": "6923644299774",
|
||||||
|
"description": "条码映射:6923644299804 -> 6923644299774"
|
||||||
|
},
|
||||||
|
"6923644266318": {
|
||||||
|
"map_to": "6923644266066",
|
||||||
|
"description": "条码映射:6923644266318 -> 6923644266066"
|
||||||
|
},
|
||||||
|
"6923644210151": {
|
||||||
|
"map_to": "6923644223458",
|
||||||
|
"description": "条码映射:6923644210151 -> 6923644223458"
|
||||||
|
},
|
||||||
|
"6907992501819": {
|
||||||
|
"map_to": "6907992500133",
|
||||||
|
"description": "条码映射:6907992501819 -> 6907992500133"
|
||||||
|
},
|
||||||
|
"6907992502052": {
|
||||||
|
"map_to": "6907992100272",
|
||||||
|
"description": "条码映射:6907992502052 -> 6907992100272"
|
||||||
|
},
|
||||||
|
"6907992507385": {
|
||||||
|
"map_to": "6907992507095",
|
||||||
|
"description": "条码映射:6907992507385 -> 6907992507095"
|
||||||
|
},
|
||||||
|
"6973726149671": {
|
||||||
|
"map_to": "6973726149657",
|
||||||
|
"description": "条码映射:6973726149671 -> 6973726149657"
|
||||||
|
},
|
||||||
|
"6977426410574": {
|
||||||
|
"map_to": "6977426410567",
|
||||||
|
"description": "条码映射:6977426410574 -> 6977426410567"
|
||||||
|
},
|
||||||
|
"6973726149688": {
|
||||||
|
"map_to": "6973726149664",
|
||||||
|
"description": "条码映射:6973726149688 -> 6973726149664"
|
||||||
|
},
|
||||||
|
"6935205322012": {
|
||||||
|
"map_to": "6935205320018",
|
||||||
|
"description": "条码映射:6935205322012 -> 6935205320018"
|
||||||
|
},
|
||||||
|
"6943497411024": {
|
||||||
|
"map_to": "6943497411017",
|
||||||
|
"description": "条码映射:6943497411024 -> 6943497411017"
|
||||||
|
},
|
||||||
|
"6921734968821": {
|
||||||
|
"map_to": "6921734968814",
|
||||||
|
"description": "条码映射:6921734968821 -> 6921734968814"
|
||||||
|
},
|
||||||
|
"6921734968258": {
|
||||||
|
"map_to": "6921734968241",
|
||||||
|
"description": "条码映射:6921734968258 -> 6921734968241"
|
||||||
|
},
|
||||||
|
"6921734968180": {
|
||||||
|
"map_to": "6921734968173",
|
||||||
|
"description": "条码映射:6921734968180 -> 6921734968173"
|
||||||
|
},
|
||||||
|
"6921734908735": {
|
||||||
|
"map_to": "6935205372772",
|
||||||
|
"description": "条码映射:6921734908735 -> 6935205372772"
|
||||||
|
},
|
||||||
|
"6923644248222": {
|
||||||
|
"map_to": "6923644248208",
|
||||||
|
"description": "条码映射:6923644248222 -> 6923644248208"
|
||||||
|
},
|
||||||
|
"6902083881122": {
|
||||||
|
"map_to": "6902083881085",
|
||||||
|
"description": "条码映射:6902083881122 -> 6902083881085"
|
||||||
|
},
|
||||||
|
"6907992501857": {
|
||||||
|
"map_to": "6907992500010",
|
||||||
|
"description": "条码映射:6907992501857 -> 6907992500010"
|
||||||
|
},
|
||||||
|
"6902083891015": {
|
||||||
|
"map_to": "6902083890636",
|
||||||
|
"description": "条码映射:6902083891015 -> 6902083890636"
|
||||||
|
},
|
||||||
|
"6923450605240": {
|
||||||
|
"map_to": "6923450605226",
|
||||||
|
"description": "条码映射:6923450605240 -> 6923450605226"
|
||||||
|
},
|
||||||
|
"6923450605196": {
|
||||||
|
"map_to": "6923450614624",
|
||||||
|
"description": "条码映射:6923450605196 -> 6923450614624"
|
||||||
|
},
|
||||||
|
"6923450665213": {
|
||||||
|
"map_to": "6923450665206",
|
||||||
|
"description": "条码映射:6923450665213 -> 6923450665206"
|
||||||
|
},
|
||||||
|
"6923450666821": {
|
||||||
|
"map_to": "6923450666838",
|
||||||
|
"description": "条码映射:6923450666821 -> 6923450666838"
|
||||||
|
},
|
||||||
|
"6923450661505": {
|
||||||
|
"map_to": "6923450661499",
|
||||||
|
"description": "条码映射:6923450661505 -> 6923450661499"
|
||||||
|
},
|
||||||
|
"6923450676103": {
|
||||||
|
"map_to": "6923450676097",
|
||||||
|
"description": "条码映射:6923450676103 -> 6923450676097"
|
||||||
|
},
|
||||||
|
"6923450614631": {
|
||||||
|
"map_to": "6923450614624",
|
||||||
|
"description": "条码映射:6923450614631 -> 6923450614624"
|
||||||
|
},
|
||||||
|
"6901424334174": {
|
||||||
|
"map_to": "6973730760015",
|
||||||
|
"description": "条码映射:6901424334174 -> 6973730760015"
|
||||||
|
},
|
||||||
|
"6958620703716": {
|
||||||
|
"map_to": "6958620703907",
|
||||||
|
"description": "条码映射:6958620703716 -> 6958620703907"
|
||||||
|
},
|
||||||
|
"6937003706322": {
|
||||||
|
"map_to": "6937003703833",
|
||||||
|
"description": "条码映射:6937003706322 -> 6937003703833"
|
||||||
|
},
|
||||||
|
"6950783203494": {
|
||||||
|
"map_to": "6950873203494",
|
||||||
|
"description": "条码映射:6950783203494 -> 6950873203494"
|
||||||
|
},
|
||||||
|
"6907992501871": {
|
||||||
|
"map_to": "6907992500010",
|
||||||
|
"description": "条码映射:6907992501871 -> 6907992500010"
|
||||||
|
},
|
||||||
|
"6907992501864": {
|
||||||
|
"map_to": "6907992100012",
|
||||||
|
"description": "条码映射:6907992501864 -> 6907992100012"
|
||||||
|
},
|
||||||
|
"6923644264192": {
|
||||||
|
"map_to": "6923644264116",
|
||||||
|
"description": "条码映射:6923644264192 -> 6923644264116"
|
||||||
|
},
|
||||||
|
"6923450667316": {
|
||||||
|
"map_to": "69042386",
|
||||||
|
"description": "条码映射:6923450667316 -> 69042386"
|
||||||
|
},
|
||||||
|
"6923450653012": {
|
||||||
|
"map_to": "69021343",
|
||||||
|
"description": "条码映射:6923450653012 -> 69021343"
|
||||||
|
},
|
||||||
|
"6925019900087": {
|
||||||
|
"multiplier": 10,
|
||||||
|
"target_unit": "瓶",
|
||||||
|
"description": "特殊处理:数量*10,单位转换为瓶"
|
||||||
|
},
|
||||||
|
"6921168593804": {
|
||||||
|
"multiplier": 30,
|
||||||
|
"target_unit": "瓶",
|
||||||
|
"description": "NFC产品特殊处理:每箱30瓶"
|
||||||
|
},
|
||||||
|
"6901826888138": {
|
||||||
|
"multiplier": 30,
|
||||||
|
"target_unit": "瓶",
|
||||||
|
"fixed_price": 3.7333333333333334,
|
||||||
|
"specification": "1*30",
|
||||||
|
"description": "特殊处理: 规格1*30,数量*30,单价=112/30"
|
||||||
|
},
|
||||||
|
"6958620703907": {
|
||||||
|
"multiplier": 14,
|
||||||
|
"target_unit": "个",
|
||||||
|
"specification": "1*14",
|
||||||
|
"description": "友臣肉松,1盒14个"
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,28 @@
|
|||||||
|
[API]
|
||||||
|
api_key = O0Fgk3o69RWJ86eAX8BTHRaB
|
||||||
|
secret_key = VyZD5lzcIMgsup1uuD6Cw0pfzS20IGPZ
|
||||||
|
timeout = 30
|
||||||
|
max_retries = 3
|
||||||
|
retry_delay = 2
|
||||||
|
api_url = https://aip.baidubce.com/rest/2.0/ocr/v1/table
|
||||||
|
|
||||||
|
[Paths]
|
||||||
|
input_folder = data/input
|
||||||
|
output_folder = data/output
|
||||||
|
temp_folder = data/temp
|
||||||
|
template_folder = templates
|
||||||
|
processed_record = data/processed_files.json
|
||||||
|
|
||||||
|
[Performance]
|
||||||
|
max_workers = 4
|
||||||
|
batch_size = 5
|
||||||
|
skip_existing = true
|
||||||
|
|
||||||
|
[File]
|
||||||
|
allowed_extensions = .jpg,.jpeg,.png,.bmp
|
||||||
|
excel_extension = .xlsx
|
||||||
|
max_file_size_mb = 4
|
||||||
|
|
||||||
|
[Templates]
|
||||||
|
purchase_order = 银豹-采购单模板.xls
|
||||||
|
|
||||||
Binary file not shown.
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user