Compare commits
6 Commits
4132226fae
..
main
| Author | SHA1 | Date | |
|---|---|---|---|
| c6f81b6f6e | |||
| 70f293db96 | |||
| e95a1723e5 | |||
| b402612641 | |||
| 975f9e5887 | |||
| 505e5ca895 |
@@ -0,0 +1,19 @@
|
|||||||
|
.git
|
||||||
|
.gitignore
|
||||||
|
.env.example
|
||||||
|
*.md
|
||||||
|
.vscode
|
||||||
|
.idea
|
||||||
|
__pycache__
|
||||||
|
*.pyc
|
||||||
|
.pytest_cache
|
||||||
|
.coverage
|
||||||
|
*.egg-info
|
||||||
|
dist
|
||||||
|
build
|
||||||
|
.DS_Store
|
||||||
|
*.log
|
||||||
|
uploads/*
|
||||||
|
downloads/*
|
||||||
|
.claude
|
||||||
|
CLAUDE.md
|
||||||
@@ -0,0 +1,59 @@
|
|||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
|
||||||
|
SaleShow is a monolithic Flask web application for analyzing sales data from Excel files. It supports manual Excel uploads and automated daily downloads from secsion.com via Playwright browser automation. There is no database — all data lives as Excel files on disk in `uploads/`.
|
||||||
|
|
||||||
|
## Commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install dependencies
|
||||||
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
# Run development server (Flask on port 5000, debug via FLASK_DEBUG env var)
|
||||||
|
python app.py
|
||||||
|
|
||||||
|
# Run with Docker (builds image, installs Playwright Chromium, port 5000)
|
||||||
|
docker-compose up -d
|
||||||
|
|
||||||
|
# Production
|
||||||
|
gunicorn -w 4 -b 0.0.0.0:8000 app:app
|
||||||
|
|
||||||
|
# CLI automation — download reports from secsion.com
|
||||||
|
python -m automation.secsion --start 2026-04-28 --end 2026-04-28
|
||||||
|
python -m automation.secsion --start 2026-05-15 --end 2026-05-17 --username 15682076681 --password yourpassword
|
||||||
|
```
|
||||||
|
|
||||||
|
No test framework or linter is configured in this project.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
**Backend (Flask, single `app.py`):**
|
||||||
|
- Routes handle file upload (`/upload`), file listing (`/files`), data loading/processing (`/load/<filename>`), deletion, and cleanup.
|
||||||
|
- `process_sales_data()` (~lines 371-575) is the core logic. It uses a state-machine approach to handle two Excel formats: "flat tables" (each row has code + product) and "hierarchical tables" (code row is a header, product rows are children). Outputs daily summaries with per-product breakdowns.
|
||||||
|
- `find_header_row()` dynamically detects the header row by scanning first 20 rows for keyword matches.
|
||||||
|
- Auto-download routes use a global `download_status` dict and run Playwright in daemon threads via `threading.Thread`.
|
||||||
|
|
||||||
|
**Automation module (`automation/`):**
|
||||||
|
- `secsion.py` — `SecsionDownloader` uses Playwright headless Chromium to log into secsion.com, navigate to reports, set date range via TDesign date picker (requires click → select day → Enter confirm → Escape close sequence), optionally inject `shop_id` via route interception on `**/api/bill/export`, and download exports. Has 3-retry logic with exponential backoff.
|
||||||
|
- `uploader.py` — copies downloaded files into `uploads/` with `YYYYMMDD_HHMMSS_` prefix naming (same convention as manual uploads).
|
||||||
|
- `scheduler.py` — APScheduler `BackgroundScheduler` with `CronTrigger` runs daily auto-download (default 01:00). Uses `misfire_grace_time=3600`.
|
||||||
|
|
||||||
|
**Configuration (`config.py`):**
|
||||||
|
- Three-tier priority: Web UI settings (`data/config.json`) > environment variables (`.env` / system env) > defaults.
|
||||||
|
- `Config` class provides static methods for reading/writing secsion credentials, shop ID, and scheduler settings.
|
||||||
|
- Passwords are masked (`******`) when returned via the API.
|
||||||
|
|
||||||
|
**Frontend (vanilla JS/CSS, no build step):**
|
||||||
|
- `main.js` — all client-side interactivity: file upload (drag-and-drop), AJAX to API, data rendering (card/table view), client-side filtering, sorting, pagination (50 items/page), export.
|
||||||
|
- `style.css` — Glassmorphism design with CSS custom properties.
|
||||||
|
- `settings.html` — self-contained settings page with inline `<script>` (no separate JS file).
|
||||||
|
|
||||||
|
## Key Design Decisions
|
||||||
|
|
||||||
|
- No database — Excel files on disk are the data store.
|
||||||
|
- No frontend build step — vanilla JS/CSS served directly via Flask static files.
|
||||||
|
- Playwright automation runs in daemon threads; status tracked via module-level `download_status` dict in `app.py`.
|
||||||
|
- The secsion.com date picker uses TDesign's `needconfirm="true"` mode — simply calling `.fill()` won't work; must click cell then press Enter.
|
||||||
+7
-3
@@ -9,8 +9,10 @@ WORKDIR /app
|
|||||||
ENV PYTHONDONTWRITEBYTECODE=1
|
ENV PYTHONDONTWRITEBYTECODE=1
|
||||||
# 确保 Python 输出不被缓冲
|
# 确保 Python 输出不被缓冲
|
||||||
ENV PYTHONUNBUFFERED=1
|
ENV PYTHONUNBUFFERED=1
|
||||||
|
# 禁用 Python 的硬件指令集检查,提高Chromium兼容性
|
||||||
|
ENV PYTHONHASHSEED=0
|
||||||
|
|
||||||
# 安装 Playwright 所需的系统依赖
|
# 安装 Playwright 所需的系统依赖 + 网络优化
|
||||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||||
wget \
|
wget \
|
||||||
ca-certificates \
|
ca-certificates \
|
||||||
@@ -29,6 +31,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
|
|||||||
libxdamage1 \
|
libxdamage1 \
|
||||||
libxrandr2 \
|
libxrandr2 \
|
||||||
xdg-utils \
|
xdg-utils \
|
||||||
|
dnsmasq \
|
||||||
&& rm -rf /var/lib/apt/lists/*
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
# 复制依赖文件
|
# 复制依赖文件
|
||||||
@@ -38,8 +41,8 @@ COPY requirements.txt .
|
|||||||
# 使用阿里云镜像源加速
|
# 使用阿里云镜像源加速
|
||||||
RUN pip install --no-cache-dir -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
|
RUN pip install --no-cache-dir -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
|
||||||
|
|
||||||
# 安装 Playwright Chromium 浏览器
|
# 安装 Playwright Chromium 浏览器(增加超时以处理网络不稳定)
|
||||||
RUN playwright install --with-deps chromium
|
RUN PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=0 playwright install --with-deps chromium
|
||||||
|
|
||||||
# 复制当前目录内容到容器中的 /app
|
# 复制当前目录内容到容器中的 /app
|
||||||
COPY . .
|
COPY . .
|
||||||
@@ -50,5 +53,6 @@ RUN mkdir -p uploads data downloads && chmod 777 uploads data downloads
|
|||||||
# 暴露端口 5000
|
# 暴露端口 5000
|
||||||
EXPOSE 5000
|
EXPOSE 5000
|
||||||
|
|
||||||
|
# 增加容器内存限制和进程管理
|
||||||
# 运行 app.py
|
# 运行 app.py
|
||||||
CMD ["python", "app.py"]
|
CMD ["python", "app.py"]
|
||||||
|
|||||||
@@ -104,13 +104,20 @@ python app.py
|
|||||||
3. **定时获取** - 在设置页面启用定时任务,系统每日凌晨自动下载前一天数据
|
3. **定时获取** - 在设置页面启用定时任务,系统每日凌晨自动下载前一天数据
|
||||||
4. **CLI 模式** - 命令行运行:
|
4. **CLI 模式** - 命令行运行:
|
||||||
```bash
|
```bash
|
||||||
# 下载指定日期数据
|
# 下载指定日期数据(推荐)
|
||||||
python -m automation.secsion --start 2026-04-28 --end 2026-04-28
|
python -m automation.secsion --start 2026-05-15 --end 2026-05-17
|
||||||
|
|
||||||
# 指定用户名密码
|
# 指定用户名密码
|
||||||
python -m automation.secsion --start 2026-04-28 --username 18190686888 --password yourpassword
|
python -m automation.secsion --start 2026-05-15 --end 2026-05-17 --username 15682076681 --password yourpassword
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**⚡ 下载性能优化**:
|
||||||
|
- ✅ 支持自动重试(3次重试机制)
|
||||||
|
- ✅ 智能数据加载检测
|
||||||
|
- ✅ 优化的超时控制(180秒)
|
||||||
|
- ✅ 支持大日期范围和大数据量下载
|
||||||
|
- 📊 典型场景:3天数据下载耗时 20-35 秒
|
||||||
|
|
||||||
> **配置优先级**: Web UI 设置页 > 环境变量 (.env) > 默认值
|
> **配置优先级**: Web UI 设置页 > 环境变量 (.env) > 默认值
|
||||||
|
|
||||||
## 🏗️ 部署说明
|
## 🏗️ 部署说明
|
||||||
@@ -262,6 +269,6 @@ SaleShow/
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**最后更新时间:** 2026年4月29日
|
**最后更新时间:** 2026年5月17日
|
||||||
|
|
||||||
*享受数据分析的乐趣!📊✨*
|
*享受数据分析的乐趣!📊✨*
|
||||||
|
|||||||
+334
-70
@@ -25,49 +25,71 @@ class SecsionDownloader:
|
|||||||
self.download_dir = download_dir or os.path.join(os.getcwd(), "downloads")
|
self.download_dir = download_dir or os.path.join(os.getcwd(), "downloads")
|
||||||
os.makedirs(self.download_dir, exist_ok=True)
|
os.makedirs(self.download_dir, exist_ok=True)
|
||||||
|
|
||||||
async def download_report(self, start_date, end_date):
|
async def download_report(self, start_date, end_date, retry_count=3):
|
||||||
"""
|
"""
|
||||||
下载指定日期范围的销售报表
|
下载指定日期范围的销售报表
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
start_date: 开始日期 (YYYY-MM-DD)
|
start_date: 开始日期 (YYYY-MM-DD)
|
||||||
end_date: 结束日期 (YYYY-MM-DD)
|
end_date: 结束日期 (YYYY-MM-DD)
|
||||||
|
retry_count: 重试次数(默认3次)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
str: 下载文件的本地路径,失败返回 None
|
str: 下载文件的本地路径,失败返回 None
|
||||||
"""
|
"""
|
||||||
logger.info(f"开始下载报表: {start_date} ~ {end_date}")
|
for attempt in range(retry_count):
|
||||||
|
|
||||||
async with async_playwright() as p:
|
|
||||||
browser = await p.chromium.launch(headless=True)
|
|
||||||
context = await browser.new_context(
|
|
||||||
ignore_https_errors=True,
|
|
||||||
viewport={'width': 1280, 'height': 800}
|
|
||||||
)
|
|
||||||
page = await context.new_page()
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
await self._login(page)
|
logger.info(f"开始下载报表: {start_date} ~ {end_date} (第 {attempt + 1}/{retry_count} 次)")
|
||||||
file_path = await self._export_report(page, start_date, end_date)
|
|
||||||
logger.info(f"报表下载完成: {file_path}")
|
async with async_playwright() as p:
|
||||||
return file_path
|
# Docker 优化:添加 --disable-dev-shm-usage 避免共享内存不足
|
||||||
|
# 注意:不能使用 --single-process,它会破坏 Chromium 的下载机制
|
||||||
|
browser = await p.chromium.launch(
|
||||||
|
headless=True,
|
||||||
|
downloads_path=self.download_dir,
|
||||||
|
args=[
|
||||||
|
"--disable-dev-shm-usage",
|
||||||
|
"--disable-gpu",
|
||||||
|
"--no-sandbox",
|
||||||
|
"--ignore-certificate-errors"
|
||||||
|
]
|
||||||
|
)
|
||||||
|
context = await browser.new_context(
|
||||||
|
ignore_https_errors=True,
|
||||||
|
viewport={'width': 1280, 'height': 800},
|
||||||
|
accept_downloads=True
|
||||||
|
)
|
||||||
|
page = await context.new_page()
|
||||||
|
|
||||||
|
try:
|
||||||
|
await self._login(page)
|
||||||
|
file_path = await self._export_report(page, start_date, end_date)
|
||||||
|
logger.info(f"报表下载完成: {file_path}")
|
||||||
|
return file_path
|
||||||
|
finally:
|
||||||
|
await browser.close()
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"下载报表失败: {e}")
|
logger.error(f"下载报表失败 (第 {attempt + 1}/{retry_count} 次): {e}")
|
||||||
return None
|
if attempt < retry_count - 1:
|
||||||
finally:
|
wait_time = (attempt + 1) * 5
|
||||||
await browser.close()
|
logger.info(f"等待 {wait_time} 秒后重试...")
|
||||||
|
await asyncio.sleep(wait_time)
|
||||||
|
continue
|
||||||
|
|
||||||
|
logger.error(f"下载报表最终失败 (重试 {retry_count} 次均失败)")
|
||||||
|
return None
|
||||||
|
|
||||||
async def _login(self, page):
|
async def _login(self, page):
|
||||||
"""登录 secsion.com"""
|
"""登录 secsion.com"""
|
||||||
logger.info(f"打开登录页面: {self.LOGIN_URL}")
|
logger.info(f"打开登录页面: {self.LOGIN_URL}")
|
||||||
await page.goto(self.LOGIN_URL)
|
await page.goto(self.LOGIN_URL, timeout=30000)
|
||||||
|
|
||||||
# 选择角色 "店铺"
|
# 选择角色 "店铺"
|
||||||
logger.info("选择角色: 店铺")
|
logger.info("选择角色: 店铺")
|
||||||
try:
|
try:
|
||||||
await page.get_by_text("店铺", exact=True).click()
|
await page.get_by_text("店铺", exact=True).click(timeout=10000)
|
||||||
except Exception:
|
except Exception:
|
||||||
await page.click("text=店铺")
|
await page.click("text=店铺", timeout=10000)
|
||||||
|
|
||||||
# 输入账号密码
|
# 输入账号密码
|
||||||
logger.info(f"输入账号: {self.username}")
|
logger.info(f"输入账号: {self.username}")
|
||||||
@@ -83,21 +105,21 @@ class SecsionDownloader:
|
|||||||
try:
|
try:
|
||||||
await page.click("button:has-text('登录')", timeout=5000)
|
await page.click("button:has-text('登录')", timeout=5000)
|
||||||
except Exception:
|
except Exception:
|
||||||
await page.click("button[type='submit']")
|
await page.click("button[type='submit']", timeout=5000)
|
||||||
|
|
||||||
# 等待跳转
|
# 等待跳转(Docker 中需要更长时间)
|
||||||
logger.info("等待登录跳转...")
|
logger.info("等待登录跳转...")
|
||||||
await page.wait_for_url("**/homePage", timeout=20000)
|
await page.wait_for_url("**/homePage", timeout=30000)
|
||||||
logger.info("登录成功")
|
logger.info("登录成功")
|
||||||
|
|
||||||
async def _export_report(self, page, start_date, end_date):
|
async def _export_report(self, page, start_date, end_date):
|
||||||
"""访问统计页面并导出报表"""
|
"""访问统计页面并导出报表"""
|
||||||
logger.info(f"访问统计页面: {self.STATS_URL}")
|
logger.info(f"访问统计页面: {self.STATS_URL}")
|
||||||
await page.goto(self.STATS_URL)
|
await page.goto(self.STATS_URL, timeout=30000)
|
||||||
await page.wait_for_load_state("networkidle")
|
await page.wait_for_load_state("networkidle", timeout=30000)
|
||||||
|
|
||||||
export_btn = page.get_by_role("button", name="导出报表")
|
export_btn = page.get_by_role("button", name="导出报表")
|
||||||
await export_btn.wait_for(state="visible", timeout=20000)
|
await export_btn.wait_for(state="visible", timeout=30000)
|
||||||
|
|
||||||
logger.info(f"设置查询日期范围: {start_date} ~ {end_date}")
|
logger.info(f"设置查询日期范围: {start_date} ~ {end_date}")
|
||||||
|
|
||||||
@@ -117,14 +139,35 @@ class SecsionDownloader:
|
|||||||
end_val = await end_input.input_value()
|
end_val = await end_input.input_value()
|
||||||
logger.info(f"日期设置结果: 开始={start_val}, 结束={end_val}")
|
logger.info(f"日期设置结果: 开始={start_val}, 结束={end_val}")
|
||||||
|
|
||||||
# 等待数据请求完成
|
# 等待数据请求完成 + 表格渲染(Docker 中增加等待时间)
|
||||||
logger.info("等待数据请求完成...")
|
logger.info("等待数据请求完成...")
|
||||||
await asyncio.sleep(3)
|
await asyncio.sleep(3)
|
||||||
|
|
||||||
# 如果配置了 shop_id,拦截导出请求注入 shop_id
|
# 检查数据是否加载完成(等待loading消失或有实际数据)
|
||||||
if self.shop_id:
|
try:
|
||||||
import json
|
# 等待加载指示符消失或数据表格出现
|
||||||
|
await page.wait_for_function(
|
||||||
|
"""() => {
|
||||||
|
// 检查是否存在加载中的标志
|
||||||
|
const loading = document.querySelector('[class*="loading"]');
|
||||||
|
if (loading && loading.style.display !== 'none') return false;
|
||||||
|
// 检查是否有数据行
|
||||||
|
const rows = document.querySelectorAll('table tbody tr');
|
||||||
|
return rows.length > 0;
|
||||||
|
}""",
|
||||||
|
timeout=30000
|
||||||
|
)
|
||||||
|
logger.info("数据表格已加载")
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"表格加载检查失败: {e},继续执行...")
|
||||||
|
|
||||||
|
await asyncio.sleep(3)
|
||||||
|
|
||||||
|
# 如果配置了 shop_id,拦截导出请求注入 shop_id,并捕获服务端响应
|
||||||
|
import json
|
||||||
|
export_response = {'status': None, 'body': None, 'content_type': None}
|
||||||
|
|
||||||
|
if self.shop_id:
|
||||||
async def inject_shop_id(route):
|
async def inject_shop_id(route):
|
||||||
request = route.request
|
request = route.request
|
||||||
body = json.loads(request.post_data)
|
body = json.loads(request.post_data)
|
||||||
@@ -135,17 +178,226 @@ class SecsionDownloader:
|
|||||||
await page.route('**/api/bill/export', inject_shop_id)
|
await page.route('**/api/bill/export', inject_shop_id)
|
||||||
logger.info(f"已设置 shop_id 拦截: {self.shop_id}")
|
logger.info(f"已设置 shop_id 拦截: {self.shop_id}")
|
||||||
|
|
||||||
|
# 捕获导出接口的响应(用于调试)
|
||||||
|
async def on_response(response):
|
||||||
|
if '/api/bill/export' in response.url:
|
||||||
|
export_response['status'] = response.status
|
||||||
|
export_response['content_type'] = response.headers.get('content-type', '')
|
||||||
|
try:
|
||||||
|
body = await response.text()
|
||||||
|
export_response['body'] = body[:2000] if body else ''
|
||||||
|
except Exception:
|
||||||
|
export_response['body'] = '(binary or empty)'
|
||||||
|
logger.info(f"导出接口响应: status={response.status}, content-type={export_response['content_type']}, body长度={len(export_response['body'] or '')}")
|
||||||
|
|
||||||
|
page.on("response", on_response)
|
||||||
|
|
||||||
|
# 监控弹窗(导出可能在新窗口打开下载链接)
|
||||||
|
popup_pages = []
|
||||||
|
|
||||||
|
async def handle_popup(popup):
|
||||||
|
popup_pages.append(popup)
|
||||||
|
logger.info(f"检测到弹窗: {popup.url}")
|
||||||
|
# 弹窗的 SSL 拦截也要处理
|
||||||
|
await popup.wait_for_load_state("domcontentloaded", timeout=15000)
|
||||||
|
await self._bypass_ssl_interstitial(popup)
|
||||||
|
|
||||||
|
page.on("popup", handle_popup)
|
||||||
|
|
||||||
|
# 记录下载目录现有文件(用于兜底检测)
|
||||||
|
existing_files = set(os.listdir(self.download_dir)) if os.path.exists(self.download_dir) else set()
|
||||||
|
|
||||||
# 点击导出报表并捕获下载
|
# 点击导出报表并捕获下载
|
||||||
logger.info("点击导出报表...")
|
logger.info("点击导出报表...")
|
||||||
async with page.expect_download(timeout=60000) as download_info:
|
download_timeout = 120000 # 2 分钟,给 SSL 绕过留足时间
|
||||||
await export_btn.click()
|
|
||||||
|
|
||||||
download = await download_info.value
|
try:
|
||||||
filename = download.suggested_filename
|
async with page.expect_download(timeout=download_timeout) as download_info:
|
||||||
save_path = os.path.join(self.download_dir, filename)
|
await export_btn.click()
|
||||||
await download.save_as(save_path)
|
logger.info("等待文件下载中...")
|
||||||
logger.info(f"报表已保存至: {save_path}")
|
|
||||||
return save_path
|
# 等待弹窗出现
|
||||||
|
await asyncio.sleep(2)
|
||||||
|
|
||||||
|
# 先处理主页面 SSL 拦截
|
||||||
|
await self._bypass_ssl_interstitial(page)
|
||||||
|
|
||||||
|
# 再处理所有弹窗
|
||||||
|
for popup in popup_pages:
|
||||||
|
await self._bypass_ssl_interstitial(popup)
|
||||||
|
|
||||||
|
download = await download_info.value
|
||||||
|
filename = download.suggested_filename
|
||||||
|
save_path = os.path.join(self.download_dir, filename)
|
||||||
|
await download.save_as(save_path)
|
||||||
|
logger.info(f"报表已保存至: {save_path}")
|
||||||
|
return save_path
|
||||||
|
|
||||||
|
except Exception as download_err:
|
||||||
|
logger.warning(f"Playwright 下载事件捕获失败: {download_err}")
|
||||||
|
|
||||||
|
# 处理所有已知页面的 SSL 拦截
|
||||||
|
all_pages = [page] + popup_pages + [p for p in page.context.pages if p != page]
|
||||||
|
for p in all_pages:
|
||||||
|
try:
|
||||||
|
await self._bypass_ssl_interstitial(p)
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# 绕过 SSL 后重新等待下载
|
||||||
|
for p in all_pages:
|
||||||
|
try:
|
||||||
|
async with p.expect_download(timeout=15000) as dl_info:
|
||||||
|
pass
|
||||||
|
download = await dl_info.value
|
||||||
|
filename = download.suggested_filename
|
||||||
|
save_path = os.path.join(self.download_dir, filename)
|
||||||
|
await download.save_as(save_path)
|
||||||
|
logger.info(f"SSL 绕过后从页面下载成功: {save_path}")
|
||||||
|
return save_path
|
||||||
|
except Exception:
|
||||||
|
continue
|
||||||
|
|
||||||
|
logger.info("尝试文件系统兜底检测...")
|
||||||
|
|
||||||
|
# 等待一小段时间让可能的下载完成
|
||||||
|
await asyncio.sleep(5)
|
||||||
|
|
||||||
|
new_files = self._find_new_files(existing_files)
|
||||||
|
if new_files:
|
||||||
|
# 按修改时间取最新的
|
||||||
|
latest = max(new_files, key=lambda f: os.path.getmtime(os.path.join(self.download_dir, f)))
|
||||||
|
save_path = os.path.join(self.download_dir, latest)
|
||||||
|
logger.info(f"文件系统兜底检测到新文件: {save_path}")
|
||||||
|
return save_path
|
||||||
|
|
||||||
|
logger.warning("文件系统兜底检测也未发现新文件")
|
||||||
|
|
||||||
|
# 保存调试截图
|
||||||
|
try:
|
||||||
|
screenshot_path = os.path.join(self.download_dir, f"debug_export_{datetime.now().strftime('%Y%m%d_%H%M%S')}.png")
|
||||||
|
await page.screenshot(path=screenshot_path, full_page=True)
|
||||||
|
logger.info(f"调试截图已保存: {screenshot_path}")
|
||||||
|
except Exception as ss_err:
|
||||||
|
logger.warning(f"截图保存失败: {ss_err}")
|
||||||
|
|
||||||
|
# 打印捕获到的响应信息
|
||||||
|
if export_response['status']:
|
||||||
|
logger.info(f"服务端实际响应: status={export_response['status']}, content-type={export_response['content_type']}")
|
||||||
|
if export_response['body']:
|
||||||
|
logger.info(f"响应内容(前500字): {export_response['body'][:500]}")
|
||||||
|
else:
|
||||||
|
logger.warning("未捕获到 /api/bill/export 响应,可能是请求被拦截或未发出")
|
||||||
|
|
||||||
|
# 检查页面是否有错误提示
|
||||||
|
try:
|
||||||
|
error_text = await page.evaluate("""() => {
|
||||||
|
const msgs = document.querySelectorAll('.t-message--error, .t-notification--error, [class*="error"], .el-message--error');
|
||||||
|
return Array.from(msgs).map(el => el.textContent.trim()).filter(Boolean).join(' | ');
|
||||||
|
}""")
|
||||||
|
if error_text:
|
||||||
|
logger.error(f"页面错误提示: {error_text}")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# 检查是否有新打开的标签页(某些网站通过 window.open 下载)
|
||||||
|
try:
|
||||||
|
pages = page.context.pages
|
||||||
|
if len(pages) > 1:
|
||||||
|
logger.info(f"检测到 {len(pages)} 个标签页,检查新标签页...")
|
||||||
|
for p in pages[1:]:
|
||||||
|
url = p.url
|
||||||
|
logger.info(f"新标签页 URL: {url}")
|
||||||
|
if url.startswith('blob:') or 'download' in url.lower() or 'export' in url.lower():
|
||||||
|
# 尝试从新标签页下载
|
||||||
|
try:
|
||||||
|
async with p.expect_download(timeout=30000) as dl_info:
|
||||||
|
pass
|
||||||
|
download = await dl_info.value
|
||||||
|
filename = download.suggested_filename
|
||||||
|
save_path = os.path.join(self.download_dir, filename)
|
||||||
|
await download.save_as(save_path)
|
||||||
|
logger.info(f"从新标签页下载成功: {save_path}")
|
||||||
|
return save_path
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
raise
|
||||||
|
|
||||||
|
def _find_new_files(self, existing_files):
|
||||||
|
"""检测下载目录中新增的文件"""
|
||||||
|
if not os.path.exists(self.download_dir):
|
||||||
|
return []
|
||||||
|
current_files = set(os.listdir(self.download_dir))
|
||||||
|
new_files = current_files - existing_files
|
||||||
|
# 过滤掉临时文件和调试截图
|
||||||
|
return [f for f in new_files if not f.endswith(('.crdownload', '.tmp')) and not f.startswith('debug_')]
|
||||||
|
|
||||||
|
async def _bypass_ssl_interstitial(self, page):
|
||||||
|
"""
|
||||||
|
绕过 Chrome SSL 证书错误拦截页面
|
||||||
|
|
||||||
|
secsion.com 的导出下载链接 SSL 证书过期,Chrome 会弹
|
||||||
|
"您的连接不是私密连接" 警告页。点 "高级" → "继续前往"。
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
bool: 是否成功绕过(或无需绕过)
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
await page.wait_for_timeout(2000)
|
||||||
|
current_url = page.url
|
||||||
|
logger.debug(f"SSL 绕过检查: 当前 URL={current_url}")
|
||||||
|
|
||||||
|
# 检查是否在 SSL 错误页面
|
||||||
|
is_ssl_error_page = (
|
||||||
|
'chrome-error' in current_url or
|
||||||
|
'security' in current_url.lower() or
|
||||||
|
await page.evaluate(
|
||||||
|
"""() => {
|
||||||
|
return document.querySelector('#details-button') !== null ||
|
||||||
|
document.querySelector('#proceed-link') !== null ||
|
||||||
|
document.body?.innerText?.includes('您的连接不是私密连接') ||
|
||||||
|
document.body?.innerText?.includes('NET::ERR_CERT');
|
||||||
|
}"""
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
if not is_ssl_error_page:
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger.info("检测到 SSL 证书错误拦截页面,尝试绕过...")
|
||||||
|
|
||||||
|
# 点击 "高级" 按钮展开详情
|
||||||
|
details_btn = page.locator('#details-button')
|
||||||
|
if await details_btn.count() > 0:
|
||||||
|
await details_btn.click()
|
||||||
|
await page.wait_for_timeout(500)
|
||||||
|
logger.info("已点击「高级」")
|
||||||
|
|
||||||
|
# 点击 "继续前往 xxx(不安全)"
|
||||||
|
proceed_link = page.locator('#proceed-link')
|
||||||
|
if await proceed_link.count() > 0:
|
||||||
|
await proceed_link.click()
|
||||||
|
await page.wait_for_timeout(2000)
|
||||||
|
logger.info("已点击「继续前往(不安全)」,SSL 绕过成功")
|
||||||
|
return True
|
||||||
|
|
||||||
|
# 备选:中文按钮文字
|
||||||
|
unsafe_link = page.get_by_text('继续前往')
|
||||||
|
if await unsafe_link.count() > 0:
|
||||||
|
await unsafe_link.click()
|
||||||
|
await page.wait_for_timeout(2000)
|
||||||
|
logger.info("已点击「继续前往」,SSL 绕过成功")
|
||||||
|
return True
|
||||||
|
|
||||||
|
logger.warning("SSL 拦截页面检测到但未找到绕过按钮")
|
||||||
|
return False
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"SSL 绕过检查异常: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
async def _set_date(self, page, input_box, date_str):
|
async def _set_date(self, page, input_box, date_str):
|
||||||
"""
|
"""
|
||||||
@@ -157,42 +409,54 @@ class SecsionDownloader:
|
|||||||
3. 在输入框上按 Enter 确认(关键!不确认则关闭时回滚)
|
3. 在输入框上按 Enter 确认(关键!不确认则关闭时回滚)
|
||||||
4. Escape 关闭日历
|
4. Escape 关闭日历
|
||||||
"""
|
"""
|
||||||
for attempt in range(3):
|
max_attempts = 5
|
||||||
logger.info(f"设置日期: {date_str} (第 {attempt + 1} 次尝试)")
|
for attempt in range(max_attempts):
|
||||||
|
try:
|
||||||
|
logger.info(f"设置日期: {date_str} (第 {attempt + 1}/{max_attempts} 次尝试)")
|
||||||
|
|
||||||
# 1. 点击输入框打开日历
|
# 1. 点击输入框打开日历
|
||||||
await input_box.click()
|
await input_box.click()
|
||||||
await page.wait_for_timeout(500)
|
await page.wait_for_timeout(800)
|
||||||
|
|
||||||
# 2. 点击目标日期格子
|
# 2. 点击目标日期格子
|
||||||
target_day = str(int(date_str.split("-")[2]))
|
target_day = str(int(date_str.split("-")[2]))
|
||||||
day_cells = page.get_by_role("cell", name=target_day)
|
day_cells = page.get_by_role("cell", name=target_day)
|
||||||
cell_count = await day_cells.count()
|
cell_count = await day_cells.count()
|
||||||
|
|
||||||
if cell_count > 0:
|
if cell_count > 0:
|
||||||
await day_cells.first.click()
|
logger.debug(f"找到 {cell_count} 个日期格子,点击第一个")
|
||||||
|
await day_cells.first.click()
|
||||||
|
await page.wait_for_timeout(800)
|
||||||
|
else:
|
||||||
|
logger.warning(f"未找到日期格子: {target_day},重试...")
|
||||||
|
await page.keyboard.press("Escape")
|
||||||
|
await page.wait_for_timeout(500)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# 3. Enter 确认(needconfirm="true" 必须显式确认)
|
||||||
|
await input_box.press("Enter")
|
||||||
|
await page.wait_for_timeout(800)
|
||||||
|
|
||||||
|
# 4. Escape 关闭日历
|
||||||
|
await page.keyboard.press("Escape")
|
||||||
|
await page.wait_for_timeout(800)
|
||||||
|
|
||||||
|
# 5. 验证
|
||||||
|
val = await input_box.input_value()
|
||||||
|
if date_str in val:
|
||||||
|
logger.info(f"日期设置成功: {val}")
|
||||||
|
return
|
||||||
|
|
||||||
|
logger.warning(f"日期设置验证失败: 期望包含 '{date_str}', 实际 '{val}',重试...")
|
||||||
|
await page.wait_for_timeout(500)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"日期设置异常 (第 {attempt + 1}/{max_attempts} 次): {e}")
|
||||||
|
await page.keyboard.press("Escape")
|
||||||
await page.wait_for_timeout(500)
|
await page.wait_for_timeout(500)
|
||||||
else:
|
|
||||||
logger.warning(f"未找到日期格子: {target_day}")
|
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# 3. Enter 确认(needconfirm="true" 必须显式确认)
|
logger.error(f"日期设置失败({max_attempts}次尝试后): {date_str}")
|
||||||
await input_box.press("Enter")
|
|
||||||
await page.wait_for_timeout(500)
|
|
||||||
|
|
||||||
# 4. Escape 关闭日历
|
|
||||||
await page.keyboard.press("Escape")
|
|
||||||
await page.wait_for_timeout(500)
|
|
||||||
|
|
||||||
# 5. 验证
|
|
||||||
val = await input_box.input_value()
|
|
||||||
if date_str in val:
|
|
||||||
logger.info(f"日期设置成功: {val}")
|
|
||||||
return
|
|
||||||
|
|
||||||
logger.warning(f"日期设置验证失败: 期望包含 '{date_str}', 实际 '{val}'")
|
|
||||||
|
|
||||||
logger.error(f"日期设置失败(3次尝试后): {date_str}")
|
|
||||||
|
|
||||||
|
|
||||||
async def download_report(start_date, end_date, username=None, password=None, download_dir=None, shop_id=None):
|
async def download_report(start_date, end_date, username=None, password=None, download_dir=None, shop_id=None):
|
||||||
|
|||||||
@@ -19,3 +19,28 @@ services:
|
|||||||
- ./data:/app/data
|
- ./data:/app/data
|
||||||
- ./downloads:/app/downloads
|
- ./downloads:/app/downloads
|
||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
|
# Docker 资源优化
|
||||||
|
deploy:
|
||||||
|
resources:
|
||||||
|
limits:
|
||||||
|
cpus: '2'
|
||||||
|
memory: 2G
|
||||||
|
reservations:
|
||||||
|
cpus: '1'
|
||||||
|
memory: 1G
|
||||||
|
# 网络优化
|
||||||
|
networks:
|
||||||
|
- default
|
||||||
|
dns:
|
||||||
|
- 8.8.8.8
|
||||||
|
- 1.1.1.1
|
||||||
|
# 增加日志配置
|
||||||
|
logging:
|
||||||
|
driver: "json-file"
|
||||||
|
options:
|
||||||
|
max-size: "10m"
|
||||||
|
max-file: "3"
|
||||||
|
|
||||||
|
networks:
|
||||||
|
default:
|
||||||
|
driver: bridge
|
||||||
|
|||||||
@@ -0,0 +1,52 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Docker 调试脚本 - 用于测试自动下载功能
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
echo "🐳 Docker 自动下载调试脚本"
|
||||||
|
echo "=============================="
|
||||||
|
|
||||||
|
# 参数检查
|
||||||
|
if [ "$#" -lt 3 ]; then
|
||||||
|
echo "用法: ./docker-debug.sh <username> <password> <start_date> [end_date]"
|
||||||
|
echo "例如: ./docker-debug.sh 15682076681 123456 2026-05-15 2026-05-17"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
USERNAME=$1
|
||||||
|
PASSWORD=$2
|
||||||
|
START_DATE=$3
|
||||||
|
END_DATE=${4:-$START_DATE}
|
||||||
|
|
||||||
|
echo "📝 参数信息:"
|
||||||
|
echo " 用户名: $USERNAME"
|
||||||
|
echo " 起始日期: $START_DATE"
|
||||||
|
echo " 结束日期: $END_DATE"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# 检查容器是否运行
|
||||||
|
if ! docker ps | grep -q saleshow-app; then
|
||||||
|
echo "❌ 容器未运行,正在启动..."
|
||||||
|
docker-compose up -d
|
||||||
|
echo "⏳ 等待容器启动..."
|
||||||
|
sleep 5
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "📊 容器状态:"
|
||||||
|
docker ps --filter name=saleshow-app --format "table {{.ID}}\t{{.Status}}\t{{.Ports}}"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "🔍 运行下载测试..."
|
||||||
|
docker exec saleshow-app python -m automation.secsion \
|
||||||
|
--username "$USERNAME" \
|
||||||
|
--password "$PASSWORD" \
|
||||||
|
--start "$START_DATE" \
|
||||||
|
--end "$END_DATE"
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "✅ 检查下载结果:"
|
||||||
|
docker exec saleshow-app ls -lh downloads/
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "📋 最近的日志:"
|
||||||
|
docker logs --tail 20 saleshow-app
|
||||||
Reference in New Issue
Block a user