mirror of
https://gitee.com/houhuan/TrendRadar.git
synced 2025-12-21 08:17:17 +08:00
feat: v2.0.0 重大重构
- 配置管理重构:统一使用 config/config.yaml - 运行模式升级:支持 daily/current/incremental 三种模式 - Docker 支持:完整容器化部署方案 - 新增配置文件:config/config.yaml 和 config/frequency_words.txt
This commit is contained in:
parent
ae64109002
commit
3fbbfeff10
35
.dockerignore
Normal file
35
.dockerignore
Normal file
@ -0,0 +1,35 @@
|
||||
.git/
|
||||
.gitignore
|
||||
*.md
|
||||
README.md
|
||||
|
||||
output/
|
||||
|
||||
__pycache__/
|
||||
*.pyc
|
||||
*.pyo
|
||||
*.pyd
|
||||
.Python
|
||||
*.so
|
||||
.pytest_cache/
|
||||
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
docker/.env
|
||||
|
||||
_image/
|
||||
|
||||
.github/
|
||||
|
||||
*.log
|
||||
.env.local
|
||||
.env.*.local
|
||||
version
|
||||
index.html
|
||||
123
.github/ISSUE_TEMPLATE/01-bug-report.yml
vendored
Normal file
123
.github/ISSUE_TEMPLATE/01-bug-report.yml
vendored
Normal file
@ -0,0 +1,123 @@
|
||||
# yaml-language-server: $schema=https://json.schemastore.org/github-issue-forms.json
|
||||
|
||||
name: 🐛 Bug 报告
|
||||
description: 报告程序错误或异常行为
|
||||
title: "[Bug] "
|
||||
labels: ["bug"]
|
||||
assignees:
|
||||
- sansan0
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
感谢你的反馈!详细的 Bug 报告有助于快速定位和修复问题。
|
||||
|
||||
- type: dropdown
|
||||
id: bug-category
|
||||
attributes:
|
||||
label: 🏷️ 问题类型
|
||||
options:
|
||||
- 数据抓取失败
|
||||
- 频率词匹配异常
|
||||
- 通知发送失败
|
||||
- 配置解析错误
|
||||
- Docker 部署问题
|
||||
- 报告生成错误
|
||||
- 性能问题
|
||||
- 其他
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: bug-description
|
||||
attributes:
|
||||
label: 📝 问题描述
|
||||
description: 详细描述遇到的问题
|
||||
placeholder: |
|
||||
请清楚地描述:
|
||||
- 发生了什么问题
|
||||
- 问题的具体表现
|
||||
- 影响范围
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: reproduction-steps
|
||||
attributes:
|
||||
label: 🔄 复现步骤
|
||||
description: 如何重现这个问题?
|
||||
placeholder: |
|
||||
1. 运行命令:python main.py
|
||||
2. 配置设置:...
|
||||
3. 观察到的现象:...
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: expected-behavior
|
||||
attributes:
|
||||
label: ✅ 期望行为
|
||||
description: 正常情况下应该发生什么?
|
||||
placeholder: 描述期望的正常行为...
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: dropdown
|
||||
id: environment
|
||||
attributes:
|
||||
label: 🖥️ 运行环境
|
||||
options:
|
||||
- 本地 Python 环境
|
||||
- Docker 容器
|
||||
- GitHub Actions
|
||||
- 其他
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: input
|
||||
id: python-version
|
||||
attributes:
|
||||
label: 🐍 Python 版本
|
||||
description: 使用的 Python 版本
|
||||
placeholder: 如:3.10.0
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: error-logs
|
||||
attributes:
|
||||
label: 📋 错误日志
|
||||
description: 相关的错误信息或日志
|
||||
placeholder: |
|
||||
请粘贴完整的错误堆栈信息或相关日志:
|
||||
|
||||
```
|
||||
错误信息...
|
||||
```
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: config-info
|
||||
attributes:
|
||||
label: ⚙️ 配置信息
|
||||
description: 相关的配置设置(请隐藏敏感信息)
|
||||
placeholder: |
|
||||
相关的配置片段(请移除敏感信息如 webhook URL):
|
||||
|
||||
```yaml
|
||||
crawler:
|
||||
request_interval: 1000
|
||||
enable_crawler: true
|
||||
```
|
||||
|
||||
- type: textarea
|
||||
id: additional-context
|
||||
attributes:
|
||||
label: 📎 其他信息
|
||||
description: 其他可能有用的信息
|
||||
placeholder: |
|
||||
- 网络环境
|
||||
- 使用的平台配置
|
||||
- 频率词配置
|
||||
- 其他相关信息
|
||||
98
.github/ISSUE_TEMPLATE/02-feature-request.yml
vendored
Normal file
98
.github/ISSUE_TEMPLATE/02-feature-request.yml
vendored
Normal file
@ -0,0 +1,98 @@
|
||||
# yaml-language-server: $schema=https://json.schemastore.org/github-issue-forms.json
|
||||
|
||||
name: 💡 功能建议
|
||||
description: 提出新功能想法或改进建议
|
||||
title: "[功能] "
|
||||
labels: ["enhancement"]
|
||||
assignees:
|
||||
- sansan0
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
感谢你的创意!好的想法让这个项目变得更加实用。
|
||||
|
||||
- type: dropdown
|
||||
id: feature-category
|
||||
attributes:
|
||||
label: 🏷️ 功能类别
|
||||
options:
|
||||
- 数据抓取增强
|
||||
- 分析算法改进
|
||||
- 通知方式扩展
|
||||
- 配置管理优化
|
||||
- 部署便利性
|
||||
- 数据可视化
|
||||
- 监控与告警
|
||||
- 其他
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: feature-description
|
||||
attributes:
|
||||
label: 💭 功能描述
|
||||
description: 详细描述你希望添加的功能
|
||||
placeholder: |
|
||||
请详细描述:
|
||||
- 功能的具体内容
|
||||
- 如何使用这个功能
|
||||
- 期望的效果
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: use-case
|
||||
attributes:
|
||||
label: 🎯 使用场景
|
||||
description: 这个功能会在什么情况下使用?
|
||||
placeholder: |
|
||||
例如:
|
||||
- 当用户需要...的时候
|
||||
- 在...场景下会很有帮助
|
||||
- 可以解决...问题
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: current-workaround
|
||||
attributes:
|
||||
label: 🔄 现有解决方案
|
||||
description: 目前是如何处理这个需求的?
|
||||
placeholder: |
|
||||
- 目前的替代方案
|
||||
- 现有方案的不足
|
||||
- 为什么需要新功能
|
||||
|
||||
- type: dropdown
|
||||
id: feature-priority
|
||||
attributes:
|
||||
label: ⭐ 优先级
|
||||
description: 你认为这个功能的重要程度
|
||||
options:
|
||||
- 高 - 非常重要,强烈建议
|
||||
- 中 - 会很有用,建议考虑
|
||||
- 低 - 锦上添花,有时间可做
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: implementation-ideas
|
||||
attributes:
|
||||
label: 🛠️ 实现想法(可选)
|
||||
description: 如果你有实现思路,欢迎分享
|
||||
placeholder: |
|
||||
- 技术实现方案
|
||||
- 配置设计建议
|
||||
- 用户界面设计
|
||||
- 参考项目或工具
|
||||
|
||||
- type: textarea
|
||||
id: additional-context
|
||||
attributes:
|
||||
label: 📎 其他信息
|
||||
description: 其他想要补充的内容
|
||||
placeholder: |
|
||||
- 相关截图或示例
|
||||
- 类似功能的参考
|
||||
- 其他相关想法
|
||||
108
.github/ISSUE_TEMPLATE/03-config-help.yml
vendored
Normal file
108
.github/ISSUE_TEMPLATE/03-config-help.yml
vendored
Normal file
@ -0,0 +1,108 @@
|
||||
# yaml-language-server: $schema=https://json.schemastore.org/github-issue-forms.json
|
||||
|
||||
name: ⚙️ 配置帮助
|
||||
description: 配置相关的问题或优化建议
|
||||
title: "[配置] "
|
||||
labels: ["配置", "帮助"]
|
||||
assignees:
|
||||
- sansan0
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
遇到配置问题?我们来帮你解决!
|
||||
|
||||
📖 **建议先查看 README 文档和配置文件注释**
|
||||
|
||||
- type: dropdown
|
||||
id: config-type
|
||||
attributes:
|
||||
label: 🏷️ 配置类型
|
||||
options:
|
||||
- 基础配置问题
|
||||
- Webhook 配置
|
||||
- Docker 部署配置
|
||||
- 频率词配置
|
||||
- 平台配置
|
||||
- 定时任务配置
|
||||
- 其他
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: dropdown
|
||||
id: problem-type
|
||||
attributes:
|
||||
label: 🔍 问题类型
|
||||
options:
|
||||
- 配置不生效
|
||||
- 不知道如何配置
|
||||
- 配置优化建议
|
||||
- 配置文档改进
|
||||
- 配置验证问题
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: config-content
|
||||
attributes:
|
||||
label: 📄 相关配置
|
||||
description: 请提供相关的配置内容(请隐藏敏感信息)
|
||||
placeholder: |
|
||||
请贴出相关的配置片段(记得隐藏敏感信息):
|
||||
|
||||
```yaml
|
||||
notification:
|
||||
enable_notification: true
|
||||
webhooks:
|
||||
feishu_url: "***"
|
||||
```
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: problem-description
|
||||
attributes:
|
||||
label: 📝 问题描述
|
||||
description: 详细描述遇到的配置问题
|
||||
placeholder: |
|
||||
请详细描述:
|
||||
- 遇到的具体问题
|
||||
- 期望的效果
|
||||
- 已经尝试的方法
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: error-messages
|
||||
attributes:
|
||||
label: ❌ 错误信息(如果有)
|
||||
description: 相关的错误信息或日志
|
||||
placeholder: |
|
||||
如果有错误信息,请贴出来:
|
||||
|
||||
```
|
||||
错误信息...
|
||||
```
|
||||
|
||||
- type: dropdown
|
||||
id: environment
|
||||
attributes:
|
||||
label: 🖥️ 运行环境
|
||||
options:
|
||||
- 本地 Python 环境
|
||||
- Docker 容器
|
||||
- GitHub Actions
|
||||
- 其他
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: additional-info
|
||||
attributes:
|
||||
label: 📎 其他信息
|
||||
description: 其他可能有用的信息
|
||||
placeholder: |
|
||||
- 操作系统版本
|
||||
- 网络环境
|
||||
- 使用场景
|
||||
- 其他相关信息
|
||||
3
.github/ISSUE_TEMPLATE/config.yml
vendored
Normal file
3
.github/ISSUE_TEMPLATE/config.yml
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
# yaml-language-server: $schema=https://json.schemastore.org/github-issue-config.json
|
||||
|
||||
blank_issues_enabled: false
|
||||
87
.github/workflows/crawler.yml
vendored
87
.github/workflows/crawler.yml
vendored
@ -2,7 +2,7 @@ name: Hot News Crawler
|
||||
|
||||
on:
|
||||
schedule:
|
||||
- cron: '*/50 * * * *' # 每50分钟运行一次
|
||||
- cron: "*/30 * * * *" # 每 30 分钟运行一次
|
||||
workflow_dispatch:
|
||||
|
||||
# 添加权限设置
|
||||
@ -12,41 +12,52 @@ permissions:
|
||||
jobs:
|
||||
crawl:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v4
|
||||
with:
|
||||
python-version: '3.9'
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install requests pytz
|
||||
|
||||
- name: Create frequency_words.txt if not exists
|
||||
run: |
|
||||
if [ ! -f frequency_words.txt ]; then
|
||||
echo "Creating empty frequency_words.txt file"
|
||||
touch frequency_words.txt
|
||||
fi
|
||||
|
||||
- name: Run crawler
|
||||
env:
|
||||
FEISHU_WEBHOOK_URL: ${{ secrets.FEISHU_WEBHOOK_URL }}
|
||||
TELEGRAM_BOT_TOKEN: ${{ secrets.TELEGRAM_BOT_TOKEN }}
|
||||
TELEGRAM_CHAT_ID: ${{ secrets.TELEGRAM_CHAT_ID }}
|
||||
DINGTALK_WEBHOOK_URL: ${{ secrets.DINGTALK_WEBHOOK_URL }}
|
||||
WEWORK_WEBHOOK_URL: ${{ secrets.WEWORK_WEBHOOK_URL }}
|
||||
GITHUB_ACTIONS: true
|
||||
run: python main.py
|
||||
|
||||
- name: Commit and push if changes
|
||||
run: |
|
||||
git config --global user.name 'GitHub Actions'
|
||||
git config --global user.email 'actions@github.com'
|
||||
git add -A
|
||||
git diff --quiet && git diff --staged --quiet || (git commit -m "Auto update by GitHub Actions at $(TZ=Asia/Shanghai date)" && git push)
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v4
|
||||
with:
|
||||
python-version: "3.9"
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install -r requirements.txt
|
||||
|
||||
- name: Verify required files
|
||||
run: |
|
||||
echo "🔍 检查必需的配置文件..."
|
||||
|
||||
if [ ! -f config/config.yaml ]; then
|
||||
echo "❌ 错误: config/config.yaml 文件不存在"
|
||||
echo "请参考项目文档创建配置文件"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [ ! -f config/frequency_words.txt ]; then
|
||||
echo "❌ 错误: config/frequency_words.txt 文件不存在"
|
||||
echo "请参考项目文档创建频率词配置文件"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✅ 配置文件检查通过"
|
||||
|
||||
- name: Run crawler
|
||||
env:
|
||||
FEISHU_WEBHOOK_URL: ${{ secrets.FEISHU_WEBHOOK_URL }}
|
||||
TELEGRAM_BOT_TOKEN: ${{ secrets.TELEGRAM_BOT_TOKEN }}
|
||||
TELEGRAM_CHAT_ID: ${{ secrets.TELEGRAM_CHAT_ID }}
|
||||
DINGTALK_WEBHOOK_URL: ${{ secrets.DINGTALK_WEBHOOK_URL }}
|
||||
WEWORK_WEBHOOK_URL: ${{ secrets.WEWORK_WEBHOOK_URL }}
|
||||
GITHUB_ACTIONS: true
|
||||
run: python main.py
|
||||
|
||||
- name: Commit and push if changes
|
||||
run: |
|
||||
git config --global user.name 'GitHub Actions'
|
||||
git config --global user.email 'actions@github.com'
|
||||
git add -A
|
||||
git diff --quiet && git diff --staged --quiet || (git commit -m "Auto update by GitHub Actions at $(TZ=Asia/Shanghai date)" && git push)
|
||||
|
||||
72
config/config.yaml
Normal file
72
config/config.yaml
Normal file
@ -0,0 +1,72 @@
|
||||
app:
|
||||
version: "2.0.0"
|
||||
version_check_url: "https://raw.githubusercontent.com/sansan0/TrendRadar/refs/heads/master/version"
|
||||
show_version_update: true # 控制显示版本更新提示,改成 false 将不接受新版本提示
|
||||
|
||||
crawler:
|
||||
request_interval: 1000 # 请求间隔(毫秒)
|
||||
enable_crawler: true # 是否启用爬取新闻功能,false 时直接停止程序
|
||||
use_proxy: false # 是否启用代理,false 时为关闭
|
||||
default_proxy: "http://127.0.0.1:10086"
|
||||
|
||||
# 🔸 daily(当日汇总模式)
|
||||
# • 推送时机:按时推送
|
||||
# • 显示内容:当日所有匹配新闻 + 新增新闻区域
|
||||
# • 适用场景:日报总结、全面了解当日热点趋势
|
||||
#
|
||||
# 🔸 current(当前榜单模式)
|
||||
# • 推送时机:按时推送
|
||||
# • 显示内容:当前榜单匹配新闻 + 新增新闻区域
|
||||
# • 适用场景:实时热点追踪、了解当前最火的内容
|
||||
#
|
||||
# 🔸 incremental(增量监控模式)
|
||||
# • 推送时机:有新增才推送
|
||||
# • 显示内容:新出现的匹配频率词新闻
|
||||
# • 适用场景:避免重复信息干扰
|
||||
|
||||
report:
|
||||
mode: "daily" # 可选: "daily"|"incremental"|"current"
|
||||
rank_threshold: 5 # 排名高亮阈值
|
||||
|
||||
notification:
|
||||
enable_notification: true # 是否启用通知功能,false 时不发送手机通知
|
||||
message_batch_size: 4000 # 消息分批大小(字节)(这个配置别动)
|
||||
batch_send_interval: 1 # 批次发送间隔(秒)
|
||||
feishu_message_separator: "━━━━━━━━━━━━━━━━━━━" # feishu 消息分割线
|
||||
|
||||
webhooks:
|
||||
feishu_url: "" # 飞书机器人的 webhook URL
|
||||
dingtalk_url: "" # 钉钉机器人的 webhook URL
|
||||
wework_url: "" # 企业微信机器人的 webhook URL
|
||||
telegram_bot_token: "" # Telegram Bot Token
|
||||
telegram_chat_id: "" # Telegram Chat ID
|
||||
|
||||
# 用于让关注度更高的新闻在更前面显示,合起来是 1 就行
|
||||
weight:
|
||||
rank_weight: 0.6 # 排名权重
|
||||
frequency_weight: 0.3 # 频次权重
|
||||
hotness_weight: 0.1 # 热度权重
|
||||
|
||||
platforms:
|
||||
- id: "toutiao"
|
||||
name: "今日头条"
|
||||
- id: "baidu"
|
||||
name: "百度热搜"
|
||||
- id: "wallstreetcn-hot"
|
||||
name: "华尔街见闻"
|
||||
- id: "thepaper"
|
||||
name: "澎湃新闻"
|
||||
- id: "bilibili-hot-search"
|
||||
name: "bilibili 热搜"
|
||||
- id: "cls-hot"
|
||||
name: "财联社热门"
|
||||
- id: "ifeng"
|
||||
name: "凤凰网"
|
||||
- id: "tieba"
|
||||
name: "贴吧"
|
||||
- id: "weibo"
|
||||
name: "微博"
|
||||
- id: "douyin"
|
||||
name: "抖音"
|
||||
- id: "zhihu"
|
||||
name: "知乎"
|
||||
@ -41,6 +41,10 @@ HarmonyOS
|
||||
字节
|
||||
张一鸣
|
||||
|
||||
小米
|
||||
雷军
|
||||
xiaomi
|
||||
|
||||
马斯克
|
||||
特斯拉
|
||||
|
||||
@ -59,11 +63,14 @@ gemini
|
||||
deepmind
|
||||
|
||||
chatgpt
|
||||
Sam Altman
|
||||
openai
|
||||
|
||||
claude
|
||||
Anthropic
|
||||
|
||||
苹果
|
||||
库克
|
||||
iphone
|
||||
ipad
|
||||
mac
|
||||
@ -75,18 +82,25 @@ ai
|
||||
|
||||
汽车
|
||||
自动驾驶
|
||||
l3
|
||||
|
||||
手机
|
||||
|
||||
机器人
|
||||
|
||||
国产
|
||||
中国
|
||||
|
||||
美国
|
||||
|
||||
芯片
|
||||
半导体
|
||||
光刻机
|
||||
|
||||
科技
|
||||
|
||||
核能
|
||||
|
||||
新质生产力
|
||||
|
||||
月球
|
||||
登月
|
||||
火星
|
||||
11
docker/.env
Normal file
11
docker/.env
Normal file
@ -0,0 +1,11 @@
|
||||
# Webhook 配置
|
||||
FEISHU_WEBHOOK_URL=
|
||||
TELEGRAM_BOT_TOKEN=
|
||||
TELEGRAM_CHAT_ID=
|
||||
DINGTALK_WEBHOOK_URL=
|
||||
WEWORK_WEBHOOK_URL=
|
||||
|
||||
# 运行配置
|
||||
CRON_SCHEDULE=*/30 * * * * # 定时任务表达式,每 30 分钟执行一次(比如 8点,8点半,9点,9点半这种时间规律执行)
|
||||
RUN_MODE=cron # 运行模式:cron/once
|
||||
IMMEDIATE_RUN=true # 启动时立即执行一次
|
||||
36
docker/Dockerfile
Normal file
36
docker/Dockerfile
Normal file
@ -0,0 +1,36 @@
|
||||
FROM python:3.10-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# https://github.com/aptible/supercronic/releases
|
||||
ENV SUPERCRONIC_URL=https://github.com/aptible/supercronic/releases/download/v0.2.34/supercronic-linux-amd64 \
|
||||
SUPERCRONIC_SHA1SUM=e8631edc1775000d119b70fd40339a7238eece14 \
|
||||
SUPERCRONIC=supercronic-linux-amd64
|
||||
|
||||
RUN apt-get update && \
|
||||
apt-get install -y --no-install-recommends curl && \
|
||||
curl -fsSLO "$SUPERCRONIC_URL" && \
|
||||
echo "${SUPERCRONIC_SHA1SUM} ${SUPERCRONIC}" | sha1sum -c - && \
|
||||
chmod +x "$SUPERCRONIC" && \
|
||||
mv "$SUPERCRONIC" "/usr/local/bin/${SUPERCRONIC}" && \
|
||||
ln -s "/usr/local/bin/${SUPERCRONIC}" /usr/local/bin/supercronic && \
|
||||
apt-get remove -y curl && \
|
||||
apt-get clean && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
COPY main.py .
|
||||
COPY docker/manage.py .
|
||||
COPY docker/entrypoint.sh /entrypoint.sh
|
||||
|
||||
RUN chmod +x /entrypoint.sh && \
|
||||
chmod +x manage.py && \
|
||||
mkdir -p /app/config /app/output
|
||||
|
||||
ENV PYTHONUNBUFFERED=1 \
|
||||
CONFIG_PATH=/app/config/config.yaml \
|
||||
FREQUENCY_WORDS_PATH=/app/config/frequency_words.txt
|
||||
|
||||
ENTRYPOINT ["/entrypoint.sh"]
|
||||
22
docker/docker-compose.yml
Normal file
22
docker/docker-compose.yml
Normal file
@ -0,0 +1,22 @@
|
||||
services:
|
||||
trend-radar:
|
||||
build:
|
||||
context: ..
|
||||
dockerfile: docker/Dockerfile
|
||||
container_name: trend-radar
|
||||
restart: unless-stopped
|
||||
|
||||
volumes:
|
||||
- ../config:/app/config:ro
|
||||
- ../output:/app/output
|
||||
|
||||
environment:
|
||||
- TZ=Asia/Shanghai
|
||||
- FEISHU_WEBHOOK_URL=${FEISHU_WEBHOOK_URL:-}
|
||||
- TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN:-}
|
||||
- TELEGRAM_CHAT_ID=${TELEGRAM_CHAT_ID:-}
|
||||
- DINGTALK_WEBHOOK_URL=${DINGTALK_WEBHOOK_URL:-}
|
||||
- WEWORK_WEBHOOK_URL=${WEWORK_WEBHOOK_URL:-}
|
||||
- CRON_SCHEDULE=${CRON_SCHEDULE:-*/5 * * * *}
|
||||
- RUN_MODE=${RUN_MODE:-cron}
|
||||
- IMMEDIATE_RUN=${IMMEDIATE_RUN:-true}
|
||||
44
docker/entrypoint.sh
Normal file
44
docker/entrypoint.sh
Normal file
@ -0,0 +1,44 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
# 检查配置文件
|
||||
if [ ! -f "/app/config/config.yaml" ] || [ ! -f "/app/config/frequency_words.txt" ]; then
|
||||
echo "❌ 配置文件缺失"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 保存环境变量
|
||||
env >> /etc/environment
|
||||
|
||||
case "${RUN_MODE:-cron}" in
|
||||
"once")
|
||||
echo "🔄 单次执行"
|
||||
exec /usr/local/bin/python main.py
|
||||
;;
|
||||
"cron")
|
||||
# 生成 crontab
|
||||
echo "${CRON_SCHEDULE:-*/30 * * * *} cd /app && /usr/local/bin/python main.py" > /tmp/crontab
|
||||
|
||||
echo "📅 生成的crontab内容:"
|
||||
cat /tmp/crontab
|
||||
|
||||
if ! /usr/local/bin/supercronic-linux-amd64 -test /tmp/crontab; then
|
||||
echo "❌ crontab格式验证失败"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 立即执行一次(如果配置了)
|
||||
if [ "${IMMEDIATE_RUN:-false}" = "true" ]; then
|
||||
echo "▶️ 立即执行一次"
|
||||
/usr/local/bin/python main.py
|
||||
fi
|
||||
|
||||
echo "⏰ 启动supercronic: ${CRON_SCHEDULE:-*/30 * * * *}"
|
||||
echo "🎯 supercronic 将作为 PID 1 运行"
|
||||
|
||||
exec /usr/local/bin/supercronic-linux-amd64 -passthrough-logs /tmp/crontab
|
||||
;;
|
||||
*)
|
||||
exec "$@"
|
||||
;;
|
||||
esac
|
||||
472
docker/manage.py
Normal file
472
docker/manage.py
Normal file
@ -0,0 +1,472 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
新闻爬虫容器管理工具 - supercronic
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import subprocess
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def run_command(cmd, shell=True, capture_output=True):
|
||||
"""执行系统命令"""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
cmd, shell=shell, capture_output=capture_output, text=True
|
||||
)
|
||||
return result.returncode == 0, result.stdout, result.stderr
|
||||
except Exception as e:
|
||||
return False, "", str(e)
|
||||
|
||||
|
||||
def manual_run():
|
||||
"""手动执行一次爬虫"""
|
||||
print("🔄 手动执行爬虫...")
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["python", "main.py"], cwd="/app", capture_output=False, text=True
|
||||
)
|
||||
if result.returncode == 0:
|
||||
print("✅ 执行完成")
|
||||
else:
|
||||
print(f"❌ 执行失败,退出码: {result.returncode}")
|
||||
except Exception as e:
|
||||
print(f"❌ 执行出错: {e}")
|
||||
|
||||
|
||||
def parse_cron_schedule(cron_expr):
|
||||
"""解析cron表达式并返回人类可读的描述"""
|
||||
if not cron_expr or cron_expr == "未设置":
|
||||
return "未设置"
|
||||
|
||||
try:
|
||||
parts = cron_expr.strip().split()
|
||||
if len(parts) != 5:
|
||||
return f"原始表达式: {cron_expr}"
|
||||
|
||||
minute, hour, day, month, weekday = parts
|
||||
|
||||
# 分析分钟
|
||||
if minute == "*":
|
||||
minute_desc = "每分钟"
|
||||
elif minute.startswith("*/"):
|
||||
interval = minute[2:]
|
||||
minute_desc = f"每{interval}分钟"
|
||||
elif "," in minute:
|
||||
minute_desc = f"在第{minute}分钟"
|
||||
else:
|
||||
minute_desc = f"在第{minute}分钟"
|
||||
|
||||
# 分析小时
|
||||
if hour == "*":
|
||||
hour_desc = "每小时"
|
||||
elif hour.startswith("*/"):
|
||||
interval = hour[2:]
|
||||
hour_desc = f"每{interval}小时"
|
||||
elif "," in hour:
|
||||
hour_desc = f"在{hour}点"
|
||||
else:
|
||||
hour_desc = f"在{hour}点"
|
||||
|
||||
# 分析日期
|
||||
if day == "*":
|
||||
day_desc = "每天"
|
||||
elif day.startswith("*/"):
|
||||
interval = day[2:]
|
||||
day_desc = f"每{interval}天"
|
||||
else:
|
||||
day_desc = f"每月{day}号"
|
||||
|
||||
# 分析月份
|
||||
if month == "*":
|
||||
month_desc = "每月"
|
||||
else:
|
||||
month_desc = f"在{month}月"
|
||||
|
||||
# 分析星期
|
||||
weekday_names = {
|
||||
"0": "周日", "1": "周一", "2": "周二", "3": "周三",
|
||||
"4": "周四", "5": "周五", "6": "周六", "7": "周日"
|
||||
}
|
||||
if weekday == "*":
|
||||
weekday_desc = ""
|
||||
else:
|
||||
weekday_desc = f"在{weekday_names.get(weekday, weekday)}"
|
||||
|
||||
# 组合描述
|
||||
if minute.startswith("*/") and hour == "*" and day == "*" and month == "*" and weekday == "*":
|
||||
# 简单的间隔模式,如 */30 * * * *
|
||||
return f"每{minute[2:]}分钟执行一次"
|
||||
elif hour != "*" and minute != "*" and day == "*" and month == "*" and weekday == "*":
|
||||
# 每天特定时间,如 0 9 * * *
|
||||
return f"每天{hour}:{minute.zfill(2)}执行"
|
||||
elif weekday != "*" and day == "*":
|
||||
# 每周特定时间
|
||||
return f"{weekday_desc}{hour}:{minute.zfill(2)}执行"
|
||||
else:
|
||||
# 复杂模式,显示详细信息
|
||||
desc_parts = [part for part in [month_desc, day_desc, weekday_desc, hour_desc, minute_desc] if part and part != "每月" and part != "每天" and part != "每小时"]
|
||||
if desc_parts:
|
||||
return " ".join(desc_parts) + "执行"
|
||||
else:
|
||||
return f"复杂表达式: {cron_expr}"
|
||||
|
||||
except Exception as e:
|
||||
return f"解析失败: {cron_expr}"
|
||||
|
||||
|
||||
def show_status():
|
||||
"""显示容器状态"""
|
||||
print("📊 容器状态:")
|
||||
|
||||
# 检查 PID 1 状态
|
||||
supercronic_is_pid1 = False
|
||||
pid1_cmdline = ""
|
||||
try:
|
||||
with open('/proc/1/cmdline', 'r') as f:
|
||||
pid1_cmdline = f.read().replace('\x00', ' ').strip()
|
||||
print(f" 🔍 PID 1 进程: {pid1_cmdline}")
|
||||
|
||||
if "supercronic" in pid1_cmdline.lower():
|
||||
print(" ✅ supercronic 正确运行为 PID 1")
|
||||
supercronic_is_pid1 = True
|
||||
else:
|
||||
print(" ❌ PID 1 不是 supercronic")
|
||||
print(f" 📋 实际的 PID 1: {pid1_cmdline}")
|
||||
except Exception as e:
|
||||
print(f" ❌ 无法读取 PID 1 信息: {e}")
|
||||
|
||||
# 检查环境变量
|
||||
cron_schedule = os.environ.get("CRON_SCHEDULE", "未设置")
|
||||
run_mode = os.environ.get("RUN_MODE", "未设置")
|
||||
immediate_run = os.environ.get("IMMEDIATE_RUN", "未设置")
|
||||
|
||||
print(f" ⚙️ 运行配置:")
|
||||
print(f" CRON_SCHEDULE: {cron_schedule}")
|
||||
|
||||
# 解析并显示cron表达式的含义
|
||||
cron_description = parse_cron_schedule(cron_schedule)
|
||||
print(f" ⏰ 执行频率: {cron_description}")
|
||||
|
||||
print(f" RUN_MODE: {run_mode}")
|
||||
print(f" IMMEDIATE_RUN: {immediate_run}")
|
||||
|
||||
# 检查配置文件
|
||||
config_files = ["/app/config/config.yaml", "/app/config/frequency_words.txt"]
|
||||
print(" 📁 配置文件:")
|
||||
for file_path in config_files:
|
||||
if Path(file_path).exists():
|
||||
print(f" ✅ {Path(file_path).name}")
|
||||
else:
|
||||
print(f" ❌ {Path(file_path).name} 缺失")
|
||||
|
||||
# 检查关键文件
|
||||
key_files = [
|
||||
("/usr/local/bin/supercronic-linux-amd64", "supercronic二进制文件"),
|
||||
("/usr/local/bin/supercronic", "supercronic软链接"),
|
||||
("/tmp/crontab", "crontab文件"),
|
||||
("/entrypoint.sh", "启动脚本")
|
||||
]
|
||||
|
||||
print(" 📂 关键文件检查:")
|
||||
for file_path, description in key_files:
|
||||
if Path(file_path).exists():
|
||||
print(f" ✅ {description}: 存在")
|
||||
# 对于crontab文件,显示内容
|
||||
if file_path == "/tmp/crontab":
|
||||
try:
|
||||
with open(file_path, 'r') as f:
|
||||
crontab_content = f.read().strip()
|
||||
print(f" 内容: {crontab_content}")
|
||||
except:
|
||||
pass
|
||||
else:
|
||||
print(f" ❌ {description}: 不存在")
|
||||
|
||||
# 检查容器运行时间
|
||||
print(" ⏱️ 容器时间信息:")
|
||||
try:
|
||||
# 检查 PID 1 的启动时间
|
||||
with open('/proc/1/stat', 'r') as f:
|
||||
stat_content = f.read().strip().split()
|
||||
if len(stat_content) >= 22:
|
||||
# starttime 是第22个字段(索引21)
|
||||
starttime_ticks = int(stat_content[21])
|
||||
|
||||
# 读取系统启动时间
|
||||
with open('/proc/stat', 'r') as stat_f:
|
||||
for line in stat_f:
|
||||
if line.startswith('btime'):
|
||||
boot_time = int(line.split()[1])
|
||||
break
|
||||
else:
|
||||
boot_time = 0
|
||||
|
||||
# 读取系统时钟频率
|
||||
clock_ticks = os.sysconf(os.sysconf_names['SC_CLK_TCK'])
|
||||
|
||||
if boot_time > 0:
|
||||
pid1_start_time = boot_time + (starttime_ticks / clock_ticks)
|
||||
current_time = time.time()
|
||||
uptime_seconds = int(current_time - pid1_start_time)
|
||||
uptime_minutes = uptime_seconds // 60
|
||||
uptime_hours = uptime_minutes // 60
|
||||
|
||||
if uptime_hours > 0:
|
||||
print(f" PID 1 运行时间: {uptime_hours} 小时 {uptime_minutes % 60} 分钟")
|
||||
else:
|
||||
print(f" PID 1 运行时间: {uptime_minutes} 分钟 ({uptime_seconds} 秒)")
|
||||
else:
|
||||
print(f" PID 1 运行时间: 无法精确计算")
|
||||
else:
|
||||
print(" ❌ 无法解析 PID 1 统计信息")
|
||||
except Exception as e:
|
||||
print(f" ❌ 时间检查失败: {e}")
|
||||
|
||||
# 状态总结和建议
|
||||
print(" 📊 状态总结:")
|
||||
if supercronic_is_pid1:
|
||||
print(" ✅ supercronic 正确运行为 PID 1")
|
||||
print(" ✅ 定时任务应该正常工作")
|
||||
|
||||
# 显示当前的调度信息
|
||||
if cron_schedule != "未设置":
|
||||
print(f" ⏰ 当前调度: {cron_description}")
|
||||
|
||||
# 提供一些常见的调度建议
|
||||
if "分钟" in cron_description and "每30分钟" not in cron_description and "每60分钟" not in cron_description:
|
||||
print(" 💡 频繁执行模式,适合实时监控")
|
||||
elif "小时" in cron_description:
|
||||
print(" 💡 按小时执行模式,适合定期汇总")
|
||||
elif "天" in cron_description:
|
||||
print(" 💡 每日执行模式,适合日报生成")
|
||||
|
||||
print(" 💡 如果定时任务不执行,检查:")
|
||||
print(" • crontab 格式是否正确")
|
||||
print(" • 时区设置是否正确")
|
||||
print(" • 应用程序是否有错误")
|
||||
else:
|
||||
print(" ❌ supercronic 状态异常")
|
||||
if pid1_cmdline:
|
||||
print(f" 📋 当前 PID 1: {pid1_cmdline}")
|
||||
print(" 💡 建议操作:")
|
||||
print(" • 重启容器: docker restart trend-radar")
|
||||
print(" • 检查容器日志: docker logs trend-radar")
|
||||
|
||||
# 显示日志检查建议
|
||||
print(" 📋 运行状态检查:")
|
||||
print(" • 查看完整容器日志: docker logs trend-radar")
|
||||
print(" • 查看实时日志: docker logs -f trend-radar")
|
||||
print(" • 手动执行测试: python manage.py run")
|
||||
print(" • 重启容器服务: docker restart trend-radar")
|
||||
|
||||
|
||||
def show_config():
|
||||
"""显示当前配置"""
|
||||
print("⚙️ 当前配置:")
|
||||
|
||||
env_vars = [
|
||||
"CRON_SCHEDULE",
|
||||
"RUN_MODE",
|
||||
"IMMEDIATE_RUN",
|
||||
"FEISHU_WEBHOOK_URL",
|
||||
"DINGTALK_WEBHOOK_URL",
|
||||
"WEWORK_WEBHOOK_URL",
|
||||
"TELEGRAM_BOT_TOKEN",
|
||||
"TELEGRAM_CHAT_ID",
|
||||
"CONFIG_PATH",
|
||||
"FREQUENCY_WORDS_PATH",
|
||||
]
|
||||
|
||||
for var in env_vars:
|
||||
value = os.environ.get(var, "未设置")
|
||||
# 隐藏敏感信息
|
||||
if any(sensitive in var for sensitive in ["WEBHOOK", "TOKEN", "KEY"]):
|
||||
if value and value != "未设置":
|
||||
masked_value = value[:10] + "***" if len(value) > 10 else "***"
|
||||
print(f" {var}: {masked_value}")
|
||||
else:
|
||||
print(f" {var}: {value}")
|
||||
else:
|
||||
print(f" {var}: {value}")
|
||||
|
||||
crontab_file = "/tmp/crontab"
|
||||
if Path(crontab_file).exists():
|
||||
print(" 📅 Crontab内容:")
|
||||
try:
|
||||
with open(crontab_file, "r") as f:
|
||||
content = f.read().strip()
|
||||
print(f" {content}")
|
||||
except Exception as e:
|
||||
print(f" 读取失败: {e}")
|
||||
else:
|
||||
print(" 📅 Crontab文件不存在")
|
||||
|
||||
|
||||
def show_files():
|
||||
"""显示输出文件"""
|
||||
print("📁 输出文件:")
|
||||
|
||||
output_dir = Path("/app/output")
|
||||
if not output_dir.exists():
|
||||
print(" 📭 输出目录不存在")
|
||||
return
|
||||
|
||||
# 显示最近的文件
|
||||
date_dirs = sorted([d for d in output_dir.iterdir() if d.is_dir()], reverse=True)
|
||||
|
||||
if not date_dirs:
|
||||
print(" 📭 输出目录为空")
|
||||
return
|
||||
|
||||
# 显示最近2天的文件
|
||||
for date_dir in date_dirs[:2]:
|
||||
print(f" 📅 {date_dir.name}:")
|
||||
for subdir in ["html", "txt"]:
|
||||
sub_path = date_dir / subdir
|
||||
if sub_path.exists():
|
||||
files = list(sub_path.glob("*"))
|
||||
if files:
|
||||
recent_files = sorted(
|
||||
files, key=lambda x: x.stat().st_mtime, reverse=True
|
||||
)[:3]
|
||||
print(f" 📂 {subdir}: {len(files)} 个文件")
|
||||
for file in recent_files:
|
||||
mtime = time.ctime(file.stat().st_mtime)
|
||||
size_kb = file.stat().st_size // 1024
|
||||
print(
|
||||
f" 📄 {file.name} ({size_kb}KB, {mtime.split()[3][:5]})"
|
||||
)
|
||||
else:
|
||||
print(f" 📂 {subdir}: 空")
|
||||
|
||||
|
||||
def show_logs():
|
||||
"""显示实时日志"""
|
||||
print("📋 实时日志 (按 Ctrl+C 退出):")
|
||||
print("💡 提示: 这将显示 PID 1 进程的输出")
|
||||
try:
|
||||
# 尝试多种方法查看日志
|
||||
log_files = [
|
||||
"/proc/1/fd/1", # PID 1 的标准输出
|
||||
"/proc/1/fd/2", # PID 1 的标准错误
|
||||
]
|
||||
|
||||
for log_file in log_files:
|
||||
if Path(log_file).exists():
|
||||
print(f"📄 尝试读取: {log_file}")
|
||||
subprocess.run(["tail", "-f", log_file], check=True)
|
||||
break
|
||||
else:
|
||||
print("📋 无法找到标准日志文件,建议使用: docker logs trend-radar")
|
||||
|
||||
except KeyboardInterrupt:
|
||||
print("\n👋 退出日志查看")
|
||||
except Exception as e:
|
||||
print(f"❌ 查看日志失败: {e}")
|
||||
print("💡 建议使用: docker logs trend-radar")
|
||||
|
||||
|
||||
def restart_supercronic():
|
||||
"""重启supercronic进程"""
|
||||
print("🔄 重启supercronic...")
|
||||
print("⚠️ 注意: supercronic 是 PID 1,无法直接重启")
|
||||
|
||||
# 检查当前 PID 1
|
||||
try:
|
||||
with open('/proc/1/cmdline', 'r') as f:
|
||||
pid1_cmdline = f.read().replace('\x00', ' ').strip()
|
||||
print(f" 🔍 当前 PID 1: {pid1_cmdline}")
|
||||
|
||||
if "supercronic" in pid1_cmdline.lower():
|
||||
print(" ✅ PID 1 是 supercronic")
|
||||
print(" 💡 要重启 supercronic,需要重启整个容器:")
|
||||
print(" docker restart trend-radar")
|
||||
else:
|
||||
print(" ❌ PID 1 不是 supercronic,这是异常状态")
|
||||
print(" 💡 建议重启容器以修复问题:")
|
||||
print(" docker restart trend-radar")
|
||||
except Exception as e:
|
||||
print(f" ❌ 无法检查 PID 1: {e}")
|
||||
print(" 💡 建议重启容器: docker restart trend-radar")
|
||||
|
||||
|
||||
def show_help():
|
||||
"""显示帮助信息"""
|
||||
help_text = """
|
||||
🐳 TrendRadar 容器管理工具
|
||||
|
||||
📋 命令列表:
|
||||
run - 手动执行一次爬虫
|
||||
status - 显示容器运行状态
|
||||
config - 显示当前配置
|
||||
files - 显示输出文件
|
||||
logs - 实时查看日志
|
||||
restart - 重启说明
|
||||
help - 显示此帮助
|
||||
|
||||
📖 使用示例:
|
||||
# 在容器中执行
|
||||
python manage.py run
|
||||
python manage.py status
|
||||
python manage.py logs
|
||||
|
||||
# 在宿主机执行
|
||||
docker exec -it trend-radar python manage.py run
|
||||
docker exec -it trend-radar python manage.py status
|
||||
docker logs trend-radar
|
||||
|
||||
💡 常用操作指南:
|
||||
1. 检查运行状态: status
|
||||
- 查看 supercronic 是否为 PID 1
|
||||
- 检查配置文件和关键文件
|
||||
- 查看 cron 调度设置
|
||||
|
||||
2. 手动执行测试: run
|
||||
- 立即执行一次新闻爬取
|
||||
- 测试程序是否正常工作
|
||||
|
||||
3. 查看日志: logs
|
||||
- 实时监控运行情况
|
||||
- 也可使用: docker logs trend-radar
|
||||
|
||||
4. 重启服务: restart
|
||||
- 由于 supercronic 是 PID 1,需要重启整个容器
|
||||
- 使用: docker restart trend-radar
|
||||
"""
|
||||
print(help_text)
|
||||
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 2:
|
||||
show_help()
|
||||
return
|
||||
|
||||
command = sys.argv[1]
|
||||
commands = {
|
||||
"run": manual_run,
|
||||
"status": show_status,
|
||||
"config": show_config,
|
||||
"files": show_files,
|
||||
"logs": show_logs,
|
||||
"restart": restart_supercronic,
|
||||
"help": show_help,
|
||||
}
|
||||
|
||||
if command in commands:
|
||||
try:
|
||||
commands[command]()
|
||||
except KeyboardInterrupt:
|
||||
print("\n👋 操作已取消")
|
||||
except Exception as e:
|
||||
print(f"❌ 执行出错: {e}")
|
||||
else:
|
||||
print(f"❌ 未知命令: {command}")
|
||||
print("运行 'python manage.py help' 查看可用命令")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
245
readme.md
245
readme.md
@ -1,6 +1,6 @@
|
||||
<div align="center">
|
||||
|
||||
# 🎯 TrendRadar
|
||||
# 🎯TrendRadar
|
||||
|
||||
**你的专属热点助手 —— 让手机只推送你真正关心的新闻**
|
||||
|
||||
@ -10,7 +10,7 @@
|
||||
[](https://github.com/sansan0/TrendRadar/network/members)
|
||||
[](LICENSE)
|
||||
[](https://www.python.org/)
|
||||
[](https://github.com/sansan0/TrendRadar)
|
||||
[](https://github.com/sansan0/TrendRadar)
|
||||
|
||||
[](https://work.weixin.qq.com/)
|
||||
[](https://telegram.org/)
|
||||
@ -18,34 +18,50 @@
|
||||
[](https://www.feishu.cn/)
|
||||
[](https://github.com/sansan0/TrendRadar)
|
||||
[](https://sansan0.github.io/TrendRadar)
|
||||
[](https://hub.docker.com/)
|
||||
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
> 如果本项目对你有所帮助,**点个 Star ⭐** 就是对我最大的支持。
|
||||
>
|
||||
> 遇到问题可以直接提 issues,我会认真回复每一个。也可以关注【硅基茶水间】公众号,那里我会分享开源项目的思路和经验。
|
||||
>
|
||||
> 我从 2025 年 4 月末开始经常开源项目,之前一直各种使用其他大佬的开源项目,虽然也勤于 star, 但确实更多的是在"白嫖"。现在想通过自己的项目回馈开源社区,同时记录这个过程中的收获和思考。
|
||||
>
|
||||
> 本账号专门用来维护开源项目。如果你也在做开源,或者对开源感兴趣,**欢迎一起交流**。
|
||||
> 遇到问题可以直接提 issues。也可以【硅基茶水间】公众号留言问题。
|
||||
|
||||
|
||||
---
|
||||
|
||||
## ✨ 核心功能
|
||||
|
||||
### **全网热点聚合**
|
||||
|
||||
一次监控 11+ 个主流平台(微博、知乎、抖音、百度等),再也不用逐个 APP 刷新闻)
|
||||
默认监控 11 个主流平台
|
||||
|
||||
> _理论上支持 35 个左右的平台,我默认只放了 11 个,想增加什么股票等金融相关资讯的,需要你自己手动添加,有需求的看最下方的**高级用法**_
|
||||
- 今日头条
|
||||
- 百度热搜
|
||||
- 华尔街见闻
|
||||
- 澎湃新闻
|
||||
- bilibili 热搜
|
||||
- 财联社热门
|
||||
- 凤凰网
|
||||
- 贴吧
|
||||
- 微博
|
||||
- 抖音
|
||||
- 知乎
|
||||
|
||||
> _理论上支持 35 个左右,如果你想增加额外的金融类等相关资讯推送,可看最下方的**自定义监控平台**_
|
||||
|
||||
### **智能推送策略**
|
||||
|
||||
提供两种工作模式满足不同需求:**默认模式**持续追踪热点排名变化、出现频次和时间跨度,定时推送完整分析报告(也包括新增热点);**增量模式**仅检测新增热点,无新内容时不推送,避免信息噪音
|
||||
提供三种工作模式满足不同需求:
|
||||
|
||||
> _默认模式适合全面了解热点趋势,增量模式适合高频监控(≤30 分钟)场景,专注最新动态而非持续热度_
|
||||
| 模式 | 推送时机 | 显示内容 | 适用场景 |
|
||||
|------|----------|----------|----------|
|
||||
| **当日汇总模式**<br/>`daily` | 按时推送 | 当日所有匹配新闻<br/>+ 新增新闻区域 | 日报总结<br/>全面了解当日热点趋势 |
|
||||
| **当前榜单模式**<br/>`current` | 按时推送 | 当前榜单匹配新闻<br/>+ 新增新闻区域 | 实时热点追踪<br/>了解当前最火的内容 |
|
||||
| **增量监控模式**<br/>`incremental` | 有新增才推送 | 新出现的匹配频率词新闻 | 避免重复信息干扰<br/>高频监控场景 |
|
||||
|
||||
- **📈 投资者/交易员** → 选择 `incremental`,及时获取新增资讯(建议自建服务器进行 docker 部署,提高检测频率)
|
||||
- **📰 自媒体人/内容创作者** → 选择 `current`,掌握实时热点趋势
|
||||
- **📋 企业管理者/普通用户** → 选择 `daily`,定时获取完整日报
|
||||
|
||||
### **精准内容筛选**
|
||||
|
||||
@ -59,9 +75,9 @@
|
||||
|
||||
### **零技术门槛部署**
|
||||
|
||||
GitHub 一键 Fork 即可使用,无需服务器,无需编程基础。如果要手机接收通知, 要达到 1 分钟部署完毕的效果,请选择**企业微信**
|
||||
GitHub 一键 Fork 即可使用,无需编程基础。
|
||||
|
||||
> _就,很快_
|
||||
> _如果要手机接收通知, 要达到 1 分钟部署完毕的效果,请选择**企业微信**_
|
||||
|
||||
### **减少 APP 依赖**
|
||||
|
||||
@ -73,7 +89,7 @@ GitHub 一键 Fork 即可使用,无需服务器,无需编程基础。如果
|
||||
|
||||
**典型场景:** 股市投资监控、品牌舆情追踪、行业动态关注、生活资讯获取
|
||||
|
||||
**💡 提示:** 想要网页版?启用 GitHub Pages 即可获得自动更新的新闻页面。进入你的仓库 Settings → Pages,启用后就能获得专属链接,比如我这里[TrendRadar](https://sansan0.github.io/TrendRadar/)。
|
||||
**💡 提示:** 想要网页版?进入你的仓库 Settings → Pages,启用 GitHub Pages 即可获得自动更新的新闻页面链接。比如我这里[TrendRadar](https://sansan0.github.io/TrendRadar/)。
|
||||
|
||||
<p align="center">
|
||||
<img src="_image/2025-06-14.jpg" alt="最新推送效果" width="400"/>
|
||||
@ -81,7 +97,24 @@ GitHub 一键 Fork 即可使用,无需服务器,无需编程基础。如果
|
||||
|
||||
## 📝 更新日志
|
||||
|
||||
**升级说明:** 已**fork**的同学,如果想获得最新功能:只需把本项目中 main.py 的所有代码替换掉你的旧代码即可(直接 GitHub 网页上编辑)。
|
||||
**升级说明:**
|
||||
- **小版本更新**:直接在 GitHub 网页编辑器中,用本项目的 `main.py` 代码替换你 fork 仓库中的对应文件
|
||||
- **大版本升级**:从 v1.x 升级到 v2.0 建议删除现有 fork 后重新 fork,这样更省力且避免配置冲突
|
||||
|
||||
### 2025/07/17 - v2.0.0
|
||||
|
||||
**重大重构**:
|
||||
- 配置管理重构:所有配置现在通过 `config/config.yaml` 文件管理(main.py 我依旧没拆分,方便你们复制升级)
|
||||
- 运行模式升级:支持三种模式 - `daily`(当日汇总)、`current`(当前榜单)、`incremental`(增量监控)
|
||||
- Docker 支持:完整的 Docker 部署方案,支持容器化运行
|
||||
|
||||
**配置文件说明**:
|
||||
- `config/config.yaml` - 主配置文件(应用设置、爬虫配置、通知配置、平台配置等)
|
||||
- `config/frequency_words.txt` - 关键词配置(监控词汇设置)
|
||||
|
||||
|
||||
<details>
|
||||
<summary><strong>📝 点击查看历史更新</strong></summary>
|
||||
|
||||
### 2025/07/09 - v1.4.1
|
||||
|
||||
@ -89,9 +122,6 @@ GitHub 一键 Fork 即可使用,无需服务器,无需编程基础。如果
|
||||
|
||||
**修复问题**: 某些情况下,由于新闻本身含有特殊符号导致的偶发性排版异常。
|
||||
|
||||
<details>
|
||||
<summary><strong>📝 查看历史更新</strong></summary>
|
||||
|
||||
### 2025/06/23 - v1.3.0
|
||||
|
||||
企业微信 和 Telegram 的推送消息有长度限制,对此我采用将消息拆分推送的方式。开发文档详见[企业微信](https://developer.work.weixin.qq.com/document/path/91770) 和 [Telegram](https://core.telegram.org/bots/api)
|
||||
@ -168,21 +198,6 @@ frequency_words.txt 文件增加了一个【必须词】功能,使用 + 号
|
||||
|
||||
</details>
|
||||
|
||||
## 🔍 支持的平台
|
||||
|
||||
目前已支持以下 11 个热门平台:
|
||||
|
||||
- 今日头条
|
||||
- 百度热搜
|
||||
- 华尔街见闻
|
||||
- 澎湃新闻
|
||||
- bilibili 热搜
|
||||
- 财联社热门
|
||||
- 凤凰网
|
||||
- 贴吧
|
||||
- 微博
|
||||
- 抖音
|
||||
- 知乎
|
||||
|
||||
## 🚀 使用方式
|
||||
|
||||
@ -286,7 +301,7 @@ frequency_words.txt 文件增加了一个【必须词】功能,使用 + 号
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>🟣 <strong>Telegram Bot</strong>(配置最复杂)</summary>
|
||||
<summary>🟣 <strong>Telegram Bot</strong></summary>
|
||||
|
||||
**GitHub Secret 配置:**
|
||||
- 名称:`TELEGRAM_BOT_TOKEN` - 你的 Telegram Bot Token
|
||||
@ -316,12 +331,16 @@ frequency_words.txt 文件增加了一个【必须词】功能,使用 + 号
|
||||
- `TELEGRAM_CHAT_ID`:填入第 2 步获得的 Chat ID
|
||||
</details>
|
||||
|
||||
3. **自定义关键词**:
|
||||
3. **配置关键词和设置**:
|
||||
|
||||
- 修改`frequency_words.txt`文件,添加你关心的关键词
|
||||
- **关键词配置**: 修改 `config/frequency_words.txt` 文件,添加你关心的关键词
|
||||
- **运行模式配置**: 在 `config/config.yaml` 中修改 `report.mode` 设置:
|
||||
- `"daily"` - 当日汇总模式(默认)
|
||||
- `"current"` - 当前榜单模式
|
||||
- `"incremental"` - 增量监控模式
|
||||
|
||||
<details>
|
||||
<summary><strong>⚙️ 点击查看 frequency_words.txt 完整配置教程</strong></summary>
|
||||
<summary><strong>⚙️ 点击查看 frequency_words.txt 配置教程</strong></summary>
|
||||
|
||||
在 `frequency_words.txt` 文件中配置监控的关键词,支持三种语法和词组功能。
|
||||
|
||||
@ -472,7 +491,7 @@ frequency_words.txt 文件增加了一个【必须词】功能,使用 + 号
|
||||
|
||||
4. **自动运行**:
|
||||
|
||||
- 项目已包含`.github/workflows/crawler.yml`配置文件,默认每 50 分钟自动运行一次
|
||||
- 项目已包含`.github/workflows/crawler.yml`配置文件,默认每 30 分钟自动运行一次
|
||||
- 你也可以在 GitHub 仓库的 Actions 页面手动触发运行
|
||||
|
||||
5. **查看结果**:
|
||||
@ -508,64 +527,128 @@ frequency_words.txt 文件增加了一个【必须词】功能,使用 + 号
|
||||
| [时间 ~ 时间] | [12 时 30 分 ~ 14 时 00 分] | 时间范围 | 标题出现的时间范围(首次~最后) |
|
||||
| (N 次) | (4 次) | 出现次数 | 标题在监控期间出现的总次数 |
|
||||
|
||||
## 🔧 高级用法
|
||||
|
||||
### 自定义监控平台
|
||||
### 🔧 自定义监控平台
|
||||
|
||||
本项目的数据来源于 [newsnow](https://github.com/ourongxing/newsnow) 项目,你可以点击[这个网站](https://newsnow.busiyi.world/),点击里面的【更多】,可以看到所有支持的平台。如果你想支持更多平台或者不想看某些平台,可以访问 https://github.com/ourongxing/newsnow/tree/main/server/sources ,根据里面的文件名来调整 main.py 中的下面代码(如果不懂怎么看这些代码,也可以去我公众号留言):
|
||||
本项目的资讯数据来源于 [newsnow](https://github.com/ourongxing/newsnow) ,你可以点击[网站](https://newsnow.busiyi.world/),点击[更多],查看是否有你想要的平台。
|
||||
|
||||
```python
|
||||
ids = [
|
||||
("toutiao", "今日头条"),
|
||||
("baidu", "百度热搜"),
|
||||
("wallstreetcn-hot", "华尔街见闻"),
|
||||
("thepaper", "澎湃新闻"),
|
||||
("bilibili-hot-search", "bilibili 热搜"),
|
||||
("cls-hot", "财联社热门"),
|
||||
("ifeng", "凤凰网"),
|
||||
"tieba",
|
||||
"weibo",
|
||||
"douyin",
|
||||
"zhihu",
|
||||
]
|
||||
具体添加可访问 [项目源代码](https://github.com/ourongxing/newsnow/tree/main/server/sources),根据里面的文件名,在 `config/config.yaml` 文件中修改 `platforms` 配置:
|
||||
|
||||
```yaml
|
||||
platforms:
|
||||
- id: "toutiao"
|
||||
name: "今日头条"
|
||||
- id: "baidu"
|
||||
name: "百度热搜"
|
||||
- id: "wallstreetcn-hot"
|
||||
name: "华尔街见闻"
|
||||
# 添加更多平台...
|
||||
```
|
||||
|
||||
## ❓ 常见问题
|
||||
### 🐳 Docker 部署
|
||||
|
||||
1. **GitHub Actions 不执行怎么办?**
|
||||
1. **克隆项目并进入目录**:
|
||||
```bash
|
||||
git clone https://github.com/sansan0/TrendRadar.git
|
||||
cd TrendRadar
|
||||
```
|
||||
|
||||
- 检查`.github/workflows/crawler.yml`文件是否存在
|
||||
- 在 Actions 页面手动触发一次 workflow
|
||||
- 确认你有足够的 GitHub Actions 免费分钟数
|
||||
2. **配置文件**:
|
||||
- 修改 `config/config.yaml` 和 `config/frequency_words.txt`
|
||||
- **推送链接填写**,**设置推送定时**可通过 .env 进行配置
|
||||
|
||||
2. **没有收到飞书通知怎么办?**
|
||||
3. **启动服务**:
|
||||
```bash
|
||||
cd docker
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
- 检查`FEISHU_WEBHOOK_URL`是否正确设置(环境变量或 CONFIG 中)
|
||||
- 检查飞书机器人是否仍在群内且启用
|
||||
- 查看程序输出中是否有发送失败的错误信息
|
||||
- 确认飞书流程配置中的参数结构正确
|
||||
4. **管理服务**:
|
||||
```bash
|
||||
# 查看运行状态
|
||||
docker exec -it trend-radar python manage.py status
|
||||
|
||||
# 手动执行一次爬虫
|
||||
docker exec -it trend-radar python manage.py run
|
||||
|
||||
# 查看实时日志
|
||||
docker exec -it trend-radar python manage.py logs
|
||||
|
||||
# 显示当前配置
|
||||
docker exec -it trend-radar python manage.py config
|
||||
|
||||
# 显示输出文件
|
||||
docker exec -it trend-radar python manage.py files
|
||||
|
||||
# 重启定时服务
|
||||
docker exec -it trend-radar python manage.py restart
|
||||
|
||||
# 显示帮助信息
|
||||
docker exec -it trend-radar python manage.py help
|
||||
```
|
||||
|
||||
3. **想要停止手机接收新闻通知消息?**
|
||||
|
||||
- 将`ENABLE_NOTIFICATION`设置为`False`
|
||||
|
||||
4. **想要停止在本地继续生成新闻数据?**
|
||||
|
||||
- 将`ENABLE_CRAWLER`设置为`False`
|
||||
|
||||
## 📧 学习交流
|
||||
|
||||
扫码关注微信公众号,里面有文章是讲我写的这些项目的,咳如果对你有了点帮助,献上【点赞,转发,推荐】三连,就算支持了俺这个作者的开发了,顺便也可以反馈使用问题:
|
||||
## ☕ 赞助让代码更香.jpg
|
||||
|
||||
<div align="center">
|
||||
|
||||

|
||||
| Wechat | Alipay |
|
||||
|:---:|:---:|
|
||||
| <img src="https://cdn-1258574687.cos.ap-shanghai.myqcloud.com/img/%2F2025%2F07%2F17%2F2ae0a88d98079f7e876c2b4dc85233c6-9e8025.JPG" width="300" title="微信支付"/> | <img src="https://cdn-1258574687.cos.ap-shanghai.myqcloud.com/img/%2F2025%2F07%2F17%2Fed4f20ab8e35be51f8e84c94e6e239b4-fe4947.JPG" width="300" title="支付宝支付"/> |
|
||||
|
||||
</div>
|
||||
|
||||

|
||||
|
||||
## Star History
|
||||
## 📧 学习交流
|
||||
|
||||
<div align="center">
|
||||
|
||||
<img src="_image/support.jpg" title="硅基茶水间留言"/>
|
||||
<br>
|
||||
<img src="_image/weixin.png" width="600" title="硅基茶水间"/>
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[👤 用户开始] --> B[🍴 Fork 项目]
|
||||
B --> C[⚙️ 选择通知方式]
|
||||
|
||||
C --> D1[📱 企业微信群机器人<br/>最简单快速]
|
||||
C --> D2[💬 飞书机器人<br/>显示效果最佳]
|
||||
C --> D3[🔔 钉钉机器人<br/>]
|
||||
C --> D4[📟 Telegram Bot<br/>]
|
||||
|
||||
D1 --> E[🔑 配置 GitHub Secrets<br/>填入机器人 Webhook 地址]
|
||||
D2 --> E
|
||||
D3 --> E
|
||||
D4 --> E
|
||||
|
||||
E --> F[📝 编辑关键词配置<br/>config/frequency_words.txt<br/>添加你关心的词汇]
|
||||
F --> G[🎯 选择运行模式<br/>config/config.yaml<br/>daily/current/incremental]
|
||||
|
||||
G --> H[✅ 配置完成]
|
||||
H --> I[🤖 系统每30分钟自动运行]
|
||||
|
||||
I --> J[📊 爬取各大平台热点]
|
||||
J --> K[🔍 根据关键词筛选]
|
||||
K --> L[📱 推送到你的手机]
|
||||
|
||||
L --> M[📈 查看推送结果]
|
||||
M --> N{满意效果?}
|
||||
N -->|不满意| F
|
||||
N -->|满意| O[🎉 持续接收精准推送]
|
||||
|
||||
style A fill:#e1f5fe
|
||||
style B fill:#f3e5f5
|
||||
style E fill:#fff3e0
|
||||
style F fill:#e8f5e8
|
||||
style G fill:#e8f5e8
|
||||
style L fill:#ffebee
|
||||
style O fill:#e8f5e8
|
||||
```
|
||||
|
||||
|
||||
[](https://www.star-history.com/#sansan0/TrendRadar&Date)
|
||||
|
||||
@ -583,6 +666,6 @@ GPL-3.0 License
|
||||
|
||||
**⭐ 如果这个工具对你有帮助,请给项目点个 Star 支持开发!**
|
||||
|
||||
[🔝 回到顶部](#TrendRadar-多平台热点资讯监控分析系统)
|
||||
[🔝 回到顶部](#trendradar)
|
||||
|
||||
</div>
|
||||
|
||||
3
requirements.txt
Normal file
3
requirements.txt
Normal file
@ -0,0 +1,3 @@
|
||||
requests==2.32.4
|
||||
pytz==2025.2
|
||||
PyYAML==6.0.2
|
||||
Loading…
Reference in New Issue
Block a user