InfoRadar-v5 Pipeline Verification¶
This document provides the acceptance test commands as specified in the claudecode-spec.
Prerequisites¶
-
Environment Setup:
-
Install Dependencies:
Acceptance Tests¶
Test A: Ingest Command¶
Expected behavior: - Fetches articles from RSS feeds and static URLs - Filters by date (max_age_days=3) - Deduplicates by URL (seen_urls + SQLite) - Scores articles (two-round scoring with smoothing) - Generates summaries (~400 Chinese chars) - Saves to SQLite database and article markdown files - Output: data/articles/{date}/{grade}_{score}_{title}.md
Success criteria: - No errors during execution - Articles saved to data/radar.db - Article files created in data/articles/{date}/
Test B: Report Command¶
Expected behavior: - Reads articles from database for the specified date - Performs embedding deduplication (cosine >= 0.8) - Generates summary file: data/articles/summary/2026-02-10_S.md - Makes single smart LLM call to generate full report - Creates report: data/dailyReport/industry_radar_2026-02-10.md
Success criteria: - Report file created at: data/dailyReport/industry_radar_2026-02-10.md - Report contains Chinese content with proper structure - Summary file created at: data/articles/summary/2026-02-10_S.md
Test C: Add Report Script¶
Expected behavior: - Copies report from data/dailyReport/ to output/{date}.md - Generates static website (runs web/generate.js) - Executes hooks/post_gen.sh (failure is warning only) - Stages and commits changes to git - Attempts to push to remote (failure allowed but prints manual steps)
Success criteria: - Report copied to output/2026-02-10.md - Website generated in web/dist/ - Hook executed (or warning shown) - Git commit created - Clear manual push instructions shown if push fails
Additional Verification¶
Check Database¶
sqlite3 data/radar.db "SELECT COUNT(*) FROM articles;"
sqlite3 data/radar.db "SELECT url, title, final_score, grade FROM articles LIMIT 5;"
Check File Structure¶
# Check article files
ls -la data/articles/$(date +%Y-%m-%d)/
# Check daily reports
ls -la data/dailyReport/
# Check summary files
ls -la data/articles/summary/
Verify No Mocks¶
# Ensure radar.py is deleted
test ! -f pipeline/radar.py && echo "✓ No radar.py mock" || echo "✗ radar.py still exists"
# Check for MOCK_NEWS_ITEMS or random.sample in main path
grep -r "MOCK_NEWS_ITEMS\|random\.sample" pipeline/*.py pipeline/core/*.py && echo "✗ Found mocks" || echo "✓ No mocks found"
Cron Schedule¶
The pipeline is designed to run on the following schedule (Asia/Shanghai timezone):
# Hourly ingest at :00 minutes
0 * * * * cd /path/to/daily-report && TZ=Asia/Shanghai bash scripts/run-pipeline.sh ingest >> logs/cron-ingest.log 2>&1
# Daily report at 23:15 (generates report for TODAY)
50 23 * * * cd /path/to/daily-report && TZ=Asia/Shanghai bash scripts/run-pipeline.sh report >> logs/cron-report.log 2>&1
To install:
Troubleshooting¶
Missing API Key¶
Solution: Configure.env file with valid API key No Articles Found¶
Solution: Check RSS feeds are accessible, verify date filter (max_age_days)Report Generation Failed¶
Solution: Run ingest first, or check if articles exist for that date in databaseGit Push Failed¶
Solution: This is expected if remote is not configured. Follow manual push instructions printed by the script.Implementation Notes¶
- No mocks: All code uses real RSS feeds, real LLM calls, real database
- Output path: Reports are generated in
data/dailyReport/, NOToutput/ - Grade thresholds: A >= 80, B >= 60, C >= 51
- Smoothing logic: Applied when first_score >= 70 but second_score < 50
- Embedding dedupe: Cosine similarity threshold >= 0.8
- Summary length: ~400 Chinese characters
- Time filter: max_age_days = 3