leonifrazao/Parallax

GitHub: leonifrazao/Parallax

一个开源的地缘政治叙事情报管道，利用本地 LLM 在离线环境中完成新闻抓取、去重与叙事结构化解构。

Stars: 3 | Forks: 0

[![Contributors](https://img.shields.io/github/contributors/leonifrazao/Parallax.svg?style=for-the-badge)][contributors-url] [![Forks](https://img.shields.io/github/forks/leonifrazao/Parallax.svg?style=for-the-badge)][forks-url] [![Stargazers](https://img.shields.io/github/stars/leonifrazao/Parallax.svg?style=for-the-badge)][stars-url] [![Issues](https://img.shields.io/github/issues/leonifrazao/Parallax.svg?style=for-the-badge)][issues-url] [![Unlicense License](https://img.shields.io/github/license/leonifrazao/Parallax.svg?style=for-the-badge)][license-url] [![LinkedIn](https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555)][linkedin-url]

Parallax

一个开源的地缘政治情报管道，用于抓取全球新闻、消除重复报道，并利用本地大语言模型分析叙事偏见、情感框架和隐藏动机——为分析师提供可操作的、结构化的情报报告。
探索文档 »

查看演示 · 报告错误 · 请求功能

目录

关于项目
情报管道
快速开始
- 前提条件
- 安装
使用与 API 参考
数据模型
路线图
贡献
许可证
联系
致谢

## 关于项目在任何地缘政治危机中，同一事件会被数十家媒体报道——每家报道都有其自身的框架、偏见和议程。 **报道的方式** 与 **报道的内容** 同样重要。理解一个冲突被描述为“解放”还是“入侵”，谁在放大恐惧而谁保持中立，以及哪些实体在叙事中始终占据中心地位——这正是区分“消费新闻”与“生产情报”的关键。 **Parallax** 是一个自动化的地缘政治叙事情报系统。它抓取全球新闻报道，去除重复内容，并将独特的头条信息输入本地大语言模型，作为 **专注于地缘政治叙事分析的高级情报分析师**。输出不仅仅是摘要，而是对每篇报道的立场、情感框架、强度、关键行为体、叙事动机和意识形态倾向的结构化分解。所有内容均在 **本地 100% 运行**。无需云端分析API，数据不会离开你的机器。Ollama 在你的硬件上运行模型（如 `llama3.1`），确保完全的运行隐私。

### 为何选择“Parallax”？在光学中，**视差（Parallax）** 是指从两个不同位置观察同一物体时，物体看似发生位移的现象。物体并未移动——改变的只是观察者的角度。新闻也是如此。同一地缘政治事件——一次军事打击、一项贸易制裁、一场外交峰会——根据报道它的媒体、国家或意识形态不同，呈现出的面貌完全不同。Parallax 揭示了这些变化。它不告诉你发生了什么，而是告诉你 **每一个观察者选择如何描述所发生的事**。

### 关键能力 | 能力 | 描述 | |---|---| | 🌐 **全球新闻收集** | 通过 NewsAPI 抓取国际媒体文章。支持指定来源（如 Al Jazeera、BBC、Reuters）或广泛抓取所有可用发布者。 | | 🧹 **跨媒体去重** | 使用 RapidFuzz 模糊匹配（`token_set_ratio`，84% 阈值）消除同一事件在不同媒体中的重复报道，保留最完整的版本。 | | 🧠 **叙事拆解** | 本地 LLM（Ollama）扮演高级地缘政治分析师，从每篇文章中提取 **立场**（如 `pro_iran`、`anti_nato`、`neutral`）、**情感基调**（`alarmist`、`factual`、`sympathetic`、`triumphalist`、`defeatist`）、**情感强度**、**关键实体**、**叙事摘要** 与 **潜在动机**。 | | 📊 **立场与偏见映射** | 立场遵循 `neutral / pro_ / anti_` 约束模式，便于系统化比较不同媒体对同一冲突的立场分布。 | | 🎨 **情报报告** | 自动生成深色主题的 HTML 情报报告，每篇文章对应一个卡片：立场标识、强度条、实体标签、动机标签和原文链接。直接在浏览器中打开查看。 | | 📁 **多格式导出** | 将分析结果导出为 JSON、CSV 或 XML，便于在自有工具或数据库中进一步处理。 | | 🏗️ **微服务架构** | 四个独立的 FastAPI 服务（Pipeline、Scraper、Analysis、Render）通过 Docker Compose 编排。可独立扩展、替换或测试。 | | 🔒 **完全本地与私有** | 数据全程不离开你的机器。Ollama 在本地运行 LLM 推理，适合敏感的地缘政治研究。 | | 💻 **CLI 模式** | 提供独立的终端应用程序，使用 Rich 表格格式化输出，无需浏览器即可快速分析 |

### 使用的技术 * [![Python](https://img.shields.io/badge/Python-3776AB?style=for-the-badge&logo=python&logoColor=white)][Python-url] Python >= 3.12 * [![FastAPI](https://img.shields.io/badge/FastAPI-009688?style=for-the-badge&logo=fastapi&logoColor=white)][FastAPI-url] FastAPI（异步 REST 框架） * [![Docker](https://img.shields.io/badge/Docker-2496ED?style=for-the-badge&logo=docker&logoColor=white)][Docker-url] Docker & Docker Compose * [![Ollama](https://img.shields.io/badge/Ollama-000000?style=for-the-badge&logo=ollama&logoColor=white)][Ollama-url] Ollama（本地 LLM 推理） * **UV** — 极速 Python 包安装与解析器 ([astral-sh/uv](https://github.com/astral-sh/uv)) * **RapidFuzz** — 高性能模糊字符串匹配，用于去重 * **Pydantic v2** — 数据验证与 LLM 输出 Schema 强制 * **dependency-injector** — IoC 容器，实现清晰架构 * **HTTPX** — 异步 HTTP 客户端，用于服务间通信 * **Loguru** — 全流程结构化日志 * **Rich** — 美观的终端输出（CLI 模式） * **BeautifulSoup4** — HTML 解析，用于提取文章内容 * **Dacite** — 数据类反序列化工具

### 架构概览 Parallax 将其情报管道划分为 **四个通过 Docker Compose 编排的微服务**。它们通过 Docker 网络内部使用异步 REST 调用进行通信，同时对外暴露端点以便直接测试。 ``` graph TD Client(("Client")) -->|"POST /pipeline"| Pipeline Pipeline["Pipeline Service :8000"] -->|"POST /scrape"| Scraper["Scraper Service :8001"] Pipeline --> Dedup["HeadlineDeduplicator"] Pipeline -->|"POST /analyze"| Analysis["Analysis Service :8002"] Pipeline -->|"POST /render/save"| Render["Render Service :8003"] Scraper --> NewsAPI["NewsAPI"] Analysis --> Ollama["Ollama - llama3.1"] Render --> HTML["./output/render/*.html"] ``` **服务拆分：** | 服务 | 容器 | 外部端口 | 职责 | |---|---|---|---| | **Pipeline** | `pipeline_service` | `8000` | 协调完整情报周期。验证查询、协调所有服务、去重头条、触发导出。客户端的单一入口点。 | | **Scraper** | `scraper_service` | `8001` | 与新闻提供方（NewsAPI）交互。获取文章、解析为 `Headline` 模型、按查询相关性与来源过滤。 | | **Analysis** | `analysis_service` | `8002` | 情报核心。构建地缘政治分析提示，调度至 Ollama，解析结构化 LLM 输出为 `Narrative` 模型（含立场、基调、实体、动机）。 | | **Render** | `render_service` | `8003` | 将 `Narrative` 数据转换为深色主题 HTML 情报报告。生成带时间戳的 `.html` 文件至挂载的 Docker 卷（`./output/render`）。 |

### 项目结构 ``` Parallax/ ├── docker-compose.yml # Orchestrates all 4 microservices ├── Dockerfile # Shared build (Python 3.12-slim + UV) ├── pyproject.toml # Project metadata & dependencies ├── uv.lock # Deterministic dependency lock ├── .env # NEWSAPI_KEY, OLLAMA_HOST │ ├── src/parallax/ │ ├── main.py # CLI entry point (standalone mode) │ ├── container.py # Root DI container (CLI mode) │ │ │ ├── models/ │ │ └── enter/ │ │ ├── headline.py # Headline: scraped article data │ │ ├── narrative.py # Narrative: analyzed intelligence output │ │ ├── narrativeLLM.py # LLM schema (constrained stance, tone literals) │ │ └── web/ │ │ ├── pipelinerequest.py # Pipeline API request body │ │ ├── scraperequest.py # Scraper API request body │ │ └── renderrequest.py # Render API request body │ │ │ ├── interfaces/ # Abstract contracts (Clean Architecture) │ │ ├── enter/ │ │ │ ├── IScraper.py # Scraper provider interface │ │ │ ├── IWebScraper.py # Scraper orchestrator interface │ │ │ ├── INarrativeAnalysis.py # Analysis engine interface │ │ │ ├── IAnalysisMetrics.py # Aggregate metrics interface │ │ │ └── usecases/ │ │ │ ├── IExecutorUseCase.py # Pipeline executor contract │ │ │ ├── IScraperUseCase.py # Scraper use case contract │ │ │ ├── IAnalysisUseCase.py # Analysis use case contract │ │ │ └── IRenderUseCase.py # Render use case contract │ │ └── out/ │ │ └── ICliApp.py # CLI output interface │ │ │ ├── scrapers/ │ │ ├── web.py # WebScraper: aggregates all providers │ │ ├── telegram.py # Telegram scraper (planned) │ │ └── websites/ │ │ ├── base.py # BaseScraper: fetcher/parser pattern │ │ └── newsapi.py # NewsAPI: get_everything + query filter │ │ │ ├── analysis/ │ │ ├── engine.py # NarrativeAnalysis: prompt → Ollama → Narrative │ │ └── metrics.py # AnalysisMetrics: stance/tone distributions │ │ │ ├── helpers/ │ │ ├── deduplicator.py # HeadlineDeduplicator (RapidFuzz) │ │ └── modeltofile.py # ModelToFile: export JSON / CSV / XML │ │ │ ├── ui/ │ │ ├── app.py # CliApp: Rich table display │ │ └── components.py # UI components (planned) │ │ │ ├── database/ # Persistence layer (planned) │ │ ├── entities.py │ │ └── repository.py │ │ │ └── services/ # Microservice layer │ ├── pipeline_service/ │ │ ├── container.py # DI: wires clients → executor │ │ └── app/ │ │ ├── main.py # FastAPI app │ │ ├── controllers/ │ │ │ └── pipeline_controller.py │ │ ├── clients/ │ │ │ ├── scraper_client.py # HTTP → Scraper Service │ │ │ ├── analysis_client.py # HTTP → Analysis Service │ │ │ └── render_client.py # HTTP → Render Service │ │ └── services/ │ │ └── executor_service.py # Full pipeline orchestration │ │ │ ├── scraper_service/ │ │ ├── container.py │ │ └── app/ (main, controllers, services) │ │ │ ├── analysis_service/ │ │ ├── container.py │ │ └── app/ (main, controllers, services) │ │ │ └── render_service/ │ ├── container.py │ └── app/ │ └── services/ │ └── render_service.py # HTML generation engine │ └── output/ └── render/ # Generated HTML intelligence reports ```

## 情报管道一次 `POST /pipeline` 调用将触发完整的情报周期。Parallax 在后台执行以下操作： ### 1. 收集与抓取 **Scraper 服务** 作为你的采集单元，负责与新闻提供方交互以获取原始报道： - 查询 [NewsAPI](https://newsapi.org/) 的 `get_everything()` — 支持布尔运算符（`AND`、`OR`）、语言过滤（默认为英文）、相关性排序及最大批量 20 条。 - **来源定向**：聚焦特定媒体进行分析。例如地缘政治工作中，可对比 `al-jazeera-english`、`bbc-news`、`reuters` 对同一事件的框架差异。 - 将每篇原始文章解析为 `Headline` 领域模型：标题、来源、URL、描述、作者、发布日期，以及用于跨管道追踪的 UUID。 - 应用二次本地查询过滤，确保所有返回文章确实在标题或描述中包含搜索词。 - 丢弃无效内容：移除无标题/URL 的文章、空描述。 Scraper 架构是 **可扩展的** — `WebScraper` 可聚合任意数量 `IScraper` 实现的结果。预留了 `telegram.py` 模块用于未来 Telegram 频道采集。 ### 2. 去重在涉及地缘政治的事件中，同一则通稿（AP、Reuters）会被数十家媒体转载，内容变化极小。对同一事件分析 15 次会浪费 LLM 计算资源并污染情报产品。 `HeadlineDeduplicator` 解决此问题： - 将 `headline + description` 合并为单字符串用于比对。 - 文本标准化：转小写、去除特殊字符、压缩空白、统一引号。 - 使用 `fuzz.token_set_ratio` 进行模糊匹配——该算法对词序变化具有鲁棒性（对新闻标题至关重要）。 - **阈值：84%** 相似度即标记为重复。保留文本更完整（`text + description` 更长）的版本。 - 输出：精简后的唯一视角集合，每条代表对该事件真正不同的解读。 ### 3. 叙事分析这是 Parallax 的核心。**Analysis 服务** 将 LLM 用作地缘政治情报分析师。系统提示将 Ollama 定位为：对每批去重后的头条，引擎执行： 1. 构建包含头条 ID、标题、描述、来源的 JSON 载荷。 2. 发送至 Ollama（默认模型 `llama3.1`，`temperature: 0`）以确保可复现的确定性分析。 3. 使用 **结构化输出** — LLM 被强制返回符合 Pydantic 模式（`NarrativeListResponse`）的 JSON，消除幻觉格式。 **每条头条提取内容：** | 字段 | 类型 | 地缘政治用途 | |---|---|---| | `stance` | `neutral`、`pro_`、`anti_` | 映射意识形态 alignment。模式：`pro_iran`、`anti_nato`、`pro_ukraine` 等，正则验证确保一致性。 | | `emotional_tone` | `alarmist` · `factual` · `sympathetic` · `triumphalist` · `defeatist` | 分类情感框架策略。报道旨在引发恐惧？唤起同情？展示力量？ | | `emotional_intensity` | `0.0 – 10.0` | 量化报道的强度程度。事实类电报通常在 1–3；煽动性报道可达 7–10。 | | `key_entities` | `list[str]` | 报道聚焦的人物、组织、国家与概念。揭示每家媒体认定的主要行为体。 | | `narrative_summary` | `str` | 对故事的精炼概括——不是事件本身，而是对其的 *框架* 描述。 | | `motives` | `list[str]` | 检测到的潜在议程：“经济施压”、“领土合法化”、“人道主义框架”等。 | ### 4. 报告渲染 **Render 服务** 将原始情报转化为可视化 HTML 报告： - **深色情报主题** — `#020617` 背景，Inter 字体，适合长时间阅读。 - 每条叙事为一个 **卡片**，包含： - **标题与来源**：报道内容与出处。 - **立场标识** — 药丸形标签（如 `pro_russia`），便于快速扫描。 - **叙事摘要**：情报分析师的提炼。 - **情感基调与强度**：文本标签 + 进度条展示报道的强度。 - **关键实体** — 药丸标签展示报道聚焦的各方。 - **动机** — 药丸标签展示检测到的潜在议程。 - **原文链接** — 直接跳转至原始文章以便验证。 - 报告保存为带时间戳的 `.html` 文件至 `./output/render/`（挂载的 Docker 卷），可在任意浏览器中打开。 ### 5. 数据导出 `ModelToFile` 工具支持下游处理： - **JSON** — 美化 UTF-8 数组，适合输入可视化仪表板或进一步 NLP 处理。 - **CSV** — 自动识别表头的扁平表格。嵌套字段（实体、动机）以 JSON 字符串序列化，便于电子表格分析。 - **XML** — 结构化 `/` 树，适用于 OSINT 工具集成。所有导出均带时间戳并保存至 `output/` 目录。

快速开始 ### 前提条件 1. **Docker & Docker Compose** — 安装 [Docker Desktop](https://www.docker.com/products/docker-desktop)。 2. **Ollama** — 安装 [Ollama](https://ollama.com/) 并拉取模型： ```bash ollama pull llama3.1 ``` 3. **NewsAPI Key** — 在 [newsapi.org](https://newsapi.org/) 获取免费 API 密钥。 ### 安装 1. **克隆仓库** ```bash git clone https://github.com/leonifrazao/Parallax.git cd Parallax ``` 2. **配置环境变量** 在项目根目录创建 `.env` 文件： ```env NEWSAPI_KEY=your_newsapi_key_here OLLAMA_HOST=http://host.docker.internal:11434 ``` 3. **构建并启动所有服务** ```bash docker compose up --build ``` 四个容器将启动： | 容器 | URL | |---|---| | Pipeline（网关） | `http://localhost:8000` | | Scraper | `http://localhost:8001` | | Analysis | `http://localhost:8002` | | Render | `http://localhost:8003` | 4. **验证** 所有服务健康： ```bash curl http://localhost:8000/health ``` 应返回： ```json {"status":"ok","service":"pipeline-service","version":"0.1.0"} ```

## 使用与 API 参考 ### 运行分析发送单个 `POST` 请求以触发完整情报周期——包括收集、去重、叙事分析、渲染与导出。 **示例：分析伊朗核计划相关报道（覆盖多家主流媒体）：** ``` curl -X POST http://localhost:8000/pipeline \ -H 'Content-Type: application/json' \ -d '{ "query": "Iran nuclear deal", "limit": 5, "sources": ["al-jazeera-english", "bbc-news", "reuters", "the-washington-post"], "tojson": true }' ``` **示例：对比北约报道在西方与非西方媒体中的框架差异：** ``` curl -X POST http://localhost:8000/pipeline \ -H 'Content-Type: application/json' \ -d '{ "query": "NATO expansion", "limit": 10, "sources": ["bbc-news", "cnn", "al-jazeera-english", "the-hindu"], "tojson": true }' ``` ### 请求参数 | 参数 | 类型 | 默认值 | 描述 | |---|---|---|---| | `query` | `string` | *必需* | 搜索查询。支持布尔运算符：`"Iran AND sanctions"`、`"NATO OR OTAN"`。 | | `limit` | `integer` | `10` | 最大分析头条数（去重后）。数值越高结果越全面，但 LLM 处理时间更长。 | | `sources` | `string[]` | `[]` | 按 NewsAPI 来源 ID 过滤。空数组表示使用所有来源。可用于对比特定媒体的框架差异。 | | `tojson` | `boolean` | `false` | 同时导出 `.json` 结果至 `output/` 目录。 | ### 情报输出 API 返回一个 `Narrative` 对象数组——每个对象对应一条分析后的头条： ``` [ { "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "headline": "Iran Resumes Uranium Enrichment Amid Stalled Talks", "stance": "anti_iran", "emotional_tone": "alarmist", "emotional_intensity": 7.2, "key_entities": ["Iran", "IAEA", "United States", "Uranium Enrichment"], "narrative_summary": "Western-aligned framing that positions Iran as the aggressor breaking international norms, while omitting context about sanctions relief failures.", "motives": ["Security Threat Amplification", "Diplomatic Pressure", "Nuclear Proliferation Fear"], "url": "https://www.bbc.com/news/...", "source": "BBC News" }, { "id": "f9e8d7c6-b5a4-3210-fedc-ba0987654321", "headline": "Iran Exercises Sovereign Right to Nuclear Energy", "stance": "pro_iran", "emotional_tone": "sympathetic", "emotional_intensity": 4.1, "key_entities": ["Iran", "NPT", "Sovereignty", "Western Powers"], "narrative_summary": "Frames Iran's nuclear program as a legitimate exercise of sovereignty under the NPT, emphasizing Western hypocrisy in selective enforcement.", "motives": ["Sovereignty Defense", "Anti-Western Framing", "Historical Grievance"], "url": "https://www.aljazeera.com/news/...", "source": "Al Jazeera English" } ] ``` ### 独立服务端点每个微服务均可直接调用以支持自定义工作流： | 服务 | 端点 | 方法 | 请求体 | 用途 | |---|---|---|---|---| | Scraper | `localhost:8001/scrape` | POST | `{ "query": "...", "sources": [] }` | 仅收集原始文章 | | Analysis | `localhost:8002/analyze` | POST | `Headline[]` 数组 | 仅执行叙事分析 | | Render | `localhost:8003/render/save` | POST | `{ "title": "...", "filename": "...", "items": [...] }` | 仅生成 HTML 报告 | ### CLI 模式 Parallax 包含一个独立的终端应用程序，无需 Docker 即可在本地使用 Rich 表格格式化输出： ``` uv run parallax ``` 在终端中显示格式化的 **Rich** 表格，包含列：ID、标题、立场、基调、强度、来源、URL。 ### 交互式文档与健康检查每个服务都提供自动生成的 Swagger UI 和健康检查端点： | 服务 | Swagger UI | 健康检查 | |---|---|---| | Pipeline | [`localhost:8000/docs`](http://localhost:8000/docs) | `GET /health` | | Scraper | [`localhost:8001/docs`](http://localhost:8001/docs) | `GET /health` | | Analysis | [`localhost:8002/docs`](http://localhost:8002/docs) | `GET /health` | | Render | [`localhost:8003/docs`](http://localhost:8003/docs) | `GET /health` |

## 数据模型 ### Headline（采集输入） ``` class Headline(BaseModel): text: str # Article title source: str # Publisher name (e.g., "BBC News") url: str | None # Article URL description: str | None # Subtitle / lead paragraph author: str | None # Author name published_at: datetime | None # Publication timestamp scraped_at: datetime # When Parallax collected it (auto) id: str # UUID for pipeline tracking (auto) ``` ### Narrative（情报输出） ``` class Narrative(BaseModel): id: str # Matches original Headline UUID headline: str # LLM-reformulated headline stance: str # neutral / pro_ / anti_ emotional_tone: str # alarmist / factual / sympathetic / triumphalist / defeatist emotional_intensity: float # 0.0 – 10.0 key_entities: list[str] # People, orgs, nations, concepts narrative_summary: str # Analyst's distillation of the framing motives: list[str] # Detected underlying agendas url: str # Original source URL source: str # Publisher name ``` ### NarrativeLLM（LLM Schema 约束）发送至 Ollama 的 Schema 强制严格类型： - `stance` 通过正则验证：`^(neutral|pro_[a-z0-9_]+|anti_[a-z0-9_]+)$` - `emotional_tone` 为 `Literal` 枚举（仅允许 5 个值） - `emotional_intensity` 范围约束：`0.0 ≤ x ≤ 10.0` 这可防止 LLM 产生幻觉或不一致的输出。

## 路线图 - [x] NewsAPI 采集集成 - [x] RapidFuzz 跨媒体去重 - [x] Ollama 叙事分析（结构化输出） - [x] 地缘政治立场与偏见提取（`pro_*/anti_*` 模式） - [x] 情感基调分类（5 类分类体系） - [x] 自动 HTML 情报报告生成 - [x] 基于来源的媒体过滤 - [x] 多格式导出（JSON、CSV、XML） - [x] CLI 模式（Rich 表格） - [x] 依赖注入架构 - [x] Docker Compose 微服务编排 - [x] 健康检查端点 - [ ] 新增采集来源（Telegram 频道、Reddit、Twitter/X） - [ ] 交互式仪表板（Next.js / React），含立场对比图表 - [ ] 多语言支持（采集与分析支持 PT-BR、AR、RU） - [ ] 数据库持久化，用于历史叙事追踪 - [ ] 聚合指标仪表板（立场分布、实体频率、基调趋势随时间变化） - [ ] 对比分析模式（同一事件，两组媒体，横向对比）请参阅 [开放问题](https://github.com/leonifrazao/Parallax/issues) 获取完整功能列表与已知问题。

## 贡献贡献让开源社区成为学习、启发与创造的绝佳场所。任何贡献都 **备受感激**。如有任何改进建议，请先 fork 本仓库并创建 pull request。你也可以直接打开标注为 "enhancement" 的问题。别忘了给项目一个星标！再次感谢！ 1. Fork 本项目 2. 创建功能分支（`git checkout -b feature/AmazingFeature`） 3. 提交变更（`git commit -m 'Add some AmazingFeature'`） 4. 推送到分支（`git push origin feature/AmazingFeature`） 5. 创建 Pull Request

## 许可证采用 Unlicense License 分发。详见 `LICENSE.txt`。

## 联系 Leoni Frazão - leoni.frazao.oliveira@gmail.com 项目链接: [https://github.com/leonifrazao/Parallax](https://github.com/leonifrazao/Parallax)

## 致谢 * [FastAPI](https://fastapi.tiangolo.com/) — 高性能异步 Web 框架 * [Ollama](https://ollama.com/) — 在本地运行 LLM，保障完全隐私 * [RapidFuzz](https://github.com/maxbachmann/RapidFuzz) — 极速模糊字符串匹配 * [UV](https://github.com/astral-sh/uv) — 极速 Python 包管理器 * [Loguru](https://github.com/Delgan/loguru) — 轻松的结构化日志 * [Rich](https://github.com/Textualize/rich) — 美观的终端格式化 * [dependency-injector](https://github.com/ets-labs/python-dependency-injector) — Python 的 IoC 容器 * [Pydantic](https://docs.pydantic.dev/) — 基于 Python 类型注解的数据校验 * [HTTPX](https://www.python-httpx.org/) — 异步 HTTP 客户端 * [NewsAPI](https://newsapi.org/) — 国际新闻数据提供方

标签：AI风险缓解, SEO, 信息抽取, 偏见检测, 关键词优化, 去重, 叙事分析, 地缘政治, 媒体分析, 情感框架, 情报报告, 数据管道, 文本挖掘, 新闻聚合, 本地大模型, 版权保护, 立场分析, 结构化情报, 软件工程, 逆向工具, 隐藏动机