ashioyajotham/web_research_agent

GitHub: ashioyajotham/web_research_agent

一个基于 ReAct 范式的 AI 网络研究代理,支持实时推理流式输出、多模式深度调研、会话记忆和代码沙箱执行。

Stars: 1 | Forks: 1

# Web 研究 Agent [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/7d0e382b27072315.svg)](https://github.com/ashioyajotham/web_research_agent/actions/workflows/ci.yml) [![PyPI](https://img.shields.io/pypi/v/web-research-agent)](https://pypi.org/project/web-research-agent/) [![Python](https://img.shields.io/pypi/pyversions/web-research-agent)](https://pypi.org/project/web-research-agent/) 一个使用 **ReAct** (Reasoning and Acting) 方法论的 AI agent,通过搜索网络、抓取页面和运行代码来完成复杂的研究任务——所有过程实时可见。 ## 功能特性 - **实时 ReAct 流式传输** — 在你的终端中实时观看 agent 工作的每一个 Thought → Action → Observation 步骤 - **标准与深度研究模式** — 单次顺序循环,或针对 4 个子查询的并行展开,以进行详尽的调查 - **会话记忆** — 后续查询自动继承当前会话的上下文 - **Playwright 浏览器工具** — 无头 Chromium 回退方案,用于处理 JavaScript 渲染的页面 - **代码执行沙箱** — 基于抽象语法树 (AST) 的预检查,在运行用户生成的代码前拦截危险的导入 - **查询历史** — 持久化存储于 `~/.webresearch/history.json`,支持方向键导航和重新运行 - **速率限制显示** — 每次查询后显示 Serper API 的月度使用量 - **Prompt 注入防护** — 抓取的内容在到达 LLM 之前会经过清洗 - **由 Gemini 2.5 Flash 驱动** ## 研究致谢 本项目实现了源自以下资源的 **ReAct** 范式: ## 安装说明 ``` pip install web-research-agent ``` 可选 — 针对包含大量 JS 的页面安装 Playwright: ``` pip install "web-research-agent[browser]" playwright install chromium ``` ### API Keys 你需要两个密钥(均提供免费层级): | Key | Where to get it | |-----|----------------| | **Gemini API key** | [Google AI Studio](https://makersuite.google.com/app/apikey) | | **Serper API key** | [serper.dev](https://serper.dev) — 每月免费 2,500 次搜索 | CLI 在首次运行时会提示输入这些密钥,并将其存储在 `~/.webresearch/config.env` 中。 ### Windows PATH 修复 如果 `pip install` 后无法识别 `webresearch`: ``` # 永久修复 [Environment]::SetEnvironmentVariable( "Path", [Environment]::GetEnvironmentVariable("Path", "User") + ";$env:APPDATA\Python\Python313\Scripts", "User" ) # 然后重启你的终端 ``` 或者直接运行而不依赖 PATH:`python -m cli` ## 快速开始 ``` webresearch ``` 交互式菜单: ``` 1. 🔍 Run a research query 2. 📁 Process tasks from file 3. 📚 View query history 4. 📋 View recent logs 5. 🔧 Reconfigure API keys 6. 🧹 Clear session memory 7. 👋 Exit ``` 选择 **1**,输入你的问题,然后选择你的研究模式: - **Standard (标准)** — 顺序 ReAct 循环,最多 15 次迭代(约 1 分钟) - **Deep Research (深度研究)** — 并行展开:4 个子查询 × 每个 5 次迭代(约 3 分钟,更详尽) ## 工作原理 ### Standard 模式 (顺序 ReAct) ``` Thought: I need to search for current Bitcoin price Action: search {"query": "Bitcoin price USD 2025"} Observation: [search results…] Thought: I have enough information Final Answer: Bitcoin is currently trading at $67,709. ``` ### Deep Research 模式 (并行展开) ``` Original question: "What is the state of fusion energy in 2025?" Sub-query 1: What recent breakthroughs in fusion energy occurred? ⟳ running Sub-query 2: Which companies are leading commercial fusion? ⟳ running Sub-query 3: What is the current timeline for viable fusion power? ⟳ running Sub-query 4: What are the remaining engineering challenges? ⟳ running → All results synthesised into one comprehensive answer ``` ### 会话记忆 ``` Q1: "Who is the CEO of OpenAI?" → Sam Altman Q2: "What is his background?" → agent knows "his" = Sam Altman Q3: "What companies has he co-founded?" → agent retains full context ``` 从菜单中选择选项 **6** 以重置会话。 ## 架构 ``` webresearch/ ├── agent.py # Sequential ReAct loop (step_callback for live streaming) ├── parallel.py # Parallel fan-out agent (ThreadPoolExecutor) ├── memory.py # ConversationMemory for multi-turn sessions ├── llm.py # Gemini API interface (retry + backoff) ├── config.py # Configuration from env / ~/.webresearch/config.env └── tools/ ├── base.py # Abstract Tool class ├── search.py # Google search via Serper.dev + usage tracking ├── scrape.py # requests-based HTML scraper + injection sanitizer ├── browser.py # Playwright headless scraper (scrape_js) ├── code_executor.py # Python sandbox (AST check + isolated temp dir) └── file_ops.py # Read / write files cli.py # Interactive terminal UI (rich + questionary) ``` ## 配置 在 `.env` 或 `~/.webresearch/config.env` 中设置以下变量: | Variable | Default | Description | |----------|---------|-------------| | `GEMINI_API_KEY` | — | 必需 | | `SERPER_API_KEY` | — | 必需 | | `MODEL_NAME` | `gemini-2.5-flash` | Gemini 模型 | | `MAX_ITERATIONS` | `15` | ReAct 循环上限 | | `TEMPERATURE` | `0.1` | LLM temperature | | `MAX_TOOL_OUTPUT_LENGTH` | `5000` | 每次工具调用反馈给 LLM 的字符数 | | `WEB_REQUEST_TIMEOUT` | `30` | 抓取器超时时间 (秒) | | `CODE_EXECUTION_TIMEOUT` | `60` | 代码沙箱超时时间 (秒) | ## 批量处理 ``` webresearch # choose option 2, enter path to tasks file ``` 任务文件格式(以空行分隔): ``` Find the name of the COO of the organization that mediated secret talks between US and Chinese AI companies in Geneva in 2023. By what percentage did Volkswagen reduce their Scope 1 + Scope 2 greenhouse gas emissions in 2023 compared to 2021? ``` ## 编程方式使用 ``` from webresearch import ReActAgent, ParallelResearchAgent from webresearch.config import Config from webresearch.llm import LLMInterface from webresearch.tools import ToolManager, SearchTool, ScrapeTool, CodeExecutorTool, FileOpsTool cfg = Config() cfg.validate() llm = LLMInterface(api_key=cfg.gemini_api_key, model_name=cfg.model_name) tools = ToolManager() tools.register_tool(SearchTool(cfg.serper_api_key)) tools.register_tool(ScrapeTool()) tools.register_tool(CodeExecutorTool()) tools.register_tool(FileOpsTool()) # Sequential agent = ReActAgent(llm=llm, tool_manager=tools) answer = agent.run("What is the capital of France?") # Parallel fan-out deep = ParallelResearchAgent(llm=llm, tool_manager=tools) answer = deep.run("Explain the state of nuclear fusion energy in 2025") ``` ## 添加自定义工具 ``` from webresearch.tools.base import Tool class MyTool(Tool): @property def name(self) -> str: return "my_tool" @property def description(self) -> str: return """One-line summary. Parameters: - param (str, required): what it does Use this tool when you need to…""" def execute(self, param: str) -> str: return f"Result for {param}" # 注册它 tools.register_tool(MyTool()) ``` ## 安全性 - **Prompt 注入**:9 个正则表达式模式会在抓取的页面内容到达 LLM 之前,清除类似指令的内容 - **代码沙箱**:AST 预检查拦截 `subprocess`、`socket`、`ctypes`、`os.system/fork/exec` 以及 10 多种其他危险调用;代码在隔离的 `TemporaryDirectory` 中运行,且子进程环境中已清除 API 密钥 - **API 密钥**:存储在 `~/.webresearch/config.env` 中,从不记录在日志中 ## 开发 ``` git clone https://github.com/ashioyajotham/web_research_agent.git cd web_research_agent pip install -e ".[dev]" pytest tests/ -v # 56 tests, no API keys required ``` CI 在每次推送和 Pull Request 时于 Python 3.9 · 3.10 · 3.11 · 3.12 上运行。 ## 限制 - 无法访问付费墙或需要登录的内容(即使使用 Playwright) - 不支持 PDF 解析;URL 会被标记以供手动下载 - Serper 免费层:每月 2,500 次搜索 —— 复杂查询每次可能消耗 5-8 次调用 - 并行模式会发出多个并发 Gemini 请求;频繁使用可能会触发速率限制 ## 贡献 请参阅 [CONTRIBUTING.md](CONTRIBUTING.md)。欢迎通过 GitHub Issues 提交 Bug 报告和功能请求。 ## 许可证 MIT — 详见 [LICENSE](LICENSE)。 ## 参考资料 - [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/abs/2210.03629) - [Google Gemini API](https://ai.google.dev/docs) - [Serper.dev API](https://serper.dev/docs)
标签:AI助手, AST检查, DNS 反向解析, Playwright, Prompt注入防护, ReAct代理, Serper API, 人工智能, 代码执行沙箱, 会话记忆, 动态任务分析, 多策略综合, 大语言模型应用, 实时流式传输, 深度研究模式, 特征检测, 用户模式Hook绕过, 网络研究工具, 自动化研究, 逆向工具