Deaxu/ArchGraph

GitHub: Deaxu/ArchGraph

面向 AI Agent 的安全优先代码智能工具，支持污点分析、CVE 检测和知识图谱构建，通过 MCP 协议让 AI 具备深度代码安全理解能力。

Stars: 0 | Forks: 0

ArchGraph

面向 AI Agent 的安全优先代码智能工具。
解析 10 种语言，构建包含污点分析、CVE 检测和聚类的知识图谱。
通过 MCP 连接任何 AI Agent —— Cursor、Claude Code、Windsurf 等。

## 为什么选择 ArchGraph？其他工具帮助你*理解*代码。**ArchGraph 帮助你*保护*代码。** | | **ArchGraph** | **Code Search** | **AST Parsers** | **SAST Tools** | |--|---------------|-----------------|-----------------|----------------| | **Taint Analysis** | ✅ Input → Sink | ❌ | ❌ | ✅ | | **CVE Detection** | ✅ Auto via OSV | ❌ | ❌ | Partial | | **CFG / Data Flow** | ✅ libclang + tree-sitter | ❌ | Partial | ✅ | | **MCP for AI Agents** | ✅ 7 tools | ❌ | ❌ | ❌ | | **Functional Clustering** | ✅ Community detection | ❌ | ❌ | ❌ | | **Execution Tracing** | ✅ Entry → Sink flows | ❌ | ❌ | ❌ | | **Export (JSON/GraphML)** | ✅ | ❌ | ❌ | Partial | | **Local-first** | ✅ Neo4j | Varies | ✅ | Varies | ## 快速开始 ``` # 安装 pip install archgraph # 解压（自动检测语言） archgraph extract /path/to/repo -w 4 # 查询 graph archgraph query "MATCH (f:Function {is_input_source: true}) RETURN f.name, f.file" # 启动 web dashboard archgraph serve --port 8080 # 生成 HTML 安全报告 archgraph report /path/to/repo ``` **使用 Docker（包含 Neo4j）：** ``` docker compose up -d neo4j # password: archgraph archgraph extract /path/to/repo --neo4j-password archgraph ``` ## 🤖 AI Agent 集成 (MCP) ArchGraph 向任何兼容 MCP 的 Agent 暴露 7 个工具和 4 个资源。 ### 设置 ``` # 索引你的 repo archgraph extract . --include-cve --include-clustering # 启动 MCP server archgraph mcp ``` **连接你的 Agent：** | Agent | Command | |-------|---------| | Claude Code | `claude mcp add archgraph -- archgraph mcp` | | Cursor | Add to `~/.cursor/mcp.json` | | Windsurf | Add to MCP config | | OpenCode | Add to `~/.config/opencode/config.json` | ### 你的 Agent 将获得什么 **工具：** `query`, `impact`, `context`, `detect_changes`, `find_vulnerabilities`, `cypher`, `stats` **资源：** `archgraph://schema`, `archgraph://security`, `archgraph://clusters`, `archgraph://processes` ### 示例对话 ``` You: "Are there any buffer overflow risks in the network code?" Agent: 1. Queries input sources in network files 2. Traces taint paths to dangerous sinks 3. Reports: "Found 2 paths: - net_recv() → memcpy() in src/net/handler.c (depth: 3) - read_packet() → strcpy() in src/net/parser.c (depth: 4) Both reach dangerous sinks without validation." ``` ## 🔒 安全功能 **自动标记** —— 每个函数都会获得安全标签： - `is_input_source` —— 读取外部数据 (recv, read, fetch, ...) - `is_dangerous_sink` —— 危险操作 (memcpy, exec, eval, ...) - `is_allocator`, `is_crypto`, `is_parser` —— 额外类别 - `risk_score` —— 基于标签的 0-100 风险评分 **污点路径检测：** ``` MATCH path = (src:Function {is_input_source: true})-[:CALLS*1..8]->(sink:Function {is_dangerous_sink: true}) RETURN src.name, sink.name, length(path) AS depth ``` **CVE 富化：** ``` archgraph extract . --include-cve # Queries OSV API automatically ``` ## 所有命令 | Command | Description | |---------|-------------| | `extract` | Extract code graph from repository | | `query` | Run Cypher queries against the graph | | `stats` | Show node/edge statistics | | `schema` | Show graph schema | | `diff` | Compare repo state vs stored graph | | `impact` | Blast radius analysis for a function | | `export` | Export to JSON, GraphML, or CSV | | `report` | Generate HTML security report | | `serve` | Start web dashboard | | `mcp` | Start MCP server for AI agents | | `skills` | Generate AI agent skill files | | `repos` | List indexed repositories | ## 使用场景 ### 安全审计 ``` archgraph extract /target -l c,cpp --include-cve --include-clang archgraph query "MATCH path = (src:Function {is_input_source: true})-[:CALLS*1..5]->(sink:Function {is_dangerous_sink: true}) RETURN src.name, sink.name" ``` ### 代码审查 ``` archgraph diff /path/to/repo archgraph impact "func:src/api.c:handle:42" --direction both ``` ### 逆向工程 ``` archgraph extract /binary/project -l c,cpp,rust --include-clang --include-deep archgraph query "MATCH (f:Function) WHERE f.is_exported = true RETURN f.name, f.file" ``` ## 架构 ``` ┌──────────────────────────────────────────────────┐ │ GraphBuilder Pipeline (11 steps) │ │ │ Local Path ─────┤ 1. Tree-sitter structural extraction │ or │ 2. Git history │ GitHub URL ─────┤ 3. Dependency extraction │──── Neo4j (auto clone) │ 4. Annotation scanning │ Store │ 5. Security labeling │ │ │ 6. Clang deep analysis (C/C++) │ ├── MCP Server │ 7. Tree-sitter deep analysis (Rust/Java/Go/…) │ ├── Web Dashboard │ 8. Churn enrichment │ └── Export/Report │ 9. CVE enrichment (OSV API) │ │ 10. Clustering (community detection) │ │ 11. Process tracing (execution flows) │ └──────────────────────────────────────────────────┘ ``` ## 基准测试 | Project | Language | Files | Nodes | Edges | Time | |---------|----------|-------|-------|-------|------| | [zlib](https://github.com/madler/zlib) (~50K LOC) | C | 79 | 2,389 | 3,968 | 6.6s | | [fastify](https://github.com/fastify/fastify) (~30K LOC) | JavaScript | 487 | 2,810 | 18,472 | 10.5s | | Linux `drivers/usb` (~500K LOC) | C | 892 | 62,812 | 122,746 | 12.7s | *基准测试环境：Windows 11, Python 3.13, 单线程。并行模式 (`-w 4`) 快 2-3 倍。* ## 文档 | Document | Description | |----------|-------------| | [Architecture & Schema](docs/ARCHITECTURE.md) | Graph schema, node/edge types, pipeline | | [CLI Reference](docs/CLI.md) | All commands and options | | [AI Agent Integration](docs/AGENT.md) | MCP setup, tools, examples | | [Security Analysis](docs/SECURITY.md) | Security labeling, Cypher queries | | [Deep Analysis](docs/DEEP_ANALYSIS.md) | CFG, data flow, taint tracking | | [Roadmap](docs/ROADMAP.md) | Development phases | ## 测试 ``` pytest tests/ -v # 137 passed, 22 skipped ``` 无需外部服务。测试使用带有真实 tree-sitter 解析和 git 操作的临时目录。 ## 许可证 [MIT](LICENSE)

标签：AI安全, AST解析, Chat Copilot, CISA项目, Claude, Claude Code, Cursor, CVE检测, DNS重绑定攻击, Function Clustering, MCP集成, Neo4j, Python, SAST, Tree-sitter, 代码安全性, 代码智能, 大模型工具, 控制流图, 无后门, 盲注攻击, 请求拦截, 软件供应链安全, 远程方法调用, 逆向工具, 错误基检测, 静态代码分析