Deaxu/ArchGraph
GitHub: Deaxu/ArchGraph
面向 AI Agent 的安全优先代码智能工具,支持污点分析、CVE 检测和知识图谱构建,通过 MCP 协议让 AI 具备深度代码安全理解能力。
Stars: 0 | Forks: 0
面向 AI Agent 的安全优先代码智能工具。
解析 10 种语言,构建包含污点分析、CVE 检测和聚类的知识图谱。
通过 MCP 连接任何 AI Agent —— Cursor、Claude Code、Windsurf 等。
## 为什么选择 ArchGraph?
其他工具帮助你*理解*代码。**ArchGraph 帮助你*保护*代码。**
| | **ArchGraph** | **Code Search** | **AST Parsers** | **SAST Tools** |
|--|---------------|-----------------|-----------------|----------------|
| **Taint Analysis** | ✅ Input → Sink | ❌ | ❌ | ✅ |
| **CVE Detection** | ✅ Auto via OSV | ❌ | ❌ | Partial |
| **CFG / Data Flow** | ✅ libclang + tree-sitter | ❌ | Partial | ✅ |
| **MCP for AI Agents** | ✅ 7 tools | ❌ | ❌ | ❌ |
| **Functional Clustering** | ✅ Community detection | ❌ | ❌ | ❌ |
| **Execution Tracing** | ✅ Entry → Sink flows | ❌ | ❌ | ❌ |
| **Export (JSON/GraphML)** | ✅ | ❌ | ❌ | Partial |
| **Local-first** | ✅ Neo4j | Varies | ✅ | Varies |
## 快速开始
```
# 安装
pip install archgraph
# 解压(自动检测语言)
archgraph extract /path/to/repo -w 4
# 查询 graph
archgraph query "MATCH (f:Function {is_input_source: true}) RETURN f.name, f.file"
# 启动 web dashboard
archgraph serve --port 8080
# 生成 HTML 安全报告
archgraph report /path/to/repo
```
**使用 Docker(包含 Neo4j):**
```
docker compose up -d neo4j # password: archgraph
archgraph extract /path/to/repo --neo4j-password archgraph
```
## 🤖 AI Agent 集成 (MCP)
ArchGraph 向任何兼容 MCP 的 Agent 暴露 7 个工具和 4 个资源。
### 设置
```
# 索引你的 repo
archgraph extract . --include-cve --include-clustering
# 启动 MCP server
archgraph mcp
```
**连接你的 Agent:**
| Agent | Command |
|-------|---------|
| Claude Code | `claude mcp add archgraph -- archgraph mcp` |
| Cursor | Add to `~/.cursor/mcp.json` |
| Windsurf | Add to MCP config |
| OpenCode | Add to `~/.config/opencode/config.json` |
### 你的 Agent 将获得什么
**工具:** `query`, `impact`, `context`, `detect_changes`, `find_vulnerabilities`, `cypher`, `stats`
**资源:** `archgraph://schema`, `archgraph://security`, `archgraph://clusters`, `archgraph://processes`
### 示例对话
```
You: "Are there any buffer overflow risks in the network code?"
Agent:
1. Queries input sources in network files
2. Traces taint paths to dangerous sinks
3. Reports: "Found 2 paths:
- net_recv() → memcpy() in src/net/handler.c (depth: 3)
- read_packet() → strcpy() in src/net/parser.c (depth: 4)
Both reach dangerous sinks without validation."
```
## 🔒 安全功能
**自动标记** —— 每个函数都会获得安全标签:
- `is_input_source` —— 读取外部数据 (recv, read, fetch, ...)
- `is_dangerous_sink` —— 危险操作 (memcpy, exec, eval, ...)
- `is_allocator`, `is_crypto`, `is_parser` —— 额外类别
- `risk_score` —— 基于标签的 0-100 风险评分
**污点路径检测:**
```
MATCH path = (src:Function {is_input_source: true})-[:CALLS*1..8]->(sink:Function {is_dangerous_sink: true})
RETURN src.name, sink.name, length(path) AS depth
```
**CVE 富化:**
```
archgraph extract . --include-cve # Queries OSV API automatically
```
## 所有命令
| Command | Description |
|---------|-------------|
| `extract` | Extract code graph from repository |
| `query` | Run Cypher queries against the graph |
| `stats` | Show node/edge statistics |
| `schema` | Show graph schema |
| `diff` | Compare repo state vs stored graph |
| `impact` | Blast radius analysis for a function |
| `export` | Export to JSON, GraphML, or CSV |
| `report` | Generate HTML security report |
| `serve` | Start web dashboard |
| `mcp` | Start MCP server for AI agents |
| `skills` | Generate AI agent skill files |
| `repos` | List indexed repositories |
## 使用场景
### 安全审计
```
archgraph extract /target -l c,cpp --include-cve --include-clang
archgraph query "MATCH path = (src:Function {is_input_source: true})-[:CALLS*1..5]->(sink:Function {is_dangerous_sink: true}) RETURN src.name, sink.name"
```
### 代码审查
```
archgraph diff /path/to/repo
archgraph impact "func:src/api.c:handle:42" --direction both
```
### 逆向工程
```
archgraph extract /binary/project -l c,cpp,rust --include-clang --include-deep
archgraph query "MATCH (f:Function) WHERE f.is_exported = true RETURN f.name, f.file"
```
## 架构
```
┌──────────────────────────────────────────────────┐
│ GraphBuilder Pipeline (11 steps) │
│ │
Local Path ─────┤ 1. Tree-sitter structural extraction │
or │ 2. Git history │
GitHub URL ─────┤ 3. Dependency extraction │──── Neo4j
(auto clone) │ 4. Annotation scanning │ Store
│ 5. Security labeling │ │
│ 6. Clang deep analysis (C/C++) │ ├── MCP Server
│ 7. Tree-sitter deep analysis (Rust/Java/Go/…) │ ├── Web Dashboard
│ 8. Churn enrichment │ └── Export/Report
│ 9. CVE enrichment (OSV API) │
│ 10. Clustering (community detection) │
│ 11. Process tracing (execution flows) │
└──────────────────────────────────────────────────┘
```
## 基准测试
| Project | Language | Files | Nodes | Edges | Time |
|---------|----------|-------|-------|-------|------|
| [zlib](https://github.com/madler/zlib) (~50K LOC) | C | 79 | 2,389 | 3,968 | 6.6s |
| [fastify](https://github.com/fastify/fastify) (~30K LOC) | JavaScript | 487 | 2,810 | 18,472 | 10.5s |
| Linux `drivers/usb` (~500K LOC) | C | 892 | 62,812 | 122,746 | 12.7s |
*基准测试环境:Windows 11, Python 3.13, 单线程。并行模式 (`-w 4`) 快 2-3 倍。*
## 文档
| Document | Description |
|----------|-------------|
| [Architecture & Schema](docs/ARCHITECTURE.md) | Graph schema, node/edge types, pipeline |
| [CLI Reference](docs/CLI.md) | All commands and options |
| [AI Agent Integration](docs/AGENT.md) | MCP setup, tools, examples |
| [Security Analysis](docs/SECURITY.md) | Security labeling, Cypher queries |
| [Deep Analysis](docs/DEEP_ANALYSIS.md) | CFG, data flow, taint tracking |
| [Roadmap](docs/ROADMAP.md) | Development phases |
## 测试
```
pytest tests/ -v # 137 passed, 22 skipped
```
无需外部服务。测试使用带有真实 tree-sitter 解析和 git 操作的临时目录。
## 许可证
[MIT](LICENSE)
标签:AI安全, AST解析, Chat Copilot, CISA项目, Claude, Claude Code, Cursor, CVE检测, DNS重绑定攻击, Function Clustering, MCP集成, Neo4j, Python, SAST, Tree-sitter, 代码安全性, 代码智能, 大模型工具, 控制流图, 无后门, 盲注攻击, 请求拦截, 软件供应链安全, 远程方法调用, 逆向工具, 错误基检测, 静态代码分析