Panda - Agentic AI 代码验证
一个自主 AI agent,能够检查后端代码、检测 bug、提出修复方案,并不断迭代直到所有测试通过。
功能 •
在线演示 •
快速开始 •
API 文档 •
架构 •
部署
## 概述
Panda 是一个 **Agentic AI 系统**,旨在自主验证和修复后端代码。与简单的 LLM 包装器不同,Panda 实现了一个完整的 **agent loop**:
- **观察**:运行测试并捕获失败
- **推理**:分析错误并找出根本原因
- **行动**:生成并应用代码修复
- **迭代**:重复直到成功或达到最大重试次数
```
graph LR
A[Broken Code] --> B[Panda Agent]
B --> C{Tests Pass?}
C -->|No| D[AI Analysis]
D --> E[Generate Fix]
E --> F[Apply Fix]
F --> B
C -->|Yes| G[Fixed Code]
style A fill:#ff6b6b,color:#fff
style G fill:#51cf66,color:#fff
style B fill:#339af0,color:#fff
```
## 功能
| 功能 | 描述 |
|---------|-------------|
| **自主修复** | 提交有问题的代码,拿回可运行的代码 - 无需人工干预 |
| **Agent Loop** | 迭代验证 → 分析 → 修复 → 重新验证的循环 |
| **静态分析** | 在 runtime 之前检测缺失的 import 和未定义的引用 |
| **自动修复缺失的 Import** | 自动恢复建议代码中被移除的 import |
| **实时更新** | 基于 WebSocket 的实时进度追踪 |
| **智能模型选择** | 在代码任务中使用 GPT-4o,并带有智能回退机制 |
| **沙盒执行** | 使用 Vitest 的隔离测试环境 |
| **REST API** | 用于集成的简单 HTTP API |
| **Web 界面** | 用于提交和监控任务的可视化 UI |
## 在线演示
Panda Agent 已经部署并可供使用:
| 服务 | URL |
|---------|-----|
| **API** | https://panda-api-production-c8e2.up.railway.app |
| **Web UI** | https://panda-web-production.up.railway.app |
| **健康检查** | https://panda-api-production-c8e2.up.railway.app/health |
### 快速测试
```
# 检查 API 是否健康
curl https://panda-api-production-c8e2.up.railway.app/health
```
## 快速开始
### 前置条件
- Node.js 20+
- pnpm 8+
- OpenAI API Key
### 安装说明
```
# Clone 仓库
git clone https://github.com/ChanMeng666/panda-agent.git
cd panda-agent
# 安装依赖
pnpm install
# 配置环境
cp .env.example .env.local
# 编辑 .env.local 并添加你的 OPENAI_API_KEY
```
### 开发
```
# 同时启动 API 和 Web 服务器
pnpm dev
# 或者单独启动
pnpm dev:api # API server on http://localhost:3000
pnpm dev:web # Web UI on http://localhost:3001
```
### 生产构建
```
pnpm build
pnpm start
```
## 架构
### 系统概述
```
flowchart TB
subgraph Client["Client Layer"]
WEB[Web Interface]
API_CLIENT[API Client]
end
subgraph API["API Server (Express)"]
ROUTES[Routes]
WS[WebSocket Server]
end
subgraph Core["Agent Core"]
ORCH[Orchestrator]
STATIC[Static Analyzer]
VERIFY[Verifier]
SANDBOX[Code Sandbox]
end
subgraph LLM["LLM Layer"]
PROVIDER[OpenAI Provider]
STRATEGY[Model Strategy]
end
subgraph External["External Services"]
OPENAI[OpenAI API]
VITEST[Vitest Runner]
end
WEB --> ROUTES
API_CLIENT --> ROUTES
ROUTES --> ORCH
ROUTES --> WS
ORCH --> STATIC
ORCH --> VERIFY
ORCH --> SANDBOX
ORCH --> PROVIDER
PROVIDER --> STRATEGY
PROVIDER --> OPENAI
SANDBOX --> VITEST
ORCH -.-> WS
style ORCH fill:#339af0,color:#fff
style OPENAI fill:#412991,color:#fff
```
### Agent Loop 流程
```
sequenceDiagram
participant C as Client
participant O as Orchestrator
participant SA as Static Analyzer
participant V as Verifier
participant A as AI Analyzer
participant F as Fix Generator
participant S as Sandbox
C->>O: Submit Task (code + tests)
O->>SA: Run Static Analysis
alt Missing Imports Found
SA-->>O: Suggest Import Fixes
O->>SA: Apply Auto-fixes
SA-->>O: Re-analyze (passed)
else Unfixable Errors
SA-->>O: Analysis Failed
O-->>C: Return Error
end
O->>S: Initialize Sandbox
loop Until Pass or Max Iterations
O->>V: Run Verification
V->>S: Execute Tests
S-->>V: Test Results
alt Tests Pass
V-->>O: Success
O-->>C: Return Fixed Code
else Tests Fail
V-->>O: Failure + Errors
O->>A: Analyze Failure
A-->>O: Root Cause Analysis
O->>F: Generate Fix
F-->>O: Code Patches
O->>S: Apply Fixes
end
end
O->>S: Cleanup Sandbox
O-->>C: Final Result
```
### 组件详情
| 组件 | 文件 | 职责 |
|-----------|------|----------------|
| **Orchestrator** | `core/orchestrator.ts` | 管理 agent loop,协调所有阶段 |
| **Verifier** | `core/verifier.ts` | 运行测试、linting 和类型检查 |
| **Sandbox** | `core/sandbox.ts` | 隔离的执行环境 |
| **静态分析器** | `core/static-analyzer.ts` | runtime 前检测缺失的 import,生成自动修复 |
| **OpenAI Provider** | `llm/openai-provider.ts` | LLM 集成,prompt 管理 |
| **模型策略** | `llm/model-strategy.ts` | 智能模型选择 |
## API 文档
### Endpoints
| 方法 | Endpoint | 描述 |
|--------|----------|-------------|
| `POST` | `/api/verify` | 提交验证任务 |
| `GET` | `/api/task/:taskId` | 获取任务状态 |
| `GET` | `/api/task/:taskId/result` | 获取最终结果 |
| `GET` | `/health` | 健康检查 |
| `GET` | `/health/ready` | 就绪检查 |
| `WS` | `/ws?taskId=xxx` | 实时更新 |
### 提交验证任务
**请求:**
```
curl -X POST https://panda-api-production-c8e2.up.railway.app/api/verify \
-H "Content-Type: application/json" \
-d '{
"task_id": "my-task-001",
"files": [
{
"path": "src/calculator.js",
"current_code": "export function divide(a, b) {\n return a / b;\n}",
"suggested_code": "export function divide(a, b) {\n return a / b;\n}"
},
{
"path": "src/calculator.test.js",
"current_code": "import { divide } from \"./calculator.js\";\nimport { describe, test, expect } from \"vitest\";\n\ndescribe(\"calculator\", () => {\n test(\"division by zero should throw error\", () => {\n expect(() => divide(10, 0)).toThrow();\n });\n});",
"suggested_code": "..."
}
],
"instructions": {
"goal": "Fix the code to handle division by zero properly"
},
"config": {
"max_iterations": 5,
"model_strategy": "balanced"
}
}'
```
**响应(已排队):**
```
{
"task_id": "my-task-001",
"status": "queued",
"websocket_url": "wss://panda-api-production-c8e2.up.railway.app/ws?taskId=my-task-001",
"message": "Task queued. Connect to WebSocket for real-time updates."
}
```
**最终结果(成功):**
```
{
"task_id": "my-task-001",
"status": "passed",
"summary": "Code verification passed after 2 iteration(s) with AI-assisted fixes.",
"files": [
{
"path": "src/calculator.js",
"applied": true,
"final_code": "export function divide(a, b) {\n if (b === 0) {\n throw new Error('Division by zero');\n }\n return a / b;\n}"
},
{
"path": "src/calculator.test.js",
"applied": true,
"final_code": "..."
}
],
"confidence_score": 0.67,
"iterations_used": 2,
"total_tokens_used": 1500,
"duration_ms": 16085
}
```
### WebSocket 事件
连接以接收实时更新:
```
const ws = new WebSocket('wss://panda-api-production-c8e2.up.railway.app/ws?taskId=my-task-001');
ws.onmessage = (event) => {
const message = JSON.parse(event.data);
switch (message.type) {
case 'state':
console.log('Phase:', message.data.phase);
console.log('Iteration:', message.data.iteration);
break;
case 'log':
console.log(`[${message.data.type}] ${message.data.message}`);
break;
case 'iteration':
console.log('Iteration complete:', message.data);
break;
case 'complete':
console.log('Task complete:', message.data.status);
break;
}
};
```
### 输入/输出 Schema
```
classDiagram
class TaskInput {
+string task_id
+FileInput[] files
+Instructions instructions
+TaskConfig config
}
class FileInput {
+string path
+string current_code
+string suggested_code
}
class TaskOutput {
+string task_id
+string status
+string summary
+FileOutput[] files
+number confidence_score
+number iterations_used
+number duration_ms
}
class FileOutput {
+string path
+boolean applied
+string final_code
}
TaskInput --> FileInput
TaskOutput --> FileOutput
```
## 模型策略
Agent 会根据任务复杂度智能选择模型:
```
flowchart TD
START[Task Received] --> CHECK{Assess Complexity}
CHECK -->|Simple| MINI[GPT-4o-mini]
CHECK -->|Medium| PRIMARY[GPT-4o]
CHECK -->|Complex| PRIMARY
PRIMARY --> RETRY{Iteration >= 3?}
RETRY -->|Yes| TURBO[GPT-4-turbo]
RETRY -->|No| PRIMARY
MINI --> DONE[Generate Fix]
TURBO --> DONE
style PRIMARY fill:#339af0,color:#fff
style MINI fill:#51cf66,color:#fff
style TURBO fill:#fab005,color:#fff
```
| 策略 | 主要模型 | 用例 |
|----------|---------------|----------|
| `quality` | GPT-4o | 最佳结果,复杂的 bug |
| `balanced` | GPT-4o | 推荐用于大多数场景 |
| `economy` | GPT-4o-mini | 高频次,简单的修复 |
## 项目结构
```
panda-agent/
├── apps/
│ ├── api/ # Backend API Server
│ │ ├── src/
│ │ │ ├── core/
│ │ │ │ ├── orchestrator.ts # Main agent loop controller
│ │ │ │ ├── verifier.ts # Test/lint/typecheck runner
│ │ │ │ ├── sandbox.ts # Isolated code execution
│ │ │ │ ├── static-analyzer.ts # Missing imports detection & auto-fix
│ │ │ │ └── types.ts # Type definitions
│ │ │ ├── llm/
│ │ │ │ ├── openai-provider.ts # OpenAI integration
│ │ │ │ ├── config.ts # Model configurations
│ │ │ │ └── model-strategy.ts # Smart model selection
│ │ │ ├── routes/
│ │ │ │ ├── verify.ts # Verification endpoints
│ │ │ │ └── health.ts # Health check endpoints
│ │ │ ├── utils/
│ │ │ │ └── logger.ts # Logging utilities
│ │ │ └── index.ts # Server entry point
│ │ ├── Dockerfile
│ │ ├── package.json
│ │ └── tsconfig.json
│ │
│ └── web/ # Frontend Web UI
│ ├── src/
│ │ ├── app/ # Next.js App Router
│ │ │ ├── page.tsx # Main page
│ │ │ └── layout.tsx # Root layout
│ │ ├── components/
│ │ │ ├── CodeEditor.tsx # Monaco editor wrapper
│ │ │ ├── TaskStatus.tsx # Real-time status display
│ │ │ └── ResultViewer.tsx # Final results view
│ │ └── lib/
│ │ └── api.ts # API client
│ ├── package.json
│ └── tsconfig.json
│
├── packages/
│ └── shared/ # Shared Types & Utilities
│ └── src/
│ └── index.ts
│
├── Dockerfile.api # API production Dockerfile
├── Dockerfile.web # Web production Dockerfile
├── pnpm-workspace.yaml # Monorepo configuration
├── package.json # Root package.json
└── README.md # This file
```
## 部署
### Railway(推荐)
项目部署在 Railway 上,包含两个服务:
```
flowchart LR
subgraph Railway["Railway Platform"]
API[panda-api
Port 3000]
WEB[panda-web
Port 3001]
end
subgraph External
OPENAI[OpenAI API]
USER[Users]
end
USER --> WEB
USER --> API
WEB --> API
API --> OPENAI
style API fill:#339af0,color:#fff
style WEB fill:#51cf66,color:#fff
```
#### 部署你自己的实例
1. **安装 Railway CLI:**
npm install -g @railway/cli
2. **登录并初始化:**
railway login
cd panda-agent
railway init
3. **创建服务:**
# 创建 API 服务
railway add --service panda-api
railway variables --set "RAILWAY_DOCKERFILE_PATH=Dockerfile.api" --service panda-api
railway variables --set "OPENAI_API_KEY=sk-xxx" --service panda-api
railway variables --set "PORT=3000" --service panda-api
# 创建 Web 服务
railway add --service panda-web
railway variables --set "RAILWAY_DOCKERFILE_PATH=Dockerfile.web" --service panda-web
railway variables --set "PORT=3001" --service panda-web
4. **部署:**
railway up --service panda-api
railway up --service panda-web
5. **生成域名:**
railway domain --service panda-api
railway domain --service panda-web
### 环境变量
| 变量 | 必需 | 描述 | 默认值 |
|----------|----------|-------------|---------|
| `OPENAI_API_KEY` | 是 | OpenAI API key | - |
| `PORT` | 否 | 服务器端口 | `3000` |
| `NODE_ENV` | 否 | 环境 | `development` |
| `MAX_ITERATIONS` | 否 | 最大重试次数 | `5` |
| `EXECUTION_TIMEOUT` | 否 | 测试超时时间 (ms) | `60000` |
| `NEXT_PUBLIC_API_URL` | 否 | 前端的 API URL | `http://localhost:3000` |
| `NEXT_PUBLIC_WS_URL` | 否 | WebSocket URL | `ws://localhost:3000` |
### Docker
```
# 构建 API 镜像
docker build -f Dockerfile.api -t panda-api .
# 构建 Web 镜像
docker build -f Dockerfile.web -t panda-web .
# 本地运行
docker run -p 3000:3000 -e OPENAI_API_KEY=sk-xxx panda-api
docker run -p 3001:3001 panda-web
```
## 测试结果
Agent 已经在多种场景下进行了测试:
### 测试用例 1:除以零 Bug
**输入(有问题的代码):**
```
// src/calculator.js
export function divide(a, b) {
return a / b; // Bug: No handling for division by zero
}
// src/calculator.test.js
test('division by zero should throw error', () => {
expect(() => divide(10, 0)).toThrow(); // This test will fail
});
```
**输出(AI 修复的代码):**
```
// src/calculator.js (after AI fix)
export function divide(a, b) {
if (b === 0) {
throw new Error('Division by zero');
}
return a / b;
}
```
**性能:**
| 指标 | 值 |
|--------|-------|
| 修复迭代次数 | 2 |
| 总持续时间 | ~16 秒 |
| 置信度分数 | 0.67 |
| 状态 | 通过 |
### 测试用例 2:缺失的 Import(自动修复)
**输入(建议的代码缺失 import):**
```
// src/mathUtils.js - current_code (has import)
import { isNumber } from './validators.js';
export function divide(a, b) {
if (!isNumber(a) || !isNumber(b)) {
throw new TypeError('Inputs must be numbers');
}
return a / b;
}
// src/mathUtils.js - suggested_code (missing import)
export function divide(a, b) {
if (!isNumber(a) || !isNumber(b)) { // Bug: isNumber is not defined!
throw new TypeError('Inputs must be numbers');
}
return a / b;
}
```
**输出(自动修复后的代码):**
```
// src/mathUtils.js (after static analysis auto-fix)
import { isNumber } from './validators.js'; // Import restored automatically!
export function divide(a, b) {
if (!isNumber(a) || !isNumber(b)) {
throw new TypeError('Inputs must be numbers');
}
return a / b;
}
```
**工作原理:**
1. 静态分析器检测到使用了 `isNumber` 但未导入
2. 在 `current_code` 中查找 `isNumber` 并找到 import 语句
3. 自动将缺失的 import 恢复到 `suggested_code` 中
4. 重新运行静态分析以验证修复
5. 继续进行测试验证
**性能:**
| 指标 | 值 |
|--------|-------|
| 应用自动修复 | 是 |
| 节省的迭代次数 | 1-2(修复 import 无需调用 LLM) |
| 状态 | 通过(自动修复后) |
## Agent 能力
### Panda 能修复什么
- **缺失的 import**(在测试执行前自动检测并自动修复)
- 缺失的错误处理
- 逻辑 bug
- 边界情况失败
- 类型不匹配
- 验证问题
- 缺失的 null 检查
- 未定义的引用错误
### Agent Loop 阶段
```
stateDiagram-v2
[*] --> Initializing: Task Received
Initializing --> StaticAnalysis: Input Validated
StaticAnalysis --> Verifying: Analysis Passed
StaticAnalysis --> AutoFix: Missing Imports Found
AutoFix --> StaticAnalysis: Imports Restored
StaticAnalysis --> Failed: Unfixable Errors
Verifying --> Completed: Tests Pass
Verifying --> Analyzing: Tests Fail
Analyzing --> GeneratingFix: Root Cause Found
GeneratingFix --> ApplyingFix: Fix Generated
ApplyingFix --> Verifying: Fix Applied
Verifying --> Failed: Max Iterations
Completed --> [*]
Failed --> [*]
note right of StaticAnalysis: Pre-runtime detection of\nmissing imports & undefined refs
note right of AutoFix: Automatically restores imports\nfrom current_code to suggested_code
note right of Analyzing: AI analyzes error messages\nand identifies root cause
note right of GeneratingFix: AI generates code patches\nfor affected files
```
## 技术栈
| 层级 | 技术 |
|-------|------------|
| Runtime | Node.js 20 |
| 语言 | TypeScript 5.3 |
| API 框架 | Express.js |
| WebSocket | ws |
| 前端 | Next.js 14 |
| 样式 | Tailwind CSS |
| 代码编辑器 | Monaco Editor |
| AI Provider | OpenAI (GPT-4o) |
| 测试运行器 | Vitest |
| 包管理器 | pnpm |
| 部署 | Railway |
## 许可证
MIT 许可证 - 详情请参阅 [LICENSE](LICENSE)。
由 AI 打造,专为希望更快交付的开发者设计
GitHub •
在线演示 •
API 健康