thesoze/feature-crew
GitHub: thesoze/feature-crew
一个基于分层LLM的自主代码生成、测试和审查框架,旨在大幅降低自动化开发成本并提升效率。
Stars: 0 | Forks: 0
# Feature Crew — 自主开发流程
一个经过生产验证的、可复用的自主代码生成、测试和审查框架。最初为 Argus 项目构建,现作为独立模板提取,适用于任何 Python 项目。
**成本:** 每项任务约 $0.0005(本地 LLM + 低成本 API 审查),而传统方法成本为 $0.15 以上。
**速度:** 每项任务 30-60 秒(代码生成 + 自审计 + 审查)。
**可靠性:** 仅使用新鲜上下文审查(无会话上下文膨胀)。
## 快速开始
```
# stral_tier1.py
git clone https://github.com/thesoze/feature-crew.git myproject-crew
cd myproject-crew
# 21. guardrails/mistral-tier1.md
cp config.example.toml config.toml
# 22. feature_crew/audit/plugins/bandit.py
# 23. feature_crew/review/checks/security.py
docker-compose up -d # Redis + optional Spark service
# 24. feature_crew/review.py
python -m feature_crew tier1 orchestrator
# 25. Terminal 1: Redis
python -m feature_crew enqueue \
--spec "Add a function named greet() that returns 'Hello'" \
--spec-slug "test-greet-function"
```
## 架构
```
Task Queue (Redis)
↓
Spark Service (tier-specific task dispatch)
├─ Tier 1 (XS, low-risk): Local LLM (Qwen, Llama, etc.)
├─ Tier 2 (S/M, standard): Claude Sonnet
└─ Tier 3 (L/XL, high-risk): Claude Opus + security review
↓
Code Generation (via Aider or direct LLM)
↓
Self-Audit (pytest, ruff, mypy, compile)
↓
Fresh-Context Review (read diff + audit only, never conversation history)
↓
Merge & Deploy (or re-queue with feedback, up to 3 attempts)
```
## 核心组件
### 1. 任务路由器
- **文件:** `feature_crew/router.py`
- **分类:** 按代码行数、复杂性、风险分为 XS/S/M/L/XL
- **输出:** 包含 `tier`、`model`、`crew_plan` 的任务负载
### 2. Spark 服务
- **文件:** `spark/tier1_service.py`
- **职责:** 管理任务队列、创建工作树、跟踪状态
- **协议:** REST API (GET /next, POST /complete, POST /feedback)
- **存储:** Redis 队列 + 任务元数据
### 3. 编排器
- **文件:** `feature_crew/orchestrator.py`
- **层级:** 每个层级可插拔的编排器
- `Tier1Orchestrator` — 本地 LLM 分派(Qwen、Llama)
- `Tier2Orchestrator` — Claude Sonnet(通过 API)
- `Tier3Orchestrator` — Claude Opus + 安全审查
- **方法:** 轮询 Spark 服务、分派、等待审计、审查、合并
### 4. 自审计
- **文件:** `feature_crew/audit/`
- **工具:** pytest、ruff、mypy、py_compile(可配置)
- **判定:** `ready_for_review` | `needs_retry`
- **输出:** 包含所有检查结果的 JSON 报告
### 5. 新鲜上下文审查
- **文件:** `feature_crew/review.py`
- **仅读取:** 规格说明 + 差异 + 审计报告
- **不读取:** 会话历史、任务元数据、先前尝试
- **检查:** 自定义验证链(每层级 5+ 项检查)
- **输出:** `{action: approve|reject, reason, feedback}`
### 6. 命令行界面
- **命令:**
- `enqueue` — 将任务添加到队列
- `tier1 orchestrator` — 启动第一层级轮询循环
- `tier2 orchestrator` — 启动第二层级轮询循环
- `status` — 检查队列状态
- `config` — 显示/测试配置
## 配置
### 26. Terminal 2: Spark service
```
[project]
name = "myproject"
repo_path = "/path/to/myproject"
description = "My awesome project"
[infrastructure]
redis_url = "redis://localhost:6379"
spark_url = "http://localhost:30001"
[tier1]
enabled = true
model = "qwen3-235b" # or "llama3-70b", "mistral-7b", etc.
guardrail_file = "guardrails/tier1-f2.md" # optional: teach-to-ask guardrail
timeout_seconds = 300
[tier2]
enabled = true
model = "claude-sonnet-4-6" # Claude API
api_key = "sk-..." # or read from .env
cost_budget = 5.0 # USD per day
[tier3]
enabled = false # tier-3 for high-risk only
model = "claude-opus-4-7"
security_review = true
[self_audit]
tools = ["pytest", "ruff", "mypy", "compile"]
pytest_args = ["tests/", "-v", "--tb=short"]
ruff_args = ["check", "src/", "--select=E9,F63,F7,F82"]
[review]
# 27. Terminal 3: Orchestrator
checks = [
"audit_verdict",
"diff_size",
"file_count",
"spec_adherence",
"suspicious_patterns",
]
max_files_per_tier1 = 3
max_files_per_tier2 = 10
```
### 28. Using docker-compose
可选的防护规则,用于教导第一层级 LLM 在规格说明模糊时提问:
```
# 29. Using systemd (Linux)
When given a vague spec, respond with:
[ASK]:
Vague spec indicators:
- "improve", "clean up", "refactor" (without specifics)
- "make it better", "optimize", "simplify"
- "handle edge cases" (which ones?)
Example:
Spec: "Improve the login flow"
Response: [ASK]: Do you want to (a) add rate limiting, (b) fix a specific bug, (c) add 2FA?
```
## 使用模式
### 模式 1:快速第一层级(开发)
```
# 30. Using launchd (macOS)
python -m feature_crew enqueue \
--spec "Add validation to email field in UserForm" \
--spec-slug "validate-email-form"
# 31. Spark service health
# 32. Queue status
```
### 模式 2:第二层级(标准功能)
```
# 33. Logs
python -m feature_crew enqueue \
--spec "Add OAuth2 login endpoint with database migration" \
--spec-slug "oauth2-login" \
--tier 2
# 34. Task success rate
# 35. Cost tracking
```
### 模式 3:第三层级(高风险)
```
# 36. Latency per tier
python -m feature_crew enqueue \
--spec "Implement HMAC verification for webhook signatures" \
--spec-slug "webhook-hmac" \
--tier 3
# Yes, that's 36 lines. Now, I need to translate each one to Simplified Chinese, keeping professional terms in English.
# Let me translate each line:
```
## 可扩展性
### 添加新的 LLM 模型
1. **创建层级编排器:**
# 1. Clone this repo – Translation: 克隆此仓库
class MistralTier1Orchestrator(BaseTier1Orchestrator):
async def dispatch_to_llm(self, spec, worktree_path, feedback):
# 调用 Mistral API 或本地 vLLM 端点
pass
2. **在配置中注册:**
[tier1]
model = "mistral-7b"
api_endpoint = "http://localhost:8000"
3. **可选防护规则:**
# 2. Configure for your project – Translation: 为您的项目配置
# 针对 Mistral 特性的自定义防护规则
### 添加自审计工具
1. **创建审计插件:**
# 3. Edit config.toml: set LLM models, Redis URL, project paths – Translation: 编辑 config.toml:设置 LLM 模型、Redis URL、项目路径
def run_bandit(task_id, worktree_path, files_modified):
result = subprocess.run(["bandit", "-r", "src/"])
return {"exit_code": result.returncode, "passed": result.returncode == 0}
2. **在配置中注册:**
[self_audit]
tools = ["pytest", "ruff", "mypy", "bandit"]
### 添加审查检查
1. **创建自定义验证器:**
# - Here, config.toml, LLM, Redis are kept in English as per instruction.
def check_no_hardcoded_secrets(spec, diff, audit_report):
if "password" in diff or "secret" in diff:
return False, "Found hardcoded secrets in diff"
return True, None
2. **注册:**
# 4. Start the infrastructure – Translation: 启动基础设施
CHECKS = [
check_audit_verdict,
check_diff_size,
check_no_hardcoded_secrets, # 自定义
]
## 部署
### 本地开发
```
# 5. Run the pipeline – Translation: 运行流水线
docker run -d -p 6379:6379 redis:latest
# 6. Enqueue a task – Translation: 将任务加入队列
python spark/tier1_service.py
# 7. config.toml – This is a file name, so keep it as is: config.toml
python -m feature_crew tier1 orchestrator
```
### 生产环境
```
# 8. Validation checks (ordered, stop on first failure) – Translation: 验证检查(有序,第一个失败时停止)
docker-compose -f docker-compose.prod.yml up -d
# 9. guardrails/tier1-f2.md – File name, keep as is: guardrails/tier1-f2.md
sudo systemctl start feature-crew-orchestrator.service
# 10. Tier-1 F2 Guardrail – Translation: Tier-1 F2 防护栏 (but keep Tier-1 and F2 in English? F2 might be a specific term. From context, it seems like a tier or type. I'll keep "Tier-1" and "F2" in English, and translate "Guardrail" to "防护栏".)
launchctl load ~/Library/LaunchAgents/com.feature-crew.plist
```
## 监控
### 健康检查
```
# - Let's see the instruction: keep professional terms in English. "Tier-1" and "F2" might be part of a naming convention, so I'll keep them as is. Translation: Tier-1 F2 防护栏
curl http://localhost:30001/health
# 11. Single command for dev/test features – Translation: 用于开发/测试功能的单个命令
python -m feature_crew status
# 12. Orchestrator auto-routes to tier-1 (Qwen3), audits locally, reviews cheaply – Translation: Orchestrator 自动路由到 tier-1 (Qwen3),本地审计,低成本审查
tail -f logs/orchestrator.log
tail -f logs/spark.log
```
### 指标
```
# - Orchestrator is a term, so keep it in English? The instruction says keep tool/framework names. Orchestrator might be a component name, so I'll keep it as "Orchestrator". Similarly, Qwen3 is a model name, keep in English.
redis-cli keys "task:*:status" | wc -l
# 13. Cost: ~$0.0005, Time: ~40s – Translation: 成本:约 $0.0005,时间:约 40 秒
python -m feature_crew metrics --duration 7d
# - Keep the currency symbol and units as is? Probably translate "Cost" and "Time" to Chinese, but keep the numbers and symbols. Since it's a heading, I'll translate the words.
python -m feature_crew metrics --tier 1 --metric latency
```
## 故障排除
| 问题 | 原因 | 修复方法 |
|-------|-------|----------|
| 任务在"分派"状态挂起 | LLM 端点无法访问 | 检查配置中的 vLLM/API 端点 URL |
| 自审计静默失败 | 工具未安装 | `pip install pytest ruff mypy` |
| 审查拒绝所有差异 | 检查规则过于严格 | 在 config + review.py 中调整阈值 |
| 合并冲突 | 主分支有新提交 | 实现智能合并(变基 + 重试) |
## 项目结构
```
feature-crew/
├── feature_crew/
│ ├── __init__.py
│ ├── cli.py # CLI commands
│ ├── router.py # Task triage (tier classification)
│ ├── orchestrators/ # Pluggable per-tier orchestrators
│ │ ├── base.py
│ │ ├── tier1_qwen.py
│ │ ├── tier1_mistral.py
│ │ ├── tier2_sonnet.py
│ │ └── tier3_opus.py
│ ├── audit/ # Self-audit framework
│ │ ├── runner.py
│ │ └── plugins/
│ │ ├── pytest.py
│ │ ├── ruff.py
│ │ ├── mypy.py
│ │ └── compile.py
│ ├── review/ # Fresh-context review
│ │ ├── checks.py
│ │ └── validators/
│ ├── models/ # Data classes (Task, AuditReport, etc.)
│ └── config.py # Configuration loader
├── spark/ # Spark service (tier-specific dispatch)
│ ├── tier1_service.py
│ ├── requirements.txt
│ └── README.md
├── guardrails/ # LLM guardrails (optional)
│ ├── tier1-f2.md
│ └── tier2-instructions.md
├── scripts/
│ ├── enqueue.py
│ ├── status.py
│ └── metrics.py
├── tests/
│ ├── test_router.py
│ ├── test_orchestrator.py
│ ├── test_audit.py
│ └── test_review.py
├── docker-compose.yml # Local dev
├── docker-compose.prod.yml # Production
├── config.example.toml # Template config
├── pyproject.toml # Python dependencies
└── README.md # This file
```
## 参与贡献
Feature Crew 设计为可扩展,以适应您的特定项目:
1. **分叉或克隆**此仓库
2. **自定义** config.toml、guardrails/ 和审查检查
3. 在采用第二/三层级之前,先用几个第一层级任务**测试**
4. **监控**成功率、成本、延迟
5. **迭代** — 收集失败案例,完善防护规则,调整检查项
## 许可证
MIT 许可证 — 可在您的项目中自由使用。
## 致谢
在 **Argus** 项目(一个拥有 35+ 个 OSINT 源的常驻个人 AI 代理)中构建并经过实战检验。现作为可复用模板提取,以便将相同的自主开发工作流带入任何 Python 项目。
## 支持
- **问题:** 在 GitHub 上提交 issue
- **文档:** 阅读 `docs/` 文件夹以深入了解架构
- **示例:** 查看 `examples/` 获取项目特定配置
标签:AI代码审查, AI辅助开发, Docker容器化, LLM模型, LLM集成, Python项目, Redis队列, REST API, 云API, 代码审查, 代码测试, 代码生成框架, 任务调度, 任务队列管理, 低成本开发, 分层处理, 安全审查, 安全规则引擎, 开发效率, 开发框架, 开源框架, 微服务架构, 快速开发, 成本优化, 技术栈优化, 持续集成, 搜索引擎查询, 智能开发, 本地LLM, 网络可观测性, 自动化代码生成, 请求拦截, 软件开发管道, 逆向工具