thesoze/feature-crew

GitHub: thesoze/feature-crew

一个基于分层LLM的自主代码生成、测试和审查框架,旨在大幅降低自动化开发成本并提升效率。

Stars: 0 | Forks: 0

# Feature Crew — 自主开发流程 一个经过生产验证的、可复用的自主代码生成、测试和审查框架。最初为 Argus 项目构建,现作为独立模板提取,适用于任何 Python 项目。 **成本:** 每项任务约 $0.0005(本地 LLM + 低成本 API 审查),而传统方法成本为 $0.15 以上。 **速度:** 每项任务 30-60 秒(代码生成 + 自审计 + 审查)。 **可靠性:** 仅使用新鲜上下文审查(无会话上下文膨胀)。 ## 快速开始 ``` # stral_tier1.py git clone https://github.com/thesoze/feature-crew.git myproject-crew cd myproject-crew # 21. guardrails/mistral-tier1.md cp config.example.toml config.toml # 22. feature_crew/audit/plugins/bandit.py # 23. feature_crew/review/checks/security.py docker-compose up -d # Redis + optional Spark service # 24. feature_crew/review.py python -m feature_crew tier1 orchestrator # 25. Terminal 1: Redis python -m feature_crew enqueue \ --spec "Add a function named greet() that returns 'Hello'" \ --spec-slug "test-greet-function" ``` ## 架构 ``` Task Queue (Redis) ↓ Spark Service (tier-specific task dispatch) ├─ Tier 1 (XS, low-risk): Local LLM (Qwen, Llama, etc.) ├─ Tier 2 (S/M, standard): Claude Sonnet └─ Tier 3 (L/XL, high-risk): Claude Opus + security review ↓ Code Generation (via Aider or direct LLM) ↓ Self-Audit (pytest, ruff, mypy, compile) ↓ Fresh-Context Review (read diff + audit only, never conversation history) ↓ Merge & Deploy (or re-queue with feedback, up to 3 attempts) ``` ## 核心组件 ### 1. 任务路由器 - **文件:** `feature_crew/router.py` - **分类:** 按代码行数、复杂性、风险分为 XS/S/M/L/XL - **输出:** 包含 `tier`、`model`、`crew_plan` 的任务负载 ### 2. Spark 服务 - **文件:** `spark/tier1_service.py` - **职责:** 管理任务队列、创建工作树、跟踪状态 - **协议:** REST API (GET /next, POST /complete, POST /feedback) - **存储:** Redis 队列 + 任务元数据 ### 3. 编排器 - **文件:** `feature_crew/orchestrator.py` - **层级:** 每个层级可插拔的编排器 - `Tier1Orchestrator` — 本地 LLM 分派(Qwen、Llama) - `Tier2Orchestrator` — Claude Sonnet(通过 API) - `Tier3Orchestrator` — Claude Opus + 安全审查 - **方法:** 轮询 Spark 服务、分派、等待审计、审查、合并 ### 4. 自审计 - **文件:** `feature_crew/audit/` - **工具:** pytest、ruff、mypy、py_compile(可配置) - **判定:** `ready_for_review` | `needs_retry` - **输出:** 包含所有检查结果的 JSON 报告 ### 5. 新鲜上下文审查 - **文件:** `feature_crew/review.py` - **仅读取:** 规格说明 + 差异 + 审计报告 - **不读取:** 会话历史、任务元数据、先前尝试 - **检查:** 自定义验证链(每层级 5+ 项检查) - **输出:** `{action: approve|reject, reason, feedback}` ### 6. 命令行界面 - **命令:** - `enqueue` — 将任务添加到队列 - `tier1 orchestrator` — 启动第一层级轮询循环 - `tier2 orchestrator` — 启动第二层级轮询循环 - `status` — 检查队列状态 - `config` — 显示/测试配置 ## 配置 ### 26. Terminal 2: Spark service ``` [project] name = "myproject" repo_path = "/path/to/myproject" description = "My awesome project" [infrastructure] redis_url = "redis://localhost:6379" spark_url = "http://localhost:30001" [tier1] enabled = true model = "qwen3-235b" # or "llama3-70b", "mistral-7b", etc. guardrail_file = "guardrails/tier1-f2.md" # optional: teach-to-ask guardrail timeout_seconds = 300 [tier2] enabled = true model = "claude-sonnet-4-6" # Claude API api_key = "sk-..." # or read from .env cost_budget = 5.0 # USD per day [tier3] enabled = false # tier-3 for high-risk only model = "claude-opus-4-7" security_review = true [self_audit] tools = ["pytest", "ruff", "mypy", "compile"] pytest_args = ["tests/", "-v", "--tb=short"] ruff_args = ["check", "src/", "--select=E9,F63,F7,F82"] [review] # 27. Terminal 3: Orchestrator checks = [ "audit_verdict", "diff_size", "file_count", "spec_adherence", "suspicious_patterns", ] max_files_per_tier1 = 3 max_files_per_tier2 = 10 ``` ### 28. Using docker-compose 可选的防护规则,用于教导第一层级 LLM 在规格说明模糊时提问: ``` # 29. Using systemd (Linux) When given a vague spec, respond with: [ASK]: Vague spec indicators: - "improve", "clean up", "refactor" (without specifics) - "make it better", "optimize", "simplify" - "handle edge cases" (which ones?) Example: Spec: "Improve the login flow" Response: [ASK]: Do you want to (a) add rate limiting, (b) fix a specific bug, (c) add 2FA? ``` ## 使用模式 ### 模式 1:快速第一层级(开发) ``` # 30. Using launchd (macOS) python -m feature_crew enqueue \ --spec "Add validation to email field in UserForm" \ --spec-slug "validate-email-form" # 31. Spark service health # 32. Queue status ``` ### 模式 2:第二层级(标准功能) ``` # 33. Logs python -m feature_crew enqueue \ --spec "Add OAuth2 login endpoint with database migration" \ --spec-slug "oauth2-login" \ --tier 2 # 34. Task success rate # 35. Cost tracking ``` ### 模式 3:第三层级(高风险) ``` # 36. Latency per tier python -m feature_crew enqueue \ --spec "Implement HMAC verification for webhook signatures" \ --spec-slug "webhook-hmac" \ --tier 3 # Yes, that's 36 lines. Now, I need to translate each one to Simplified Chinese, keeping professional terms in English. # Let me translate each line: ``` ## 可扩展性 ### 添加新的 LLM 模型 1. **创建层级编排器:** # 1. Clone this repo – Translation: 克隆此仓库 class MistralTier1Orchestrator(BaseTier1Orchestrator): async def dispatch_to_llm(self, spec, worktree_path, feedback): # 调用 Mistral API 或本地 vLLM 端点 pass 2. **在配置中注册:** [tier1] model = "mistral-7b" api_endpoint = "http://localhost:8000" 3. **可选防护规则:** # 2. Configure for your project – Translation: 为您的项目配置 # 针对 Mistral 特性的自定义防护规则 ### 添加自审计工具 1. **创建审计插件:** # 3. Edit config.toml: set LLM models, Redis URL, project paths – Translation: 编辑 config.toml:设置 LLM 模型、Redis URL、项目路径 def run_bandit(task_id, worktree_path, files_modified): result = subprocess.run(["bandit", "-r", "src/"]) return {"exit_code": result.returncode, "passed": result.returncode == 0} 2. **在配置中注册:** [self_audit] tools = ["pytest", "ruff", "mypy", "bandit"] ### 添加审查检查 1. **创建自定义验证器:** # - Here, config.toml, LLM, Redis are kept in English as per instruction. def check_no_hardcoded_secrets(spec, diff, audit_report): if "password" in diff or "secret" in diff: return False, "Found hardcoded secrets in diff" return True, None 2. **注册:** # 4. Start the infrastructure – Translation: 启动基础设施 CHECKS = [ check_audit_verdict, check_diff_size, check_no_hardcoded_secrets, # 自定义 ] ## 部署 ### 本地开发 ``` # 5. Run the pipeline – Translation: 运行流水线 docker run -d -p 6379:6379 redis:latest # 6. Enqueue a task – Translation: 将任务加入队列 python spark/tier1_service.py # 7. config.toml – This is a file name, so keep it as is: config.toml python -m feature_crew tier1 orchestrator ``` ### 生产环境 ``` # 8. Validation checks (ordered, stop on first failure) – Translation: 验证检查(有序,第一个失败时停止) docker-compose -f docker-compose.prod.yml up -d # 9. guardrails/tier1-f2.md – File name, keep as is: guardrails/tier1-f2.md sudo systemctl start feature-crew-orchestrator.service # 10. Tier-1 F2 Guardrail – Translation: Tier-1 F2 防护栏 (but keep Tier-1 and F2 in English? F2 might be a specific term. From context, it seems like a tier or type. I'll keep "Tier-1" and "F2" in English, and translate "Guardrail" to "防护栏".) launchctl load ~/Library/LaunchAgents/com.feature-crew.plist ``` ## 监控 ### 健康检查 ``` # - Let's see the instruction: keep professional terms in English. "Tier-1" and "F2" might be part of a naming convention, so I'll keep them as is. Translation: Tier-1 F2 防护栏 curl http://localhost:30001/health # 11. Single command for dev/test features – Translation: 用于开发/测试功能的单个命令 python -m feature_crew status # 12. Orchestrator auto-routes to tier-1 (Qwen3), audits locally, reviews cheaply – Translation: Orchestrator 自动路由到 tier-1 (Qwen3),本地审计,低成本审查 tail -f logs/orchestrator.log tail -f logs/spark.log ``` ### 指标 ``` # - Orchestrator is a term, so keep it in English? The instruction says keep tool/framework names. Orchestrator might be a component name, so I'll keep it as "Orchestrator". Similarly, Qwen3 is a model name, keep in English. redis-cli keys "task:*:status" | wc -l # 13. Cost: ~$0.0005, Time: ~40s – Translation: 成本:约 $0.0005,时间:约 40 秒 python -m feature_crew metrics --duration 7d # - Keep the currency symbol and units as is? Probably translate "Cost" and "Time" to Chinese, but keep the numbers and symbols. Since it's a heading, I'll translate the words. python -m feature_crew metrics --tier 1 --metric latency ``` ## 故障排除 | 问题 | 原因 | 修复方法 | |-------|-------|----------| | 任务在"分派"状态挂起 | LLM 端点无法访问 | 检查配置中的 vLLM/API 端点 URL | | 自审计静默失败 | 工具未安装 | `pip install pytest ruff mypy` | | 审查拒绝所有差异 | 检查规则过于严格 | 在 config + review.py 中调整阈值 | | 合并冲突 | 主分支有新提交 | 实现智能合并(变基 + 重试) | ## 项目结构 ``` feature-crew/ ├── feature_crew/ │ ├── __init__.py │ ├── cli.py # CLI commands │ ├── router.py # Task triage (tier classification) │ ├── orchestrators/ # Pluggable per-tier orchestrators │ │ ├── base.py │ │ ├── tier1_qwen.py │ │ ├── tier1_mistral.py │ │ ├── tier2_sonnet.py │ │ └── tier3_opus.py │ ├── audit/ # Self-audit framework │ │ ├── runner.py │ │ └── plugins/ │ │ ├── pytest.py │ │ ├── ruff.py │ │ ├── mypy.py │ │ └── compile.py │ ├── review/ # Fresh-context review │ │ ├── checks.py │ │ └── validators/ │ ├── models/ # Data classes (Task, AuditReport, etc.) │ └── config.py # Configuration loader ├── spark/ # Spark service (tier-specific dispatch) │ ├── tier1_service.py │ ├── requirements.txt │ └── README.md ├── guardrails/ # LLM guardrails (optional) │ ├── tier1-f2.md │ └── tier2-instructions.md ├── scripts/ │ ├── enqueue.py │ ├── status.py │ └── metrics.py ├── tests/ │ ├── test_router.py │ ├── test_orchestrator.py │ ├── test_audit.py │ └── test_review.py ├── docker-compose.yml # Local dev ├── docker-compose.prod.yml # Production ├── config.example.toml # Template config ├── pyproject.toml # Python dependencies └── README.md # This file ``` ## 参与贡献 Feature Crew 设计为可扩展,以适应您的特定项目: 1. **分叉或克隆**此仓库 2. **自定义** config.toml、guardrails/ 和审查检查 3. 在采用第二/三层级之前,先用几个第一层级任务**测试** 4. **监控**成功率、成本、延迟 5. **迭代** — 收集失败案例,完善防护规则,调整检查项 ## 许可证 MIT 许可证 — 可在您的项目中自由使用。 ## 致谢 在 **Argus** 项目(一个拥有 35+ 个 OSINT 源的常驻个人 AI 代理)中构建并经过实战检验。现作为可复用模板提取,以便将相同的自主开发工作流带入任何 Python 项目。 ## 支持 - **问题:** 在 GitHub 上提交 issue - **文档:** 阅读 `docs/` 文件夹以深入了解架构 - **示例:** 查看 `examples/` 获取项目特定配置
标签:AI代码审查, AI辅助开发, Docker容器化, LLM模型, LLM集成, Python项目, Redis队列, REST API, 云API, 代码审查, 代码测试, 代码生成框架, 任务调度, 任务队列管理, 低成本开发, 分层处理, 安全审查, 安全规则引擎, 开发效率, 开发框架, 开源框架, 微服务架构, 快速开发, 成本优化, 技术栈优化, 持续集成, 搜索引擎查询, 智能开发, 本地LLM, 网络可观测性, 自动化代码生成, 请求拦截, 软件开发管道, 逆向工具