rexcoleman/llm-patch-correctness

GitHub: rexcoleman/llm-patch-correctness

分析LLM生成安全补丁的正确性与回归风险,按CWE类别给出量化修复效果。

Stars: 0 | Forks: 0

# LLM 生成补丁的正确性 **LLM 安全补丁的修复率为 42%,回归率为 10% —— 但 CWE 类别决定一切。弱加密修复的正确率为 100%。SQL 注入补丁的净效果为负:修复率为 0%,回归率为 50%,其中一半的尝试引入了新的注入向量。** **博客文章:** [LLM 补丁完美修复加密漏洞 — 但使 SQL 注入更糟](https://rexcoleman.dev/posts/llm-patch-correctness/) ![govML](https://img.shields.io/badge/govML-v3.3-blue) ![质量](https://img.shields.io/badge/quality-8.0-brightgreen) ![许可证](https://img.shields.io/badge/license-MIT-green) ![关键结果](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/960a0e0e74042240.png) ## 关键结果 | CWE 类别 | 修复率 | 回归率 | 结论 | |---|---|---|---| | CWE-327(弱加密) | **100%** | 0% | 适合自动化 | | CWE-120(缓冲区溢出) | 50% | 0% | 需要审查 | | CWE-79(XSS) | 50% | 0% | 需要审查 | | CWE-22(路径遍历) | 10% | 0% | 很少修复 | | CWE-89(SQL 注入) | **0%** | **50%** | **净负效果 — 使问题更糟** | ## 快速开始 ``` git clone https://github.com/rexcoleman/llm-patch-correctness cd llm-patch-correctness pip install -e . bash reproduce.sh ``` ## 项目结构 ``` FINDINGS.md # Research findings with pre-registered hypotheses and full results EXPERIMENTAL_DESIGN.md # Pre-registered experimental design and methodology HYPOTHESIS_REGISTRY.md # Hypothesis predictions, results, and verdicts reproduce.sh # One-command reproduction of all experiments governance.yaml # govML governance configuration LICENSE # MIT License pyproject.toml # Python project configuration scripts/ # Experiment and analysis scripts src/ # Source code tests/ # Test suite outputs/ # Experiment outputs and results data/ # Data files and datasets docs/ # Documentation and decision records ``` ## 方法论 详见 [FINDINGS.md](FINDINGS.md) 和 [EXPERIMENTAL_DESIGN.md](EXPERIMENTAL_DESIGN.md),包括预注册假设、多种子验证的完整实验结果。 ## 许可证 [MIT](LICENSE) 2026 Rex Coleman 由 [govML](https://rexcoleman.dev/posts/govml-methodology/) v3.3 管理
标签:CWE分类, govML, Python, XSS, 代码修复, 修复率, 回归率, 实验设计, 密钥泄露防护, 弱加密, 数据科学, 无后门, 机器学习安全, 模块化设计, 漏洞修补, 漏洞情报, 缓冲区溢出, 自动化修复, 补丁正确性, 资源验证, 路径遍历, 逆向工具