rexcoleman/llm-patch-correctness
GitHub: rexcoleman/llm-patch-correctness
分析LLM生成安全补丁的正确性与回归风险,按CWE类别给出量化修复效果。
Stars: 0 | Forks: 0
# LLM 生成补丁的正确性
**LLM 安全补丁的修复率为 42%,回归率为 10% —— 但 CWE 类别决定一切。弱加密修复的正确率为 100%。SQL 注入补丁的净效果为负:修复率为 0%,回归率为 50%,其中一半的尝试引入了新的注入向量。**
**博客文章:** [LLM 补丁完美修复加密漏洞 — 但使 SQL 注入更糟](https://rexcoleman.dev/posts/llm-patch-correctness/)
  

## 关键结果
| CWE 类别 | 修复率 | 回归率 | 结论 |
|---|---|---|---|
| CWE-327(弱加密) | **100%** | 0% | 适合自动化 |
| CWE-120(缓冲区溢出) | 50% | 0% | 需要审查 |
| CWE-79(XSS) | 50% | 0% | 需要审查 |
| CWE-22(路径遍历) | 10% | 0% | 很少修复 |
| CWE-89(SQL 注入) | **0%** | **50%** | **净负效果 — 使问题更糟** |
## 快速开始
```
git clone https://github.com/rexcoleman/llm-patch-correctness
cd llm-patch-correctness
pip install -e .
bash reproduce.sh
```
## 项目结构
```
FINDINGS.md # Research findings with pre-registered hypotheses and full results
EXPERIMENTAL_DESIGN.md # Pre-registered experimental design and methodology
HYPOTHESIS_REGISTRY.md # Hypothesis predictions, results, and verdicts
reproduce.sh # One-command reproduction of all experiments
governance.yaml # govML governance configuration
LICENSE # MIT License
pyproject.toml # Python project configuration
scripts/ # Experiment and analysis scripts
src/ # Source code
tests/ # Test suite
outputs/ # Experiment outputs and results
data/ # Data files and datasets
docs/ # Documentation and decision records
```
## 方法论
详见 [FINDINGS.md](FINDINGS.md) 和 [EXPERIMENTAL_DESIGN.md](EXPERIMENTAL_DESIGN.md),包括预注册假设、多种子验证的完整实验结果。
## 许可证
[MIT](LICENSE) 2026 Rex Coleman
由 [govML](https://rexcoleman.dev/posts/govml-methodology/) v3.3 管理
标签:CWE分类, govML, Python, XSS, 代码修复, 修复率, 回归率, 实验设计, 密钥泄露防护, 弱加密, 数据科学, 无后门, 机器学习安全, 模块化设计, 漏洞修补, 漏洞情报, 缓冲区溢出, 自动化修复, 补丁正确性, 资源验证, 路径遍历, 逆向工具