Denis-Krueger-labs/llm-prompt-injection-paper
GitHub: Denis-Krueger-labs/llm-prompt-injection-paper
该论文系统分析了大语言模型系统中提示注入攻击的防御机制,评估了结构化查询、检测型防御和系统级方案在自适应攻击下的有效性及其局限性。
Stars: 0 | Forks: 0
# LLM Prompt Injection 防御
**专业技能与沟通**课程研讨会论文
THWS 维尔茨堡-施韦因富特 · 2026年夏季学期
**作者:** Denis Krüger · Sebastian Mautner
## 主题
本文分析了 Large Language Model (LLM) 系统中的 **prompt injection 攻击**,并评估了当前的防御机制。LLM 正越来越多地部署在处理外部不可信数据(如电子邮件、文档、网页、API 响应)的应用程序中,这种集成引入了一个关键的漏洞:即在结构上无法将指令与数据分离。
本文重点关注:
- 由指令/数据边界问题引起的底层漏洞
- 结构化查询方法,如 **StruQ** 和 **SecAlign**
- 基于检测的防御和系统级防御
- 使用最新 benchmark 进行评估(AgentDojo, OpenPromptInjection, AlpacaFarm)
- 当前保护机制在自适应攻击下的局限性
## 研究问题
## 来源
### 核心论文
| 关键词 | 标题 | 作者 | 发表会议/期刊 |
|-----|-------|---------|-------|
| StruQ | *StruQ: Defending Against Prompt Injection with Structured Queries* | Chen et al. | USENIX Security 2025 |
| SecAlign | *SecAlign: Defending Against Prompt Injection with Preference Optimization* | Chen et al. | ACM CCS 2025 |
| AgentDojo | *AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents* | Debenedetti et al. | NeurIPS 2024 |
| Critical Evaluation | *A Critical Evaluation of Defenses against Prompt Injection Attacks* | Jia et al. | arXiv 2025 |
| Checkpoint-GCG | *Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses* | Yang et al. | arXiv 2025 |
| ASTRA | *May I Have Your Attention? Breaking Fine-Tuning Based Prompt Injection Defenses Using Architecture-Aware Attacks* | Pandya et al. | arXiv 2025 |
### 基础文献
| 关键词 | 标题 | 作者 | 发表会议/期刊 |
|-----|-------|---------|-------|
| Perez & Ribeiro | *Ignore Previous Prompt: Attack Techniques for Language Models* | Perez & Ribeiro | arXiv 2022 |
| Greshake et al. | *Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection* | Greshake et al. | AISec 2023 |
| Zou et al. | *Universal and Transferable Adversarial Attacks on Aligned Language Models* | Zou et al. | arXiv 2023 |
| Wallace et al. | *The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions* | Wallace et al. | arXiv 2024 |
| Liu et al. | *Prompt Injection Attack against LLM-Integrated Applications* | Liu et al. | arXiv 2023 |
### 防御机制
| 关键词 | 标题 | 作者 | 发表会议/期刊 |
|-----|-------|---------|-------|
| PromptArmor | *PromptArmor: Simple yet Effective Prompt Injection Defenses* | Shi et al. | arXiv 2025 |
| RTBAS | *RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage* | Zhong et al. | arXiv 2025 |
| f-secure / IFC | *System-Level Defense against Indirect Prompt Injection Attacks: An Information Flow Control Perspective* | Wu et al. | arXiv 2024 |
| Spotlighting | *Defending Against Indirect Prompt Injection Attacks With Spotlighting* | Hines et al. | arXiv 2024 |
| Defeating by Design | *Defeating Prompt Injections by Design* | Debenedetti et al. | arXiv 2025 |
| A-MemGuard | *A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory* | Wei et al. | arXiv 2025 |
### Benchmark 与评估
| 关键词 | 标题 | 作者 | 发表会议/期刊 |
|-----|-------|---------|-------|
| InjecAgent | *InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated LLM Agents* | Zhan et al. | arXiv 2024 |
| InjectBench | *InjectBench: An Indirect Prompt Injection Benchmarking Framework* | Kong | Master's Thesis, Virginia Tech 2024 |
| FSPIB | *Prompt Injection Benchmark for Foundation Model Integrated Systems* | Anonymous | Under review, ICLR 2025 |
| MPIB | *MPIB: A Benchmark for Medical Prompt Injection Attacks and Clinical Safety in LLMs* | Lee et al. | arXiv 2025 |
### 实际案例与综述
| 关键词 | 标题 | 作者 | 发表会议/期刊 |
|-----|-------|---------|-------|
| P2SQL | *Prompt-to-SQL Injections in LLM-Integrated Web Applications: Risks and Defenses* | Pedro et al. | ICSE 2025 |
| Survey | *Prompting for LLM Security and RAG: A Survey* | Ruparel et al. | IEEE ICAIC 2026 |
| Mathew | *Enhancing Security in LLMs: A Comprehensive Review of Prompt Injection Attacks and Defenses* | Mathew | Preprint 2024 |
## 项目结构
```
paper/
├── main.tex
├── sections/
│ ├── 00_titlepage.tex
│ ├── 01_einleitung.tex
│ ├── 02_llms_und_prompting.tex
│ ├── 03_prompt_injection.tex
│ ├── 04_bedrohungsmodell.tex
│ ├── 05_struq.tex
│ ├── 06_weitere_defences.tex
│ ├── 07_benchmarks_und_metriken.tex
│ ├── 08_grenzen_aktueller_defences.tex
│ ├── 09_praxis_und_management.tex
│ └── 10_fazit.tex
├── figures/
└── references.bib
```
## 编译论文
需要包含 `biber` 和 `pdflatex` 的 LaTeX 发行版。
```
pdflatex main.tex
biber main
pdflatex main.tex
pdflatex main.tex
```
标签:DLL 劫持, TruffleHog, 大语言模型, 学术论文, 提示注入, 防御机制, 集群管理