zhangbl6618/RAG-Responsibility-Attribution
GitHub: zhangbl6618/RAG-Responsibility-Attribution
Stars: 18 | Forks: 1
# Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation
## Introduction
This repository is the official implementation of the papers `Who Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation` (IEEE Symposium on Security and Privacy 2026) and `Traceback of Poisoning Attacks to Retrieval-Augmented Generation` (The Web Conference 2025).
## Setup
1. Please run the following commands to set up the environment:
conda env create my_custom_env python=3.12
conda activate my_custom_env
pip install -r requirements.txt
2. Collecting misgeneration events
Misgeneration events are in a specific JSON format, typically generated by an attack simulation. For each misgeneration event, it should contain information such as questions, contexts, RAG responses, and retrieval scores. An example is provided in the`attack_feedback/PRAGB/*.json`.
3. Set up OpenAI API Key
Ensure your OpenAI API key is set as an environment variable.
export OPENAI_API_KEY="YOUR_OPENAI_API_KEY"
export OPENAI_API_URL="YOUR_OPENAI_BASE_URL" # Optional, if using a custom endpoint
## Usage
### RAGOrigin
python RAGOrigin/main.py \
--dataset "NQ" \
--attack_retriever "e5" \
--attack_LLM "gpt-4o-mini" \
--judge_LLM "gpt-4o-mini" \
--attack_method "PRAGB" \
--attack_M 5 \
--top_K 5 \
--trace_method "RAGOrigin" \
--proxy_model "meta-llama/Llama-3.1-8B" \
--variant 0 \
--normalize_method "z_score_normalize" \
--feedback_root_dir "attack_feedback" \
--feedback_scope_dir "attack_feedback_scope" \
--result_root_dir "result" \
--test_version "v1" \
--cuda_device 0
### RAGForensics
python RAGForensics/main.py \
--dataset "NQ" \
--attack_retriever "e5" \
--attack_LLM "gpt-4o-mini" \
--trace_LLM "gpt-4o-mini"
--attack_method "PRAGB" \
--attack_M 5 \
--top_K 5 \
--feedback_root_dir "attack_feedback" \
--result_root_dir "result" \
--test_version "v1" \
## Citation
The citations for our attribution framework:
@inproceedings{zhang2026ragorigin,
title={Who Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation},
author={Zhang, Baolei and Xin, Haoran and Chen, Yuxi and Liu, Zhuqing and Yi, Biao and Li, Tong and Nie, Lihai and Liu, Zheli and Fang, Minghong},
booktitle={IEEE Symposium on Security and Privacy},
year={2026}
}
@inproceedings{zhang2025traceback,
title={Traceback of Poisoning Attacks to Retrieval-Augmented Generation},
author={Zhang, Baolei and Xin, Haoran and Fang, Minghong and Liu, Zhuqing and Yi, Biao and Li, Tong and Liu, Zheli},
booktitle={The Web Conference},
year={2025}
}