Kim-Hammar/llm_incident_response_ndss26

GitHub: Kim-Hammar/llm_incident_response_ndss26

利用轻量级大型语言模型实现应急响应规划。

Stars: 18 | Forks: 1

# 使用轻量级大型语言模型降低幻觉的应急响应规划本存储库包含与论文 *"使用轻量级大型语言模型降低幻觉的应急响应规划"* 相关的工件，该论文已被 The Network and Distributed System Security (NDSS) Symposium 2026 接受。我们介绍了一种新颖的方法，该方法使大型语言模型 (LLM) 能够有效地用于为应急响应规划提供决策支持。我们的方法使用 LLM 将系统日志转换为有效的响应计划，并通过微调、信息检索和决策理论规划来解决其局限性。与以往依赖于前沿模型提示工程的工作不同，我们的方法轻量级且可在通用硬件上运行。

## NDSS 2026 论文和演示。论文：[NDSS 2026 会议论文集](https://www.ndss-symposium.org/ndss-paper/incident-response-planning-using-a-lightweight-large-language-model-with-reduced-hallucination/). 视频：[NDSS 2026 演示](https://www.youtube.com/watch?v=TGuNgPEFnwk). ## 工件 - 第一个公开的应急事件和响应动作微调数据集。这是我们用于生成论文中结果的数据库。数据集可以在此处下载[这里](https://huggingface.co/datasets/kimhammar/CSLE-IncidentResponse-V1). - 微调模型的权重，可以在此处下载[这里](https://huggingface.co/kimhammar/LLMIncidentResponse). - 下载训练数据集的 Python 代码 (`load_training_dataset.py`). - 下载微调模型并使用它生成应急响应计划的 Python 代码 (`load_fine_tuned_llm.py`). - 生成应急响应计划的 Python 代码 (`response_generation.py`). - 基于我们的数据集微调新模型的 Python 代码 (`fine_tune_llm.py`). - [我们的基于 LLM 的应急响应决策支持系统演示视频](https://www.youtube.com/watch?v=SCxq2ye-R4Y&). ## 要求 - Python 3.8+ - `load_training_dataset.py` 需要 1 GB 的存储空间和一个通用 CPU。 - `load_fine_tuned_llm.py`: 需要 15 GB 的存储空间和一个通用 CPU。 - `response_generation.py`: 需要一个通用 GPU，例如 RTX 8000。 - `fine_tune_llm.py`: 需要 RTX 8000 等通用 GPU。我们已经测试了以下平台上的 Python 脚本： - MacOs Sequoia，Python 3.9, 3.10, 3.11, 3.12 和 3.13。 - Ubuntu 22.04，Python 3.9, 3.10, 3.11, 3.12 和 3.13。 ## 安装要下载此存储库，请运行以下命令： ``` git clone https://github.com/Limmen/llm_incident_response_ndss26 ``` 要安装所需的 Python 库，请运行以下命令 ``` pip install llm_recovery==0.0.13 ``` **注意** 如果您使用的是 Python 3.9 或更早版本，请运行以下命令： ``` pip install llm_recovery==0.0.7 ``` ## 执行 ### 加载微调的 LLM 命令： ``` python load_fine_tuned_llm.py ``` 预期输出： ``` ⋊> kim@gpu1 ⋊> ~/llm_incident_response_ndss26 on main ◦ python load_fine_tuned_llm.py (base) 19:25:01 Loading the fine-tuned incident response LLM. adapter_config.json: 100%|████████████████████████████████████████████████████████████| 797/797 [00:00<00:00, 4.08MB/s] Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 4/4 [00:59<00:00, 14.78s/it] /home/kim/anaconda3/lib/python3.11/site-packages/torch/cuda/__init__.py:734: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") adapter_model.safetensors: 0%| | 0.00/201M [00:00 kim@gpu1 ⋊> ~/llm_incident_response_ndss26 on main ◦ python load_training_dataset.py (base) 19:26:24 Loading training dataset. README.md: 100%|█████████████████████████████████████████████████████████████████████| 33.0/33.0 [00:00<00:00, 187kB/s] examples_16_june.json: 100%|█████████████████████████████████████████████████████████| 536M/536M [00:03<00:00, 145MB/s] Generating train split: 1 examples [00:06, 6.32s/ examples] Training dataset loaded successfully. ``` ### 响应生成命令： ``` python response_generation.py ``` 预期输出（示例）： ``` ⋊> kim@gpu1 ⋊> ~/llm_incident_response_ndss26 on main ◦ python response_generation.py (base) 19:28:50 Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 4/4 [00:05<00:00, 1.47s/it] /home/kim/anaconda3/lib/python3.11/site-packages/torch/cuda/__init__.py:734: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") /home/kim/anaconda3/lib/python3.11/site-packages/torch/cuda/__init__.py:734: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation. I recognize that while the attack is contained, I do not yet have enough information to fully understand or eradicate it. Therefore, I choose to acquire full disk and memory images along with relevant logs, preserving evidence in a forensically sound manner to support analysis. { "Action": "Acquire full disk and memory images of 10.20.11.42 and export DNS, firewall, and NetFlow logs to write-protected storage.", "Explanation": "Capturing images and logs secures evidence for later analysis and legal requirements." }⏎ ``` ### 在我们的应急响应数据集上微调 DeepSeek-R1-Distill-Qwen-14B 命令： ``` python fine_tune_llm.py ``` 预期输出： ``` ⋊> kim@gpu1 ⋊> ~/llm_incident_response_ndss26 on main ⨯ python fine_tune_llm.py (base) 20:14:06 Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 4/4 [00:35<00:00, 8.85s/it] Trainable parameters: 50331648 No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead. `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`. Step: 1, Epoch: 0.5000, Progress: 50.0%, Avg_loss=0.5480, LR=0.00095000, Grad_norm=0.3354, minutes: 0.7959 Step: 2, Epoch: 1.0000, Progress: 100.0%, Avg_loss=0.7036, LR=0.00047500, Grad_norm=0.4813, minutes: 1.2205 ⋊> kim@gpu2 ⋊> ~/llm_incident_response_ndss26 on main ⨯ ``` ## DOI：数字对象标识符 https://doi.org/10.5281/zenodo.17459636 ## 作者 Kim Hammar，Tansu Alpcan 和 Emil Lupu。联系方式：kimham@kth.se ## 🔖 版权和许可

Made with ❤ at and

标签：Apex, Hugging Face, IaC 扫描, NDSS, Python, 人工智能, 代码示例, 信息检索, 决策支持, 凭据扫描, 大型语言模型, 安全事件, 库, 应急响应, 开源, 技术栈, 数据分析, 无后门, 机器学习, 模型微调, 模型权重, 用户模式Hook绕过, 网络安全, 视频演示, 逆向工具, 隐私保护