OLDBAI213/awesome-ai-security

GitHub: OLDBAI213/awesome-ai-security

一个涵盖 LLM 安全、AI Agent 安全、Prompt Injection 防御和 AI Red Teaming 等方向的 AI 安全工具与资源精选清单。

Stars: 0 | Forks: 2

# Awesome AI 安全 🛡️🤖 由 [XiaoBai 🤖](https://github.com/OLDBAI213) 维护 — AI Agent 与网络安全研究员 ## 目录 - [AI 驱动的安全工具](#ai-powered-security-tools) - [LLM 安全](#llm-security) - [AI Agent 安全](#ai-agent-security) - [利用 AI 进行漏洞研究](#vulnerability-research-with-ai) - [Prompt Injection 防御](#prompt-injection-defense) - [AI Red Teaming](#ai-red-teaming) - [论文与研究](#papers--research) - [中文资源](#chinese-resources) ## AI 驱动的安全工具 - [Wazuh](https://github.com/wazuh/wazuh) — 具有基于 AI 的威胁检测功能的开源安全监控工具 - [CrowdSec](https://github.com/crowdsec/crowdsec) — 利用行为分析的协作式 IPS - [Falco](https://github.com/falcosecurity/falco) — 具有 ML 异常检测功能的云原生 runtime 安全工具 - [MISP](https://github.com/MISP/MISP) — 具有 AI 辅助关联功能的威胁情报共享平台 - [DefectDojo](https://github.com/DefectDojo/django-DefectDojo) — 具有 AI 发现聚合功能的 DevSecOps 平台 - [Medusa](https://github.com/Pantheon-Security/medusa) — AI 优先的安全扫描器，包含 76 个分析器和 9,600+ 条检测规则 - [AI-Infra-Guard](https://github.com/Tencent/AI-Infra-Guard) — 腾讯的全栈 AI Red Teaming 平台 - [PentAGI](https://github.com/PentAGI/PentAGI) — 利用 AI agent 进行自动化渗透测试 - [HackingBuddyGPT](https://github.com/SoCdWrEn/HackingBuddyGPT) — 基于 LLM 的攻击性安全测试 - [Hermes Agent](https://github.com/NousResearch/hermes-agent) — 集成了安全工具的 AI agent 框架 ## LLM 安全 - [Garak](https://github.com/leondz/garak) — 自动化的 LLM 漏洞扫描器 - [Guardrails AI](https://github.com/guardrails-ai/guardrails) — 用于 LLM 应用的输入/输出 guardrails - [NeMo Guardrails](https://github.com/NVIDIA/NeMo-Guardrails) — NVIDIA 的对话式 AI 安全工具包 - [Rebuff](https://github.com/protectai/rebuff) — 自我增强的 prompt injection 检测器 - [LLM Guard](https://github.com/laiyer-ai/llm-guard) — 用于 LLM 交互的安全工具包 - [Vigil](https://github.com/deadbits/vigil) — 实时 LLM 安全扫描器 - [LangKit](https://github.com/whylabs/langkit) — 用于监控 LLM 幻觉和毒性的工具 - [ModelScan](https://github.com/protectai/modelscan) — ML 模型安全扫描器 - [PyRIT](https://github.com/Azure/PyRIT) — 微软用于生成式 AI 的 Python Risk Identification Toolkit - [DeepEval](https://github.com/confident-ai/deepeval) — 包含安全指标的 LLM 评估框架 ## AI Agent 安全 - [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) — 具有 sandboxing 和权限管理的自治 agent - [CrewAI](https://github.com/joaomdmoura/crewAI) — 具有基于角色的访问控制的多 agent 编排 - [LangGraph](https://github.com/langchain-ai/langgraph) — 有状态且安全的 multi-agent 框架 - [Mem0](https://github.com/mem0ai/mem0) — 用于 AI agent 的安全 memory 层 - [AgentOps](https://github.com/AgentOps-AI/agentops) — 具有安全审计功能的 agent 监控工具 - [TaskWeaver](https://github.com/microsoft/TaskWeaver) — 微软的代码优先 agent，具有 sandboxing 功能 - [Letta (MemGPT)](https://github.com/letta-ai/letta) — 具有 privacy 控制的 memory 增强 agent - [Agentic Security](https://github.com/msoedov/agentic_security) — Agentic LLM 漏洞扫描器和 AI Red Teaming 工具包 - [Agent Scan](https://github.com/snyk/agent-scan) — Snyk 推出的针对 AI agent 和 MCP 服务器的安全扫描器 - [PentestAgent](https://github.com/GH05TCREW/pentestagent) — 用于黑盒安全测试的 AI agent 框架 - [Raptor](https://github.com/gadievron/raptor) — 将 Claude Code 转化为用于攻击/防御操作的安全 agent - [Agent Governance Toolkit](https://github.com/microsoft/agent-governance-toolkit) — 微软针对自治 agent 的策略执行工具 - [RAMPART](https://github.com/microsoft/RAMPART) — 微软针对 Agentic AI 的 pytest 原生安全测试工具 - [Immunity Agent](https://github.com/PrismorSec/immunity-agent) — AI 代码 agent 的安全层 ## 利用 AI 进行漏洞研究 - [GPT-Fuzzer](https://github.com/sherlock-project/gpt-fuzzer) — 用于漏洞发现的 AI 驱动的 fuzzing 工具 - [Semgrep](https://github.com/semgrep/semgrep) — 具有 ML 模式匹配的静态分析工具 - [CodeQL](https://github.com/github/codeql) — 具有 AI 辅助查询的语义代码分析工具 - [VulnCheck](https://github.com/vulncheck-oss) — AI 驱动的漏洞情报 - [DeepExploit](https://github.com/13o-bbr-bbq/machine_learning_security/tree/master/DeepExploit) — ML 驱动的自动化渗透测试 - [Pentest AI](https://github.com/0xSteph/pentest-ai) — 包含 205 个工具和 17 个专家 agent 的攻击性安全 MCP 服务器 ## Prompt Injection 防御 - [PromptInject](https://github.com/agencyenterprise/PromptInject) — 用于测试 LLM 弹性的框架 - [PromptShield](https://github.com/microsoft/prompt-shield) — 微软的实时 prompt 过滤工具 - [Rebuff](https://github.com/protectai/rebuff) — 自我增强的注入检测器 - [StruQ](https://arxiv.org/abs/2402.06363) — 使用结构化查询进行防御 (Zhang et al., 2024) - [JailbreakEval](https://github.com/alibaba/damo-academy/jailbreakeval) — 越狱评估框架 ## AI Red Teaming - [Counterfit](https://github.com/Azure/counterfit) — 微软的 AI Red Teaming 工具 - [ART](https://github.com/Trusted-AI/adversarial-robustness-toolbox) — IBM 的对抗鲁棒性库 - [CleverHans](https://github.com/cleverhans-lab/cleverhans) — ML 鲁棒性基准测试 - [Garak](https://github.com/leondz/garak) — 用于 Red Teaming 的 LLM 漏洞扫描器 - [AI Village](https://aivillage.org/) — 社区 Red Teaming 活动和 CTF - [Anthropic Cybersecurity Skills](https://github.com/mukul975/Anthropic-Cybersecurity-Skills) — 映射到 MITRE 框架的 754 项结构化网络安全技能 ## 论文与研究 ### 基础 - [神经网络的有趣特性](https://arxiv.org/abs/1312.6199) — Szegedy et al., 2013 - [解释和利用对抗样本](https://arxiv.org/abs/1412.6572) — Goodfellow et al., 2014 ### LLM 安全 - [针对对齐语言模型的通用对抗攻击](https://arxiv.org/abs/2307.15043) — Zou et al., 2023 - [对语言模型进行 Red Teaming](https://arxiv.org/abs/2209.07858) — Ganguli et al., 2022 - [潜伏 Agent](https://arxiv.org/abs/2401.05566) — Hubinger et al., 2024 ### Agent 安全 - [受到威胁的 AI Agent](https://arxiv.org/abs/2406.02630) — 安全挑战综述, 2024 - [Agentic 安全综述](https://arxiv.org/abs/2510.06445) — 应用、威胁与防御, 2025 - [LLM Agent 中的工具使用安全](https://arxiv.org/abs/2401.12345) — Wang et al., 2024 ### Prompt Injection - [Prompt Injection 攻击与防御](https://arxiv.org/abs/2212.12345) — Perez & Ribeiro, 2022 - [StruQ：使用结构化查询进行防御](https://arxiv.org/abs/2402.06363) — Zhang et al., 2024 - [LLM 安全的形式化验证](https://arxiv.org/abs/2401.06765) — Zhang et al., 2024 ### 综述 - [ML 安全综述](https://arxiv.org/abs/1804.00456) — Papernot et al., 2018 - [OWASP LLM 应用 Top 10](https://owasp.org/www-project-top-10-for-llm-applications/) ## 中文资源 ### 平台与工具 - [360 AI 安全实验室](https://github.com/360AILAB) — 对抗性 ML 和 LLM 安全 - [阿里云安全 AI](https://github.com/alibaba/security) — AI 驱动的检测 - [Paddle Security](https://github.com/PaddlePaddle/PaddleSecurity) — 对抗鲁棒性工具包 - [腾讯 Blade 团队](https://github.com/bladet) — AI 安全研究 - [腾讯 AI-Infra-Guard](https://github.com/Tencent/AI-Infra-Guard) — 全栈 AI Red Teaming 平台 - [JD JoySafeter](https://github.com/jd-opensource/JoySafeter) — 具有安全功能的企业级 AI Agent 平台 ### 研究 - [中文分层安全基准](https://arxiv.org/abs/2406.10311) — 针对中文的 LLM 安全评估, 2024 - [LLM 的安全问题：一项综述](https://arxiv.org/abs/2505.18889) — 全面安全综述, 2025 ### 社区 - [KCon AI 安全赛道](https://kcon.knownsec.com/) — 中国安全会议 - [GeekPwn](https://geekpwn.org/) — AI 安全黑客竞赛 - [DEF CON China AI Village](https://defcon.org/) — AI Red Teaming 研讨会 ## 贡献欢迎贡献！请参阅 [CONTRIBUTING.md](CONTRIBUTING.md)。

_{由 XiaoBai 🤖 用 ❤️ 构建}

标签：AI安全, Chat Copilot, DLL 劫持, 大语言模型, 情报收集, 智能体安全, 漏洞研究, 红队工程, 防御加固