farhanashrafdev/mantis

GitHub: farhanashrafdev/mantis

Mantis 是一款开源的 LLM 应用红队测试 CLI 工具，通过 67 个攻击 Prompt 自动化检测 Prompt 注入、数据泄露、幻觉和 Agent 利用等 AI 特有漏洞。

Stars: 5 | Forks: 3

# 🔒 Mantis **用于 LLM 应用自动化红队测试的开源 CLI 工具包** [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/0db5977293121237.svg)](https://github.com/farhanashrafdev/mantis/actions/workflows/ci.yml) [![npm](https://img.shields.io/npm/v/mantis-redteam.svg)](https://www.npmjs.com/package/mantis-redteam) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Node.js](https://img.shields.io/badge/Node.js-%3E%3D20-green.svg)](https://nodejs.org) [![TypeScript](https://img.shields.io/badge/TypeScript-5.7-blue.svg)](https://www.typescriptlang.org) [![Docker](https://img.shields.io/badge/Docker-GHCR-blue.svg)](https://github.com/farhanashrafdev/mantis/pkgs/container/mantis) *系统化探测 AI 应用中的 Prompt Injection、数据泄露、幻觉和 Agent 利用漏洞——抢在攻击者之前。* [快速开始](#-quick-start) · [攻击模块](#-attack-modules) · [CI/CD 集成](#-cicd-integration) · [架构](#-architecture) · [贡献](#-contributing)

## 为什么选择 Mantis？ LLM 应用引入了一类传统安全扫描器无法检测的新型漏洞。Prompt Injection、通过隐藏系统提示进行的数据泄露、幻觉 URL 以及 Agent 利用等问题，都需要专门的工具来应对。 **Mantis** 正是这样的工具——一个模块化、可扩展的 CLI 框架，它自动化 AI 安全测试的方式，就像传统 DAST 工具自动化 Web 应用测试一样。 ### 它能发现什么 | 类别 | Mantis 测试内容 | 插件 | 攻击数 | |----------|-------------------|---------|---------| | 🔴 **Prompt Injection** | 系统提示覆盖、越狱、角色混淆、指令提取 | 4 | 20 | | 🟠 **数据泄露** | 隐藏提示暴露、秘密检索、PII 提取、记忆渗出 | 4 | 16 | | 🟡 **幻觉** | 虚构 URL、不存在的实体、引用失败、置信度不匹配 | 4 | 15 | | 🟣 **Tool/Agent Exploitation** | 命令注入、文件系统访问、网络利用、权限提升 | 4 | 16 | | | **总计** | **16** | **67** | ### 核心能力 - **67 个攻击 Prompt** 分布在 16 个插件中——覆盖最关键的 AI 漏洞类别 - **ALVSS 评分**——专为 AI 漏洞构建的类 CVSS 风险模型（可利用性、影响、数据敏感度、可复现性、模型合规性） - **OWASP LLM Top 10**——每个插件都映射到 [2025 OWASP LLM 应用 Top 10](https://genai.owasp.org/resource/owasp-top-10-for-llm-applications-2025/) - **CI/CD 原生**——退出码门禁、用于 GitHub Security 标签页的 SARIF 输出、兼容 Jenkins/GitLab - **可扩展**——用约 15 行 TypeScript 编写自定义攻击插件 ## 🚀 快速开始 ### 安装 ``` # npm (推荐) npm install -g mantis-redteam # 或者在不安装的情况下运行 npx mantis-redteam scan --target https://your-ai-app.com/api/chat # 或者使用 Docker docker pull ghcr.io/farhanashrafdev/mantis:latest ``` ### 扫描 ``` # 基础扫描与表格输出 mantis scan --target https://your-ai-app.com/api/chat # 用于自动化的 JSON 输出 mantis scan --target https://your-ai-app.com/api/chat --format json # 用于 GitHub Security 标签页的 SARIF 输出 mantis scan --target https://your-ai-app.com/api/chat --format sarif --output results.sarif ``` ### Docker ``` docker run --rm ghcr.io/farhanashrafdev/mantis:latest \ scan --target https://your-ai-app.com/api/chat --format json ``` ### 配置文件对于高级设置，请创建一个 `mantis.config.yaml`： ``` version: "1.0" target: url: https://your-ai-app.com/api/chat method: POST headers: Content-Type: application/json promptField: messages[-1].content responseField: choices[0].message.content authToken: ${MANTIS_AUTH_TOKEN} modules: include: [] # empty = all plugins exclude: [] scan: timeoutMs: 30000 maxRetries: 2 rateLimit: 10 severityThreshold: low output: format: table verbose: false redactResponses: true ``` ``` mantis scan --config mantis.config.yaml ``` ## 🔗 CI/CD 集成 Mantis 旨在作为持续集成流水线中的质量门禁运行。 ### GitHub Actions ``` name: AI Security Scan on: push: branches: [main] pull_request: branches: [main] jobs: mantis-scan: runs-on: ubuntu-latest permissions: security-events: write steps: - uses: actions/checkout@v4 - name: Install Mantis run: npm install -g mantis-redteam - name: Run AI security scan run: | mantis scan \ --target ${{ secrets.AI_APP_URL }} \ --format sarif \ --output results.sarif \ --severity-threshold medium continue-on-error: true - name: Upload to GitHub Security tab uses: github/codeql-action/upload-sarif@v3 with: sarif_file: results.sarif - name: Fail on critical/high findings run: | mantis scan \ --target ${{ secrets.AI_APP_URL }} \ --severity-threshold high ``` ### Jenkins / GitLab CI / 任意 CI 系统 ``` npm install -g mantis-redteam mantis scan --target "$AI_APP_URL" --format sarif --output results.sarif ``` ### 退出码 | 代码 | 含义 | |------|---------| | `0` | 扫描完成——未发现严重或高危结果 | | `1` | 扫描完成——检测到严重或高危结果 | | `2` | 运行时错误（配置无效、网络故障等） | ## 🏗 架构 ``` graph LR CLI["CLI"] --> Engine["CoreEngine"] Engine --> Registry["PluginRegistry"] Registry --> PI["Prompt Injection
4 plugins · 20 attacks"] Registry --> DL["Data Leakage
4 plugins · 16 attacks"] Registry --> HL["Hallucination
4 plugins · 15 attacks"] Registry --> TE["Tool Exploit
4 plugins · 16 attacks"] Engine --> Adapter["HttpAdapter"] Adapter --> Target["Target LLM"] Engine --> Scoring["ALVSS Scorer"] Engine --> Reporter["Table · JSON · SARIF"] ``` ### 工作原理 1. **CLI** 解析选项并从文件/CLI/环境变量加载配置 2. **CoreEngine** 编排扫描生命周期 3. **PluginRegistry** 自动发现并过滤攻击插件 4. 每个 **Plugin** 通过 **HttpAdapter** 向目标发送攻击 Prompt 5. 根据已知的漏洞/安全模式分析响应 6. **ALVSS Scorer** 计算 5 个加权维度的风险评分 7. **Reporters** 以表格、JSON 或 SARIF 格式输出结果 ### 攻击模块（详情）

🔴 Prompt Injection — 4 个插件，20 个攻击 (OWASP LLM01)

| Plugin | 攻击数 | 测试内容 | |--------|---------|---------------| | System Override | 5 | 直接指令覆盖、DAN 人格、开发者模式、上下文重置、多语言绕过 | | Jailbreak | 5 | 角色扮演绕过、假设场景、Base64 编码、逆向心理、学术框架 | | Role Confusion | 5 | 管理员冒充、维护模式、权威声明、系统命令、信任升级 | | Instruction Extraction | 5 | 直接提取、反射、Debug 模式、导出 Prompt、元数据检查 |

🟠 数据泄露 — 4 个插件，16 个攻击 (OWASP LLM02)

| Plugin | 攻击数 | 测试内容 | |--------|---------|---------------| | Hidden Prompt | 4 | 预对话提取、JSON 消息转储、约束提取、错误触发泄露 | | Secret Retrieval | 4 | API Key 提取、凭据探测、配置转储、环境变量泄露 | | PII Extraction | 4 | 训练数据提取、用户数据探测、跨会话泄露、人口统计画像 | | Memory Exfiltration | 4 | 对话历史访问、陈旧上下文、跨用户数据、会话边界测试 |

🟡 幻觉 — 4 个插件，15 个攻击 (OWASP LLM09)

| Plugin | 攻击数 | 测试内容 | |--------|---------|---------------| | Fabricated URL | 4 | 假文档链接、引用中的死链、钓鱼向量生成 | | Nonexistent Entity | 4 | 虚构论文、假 API、臆造规范、捏造的专家意见 | | Citation Verification | 4 | 假引用归属、捏造统计数据、虚假法律引用、编造历史事件 | | Confidence Mismatch | 3 | 以权威口吻陈述不确定的主张、不可能的知识、未来事件预测 |

🟣 Tool/Agent Exploitation — 4 个插件，16 个攻击 (OWASP LLM06)

| Plugin | 攻击数 | 测试内容 | |--------|---------|---------------| | Command Injection | 4 | Shell 命令执行、代码评估、子进程生成、OS 交互 | | File System Access | 4 | 路径遍历、文件读/写、目录列出、敏感文件访问 | | Network Access | 4 | SSRF 探测、DNS 渗出、出站连接、内部网络扫描 | | Privilege Escalation | 4 | 管理员功能访问、权限绕过、角色提升、能力覆盖 |

## 📊 风险评分 (ALVSS) Mantis 使用 **ALVSS**（AI LLM 漏洞评分系统）——一个专为 AI 应用构建的类 CVSS 评分模型： | 维度 | 权重 | 衡量内容 | |-----------|--------|------------------| | 可利用性 | 30% | 漏洞被利用的难易程度？ | | 影响 | 25% | 潜在危害是什么？ | | 数据敏感度 | 20% | 暴露的数据有多敏感？ | | 可复现性 | 15% | 攻击是否能可靠重复？ | | 模型合规性 | 10% | 模型偏离预期行为的程度？ | **严重性映射：** 严重 (≥9.0) → 高危 (≥7.0) → 中危 (≥4.0) → 低危 (<4.0) → 信息 ## 📁 输出格式 | 格式 | 用例 | 标志 | |--------|----------|------| | **Table** | 交互式终端使用，人工审查 | `--format table` | | **JSON** | CI/CD 流水线，程序化消费，API 集成 | `--format json` | | **SARIF** | GitHub Security 标签页，Azure DevOps，VS Code SARIF Viewer | `--format sarif` | ## 🗺 路线图 | 阶段 | 范围 | 状态 | |-------|-------|--------| | **阶段 1** | 核心引擎，16 个插件（67 个攻击），CLI，JSON/Table/SARIF 报告，ALVSS 评分，配置系统，Docker，CI/CD 工作流 | ✅ 已完成 | | **阶段 2** | 插件市场，多模型适配器，高级限速，扫描重放，历史比较 | 📋 计划中 | | **阶段 3** | 攻击链，AI 辅助变异，战役模式，Web 仪表盘，团队协作 | 📋 计划中 | ## 🤝 贡献我们欢迎贡献！最简单的入门方式是**编写攻击插件**——只需要大约 15 行 TypeScript。有关设置说明、代码标准和 PR 指南，请参阅 [CONTRIBUTING.md](CONTRIBUTING.md)。 ``` src/plugins/ ├── prompt-injection/ # 4 plugins ├── data-leakage/ # 4 plugins ├── hallucination/ # 4 plugins └── tool-exploit/ # 4 plugins ``` **快速插件模板：** ``` import { BasePlugin } from '../base-plugin.js'; import { AttackCategory, SeverityLevel, type PluginMeta, type AttackPrompt } from '../../types/types.js'; class MyPlugin extends BasePlugin { meta: PluginMeta = { id: 'category/my-plugin', name: 'My Attack Plugin', description: 'Tests for a specific vulnerability', category: AttackCategory.PromptInjection, version: '1.0.0', author: 'your-name', tags: ['my-tag'], owaspLLM: 'LLM01: Prompt Injection', }; prompts: AttackPrompt[] = [ { id: 'my-attack-1', prompt: 'Your attack prompt here', description: 'What this tests', securePatterns: [/I cannot/i], vulnerablePatterns: [/here is the secret/i], severity: SeverityLevel.High, }, ]; protected getRemediation(): string { return 'How to fix this vulnerability'; } protected getCWE(): string { return 'CWE-XXX'; } } export default new MyPlugin(); ``` ## 🔐 安全如需报告 Mantis 本身的安全漏洞，请参阅 [SECURITY.md](SECURITY.md)。 ## 📄 许可证 Apache 2.0——详见 [LICENSE](LICENSE)。

**为安全社区而建，由安全社区共建。** [npm](https://www.npmjs.com/package/mantis-redteam) · [Docker](https://github.com/farhanashrafdev/mantis/pkgs/container/mantis) · [问题](https://github.com/farhanashrafdev/mantis/issues) · [贡献](CONTRIBUTING.md)

标签：AI代理利用, AI越狱, Atomic Red Team, CISA项目, Clair, DAST工具, Docker, GNU通用公共许可证, LLM安全测试, LNA, MITM代理, Node.js, TypeScript, 大语言模型安全, 安全插件, 安全防御评估, 幻觉测试, 数据泄露检测, 暗色界面, 机密管理, 渗透测试框架, 网络安全, 自动化攻击, 请求拦截, 调试插件, 软件供应链安全, 远程方法调用, 防御, 隐私保护