ditikrushnaroutray/PromptWAF

GitHub: ditikrushnaroutray/PromptWAF

一款面向 OpenAI API 的透明代理防火墙，通过多层检测实时阻断提示注入、越狱和系统提示泄露。

Stars: 0 | Forks: 0

# 🛡️ PromptWAF **面向 LLM API 的企业级 Web 应用防火墙** [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/) [![FastAPI](https://img.shields.io/badge/FastAPI-0.100+-green.svg)](https://fastapi.tiangolo.com/) [![License: GPLv3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) [![Docker](https://img.shields.io/badge/Docker-Ready-2496ED.svg)](https://www.docker.com/) *一个即插即用的透明代理，位于您的应用程序和 OpenAI 之间，可实时阻断提示注入、越狱攻击和系统提示泄露。*

## 为什么选择 PromptWAF？随着 LLM 集成规模的扩大，应用程序变得越来越容易受到对抗性提示攻击。PromptWAF 通过提供以下功能来解决这一问题： - **零摩擦集成** — 完全镜像 OpenAI `/v1/chat/completions` API。只需更改一行代码，即可为您的应用提供保护。 - **多层检测** — 启发式正则表达式、语义相似度 (TF-IDF) 和可选的 LLM 判定器协同工作。 - **故障自动阻断架构** — 如果 WAF 引擎出错或超时，流量将被**阻断**，而不是被允许通过。 - **流式感知** — 使用滑动窗口缓冲区实时监控 `text/event-stream` 响应，以检测系统提示泄露。 - **影子模式** — 首先在 `MONITOR` 模式下部署，以观察*将要*被阻断的内容，而不会影响生产流量。 ## 架构 ``` ┌──────────────┐ ┌─────────────────────────────────────────────┐ ┌──────────────┐ │ │ │ PromptWAF │ │ │ │ Your App │───────▶│ │───────▶│ OpenAI │ │ (Client) │ │ ┌───────────┐ ┌──────────┐ ┌──────────┐ │ │ API │ │ │◀───────│ │ Normalize │─▶│ Heuristic│─▶│ Semantic │ │◀───────│ │ │ │ │ │ (NFKC, │ │ (Regex) │ │ (TF-IDF) │ │ │ │ │ │ │ │ Base64) │ │ │ │ │ │ │ │ │ │ │ └───────────┘ └──────────┘ └──────────┘ │ │ │ │ │ │ ▲ │ │ │ │ │ │ │ │ Output Leakage Scan ◀─┘ │ │ │ │ │ │ │ (Sliding Window) │ │ │ │ │ └───────────┼─────────────────────────────────┘ └──────────────┘ │ │ │ │ │ ┌────────┴────────┐ │ │ │ Redis │ │ │ │ (Rate Limiting) │ │ │ └────────┬────────┘ │ │ │ │ │ ┌────────┴────────┐ │ │ │ Prometheus │ │ │ │ (/metrics) │ │ │ └─────────────────┘ └──────────────┘ ``` ### 安全层 | 层级 | 引擎 | 延迟 | 检测目标 | |-------|--------|---------|-----------------| | **1. 输入规范化** | Unicode NFKC + 零宽字符剔除 + Base64/Hex 解码 | µs | 同形字攻击、编码 payload、不可见字符注入 | | **2. 启发式正则** | 涵盖 5 个类别的 17 个编译模式 | µs | “忽略所有先前指令”、DAN 模式、系统提示提取 | | **3. 语义相似度** | TF-IDF 字符 n-grams + 余弦相似度 | ms | 规避正则表达式的改写越狱攻击 | | **4. LLM 判定器** *(可选)* | GPT-4o-mini 分类 | ~500ms | 模式库中没有的新型/创意攻击 | | **5. 输出扫描器** | SSE 流上的滑动窗口缓冲区 | µs/chunk | 模型响应中的系统提示泄露 | ## 快速开始 (Docker — 推荐) ### 1. 克隆并配置 ``` git clone https://github.com/ditikrushnaroutray/PromptWAF.git cd PromptWAF # 从模板创建环境文件 cp .env.example .env ``` 使用您的设置编辑 `.env`： ``` # 必填 — 您的真实 OpenAI API key WAF_OPENAI_API_KEY=sk-your-openai-api-key-here # 必填 — 您的应用 system prompt（用于泄露检测） PROTECTED_SYSTEM_PROMPT="You are a helpful customer support agent for Acme Corp..." # 可选 — 以 MONITOR 模式启动以实现安全推出（默认：BLOCK） WAF_MODE=MONITOR ``` ### 2. 启动服务栈 ``` docker compose up -d ``` 这将启动三个服务： | 服务 | 端口 | 用途 | |---------|------|---------| | `prompt-waf` | `8000` | WAF 代理 | | `redis` | `6379` | 分布式限流 | | `prometheus` | `9090` | 指标仪表板 | ### 3. 验证 ``` # 健康检查 curl http://localhost:8000/health # 预期： # {"status":"ok","version":"2.1.0","waf":"active","mode":"BLOCK","redis":"connected"} ``` ## 一行代码集成 PromptWAF 是一个**透明代理** — 它完全镜像了 OpenAI API 的签名。将您的 SDK 指向该代理而不是 OpenAI，您就可以获得保护。 ### Python (OpenAI SDK) ``` from openai import OpenAI # 之前（直接连接到 OpenAI）： # client = OpenAI(api_key="sk-your-key") # 之后（通过 PromptWAF）： client = OpenAI( api_key="pwaf_your-promptwaf-api-key", # PromptWAF API key base_url="http://localhost:8000", # ← Point to your PromptWAF instance ) # 使用方法完全相同 — 不需要修改其他代码 response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello, world!"}], stream=True, # Streaming is fully supported ) ``` ### Node.js (OpenAI SDK) ``` import OpenAI from "openai"; const client = new OpenAI({ apiKey: "pwaf_your-promptwaf-api-key", baseURL: "http://localhost:8000", // ← Point to PromptWAF }); const response = await client.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: "Hello, world!" }], }); ``` ### cURL ``` curl -X POST http://localhost:8000/v1/chat/completions \ -H "Authorization: Bearer pwaf_your-promptwaf-api-key" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "What is the capital of France?"}] }' ``` ### 生成 PromptWAF API 密钥 ``` curl -X POST http://localhost:8000/v1/keys/generate \ -H "Content-Type: application/json" \ -d '{"email": "dev@yourcompany.com"}' # 响应： # { # "raw_api_key": "pwaf_aBcDeFgHiJkLmNoPqRsTuVwXyZ...", # "owner_email": "dev@yourcompany.com", # "message": "Please store this key securely. It will not be shown again." # } ``` ## 环境变量 | 变量 | 必需 | 默认值 | 描述 | |----------|----------|---------|-------------| | `WAF_OPENAI_API_KEY` | ✅ | — | 您真实的 OpenAI API 密钥，用于上游转发 | | `WAF_MODE` | 否 | `BLOCK` | `BLOCK` = 故障自动阻断模式。`MONITOR` = 仅记录日志的影子模式 | | `PROTECTED_SYSTEM_PROMPT` | 推荐 | `""` | 您应用的系统提示文本 — 用于检测响应中的泄露 | | `REDIS_URL` | 否 | `memory://` | 用于分布式速率限制的 Redis 连接字符串。多节点部署时设置此项 | | `WAF_RATE_LIMIT` | 否 | `50/minute` | 每个 API 密钥或 IP 地址的速率限制 | | `WAF_TIMEOUT_SECONDS` | 否 | `5.0` | WAF 分析并在故障阻断前的最长等待秒数 | | `MAX_PROMPT_LENGTH` | 否 | `50000` | 触发自动阻断的最大提示字符数 | | `SEMANTIC_SIMILARITY_THRESHOLD` | 否 | `0.8` | 语义越狱检测的余弦相似度阈值 (0.0–1.0) | | `LEAKAGE_SIMILARITY_THRESHOLD` | 否 | `0.7` | 输出泄露检测的相似度阈值 (0.0–1.0) | | `WAF_ENABLE_LLM_JUDGE` | 否 | `false` | 启用可选的 GPT-4o-mini LLM 判定器（第 3 层）。会增加延迟和成本 | | `WAF_FAIL_CLOSED` | 否 | `true` | 如果为 `true`，WAF 引擎错误将导致流量被阻断 | ## 安全模式 ### `BLOCK` 模式 (默认 — 生产环境) WAF 执行所有安全层。恶意请求将被阻断，并返回 HTTP `403` 和响应头 `X-PromptWAF-Status: Blocked`。 ``` Client ──▶ PromptWAF ──✕ BLOCKED (403) │ └──▶ Structured JSON log (event: request_blocked) ``` ### `MONITOR` 模式 (影子模式 — 安全上线) WAF 执行所有检测，但**从不阻断**。恶意请求将被转发到 OpenAI，并附带一个额外的请求头 `X-PromptWAF-Detected-Attack: True`。所有违规行为都会被记录。 ``` Client ──▶ PromptWAF ──▶ OpenAI ──▶ Response │ │ └──▶ Log: "attack detected (not blocked)" Header: X-PromptWAF-Detected-Attack: True ``` ## 可观测性与指标 ### Prometheus 端点 PromptWAF 暴露了一个兼容 Prometheus 的 `/metrics` 端点： ``` curl http://localhost:8000/metrics ``` ``` # HELP promptwaf_requests_total PromptWAF 处理的总请求数。 # TYPE promptwaf_requests_total counter promptwaf_requests_total 1542 # HELP promptwaf_blocked_total PromptWAF 拦截的总请求数。 # TYPE promptwaf_blocked_total counter promptwaf_blocked_total 23 # HELP promptwaf_monitored_total 在 MONITOR 模式下检测到的总攻击数（未拦截）。 # TYPE promptwaf_monitored_total counter promptwaf_monitored_total 5 # HELP promptwaf_attacks_by_layer_total 每个 WAF 层检测到的攻击数。 # TYPE promptwaf_attacks_by_layer_total counter promptwaf_attacks_by_layer_total{layer="heuristic"} 18 promptwaf_attacks_by_layer_total{layer="semantic"} 5 promptwaf_attacks_by_layer_total{layer="leakage"} 0 # HELP promptwaf_inspection_latency_avg_ms 平均 WAF 检查延迟（毫秒）。 # TYPE promptwaf_inspection_latency_avg_ms gauge promptwaf_inspection_latency_avg_ms 2.341 ``` ### JSON 指标 ``` curl http://localhost:8000/metrics/json ``` ### Prometheus 仪表板使用包含的 `docker-compose.yml`，Prometheus 已预先配置为每 15 秒从 `http://prompt-waf:8000/metrics` 抓取 PromptWAF 数据。在 **http://localhost:9090** 访问 Prometheus UI 并查询如下指标： ``` # 过去 5 分钟的攻击率 rate(promptwaf_blocked_total[5m]) # 各层细分 promptwaf_attacks_by_layer_total # 检查延迟趋势 promptwaf_inspection_latency_avg_ms ``` ### 结构化 JSON 日志每个 WAF 决策都以结构化 JSON 的形式记录到标准错误 (stderr) 中： ``` { "timestamp": "2026-05-07T12:00:00.000Z", "level": "WARNING", "event": "request_blocked", "request_id": "a1b2c3d4-...", "layer": "heuristic", "reason": "Heuristic match: instruction_override", "confidence": 1.0, "blocked": true, "shadow_mode": false, "waf_mode": "BLOCK", "latency_ms": 1.82, "source_ip": "203.0.113.42", "original_prompt_hash": "e3b0c44298fc1c14..." } ``` ## 响应头来自 PromptWAF 的每个响应都包含安全头： | 请求头 | 值 | 描述 | |--------|--------|-------------| | `X-PromptWAF-Status` | `Clean` / `Blocked` / `Monitored` / `Error` | 此请求的 WAF 判定结果 | | `X-PromptWAF-Request-Id` | UUID | 用于日志关联的唯一 ID | | `X-PromptWAF-Mode` | `BLOCK` / `MONITOR` | 当前 WAF 执行模式 | | `X-PromptWAF-Layer` | `heuristic` / `semantic` / 等 | 触发阻断的安全层 | | `X-PromptWAF-Detected-Attack` | `True` | 仅在 MONITOR 模式下检测到攻击时出现 | | `X-PromptWAF-Version` | `2.1.0` | WAF 引擎版本 | ## API 端点 | 方法 | 端点 | 认证 | 描述 | |--------|----------|------|-------------| | `POST` | `/v1/chat/completions` | Bearer token | 兼容 OpenAI 的代理（主要 WAF 端点） | | `POST` | `/v1/keys/generate` | 无 | 生成新的 PromptWAF API 密钥 | | `GET` | `/health` | 无 | 健康检查 + 状态 | | `GET` | `/metrics` | 无 | Prometheus 指标（文本格式） | | `GET` | `/metrics/json` | 无 | 指标快照 (JSON) | ## 开发设置 (不使用 Docker) ``` # 克隆仓库 git clone https://github.com/ditikrushnaroutray/PromptWAF.git cd PromptWAF # 创建虚拟环境 python3 -m venv .venv source .venv/bin/activate # 安装依赖 pip install -r requirements.txt # 配置环境 cp .env.example .env # 使用您的设置编辑 .env # 运行开发服务器 uvicorn app.main:app --reload --port 8000 ``` ### 运行测试 ``` # 所有测试（99 个测试 — 安全 + 回归） python -m pytest tests/ -v # 仅对抗性负载回归套件 python -m pytest tests/security/test_payloads.py -v # 快速检查 — 100% 检出率 python -m pytest tests/security/test_payloads.py::TestCoverageReport -v ``` ## 项目结构 ``` PromptWAF/ ├── app/ │ ├── main.py # FastAPI app, middleware, /health, /metrics │ ├── api/v1/ │ │ ├── proxy.py # Main WAF proxy route (/v1/chat/completions) │ │ └── keys.py # API key generation │ ├── core/ │ │ ├── config.py # All WAF configuration & regex patterns │ │ ├── security.py # Auth, rate limiting (Redis-backed) │ │ ├── logging_config.py # Structured JSON logging │ │ └── metrics.py # Prometheus metrics collector │ ├── services/ │ │ ├── waf_engine.py # Multi-layer detection engine │ │ ├── normalizer.py # Input de-obfuscation (NFKC, Base64, hex) │ │ ├── output_scanner.py # Streaming leakage detection │ │ └── openai_client.py # Hardened upstream proxy │ └── db/ │ ├── models.py # SQLAlchemy models │ └── session.py # Database session ├── tests/ │ ├── test_waf_security.py # Unit tests (37 tests) │ └── security/ │ ├── adversarial_payloads.json # 30 adversarial test payloads │ └── test_payloads.py # Regression suite (62 tests) ├── Dockerfile # Multi-stage production build ├── docker-compose.yml # Full stack (WAF + Redis + Prometheus) ├── prometheus.yml # Prometheus scrape config ├── .env.example # Environment variable template └── requirements.txt # Python dependencies ``` ## 许可证 GNU General Public License v3 © PromptWAF 贡献者

标签：AI安全, API安全, API网关, AppImage, AV绕过, Chat Copilot, CISA项目, Docker, Fail-Closed, FastAPI, JSONLines, JSON输出, LLM, Naabu, OpenAI, Petitpotam, Python, Shadow Mode, TF-IDF, Unmanaged PE, WAF, Web应用防火墙, 内存规避, 内容安全, 反向代理, 大语言模型安全, 安全防御评估, 搜索引擎查询, 无后门, 机密管理, 正则引擎, 流式响应处理, 系统提示词泄露, 网络安全, 自定义请求头, 语义相似度, 请求拦截, 逆向工具, 隐私保护, 零信任, 零日漏洞检测