SergioYin/agent-instruction-guard

GitHub: SergioYin/agent-instruction-guard

一款零依赖的 Python CLI 工具，专为扫描 AI 编程代理指令文件中的风险命令、提示注入模式和意外泄露密钥而设计。

Stars: 0 | Forks: 0

# Agent 指令防护在 Codex、Claude Code、Copilot、Cursor、Gemini 或其他编程代理使用 AI 代理指令文件之前对其进行扫描。 `agent-instruction-guard` 是一个小型的零依赖 Python CLI 工具，用于在 `AGENTS.md`、`CLAUDE.md`、`GEMINI.md`、`.cursorrules`、`.windsurfrules` 和 GitHub Copilot 指令/代理文件等文件中查找风险模式。它专为代码库维护者设计，旨在对指令层进行快速的防御性审查，而非提供一个重量级的安全平台。 ## 为什么需要它 AI 编程代理越来越依赖代码库级别的自然语言指令。这些文件可以提高交付质量，但也可能包含不安全的命令、意外泄露的密钥、隐藏的指令或提示注入风格的内容。该工具让团队在将代码库交给自主代理之前，能够进行简单的本地检查。 ## 检测内容 - 远程下载执行模式，例如 `curl ... | bash`。 - 破坏性命令模式，例如 `sudo rm -rf`。 - 涉及 `.env`、SSH 密钥、token 或密码的潜在凭证窃取指令。 - 绕过安全控制、审批、护栏或沙箱的请求。 - 隐藏/编码的指令提示，例如 `base64 -d`、`eval(` 或 “不要告诉用户”。 - 代理可读指令文件中疑似秘密材料的内容，并附有脱敏摘录。该扫描器有意设计得较为保守：发现项意味着“请审查此行”，而不是“此代码库是恶意的”。 ## 安装从源码安装： ``` git clone https://github.com/SergioYin/agent-instruction-guard.git cd agent-instruction-guard python -m pip install -e . ``` 不进行安装，直接作为模块运行： ``` python -m agent_instruction_guard --help ``` ## 用法扫描当前代码库： ``` agent-instruction-guard . ``` 仅在遇到高严重性发现项时失败，这是默认行为： ``` agent-instruction-guard . --fail-on high ``` 为下游工具输出 JSON： ``` agent-instruction-guard . --format json ``` 选择策略配置文件： ``` agent-instruction-guard . --profile lenient agent-instruction-guard . --profile default agent-instruction-guard . --profile strict ``` 配置文件允许团队在不修改扫描器规则的情况下调整严格程度： - `lenient`：通过降低选定的非关键审查发现项的严重性来减少干扰。 - `default`：保留标准扫描器行为。 - `strict`：针对加固环境，提升存在风险的模糊指令和范围越界发现项的严重性。 JSON 输出包含所选的 `profile`。严重性发生变化的发现项也会包含 `original_severity`。比较同一次扫描中所有的内置配置文件： ``` agent-instruction-guard . --compare-profiles --format json --fail-on none ``` 比较报告在 `profiles.lenient`、`profiles.default` 和 `profiles.strict` 下包含每个配置文件的一份标准 JSON 报告。该命令将相同的代码库扫描、可选配置和可选基线应用于每个配置文件，使得严重性差异变得确定且易于比较。使用代码库本地规则覆盖： ``` agent-instruction-guard . --config agent-instruction-guard.toml ``` 当未提供 `--config` 时，CLI 会在当前工作目录中查找 `agent-instruction-guard.toml`，然后查找 `.agent-instruction-guard.toml`。配置是可选的；当未找到配置文件时，扫描器行为保持不变。配置示例： ``` [rules.network_side_effect] severity = "medium" [rules.hidden_instruction] severity = "ignore" ``` 支持的严重性级别为 `low`、`medium`、`high` 和 `ignore`。`ignore` 会抑制该规则的发现项。无效的规则 ID 或严重性会快速失败并显示清晰的错误和非零退出码。JSON 输出在使用配置文件时会包含 `config_path`，严重性发生变化的发现项会包含 `original_severity`。配置是为了方便代码库维护者使用，而不是一道安全边界。降低严重性或使用 `ignore` 可能会隐藏使扫描失败的风险指令，因此对覆盖项的审查应与对代理指令文件的审查保持同等谨慎。扫描摘录在打印前仍会进行脱敏处理。示例配置覆盖的凭证已检入到 `examples/policy-override/` 目录下： ``` python -m agent_instruction_guard examples/policy-override \ --config examples/policy-override/agent-instruction-guard.toml \ --format json --fail-on none ``` 匹配的可移植测试样本是 `examples/policy-override-report.json`。它演示了针对由配置提升或降低严重性的规则的 `config_path` 和 `original_severity`。该测试样本将 `config_path` 存储为相对于代码库的路径，以便可以跨机器进行比较；实时的 CLI 输出使用绝对路径。配置文件比较的凭证已检入到 `examples/profile-comparison/` 目录下，可移植测试样本为 `examples/profile-comparison-report.json`： ``` python -m agent_instruction_guard examples/profile-comparison \ --compare-profiles --format json --fail-on none ``` 针对策略覆盖的基线/报告互操作性凭证也已检入： - `examples/policy-override-baseline.json`：从配置覆盖后的策略发现项生成的基线。 - `examples/policy-override-baselined-report.json`：显示被该基线抑制的相同配置覆盖发现项的 JSON 报告。在更改策略输出行为后重新生成测试样本： ``` python scripts/generate_policy_report_fixtures.py ``` 在文本或 JSON 报告中包含特定规则的修复指导： ``` agent-instruction-guard . --include-guidance agent-instruction-guard . --format json --explain --fail-on none ``` 为代码扫描工具输出 SARIF 2.1.0： ``` agent-instruction-guard . --format sarif --fail-on none > agent-instruction-guard.sarif ``` 包含额外的类指令文件： ``` agent-instruction-guard . --include "docs/agent-prompts/*.md" ``` 列出将被扫描的文件： ``` agent-instruction-guard . --list-files ``` 列出规则文档而不进行扫描： ``` agent-instruction-guard --list-rules agent-instruction-guard --list-rules --format json ``` 从当前发现项创建基线： ``` agent-instruction-guard . --write-baseline .agent-instruction-guard-baseline.json --fail-on none ``` 在后续扫描中使用该基线： ``` agent-instruction-guard . --baseline .agent-instruction-guard-baseline.json --fail-on high ``` 这有助于在现有代码库中逐步采用：已知发现项在基线文件中仍可供人工审查，但在团队修复期间，它们不会在每次扫描时都导致失败。新增或更改的发现项仍会出现在正常报告中，并可能根据 `--fail-on` 的设置导致命令失败。 JSON 输出在使用基线时会包含 `baseline_path`。在审查完当前发现项后刷新基线： ``` agent-instruction-guard . --write-baseline .agent-instruction-guard-baseline.json --fail-on none ``` ## 示例安全示例： ``` python -m agent_instruction_guard examples/safe ``` 预期结果： ``` Agent Instruction Guard Summary: high=0 medium=0 low=0 No risky instruction patterns found. ``` 风险示例： ``` python -m agent_instruction_guard examples/risky --fail-on high ``` 预期结果：退出码为 `2`，并发现有关隐藏/绕过指令、远程代码执行和潜在凭证暴露的发现项。 ## 支持的文件默认情况下，CLI 会扫描： - `AGENTS.md` 和嵌套的 `**/AGENTS.md` - `AGENTS.override.md` - `CLAUDE.md` - `GEMINI.md` - `.cursorrules` - `.windsurfrules` - `.github/copilot-instructions.md` - `.github/instructions/*.instructions.md` - `.github/agents/*.agent.md` 它会跳过庞大/生成的目录，例如 `.git`、`node_modules`、`dist`、`build`、虚拟环境和 Python 缓存。 ## 退出码 - `0`：扫描完成，且没有达到或超过 `--fail-on` 阈值的发现项。 - `2`：发现项达到了失败阈值。 - 被 `--baseline` 抑制的发现项不计入失败阈值。 - argparse 错误使用 Python 标准的非零 CLI 行为。 ## 本地验证 ``` python -m unittest discover -s tests -v python scripts/selfcheck.py git diff --check python -m compileall agent_instruction_guard tests scripts python -m agent_instruction_guard examples/risky --format json --fail-on none python -m agent_instruction_guard examples/risky --format json --profile strict --fail-on none python -m agent_instruction_guard examples/profile-comparison --compare-profiles --format json --fail-on none python scripts/generate_policy_report_fixtures.py python -m agent_instruction_guard examples/risky --format json --explain --fail-on none python -m agent_instruction_guard examples/risky --format sarif --fail-on none python -m agent_instruction_guard --list-rules ``` ## GitHub 代码扫描 SARIF 输出与 GitHub 代码扫描上传工作流兼容。此代码库有意不包含 `.github/workflows/*` 示例，因为创建或更新工作流文件需要具有 `workflow` 范围的 GitHub token。 ## 非目标 - 它不会执行代理指令或将其沙箱化。 - 它不能替代人工审查、依赖项扫描或密钥扫描工具。 - 它并不声称每一行被标记的内容都是恶意的。 - 它不会自动修改文件。 ## 许可证 MIT

标签：AI Agent安全, AI安全工具, AI编程助手, Claude Code, Codex, Cursor, DevSecOps, DNS 反向解析, GitHub Copilot, Homebrew安装, IP 地址批量处理, Python, Python安全, StruQ, 上游代理, 云安全监控, 凭据泄露, 命令行工具(CLI), 对抗攻击, 恶意命令扫描, 提示词注入检测, 敏感信息检测, 文档安全, 文档结构分析, 无后门, 规则文件审查, 逆向工具, 防御规避检测, 零依赖, 零日漏洞检测, 静态分析