Carlos-Projects/agentgate

GitHub: Carlos-Projects/agentgate

一款基于策略的中间件，用于检测、评估并分级管控AI agent和自动化爬虫对网站的访问行为。

Stars: 1 | Forks: 0

# AgentGate 🔥 [![CI](https://img.shields.io/github/actions/workflow/status/Carlos-Projects/agentgate/ci.yml?branch=main&logo=github)](https://github.com/Carlos-Projects/agentgate/actions) [![npm version](https://img.shields.io/npm/v/agentgate-firewall?logo=npm)](https://www.npmjs.com/package/agentgate-firewall) [![TypeScript](https://img.shields.io/badge/types-TypeScript-blue?logo=typescript)](https://www.typescriptlang.org) [![License](https://img.shields.io/github/license/Carlos-Projects/agentgate?logo=opensourceinitiative)](LICENSE) [![GitHub stars](https://img.shields.io/github/stars/Carlos-Projects/agentgate?style=social)](https://github.com/Carlos-Projects/agentgate) [![Star History](https://img.shields.io/badge/Star-History-blue?style=social)](https://api.star-history.com/svg?repos=Carlos-Projects/agentgate&type=Date) **用于控制 AI agent 访问网站的、基于策略的防火墙和蜜罐中间件** AgentGate 提供了一个可编程的边界，用于控制 AI agent、爬虫和自动化系统如何访问您的 Web 内容。它能检测自动化流量、评估风险、执行策略并提供可观测性——所有这些都无需昂贵的基础设施。 **用例：** 您运营着一个网站。AI agent 正在抓取、浏览并与其进行交互。您希望允许友善的 agent，对可疑的 agent 进行沙盒隔离，并阻止恶意的自动化行为——同时不影响人类访客的体验。 ## AgentGate 的独特之处 | 功能 | 作用 | 重要性 | |---|---|---| | **多信号检测** | 结合 user-agent、headers、行为、速率限制 | 避免被绕过的单点故障 | | **分级响应** | allow → limited → challenge → sandbox → block | 根据风险做出按比例的响应 | | **原生框架支持** | 支持 Next.js, Express, Cloudflare Workers 的适配器 | 可直接接入任何技术栈 | | **隐私优先** | 默认对 IP 进行哈希处理（GDPR 友好） | 默认不存储 PII | ## 功能 ### 核心防护 - **多信号检测**：结合 user-agent、headers、行为分析和速率限制 - **策略驱动控制**：基于 YAML 的配置，用于批准/拒绝任务 - **风险评分**：基于可配置信号权重的 0-100 分评分 - **分级响应**：allow、limited、challenge、sandbox、block - **真正的速率限制**：具有多键检查（IP、路径、session）的滑动窗口 - **Session 追踪**：具有指纹兜底的行为分析 ### 高级功能 - **蜜罐系统**：用于机器人检测的静态和动态陷阱 URL - **隐私优先**：默认对 IP 进行哈希处理（GDPR 友好） - **Webhook 通知**：针对关键事件的实时警报 - **JSONL 日志**：可移植、可查询的审计跟踪 - **仪表盘**：内置带身份验证的分析功能 - **框架适配器**：支持 Next.js、Express、Cloudflare Workers ## 快速入门 ### 安装 ``` npm install agentgate-firewall ``` ### 基础设置 (Next.js) 1. 在您的项目根目录中创建 `agent-policy.yaml`： ``` mode: log_only defaults: action: allow expose_debug_headers: true privacy: hash_ip: true log_raw_ip: false rate_limit: enabled: true store: memory # Use "redis" for production failure_mode: open rules: default: window_ms: 60000 max_requests: 60 action: limited session: enabled: true ttl_ms: 1800000 fallback_ttl_ms: 600000 cookie_name: "agentgate_sid" cookie_secure: false # Set true in production track_paths: true max_paths: 50 dashboard: enabled: true require_auth: true known_ai_agents: - GPTBot - ClaudeBot - PerplexityBot ``` 2. 添加 middleware： ``` // middleware.ts import { createAgentGate, loadPolicy, createJsonlLogger } from 'agentgate-firewall' const policy = loadPolicy('./agent-policy.yaml') const agentGate = createAgentGate({ policy, logger: createJsonlLogger(), }) export async function middleware(request: NextRequest) { const result = await agentGate.processRequest({ ip: request.headers.get('x-forwarded-for') || 'unknown', path: request.nextUrl.pathname, method: request.method, userAgent: request.headers.get('user-agent') || '', cookies: Object.fromEntries(request.cookies), headers: Object.fromEntries(request.headers), }) if (result.action === 'block') { return new NextResponse('Blocked', { status: 403 }) } if (result.redirectPath) { return NextResponse.redirect(result.redirectPath) } return NextResponse.next() } ``` 3. 运行您的 Next.js 应用并访问 `/agentgate-dashboard` 以查看分析数据。 ## 配置 ### 策略格式完整选项请参见 [config/agent-policy.example.yaml](config/agent-policy.example.yaml)。 ``` # 模式：log_only (观察) 或 enforce (拦截) mode: log_only # 隐私设置 (GDPR 友好) privacy: hash_ip: true log_raw_ip: false # Rate limiting rate_limit: enabled: true store: memory # or "redis" (requires @upstash/redis) failure_mode: open # open | challenge | block rules: default: window_ms: 60000 max_requests: 60 action: limited suspected_agent: window_ms: 60000 max_requests: 20 action: sandbox honeypot_hit: window_ms: 60000 max_requests: 1 action: block paths: "/api/*": window_ms: 60000 max_requests: 20 action: challenge # Session tracking session: enabled: true ttl_ms: 1800000 # 30 min for cookie sessions fallback_ttl_ms: 600000 # 10 min for fingerprint fallback cookie_name: "agentgate_sid" cookie_secure: true cookie_same_site: "Lax" track_paths: true max_paths: 50 # Dashboard 认证 dashboard: enabled: true require_auth: true # Webhook 通知 webhooks: enabled: true targets: - name: "security-alerts" url: "${AGENTGATE_WEBHOOK_URL}" events: - "honeypot_hit" - "critical_score" - "blocked" secret: "${AGENTGATE_WEBHOOK_SECRET}" timeout_ms: 3000 # 评分配置 scoring: weights: known_ai_user_agent: 25 honeypot_hit: 50 high_request_rate: 20 thresholds: allow: 0 limited: 30 challenge: 55 sandbox: 70 block: 90 ``` ## 动作 | 动作 | 描述 | |--------|-------------| | `allow` | 正常访问 | | `limited` | 带限制/headers 的访问 | | `challenge` | 重定向到声明页面 | | `sandbox` | 重定向到受控环境 | | `block` | 返回 403 | | `log_only` | 仅记录日志而不干预 | ## 速率限制 AgentGate 实现了具有多键检查的**真正的滑动窗口**速率限制： ### 多键策略 - `ip:{hash}` - 全局 IP 限制 - `ip_path:{ip}:{path}` - 基于路径的限制 - `session:{id}` - 基于 session 的限制 - `ua:{hash}` - User-Agent 限制 ### 存储选项 **内存（开发环境）** ``` rate_limit: enabled: true store: memory ``` ⚠️ **警告**：内存存储仅用于开发/演示。不适用于生产环境的高并发场景。 **Redis（生产环境）** ``` rate_limit: enabled: true store: redis # 设置环境变量： # AGENTGATE_REDIS_URL=your-redis-url # AGENTGATE_REDIS_TOKEN=your-redis-token ``` 安装 Redis 适配器： ``` npm install @upstash/redis ``` ### 故障模式 ``` rate_limit: failure_mode: open # or "challenge" or "block" ``` - `open`：如果存储失败，则允许请求（对开发友好） - `challenge`：如果存储失败，则要求进行验证（推荐用于生产环境） - `block`：如果存储失败，则阻止请求（最高安全性） **默认值**：开发环境下为 `open`，生产环境下为 `challenge` ## Session 追踪 AgentGate 追踪用户 session 以检测行为模式： ### 功能 - **基于 Cookie 的 session**：30 分钟 TTL（可配置） - **指纹兜底**：对于没有 Cookie 的用户有 10 分钟的 TTL - **路径追踪**：检测重复的模式（例如：/product/1, /product/2...） - **累积评分**：随着时间推移建立风险画像 ### 隐私 - 默认情况下，IP 地址会被哈希处理 - 除非设置 `log_raw_ip: true`，否则永远不会记录原始 IP - 开箱即用的 GDPR 友好特性 ## 仪表盘访问 `/agentgate-dashboard` 查看： - 总请求数和可疑 agent - 评分分布（低/中/高/严重） - 已执行的动作 - 热门 user agent 和路径 - 蜜罐命中次数 - 近期事件 ### 身份验证 **开发环境**： ``` # 通过 query param 访问 /agentgate-dashboard?token=your-token ``` **生产环境**： ``` # 设置环境变量 export AGENTGATE_DASHBOARD_TOKEN=your-secure-token # 通过 Bearer header 访问 curl -H "Authorization: Bearer your-token" \ https://yoursite.com/agentgate-dashboard ``` 如果未设置 `AGENTGATE_DASHBOARD_TOKEN`，仪表盘在生产环境中将返回 503。 ## Webhooks 针对关键事件的实时通知： ``` webhooks: enabled: true targets: - name: "slack-security" url: "https://hooks.slack.com/..." events: - "honeypot_hit" - "critical_score" - "blocked" secret: "your-webhook-secret" # For HMAC-SHA256 signing timeout_ms: 3000 ``` ### 事件 - `honeypot_hit`：机器人访问了蜜罐 URL - `critical_score`：评分 >= 90 - `blocked`：请求被阻止 - `rate_limit_exceeded`：触发速率限制 - `session_violation`：检测到 session 模式违规 ### 安全性 Webhook 使用 Web Crypto API（兼容 Edge 环境）通过 HMAC-SHA256 进行签名。 ## 框架适配器 ### Next.js ``` import { createAgentGate, loadPolicy } from 'agentgate-firewall' const agentGate = createAgentGate({ policy: loadPolicy('./agent-policy.yaml') }) export async function middleware(request: NextRequest) { const result = await agentGate.processRequest(normalizeNextRequest(request)) return handleNextMiddleware(request, result) } ``` ### Express ``` import { createAgentGate, createExpressMiddleware } from 'agentgate-firewall' const agentGate = createAgentGate({ policy }) app.use(createExpressMiddleware(async (req) => { return await agentGate.processRequest(normalizeExpressRequest(req)) })) ``` ### Cloudflare Workers ``` import { handleCloudflareRequest, createAgentGateForCloudflare } from 'agentgate-firewall' export default { async fetch(request: Request, env: CloudflareEnv, ctx: ExecutionContext) { const agentGate = await createAgentGateForCloudflare(env) return handleCloudflareRequest(request, env, ctx, agentGate) } } ``` ## 日志记录器 AgentGate 支持多种日志记录器： ``` // JSONL (production) import { createJsonlLogger } from 'agentgate-firewall' const logger = createJsonlLogger({ filePath: './logs.jsonl' }) // Console (development) import { createConsoleLogger } from 'agentgate-firewall' const logger = createConsoleLogger({ colors: true, verbose: true }) ``` 日志条目格式： ``` { "timestamp": "2026-05-25T10:00:00Z", "ip": "a1b2c3d4...", // Hashed by default "ipRaw": "192.168.1.1", // Only if log_raw_ip: true "path": "/pricing", "userAgent": "GPTBot/1.0", "score": 72, "action": "sandbox", "signals": ["known_ai_user_agent", "high_request_rate"] } ``` ## 安全注意事项 ### 可信代理当位于反向代理（Cloudflare, AWS ALB, Nginx 等）之后时，如果不配置可信代理，切勿信任 `X-Forwarded-For` headers： ``` import { extractClientIP } from 'agentgate-firewall' // ❌ Insecure — trusts any incoming header const ip = extractClientIP(request.headers) // ✅ Secure — only trusts headers from known proxies const ip = extractClientIP(request.headers, ['203.0.113.1', '198.51.100.1']) ``` **建议**：当不位于代理之后时，直接使用 `socket.remoteAddress`。 ### Webhook SSRF 防护 Webhook 强制执行： - **仅限 HTTPS** — 拒绝 HTTP 目标 - **阻止私有 IP** — 阻止 10.x, 172.16-31.x, 192.168.x, localhost - **DNS 解析验证** — 拒绝解析为私有 IP 的主机名 - **可配置超时** — 默认 5 秒，可通过 `timeout_ms` 设置最大值 - **TLS 验证** — 默认 `rejectUnauthorized: true` ### Session 安全性 - IP 在存储前经过 **SHA-256 哈希处理**（默认保护隐私） - Session ID 是**加密随机的**（`crypto.randomUUID()`） - Session 在可配置的 TTL 后过期（默认 30 分钟） - Cookie 属性：`SameSite=Lax`，可配置的 `Secure` 标志 ### 内容安全策略 AgentGate 会自动在响应中注入安全 headers： | 动作 | Headers | |--------|---------| | **block** | `Content-Security-Policy: default-src 'none'`, `X-Frame-Options: DENY`, `X-Content-Type-Options: nosniff` | | **challenge/sandbox** | `X-Frame-Options: DENY`, `X-Content-Type-Options: nosniff` | ### 隐私 - **IP 哈希**：默认启用（`hash_ip: true`）。使用带有每个进程随机盐值的 HMAC-SHA256。 - **原始 IP 记录**：默认禁用（`log_raw_ip: false`）。仅在出于合规性要求时才启用。 - **无 PII**：AgentGate 不会记录 cookie、authorization headers 或请求体。 - **GDPR 友好**：开箱即用的配置符合数据最小化原则。 ### 配置安全性 - **YAML/JSON 策略**：两种格式均受支持。对于生产环境，请使用 JSON 和 `loadPolicyFromJson()` 以减少攻击面。两者都会运行 `stripProto()` 来防止原型污染。 - **环境变量**：敏感配置（Redis token、webhook 密钥、仪表盘 token）应使用环境变量，而不是直接内联在策略文件中。 - **日志文件路径**：拒绝带有 `..` 的相对路径以防止路径遍历。 ### 生产环境安全检查清单 - [ ] 在 `extractClientIP()` 中配置**可信代理** - [ ] 使用 **Redis** 进行速率限制 (`store: redis`) - [ ] 设置 `AGENTGATE_DASHBOARD_TOKEN` 环境变量 - [ ] 启用带有 HTTPS 目标和 HMAC 密钥的 **webhooks** - [ ] 设置 `cookie_secure: true`（需要 HTTPS） - [ ] 禁用 `expose_debug_headers`（设置为 `false`） - [ ] 使用 `failure_mode: challenge`（而不是 `open`） - [ ] 使用 `loadPolicyFromJson()` 减少攻击面 - [ ] 在开始强制执行前，启动 **`log_only` 模式** 1-2 周 - [ ] 检查 `[AgentGate Audit]` 警告的审计日志（阻止/沙盒/验证事件） ## 理念 - **没有完美的检测**：我们使用可组合的信号，而不是指纹 - **策略驱动**：站点所有者声明他们接受哪些任务 - **分级响应**：并非所有机器人都是一样的；响应随风险而变化 - **可观测性**：所有内容均被记录以供分析 - **人类友好**：绝不影响真实用户 ## 生产环境部署 ### 环境变量 ``` # Redis (可选，用于生产环境 rate limiting) AGENTGATE_REDIS_URL=your-redis-url AGENTGATE_REDIS_TOKEN=your-redis-token # Dashboard 认证 AGENTGATE_DASHBOARD_TOKEN=your-secure-token # Webhooks (可选) AGENTGATE_WEBHOOK_URL=https://your-webhook-endpoint.com AGENTGATE_WEBHOOK_SECRET=your-signing-secret ``` ### 建议 1. **以 `log_only` 模式启动** 1-2 周 2. **查看仪表盘**以了解流量模式 3. **启用 Redis** 进行生产环境速率限制 4. 在生产环境中**设置 `cookie_secure: true`** 5. 在生产环境中**禁用 debug headers** 6. **配置 webhooks** 以接收安全警报 7. 在生产环境中**使用 `failure_mode: challenge`** ## 路线图 ### 阶段 2（当前） ✅ - [x] 真正的速率限制（滑动窗口） - [x] Session 追踪 - [x] 仪表盘身份验证 - [x] Cloudflare Workers 适配器 - [x] Webhook 通知 - [x] 隐私优先日志记录 ### 阶段 3（未来） - [ ] IP 信誉提供商接口 - [ ] 通义千问任务分类器 - [ ] Canary token - [ ] CLI 工具 - [ ] SQLite 日志记录器适配器 ### 尚未计划（目前） - SaaS 仪表盘 - 计费/商业变现 - ML 异常检测 - 浏览器指纹识别 ## 生态系统 AgentGate 是面向 AI agent 的 **Carlos-Projects** 安全基础设施的一部分： ``` Palisade Scanner → Scan content before agents consume it. MCPwn → Attack MCP servers before attackers do. AgentGate → Control how agents access your website. ← you are here MCPscop → Centralize scanner results and security posture. MCPGuard → Runtime security proxy for MCP/A2A protocols. ``` - [Palisade Scanner](https://github.com/Carlos-Projects/palisade-scanner) — 扫描 Web 内容以检测提示词注入和对抗性内容 - [MCPwn](https://github.com/Carlos-Projects/mcpwn) — 针对 MCP 服务器的渗透安全测试框架 - [MCPscop](https://github.com/Carlos-Projects/mcpscope) — 用于 MCP/A2A 扫描器结果的统一安全仪表盘 - [MCPGuard](https://github.com/Carlos-Projects/mcpguard) — 用于 MCP/A2A 协议的 runtime 安全代理 ## 贡献开发指南请参阅 [CONTRIBUTING.md](CONTRIBUTING.md)。 ## 安全发现漏洞？请参阅 [SECURITY.md](SECURITY.md)。 ## 许可证 MIT — 请参阅 [LICENSE](LICENSE)

标签：AI代理, AppImage, CISA项目, MacOS取证, TypeScript, Web应用防火墙, 中间件, 安全插件, 搜索引擎查询, 流量控制, 程序员工具, 网络安全, 自动化攻击, 蜜罐, 证书利用, 配置审计, 隐私保护