khasinski/ai_bouncer

GitHub: khasinski/ai_bouncer

面向 Rails 的 AI 驱动 HTTP 攻击检测中间件，利用 ML 嵌入技术实时识别 SQL 注入、XSS 等多种 Web 攻击。

Stars: 10 | Forks: 1

# AiBouncer [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/1a964098e7220227.svg)](https://github.com/khasinski/ai_bouncer/actions/workflows/ci.yml) [![Gem Version](https://badge.fury.io/rb/ai_bouncer.svg)](https://badge.fury.io/rb/ai_bouncer) 面向 Ruby on Rails 的 AI 驱动 HTTP 请求分类器。利用 ML 嵌入技术检测凭证填充、SQL 注入、XSS 及其他攻击。 ## 功能特性 - **快速**：约 2ms 推理时间（内存模式） - **轻量**：模型总大小约 32MB - **精准**：对常见攻击的检测率达 92% 以上 - **灵活存储**：支持内存或 PostgreSQL + pgvector - **易于集成**：开箱即用的中间件或控制器组件 - **可配置**：支持保护特定路径、自定义响应 ## 可检测的攻击类型 - SQL 注入 (SQLi) - 跨站脚本攻击 (XSS) - 路径遍历 - 命令注入 - 凭证填充 - 垃圾机器人 - 漏洞扫描器 - 服务端请求伪造 (SSRF) - XML 外部实体注入 (XXE) - NoSQL 注入 - 服务端模板注入 (SSTI) - Log4Shell (JNDI 注入) - 开放重定向 - LDAP 注入 ## 系统要求 - Ruby >= 3.2（onnxruntime 要求） - Rails 6.1+（可选，用于中间件/组件集成） ## 安装添加到您的 Gemfile： ``` gem 'ai_bouncer' # 可选：用于数据库存储模式 gem 'neighbor' ``` 然后运行安装器： ``` bundle install rails generate ai_bouncer:install ``` 这将创建 `config/initializers/ai_bouncer.rb`。模型文件（约 32MB）会在首次请求时**自动下载**。 ### 手动下载（可选）如果您希望将模型文件与应用程序打包在一起： ``` # 从 HuggingFace 下载 pip install huggingface_hub huggingface-cli download khasinski/ai-bouncer --local-dir vendor/ai_bouncer # 在 initializer 中禁用自动下载 config.auto_download = false ``` ## 存储模式 ### 内存模式（默认）向量保存在内存中。快速且简单。 ``` config.storage = :memory ``` **优点**：约 2ms 延迟，无需数据库 **缺点**：约 32MB RAM 占用，模式在部署时固定 ### 数据库模式向量使用 pgvector 存储在 PostgreSQL 中。 ``` config.storage = :database ``` **优点**：可扩展，可在运行时添加自定义模式，持久化 **缺点**：约 5ms 延迟，需要 pgvector #### 数据库设置 1. 安装 pgvector：https://github.com/pgvector/pgvector 2. 生成并运行迁移： ``` rails generate ai_bouncer:migration rails db:migrate ``` 3. 导入内置模式数据： ``` rails ai_bouncer:seed ``` 4. 验证： ``` rails ai_bouncer:stats ``` ## 配置 ``` # config/initializers/ai_bouncer.rb AiBouncer.configure do |config| config.enabled = Rails.env.production? config.storage = :memory # or :database # Paths to protect (for middleware) config.protected_paths = [ "/login", "/register", "/api/*", ] # Action when attack detected config.action = :block # :block, :challenge, or :log config.threshold = 0.3 # Model files location config.model_path = Rails.root.join("vendor", "ai_bouncer") # Callback for monitoring config.on_attack_detected = ->(request:, classification:, action:) { Rails.logger.warn "Attack: #{classification[:label]} from #{request.ip}" } end ``` ## 使用方法 ### 方式 1：中间件（自动）中间件会自动保护配置的路径。它从 Rails 请求中提取 method、path、body、user-agent 和 params —— 无需手动格式化： ``` # 像这样的请求： # POST /login HTTP/1.1 # User-Agent: Mozilla/5.0... # Content-Type: application/x-www-form-urlencoded # # username=admin'--&password=x # 会被自动分类为： # => { label: "sqli", confidence: 0.94, is_attack: true } ``` ### 方式 2：控制器组件（细粒度）如需更多控制，请使用控制器组件： ``` class SessionsController < ApplicationController include AiBouncer::ControllerConcern # Protect all actions protect_from_attacks # Or protect specific actions with custom options protect_from_attacks only: [:create], threshold: 0.5, action: :block end ``` 或手动检查： ``` class PaymentsController < ApplicationController include AiBouncer::ControllerConcern def create check_for_attack # Blocks if attack detected # Normal flow continues... end end ``` ### 方式 3：手动分类 ``` result = AiBouncer.classify( AiBouncer.request_to_text( method: "POST", path: "/login", body: "username=admin'--&password=x", user_agent: "python-requests/2.28" ) ) result # => { # label: "sqli", # confidence: 0.94, # is_attack: true, # latency_ms: 2.1 # } ``` ## 添加自定义模式（数据库模式） ``` # 为您见过的特定攻击添加模式 embedding = AiBouncer.model.embed("POST /admin.php?cmd=wget...") AiBouncer::AttackPattern.create!( label: "scanner", severity: "high", embedding: embedding, sample_text: "POST /admin.php?cmd=wget...", source: "incident_2024_01" ) ``` ## Rake 任务 ``` # 手动下载模型文件（默认启用自动下载） rails ai_bouncer:download # 将内置模式种子写入数据库（仅限数据库模式） rails ai_bouncer:seed # 显示统计信息 rails ai_bouncer:stats # 测试分类 rails ai_bouncer:test # 基准测试性能 rails ai_bouncer:benchmark ``` ## 真实示例 ### SQL 注入 ``` # 认证绕过 AiBouncer.classify("POST /login username=admin' OR '1'='1 password=x") # => { label: "sqli", confidence: 0.94, is_attack: true } # 基于 UNION 的数据提取 AiBouncer.classify("GET /users?id=1 UNION SELECT username,password FROM users--") # => { label: "sqli", confidence: 0.96, is_attack: true } # 盲注 SQL 注入 AiBouncer.classify("GET /products?id=1 AND SLEEP(5)") # => { label: "sqli", confidence: 0.91, is_attack: true } ``` ### 跨站脚本攻击 (XSS) ``` # 评论中的脚本注入 AiBouncer.classify("POST /comments body=") # => { label: "xss", confidence: 0.96, is_attack: true } # 事件处理程序注入 AiBouncer.classify("POST /profile bio=

") # => { label: "xss", confidence: 0.93, is_attack: true } # 基于 SVG 的 XSS AiBouncer.classify("POST /upload filename=

标签：Apex, CISA项目, credential stuffing, DOE合作, Gem, HTTP请求分析, ONNX, RCE, Ruby on Rails, SSRF, WAF, Web安全, XSS, 中间件, 人工智能, 内存转储, 向量数据库, 应用防火墙, 异常检测, 批量扫描, 攻击识别, 机器学习, 测试用例, 漏洞情报, 用户模式Hook绕过, 网络安全, 蓝队分析, 隐私保护