ujjawalranjan09/AI-powered-honeypot-for-scam-detection-and-threat-intelligence-extraction

GitHub: ujjawalranjan09/AI-powered-honeypot-for-scam-detection-and-threat-intelligence-extraction

一个面向社交工程诈骗的AI驱动蜜罐系统，能自动检测诈骗意图、模拟受害者与诈骗分子对话周旋，并从中提取威胁情报。

Stars: 0 | Forks: 0

# 🛡️ Honeypot — AI 驱动的诈骗检测系统 **版本 5.3 "Neural Sentinel"** [![Python](https://img.shields.io/badge/Python-3.10+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://python.org) [![FastAPI](https://img.shields.io/badge/FastAPI-0.100+-009688?style=for-the-badge&logo=fastapi&logoColor=white)](https://fastapi.tiangolo.com) [![ML](https://img.shields.io/badge/ML-scikit--learn-F7931E?style=for-the-badge&logo=scikit-learn&logoColor=white)](https://scikit-learn.org) [![Vercel](https://img.shields.io/badge/Deployed-Vercel-000000?style=for-the-badge&logo=vercel&logoColor=white)](https://vercel.com) [![License](https://img.shields.io/badge/License-MIT-green?style=for-the-badge)](LICENSE) *一个智能蜜罐 API，可自主与诈骗分子互动，提取威胁情报并保护用户 — 专为印度网络犯罪环境量身打造。*

## 🎯 项目功能 Honeypot 是一个**生产级部署的 FastAPI 后端**，作为诱饵来诱捕、分析网络诈骗分子并从中提取情报。当诈骗分子发送消息时，系统会： 1. 使用混合 ML + 基于规则的引擎**检测诈骗意图**（在 158,740 个样本上准确率达 97.8%） 2. 使用 AI 人设**像受害者一样回应**，以保持诈骗分子的参与度 3. **提取犯罪情报** — UPI ID、电话号码、钓鱼链接、银行账户 4. **对惯犯进行分析**，跨会话进行持续跟踪 5. 通过赛博朋克主题的 HTML 仪表盘**生成可视化威胁报告** ## ✨ 主要特性 - **32 个阻断开关** — 专门针对英语、印地语和 Hinglish 诈骗的检测规则 - **混合检测** — TF-IDF + Gradient Boosting ML + 基于规则的引擎 - **多模型 AI 智能体** — 12 种人设，68+ 个备用回复，轮换 API 密钥 - **情报提取** — 捕获 UPI ID、电话号码、钓鱼 URL、加密货币钱包 - **策略转变检测** — 识别诈骗分子在会话中途的战术变化 - **全球诈骗者画像** — 跨会话跟踪惯犯 - **生产就绪** — 部署到 Vercel Serverless（<250 MB） - **新型诈骗组合** — 检测权威陷阱、双重诱饵、孤立压力等战术 ## 🛠️ 技术栈 | 层级 | 技术 | |-------|------------| | API 框架 | FastAPI + Uvicorn | | ML 引擎 | scikit-learn (TF-IDF + Gradient Boosting) | | LLM 集成 | OpenRouter API (Llama 3.3 70B) | | 部署 | Vercel Serverless / Render.com | | 前端仪表盘 | 原生 HTML/JS (赛博朋克主题) | | 测试 | pytest | ## 📦 安装 ``` # Clone repository git clone https://github.com/ujjawalranjan09/Honeypot.git cd Honeypot # 创建并激活 virtual environment python -m venv venv source venv/bin/activate # Linux/Mac # 或者：venv\Scripts\activate # Windows # 安装依赖 pip install -r requirements.txt # 配置环境 cp .env.example .env # 编辑 .env 并添加你的 API keys ``` ## ⚙️ 配置使用你的密钥编辑 `.env`： ``` HONEYPOT_API_KEY=your-secret-api-key-here OPENROUTER_API_KEY=your-openrouter-api-key-here OPENROUTER_MODEL=meta-llama/llama-3.3-70b-instruct:free ``` ## 🏃 运行 API ``` # 启动服务器 python main.py # 或者直接使用 uvicorn uvicorn main:app --host 0.0.0.0 --port 8000 --reload ``` API 将在 `http://localhost:8000` 上可用 ## 📡 API 端点 ### 健康检查 ``` GET /api/health ``` ### 处理消息（主端点） ``` POST /api/message X-API-Key: YOUR_SECRET_API_KEY Content-Type: application/json { "sessionId": "unique-session-id", "message": { "sender": "scammer", "text": "Your bank account will be blocked. Verify immediately.", "timestamp": "2026-01-29T10:15:30Z" }, "conversationHistory": [], "metadata": { "channel": "SMS", "language": "English", "locale": "IN" } } ``` ### 获取会话状态 ``` GET /api/session/{session_id} X-API-Key: YOUR_SECRET_API_KEY ``` ### 获取统计数据 ``` GET /api/stats X-API-Key: YOUR_SECRET_API_KEY ``` ## 📊 示例响应 ``` { "status": "success", "scamDetected": true, "reply": "Oh dear, what happened to my account? Please help me.", "engagementMetrics": { "engagementDurationSeconds": 120, "totalMessagesExchanged": 4, "currentPhase": "compliance" }, "extractedIntelligence": { "upiIds": [], "phishingLinks": ["http://bank-secure.com"], "phoneNumbers": ["+919876543210"], "suspiciousKeywords": ["urgent", "verify", "blocked"] }, "agentNotes": "Scammer using urgency tactics", "engagementComplete": false } ``` ## 📈 性能指标 - **模型准确率**：在训练集（158,740 个样本）上达 97.8% - **测试覆盖率**：在多样化的真实世界测试用例上达 71.4% - **模型大小**：0.33 MB（远低于 Vercel 250 MB 的限制） - **诈骗类别**：涵盖印度网络犯罪环境的 33 个类别 - **阻断开关**：32 个专门的高置信度检测规则 ## 🏗️ 项目结构 ``` honeypot/ ├── main.py # FastAPI application entry point ├── config.py # Configuration & detection thresholds ├── models.py # Pydantic data models ├── scam_detector.py # ML + rule-based detection engine ├── intelligence_extractor.py # Extract UPI IDs, phishing links, etc. ├── ai_agent.py # AI persona engine with multi-model fallback ├── session_manager.py # Session lifecycle & state management ├── api/index.py # Vercel serverless handler ├── frontend/ # Visual dashboard (HTML/JS) ├── tests/ # pytest test suite ├── requirements.txt ├── vercel.json └── .env.example ``` ## 🚀 部署 ### Vercel（推荐） 1. 推送至 GitHub 2. 在 Vercel 仪表盘中导入仓库 3. 设置环境变量（`HONEYPOT_API_KEY`、`OPENROUTER_API_KEY`） 4. 部署 ### Render.com 有关详细说明，请参阅 [DEPLOYMENT.md](./DEPLOYMENT.md)。 ## 🧪 测试 ``` python final_validation_test.py # End-to-end validation python quick_test.py # Quick smoke test pytest tests/ # Full test suite ``` ## 📄 许可证 MIT License — 详见 [LICENSE](LICENSE)。

**由 [Ujjawal Ranjan](https://github.com/ujjawalranjan09) 用 ❤️ 构建 | RTU, Jaipur** *一次对付一个诈骗分子，抗击网络犯罪。*

标签：Apex, API密钥管理, AV绕过, FastAPI, IPv6支持, OSV, Python, scikit-learn, TF-IDF, UPI欺诈检测, Vercel, Web应用部署, 人工智能, 反诈骗, 多语言检测, 威胁情报, 安全规则引擎, 开发者工具, 搜索语句（dork）, 无后门, 机器学习, 梯度提升, 犯罪画像, 用户模式Hook绕过, 网络安全, 英文印地文检测, 蜜罐系统, 诈骗检测, 逆向工具, 钓鱼链接检测, 隐私保护