Dhairya-Bhansali/riskguard-fraud-detection

GitHub: Dhairya-Bhansali/riskguard-fraud-detection

基于 XGBoost 和规则引擎的实时交易欺诈检测平台,集成 SHAP 可解释性和分析师审核工作流。

Stars: 0 | Forks: 0

# RiskGuard — 实时交易欺诈检测平台 ![仪表盘](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/d951bb36bf021146.png) ## 关键指标 | 指标 | 值 | |--------|-------| | AUC-ROC | **0.9847** | | F1 Score | **0.9612** | | Precision | **0.9741** | | Recall | **0.9489** | | False Positive Rate | **0.80%** | | P50 API Latency | **< 2ms** | | Training Dataset | **284,807 transactions** | ## 架构 ``` [Next.js Dashboard] ←──── auto-refresh (3s) ────→ [FastAPI Backend] │ ┌──────────────┴──────────────┐ [Rule Engine] [XGBoost ML] │ │ └──────── combined ───────┘ │ [SHAP Explainer] │ [SQLite Audit Log] ``` **评分流水线:** 规则引擎(快速启发式,权重 30%)→ XGBoost(ML 概率,权重 70%)→ SHAP 特征 → APPROVE / REVIEW / BLOCK ## 截图 ### 仪表盘 — 实时概览 实时 KPI 卡片显示总交易量、拦截率、拦截欺诈金额和 API 延迟。每 3 秒自动刷新。 ![仪表盘概览](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/d951bb36bf021146.png) ### 实时交易动态 最近 20 条评分交易的滚动信息流,包含风险条、决策标签和时间戳。 ![实时动态](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/53bbd9591d021147.png) ### 交易历史 — 完整审计日志 仅追加的合规日志,支持按决策过滤(全部 / 已拦截 / 待审核 / 已通过)。每行显示 TXN ID、用户、金额、商户、位置、设备、风险评分和决策。 ![交易历史](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/6a9a89dbed021150.png) **已拦截交易** — 高风险商户(Binance、Gambling、Wire Transfer)、未知位置、未识别设备: ![已拦截交易](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/164e2d9dd3021152.png) **已通过交易** — 低风险商户(Grocery、Healthcare、Entertainment)、已知设备、已识别城市: ![已通过交易](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/f4614df847021153.png) ### 分析 — 欺诈检测指标 交易量随时间变化图表(欺诈 vs 正常)、决策分布环形图、时间窗口选择器(6h / 1d / 2d / 7d)。 ![分析仪表盘](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/0846e069d6021155.png) ### 风险评分分布 + 模型性能 所有交易的风险评分直方图。XGBoost v2.4.1 性能指标直接从训练模型实时渲染。 ![风险分布](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/e35b67b37f021156.png) ### 分析师警报队列 — 待处理警报 欺诈分析师工作流:每个 BLOCK/REVIEW 案例都会进入队列,等待人工审核并附带分析师备注。 ![待处理警报](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/1ddb1e1468021157.png) ### 分析师警报队列 — 已确认欺诈 已确认为欺诈的已解决案例 — 存储在数据库中,用于每月重新训练模型。 ![已确认欺诈](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/bc5ef81eb7021200.png) ### 分析师警报队列 — 误报 被解决为误报的案例 — 反馈到系统中,以降低未来模型版本中的误报率。 ![误报](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/7b1dfe211c021201.png) ## 技术栈 **后端** - Python 3.13, FastAPI (async), SQLAlchemy + SQLite - XGBoost 3.x, scikit-learn, SHAP, imbalanced-learn (SMOTE) - Uvicorn, Pydantic v2, httpx **前端** - Next.js 14, TypeScript, Tailwind CSS - SWR (auto-refresh), Recharts, Axios **ML Pipeline** - Dataset: [Kaggle Credit Card Fraud Detection](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud) (284,807 transactions) - 34 engineered features (V1–V28 PCA + Amount/Time transforms) - SMOTE oversampling to handle 0.17% fraud class imbalance - 5-fold stratified cross-validation - SHAP TreeExplainer for per-transaction explainability ## 评分机制 每笔传入交易都需经过三个阶段: **1. 规则引擎** — 即时启发式规则,标记已知模式: - 高风险商户类别(加密货币交易所、赌博、电汇) - 未知地理位置或高风险国家(NG, XX, ZZ) - 检测到匿名网络 - 金额超过速率阈值($5,000+) - 未识别的设备指纹 **2. XGBoost ML** — 基于真实 28 万条交易训练,根据 34 个特征输出欺诈概率(0–1)。 **3. 综合评分** — `risk = ML_score × 0.7 + rule_boost × 0.3` | Score | Decision | |-------|----------| | ≥ 0.72 | 🔴 BLOCK | | 0.70 – 0.72 | 🟡 REVIEW | | < 0.70 | 🟢 APPROVE | ## 项目结构 ``` riskguard/ ├── api/ │ ├── main.py # FastAPI app with lifespan startup │ ├── core/ │ │ ├── config.py # Settings (thresholds, DB URL) │ │ └── database.py # Async SQLAlchemy │ ├── models/ │ │ ├── schemas.py # Pydantic request/response models │ │ └── db_models.py # TransactionLog + FraudAlert ORM │ ├── routers/ │ │ ├── transactions.py # POST /score, POST /batch, GET / │ │ ├── analytics.py # GET /summary, /timeseries, /alerts │ │ └── health.py # GET /health │ └── services/ │ └── fraud_scorer.py # Core scoring pipeline ├── ml/ │ ├── train.py # XGBoost training script │ ├── features.py # Feature engineering │ ├── model.py # Singleton model loader │ ├── explainability.py # SHAP TreeExplainer wrapper │ └── model_artifacts/ # Saved model + explainer (post-training) ├── tests/ │ └── test_api.py ├── requirements.txt └── docker-compose.yml dashboard/ ├── app/ │ ├── dashboard/page.tsx # KPI cards + live feed │ ├── transactions/page.tsx # Audit log + detail panel │ ├── analytics/page.tsx # Charts + model metrics │ └── alerts/page.tsx # Analyst case queue ├── components/ │ ├── layout/Sidebar.tsx # Nav + API status indicator │ ├── ui/index.tsx # DecisionBadge, RiskBar, KPICard │ └── charts/index.tsx # FraudAreaChart, RiskDistChart ├── hooks/useRiskGuard.ts # SWR data hooks ├── lib/api.ts # Axios API client └── types/index.ts # TypeScript types ``` ## 设置 ### 前置条件 - Python 3.10+ - Node.js 18+ - Git ### 后端 ``` # 克隆并导航 git clone https://github.com/your-username/riskguard.git cd riskguard # 创建虚拟环境 python -m venv venv venv\Scripts\activate # Windows # source venv/bin/activate # Mac/Linux # 安装依赖 pip install -r requirements.txt # 下载数据集 # 将来自 https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud 的 creditcard.csv 放置 # 于 ml/data/creditcard.csv # 训练模型(3–5 分钟) python ml/train.py # 启动 API uvicorn api.main:app --host 0.0.0.0 --port 8000 ``` API 文档地址:`http://localhost:8000/docs` ### 前端 ``` cd dashboard cp .env.example .env.local # 设置 NEXT_PUBLIC_API_URL=http://localhost:8000 npm install npm run dev ``` 仪表盘地址:`http://localhost:3000` ### 填充演示数据 ``` python ml/seed_data.py # 200 mixed transactions python ml/approve_seed.py # 80 clearly legitimate transactions ``` ## API 参考 ### POST `/api/v1/transactions/score` 实时对单笔交易进行评分。 ``` { "transaction_id": "TXN-000001", "user_id": "USR-001", "amount": 9500.00, "merchant_name": "Crypto Exchange", "merchant_category": "crypto exchange", "location": { "country": "XX", "city": "Unknown Location" }, "device_type": "Unknown Device", "device_id": null } ``` **响应:** ``` { "transaction_id": "TXN-000001", "risk_score": 0.8798, "risk_percent": 87, "decision": "BLOCK", "reasons": [ "High-risk merchant category", "Unrecognized geolocation", "Amount exceeds velocity threshold", "New/unrecognized device", "High-risk country" ], "shap_features": [...], "model_version": "v2.4.1", "latency_ms": 7.24, "timestamp": "2026-03-07T10:04:55.732962" } ``` ### GET `/api/v1/analytics/summary` 返回所有已评分交易的汇总欺诈指标。 ### GET `/api/v1/analytics/timeseries` 返回按决策分类的时间分桶交易量。 ### PATCH `/api/v1/analytics/alerts/{alert_id}` 将警报解决为 `confirmed_fraud` 或 `false_positive`,并附带分析师备注。 ## 作者 **Dhairya Bhansali**
标签:Apex, AV绕过, FastAPI, FinTech, Python, React, SHAP解释性, Syscalls, XGBoost, 云计算, 低延迟, 信用评分, 全栈项目, 反洗钱, 可解释人工智能, 合规科技, 实时交易监控, 审计日志, 异常检测, 无后门, 机器学习, 欺诈检测, 混合模型, 网络金融诈骗, 规则引擎, 逆向工具, 金融风控, 风控系统, 高频交易