guymoyo/fineract-aml
GitHub: guymoyo/fineract-aml
面向 Apache Fineract 的实时反洗钱与欺诈检测服务,结合规则引擎、异常检测和 XGBoost 分类器,配备合规仪表板与信用评分模块。
Stars: 0 | Forks: 0
# Fineract AML — 反洗钱检测服务
面向 [Apache Fineract](https://fineract.apache.org/) 的实时反洗钱 (AML) 和欺诈检测服务。通过 webhooks 消费交易事件,应用基于规则和 ML 驱动的分析,并提供合规分析师仪表板用于调查和案件管理。
## 架构
```
┌─────────────┐ Webhook ┌──────────────┐ Celery ┌────────────────┐
│ Fineract │ ──────────────▶ │ FastAPI │ ─────────────▶ │ Analysis │
│ (Core │ POST /webhook │ (API) │ async task │ Pipeline │
│ Banking) │ │ │ │ │
└─────────────┘ └──────┬───────┘ │ 1. Rules │
│ │ 2. Anomaly │
│ REST API │ 3. XGBoost │
▼ └───────┬────────┘
┌──────────────┐ │
│ Compliance │ Alerts │
│ Dashboard │ ◀──────────────────────┘
│ (React) │
└──────────────┘
│
│ Analyst reviews alerts
│ (human-in-the-loop)
▼
Labeled data → Model retraining
```
## 主要功能
- **Webhook Consumer** — 实时接收来自 Fineract 的存款/取款事件
- **Rule Engine** — 确定性 AML 规则(大额交易、结构性交易、交易频率、异常时段)
- **Anomaly Detection** — 无监督 ML (Isolation Forest),无需标记数据即可工作
- **Fraud Classifier** — 监督式 ML (XGBoost),随时间推移根据分析师标记的数据进行训练
- **Compliance Dashboard** — 审查警报、调查案件、标记交易
- **Human-in-the-Loop** — 分析师决策成为持续 ML 改进的训练数据
- **Case Management** — 将相关可疑交易归组为调查案件
- **Credit Scoring** — 基于规则 + ML 的客户信用评分,包含层级细分
- **Credit Request Review** — 贷款申请的合规工作流,附带自动推荐
- **Transfer Fraud Detection** — 循环转账、新交易对手和快速配对检测
- **MLflow Integration** — 模型版本控制、实验跟踪、指标记录
## 快速开始
### 前置条件
- Docker 和 Docker Compose
- Python 3.12+(用于本地开发)
- Node.js 20+(用于仪表板开发)
### 使用 Docker Compose 运行
```
# Clone the repository
git clone https://github.com/ADORSYS-GIS/fineract-aml.git
cd fineract-aml
# Copy environment file
cp .env.example .env
# Start all services
docker compose up -d
# Run database migrations
docker compose exec api alembic upgrade head
# Create initial admin user (interactive)
docker compose exec api python -m app.scripts.create_admin
```
服务将可通过以下地址访问:
| Service | URL |
|---------|-----|
| AML API | http://localhost:8000 |
| API Docs (Swagger) | http://localhost:8000/docs |
| API Docs (ReDoc) | http://localhost:8000/redoc |
| Compliance Dashboard | http://localhost:3000 |
| MLflow UI | http://localhost:5000 |
### 本地开发
```
# Backend
cd backend
python -m venv .venv
source .venv/bin/activate # or .venv\Scripts\activate on Windows
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000
# Celery worker (separate terminal)
celery -A app.tasks.celery_app worker --loglevel=info
# Celery beat (separate terminal)
celery -A app.tasks.celery_app beat --loglevel=info
```
### 运行测试
```
cd backend
pytest -v
pytest --cov=app --cov-report=html
```
## 项目结构
```
fineract-aml/
├── backend/ # Python FastAPI backend
│ ├── app/
│ │ ├── api/ # REST API endpoints
│ │ │ ├── webhook.py # Fineract webhook consumer
│ │ │ ├── alerts.py # Alert management
│ │ │ ├── transactions.py# Transaction queries
│ │ │ ├── cases.py # Case management
│ │ │ ├── credit.py # Credit scoring & requests
│ │ │ └── auth.py # Authentication
│ │ ├── core/ # Config, database, security
│ │ ├── features/ # Feature engineering for ML
│ │ ├── ml/ # ML models
│ │ │ ├── anomaly_detector.py # Isolation Forest (unsupervised)
│ │ │ ├── fraud_classifier.py # XGBoost (supervised)
│ │ │ └── credit_scorer.py # Credit scoring + K-Means clustering
│ │ ├── models/ # SQLAlchemy database models
│ │ ├── rules/ # Deterministic rule engine
│ │ ├── schemas/ # Pydantic request/response schemas
│ │ ├── services/ # Business logic layer
│ │ └── tasks/ # Celery async tasks
│ ├── alembic/ # Database migrations
│ ├── tests/ # Test suite
│ ├── Dockerfile
│ └── requirements.txt
├── dashboard/ # React compliance dashboard
├── ml/ # ML experiments & training notebooks
├── k8s/ # Kubernetes deployment manifests
├── docs/ # Documentation
├── docker-compose.yml
└── .env.example
```
## 信用评分
系统根据客户的交易行为计算信用评分,将其细分为不同层级 (A–E),并提供经合规审查的信贷请求工作流。
**工作原理:**
1. 每天晚上,系统会查看每位客户过去 180 天的交易记录,并计算 19 个行为特征(存款稳定性、储蓄率、贷款偿还率、欺诈历史等)
2. **加权公式** 将这些特征转换为单一的信用评分 (0–1)。权重最大的因素:存款稳定性 (20%)、净现金流 (20%)、贷款偿还 (15%)、储蓄率 (15%)
3. 评分决定客户的 **层级** 和最大可借贷金额:
| Score | Tier | Max Credit (XAF) |
|-------|------|-------------------|
| ≥ 80% | A - Excellent | 50,000 |
| ≥ 65% | B - Good | 20,000 |
| ≥ 50% | C - Fair | 10,000 |
| ≥ 35% | D - Poor | 1,000 |
| < 35% | E - Very Poor | 0 |
4. **K-Means ML 模型**(每周训练)将客户分为 5 个集群,以验证基于规则的层级
5. 当客户申请贷款时,系统会 **实时重新评分** 并生成推荐(批准 / 仔细审查 / 拒绝)
6. **合规分析师** 审查并做出最终决定 —— 系统绝不会自动批准
**快速开始:**
```
# Run nightly scoring
docker compose exec api python -c "from app.tasks.credit_scoring import compute_all_credit_scores; compute_all_credit_scores.delay()"
# Submit a credit request
curl -X POST http://localhost:8000/api/v1/credit/request \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"fineract_client_id":"CLI-001","requested_amount":15000}'
```
**详细文档:**
- [信用评分 API 参考](backend/docs/credit-scoring-api.md)
- [架构与评分方法论](backend/docs/credit-scoring-architecture.md)
- [运维指南与配置](backend/docs/credit-scoring-operations.md)
## 文档
- [架构概览](docs/architecture/overview.md)
- [API 参考](docs/api/endpoints.md)
- [信用评分 API](backend/docs/credit-scoring-api.md)
- [Fineract Webhook 设置](docs/guides/fineract-webhook-setup.md)
- [ML 流水线指南](docs/ml/pipeline.md)
- [部署指南](docs/guides/deployment.md)
- [贡献指南](docs/guides/contributing.md)
## 技术栈
| Component | Technology | Purpose |
|-----------|-----------|---------|
| API | Python 3.12, FastAPI | REST API with async support, auto-generated Swagger docs |
| ORM | SQLAlchemy 2.0 (async) | Database models with async PostgreSQL driver (asyncpg) |
| Database | PostgreSQL 16 | Permanent transaction storage, credit profiles, alerts, cases |
| Task Queue | Celery + Redis | Background analysis, nightly scoring, weekly ML retraining |
| ML | scikit-learn, XGBoost, MLflow | 4 ML models (see below) with experiment tracking |
| Dashboard | React, TanStack Router, TanStack Query | Compliance analyst UI with file-based routing |
| Auth | JWT (PyJWT) | Token-based authentication for API and dashboard |
| Validation | Pydantic v2 | Request/response schemas and config management |
| Containers | Docker, Docker Compose, Kubernetes | Development and production deployment |
### ML 模型
系统使用 4 个互补模型 —— 3 个用于 AML 欺诈检测,1 个用于信用评分:
| Model | Type | Library | Purpose | Training |
|-------|------|---------|---------|----------|
| **Rule Engine** | Deterministic | Custom Python | Checks 7 suspicious patterns (large amounts, structuring, rapid transactions, unusual hours, circular transfers, new counterparties, rapid pairs) | No training — rules are configured via environment variables |
| **Isolation Forest** | Unsupervised | scikit-learn | Anomaly detection — flags statistically unusual transactions without needing labeled data | Retrains automatically as data grows |
| **XGBoost Classifier** | Supervised | XGBoost | Fraud classification — learns from analyst decisions (fraud vs. legitimate) to predict fraud probability | Retrains when analysts label enough new data |
| **K-Means Clustering** | Unsupervised | scikit-learn | Credit tier validation — groups customers into 5 clusters to validate rule-based credit tiers | Retrains weekly via Celery Beat |
**协同工作机制:**
- 对于 **AML**:每笔交易都会经过 Rule Engine → Isolation Forest → XGBoost(如果已训练)。每个模型都会生成一个风险评分,取最高分为准。高风险交易会为合规分析师生成警报。
- 对于 **Credit**:Rule Engine 根据 19 个行为特征对客户进行评分。K-Means Clustering 独立地将客户分组以验证这些评分。当两者一致时,置信度更高(混合评分)。
### 交易存储
所有交易均 **永久** 存储在 PostgreSQL 中。信用评分系统使用 **180 天滑动窗口** —— 它查看每位客户过去 6 个月的交易来计算其信用评分。这意味着评分会自动反映最近的行为变化,同时不会丢失历史数据。
## 许可证
Apache License 2.0
标签:AML, Apache Fineract, Apex, AV绕过, Celery, FastAPI, FinTech, IP 地址批量处理, Python, React, Syscalls, TCP/UDP协议, XGBoost, 云计算, 反洗钱, 合规仪表盘, 子域名突变, 实时交易监控, 异常检测, 搜索引擎查询, 无后门, 机器学习, 案件管理, 欺诈检测, 测试用例, 网络钩子, 规则引擎, 请求拦截, 逆向工具, 金融合规, 金融科技, 金融风控, 风险控制