guymoyo/fineract-aml

GitHub: ADORSYS-GIS/fineract-aml

面向 Apache Fineract 的实时反洗钱与欺诈检测服务，结合规则引擎、异常检测和 XGBoost 分类器，配备合规仪表板与信用评分模块。

Stars: 0 | Forks: 0

# Fineract AML — 反洗钱检测服务面向 [Apache Fineract](https://fineract.apache.org/) 的实时反洗钱 (AML) 和欺诈检测服务。通过 webhooks 消费交易事件，应用基于规则和 ML 驱动的分析，并提供合规分析师仪表板用于调查和案件管理。 ## 架构 ``` ┌─────────────┐ Webhook ┌──────────────┐ Celery ┌────────────────┐ │ Fineract │ ──────────────▶ │ FastAPI │ ─────────────▶ │ Analysis │ │ (Core │ POST /webhook │ (API) │ async task │ Pipeline │ │ Banking) │ │ │ │ │ └─────────────┘ └──────┬───────┘ │ 1. Rules │ │ │ 2. Anomaly │ │ REST API │ 3. XGBoost │ ▼ └───────┬────────┘ ┌──────────────┐ │ │ Compliance │ Alerts │ │ Dashboard │ ◀──────────────────────┘ │ (React) │ └──────────────┘ │ │ Analyst reviews alerts │ (human-in-the-loop) ▼ Labeled data → Model retraining ``` ## 主要功能 - **Webhook Consumer** — 实时接收来自 Fineract 的存款/取款事件 - **Rule Engine** — 确定性 AML 规则（大额交易、结构性交易、交易频率、异常时段） - **Anomaly Detection** — 无监督 ML (Isolation Forest)，无需标记数据即可工作 - **Fraud Classifier** — 监督式 ML (XGBoost)，随时间推移根据分析师标记的数据进行训练 - **Compliance Dashboard** — 审查警报、调查案件、标记交易 - **Human-in-the-Loop** — 分析师决策成为持续 ML 改进的训练数据 - **Case Management** — 将相关可疑交易归组为调查案件 - **Credit Scoring** — 基于规则 + ML 的客户信用评分，包含层级细分 - **Credit Request Review** — 贷款申请的合规工作流，附带自动推荐 - **Transfer Fraud Detection** — 循环转账、新交易对手和快速配对检测 - **MLflow Integration** — 模型版本控制、实验跟踪、指标记录 ## 快速开始 ### 前置条件 - Docker 和 Docker Compose - Python 3.12+（用于本地开发） - Node.js 20+（用于仪表板开发） ### 使用 Docker Compose 运行 ``` # Clone the repository git clone https://github.com/ADORSYS-GIS/fineract-aml.git cd fineract-aml # Copy environment file cp .env.example .env # Start all services docker compose up -d # Run database migrations docker compose exec api alembic upgrade head # Create initial admin user (interactive) docker compose exec api python -m app.scripts.create_admin ``` 服务将可通过以下地址访问： | Service | URL | |---------|-----| | AML API | http://localhost:8000 | | API Docs (Swagger) | http://localhost:8000/docs | | API Docs (ReDoc) | http://localhost:8000/redoc | | Compliance Dashboard | http://localhost:3000 | | MLflow UI | http://localhost:5000 | ### 本地开发 ``` # Backend cd backend python -m venv .venv source .venv/bin/activate # or .venv\Scripts\activate on Windows pip install -r requirements.txt uvicorn app.main:app --reload --port 8000 # Celery worker (separate terminal) celery -A app.tasks.celery_app worker --loglevel=info # Celery beat (separate terminal) celery -A app.tasks.celery_app beat --loglevel=info ``` ### 运行测试 ``` cd backend pytest -v pytest --cov=app --cov-report=html ``` ## 项目结构 ``` fineract-aml/ ├── backend/ # Python FastAPI backend │ ├── app/ │ │ ├── api/ # REST API endpoints │ │ │ ├── webhook.py # Fineract webhook consumer │ │ │ ├── alerts.py # Alert management │ │ │ ├── transactions.py# Transaction queries │ │ │ ├── cases.py # Case management │ │ │ ├── credit.py # Credit scoring & requests │ │ │ └── auth.py # Authentication │ │ ├── core/ # Config, database, security │ │ ├── features/ # Feature engineering for ML │ │ ├── ml/ # ML models │ │ │ ├── anomaly_detector.py # Isolation Forest (unsupervised) │ │ │ ├── fraud_classifier.py # XGBoost (supervised) │ │ │ └── credit_scorer.py # Credit scoring + K-Means clustering │ │ ├── models/ # SQLAlchemy database models │ │ ├── rules/ # Deterministic rule engine │ │ ├── schemas/ # Pydantic request/response schemas │ │ ├── services/ # Business logic layer │ │ └── tasks/ # Celery async tasks │ ├── alembic/ # Database migrations │ ├── tests/ # Test suite │ ├── Dockerfile │ └── requirements.txt ├── dashboard/ # React compliance dashboard ├── ml/ # ML experiments & training notebooks ├── k8s/ # Kubernetes deployment manifests ├── docs/ # Documentation ├── docker-compose.yml └── .env.example ``` ## 信用评分系统根据客户的交易行为计算信用评分，将其细分为不同层级 (A–E)，并提供经合规审查的信贷请求工作流。 **工作原理：** 1. 每天晚上，系统会查看每位客户过去 180 天的交易记录，并计算 19 个行为特征（存款稳定性、储蓄率、贷款偿还率、欺诈历史等） 2. **加权公式** 将这些特征转换为单一的信用评分 (0–1)。权重最大的因素：存款稳定性 (20%)、净现金流 (20%)、贷款偿还 (15%)、储蓄率 (15%) 3. 评分决定客户的 **层级** 和最大可借贷金额： | Score | Tier | Max Credit (XAF) | |-------|------|-------------------| | ≥ 80% | A - Excellent | 50,000 | | ≥ 65% | B - Good | 20,000 | | ≥ 50% | C - Fair | 10,000 | | ≥ 35% | D - Poor | 1,000 | | < 35% | E - Very Poor | 0 | 4. **K-Means ML 模型**（每周训练）将客户分为 5 个集群，以验证基于规则的层级 5. 当客户申请贷款时，系统会 **实时重新评分** 并生成推荐（批准 / 仔细审查 / 拒绝） 6. **合规分析师** 审查并做出最终决定 —— 系统绝不会自动批准 **快速开始：** ``` # Run nightly scoring docker compose exec api python -c "from app.tasks.credit_scoring import compute_all_credit_scores; compute_all_credit_scores.delay()" # Submit a credit request curl -X POST http://localhost:8000/api/v1/credit/request \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{"fineract_client_id":"CLI-001","requested_amount":15000}' ``` **详细文档：** - [信用评分 API 参考](backend/docs/credit-scoring-api.md) - [架构与评分方法论](backend/docs/credit-scoring-architecture.md) - [运维指南与配置](backend/docs/credit-scoring-operations.md) ## 文档 - [架构概览](docs/architecture/overview.md) - [API 参考](docs/api/endpoints.md) - [信用评分 API](backend/docs/credit-scoring-api.md) - [Fineract Webhook 设置](docs/guides/fineract-webhook-setup.md) - [ML 流水线指南](docs/ml/pipeline.md) - [部署指南](docs/guides/deployment.md) - [贡献指南](docs/guides/contributing.md) ## 技术栈 | Component | Technology | Purpose | |-----------|-----------|---------| | API | Python 3.12, FastAPI | REST API with async support, auto-generated Swagger docs | | ORM | SQLAlchemy 2.0 (async) | Database models with async PostgreSQL driver (asyncpg) | | Database | PostgreSQL 16 | Permanent transaction storage, credit profiles, alerts, cases | | Task Queue | Celery + Redis | Background analysis, nightly scoring, weekly ML retraining | | ML | scikit-learn, XGBoost, MLflow | 4 ML models (see below) with experiment tracking | | Dashboard | React, TanStack Router, TanStack Query | Compliance analyst UI with file-based routing | | Auth | JWT (PyJWT) | Token-based authentication for API and dashboard | | Validation | Pydantic v2 | Request/response schemas and config management | | Containers | Docker, Docker Compose, Kubernetes | Development and production deployment | ### ML 模型系统使用 4 个互补模型 —— 3 个用于 AML 欺诈检测，1 个用于信用评分： | Model | Type | Library | Purpose | Training | |-------|------|---------|---------|----------| | **Rule Engine** | Deterministic | Custom Python | Checks 7 suspicious patterns (large amounts, structuring, rapid transactions, unusual hours, circular transfers, new counterparties, rapid pairs) | No training — rules are configured via environment variables | | **Isolation Forest** | Unsupervised | scikit-learn | Anomaly detection — flags statistically unusual transactions without needing labeled data | Retrains automatically as data grows | | **XGBoost Classifier** | Supervised | XGBoost | Fraud classification — learns from analyst decisions (fraud vs. legitimate) to predict fraud probability | Retrains when analysts label enough new data | | **K-Means Clustering** | Unsupervised | scikit-learn | Credit tier validation — groups customers into 5 clusters to validate rule-based credit tiers | Retrains weekly via Celery Beat | **协同工作机制：** - 对于 **AML**：每笔交易都会经过 Rule Engine → Isolation Forest → XGBoost（如果已训练）。每个模型都会生成一个风险评分，取最高分为准。高风险交易会为合规分析师生成警报。 - 对于 **Credit**：Rule Engine 根据 19 个行为特征对客户进行评分。K-Means Clustering 独立地将客户分组以验证这些评分。当两者一致时，置信度更高（混合评分）。 ### 交易存储所有交易均 **永久** 存储在 PostgreSQL 中。信用评分系统使用 **180 天滑动窗口** —— 它查看每位客户过去 6 个月的交易来计算其信用评分。这意味着评分会自动反映最近的行为变化，同时不会丢失历史数据。 ## 许可证 Apache License 2.0

标签：AML, Apache Fineract, Apex, AV绕过, Celery, FastAPI, FinTech, IP 地址批量处理, Python, React, Syscalls, TCP/UDP协议, XGBoost, 云计算, 反洗钱, 合规仪表盘, 子域名突变, 实时交易监控, 异常检测, 搜索引擎查询, 无后门, 机器学习, 案件管理, 欺诈检测, 测试用例, 网络钩子, 规则引擎, 请求拦截, 逆向工具, 金融合规, 金融科技, 金融风控, 风险控制