Alireza-Foroughi-uk/FraudGuard-ML

GitHub: Alireza-Foroughi-uk/FraudGuard-ML

一个基于机器学习对信用卡交易进行实时欺诈检测的项目,旨在解决高度不平衡数据下的精准识别问题。

Stars: 17 | Forks: 2

# FraudGuard-ML 💳 欢迎来到 FraudGuard-ML,这是一个尖端的机器学习项目,旨在以精准和风格检测信用卡欺诈!🎯 由来自阿尔斯特大学的一群充满激情的开发者打造,该项目利用数据科学来保护金融交易并增强数字经济的信任。💸 # 项目概述 📋 ``` What’s the Deal? 🤔 FraudGuard-ML uses the Kaggle Credit Card Fraud Detection dataset (284,807 transactions!) to identify fraudulent activities in real-time. With only 0.172% of transactions being fraud, we tackled the class imbalance head-on using advanced ML techniques. 🎉 Tech Stack: 🛠️ Python 🐍 Libraries: Pandas, Scikit-learn, XGBoost, Seaborn, Matplotlib, Imbalanced-learn Tools: Google Colab, Kaggle API Key Features: 🌟 Data cleaning and preprocessing (no missing values, duplicates gone! ✅) Feature engineering with PCA and Random Forest for top-notch insights 📊 Models: Logistic Regression, Random Forest, SVM, and XGBoost (winner with 0.94 AUC-PR! 🏆) Real-time fraud detection simulation 🎮 Stunning visualizations (heatmaps, violin plots, scatter plots) 📈 ``` # 工作原理 ⚙️ ``` Data Import & Cleaning: 📥 Grabbed the dataset from Kaggle, cleaned it with Pandas, and sampled 10,000 records for efficiency. No dirty data here! 🧹 Data Wrangling: 🔧 Scaled features like Time and Amount with StandardScaler, balanced classes with SMOTE. Balanced datasets = happy models! ⚖️ Analysis & Visualization: 📊 Plotted class distributions, correlation heatmaps, and feature distributions to uncover fraud patterns. Eye-candy for data lovers! 👀 Modeling: 🤖 Trained multiple classifiers, with XGBoost shining brightest (F1-Score: 0.86, AUC-PR: 0.94). Confusion matrices? Check! ROC curves? Double check! 📉 Deployment: 🚀 Built a real-time prediction function and saved the Random Forest model for deployment. Ready to catch fraudsters in action! 🕵️‍♂️ ``` # 结果与影响 🌍 ``` Performance: 📈 XGBoost nailed it with a 93.6% probability on a sample transaction (Transaction ID: 172001, Amount: €149.23). False positives? Minimized! 💪 Impact: 💡 With global card fraud losses hitting $32.34 billion in 2022, FraudGuard-ML is a step toward safer online banking. Let’s protect those wallets! 🛡️ Limitations: ⚠️ Class imbalance, feature interpretability (thanks, PCA!), and overfitting risks are noted. Future work: fairness audits and real-world testing! 🔍 ```
标签:Apex, AUC-PR, F1-Score, Google Colab, Imbalanced-learn, Kaggle, Kaggle API, Matplotlib, PCA, Scikit-learn, Seaborn, SMOTE, SVM, XGBoost, 信用卡欺诈, 信用欺诈检测, 实时检测, 小提琴图, 开源安全, 散点图, 数字经济发展, 数据清洗, 数据采样, 数据预处理, 机器学习, 标准缩放, 模型部署, 欺诈检测, 热力图, 特征工程, 类别不平衡, 逆向工具, 逻辑回归, 道德考量, 金融安全, 随机森林