aparnna-c/malware-detection

GitHub: aparnna-c/malware-detection

基于图神经网络与随机森林集成模型，通过静态调用图分析实现 Android 恶意应用分类检测，并配备交互式 Streamlit 仪表板。

Stars: 0 | Forks: 0

# 🛡️ MalDetect — Android 恶意软件检测 ![Python](https://img.shields.io/badge/Python-3.10-blue?style=flat-square&logo=python&logoColor=white) ![Streamlit](https://img.shields.io/badge/Streamlit-App-FF4B4B?style=flat-square&logo=streamlit&logoColor=white) ![PyTorch](https://img.shields.io/badge/PyTorch-GNN-EE4C2C?style=flat-square&logo=pytorch&logoColor=white) ![scikit-learn](https://img.shields.io/badge/scikit--learn-Random%20Forest-F7931E?style=flat-square&logo=scikit-learn&logoColor=white) ![License](https://img.shields.io/badge/License-MIT-green?style=flat-square) ## 📸 界面展示该应用在浏览器中运行，采用深色主题仪表板。你可以浏览数据集，上传 `.edgelist` 图文件，或上传真实的 `.apk` 文件进行实时分析。 ## 🧠 工作原理 Android 应用会被转换为**调用图**（call graphs）——这是一种展示应用内部函数如何相互调用的映射图。与安全的应用相比，恶意软件通常具有不同的图结构模式。 MalDetect 使用两种不同的方法对这些图进行分类： | 模型 | 方法 | |---|---| | **GIN** (Graph Isomorphism Network) | 深度学习 —— 直接从图中学习结构模式 | | **Random Forest** | 传统机器学习 —— 使用手工设计的图特征（节点、边、度、密度） | | **Ensemble** | 通过平均概率输出结合两者，以提供更强有力的判定结果 | ### 检测出的 5 种类别 `Addisplay` · `Adware` · `Benign` · `Downloader` · `Trojan` ## ✨ 功能特性 - **浏览数据集** —— 从测试/训练集中选择任意图文件并立即进行分析 - **上传 .edgelist** —— 拖放你自己的调用图文件 - **上传 APK** —— 上传真实的 Android APK；应用会使用 Androguard 提取调用图，并运行完整的集成分析 - **并排对比** —— GIN 和 Random Forest 的结果将与概率图表一同展示 - **模型评估** —— 在整个数据集上运行并计算准确率、精确率、召回率和 F1 分数 ## 🗂️ 项目结构 ``` malware-detection/ ├── app.py # Main Streamlit app ├── main.py # Training entry point ├── rf_model.py # Random Forest training ├── apk_to_graph.py # APK → call graph (Androguard) ├── requirements.txt # Dependencies ├── models/ │ ├── best_gin_model.pth # Trained GIN model │ ├── random_forest.pkl # Trained RF model │ └── rf_scaler.pkl # Feature scaler ├── src/ │ ├── gin_model.py # GIN architecture │ ├── data_loader.py # Dataset loading │ ├── feature_extractor.py# Graph feature extraction │ └── train.py # GIN training loop ├── dataset/ # MalNet-Tiny graph dataset ├── split_info/ # Train/test file lists └── graphs/ # Processed graph files ``` ## 🚀 本地运行 **1. 克隆仓库** ``` git clone https://github.com/aparnna-c/malware-detection.git cd malware-detection ``` **2. 创建虚拟环境** ``` python -m venv venv source venv/bin/activate ``` **3. 安装依赖** ``` pip install -r requirements.txt ``` **4. 运行应用** ``` streamlit run app.py ``` 在浏览器中打开 `http://localhost:8501`。 ## 🛠️ 技术栈 - **Python 3.10** - **Streamlit** —— Web 应用框架 - **PyTorch + PyTorch Geometric** —— GIN 模型 - **scikit-learn** —— Random Forest 分类器 - **Androguard** —— APK 静态分析与调用图提取 - **Plotly** —— 交互式概率图表 - **MalNet-Tiny** —— Android 应用调用图数据集 ## 📊 数据集本项目使用 [MalNet-Tiny](https://mal-net.org/) 数据集 —— 这是一个涵盖了多种恶意软件家族的 Android 应用调用图集合。 ## 👩‍💻 作者 **Aparnna C** MCA 学生 —— Thrissur 政府工程学院 [LinkedIn](https://linkedin.com/in/aparnna-c) · [GitHub](https://github.com/aparnna-c) ## 📄 许可证本项目基于 MIT 许可证授权。

标签：Android恶意软件检测, Apex, Kubernetes, Python, Streamlit, 人工智能, 凭据扫描, 图神经网络, 无后门, 机器学习, 用户模式Hook绕过, 访问控制, 逆向工具