anpa1200/Android-Malware-Analysis
GitHub: anpa1200/Android-Malware-Analysis
这是一个面向 Android APK 的 AI 驱动静态分析框架,整合了静态特征提取、YARA 规则与多模型 LLM 分类,可自动生成包含 MITRE 映射及 Frida 脚本的详细威胁报告。
Stars: 1 | Forks: 0
# Android 恶意软件分析工具
面向 Android APK 文件的 AI 驱动静态分析框架。将 YARA 规则匹配、语义组件分析、威胁指标评分以及多提供商 LLM 分类(Claude、OpenAI、Google Gemini 或本地 Ollama)整合到一个终端管道中 —— 生成详细的恶意软件报告、MITRE ATT&CK 映射、VirusTotal 交叉验证以及 Frida 动态插桩脚本。
## 功能
| 层级 | 功能 |
|-------|-------------|
| **静态分析** | 通过 androguard 从 APK/DEX 提取元数据、权限、组件、API 调用、证书、字符串、网络 IOC |
| **威胁评分** | 跨 4 个维度的加权风险评分(0–100):权限、行为、网络、混淆 |
| **语义分析** | 直接从 Service/Activity 名称解码恶意软件能力(`ServiceRAT` → RAT,`EncryptionService` → 勒索软件) |
| **YARA 扫描** | 20 条规则覆盖银行木马(Anubis、Cerberus)、勒索软件、间谍软件、RAT (Metasploit)、Joker、跟踪软件、逃避技术 |
| **AI 分类** | Claude、OpenAI (GPT-4o)、Google Gemini 或本地 Ollama —— 根据可用的 API 密钥自动选择;生成家族名称、MITRE 技术、IOC 列表、Frida hooks |
| **VirusTotal** | 可选哈希查询用于交叉验证(需要 `VT_API_KEY`) |
| **Frida Hooks** | AI 生成的 JavaScript,针对样本中发现的具体 API |
## 快速开始
```
git clone https://github.com/anpa1200/Android-Malware-Analysis.git
cd Android-Malware-Analysis
./setup.sh
# 设置一个或多个 API keys(工具会自动选择最佳可用项):
export ANTHROPIC_API_KEY="sk-ant-..." # Claude (best results)
export OPENAI_API_KEY="sk-..." # OpenAI GPT-4o
export GOOGLE_API_KEY="..." # Google Gemini Flash
export VT_API_KEY="..." # VirusTotal (optional)
# 分析单个 APK
python analyzer.py analyze malware.apk
# 显式选择 provider
python analyzer.py analyze malware.apk -p claude
python analyzer.py analyze malware.apk -p openai
python analyzer.py analyze malware.apk -p google
# 跳过 AI,仅进行静态分析(约 3 秒)
python analyzer.py analyze malware.apk --no-ai
# 覆盖 model
python analyzer.py analyze malware.apk --model claude-sonnet-4-6
# 批量分析目录
python analyzer.py batch /path/to/apks/
# 快速获取文件信息(不进行分析)
python analyzer.py info malware.apk
```
## 环境要求
- Python 3.10+
- [Ollama](https://ollama.ai) 已拉取 `qwen3:8b`(用于本地 AI,无需 API 密钥)
- Anthropic API 密钥(可选 —— 用于更快/更好的 Claude 分析)
- VirusTotal API 密钥(可选)
安装依赖:
```
pip install -r requirements.txt
```
或使用安装脚本,它会自动创建 venv:
```
./setup.sh
```
## 分析流程
```
APK file
│
├─ Phase 1: Static Analysis (androguard)
│ ├─ Package metadata, SDK versions, signing certificate
│ ├─ Permissions (31 dangerous permissions scored)
│ ├─ Components: activities, services, receivers, providers
│ ├─ Suspicious API calls (DexClassLoader, SmsManager, etc.)
│ ├─ Network IOCs: URLs, IPs, domains
│ └─ Obfuscation detection (ProGuard, fddo pattern, gibberish names)
│
├─ Phase 2: Threat Indicator Scoring (0–100)
│ ├─ Permission scoring (BIND_ACCESSIBILITY +8, READ_SMS +5, ...)
│ ├─ Dangerous permission combos (SMS+accessibility+overlay = banking trojan)
│ ├─ Behavioral API patterns
│ └─ MITRE ATT&CK mapping per indicator
│
├─ Phase 2b: Semantic Analysis
│ ├─ Component name → capability mapping (25+ patterns)
│ │ ServiceRAT → Remote Access Trojan
│ │ EncryptionService → Ransomware (File Encryptor)
│ │ ServiceVNC → VNC Remote Control
│ │ ActivityStartUSSD → Bank Account Wipe
│ │ NMSGService → Covert C2 Messaging
│ │ SyncTGData → Telegram C2 Exfil
│ ├─ String-based semantic detection (Telegram bot, Discord webhook, USSD, emulator checks)
│ ├─ Shannon entropy analysis (encrypted payloads / C2 keys)
│ └─ Rule-based family inference (Banking Trojan, Stalkerware, Ransomware, RAT, SMS Fraud, Dropper)
│
├─ Phase 2c: VirusTotal Lookup (optional)
│ └─ Detection ratio, threat label, family names, top AV detections
│
├─ Phase 3: YARA Scanning (20 rules)
│ └─ APK binary + all embedded DEX/SO files
│
└─ Phase 4: AI Analysis (Claude or Ollama)
├─ Threat classification + specific family name
├─ Executive summary + technical analysis
├─ MITRE ATT&CK techniques with evidence
├─ IOC list (hashes, packages, services, certificates)
├─ Remediation steps
└─ Complete Frida instrumentation script
```
## YARA 规则
| 规则 | 家族 | 严重性 |
|------|--------|----------|
| `BankingTrojan_Anubis` | Anubis/BankBot (ServiceRAT + USSD + overlay 组合) | CRITICAL |
| `BankingTrojan_Obfuscated_fddo` | Anubis 构建器 fddo 混淆模式 | HIGH |
| `BankingTrojan_Cerberus` | Cerberus/Alien (无障碍 + 剪贴板劫持 + 加密货币) | CRITICAL |
| `BankingTrojan_Accessibility` | 通用银行木马 (无障碍 + 覆盖攻击) | CRITICAL |
| `Ransomware_FileEncryption` | 文件加密勒索软件 (EncryptionService/DecryptionService) | CRITICAL |
| `Ransomware_DeviceAdmin` | 锁屏勒索软件 (lockNow + 勒索字符串) | CRITICAL |
| `RAT_Metasploit` | Metasploit Android Meterpreter 阶段 | CRITICAL |
| `Spyware_MultiVector` | 通话记录 + SMS + 网络外泄 | CRITICAL |
| `Spyware_AudioLocation` | 音频录制 + 位置追踪 | CRITICAL |
| `Spyware_SMSStealer` | SMS/联系人窃取间谍软件 | HIGH |
| `Stalkerware_Covert` | 隐蔽图标的隐秘监视 | CRITICAL |
| `SMSFraud_PremiumRate` | 高费率 SMS 欺诈 | HIGH |
| `SMSFraud_Joker` | Joker/Bread WebView 订阅欺诈 | HIGH |
| `NotificationStealer_OTP` | 通过通知监听器窃取 OTP | CRITICAL |
| `Overlay_PhishingActivity` | 针对银行应用的覆盖钓鱼 | CRITICAL |
| `Dropper_DexClassLoader` | 动态 DEX 加载 (dropper/loader) | HIGH |
| `HiddenInstaller_Silent` | 静默 APK 安装 | CRITICAL |
| `AntiAnalysis_EmulatorCheck` | 模拟器/调试器检测 | MEDIUM |
| `C2_TelegramBot` | Telegram bot 作为 C2 通道 | HIGH |
| `C2_DiscordWebhook` | Discord webhook 用于外泄 | HIGH |
## AI 后端
使用 `--provider` / `-p` 选择:`auto`(默认)、`claude`、`openai`、`google`、`ollama`。
`auto` 按顺序尝试提供商 —— Claude → OpenAI → Google → Ollama —— 基于环境变量或 `config.yml` 中存在的 API 密钥。
| 提供商 | 默认模型 | 密钥变量 |
|---|---|---|
| Claude | `claude-opus-4-6` | `ANTHROPIC_API_KEY` |
| OpenAI | `gpt-4o` | `OPENAI_API_KEY` |
| Google | `gemini-2.0-flash` | `GOOGLE_API_KEY` |
| Ollama | `qwen3:8b` (自动检测) | — |
**云提供商** 接收完整的证据提示词(约 5000 tokens)以进行丰富的技术分析和详细的 Frida hooks。
**Ollama**(完全离线,无需 API 密钥):
- 自动选择最佳的可用本地模型
- 紧凑提示模式(约 450 tokens)用于 CPU 推理
- CPU 上每个样本约 110–170 秒
```
# 安装 Ollama 并拉取 model(完全离线回退)
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull qwen3:8b
# 在没有任何 API key 的情况下运行 — Ollama 会自动启动
python analyzer.py analyze malware.apk
```
**配置文件**(项目目录下的 `config.yml` 或 `~/.android_malware_analyzer.yml`):
```
provider: claude # auto, claude, openai, google, ollama
# model: claude-opus-4-6
# api_key: sk-ant-...
# vt_key: your-vt-key
output_dir: reports
```
## 输出
**终端输出**(丰富的表格,颜色编码的严重性):
```
Phase 1: Static Analysis
Package: naqsl.ebxcb.exu App: pandemidestek SDK: 15→28
SHA256: 041ccba5... Size: 859 KB
Phase 2: Threat Indicator Scoring
Risk Score: 71/100 Level: HIGH
┌─ CRITICAL: Accessibility + SMS interception + overlay (T1417)
├─ CRITICAL: USSD execution capability (T1582)
└─ HIGH: fddo obfuscation pattern — Anubis/BankBot builder
Phase 3: YARA Scanning
4 rule(s) matched: BankingTrojan_Anubis, BankingTrojan_Obfuscated_fddo, ...
Phase 4: AI Analysis
Classification: Banking Trojan
Family: Anubis Confidence: High
"This Android Banking Trojan is a variant of the Anubis family..."
```
**JSON 报告**(`reports/_.json`):
```
{
"apk_metadata": { "package": "naqsl.ebxcb.exu", "sha256": "041c...", ... },
"risk_assessment": { "risk_score": 71, "risk_level": "HIGH", ... },
"yara_matches": [{ "rule": "BankingTrojan_Anubis", ... }],
"ai_analysis": {
"classification": "Banking Trojan",
"family": "Anubis",
"confidence": "High",
"mitre_techniques": [{ "id": "T1417", "name": "Input Injection", ... }],
"frida_hooks": "Java.perform(function() { ... });",
...
}
}
```
**Frida 脚本**(`reports/.js`):
```
// Auto-generated Frida hooks for naqsl.ebxcb.exu
Java.perform(function() {
// Hook SMS interception
var SmsManager = Java.use("android.telephony.SmsManager");
SmsManager.sendTextMessage.overload(...).implementation = function(...) { ... };
// Hook overlay injection
...
});
```
## 验证结果
针对 12 个真实恶意软件样本进行了测试。工具以高置信度正确分类了所有样本:
| 样本 | 官方家族 | AI 分类 | YARA | 语义 |
|--------|----------------|-------------------|------|----------|
| pandemidestek.apk | Anubis 银行木马 | Banking Trojan / Anubis | ✅ `BankingTrojan_Anubis` | Banking Trojan/RAT (95%) |
| sep_cerberus.apk | Cerberus 银行木马 | Banking Trojan / Cerberus | ✅ `BankingTrojan_Cerberus` | Banking Trojan/RAT (50%) |
| RansomwareCryDroid.apk | CryDroid 勒索软件 | Ransomware / File Encryptor | ✅ `Ransomware_FileEncryption` | Ransomware (80%) |
| dec_sextortionistSpyware.apk | Sextortion 间谍软件 | Spyware / SMS Stealer | ✅ `Spyware_SMSStealer` | SMS Fraud/Stealer (90%) |
| nov_jokerNew.apk | Joker 高费率拨号器 | SMS Fraud / Joker | ✅ `SMSFraud_Joker` | — (混淆 payload) |
| projectSpy.apk | 跟踪软件 | Stalkerware / Spyware | ✅ `Spyware_AudioLocation` | Stalkerware (80%) |
| mar_CovidMetasploit.apk | Metasploit RAT | RAT / Metasploit | ✅ `RAT_Metasploit` | — |
| roamingMantis1.apk | Roaming Mantis | RAT / C2 Messaging | — (加密) | RAT (40%) |
| mysteryBot.apk | 银行木马 | Banking Trojan / RAT | — | Banking Trojan (80%) |
## 项目结构
```
android-malware-analysis/
├── analyzer.py # CLI entry point (click)
├── setup.sh # One-command setup script
├── requirements.txt
├── core/
│ ├── apk_analyzer.py # Static analysis via androguard
│ ├── indicators.py # Weighted threat scoring engine
│ ├── semantic_analyzer.py # Component name → capability mapping
│ ├── yara_scanner.py # YARA rule execution
│ ├── ai_engine.py # Claude API + Ollama with lite prompt mode
│ ├── ollama_engine.py # Local LLM streaming client
│ ├── vt_lookup.py # VirusTotal API v3
│ └── reporter.py # Rich terminal UI + JSON + Frida output
├── rules/
│ ├── malware.yar # 20 YARA rules
│ ├── permissions.yaml # 31 dangerous Android permissions with scores
│ └── suspicious_patterns.yaml # API call and string patterns
└── templates/ # Report templates
```
## 环境变量
| 变量 | 描述 |
|----------|-------------|
| `ANTHROPIC_API_KEY` | Anthropic API 密钥,用于 Claude |
| `OPENAI_API_KEY` | OpenAI API 密钥,用于 GPT-4o |
| `GOOGLE_API_KEY` | Google API 密钥,用于 Gemini |
| `VT_API_KEY` | VirusTotal API 密钥(亦可用 `VIRUSTOTAL_API_KEY`) |
## 添加自定义 YARA 规则
将规则添加到 `rules/malware.yar`。必填元数据字段:
```
rule MyFamily_Variant {
meta:
description = "Description of what this detects"
severity = "CRITICAL" // CRITICAL / HIGH / MEDIUM / LOW
category = "Banking Trojan"
mitre = "T1417"
family = "FamilyName" // optional
strings:
$s1 = "SomeService" ascii
$s2 = "SomeAPI" ascii
condition:
$s1 and $s2
}
```
该工具会在下次运行时自动获取新规则 —— 无需重新编译。
## 许可证
MIT License。仅用于教育和授权的安全研究目的。
**请勿在您不拥有或未获得明确分析许可的系统上使用此工具。**
恶意软件样本不包含在此仓库中。研究样本来源:
- [MalwareBazaar](https://bazaar.abuse.ch/)
- [sk3ptre/AndroidMalware_2020](https://github.com/sk3ptre/AndroidMalware_2020) (密码: `infected`)
- [ashishb/android-malware](https://github.com/ashishb/android-malware)
标签:AI风险缓解, Androguard, Android恶意软件分析, APK静态分析, C2, Cloudflare, DNS信息、DNS暴力破解, DoH影响, Frida动态插桩, IOC提取, MITRE ATT&CK, VirusTotal集成, YARA规则, 云资产清单, 人工智能安全, 勒索软件检测, 合规性, 威胁情报, 威胁评分, 开发者工具, 目录枚举, 移动安全, 网络安全, 自动化分析框架, 远控木马分析, 逆向工具, 逆向工程, 银行木马检测, 隐私保护