ChandraVerse/malware-analysis-sandbox
GitHub: ChandraVerse/malware-analysis-sandbox
基于 CAPE/Cuckoo 构建的自动化恶意软件动态分析平台,实现了从样本引爆、IOC 提取到 MITRE ATT&CK 映射及多格式报告生成的全流程自动化。
Stars: 0 | Forks: 0
🦠 自动化恶意软件分析沙箱
动态恶意软件分析 · 自动化 IOC 提取 · 自定义严重性评分引擎 · 威胁报告生成
📌 概述 · 🏗️ 架构 · 🔧 技术栈 · 📁 结构 · ⚡ 快速开始 · 🦹 IOC · 📊 评分 · 📝 报告 · ⚠️ 道德 · 🤝 贡献 · 📜 许可证
## 📌 项目概述 本项目构建了一个**基于 Python 的动态恶意软件分析系统**,并与 **CAPE Sandbox**(或 Cuckoo)集成,旨在隔离环境中自动引爆恶意软件样本,提取所有可观察的**失陷指标**,根据 **MITRE ATT&CK** 分类行为,使用**自定义规则引擎**对威胁严重性进行评分,并生成多种格式的**结构化威胁报告**。 每个输出都源于真实的动态执行——不仅仅是静态签名匹配。系统观察恶意软件在运行时**实际执行的操作**:文件系统写入、注册表修改、网络连接、进程注入和 API 调用序列。 ### 为什么这个项目很重要 | 受众 | 交付价值 | |---|---| | **恶意软件分析师** | 从样本摄取到报告交付的端到端自动化流水线 | | **SOC 团队** | 通过实时引爆快速提取 IOC,缩短平均识别时间 (MTTI) | | **威胁情报团队** | 结构化的 STIX/JSON 输出,可直接用于 MISP、TIP 和 SIEM 富化 | | **应急响应人员** | 快速的严重性评分让分析师能够优先处理需要深入调查的样本 | | **学生与研究人员** | 文档化的代码库通过实际实现教授恶意软件分析概念 | ## 🏗️ 架构 ``` +---------------------------------------------------------------+ | SAMPLE INTAKE LAYER | | Manual Upload . Watchdog Folder . REST API Submission | +------------------------------+--------------------------------+ | Sample file (.exe/.dll/.doc) +---------------v------------------+ | Pre-Analysis Module | | SHA256/MD5/SHA1 Hashing | | File Type Detection (magic) | | VirusTotal Pre-Check (optional) | | Packer/Obfuscation Detection | +---------------+------------------+ | Detonation request +---------------v------------------+ | CAPE Sandbox Engine | | Isolated Windows VM (KVM/QEMU) | | Network: INetSim / FakeNet-NG | | Monitoring: API hooks, syscalls | | Duration: Configurable timeout | +---------------+------------------+ | Raw JSON behavioral report +-----------------------v--------------------------+ | IOC Extraction Pipeline | | File Hashes Registry Keys | | IP Addresses DNS Queries | | Mutexes Dropped Files | | Network Calls Injected Processes | | API Call Chains YARA Signature Matches | +-----------------------+--------------------------+ | Enriched IOC dataset +-----------------------v--------------------------+ | Threat Enrichment & Correlation | | VirusTotal API . AbuseIPDB | | Shodan API . MalwareBazaar | | MITRE ATT&CK Mapping (technique IDs) | +-----------------------+--------------------------+ | Classified behavior profile +-----------------------v--------------------------+ | Custom Severity Scoring Engine | | Rule-Based: weighted_score() per IOC type | | Behavioral Multipliers (anti-AV, injection) | | Final Score: LOW / MEDIUM / HIGH / CRITICAL | +-----------------------+--------------------------+ | Scored & tagged results +-----------------------v--------------------------+ | Threat Report Generator | | JSON (machine-readable, SIEM-ready) | | PDF (human analyst report, LaTeX/WeasyPrint) | | HTML (interactive, embedded charts) | | STIX 2.1 (TIP-compatible export) | +--------------------------------------------------+ ``` ## 🔧 技术栈 | 层级 | 工具 / 技术 | 用途 | |---|---|---| | **沙箱引擎** | CAPE Sandbox v2.x (或 Cuckoo 2.0.7) | 动态恶意软件引爆与行为日志记录 | | **客户机 VM** | Windows 10/11 LTSC (KVM/QEMU) | 隔离的执行环境 | | **网络模拟** | INetSim / FakeNet-NG | 用于 C2 流量捕获的虚假互联网服务 | | **静态分析** | YARA 4.x, pefile, python-magic, ssdeep | 文件分类、加壳检测、模糊哈希 | | **IOC 提取** | Python 3.10+ (自定义流水线) | 解析 CAPE JSON 报告,提取所有观测数据 | | **威胁富化** | VirusTotal API, AbuseIPDB, MalwareBazaar | 哈希声誉、IP 声誉、样本查询 | | **ATT&CK 映射** | mitreattack-python, CAPE 内置标签 | 将行为映射到 MITRE 技术 ID | | **严重性评分** | 自定义 Python 规则引擎 (`scorer.py`) | 基于加权规则并结合行为乘数的评分 | | **报告生成** | Jinja2, WeasyPrint, fpdf2, Plotly | PDF, HTML, JSON, STIX 2.1 报告输出 | | **数据存储** | MongoDB / SQLite (可配置) | 存储分析结果和 IOC 历史 | | **编排** | Celery + Redis (可选) | 用于批量处理的异步样本队列 | | **主机操作系统** | Ubuntu 22.04 LTS | 分析主机平台 | ## 📁 仓库结构 ``` malware-analysis-sandbox/ | +-- sandbox/ # Sandbox integration layer | +-- cape_client.py # CAPE REST API client (submit, poll, fetch) | +-- cuckoo_client.py # Cuckoo API client (legacy support) | +-- submission.py # Sample intake, hashing, dedup check | +-- vm_manager.py # Snapshot revert automation (KVM) | +-- network_config.py # INetSim/FakeNet-NG configuration helpers | +-- analysis/ # Core analysis pipeline | +-- ioc_extractor.py # Full IOC extraction from CAPE JSON reports | +-- static_analyzer.py # YARA, pefile, magic, ssdeep static checks | +-- behavior_parser.py # API call chain parsing, process tree analysis | +-- enrichment.py # VT / AbuseIPDB / MalwareBazaar API integration | +-- mitre_mapper.py # CAPE behavior tags -> MITRE ATT&CK technique IDs | +-- scoring/ # Severity scoring engine | +-- scorer.py # Main scoring engine: rules + weights + multipliers | +-- rules/ # Rule definition files (YAML) | | +-- network_rules.yml # Network IOC scoring rules | | +-- process_rules.yml # Process injection / hollowing rules | | +-- registry_rules.yml # Persistence mechanism registry rules | | +-- file_rules.yml # Dropped file and ransomware behavior rules | +-- thresholds.py # Configurable LOW/MEDIUM/HIGH/CRITICAL thresholds | +-- reporting/ # Threat report generators | +-- report_generator.py # Orchestrates all output formats | +-- templates/ # Jinja2 HTML and PDF report templates | | +-- report.html.j2 | | +-- report.pdf.j2 | +-- stix_exporter.py # STIX 2.1 bundle generator from analysis results | +-- json_exporter.py # Machine-readable JSON export | +-- storage/ # Results persistence | +-- db.py # MongoDB / SQLite abstraction layer | +-- models.py # Data models for samples, IOCs, reports | +-- api/ # Optional REST API for programmatic access | +-- app.py # Flask/FastAPI endpoint definitions | +-- routes.py # /submit, /status, /report, /iocs endpoints | +-- samples/ # Sample input directory (gitignored) | +-- .gitkeep | +-- reports/ # Generated report output (gitignored) | +-- .gitkeep | +-- tests/ # Unit and integration tests | +-- test_ioc_extractor.py | +-- test_scorer.py | +-- test_stix_exporter.py +-- requirements.txt # Python dependencies +-- config.yml # Central configuration file +-- .env.example # API keys template +-- docker-compose.yml # Full stack (CAPE + Redis + MongoDB + API) +-- CONTRIBUTING.md +-- LICENSE +-- README.md ``` ## ⚡ 快速开始 ### 前置条件 在设置之前,请确保您具备: - **主机操作系统:** Ubuntu 22.04 LTS(裸机或支持嵌套 KVM 的 VM) - 最低配置:**16 GB 内存 · 8 vCPU · 200 GB SSD**(沙箱 VM 非常消耗内存) - 推荐配置:**32 GB 内存 · 16 vCPU · 500 GB SSD**,用于并发分析 - **虚拟化:** KVM/QEMU 配合 `libvirt`(确保存在 `vmx`/`svm` CPU 标志:`grep -E 'vmx|svm' /proc/cpuinfo`) - **客户机 VM ISO:** Windows 10/11 LTSC(您必须合法获取) - **Python:** 3.10 或更高版本 - **Docker & Docker Compose v2**(用于 Redis、MongoDB 服务) - **API 密钥**(免费层级足以用于开发): - [VirusTotal](https://developers.virustotal.com/) — 哈希和 IP 声誉 - [AbuseIPDB](https://www.abuseipdb.com/api) — IP 滥用评分 - [MalwareBazaar](https://bazaar.abuse.ch/api/) — 样本查询(免费,无需密钥) ### 步骤 1 — 克隆仓库 ``` git clone https://github.com/ChandraVerse/malware-analysis-sandbox.git cd malware-analysis-sandbox ``` ### 步骤 2 — 安装 CAPE Sandbox ``` # 安装 CAPE 依赖 sudo apt update && sudo apt install -y python3-pip python3-venv \ libvirt-dev pkg-config virtualbox qemu-kvm libvirt-daemon-system # 克隆并安装 CAPE git clone https://github.com/kevoreilly/CAPEv2.git /opt/CAPEv2 cd /opt/CAPEv2 pip install -r requirements.txt # 创建并配置客户机 VM 快照(完整 VM 设置请参阅 CAPE 文档) # https://capev2.readthedocs.io/en/latest/installation/host/ ``` ### 步骤 3 — 配置项目 ``` cd /path/to/malware-analysis-sandbox cp .env.example .env nano .env ``` 填写您的 API 凭证和 CAPE 端点: ``` # CAPE / Cuckoo CAPE_HOST=http://localhost:8000 CAPE_API_TOKEN=your_cape_token_here # Threat Intelligence APIs VT_API_KEY=your_virustotal_key_here ABUSEIPDB_API_KEY=your_abuseipdb_key_here # Storage MONGO_URI=mongodb://localhost:27017/malware_sandbox REDIS_URL=redis://localhost:6379/0 # Report Output REPORT_OUTPUT_DIR=./reports/ STIX_OUTPUT_DIR=./reports/stix/ ``` ### 步骤 4 — 安装 Python 依赖 ``` python3 -m venv venv source venv/bin/activate pip install -r requirements.txt ``` ### 步骤 5 — 启动支持服务 ``` # 通过 Docker Compose 启动 MongoDB 和 Redis docker-compose up -d mongodb redis ``` ### 步骤 6 — 提交样本进行分析 ``` # 提交样本文件 python sandbox/submission.py --file /path/to/sample.exe # 或使用 REST API(如正在运行) curl -X POST http://localhost:5000/submit \ -F "file=@sample.exe" \ -F "timeout=120" ``` ### 步骤 7 — 运行完整分析流水线 ``` # CAPE 分析完成后,对 task ID 运行完整流程 python analysis/ioc_extractor.py --task-id 1 --output ./data/ python analysis/enrichment.py --input ./data/task_1_iocs.json python analysis/mitre_mapper.py --input ./data/task_1_iocs.json python scoring/scorer.py --input ./data/task_1_iocs.json # 以所有格式生成威胁报告 python reporting/report_generator.py \ --task-id 1 \ --formats pdf html json stix \ --output ./reports/ ``` ## 🦹 IOC 提取 提取流水线解析 CAPE 的原始 JSON 行为报告,并提取所有可观察的工件: | IOC 类别 | 提取的工件 | |---|---| | **文件系统** | 释放文件路径、哈希 (MD5/SHA1/SHA256)、SSDEEP 模糊哈希 | | **网络** | 联系的 IP、域名、URL、DNS 查询、HTTP 请求头 | | **注册表** | 创建/修改/删除的注册表键(Run, RunOnce, Services 等) | | **进程** | 生成的进程、注入的 PID、镂空进程、父子链 | | **互斥体** | 用于单实例强制执行的命名互斥体对象 | | **API 调用** | 可疑的调用序列(例如 `VirtualAllocEx` → `WriteProcessMemory` → `CreateRemoteThread`) | | **YARA 匹配** | 自定义和社区 YARA 规则命中,包含规则名称和元数据 | | **签名** | CAPE 行为签名匹配(例如 `ransomware_file_modifications`) | ### IOC 输出格式 ``` { "task_id": 42, "sample_sha256": "a1b2c3d4...", "analysis_timestamp": "2025-06-01T14:30:00Z", "iocs": { "file_hashes": [ { "md5": "abc123", "sha256": "def456", "path": "C:\\Users\\..\\dropped.exe", "ssdeep": "..." } ], "network": [ { "ip": "1.2.3.4", "port": 443, "protocol": "tcp", "domain": "evil.example.com" } ], "registry_keys": [ { "action": "write", "key": "HKCU\\Software\\Microsoft\\Windows\\CurrentVersion\\Run", "value": "malware.exe" } ], "mutexes": ["Global\\MalwareMutex123"], "api_sequences": [ ["VirtualAllocEx", "WriteProcessMemory", "CreateRemoteThread"] ] }, "yara_matches": ["Ransomware_WannaCry_variant", "Generic_Packer_UPX"] } ``` ## 📊 严重性评分引擎 `scoring/scorer.py` 中的自定义规则引擎根据观察到的行为分配数值严重性分数,每种 IOC 类型和和行为模式都具有可配置的权重。 ### 评分模型 ``` Base Score = SUM(weight_i * count_i) for all triggered rules Behavioral Multipliers applied on top of base score: - Process injection detected: x 2.0 - Anti-sandbox / anti-AV evasion: x 1.8 - Ransomware file modification pattern: x 2.5 - Persistence via Run key: x 1.5 - C2 communication detected: x 2.0 - Lateral movement indicators: x 1.7 Final Score = Base Score * Product(active multipliers) ``` ### 严重性阈值 | 分数范围 | 严重性级别 | 建议操作 | |---|---|---| | 0 – 25 | **低 (LOW)** | 自动归档,低优先级审查 | | 26 – 50 | **中 (MEDIUM)** | 分析师在 24 小时内审查 | | 51 – 75 | **高 (HIGH)** | 立即升级给分析师,进行 IOC 封禁 | | 76 – 100+ | **严重 (CRITICAL)** | 激活应急响应团队 | 所有阈值和规则权重均可在 `scoring/rules/*.yml` 和 `scoring/thresholds.py` 中配置,无需修改 Python 源代码。 ### 示例规则定义 (YAML) ``` # scoring/rules/process_rules.yml rules: - id: PROC_001 name: "Process Injection via WriteProcessMemory" description: "Classic process injection API sequence detected" api_sequence: - VirtualAllocEx - WriteProcessMemory - CreateRemoteThread weight: 15 mitre_technique: T1055 - id: PROC_002 name: "Process Hollowing" description: "Process unmapping followed by new image mapping" api_sequence: - NtUnmapViewOfSection - VirtualAllocEx - WriteProcessMemory weight: 20 mitre_technique: T1055.012 ``` ## 📝 威胁报告输出 报告流水线从单次分析运行中生成四种格式的综合报告: ### 报告格式 | 格式 | 文件 | 用途 | |---|---|---| | **PDF** | `report_
由 Chandra Sekhar Chakraborty 用 🛡️ 制作
恶意软件分析师 · 蓝队成员 · SOC 分析师候选人
🌐 作品集 ·
💻 GitHub ·
🔗 LinkedIn
如果这个项目对您有帮助,请考虑给它一个 ⭐
标签:CAPE, Cloudflare, Cuckoo, DAST, IOC提取, MITRE ATT&CK, Python, 严重性评分, 动态沙箱, 威胁情报, 威胁报告生成, 开发者工具, 恶意软件分析, 搜索引擎查询, 无后门, 沙箱, 结构化查询, 网络安全, 自动化安全, 请求拦截, 身份验证强制, 逆向工具, 隐私保护