narmadanm/pdf-malware-analysis-toolkit
GitHub: narmadanm/pdf-malware-analysis-toolkit
一款基于 Python 的静态 PDF 恶意软件分析工具包,通过元数据解析、对象提取与风险评分识别隐藏威胁。
Stars: 0 | Forks: 0
# 📄 PDF 恶意软件分析工具包
## 🚀 概述
该项目是一个基于 Python 的工具包,用于使用静态分析技术分析潜在的恶意 PDF 文件。它有助于识别隐藏的威胁,例如嵌入的 JavaScript、混淆的有效载荷以及真实攻击中使用的可疑结构。
## 🎯 功能
- ✅ 元数据分析
- ✅ 基于关键字的检测(pdfid 风格)
- ✅ 对象提取(pdf-parser 风格)
- ✅ 流解码(FlateDecode)
- ✅ JavaScript 与混淆检测
- ✅ IOC 提取(URL、IP、域名)
- ✅ 误报减少(智能过滤)
- ✅ 自动化风险评分
- ✅ SOC 风格报告生成
## 🧱 项目结构
pdfanalysis/
│
├── modules/
│ ├── pdf_loader.py
│ ├── keyword_scanner.py
│ ├── object_extractor.py
│ ├── stream_decoder.py
│ ├── ioc_extractor.py
│ └── report_generator.py
│
├── samples/
│ └── test.pdf
│
├── main.py
└── requirements.txt
## ⚙️ 安装
```
git clone https://github.com/your-username/pdf-malware-analysis-toolkit.git
cd pdf-malware-analysis-toolkit
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
▶️ Usage
python3 main.py samples/test.pdf
📊 Sample Output
PDF MALWARE ANALYSIS REPORT
Risk Level: LOW
Risk Score: 2.0
--- Metadata Findings ---
[!] Suspicious or empty Author field
--- Keyword Findings ---
[!] Additional Actions present
--- Extracted IOCs ---
urls: []
ips: []
domains: []
--- Mitigation ---
File appears safe
🔬 Techniques Used
Static Malware Analysis
PDF Structure Parsing
Regular Expression-based Detection
Stream Decompression (zlib)
Threat Intelligence Extraction
Risk Scoring Models
🛡️ Use Cases
SOC Analysis
Incident Response
Malware Research
Threat Hunting
Email Attachment Inspection
⚠️ Disclaimer
This tool is intended for educational and defensive security purposes only.
Always analyze suspicious files in a sandbox or isolated environment.
```
标签:IOC提取, JavaScript检测, PDF结构解析, Python, SOC分析, zlib解压, 云安全监控, 元数据分析, 关键词检测, 威胁情报, 对象提取, 开发者工具, 恶意PDF分析, 无后门, 样本分析, 正则表达式检测, 流解码, 自动化分析, 跨站脚本, 逆向工具, 防病毒, 附件检查, 静态分析, 风险评分