faizzy69/PDF_Malware_Analysis_Toolkit
GitHub: faizzy69/PDF_Malware_Analysis_Toolkit
Stars: 0 | Forks: 0
# PDF Malware Analysis Toolkit
A SOC-style PDF Malware Analysis Toolkit developed using Python and Flask for detecting suspicious indicators, JavaScript behavior, embedded actions, obfuscation, and Indicators of Compromise (IOCs) inside PDF files.
## Project Overview
Malicious PDF files are commonly used for:
- Phishing attacks
- Malware delivery
- Credential theft
- Exploit execution
- Embedded payload delivery
This project analyzes PDF documents for suspicious behavior and generates a professional malware analysis report.
The system performs:
- Static PDF Analysis
- JavaScript Detection
- IOC Extraction
- Threat Scoring
- Obfuscation Detection
- Metadata Analysis
- Risk Classification
- Threat Intelligence Reporting
## Features
### 1. Metadata Analysis
Extracts:
- Author
- Creator
- Producer
- Creation date
- Page count
- PDF metadata
### 2. Suspicious Keyword Detection
Detects malicious PDF indicators such as:
/JS
/JavaScript
/OpenAction
/Launch
/EmbeddedFile
/URI
/ObjStm
/SubmitForm
/XFA
### 3. JavaScript Analysis
Detects suspicious JavaScript patterns:
eval()
unescape()
atob()
fromCharCode()
base64
submitForm()
launchURL()
### 4. IOC Extraction
Extracts:
- URLs
- IP Addresses
- Emails
### 5. Threat Intelligence
Calculates:
- Risk Score
- Severity Level
- Threat Confidence
- Verdict Generation
- Threat Timeline
### 6. Professional Dashboard
Includes:
- Modern Flask UI
- Threat Cards
- PDF Preview
- Threat Timeline
- Analyst Recommendation
- Downloadable PDF Report
## Tech Stack
### Frontend
- HTML
- CSS
- JavaScript
### Backend
- Python
- Flask
### Libraries Used
Flask
pypdf
reportlab
re
hashlib
os
math
## Project Structure
PDF_Malware_Analyzer/
│── app.py
│── analyzer.py
│── threat_engine.py
│── report_generator.py
│── requirements.txt
│
├── uploads/
├── reports/
│
├── templates/
│ └── index.html
│
├── static/
│ ├── style.css
│ └── script.js
## Installation
### 1. Clone Project
git clone
cd PDF_Malware_Analyzer
### 2. Install Dependencies
pip install -r requirements.txt
### 3. Run Application
python app.py
Open browser:
http://127.0.0.1:5000
## Workflow
1. Upload PDF File
2. Static Analysis Starts
3. Suspicious Keywords Detected
4. JavaScript Patterns Analyzed
5. IOCs Extracted
6. Threat Score Calculated
7. Report Generated
8. Results Displayed
## Threat Severity Levels
### LOW
Safe or minimal suspicious activity.
### MEDIUM
Potentially suspicious behavior.
### HIGH
Likely malicious indicators found.
### CRITICAL
Strong malicious indicators detected.
## Sample Output
The dashboard provides:
- Risk Score
- Threat Verdict
- Confidence Level
- IOC Extraction
- Metadata Viewer
- PDF Preview
- Downloadable Report
## Future Enhancements
- VirusTotal API integration
- ML-based malware classification
- Real-time sandbox analysis
- Multi-file scanning
- Threat intelligence feeds
- Database logging
## Limitations
- Static analysis only
- No dynamic execution
- No sandbox detonation
- Signature-based detection
## Author
Faisal Khan
Cybersecurity / Python Project
## License
Educational Purpose Only