Fahimuz/threat-intel-rag
GitHub: Fahimuz/threat-intel-rag
Stars: 0 | Forks: 0
# Cybersecurity Threat Intelligence Tool
A RAG-based AI tool that lets security analysts query real government cybersecurity threat reports using natural language -- like Google, but only for threat intelligence documents.
Built by **Fahim Uzzaman** | Minnesota State University, Mankato | B.S. Computer Information Technology
## What It Does
Instead of manually reading hundreds of pages of threat reports, analysts can type a question like:
And instantly get a cited answer from real CISA and FBI reports.
## Tech Stack
| Layer | Technology |
|---|---|
| AI / LLM | Anthropic Claude (claude-haiku) |
| RAG Framework | LangChain |
| Vector Database | ChromaDB |
| Embeddings | SentenceTransformers (all-MiniLM-L6-v2) |
| Frontend | Streamlit |
| PDF Processing | pypdf |
## Architecture
PDF Reports --> Text Extraction --> Chunking (500 tokens) --> Embeddings --> ChromaDB
User Question --> Embedding --> Similarity Search --> Top 3 Chunks --> Claude AI --> Answer
## Features
- Natural language search across 3 real threat intelligence reports
- Threat category tagging (Ransomware, Phishing, Malware, Fraud, etc.)
- MITRE ATT&CK framework references linked automatically
- Multi-document filtering
- Chat history for follow-up questions
- Source citations for every answer
## Threat Reports Loaded
- CISA Ransomware Guide (CISA / MS-ISAC)
- IC3 Internet Crime Report 2022 (FBI)
- IC3 Internet Crime Report 2023 (FBI)
## How to Run
1. Clone the repo
git clone https://github.com/Fahimuz/threat-intel-rag.git
cd threat-intel-rag
2. Create virtual environment
python -m venv venv
venv\Scripts\activate
3. Install dependencies
pip install -r requirements.txt
4. Add your API key
Create a .env file:
ANTHROPIC_API_KEY=your_key_here
5. Build the vector database
python build_vectordb.py
6. Run the app
streamlit run app.py
## Test Results
Accuracy: 70% (7/10 test questions answered correctly)
Adding more threat reports will increase accuracy significantly.
## Future Improvements
- Add more threat reports (Mandiant M-Trends, Microsoft MSTIC)
- Real-time CVE database integration
- User authentication for enterprise use
- Docker containerization
## Contact
- GitHub: https://github.com/Fahimuz
- LinkedIn: https://www.linkedin.com/in/fahimuzzam/
- Portfolio: https://bold.pro/my/fnu-fahimuzzaman-260212134518