Kodjocares/osint
GitHub: Kodjocares/osint
一个基于Python的模块化开源情报收集框架,整合27个情报模块,覆盖从用户名枚举、泄露检测到威胁情报、区块链追踪的完整OSINT工作流。
Stars: 0 | Forks: 0
# osint
一个模块化、可扩展的 OSINT 框架,包含 27 个情报模块 —— 用户名枚举、泄露检测、威胁情报、区块链追踪、云资产发现、实体关系图可视化以及更多功能
```
██████╗ ███████╗██╗███╗ ██╗████████╗
██╔═══██╗██╔════╝██║████╗ ██║╚══██╔══╝
██║ ██║███████╗██║██╔██╗ ██║ ██║
██║ ██║╚════██║██║██║╚██╗██║ ██║
╚██████╔╝███████║██║██║ ╚████║ ██║
╚═════╝ ╚══════╝╚═╝╚═╝ ╚═══╝ ╚═╝
v2.0 — 27 Modules
```
**开源情报框架 · Python 3.9+**
[](https://python.org)
[](LICENSE)
[](#modules)
[](CONTRIBUTING.md)
[]()
[](https://github.com/YOUR_USERNAME/osint-tool/stargazers)
**一个模块化、可扩展的 OSINT 框架,包含 27 个情报模块 —— 用户名枚举、泄露检测、威胁情报、区块链追踪、云资产发现、实体关系图可视化以及更多功能。**
[安装说明](#installation) · [使用方法](#usage) · [模块](#modules) · [API Keys](#api-keys) · [贡献指南](#contributing)
## 什么是 OSINT Tool?
OSINT Tool 是一个全面、模块化的 Python 开源情报收集框架。它将 27 个专业模块整合在一个交互式 CLI 下,并生成专业的 HTML 报告、JSON 导出和交互式实体关系图。
无论您是独立的安全研究员、渗透测试人员、数字取证分析师,还是调查公共记录的记者 —— OSINT Tool 都能通过单个终端命令提供结构化、可重复的工作流程。
**关键设计原则:**
- 每个模块都可以独立工作,或作为完整调查的一部分
- 优雅降级 —— 所有 API keys 均为可选;免费资源无需注册即可使用
- 尊重隐私 —— 密码通过 k-anonymity 方式检查,绝不传输明文
- 兼容 Tor —— 仅需一行配置即可将所有流量通过 Tor 路由
- 报告优先 —— 每次运行都会生成结构化 JSON + 样式化 HTML 输出
## Modules (模块)
### 原始模块
| # | Module | File | Capabilities |
|---|--------|------|-------------|
| 1 | **Username / Email** | `username_lookup.py` | 并行枚举 30+ 平台;email MX, Gravatar, Hunter.io |
| 2 | **Domain & IP Intel** | `domain_intel.py` | WHOIS, full DNS, 子域名枚举 (crt.sh + bruteforce), SSL cert, Shodan, VirusTotal, 技术指纹识别 |
| 3 | **Phone Tracking** | `phone_lookup.py` | E.164 解析, 运营商, 线路类型, 地区; AbstractAPI + NumVerify |
| 4 | **Breach Check** | `breach_check.py` | HaveIBeenPwned 邮箱泄露查询, paste 暴露 |
| 5 | **Password Exposure** | `breach_check.py` | k-anonymity SHA-1 检查 —— 密码从不被传输 |
| 6 | **Social Media** | `social_media.py` | GitHub full API (repos/orgs/events), Reddit API, 通用爬虫 |
| 7 | **Metadata Extraction** | `metadata_extractor.py` | 图片 EXIF/GPS, PDF 作者信息, DOCX 用户元数据 |
| 8 | **Google Dorking** | `google_dorking.py` | 15+ dork 类别, 自定义操作符构建器, DDG 执行 |
| 9 | **Geolocation** | `geolocation.py` | IP/域名/GPS 地理定位, 逆地理编码, Folium HTML 地图 |
| 10 | **Monitoring & Alerts** | `monitoring.py` | SHA-256 变更检测, 邮件告警, 后台调度器 |
### 新模块 (v2.0)
| # | Module | File | Capabilities |
|---|--------|------|-------------|
| 11 | **Web Archive** | `web_archive.py` | Wayback Machine 快照, 域名时间线, 已删除内容提取, 快照差异对比 |
| 12 | **GitHub Recon** | `github_recon.py` | 密钥模式扫描 (30+ regex), commit 邮箱收集, 域名/组织代码暴露 |
| 13 | **Paste Monitor** | `paste_monitor.py` | 搜索 Pastebin/Ghostbin/Gist/Rentry; 自动分类为凭证转储、哈希转储或财务数据 |
| 14 | **Company Intel** | `company_intel.py` | OpenCorporates 注册信息 + 高管, SEC EDGAR 文件, LinkedIn 职位发布抓取 |
| 15 | **Threat Intelligence** | `threat_intel.py` | AlienVault OTX, VirusTotal, AbuseIPDB, MalwareBazaar, URLhaus — IP/域名/哈希/URL |
| 16 | **Email Header Analysis** | `email_header.py` | 跳跃链重建, 欺骗检测, SPF/DKIM/DMARC, 发件人 IP 地理定位 |
| 17 | **Reverse Image Search** | `reverse_image.py` | 搜索 URL 生成 (Google/Yandex/TinEye/Bing), OCR 文本提取, 图片哈希 |
| 18 | **Crypto Tracer** | `crypto_tracer.py` | Bitcoin & Ethereum 余额/交易, Blockchair 多链, 钱包风险分类 |
| 19 | **DNS History** | `dns_history.py` | HackerTarget passive DNS (免费), SecurityTrails 历史, ViewDNS IP 历史, 反向 IP |
| 20 | **ASN / Network Intel** | `network_intel.py` | BGPView ASN 详情, IP→ASN, 组织 IP 范围, RDAP 注册, 快速端口检查 |
| 21 | **Cloud Asset Discovery** | `cloud_discovery.py` | S3/Azure Blob/GCS bucket 枚举 (30+ 排列组合), Firebase 暴露检查 |
| 22 | **Web Crawler** | `web_crawler.py` | 全站爬取, 表单提取, 登录/管理页面检测, 批量邮箱/电话收集 |
| 23 | **IP Classifier** | `ip_classifier.py` | Tor 出口节点列表, VPN/代理/数据中心/住宅 IP 检测 (IPQualityScore + AbuseIPDB) |
| 24 | **Graph Visualization** | `graph_viz.py` | 交互式 D3.js 实体关系图, 从任何 OSINT 结果自动构建, pyvis/NetworkX 导出 |
### 支持模块
| Module | File | Purpose |
|--------|------|---------|
| **Anonymity** | `utils/anonymity.py` | 通过 stem 使用 Tor, 代理轮换, DNS 泄露检查, 身份验证 |
| **HTTP Client** | `utils/helpers.py` | 速率限制, 重试逻辑, user-agent 轮换 |
| **Report Generator** | `reporting/report_generator.py` | 深色主题 HTML 报告, JSON 导出, matplotlib 图表 |
## 项目结构
```
osint-tool/
├── main.py # CLI entry point & interactive menu (27 options)
├── config.py # All API keys, settings, platform lists
├── .env.example # Environment variables template
├── requirements.txt # Runtime dependencies
├── requirements-dev.txt # Dev/test dependencies
├── pyproject.toml # Package metadata & build config
│
├── modules/ # Intelligence modules
│ ├── username_lookup.py
│ ├── domain_intel.py
│ ├── phone_lookup.py
│ ├── breach_check.py
│ ├── social_media.py
│ ├── metadata_extractor.py
│ ├── google_dorking.py
│ ├── geolocation.py
│ ├── monitoring.py
│ ├── web_archive.py # NEW
│ ├── github_recon.py # NEW
│ ├── paste_monitor.py # NEW
│ ├── company_intel.py # NEW
│ ├── threat_intel.py # NEW
│ ├── email_header.py # NEW
│ ├── reverse_image.py # NEW
│ ├── crypto_tracer.py # NEW
│ ├── dns_history.py # NEW
│ ├── network_intel.py # NEW
│ ├── cloud_discovery.py # NEW
│ ├── web_crawler.py # NEW
│ ├── ip_classifier.py # NEW
│ └── graph_viz.py # NEW
│
├── reporting/
│ └── report_generator.py # HTML/JSON reports + charts
│
├── utils/
│ ├── helpers.py # HTTP client, rate limiting
│ └── anonymity.py # Tor & proxy management
│
├── tests/
│ └── test_modules.py # pytest test suite
│
├── output/ # Generated reports (git-ignored)
├── wordlists/ # Custom wordlists (git-ignored)
│
└── .github/
├── workflows/
│ ├── ci.yml # Lint, test, security scan
│ └── release.yml # Auto-release on tag push
├── ISSUE_TEMPLATE/
│ ├── bug_report.md
│ └── feature_request.md
└── PULL_REQUEST_TEMPLATE.md
```
## 安装说明
### 系统要求
- Python **3.9** 或更高版本
- `pip` + `venv`
- Tor *(可选 —— 用于匿名路由)*
- Tesseract OCR *(可选 —— 用于图片文本提取)*
### 步骤 1 — 克隆
```
git clone https://github.com/Kodjocares/osint-tool.git
cd osint-tool
```
### 步骤 2 — 虚拟环境
```
# Linux / macOS
python3 -m venv venv
source venv/bin/activate
# Windows (PowerShell)
python -m venv venv
venv\Scripts\Activate.ps1
```
### 步骤 3 — 安装依赖
```
pip install -r requirements.txt
```
### 步骤 4 — 配置 API keys
```
cp .env.example .env
# 打开 .env 并添加您的 API keys — 均为可选
```
### 步骤 5 — 验证
```
python main.py --anonymity
```
## 使用方法
### 交互式菜单
```
python main.py
```
```
╔══════════════════════════════════════════════════════════╗
║ OSINT INTELLIGENCE TOOL v2.0 ║
╠══════════════════════════════════════════════════════════╣
║ ORIGINAL MODULES ║
║ [1] Username / Email Lookup [2] Domain & IP ║
║ [3] Phone Tracking [4] Breach Check ║
║ [5] Password Exposure [6] Social Media ║
║ [7] Metadata Extraction [8] Google Dorks ║
║ [9] Geolocation [10] Monitoring ║
╠══════════════════════════════════════════════════════════╣
║ NEW MODULES ║
║ [11] Web Archive / Wayback [12] GitHub Recon ║
║ [13] Paste Site Monitor [14] Company Intel ║
║ [15] Threat Intelligence / IOC [16] Email Header ║
║ [17] Reverse Image Search [18] Crypto Trace ║
║ [19] DNS History [20] ASN / Network ║
║ [21] Cloud Asset Discovery [22] Web Crawler ║
║ [23] IP Classifier [24] Graph Viz ║
╠══════════════════════════════════════════════════════════╣
║ [25] Full Target Investigation [26] Anonymity ║
╚══════════════════════════════════════════════════════════╝
```
## 命令行参考
### 原始模块
```
# Username — 扫描 30+ 平台
python main.py --username johndoe
# Email — MX records, Gravatar, Hunter.io
python main.py --email target@example.com
# Domain — WHOIS + DNS + subdomains + SSL + tech stack + geo
python main.py --domain example.com
# IP — reverse DNS + Shodan + VirusTotal + geolocation
python main.py --ip 8.8.8.8
# Phone — carrier, line type, region (E.164 format)
python main.py --phone "+14155552671"
# Email 泄露检查 (HaveIBeenPwned)
python main.py --breach user@example.com
# Password 暴露 — k-anonymity, password NEVER transmitted
python main.py --password-check
# Social media 抓取 (GitHub + Reddit 公开数据)
python main.py --social johndoe
# Metadata 提取 — 图像, PDFs, DOCX; 本地或 URL
python main.py --metadata /path/to/photo.jpg
python main.py --metadata https://example.com/doc.pdf
# Google dork 生成
python main.py --dork example.com
python main.py --dork example.com --dork-execute
# Geolocation — IP, domain, 或 GPS coords
python main.py --geo 8.8.8.8
python main.py --geo "37.7749,-122.4194"
```
### 新模块 (v2.0)
```
# Web Archive — Wayback Machine 时间线
python main.py --archive example.com
python main.py --archive https://example.com/page --archive-snapshot
python main.py --archive https://example.com --archive-ts 20200101120000
# GitHub Recon — profile, repos, commit emails
python main.py --github johndoe
python main.py --github johndoe --scan-secrets
python main.py --github org/repository --scan-secrets # deep secret scan
# Paste Site Monitor — 搜索 Pastebin, Gist, Ghostbin 等
python main.py --paste "user@example.com"
python main.py --paste "example.com"
python main.py --paste-url https://pastebin.com/AbCdEfGh
# Company Intelligence — 注册信息, SEC filings, job postings
python main.py --company "Acme Corporation"
python main.py --company "Tesla Inc"
# Threat Intelligence — IP, domain, file hash, URL
python main.py --threat 8.8.8.8
python main.py --threat example.com
python main.py --threat d41d8cd98f00b204e9800998ecf8427e # MD5 hash
python main.py --threat https://malicious-site.example.com --ioc-type url
# Email Header 分析 — 检测欺骗, 追踪来源
python main.py --email-header /path/to/raw_headers.txt
# Reverse Image Search — 生成搜索 URLs + OCR
python main.py --image https://example.com/photo.jpg
python main.py --image /path/to/local/image.jpg
# Cryptocurrency 追踪 — Bitcoin 和 Ethereum
python main.py --crypto 1A1zP1eP5QGefi2DMPTfTL5SLmv7Divf
python main.py --crypto 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045
# DNS History — passive DNS, IP history, reverse IP
python main.py --dns-history example.com
# ASN / Network Intelligence
python main.py --asn AS15169 # Google's ASN
python main.py --asn 8.8.8.8 # Auto-resolve IP to ASN
python main.py --asn "Cloudflare Inc" # Org name search
# Cloud Asset Discovery — S3, Azure, GCS, Firebase
python main.py --cloud example.com
# Web Crawler — spider 整个站点
python main.py --crawl https://example.com
python main.py --crawl https://example.com --max-pages 100
python main.py --crawl https://example.com/page --quick-scrape
# IP Classifier — VPN, Tor, proxy, datacenter, residential
python main.py --classify-ip 8.8.8.8
python main.py --classify-ip 185.220.101.1
# Entity Relationship Graph
python main.py --graph target@example.com
python main.py --graph target@example.com --graph-data output/results.json
# Full Automated Investigation (所有相关 modules + graph)
python main.py --full target@example.com
python main.py --full example.com --output html,json
python main.py --full 1.2.3.4
python main.py --full johndoe
# Anonymity / Tor 状态
python main.py --anonymity
```
### 输出文件
每次运行都会将结果保存到 `output/`:
```
output/
├── example_com_20240315_143022.html # Styled HTML report
├── example_com_20240315_143022.json # Full JSON data export
├── graph_example_com_20240315.html # Interactive entity graph
├── breach_chart.png # Matplotlib breach chart
├── geo_map.html # Folium geolocation map
├── osint_tool.log # Execution log
└── alerts/ # Monitoring change alerts
```
## API Keys 参考
| Service | Module | Free Tier | Sign Up |
|---------|--------|-----------|---------|
| [Shodan](https://shodan.io) | Domain & IP | 是 — 有限 | [account.shodan.io](https://account.shodan.io/register) |
| [VirusTotal](https://virustotal.com) | Domain, IP, Threat Intel | 是 — 4 次/分钟 | [virustotal.com](https://www.virustotal.com/gui/join-us) |
| [Hunter.io](https://hunter.io) | Email lookup | 是 — 25 次/月 | [hunter.io](https://hunter.io/users/sign_up) |
| [IPInfo](https://ipinfo.io) | Geolocation | 是 — 50k 次/月 | [ipinfo.io](https://ipinfo.io/signup) |
| [HaveIBeenPwned](https://haveibeenpwned.com/API/Key) | Breach check | 付费 — $3.50/月 | [haveibeenpwned.com](https://haveibeenpwned.com/API/Key) |
| [AbstractAPI](https://abstractapi.com) | Phone | 是 | [app.abstractapi.com](https://app.abstractapi.com/users/signup) |
| [NumVerify](https://numverify.com) | Phone | 是 — 250 次/月 | [numverify.com](https://numverify.com) |
| [Google CSE](https://programmablesearchengine.google.com) | Dork execution | 是 — 100 次/天 | [programmablesearchengine.google.com](https://programmablesearchengine.google.com) |
| [GitHub](https://github.com/settings/tokens) | GitHub Recon | 是 — 免费 | [github.com/settings/tokens](https://github.com/settings/tokens) |
| [AlienVault OTX](https://otx.alienvault.com) | Threat Intel | 是 — 免费 | [otx.alienvault.com](https://otx.alienvault.com) |
| [AbuseIPDB](https://www.abuseipdb.com/register) | Threat Intel, IP Classifier | 是 — 1k 次/天 | [abuseipdb.com](https://www.abuseipdb.com/register) |
| [SecurityTrails](https://securitytrails.com) | DNS History | 是 — 50 次/月 | [securitytrails.com](https://securitytrails.com/corp/api) |
| [Etherscan](https://etherscan.io/apis) | Crypto Tracer | 是 — 免费 | [etherscan.io/apis](https://etherscan.io/apis) |
| [IPQualityScore](https://ipqualityscore.com) | IP Classifier | 是 — 200 次/天 | [ipqualityscore.com](https://www.ipqualityscore.com/create-account) |
| [TinEye](https://tineye.com/api) | Reverse Image | 付费 | [tineye.com/api](https://services.tineye.com/TinEyeAPI) |
**无需 API key 的功能:** 密码检查 (HIBP k-anonymity), crt.sh 子域名枚举, ip-api.com 地理定位, DNS lookups, WHOIS, SSL checks, GitHub/Reddit 公共 API, DuckDuckGo 搜索, Wayback Machine, OpenCorporates (基础), HackerTarget passive DNS, BGPView ASN lookups, Tor 出口节点列表, Blockchain.info BTC lookups, AlienVault OTX (受限), URLhaus, MalwareBazaar.
## 通过 Tor 实现匿名
### 安装 Tor
```
# Ubuntu / Debian
sudo apt update && sudo apt install tor
sudo service tor start
# macOS
brew install tor
brew services start tor
# Verify
curl --socks5-hostname 127.0.0.1:9050 https://check.torproject.org/api/ip
```
### 在 `.env` 中启用
```
USE_TOR=true
TOR_PROXY=socks5h://127.0.0.1:9050
TOR_CONTROL_PORT=9051
TOR_CONTROL_PASSWORD=your_control_password # optional
```
### 验证匿名性
```
python main.py --anonymity
```
## 持续监控
注册目标以进行自动变更检测:
```
from modules.monitoring import Monitor
from modules.domain_intel import DomainIntel
import threading
monitor = Monitor()
intel = DomainIntel()
# Register
monitor.register_target("corp-domain", "domain", "example.com",
description="Watch for DNS/WHOIS changes")
# One-off check
result = monitor.check_target("corp-domain", intel.whois_lookup,
domain="example.com")
print(f"Changed: {result['changed']}")
# Background scheduler (每 24h)
t = threading.Thread(
target=monitor.start_scheduler,
args=("corp-domain", intel.whois_lookup),
kwargs={"domain": "example.com"},
daemon=True
)
t.start()
```
在 `.env` 中配置邮件告警:
```
ALERT_EMAIL=you@example.com
SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USER=your_gmail@gmail.com
SMTP_PASS=your_app_password
MONITOR_INTERVAL_HOURS=24
```
## 运行测试
```
# Install dev dependencies
pip install -r requirements-dev.txt
# Run 所有 tests
pytest tests/ -v
# With coverage
pytest tests/ -v --cov=modules --cov=utils --cov=reporting --cov-report=term-missing
# Single test class
pytest tests/test_modules.py::TestBreachCheck -v
# Lint
flake8 . --max-line-length=100
black --check .
# Security scan
bandit -r modules/ utils/ reporting/ main.py
```
## 可选系统依赖
```
# Tesseract OCR (用于 reverse_image module 中的图像文本提取)
# Ubuntu / Debian
sudo apt install tesseract-ocr
# macOS
brew install tesseract
# Then install the Python binding
pip install pytesseract
```
## 发布到 GitHub
```
cd osint-tool
git init
git add .
git commit -m "feat: initial release — OSINT Tool v2.0 (27 modules)"
# Create repo at github.com → New Repository
git remote add origin https://github.com/YOUR_USERNAME/osint-tool.git
git branch -M main
git push -u origin main
# Tag the first release
git tag -a v2.0.0 -m "v2.0.0 — 27 modules"
git push origin v2.0.0
```
GitHub Actions 将在每次 push 和 PR 时自动运行 lint + test。
## 贡献指南
在提交 PR 之前,请阅读 [CONTRIBUTING.md](CONTRIBUTING.md)。
```
# Create a feature branch
git checkout -b feat/my-new-module
# Follow the module pattern in modules/
# Add CLI flags in main.py
# Update this README
# Run tests & lint
pytest tests/ -v
flake8 . --max-line-length=100
# Commit with conventional commits
git commit -m "feat(modules): add LinkedIn public profile scraper"
git push origin feat/my-new-module
```
## 安全
发现工具本身存在漏洞?请参阅 [SECURITY.md](SECURITY.md) —— 请私下披露,不要通过公开 issue。
## 更新日志
查看 [CHANGELOG.md](CHANGELOG.md) 获取完整的版本历史。
## 许可证
[MIT License](LICENSE) 并附带额外的道德使用条款。
使用本工具即表示您同意:仅调查拥有明确书面授权的目标,遵守所有适用法律,并且不利用本工具对个人或组织造成损害。
为安全研究社区构建 · 请负责任地使用 · 如果有用请点亮 Star ⭐