thalha-a9/helix
GitHub: thalha-a9/helix
Stars: 18 | Forks: 2
██╗ ██╗███████╗██╗ ██╗██╗ ██╗
██║ ██║██╔════╝██║ ██║╚██╗██╔╝
███████║█████╗ ██║ ██║ ╚███╔╝
██╔══██║██╔══╝ ██║ ██║ ██╔██╗
██║ ██║███████╗███████╗██║██╔╝ ██╗
╚═╝ ╚═╝╚══════╝╚══════╝╚═╝╚═╝ ╚═╝
### Decode the digital DNA of any identity
[](https://python.org)
[](https://github.com/thalha-a9/helix/releases)
[](LICENSE)
[](https://github.com/thalha-a9/helix)
[](https://github.com/thalha-a9/helix/pulls)
**Helix** is a next-generation open-source OSINT framework that goes far beyond username checking.
It maps the *actual connections* between a target's online identities — then renders them as a
live, interactive D3.js relational graph you can explore, filter, and export. [**Quick Start**](#-quick-start) · [**Features**](#-what-makes-helix-different) · [**Modules**](#-intelligence-modules) · [**Graph**](#-the-graph) · [**Install**](#-installation)
## Why Helix?
Most OSINT tools answer one question: *"Does this username exist on Platform X?"*
Helix answers a harder one: **"How do all these accounts connect to the same person?"**
It extracts cross-platform links from bios, matches profile pictures by perceptual hash, infers timezone from commit patterns, discovers domains via certificate transparency, and plots every relationship as a glowing edge in a browser-based network graph — all in a single command.
python helix.py -u johndoe --wayback --crt --paste --pivot --phash
## ⚡ What Makes Helix Different
| Capability | Sherlock | SpiderFoot | Maltego | **Helix** |
|---|:---:|:---:|:---:|:---:|
| Username enumeration | ✓ | ✓ | ✓ | ✓ |
| Relational bio-link graph | ✗ | ✗ | Partial | **✓** |
| Recursive alias pivot | ✗ | ✗ | Manual | **✓ auto** |
| Perceptual avatar matching | ✗ | ✗ | ✗ | **✓** |
| Timezone inference | ✗ | ✗ | ✗ | **✓** |
| Wayback identity timeline | ✗ | Partial | ✗ | **✓** |
| Certificate transparency | ✗ | ✓ | ✓ | **✓** |
| GitHub commit email extraction | ✗ | ✗ | ✗ | **✓** |
| Local heuristic verifier | ✗ | ✗ | ✗ | **✓ always-on** |
| Multi-AI false-positive filter | ✗ | ✗ | ✗ | **✓ 3 providers** |
| Async speed | ✗ | ✗ | ✗ | **✓** |
| 100% free & open source | ✓ | ✓ | ✗ | **✓** |
## 🔍 Intelligence Modules
### Always On
- **Local Heuristic Verifier** — Zero-dependency false-positive engine. Scores every result across 8 signals (WAF pages, generic titles, login redirects, homepage redirects). Runs before anything else, every single scan.
### Core Flags
| Flag | What it does |
|---|---|
| `--wmn` | Loads WhatsMyName database at runtime — **700+ platforms**, community-maintained |
| `--maigret` | Loads **Maigret** database at runtime — sophisticated detection with `presenceStrs`/`absenceStrs`, 24h cached |
| `--sherlock` | Loads Sherlock's database at runtime — **400+ platforms**, cached 24h locally |
| `--pivot` | **Recursive bio pivot** — finds aliases in bios and auto-scans them, up to 4 hops deep |
| `--phash` | **Perceptual avatar hash** — downloads profile pics, hashes them, cross-matches across platforms. Finds the same person even if they changed their username |
| `--wayback` | **Wayback Machine** — fetches snapshot history + parses archived HTML for old usernames, historic emails, and past bios |
| `--crt` | **Certificate Transparency** — queries crt.sh for SSL certs containing the target's name or email. Finds personal domains that never appeared in any bio |
| `--paste` | **Paste Intelligence** — searches GitHub Gists and public Pastebin index for mentions |
| `--breach` | **Breach check** — queries XposedOrNot for breach metadata (names, dates, data types exposed). No credentials returned |
| `--holehe` | **Deep email scan** — hands off to holehe for 120+ platform email-registration checks |
| `--ai` | **AI false-positive filter** — second verification pass via Claude, OpenRouter (free), or NVIDIA NIM (free) |
### Auto-Triggered
- **GitHub Deep Recon** — runs automatically when a GitHub profile is found. Extracts real emails from public commits (filters noreply), org memberships, language stats, npm packages, and infers timezone from commit timestamp distribution (requires ≥15 commits for confidence)
## 🕸 The Graph
The HTML output is a standalone zero-dependency interactive network — no server needed, just open in a browser.
White pulsing node → Username root
Amber pulsing node → Email root
Amber/orange nodes → Pivot-discovered aliases
Green solid edges → Bio-extracted cross-links (proven connections)
Pink dashed edges → Avatar hash matches (same person across accounts)
Amber dashed edges → Email-matched platforms
Green ring on node → High confidence (OG meta validated)
Blue ring on node → Medium confidence
**Controls:** drag nodes · scroll to zoom · click node to open profile · hover for tooltip (confidence, og:title, cross-link partners, bio-extracted alias details) · ⌕ search · ◌ not-found overlay · ☰ labels · ↓ SVG export · filter by confidence
## 🚀 Quick Start
git clone https://github.com/thalha-a9/helix.git
cd helix
pip install -r requirements.txt
python helix.py -u johndoe
## 📦 Installation
**Required**
pip install aiohttp
**Optional — unlock more power**
**Or install everything at once**
pip install "helix-osint[full]"
**Set GITHUB_TOKEN for 5000 req/hr on GitHub API** (optional, default is 60/hr):
export GITHUB_TOKEN=ghp_yourtoken
## 💻 Usage
# Basic scan — opens interactive graph automatically
python helix.py -u johndoe
# Full power — all intelligence modules
python helix.py -u johndoe --wayback --crt --paste --pivot --phash
# Username + email — two root nodes, cross-matched in graph
python helix.py -u johndoe -e johndoe@gmail.com --breach --holehe
# Massive scan — 1100+ platforms
python helix.py -u johndoe --wmn --sherlock
# AI-verified scan (free — no API key cost)
python helix.py -u johndoe --ai openrouter
# Recursive pivot — auto-scan aliases up to 4 hops deep
python helix.py -u johndoe --pivot --pivot-depth 4
# Permutations — scan johndoe1, john.doe, realjohndoe, etc.
python helix.py -u johndoe --permutations
# Everything, saved to custom dir, no browser
python helix.py -u johndoe -e johndoe@gmail.com \
--wmn --sherlock --wayback --crt --paste \
--pivot --phash --breach --holehe \
--ai openrouter --format all --no-browser --output ~/Desktop/report
# Check which AI providers are configured
python helix.py --providers
## 🤖 AI Verification
Helix has a two-layer false-positive filter:
**Layer 1 — Local heuristic verifier** (always on, zero cost)
Scores every result across 8 signals. A single generic title (e.g. `"Pinterest"` instead of a username) instantly purges the result. WAF/Cloudflare pages scored separately at 80 points. Threshold: 60 for normal results, 85 for OG-validated high-confidence results.
**Layer 2 — AI verifier** (`--ai`, optional)
Sends uncertain results to an LLM with a strict system prompt. Three providers:
| Provider | Flag | Cost | Setup |
|---|---|---|---|
| Anthropic Claude | `--ai claude` | Paid | `export ANTHROPIC_API_KEY=...` |
| OpenRouter Llama 3.1 | `--ai openrouter` | **Free tier** | `export OPENROUTER_API_KEY=...` → [openrouter.ai](https://openrouter.ai) |
| NVIDIA NIM Llama 3.1 | `--ai nvidia` | **Free tier** | `export NVIDIA_API_KEY=...` → [build.nvidia.com](https://build.nvidia.com) |
## 🏗 Architecture
helix/
├── helix.py ← CLI entry point + orchestrator
├── pyproject.toml ← pip installable (helix-osint)
├── osint/
│ ├── checker.py ← Async engine (aiohttp + optional curl_cffi)
│ ├── platforms.py ← 70+ platform definitions with OG/API detection
│ ├── verifier.py ← Local heuristic false-positive engine
│ ├── graph.py ← D3.js relational graph generator
│ ├── report.py ← JSON / CSV / TXT exporters
│ ├── permutations.py ← Username variation generator
│ ├── pivot.py ← Concurrent BFS alias pivot engine
│ ├── phash.py ← Perceptual avatar hash matcher
│ └── modules/
│ ├── wayback.py ← Archive.org CDX API + archived HTML parser
│ ├── github_deep.py ← GitHub API deep recon + timezone inference
│ ├── crt.py ← Certificate transparency (crt.sh)
│ └── paste.py ← Gist + Pastebin intelligence
│ └── adapters/
│ ├── sherlock_adapter.py ← Sherlock data.json loader (24h cached)
│ ├── wmn_adapter.py ← WhatsMyName loader
│ ├── holehe_adapter.py ← holehe email scanner wrapper
│ ├── breach_adapter.py ← XposedOrNot breach metadata
│ └── ai_verifier.py ← Multi-provider async AI verification
└── results/ ← Output (git-ignored)
└── username/
├── username_graph.html ← Interactive D3.js network graph
├── username_TIMESTAMP.json ← Full structured report
├── username_TIMESTAMP.csv
└── username_TIMESTAMP.txt
## 🔬 How False Positive Prevention Works
Helix uses the right detection method per platform instead of naive HTTP 200 checks:
| Platform | Method | Why |
|---|---|---|
| Reddit | `reddit.com/user/{u}/about.json` → `"is_employee"` field | JSON API; field only exists for valid users |
| Bluesky | AT Protocol API | SPA — static HTML is useless |
| Chess.com | `api.chess.com/pub/player/{u}` | Official public API |
| Lichess | `lichess.org/api/user/{u}` | Official public API |
| GitHub | `og:title` parsed + validated against known error strings | Server-side rendered, reliable |
| Medium | `og:title` rejects homepage redirect string | Catches "Where good ideas find you" |
| Twitter/X | `curl_cffi` TLS impersonation | Skipped gracefully without it |
## 📋 Output Formats
| Format | Contents |
|---|---|
| `.html` | Standalone interactive D3.js graph — no server needed |
| `.json` | Full structured report including intel bundle (wayback, GitHub deep, CRT, paste) |
| `.csv` | Spreadsheet-friendly, all platforms |
| `.txt` | Clean terminal-style summary |
## ⚠️ Legal & Ethics
Helix is built for **security research, bug bounty reconnaissance, and OSINT education**.
All data sources used are publicly accessible. Always ensure you have proper authorization
before running reconnaissance on any target. The author is not responsible for misuse.
## 📎 Related Projects
- [esp32-iot-audit](https://github.com/thalha-a9/esp32-iot-audit) — ESP32 IoT security scanner
- [esp-pentest-toolkit](https://github.com/thalha-a9/esp-pentest-toolkit) — Wireless ESP32/8266 pentest toolkit
It maps the *actual connections* between a target's online identities — then renders them as a
live, interactive D3.js relational graph you can explore, filter, and export. [**Quick Start**](#-quick-start) · [**Features**](#-what-makes-helix-different) · [**Modules**](#-intelligence-modules) · [**Graph**](#-the-graph) · [**Install**](#-installation)
Built by **Thalha Ahmed** · [@thalha-a9](https://github.com/thalha-a9)