aharwelik/credsweep

GitHub: aharwelik/credsweep

Stars: 0 | Forks: 0

![credsweep](https://static.pigsec.cn/wp-content/uploads/repos/2026/06/39db3d7b8f205231.svg) # credsweep **Zero-dependency secret & cloud-key scanner — one tool, two native runtimes.** [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/06/23143051cc205233.svg)](https://github.com/aharwelik/credsweep/actions/workflows/ci.yml) [![License: MIT](https://img.shields.io/badge/License-MIT-FF7A18.svg)](LICENSE) [![Bash](https://img.shields.io/badge/bash-3.2%2B-1A73E8.svg)]() [![PowerShell](https://img.shields.io/badge/PowerShell-7%2B-1A73E8.svg)]() [![SARIF](https://img.shields.io/badge/output-SARIF%202.1.0-444.svg)]()
`credsweep` finds leaked credentials, cloud keys, and private keys before they reach a commit, a CI log, or a public repo. It ships as **two byte-for-byte equivalent implementations** — a POSIX-friendly `bash` script and a cross-platform `PowerShell` script — so the *same* scan runs on a developer's mac, a Linux CI runner, and a Windows build agent with **nothing to install**. No Python, no Go binary, no `npm install`, no network calls. Most secret scanners are heavy (Docker images, language runtimes, cloud accounts). `credsweep` is two files you can drop into any repo and read top to bottom in five minutes. ## Why this exists - **Two runtimes, one ruleset.** Security teams live in PowerShell; app teams live in bash. `credsweep` gives both the identical 15-rule detection engine and identical JSON/SARIF output, so findings are comparable no matter who ran the scan. - **CI-native.** Emits **SARIF 2.1.0**, so findings show up inline in the GitHub *Security → Code scanning* tab with zero extra glue. - **Pre-commit by design.** Non-zero exit on any finding; a ready-to-symlink git hook is included. - **Offline & auditable.** No telemetry, no API keys required. The optional AI triage step is strictly opt-in. ## Detections (15 rules) | Provider / type | Severity | Provider / type | Severity | |---|---|---|---| | AWS Access Key ID (`AKIA…`) | HIGH | AWS Secret Access Key | CRITICAL | | GCP API key (`AIza…`) | HIGH | Google OAuth secret (`GOCSPX-`) | HIGH | | GitHub token (`ghp_/gho_/…`) | HIGH | GitHub fine-grained PAT | HIGH | | Slack token (`xox…`) | HIGH | Slack webhook URL | MEDIUM | | Stripe secret key (`sk_live_`) | CRITICAL | OpenAI key (`sk-…`) | HIGH | | Anthropic key (`sk-ant-…`) | HIGH | npm token (`npm_…`) | HIGH | | Azure Storage connection key | CRITICAL | Azure client secret | HIGH | | PEM private key block | CRITICAL | JWT | MEDIUM | Plus an **opt-in entropy rule** (`--entropy` / `-Entropy`) that flags generic `password=`/`token=`/`secret=` assignments only when the value's Shannon entropy clears a threshold — cutting the false positives that make most generic scanners unusable. ## Install git clone https://github.com/aharwelik/credsweep.git cd credsweep chmod +x credsweep.sh That's it. Optionally symlink onto your `PATH`: ln -s "$PWD/credsweep.sh" /usr/local/bin/credsweep ## Usage **Bash** ./credsweep.sh . # scan current tree (human table) ./credsweep.sh src --format json # machine-readable JSON ./credsweep.sh . --format sarif > r.sarif ./credsweep.sh . --entropy # add high-entropy generic detection ./credsweep.sh . --no-fail # report but never break the build ./credsweep.sh . --exclude-dir fixtures --exclude-dir testdata **PowerShell** (macOS / Linux / Windows) ./credsweep.ps1 . # scan current tree ./credsweep.ps1 src -Format json ./credsweep.ps1 . -Format sarif > r.sarif ./credsweep.ps1 . -Entropy ./credsweep.ps1 . -NoFail ### Try it on the demo fixtures The repo ships a generator that writes a throwaway project full of (fake) secrets — no secret-shaped literal is ever committed to git: bash examples/generate-fixtures.sh # creates examples/leaky-project/ (gitignored) ./credsweep.sh examples # → 10 findings ### Example output credsweep 1.0.0 — 10 finding(s) in examples CRITICAL examples/leaky-project/config.example.env:3 aws-secret-access-key aws_…****…EY HIGH examples/leaky-project/config.example.env:2 aws-access-key-id AKIA…****…LE MEDIUM examples/leaky-project/config.example.env:9 slack-webhook http…****…XX Matched secrets are **redacted by default** (`first4…****…last2`). Pass `--show-secrets` only when you truly need the raw value. ## CI integration (GitHub Actions) `credsweep` ships a workflow (`.github/workflows/ci.yml`) that lints both scripts and runs the scanner against generated fixtures on every push. To gate *your* repo on secret leaks and surface them in the Security tab: - name: Scan for secrets run: ./credsweep.sh . --format sarif > credsweep.sarif - uses: github/codeql-action/upload-sarif@v3 with: sarif_file: credsweep.sarif ## Pre-commit hook ln -s ../../hooks/pre-commit .git/hooks/pre-commit Now any `git commit` that would introduce a secret is blocked locally. ## Optional AI triage For teams that want a written risk summary, `credsweep` can pipe its JSON to an LLM for plain-English triage (which keys are highest risk, suggested rotation order). This is **opt-in and offline-by-default** — see [`docs/ai-triage.md`](docs/ai-triage.md). The core scanner never needs an API key. ## How it works - A single rule table (`name | severity | case-flag | regex`) drives detection in both implementations, so adding a provider means editing one row in each file. - Binary files and noisy directories (`.git`, `node_modules`, `vendor`, `dist`, `target`, `.terraform`, …) are skipped automatically. - The entropy gate uses a from-scratch Shannon-entropy calculation (no libraries) so the generic rule fires on real secrets, not on long-but-low-entropy strings like URLs. ## Limitations (honest notes) - Regex + entropy detection is high-signal but not exhaustive — treat a clean scan as "no *known patterns* found," not a proof of safety. - Provider prefixes overlap (e.g. `sk-ant-…` also matches the broad `sk-…` OpenAI rule), so a single Anthropic key may be reported by two rules. Both say the same thing: rotate it. - The entropy threshold (3.5 bits/char) is tuned for typical secrets; lower it for stricter scanning. ## Author **Anthony Harwelik** — founder, **Sole Priority LLC** / **BlueTech Green**. Security & AI tooling, automation, and cloud engineering. - Email: **aharwelik@gmail.com** - Web: **https://bluetechgreen.com** - GitHub: **[@aharwelik](https://github.com/aharwelik)** Open to consulting and collaboration on security automation and AI-assisted DevSecOps. ## License [MIT](LICENSE) © Anthony Harwelik