Lazarus-AI/clearwing

GitHub: Lazarus-AI/clearwing

一个基于 LangGraph 的双模式攻防工具,兼具网络渗透代理与源码安全猎手能力。

Stars: 616 | Forks: 86

# Clearwing image By Eric Hartford, Lazarus AI Inspired by Anthropic's Glasswing. The challenge: Produce similar results as Glasswing - using models everyone has access to. **Autonomous vulnerability scanner and source-code hunter built on LangGraph.** Clearwing is a dual-mode offensive-security tool: - **Network-pentest agent** — a ReAct-loop agent with 63 bind-tools that scans live targets, detects services and vulnerabilities, runs sandboxed Kali tools, attempts exploits (gated through a human-approval guardrail), and writes reports to a persistent knowledge graph. - **Source-code hunter** — a file-parallel agent-driven pipeline that ranks source files, fans out per-file hunter agents, uses ASan/UBSan crashes as ground truth, verifies findings with an adversarial second-pass agent, optionally generates validated patches, and emits SARIF/markdown/JSON reports with explicit evidence levels (`suspicion → static_corroboration → crash_reproduced → root_cause_explained → exploit_demonstrated → patch_validated`). **Authorized use only.** Clearwing is a dual-use offensive-security tool. Run it only against targets you own or have explicit written authorization to test. Operators are responsible for scope, legal authorization, and disclosure. See `SECURITY.md`. ## Install **End users** — install the tagged release straight from GitHub: ``` git clone https://github.com/Lazarus-AI/clearwing.git cd clearwing # uv sync is recommended because Clearwing pins genai-pyo3 through # tool.uv.sources in pyproject.toml. uv sync --all-extras source .venv/bin/activate # fish: source .venv/bin/activate.fish # Interactive setup wizard — menu-driven provider selection, # credential entry, optional live test, persists to ~/.clearwing/config.yaml clearwing setup # Environment check — verifies Python, credentials, Docker daemon, # external tools, optional extras, and network reachability clearwing doctor clearwing --version # 1.0.0 clearwing --help ``` Or skip the wizard and configure directly: ``` # Anthropic direct export ANTHROPIC_API_KEY=sk-ant-... # Or any OpenAI-compatible endpoint — OpenRouter, Ollama, LM Studio, # vLLM, Together, Groq, DeepSeek, OpenAI: export CLEARWING_BASE_URL=https://openrouter.ai/api/v1 export CLEARWING_API_KEY=sk-or-... export CLEARWING_MODEL=anthropic/claude-opus-4 ``` See [`docs/providers.md`](docs/providers.md) for provider-specific recipes and per-task routing. **Developers** — clone and install the locked development environment: ``` git clone https://github.com/Lazarus-AI/clearwing.git cd clearwing uv sync --all-extras source .venv/bin/activate # fish: source .venv/bin/activate.fish clearwing --help ``` Requirements: Python 3.10+, a recent Rust toolchain for the native `genai-pyo3` bridge, and optionally Docker for the Kali container and sanitizer-image sandbox features. If the install fails with a Rust version error, run `rustup update stable`. ## Quickstart ``` # Network scan a single target clearwing scan 192.168.1.10 -p 22,80,443 --detect-services # Source-code hunt a repo (standard depth — sandboxed LLM hunters, # adversarial verifier, mechanism memory, variant loop) clearwing sourcehunt https://github.com/example/project \ --depth standard # Interactive ReAct chat with the full tool set clearwing interactive # Non-interactive CI mode with SARIF output for GitHub Code Scanning clearwing ci --config .clearwing.ci.yaml --sarif results.sarif ``` See [`docs/quickstart.md`](docs/quickstart.md) for a fuller walkthrough including credentials, session resume, and mission-mode operation. ## Running sourcehunt on a local repo (FFmpeg example) The `clearwing sourcehunt ` CLI clones a remote URL. To hunt an already-cloned tree (e.g. FFmpeg) with the native-async pipeline and a self-hosted OpenAI-compatible backend, drive `SourceHuntRunner` directly: ``` # 1. Clone the target once git clone https://github.com/FFmpeg/FFmpeg.git # 2. Run sourcehunt against the local checkout uv run python -u - <<'PY' import logging from clearwing.llm.native import AsyncLLMClient from clearwing.sourcehunt.runner import SourceHuntRunner logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(name)s: %(message)s') REPO = './FFmpeg' RUN_DIR = './sourcehunt-results-ffmpeg' COMMON = dict( provider_name='openai_resp', # or 'openai' for /v1/chat/completions api_key='YOUR_KEY', base_url='http://localhost:8183/v1', # any OpenAI-compatible endpoint max_concurrency=15, ) # One client per stage — routes each stage to a different model ranker_llm = AsyncLLMClient(model_name='gpt-5.4-mini', **COMMON) hunter_llm = AsyncLLMClient(model_name='gpt-5.4', **COMMON) verifier_llm = AsyncLLMClient(model_name='gpt-5.4-mini', **COMMON) exploiter_llm = AsyncLLMClient(model_name='gpt-5.3-codex', **COMMON) runner = SourceHuntRunner( repo_url=REPO, local_path=REPO, depth='standard', budget_usd=1000.0, max_parallel=15, output_dir=RUN_DIR, output_formats=['json', 'markdown'], ranker_llm=ranker_llm, hunter_llm=hunter_llm, verifier_llm=verifier_llm, exploiter_llm=exploiter_llm, enable_patch_oracle=True, ) print(runner.run()) # sync wrapper; internally drives SourceHuntRunner.arun() PY ``` Findings land in `./sourcehunt-results-ffmpeg/sh-/` as JSON + markdown once the run completes. FFmpeg is ~10k source files, so expect the large-repo ranker to preselect candidates and the tier-A hunter pool to run for hours. Redirect stdout/stderr to a file if you plan to detach the process — the runner's own artifacts are only written at the end. `AsyncLLMClient` accepts `provider_name` values `openai_resp` (the streaming `/v1/responses` shape) or `openai` (standard `/v1/chat/completions`); point `base_url` at any OpenAI-compatible server. See [`docs/providers.md`](docs/providers.md) for the managed-provider paths. ## Architecture at a glance ``` ┌──────────────────────┐ ┌────────────────────────────────┐ │ Network-pentest agent│ │ Source-code hunter │ │ clearwing.agent.graph│ │ clearwing.sourcehunt.runner │ │ (63 tools, ReAct) │ │ │ │ │ │ preprocess → rank → pool → │ │ │ │ hunter → verify → exploit → │ │ │ │ variant loop → auto-patch → │ │ │ │ report │ └─────────┬────────────┘ └────────┬───────────────────────┘ │ │ └───────────┬─────────────────┘ ▼ ┌───────────────────────────────────────────────────────────────┐ │ Shared substrate │ │ Finding dataclass │ capabilities probe │ sandbox layer │ │ knowledge graph │ episodic memory │ event bus │ │ telemetry │ guardrails + audit │ CVSS scoring │ └───────────────────────────────────────────────────────────────┘ ``` Deep dives live in [`docs/`](docs/): | Doc | What it covers | |---|---| | [`docs/index.md`](docs/index.md) | Landing page + table of contents | | [`docs/quickstart.md`](docs/quickstart.md) | Full install + first run walkthrough | | [`docs/providers.md`](docs/providers.md) | OpenRouter / Ollama / LM Studio / vLLM / Together / Groq recipes, per-task routing, env-var precedence | | [`docs/architecture.md`](docs/architecture.md) | Both pipelines, substrate, capability gating, tool layout | | [`docs/cli.md`](docs/cli.md) | Every subcommand flag, grouped by workflow | | [`docs/api.md`](docs/api.md) | API reference (mkdocstrings autogen) | Once the GitHub Pages workflow ships, docs will be hosted at . ## Development ``` uv sync --all-extras source .venv/bin/activate # fish: source .venv/bin/activate.fish pytest -q ruff check clearwing/ tests/ ruff format --check clearwing/ tests/ mypy --follow-imports=silent \ clearwing/findings \ clearwing/sourcehunt \ clearwing/capabilities.py \ clearwing/agent/tools \ clearwing/core python -m mkdocs serve --dev-addr 127.0.0.1:8000 ``` See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the full dev-setup guide and PR checklist. ## Reporting vulnerabilities There are two lanes, and they go to different places: - **Vulnerabilities *in* Clearwing** → GitHub Security Advisories (). See [`SECURITY.md`](SECURITY.md) for scope, SLA, and safe-harbor. - **Vulnerabilities Clearwing *finds* in someone else's software** → that vendor's disclosure channel. `clearwing sourcehunt --export-disclosures` generates pre-filled MITRE CVE-request and HackerOne templates for every finding at `evidence_level >= root_cause_explained`. ## License MIT. See [`LICENSE`](LICENSE).
标签:AI 安全工具, ASan, Clearwing, Glasswing 启发, LangGraph, Maven, PyRIT, ReAct 框架, SARIF, UBSan, 人类审批护栏, 代码狩猎, 双模式, 可视化界面, 多智能体系统, 安全测试, 对抗验证, 持久化存储, 攻击性安全, 服务发现, 沙箱工具, 源代码审计, 漏洞验证, 结构化查询, 自主智能体, 自动化安全, 补丁生成, 证据等级, 请求拦截, 进攻性安全, 逆向工具, 靶机