pq-cybarg/digger

GitHub: pq-cybarg/digger

跨平台端点取证套件，通过双哈希链和后量子签名固化证据完整性，本地 AI 分类辅助分析，伦理契约强制执行，适用于事件响应与合规审计场景。

Stars: 0 | Forks: 0

# rm. I'll keep 'Windows registry' as is. 'etc.' can be translated as "等". For line 4: "Dev (tests, linting)" - 'Dev' likely means development, can translate as "开发", but keep parentheses? 'tests' and 'linting' - 'linting' is a term, keep as 'linting'. 'tests' could be translated as "测试" but if keeping technical jargon, maybe keep 'tests'? I'll translate 'Dev' as "开发", and keep 'tests' and 'linting' as English? Or 'linting' is a tool? Usually 'linting' is a process, but can be kept. I'll keep 'tests' and 'linting' as English to be consistent. Alternatively, many translations would write "开发（测试、linting）". That seems fine. **跨平台端点取证套件。完全在您的机器上运行。经过编码的伦理准则，后量子签名证据，32个检测器，包括防御性镜像的每个进攻工具杀伤链阶段。** [![文档](https://img.shields.io/badge/docs-pq--cybarg.github.io%2Fdigger-2ea44f)](https://pq-cybarg.github.io/digger/) [![许可](https://img.shields.io/badge/license-MIT-blue)](#许可) [![测试](https://img.shields.io/badge/tests-337%20passing-2ea44f)]() [![平台](https://img.shields.io/badge/platform-macOS%20%7C%20Linux%20%7C%20Windows-lightgrey)]() [![Python](https://img.shields.io/badge/python-3.11%2B-blue)]() [![伦理](https://img.shields.io/badge/ethics-load--bearing-orange)](ETHICS.md) [![PQC](https://img.shields.io/badge/PQC-ML--DSA--65%20%7C%20ML--KEM--768-purple)]() ``` ┌────────────────────────────────────────────────────────────────┐ │ collectors → Artifacts → EvidenceStore → detectors → Findings │ │ ▲ │ │ │ │ ▼ │ │ chain-of-custody │ AI triage (local) │ │ PQC signature │ │ │ live intel feeds → detector inputs ▼ │ │ signature-base / LOKI corpus reports • exports │ └────────────────────────────────────────────────────────────────┘ ``` ## 一分钟了解它的功能 `digger` 进入主机，将数百个取证痕迹拉入一个 **仅追加、双哈希链式、可后量子签名的 SQLite 证据存储库**，在其上运行 32 个检测器堆栈，并生成适合法律披露、SOC 交接或内部事件响应审查的案件报告。每一步都仅观察：未经明确、可审计的同意，它绝不会修改正在调查的系统。 ``` digger collect --case-dir ./case-2026-05-24 # gather artifacts digger scan --case-dir ./case-2026-05-24 # run detectors digger triage --case-dir ./case-2026-05-24 --llm-base-url http://127.0.0.1:8080/v1 # local LLM grading digger report --case-dir ./case-2026-05-24 --format html --out report.html # shareable report digger pqc sign --case-dir ./case-2026-05-24 --key ./op.sk # ML-DSA-65 signed chain ``` 或者一步完成：`digger investigate --case-dir ./case --report report.html`。 ## 它的与众不同之处 | 能力 | 为什么重要 | |---|---| | **代码强制执行的伦理契约** | 10 条原则（`digger.ethics.contract`）会引发 `EthicsViolation` 而非警告。如果移除任何防护栏，19 个承载测试将失败。详见 [ETHICS.md](ETHICS.md)。 | | **每条记录的双哈希链** | SHA-256 *和* SHA3-256 通过痕迹和发现表并行串联。伪造篡改需要同时破解两种算法系列（Merkle-Damgård + Keccak 海绵结构）。 | | **后量子签名证据** | 通过 liboqs 在链端点上生成 ML-DSA-65（FIPS 204）签名。FIPS 140-3 模式，包含 SHA-256、AES-256-GCM、ML-DSA-65、ML-KEM-768 的 KAT 测试。 | | **12 个霸天虎反制措施** | 每个进攻性杀伤链阶段（侦察→利用→提权→横向移动→AD→云→反逆向→持久会话→攻击者工具→反取证→数据外泄→影响）对应一个防御检测器。 | | **15 个实时威胁情报源** | CISA KEV、abuse.ch（URLhaus/ThreatFox/MalwareBazaar）、Spamhaus、OpenSSF、GitHub Advisory DB、NVD CPE 关联的 CVE、SigmaHQ 规则库、MITRE ATT&CK STIX、Aikido Shai-Hulud IOC。静态强制 "实时优先" 约定：没有检测器只使用手动输入的种子数据。 | | **基于 IC 工作法的人工智能分类** | 兼容 OpenAI 的本地 LLM（llama.cpp / ollama / vllm）。在 ICD 203 评估概率 + NATO Admiralty 来源/信息可靠性 + TLP 下强制执行模式化输出。LLM 永远不会看到原始文件内容。 | | **18 个合规框架** | NIST 800-53、NIST 800-171、SOC 2、ISO 27001、ISO 27037、CIS、CMMC L1/L2、PCI-DSS、HIPAA、GDPR、FedRAMP、FFIEC、NIS2。添加一个框架就是一个 YAML 文件。 | | **全面的浏览器扫描器** | Chromium + Firefox：cookie（仅计数，不取值）、已保存密码摘要（仅计数）、IndexedDB、Local Storage、PWA、配置文件默认值、Service Worker。实时 URLhaus + ThreatFox 交叉引用每个来源。跟踪未打补丁的 Chromium 漏洞类别（例如 [crbug-40062121](https://issues.chromium.org/issues/40062121) service-worker 持久性）。 | | **防火墙审计 + 修复** | 统一审计 pf / nftables / iptables / ufw / firewalld / WFP。生成可复制粘贴的特定平台修复命令，通过 `redact_dangerous_command` 路由。绝不会自动应用更改。 | | **每个检测器的自动 Sigma 导出** | `digger generate sigma --from-detectors` 为每个检测器编写一条可直接部署到 SIEM 的 Sigma 规则。此外，还可针对每个发现生成特定于案件的 Sigma 签名。 | | **ATT&CK 覆盖热力图** | `digger generate heatmap --format html --out cov.html` 渲染一个 Navigator 风格的矩阵，显示哪些 MITRE 技术被哪些检测器覆盖，静态地从检测器标签派生。支持文本 / JSON / HTML 输出。 | ## 安装 ``` # Line 9: "PQC-sign" - PQC stands for Post-Quantum Cryptography, keep as 'PQC-sign'. Translate 'evidence chain' as "证据链" but 'evidence' might be kept? I'll translate 'evidence chain' as "证据链" because it's a common term. And 'tampering' as "篡改". 'breaks the signature' as "破坏签名". pip install -e . # Line 10: "Anyone with the public key can verify later" - 'public key' keep as 'public key'? Or translate "公钥"? Since it's technical, I'll keep 'public key' in English. 'verify' as "验证". pip install -e ".[all]" # Also note: line 1 'digger' - is it a tool name? Possibly keep as 'digger'? Or translate as "挖掘器"? The instruction: keep tool names in English. So 'digger' stays. pip install -e ".[dev]" ``` 基础安装不需要构建时的 C 扩展。可选功能在缺少其库时会优雅地回退。 ## 快速入门 ``` # Let's produce translation line by line. digger collect --case-dir ./case-2026-05-24 # 1. digger -> "digger" (keep) digger scan --case-dir ./case-2026-05-24 # 2. Base install -> "基础安装" (base is common, translate; install is common) digger triage --case-dir ./case-2026-05-24 \ --llm-base-url http://127.0.0.1:8080/v1 \ --llm-model GLM-4.6 # 3. With YARA, Windows registry parsing, GeoIP, etc. -> "包含 YARA、Windows registry 解析、GeoIP 等" (keep YARA, Windows registry, GeoIP; parse -> 解析) digger report --case-dir ./case-2026-05-24 --format html --out report.html # 4. Dev (tests, linting) -> "开发（tests、linting）" (keep tests, linting) digger pqc sign --case-dir ./case-2026-05-24 --key ./op.sk # 5. 1. Collect into ./case-2026-05-24/ -> "1. 收集到 ./case-2026-05-24/" (keep path, number) digger pqc verify --case-dir ./case-2026-05-24 ``` 一步完成： ``` digger investigate --case-dir ./case --report report.html ``` 使用提升的权限（`sudo` / `runas administrator`）运行以获得完整覆盖 —— 大多数收集器无需 root 即可优雅降级，但某些痕迹（审计日志、EVTX、TCC 数据库、统一日志、内核模块、防火墙规则）需要提升权限。 ## 霸天虎反制措施套件 12 个检测器，镜像并反击自主红队杀伤链（[PurpleAILAB/Decepticon](https://github.com/PurpleAILAB/Decepticon)）的每个阶段： | 阶段 | digger 检测器 | MITRE | |---|---|---| | 侦察 | `recon` | T1595.001 / T1110.001 / T1592.002 | | 利用 | `exploitation` | T1190 / T1059 / T1203 | | 提权 | `privesc` | T1548 / T1068 / T1547.006 | | 横向移动 | `lateral` | T1021 / T1550 / T1570 | | C2 框架（扩展） | `c2` | T1071 / T1573 / T1055 | | Active Directory | `ad_attacks` | T1558.003 / T1003.006 / T1484.001 | | 云 | `cloud_attacks` | T1552.005 / T1078.004 / T1611 | | 反逆向（调试器在我们身上） | `counter_re` | T1622 / T1057 | | 持久会话 | `persistent_sessions` | T1546 / T1543.002 | | 攻击者工具在主机上 | `attacker_tooling` | T1588.002 | | 反取证 / 清除痕迹 | `anti_forensics` | T1070 / T1070.001-006 | | 数据外泄 | `exfiltration` | T1041 / T1048 / T1567 / T1572 | | 影响（勒索软件 / 破坏） | `impact` | T1485 / T1486 / T1489 / T1490 / T1529 / T1561 | 所有检测器观察：digger 绝不发送 payload 来验证可利用性（伦理契约 P3）。每个检测器通过 `digger generate sigma --from-detectors` 附带一条通用的可部署到 SIEM 的 Sigma 规则。完整指南位于 [docs/decepticon-counter](https://pq-cybarg.github.io/digger/decepticon-counter.html)。 ## 本地 LLM 设置（可选） `digger` 与任何兼容 OpenAI 的 `/v1/chat/completions` 端点通信。 ### 6. 2. Run detectors against the collected evidence -> "2. 对收集到的 evidence 运行 detectors" (keep detectors, evidence? Or translate? Let's decide: keep 'detectors' and 'evidence' as English? I'll keep 'detectors' and 'evidence' to be consistent with instruction. But 'evidence' might be translated. However, to be safe, keep as English technical terms. So "运行 detectors" and "收集到的 evidence". Or "对收集到的证据运行 detectors"? I think "证据" is fine, but it says keep professional terms. 'Evidence' is a term in forensics. I'll keep 'evidence' as English. Alternatively, I can translate 'collected evidence' as "收集到的证据" and keep 'detectors' as English. I'll do that: "对收集到的证据运行 detectors") ``` huggingface-cli download zai-org/GLM-4.6-GGUF GLM-4.6-Q4_K_M.gguf --local-dir ./models llama-server -m ./models/GLM-4.6-Q4_K_M.gguf --host 127.0.0.1 --port 8080 -c 32768 --jinja ``` ### 7. 3. Optional: triage findings through a local LLM (llama.cpp on :8080) -> "3. 可选：通过本地 LLM 对 findings 进行 triage (llama.cpp on :8080)" (keep LLM, findings, triage, llama.cpp, :8080. Might adjust preposition: "through" -> "通过". "on :8080" -> "在 :8080 上" but keep as is? I'll write "（llama.cpp 运行在 :8080）" or keep parentheses as " (llama.cpp on :8080)"? Original has parentheses. I'll keep as " (llama.cpp on :8080)" unchanged? But we need to translate the surrounding text. So "3. 可选：本地 LLM 对 findings 进行 triage (llama.cpp on :8080)" - that includes English terms. Acceptable.) ``` ollama serve ollama pull qwen2.5:14b-instruct digger triage --case-dir … --llm-base-url http://127.0.0.1:11434/v1 --llm-model qwen2.5:14b-instruct ``` 除非您主动选择，否则 LLM 绝不会收到原始文件内容——仅接收元数据、检测器发现和短上下文窗口。参见 [`digger/ai/triage.py`](digger/ai/triage.py)。 ## 架构简介 ``` digger/ ├── core/ Evidence store, platform detection, hashing, runner ├── collectors/ common/, windows/, macos/, linux/ artifact collectors ├── detectors/ Behavioral + YARA + IOC + Sigma + C2 + supply-chain + 9 counter-offensive ├── memory/ VM-region anomaly detection (RWX, anonymous-exec, drop-loaded modules) ├── signing/ Code-signature verification (codesign / dpkg -V / rpm -V) ├── firewall/ Unified pf / nftables / iptables / ufw / firewalld / WFP audit ├── ethics/ The 10-principle contract; engagement scope; remediation gating ├── opsec/ Air-gap mode, PQC bundle encrypt, PII redaction, watchers, self-id ├── intel/ 15 live threat-intel feeds + scheduler + composite multi-URL fetchers ├── ai/ OpenAI-compatible client, ICD-203-compliant triage prompts + schema ├── crypto/ liboqs-backed NIST PQC (sign, verify, hybrid KEM + AES-256-GCM) ├── fips/ FIPS 140-3 mode + KAT self-test + algorithm gating ├── compliance/ 18 framework catalogs + control assessor + reports ├── tradecraft/ ICD 203 estimative probability, NATO Admiralty, TLP, ACH ├── exchange/ STIX 2.1, MISP, ATT&CK Navigator, TAXII 2.1, Sigma loader ├── coc/ ISO/IEC 27037 + NIST SP 800-86 chain-of-custody record ├── loki/ Bridge to LOKI/signature-base (Neo23x0/signature-base) ├── genrule/ Generate Sigma YAML from findings or per-detector class templates ├── hunts/ 17-query threat-hunting library ├── diff/ Stable-identity case-to-case diffing ├── rules/ Bundled YARA, IOC lists, Sigma-style rules, framework catalogs └── report/ JSON, Markdown, HTML report renderers ``` 模块级文档位于 [**pq-cybarg.github.io/digger**](https://pq-cybarg.github.io/digger/)。 ## 伦理与安全此工具用于分析 **您自己的** 机器，或您被明确授权检查的机器（您的机群、您客户的机群（根据合同）、CTF 挑战机等）。 10 条原则的伦理契约是 **通过程序** 强制执行的，而非文档字符串： | # | 原则 | 强制执行方式 | |---|---|---| | P1 | 仅限本地主机 | `assert_target_is_localhost` 引发 `EthicsViolation` | | P2 | 默认观察 | `confirm_remediation_intent` 拒绝非交互式会话 | | P3 | 不进行利用 | `assert_not_exploitation` 阻止 msfvenom / sqlmap / exploit 表述 | | P4 | 不进行凭证攻击 | `assert_not_credential_attack` 阻止 john / hashcat / brute-force | | P5 | 不进行第三方监视 | `assert_no_third_party_surveillance` 要求同意标记 | | P6 | 未经选择不得外发 | `DIGGER_AIRGAP=1` 在源代码处阻止所有出站 HTTP | | P7 | 校准的发现 | 分类模式强制执行 ICD 203 评估概率 | | P8 | 不收集生物特征 | 不存在摄像头 / 麦克风 / 指纹 / 面部捕获接口 | | P9 | 拒绝被篡改的配置 | 预检自检在检测到篡改时中止 | | P10 | 审计可见 | 对 digger 自身的发现带有自归属发出，绝不会被静默过滤 | 全文见 [ETHICS.md](ETHICS.md)。`tests/test_ethics.py` 中的 19 个测试在移除任何防护栏时会失败。不要将证据数据库发送到不应去的地方——它按设计包含进程命令行、浏览器历史记录和其他敏感数据。参见 `digger opsec redact` 以获取可安全共享的副本，以及 `digger opsec encrypt` 以获取混合 PQC-KEM + AES-256-GCM 捆绑包。 ## 多身份工具（`tools/identity/`）该仓库附带一个小的配套工具链，适用于处理多个 GitHub 账户的主机——解决了 "Sourcetree 生成的 SSH 配置将所有推送静默地路由到第一个身份" 的陷阱。 - `ghid` — CLI：切换 / 锁定 / 验证 / 轮换每个仓库的身份 - `ghidbar` — macOS 菜单栏应用：在仓库中时显示 `🔑 <身份> 🔒` - `install-hooks.sh` — 统一预推送钩子：身份锁定 + gh-pages 自动同步安装：`./tools/identity/install.sh --launchd`。文档在 [`tools/identity/README.md`](tools/identity/README.md)。 ## 文档全面文档位于 [**pq-cybarg.github.io/digger**](https://pq-cybarg.github.io/digger/) —— 30 多页，涵盖： - [开始使用](https://pq-cybarg.github.io/digger/getting-started.html) - [CLI 参考](https://pq-cybarg.github.io/digger/cli.html) - [架构](https://pq-cybarg.github.io/digger/architecture.html) - [检测器](https://pq-cybarg.github.io/digger/detectors.html)（全部 28 个） - [霸天虎反制措施](https://pq-cybarg.github.io/digger/decepticon-counter.html) - [浏览器扫描器](https://pq-cybarg.github.io/digger/browser-scanner.html) - [未打补丁的 Chromium 漏洞语料库](https://pq-cybarg.github.io/digger/chromium-unpatched.html) - [防火墙审计 + 修复](https://pq-cybarg.github.io/digger/firewall-audit.html) - [实时威胁情报源](https://pq-cybarg.github.io/digger/intel.html) - [后量子密码学](https://pq-cybarg.github.io/digger/pqc.html) - [FIPS 140-3 模式](https://pq-cybarg.github.io/digger/fips.html) - [合规框架](https://pq-cybarg.github.io/digger/compliance.html) - [伦理契约](https://pq-cybarg.github.io/digger/ethics.html) - [扩展 digger](https://pq-cybarg.github.io/digger/extending.html) 本地运行：`./docs.sh` → http://127.0.0.1:8765/。 ## 安全如果您在 `digger` 本身中发现漏洞，请遵循 [SECURITY.md](SECURITY.md) 中的披露流程。 ## 许可 MIT — 参见 [LICENSE](LICENSE)。

标签：AI分类, EDR, ML-DSA-65, PFX证书, Python, SH-256, SHA3-256, SQLite, 取证套件, 后量子密码, 哈希链, 开源, 攻击杀伤链, 攻击检测, 数字取证, 无后门, 本地处理, 检测器, 端点取证, 网络安全, 脆弱性评估, 自动化脚本, 证据存储, 证据签名, 证据链, 逆向工具, 隐私保护