ydarwish1/glyphhound

GitHub: ydarwish1/glyphhound

一款在加载模型文件前静态检测 Jinja2 chat template 中潜在远程代码执行漏洞的确定性安全扫描器。

Stars: 0 | Forks: 0

# GlyphHound [![CI](https://github.com/ydarwish1/glyphhound/actions/workflows/ci.yml/badge.svg)](https://github.com/ydarwish1/glyphhound/actions/workflows/ci.yml) [![PyPI](https://img.shields.io/pypi/v/glyphhound)](https://pypi.org/project/glyphhound/) [![License: Apache-2.0](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](LICENSE) [![Python: 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](pyproject.toml) 一个确定性的扫描器，可在加载模型之前，检测模型文件（GGUF / Ollama / Hugging Face）内部执行代码的 chat template。 ## 存在的问题当你下载一个开放权重的 LLM 时，该文件会附带一个 chat template：这是一个小型的 [Jinja2](https://jinja.palletsprojects.com/) 程序，用于在模型读取之前格式化你的消息。一些 runtime 会使用非沙箱化的 Jinja 引擎来渲染该模板，因此恶意的模板可以在你输入任何内容之前，就在模型被加载的瞬间在你的机器上执行代码。这是一个真实存在的、已被修补的 bug 类别： - [CVE-2024-34359](https://nvd.nist.gov/vuln/detail/CVE-2024-34359) (llama-cpp-python) - [CVE-2026-5760](https://nvd.nist.gov/vuln/detail/CVE-2026-5760) (SGLang, CVSS 9.8 - 通过非沙箱化的 `jinja2.Environment()` 渲染恶意 `tokenizer.chat_template` 导致的 RCE) ## GlyphHound 的作用它无需下载几十亿字节（多 GB）的权重，直接从模型文件中读取 chat template，将其解析为语法树，追踪其是否能触达代码执行操作，可选择在锁定的 sandbox 中进行确认，最后通过 CI 退出码报告结果。它是一个程序分析工具，而不是模型：相同的输入始终产生相同的检查结果，在扫描时不进行任何 API 调用或 LLM 调用，并且完全离线运行。 Pipeline（五个阶段）： 1. 获取（Acquire）：通过对 GGUF 元数据头、本地 Ollama blob 或 Hugging Face 仓库的 `tokenizer_config.json` / `chat_template.jinja` / safetensors 元数据进行 HTTP range request 来提取模板。对读取的字节设有硬性上限，并拒绝任何忽略 range request 的服务器，确保抓取的数据量远小于文件本身，并且绝对不会触碰权重。 2. 解析（Parse）：解析为 Jinja2 AST（不进行渲染）。 3. 去混淆（De-obfuscate）：将混淆还原为其隐藏的标识符，包括字符串拼接、`str.format` / `%` / `|format` printf、切片、`|join` / `|replace`、大小写转换过滤器、字符串重复、`{% set %}` 常量传播，以及 `getattr` / `|attr` 反射。 4. 分析（Analyze）：遍历 AST 以查找代码执行 sink（深入 Python 对象模型的 dunder 链、代码执行名称、反射），并且仅当危险表达式真正触达时（污点分析 / 可达性）才进行标记，而不会在良性模板仅命名变量时进行误报。 5. 报告（Report）：支持人类可读格式、JSON 和 SARIF 2.1.0，具有可配置的严重性阈值来驱动退出码。一个可选的、默认关闭的 sandbox 阶段可以在受控的子进程中渲染模板，以确认检测结果。 ## 示例这些示例将扫描本仓库中的模板文件（请先克隆仓库），或者替换为你自己的模板文件，或者使用 `glyphhound scan owner/name` 通过 ID 扫描模型。扫描恶意模板（一个混淆的 `__import__` 到 `os.system` 的调用链）。它将以非零退出码阻断 CI： ``` $ glyphhound scan fixtures/malicious/cve_2024_34359_marker.jinja GlyphHound scan report ====================== threshold: fail CI on reachable findings of severity >= high exit code: 1 findings (5): [GH-S002] CRITICAL reachable tokenizer.chat_template:17 [GATES CI] code-exec-name: .system reason: reference to a code-execution or dangerous-capability name (eval/exec/compile/os/subprocess/importlib/pickle/open/...) [GH-S001] CRITICAL reachable tokenizer.chat_template:17 [GATES CI] dunder-attribute: .__import__ reason: attribute/subscript/|attr access to a Python dunder used for sandbox escape ... (3 more reachable dunder findings: .__builtins__, .__globals__, .__init__) summary: 5 finding(s), 5 reachable; critical=5 high=0; 5 gating -> exit 1 ``` 扫描真实的良性模板。它保持静默并顺利通过： ``` $ glyphhound scan fixtures/benign/Qwen__Qwen2.5-0.5B-Instruct-GGUF.jinja GlyphHound scan report ====================== threshold: fail CI on reachable findings of severity >= high exit code: 0 findings: 0 (nothing flagged at or above the detection threshold) summary: 0 finding(s), 0 reachable; critical=0 high=0; 0 gating -> exit 0 ``` ## 对比分析 Promptfoo 的 [ModelAudit](https://www.promptfoo.dev/docs/model-audit/) 已经提供了一个可通过 pip 安装、能生成 SARIF 报告的 chat-template 扫描器，该扫描器基于字符串/regex 匹配。 GlyphHound 的贡献非常具体且有针对性：AST + 污点分析 + 去混淆机制能够捕获字符串匹配所遗漏的混淆 payload。这种差异是经过实际测量的，而非凭空断言。参见 `benchmark/`：这是一个工程产物，而非研究声明。它不能捕获所有内容，也不是该领域中唯一的工具。 ## 误报率一个扫描器的价值取决于它在处理安全模板时能保持多大的静默程度，因此这是在真实的良性 chat template 上测量得出的，而非主观断言： - 在专门收录的、包含多个真实且不同的 Hugging Face chat template 的语料库中为 0 / 120（`corpus/`）。 - 在另一项针对更多额外真实模板的独立广泛审计中为 0 / 241（`study/wider_fp_audit.json`）。这些是针对上述特定数据集测得的比率，并不能保证没有任何模板会被误报。两者均可通过 `scripts/verify_phase7.py` 进行离线复现。 ## 独立验证除了自身的测试用例外，GlyphHound 还针对真实的第三方攻击 payload 和真实的生产级 chat template 进行了检验： - 检测出了 23 / 23 个真实的 Jinja2 远程代码执行 payload（100% 召回率）。这些 payload 原封不动地取自两个广泛使用的公开参考库：PayloadsAllTheThings 和 HackTricks。 - 从热门的公共 Hugging Face 模型中实时提取的 chat template 上，误报率为 0 / 16（在该数据集上精确率为 100%）。这些公开的 payload 大多未经混淆，字符串匹配扫描器在此类情况下表现良好；GlyphHound 在混淆 payload 上的具体优势体现在 `benchmark/` 中的独立测量结果。完整的方法、确切的 payload、模型列表以及局限性详见 `VALIDATION.md`。你可以亲自复现。该脚本位于本仓库中，因此可在代码检出目录中运行（仅进行静态分析，不渲染或执行任何内容）： ``` git clone https://github.com/ydarwish1/glyphhound cd glyphhound pip install -e . python scripts/verify_real_payloads.py ``` ## 安装 ``` pip install glyphhound ``` 或者从源码安装（用于开发、测试套件或复现脚本）： ``` git clone https://github.com/ydarwish1/glyphhound cd glyphhound python -m venv .venv . .venv/bin/activate # Windows: .venv\Scripts\activate pip install -e ".[dev]" ``` runtime 依赖仅为 jinja2；pytest 和 jsonschema 仅用于开发。 ## 用法 ``` # 扫描本地 GGUF、.gguf URL、Hugging Face repo 或 Ollama model： python -m glyphhound scan path/to/model.gguf python -m glyphhound scan owner/name # canonical HF template (no weights) python -m glyphhound scan owner/name --file auto # smallest .gguf quant in the repo python -m glyphhound scan owner/name --file model.Q4.gguf python -m glyphhound scan ollama-model-name python -m glyphhound scan template.jinja # a local template file cat template.jinja | python -m glyphhound scan - # stdin ``` 选项： ``` --format human|json|sarif output format (default: human) --threshold critical|high minimum severity that gates CI (default: high) --confirm render in the locked-down sandbox to confirm a finding --revision pin a Hugging Face commit for reproducibility ``` 为受限/私有仓库和更高的 Hub 速率限制设置 `HF_TOKEN`。退出码（用于 CI）：0 = 无异常，1 = 存在可触达的发现并阻断了构建，2 = 扫描无法运行。 ## 测试与验证 ``` pip install -e ".[dev]" python -m pytest # offline test suite python scripts/verify_phase2.py # per-stage verification scripts (verify_phase*.py) ``` `verify_phase*.py` 脚本通过真实输出重新验证每个阶段：一个被标记的 fixture、一个符合 schema 的 SARIF 文件、测得的误报率、正面交锋的基准测试以及 sandbox 隔离证明。少数脚本需要网络连接（`verify_phase0/9/14`）或独立的 ModelAudit 环境（`verify_phase8`）；其余均可离线运行。 ## 安全性 - 仅为 MARKER payload。测试 fixture 模拟了攻击链，但“payload”只是一个无害的标记（sentinel）；本仓库中不包含任何实际可用的 exploit 或被投毒的模型。 - 绝不加载权重。获取器仅抓取包含模板的元数据。 - sandbox 进行隔离，或者保持关闭状态。可选的 `--confirm` 阶段仅在锁定的子进程内渲染模板（一个 `sys.addaudithook` 策略，阻止网络、进程生成、`ctypes` 以及越界的 scratch / 符号链接 / 硬链接写入；在 Linux 上，它还添加了 seccomp 系统调用过滤器、资源限制和权限降级）。它不阻止主机文件的读取或删除；阻断网络出口才是防止“读取后渗透”的关键。隔离机制已经过测试。这是一个尽力而为的 sandbox，并非经过形式化验证的 jail。参见 `ARCHITECTURE.md`。 ## 文档 - `ARCHITECTURE.md`：五阶段的 pipeline，以及每个阶段的确切输入/输出。 - `CHANGELOG.md`：完整的构建历史，按阶段记录。 - `benchmark/`：正面交锋的方法论、payload 以及可复现的衡量标准（`benchmark/RELEASE.md`）。 - `study/wider_fp_audit.json`：0/241 数据背后的更广泛误报审计。 - `action/`：在 CI 中运行扫描并将 SARIF 上传至代码扫描的 GitHub Action 包装器。 - `SECURITY.md`：如何报告漏洞。`CONTRIBUTING.md`：如何构建和测试。 ## 许可证 Apache-2.0。参见 `LICENSE`。第三方归属（Jinja2、内置的 SARIF schema 以及良性模板语料库）详见 `NOTICE`。

标签：DLL 劫持, GGUF, Python, XSS注入, 人工智能, 大语言模型, 搜索语句（dork）, 无后门, 用户模式Hook绕过, 自动化payload嵌入, 逆向工具, 错误基检测, 静态代码分析