TDharm/ai-context-guard

GitHub: TDharm/ai-context-guard

Stars: 1 | Forks: 0

# AI Context Guard [![tests](https://static.pigsec.cn/wp-content/uploads/repos/2026/06/f075bf7045112054.svg)](https://github.com/TDharm/ai-context-guard/actions/workflows/test.yml) [![license: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) This tool makes those characters visible so they show up in code review and fail in CI. It is the companion to the article ["Attackers Are Hiding Instructions Your AI Coding Agent Will Obey"](https://tarkar.substack.com/) on The Hidden Layer. ## What it does - Scans `CLAUDE.md`, `AGENTS.md`, `GEMINI.md`, `.cursorrules`, `.windsurfrules`, `.clinerules`, `.github/copilot-instructions.md`, and rule directories like `.cursor/rules/` by default. Or point it at any file or folder. - Flags zero-width characters (for example U+200B, U+200C, U+200D, U+2060, U+FEFF), bidirectional controls (the "Trojan Source" class, U+202A to U+202E and U+2066 to U+2069), Unicode tag characters (U+E0000 to U+E007F, used to smuggle invisible ASCII), soft hyphens, and other control and format codepoints. - Reports file, line, column, codepoint, and the official Unicode name. - Exits non-zero when it finds something, so it works as a CI gate or a pre-commit hook. - Detection-first: by default it changes nothing. An optional `--strip` mode removes flagged characters after you have looked at them, writing a `.bak` backup. It detects hidden characters, not malicious intent. A visible-but-malicious instruction is still your job to catch by reading the file. This narrows the blind spot; it does not replace review. ## Requirements Python 3.8 or newer. No third-party packages. That is the whole dependency list. ## Quick start git clone https://github.com/TDharm/ai-context-guard.git cd ai-context-guard # See it catch a poisoned file python ai_context_guard.py examples/poisoned-CLAUDE.md # See a clean file pass python ai_context_guard.py examples/clean-CLAUDE.md ## Usage # Scan the known agent context files under the current directory python ai_context_guard.py # Scan specific files or folders python ai_context_guard.py CLAUDE.md .cursor/rules/ # Machine-readable output python ai_context_guard.py --json # Also flag variation selectors (stricter, a little noisier) python ai_context_guard.py --strict # Allow a codepoint you have decided is legitimate (for example an emoji joiner) python ai_context_guard.py --allow 200D,FE0F # Remove flagged characters in place, keeping a .bak backup python ai_context_guard.py --strip CLAUDE.md Exit codes: `0` clean, `1` hidden characters found, `2` usage or IO error. ## What the output looks like Running it on a poisoned context file: ai-context-guard: found 5 hidden character(s) in 1 file(s). examples/poisoned-CLAUDE.md line 3, col 20: U+200B ZERO WIDTH SPACE [Cf] line 3, col 34: U+2060 WORD JOINER [Cf] line 4, col 12: U+202E RIGHT-TO-LEFT OVERRIDE [Cf] line 4, col 21: U+202C POP DIRECTIONAL FORMATTING [Cf] line 5, col 13: U+E0041 TAG LATIN CAPITAL LETTER A [Cf] These characters are invisible or non-printing in a normal editor and diff. Review the file, then re-run with --strip to remove them, or --allow HEX[,HEX...] if a flagged character is legitimate (for example an emoji joiner). ## Run it in CI (GitHub Actions) Copy [`.github/workflows/context-scan.yml`](.github/workflows/context-scan.yml) into the repository you want to protect. It scans your agent context files on every pull request and push and fails the job if a hidden character appears in one. That turns "review the diff" (which cannot show invisible characters) into a check the build enforces for you. ## Run it as a pre-commit hook If you use [pre-commit](https://pre-commit.com/), add this to your `.pre-commit-config.yaml`: repos: - repo: https://github.com/TDharm/ai-context-guard rev: v0.1.0 hooks: - id: ai-context-guard The script must be committed with its executable bit set for the `script` hook to run (this repo already does this): git update-index --chmod=+x ai_context_guard.py There is also a ready-to-use [`.pre-commit-config.yaml`](.pre-commit-config.yaml) in this repo that runs the scanner as a local hook, for when you have copied `ai_context_guard.py` into your own project. ## Examples The [`examples/`](examples/) folder has a `poisoned-CLAUDE.md` (carrying zero-width, bidirectional, and tag characters) and a `clean-CLAUDE.md`, so you can watch the tool catch one and pass the other. See [`examples/README.md`](examples/README.md). ## Tests python -m unittest discover -s tests -v ## False positives Some flagged characters are legitimate in the right context. A zero-width joiner (U+200D) is used in emoji sequences, and variation selectors shape emoji presentation. They almost never belong in a `CLAUDE.md`, which is why the default flags them, but if you have a real reason to keep one, allowlist it with `--allow`. ## Limitations - It finds hidden characters. It does not judge whether visible text is malicious. Read your context files too. - It scans UTF-8 text. Files that are not valid UTF-8 are reported and skipped. - `--strip` removes every flagged character it finds. Look at the report first, and rely on the `.bak` backup if you need to undo. ## License MIT. See [LICENSE](LICENSE).