TDharm/ai-context-guard
GitHub: TDharm/ai-context-guard
Stars: 1 | Forks: 0
# AI Context Guard
[](https://github.com/TDharm/ai-context-guard/actions/workflows/test.yml)
[](LICENSE)
This tool makes those characters visible so they show up in code review and fail in CI.
It is the companion to the article ["Attackers Are Hiding Instructions Your AI Coding Agent Will Obey"](https://tarkar.substack.com/) on The Hidden Layer.
## What it does
- Scans `CLAUDE.md`, `AGENTS.md`, `GEMINI.md`, `.cursorrules`, `.windsurfrules`, `.clinerules`, `.github/copilot-instructions.md`, and rule directories like `.cursor/rules/` by default. Or point it at any file or folder.
- Flags zero-width characters (for example U+200B, U+200C, U+200D, U+2060, U+FEFF), bidirectional controls (the "Trojan Source" class, U+202A to U+202E and U+2066 to U+2069), Unicode tag characters (U+E0000 to U+E007F, used to smuggle invisible ASCII), soft hyphens, and other control and format codepoints.
- Reports file, line, column, codepoint, and the official Unicode name.
- Exits non-zero when it finds something, so it works as a CI gate or a pre-commit hook.
- Detection-first: by default it changes nothing. An optional `--strip` mode removes flagged characters after you have looked at them, writing a `.bak` backup.
It detects hidden characters, not malicious intent. A visible-but-malicious instruction is still your job to catch by reading the file. This narrows the blind spot; it does not replace review.
## Requirements
Python 3.8 or newer. No third-party packages. That is the whole dependency list.
## Quick start
git clone https://github.com/TDharm/ai-context-guard.git
cd ai-context-guard
# See it catch a poisoned file
python ai_context_guard.py examples/poisoned-CLAUDE.md
# See a clean file pass
python ai_context_guard.py examples/clean-CLAUDE.md
## Usage
# Scan the known agent context files under the current directory
python ai_context_guard.py
# Scan specific files or folders
python ai_context_guard.py CLAUDE.md .cursor/rules/
# Machine-readable output
python ai_context_guard.py --json
# Also flag variation selectors (stricter, a little noisier)
python ai_context_guard.py --strict
# Allow a codepoint you have decided is legitimate (for example an emoji joiner)
python ai_context_guard.py --allow 200D,FE0F
# Remove flagged characters in place, keeping a .bak backup
python ai_context_guard.py --strip CLAUDE.md
Exit codes: `0` clean, `1` hidden characters found, `2` usage or IO error.
## What the output looks like
Running it on a poisoned context file:
ai-context-guard: found 5 hidden character(s) in 1 file(s).
examples/poisoned-CLAUDE.md
line 3, col 20: U+200B ZERO WIDTH SPACE [Cf]
line 3, col 34: U+2060 WORD JOINER [Cf]
line 4, col 12: U+202E RIGHT-TO-LEFT OVERRIDE [Cf]
line 4, col 21: U+202C POP DIRECTIONAL FORMATTING [Cf]
line 5, col 13: U+E0041 TAG LATIN CAPITAL LETTER A [Cf]
These characters are invisible or non-printing in a normal editor and diff.
Review the file, then re-run with --strip to remove them, or --allow
HEX[,HEX...] if a flagged character is legitimate (for example an emoji joiner).
## Run it in CI (GitHub Actions)
Copy [`.github/workflows/context-scan.yml`](.github/workflows/context-scan.yml) into the repository you want to protect. It scans your agent context files on every pull request and push and fails the job if a hidden character appears in one. That turns "review the diff" (which cannot show invisible characters) into a check the build enforces for you.
## Run it as a pre-commit hook
If you use [pre-commit](https://pre-commit.com/), add this to your `.pre-commit-config.yaml`:
repos:
- repo: https://github.com/TDharm/ai-context-guard
rev: v0.1.0
hooks:
- id: ai-context-guard
The script must be committed with its executable bit set for the `script` hook to run (this repo already does this):
git update-index --chmod=+x ai_context_guard.py
There is also a ready-to-use [`.pre-commit-config.yaml`](.pre-commit-config.yaml) in this repo that runs the scanner as a local hook, for when you have copied `ai_context_guard.py` into your own project.
## Examples
The [`examples/`](examples/) folder has a `poisoned-CLAUDE.md` (carrying zero-width, bidirectional, and tag characters) and a `clean-CLAUDE.md`, so you can watch the tool catch one and pass the other. See [`examples/README.md`](examples/README.md).
## Tests
python -m unittest discover -s tests -v
## False positives
Some flagged characters are legitimate in the right context. A zero-width joiner (U+200D) is used in emoji sequences, and variation selectors shape emoji presentation. They almost never belong in a `CLAUDE.md`, which is why the default flags them, but if you have a real reason to keep one, allowlist it with `--allow`.
## Limitations
- It finds hidden characters. It does not judge whether visible text is malicious. Read your context files too.
- It scans UTF-8 text. Files that are not valid UTF-8 are reported and skipped.
- `--strip` removes every flagged character it finds. Look at the report first, and rely on the `.bak` backup if you need to undo.
## License
MIT. See [LICENSE](LICENSE).