unattributed/ai-browser-security-test-suite

GitHub: unattributed/ai-browser-security-test-suite

Stars: 0 | Forks: 0

# AI Browser Security Test Suite AI Browser Security Test Suite is a Python-based validation framework for browser-based AI ecosystems. The public test suite is now centered on one supported local target: https://github.com/unattributed/ollama-webui `ollama-webui` is used as the suite's deliberately weak, locally runnable browser-based LLM app for testing, prototyping, and demonstrating browser-AI security weaknesses safely. The goal is to give blue teams, security engineers, product security teams, penetration testers, and organizations a reproducible target that does not require testing against third-party systems. ## Research basis This repository is the executable validation layer for the Browser-Safe AI Systems research series: https://unattributed.blog/ai-security/browser-security/security-operations/red-team/2026/05/09/browser-safe-ai-systems-00-series-index.html The series argues that browser-based AI systems must treat webpage content, rendered text, hidden DOM, metadata, screenshots, QR handoffs, delayed content, user feedback, and exception requests as adversarial inputs. This repository turns those claims into repeatable tests against a controlled local target. The public workflow intentionally uses `unattributed/ollama-webui` as a deliberately weak local browser-based LLM app so the test suite can demonstrate risk patterns without encouraging testing against third-party systems. The intended review model is: research claim -> safe synthetic probe -> browser evidence -> model response -> structured report -> analyst review ## Core thesis Browser-based AI systems should be treated as controlled security pipelines, not as magic models. This project follows four principles: browser content = untrusted input AI verdict = advisory signal policy decision = deterministic control evidence = mandatory output Hostile browser content can include visible text, hidden DOM, metadata, screenshots, QR codes, Unicode lookalikes, delayed content, file names, user feedback, and exception requests. The model must not become the policy authority. ## Supported target model The default and recommended test target is: unattributed/ollama-webui Expected local target: http://127.0.0.1:11435/ Expected local Ollama backend: http://127.0.0.1:11434 Why this target is used: public repository local-only execution browser-based AI workflow Ollama-backed LLM behavior stable selectors for Playwright testing safe environment for repeatable proof-of-concept validation no third-party target required All public scripts are written to focus on this local target or on local generated lab pages. Any broader black-box testing must be driven by an explicit client-provided scope file and written authorization. ## Safety boundary This repository is for authorized testing, local validation, defensive research, and professional due diligence. Do not use this suite for: unauthorized scanning credential theft cookie theft token extraction browser C2 MFA bypass tooling destructive tests exploit automation third-party testing without written authorization The included probes use synthetic markers and local test cases. They are designed to demonstrate weakness patterns without collecting real credentials, real tokens, real cookies, or real personal data. ## What this project provides The current release provides: safe browser-AI test case definitions local HTML lab generation playable local browser-safe AI examples mapped to the series Ollama Web UI local target validation Playwright browser evidence capture structured JSONL evidence deterministic artifact manifest with SHA256 hashing explicit evidence and artifact manifest schema contracts Markdown reporting article-series mapping coverage auditing against the research series guided lab manifest validation for professional lab exercises free and open source tooling requirement for guided labs with purpose-built Python fallback artifact-backed browser tests for visual deception, DOM/render mismatch, QR handoff, delayed DOM mutation, iframe frame-tree evidence, and storage-state boundary evidence deterministic uploaded-file analysis tests for the local Ollama Web UI target deterministic Project Agent tests for local project guardrails, file reads, search, model type controls, copy controls, and allowlisted tool execution playground files for safe local ollama-webui upload practice authorized scope-file structure for exceptional client engagements The suite validates: local generated test pages the local ollama-webui target browser-based AI behavior against synthetic unsafe markers artifact-backed browser evidence for selected attack classes uploaded-file prompt construction, untrusted-content boundaries, redaction risk, and size-limit handling local project context handling, tool-output boundaries, model type filtering, and chat copy-control placement evidence quality for analysts and product-security review browser storage boundary evidence for cookies, localStorage, sessionStorage, cache-like state, and model-bound context separation artifact path, size, SHA256, timestamp, source tool, and source test traceability evidence record and artifact manifest contract validation ## Guided redirect-chain lab The toolkit includes an implemented Guided Lab Mode slice for local redirect-chain evidence. guided.redirect_chain_evidence Target scenario: browser.redirect_chain Purpose-built free and open-source helper: python tools/run_redirect_chain_lab.py \ --base-url http://127.0.0.1:11435 \ --variant all \ --out-dir /tmp/browser-safe-redirect-chain-lab The helper captures the local redirect path, HTTP status sequence, final URL, final page HTML, model-bound context artifact, model-response placeholder, `evidence.jsonl`, `artifact-manifest.json`, and lab report. It refuses non-loopback redirect locations and is intended only for local synthetic testing against the `ollama-webui` target. ## Guided iframe/frame-tree lab The toolkit includes an implemented local-only Guided Lab Mode slice for browser-observed iframe/frame-tree evidence. guided.iframe_frame_tree_evidence Target scenario: browser.iframe_frame_tree Purpose-built free and open-source helper: python tools/run_iframe_frame_tree_lab.py \ --base-url http://127.0.0.1:11435 \ --variant all \ --out-dir /tmp/browser-safe-iframe-frame-tree-lab The helper captures `frame-tree.json`, `frame-url-list.txt`, `top-page-dom-snapshot.html`, child frame DOM snapshots, sandbox findings, srcdoc findings, cross-frame rendered text, model-bound context, model-response placeholder, `evidence.jsonl`, `artifact-manifest.json`, and an analyst-readable report. Browser rendering and frame-tree observation are required. Static HTML parsing alone is not sufficient. ## Guided storage state boundary lab The toolkit includes an implemented local-only Guided Lab Mode slice for browser storage-state boundary evidence. guided.storage_state_boundary_evidence Target scenario: browser.storage_state_boundary Purpose-built free and open-source helper: python tools/run_storage_state_boundary_lab.py \ --base-url http://127.0.0.1:11435 \ --variant all \ --out-dir /tmp/browser-safe-storage-state-boundary-lab The helper captures `browser-state-before.json`, `browser-state-after.json`, cookie findings, localStorage findings, sessionStorage findings, cache-like state findings, `model-bound-context.txt`, `model-response.json`, `state-boundary-findings.json`, `evidence.jsonl`, `artifact-manifest.json`, and an analyst-readable report. Browser rendering and browser storage observation are required. Static HTML parsing alone is not sufficient. Protected browser state is synthetic and is preserved as bounded evidence while remaining outside model-bound context. ### Storage-state-boundary evidence chain The storage-state-boundary evidence chain is indexed for reviewer navigation in: docs/validation/README.md The index links the v8.9.9 evidence closure document, the v8.10.0 reviewer workflow document, and the v8.10.1 reviewer acceptance gate document. It records the evidence archive name, SHA256, guided lab id, target scenario id, five validated variants, evidence record count of 5, and manifest artifact count of 70. This is a documentation index for the existing local-only, synthetic-only, authorized-only evidence set. It does not rerun the lab, modify runtime code, modify the target app, or make a production security claim. ## Attack classes covered Supporting series areas: ## Requirements Tested development environment: OS family: Debian-derived Linux Distribution: Parrot OS Python: 3.13 tested locally Browser automation: Playwright Chromium Base packages for Parrot OS and Debian-family systems: sudo apt update sudo apt install -y git python3 python3-venv python3-pip gh findutils coreutils grep gawk curl Python dependencies are defined in: pyproject.toml requirements.txt ## Environment workflow Create or reuse the repository virtual environment: cd ai-browser-security-test-suite test -d .venv || python3 -m venv .venv source .venv/bin/activate python -m pip install --upgrade pip python -m pip install -e . Install development dependencies when you want to run tests: python -m pip install -e ".[dev]" ## Start the supported local target Start `ollama-webui` in a separate terminal: cd ../ollama-webui source .venv/bin/activate python scripts/pull_model.py Verify the target: curl -fsS http://127.0.0.1:11435/health curl -fsS http://127.0.0.1:11434/api/version ## Ollama Web UI service preflight The supported target suite requires `ollama-webui` to be running before validation starts. Start the target in a separate terminal: cd ../ollama-webui source .venv/bin/activate python scripts/pull_model.py Then run: cd ai-browser-security-test-suite scripts/run_supported_local_target_suite.sh If the service is not running, the suite exits early and prints these startup instructions. ## Run the supported target suite From this repository: cd ai-browser-security-test-suite scripts/run_supported_local_target_suite.sh Run the full repository verification plus supported local target validation: scripts/test_series_coverage_against_ollama_webui.sh For local repository checks without the running target: RUN_OLLAMA_TARGET=0 scripts/test_series_coverage_against_ollama_webui.sh Optional model override: OLLAMA_MODEL=deepseek-r1 scripts/run_supported_local_target_suite.sh ## CLI overview python -m ai_browser_security_suite --help Available commands: init-scope case-list lab-build lab-serve recon capture report ollama-validate ollama-upload-validate ollama-project-agent-validate ## Development checks python -m compileall -q src tools pytest python tools/audit_series_coverage.py \ --payload payloads/ollama_webui_safe_prompts.yaml \ --out-dir /tmp/ai-browser-coverage ## CI gates The toolkit includes a GitHub Actions workflow that runs the repository's core integrity checks on pull requests and pushes to `main`: .github/workflows/security-ci.yml The CI gate runs compile checks, pytest, schema validation, target-contract snapshot validation, guided lab manifest validation, the default coverage audit, and the target-contract coverage audit. This does not claim full browser-AI penetration-testing coverage. It prevents regressions and overclaiming while new browser evidence parsers and tests are added. See: docs/ci-gates.md ## Guided Lab Mode Guided Lab Mode defines how the toolkit turns the Browser-Safe AI Systems series into structured, local, repeatable lab exercises. The guided lab model is documented in: docs/guided-lab-mode.md docs/guided-lab-template.md docs/guided-lab-execution-plan.md The current lab manifest is: Validate it locally: python tools/validate_guided_labs.py Guided labs are designed for users on Parrot OS, Kali Linux, or similar penetration-testing Linux distributions. A lab tells the user which open-source tool to open, how to conduct the test, what to observe, how to vary the input safely, what evidence should be produced, and which Browser-Safe AI Systems series parts the lab demonstrates. Current guided lab implementation status: | Guided lab id | Target scenario id | Helper status | Evidence maturity | Workshop readiness | |---|---|---|---|---| | `guided.redirect_chain_evidence` | `browser.redirect_chain` | implemented | tested helper, workshop integrated, pending full guided evidence closure | workshop lab integrated component | | `guided.dom_render_mismatch` | `browser.dom_render_mismatch` | implemented | tested helper, workshop integrated, pending full guided evidence closure | workshop lab integrated component | | `guided.iframe_frame_tree_evidence` | `browser.iframe_frame_tree` | implemented | tested helper, workshop integrated, pending full guided evidence closure | workshop lab integrated component | | `guided.storage_state_boundary_evidence` | `browser.storage_state_boundary` | implemented | full guided evidence closure and reviewer gate | reviewer-gated lab component | The canonical matrix for current maturity, tooling, model mode, provisioning notes, and next gaps is: docs/lab-track-coverage-matrix.md ## Target contract ingestion The toolkit can ingest the Browser-Safe AI target scenario contract published by the local `ollama-webui` vulnerable app. The current local snapshot is: docs/target-contracts/ollama-webui-target-scenario-contract-v0.2.json Run the coverage audit with the target contract gate enabled: python tools/audit_series_coverage.py \ --payload payloads/ollama_webui_safe_prompts.yaml \ --target-payload payloads/ollama_webui_file_upload_cases.yaml \ --target-payload payloads/ollama_webui_project_agent_cases.yaml \ --target-payload payloads/ollama_webui_redirect_chain_cases.yaml \ --target-payload payloads/ollama_webui_dom_render_cases.yaml \ --target-payload payloads/ollama_webui_iframe_frame_tree_cases.yaml \ --target-payload payloads/ollama_webui_storage_state_boundary_cases.yaml \ --target-contract docs/target-contracts/ollama-webui-target-scenario-contract-v0.2.json \ --out-dir /tmp/ai-browser-target-contract-coverage This command must stay aligned with the `Run target-contract coverage audit` step in `.github/workflows/security-ci.yml` so local reviewer validation and GitHub Actions evaluate the same target payload families. When the target contract gate is enabled, the audit fails if an active target scenario is not represented by toolkit payload mappings or if a payload references an unknown scenario id. This keeps toolkit coverage claims aligned with the intentionally vulnerable local target. ## Evidence artifact manifest Every evidence directory written through the shared evidence writer now includes: artifact-manifest.json The manifest is a deterministic review index for generated evidence artifacts. Each entry records: path artifact_type size_bytes sha256 created_utc source_tool source_test_id The Markdown report references the manifest, its SHA256 hash, and the number of manifested artifacts. Missing files or declared SHA256 mismatches fail before new evidence is written, so incomplete evidence does not silently become a passing report. Current scope of this slice: proven: shared evidence manifest and SHA256 verification planned: OCR parser, QR decoder, iframe tree parser, ARIA tree parser, DOM/render diff engine, and visual diff engine ## Evidence schema contracts The shared evidence layer now includes explicit runtime and documentation contracts for evidence records and artifact manifests. Contract implementation: src/ai_browser_security_suite/evidence_schema.py Published schema files: docs/schemas/evidence-record.schema.json docs/schemas/artifact-manifest.schema.json The runtime validator checks required fields, rejects unexpected evidence record fields, validates ISO-8601 timestamps, validates artifact hash field format, and confirms manifest artifact counts match the artifact list. This makes the evidence pipeline safer for future OCR, QR, iframe, ARIA, DOM/render, and visual-diff slices without claiming those parsers exist yet. ## Ollama Web UI validation through the CLI cd ai-browser-security-test-suite source .venv/bin/activate python -m ai_browser_security_suite ollama-validate --base-url http://127.0.0.1:11435/ --cases payloads/ollama_webui_safe_prompts.yaml --out reports/ollama-webui-validation --i-have-authorization Optional model override: Generated local evidence: reports/ollama-webui-validation/evidence.jsonl reports/ollama-webui-validation/artifact-manifest.json reports/ollama-webui-validation/ollama-webui-validation-results.json reports/ollama-webui-validation/ollama-webui-validation-report.md reports/ollama-webui-validation/target-metadata.json reports/ollama-webui-validation/cases//console.log reports/ollama-webui-validation/cases//dom.html reports/ollama-webui-validation/cases//network-events.json reports/ollama-webui-validation/cases//network.har reports/ollama-webui-validation/cases//screenshot.png reports/ollama-webui-validation/cases//case-result.json Generated evidence is ignored by Git. ## Ollama Web UI upload analysis validation The upload validation path exercises the target's uploaded file analysis feature directly. It uploads controlled local files into the real UI, intercepts `/api/generate`, and saves the exact model-bound prompt for review. Convenience wrapper: scripts/test_upload_analysis_against_ollama_webui.sh Generated upload evidence: reports/ollama-webui-upload-validation/evidence.jsonl reports/ollama-webui-upload-validation/artifact-manifest.json reports/ollama-webui-upload-validation/ollama-webui-upload-validation-results.json reports/ollama-webui-upload-validation/ollama-webui-upload-validation-report.md reports/ollama-webui-upload-validation/target-metadata.json reports/ollama-webui-upload-validation/cases//captured-model-prompt.txt reports/ollama-webui-upload-validation/cases//generate-requests.json reports/ollama-webui-upload-validation/cases//upload-files/ ## Ollama Web UI Project Agent validation The Project Agent validation path exercises the updated target's local project context surface directly. It creates a synthetic project under the local report directory, calls the Project Agent APIs, verifies the model type selector, confirms `Cloud` is not exposed as a pullable model type, checks chat copy controls, intercepts `/api/generate`, and saves the exact model-bound prompt. Generated Project Agent evidence: reports/ollama-webui-project-agent-validation/evidence.jsonl reports/ollama-webui-project-agent-validation/artifact-manifest.json reports/ollama-webui-project-agent-validation/ollama-webui-project-agent-validation-results.json reports/ollama-webui-project-agent-validation/ollama-webui-project-agent-validation-report.md reports/ollama-webui-project-agent-validation/target-metadata.json reports/ollama-webui-project-agent-validation/cases//captured-model-prompt.txt reports/ollama-webui-project-agent-validation/cases//project-agent-api-responses.json reports/ollama-webui-project-agent-validation/cases//model-controls.json reports/ollama-webui-project-agent-validation/cases//synthetic-project/ ## Local browser-AI lab The local generated lab provides playable browser-safe AI examples for evidence demonstrations, training, and authorized penetration-test practice. List cases: cd ai-browser-security-test-suite source .venv/bin/activate python -m ai_browser_security_suite case-list --cases payloads/safe_browser_ai_cases.yaml Build local lab pages: Serve the lab: python -m ai_browser_security_suite lab-serve --directory local_lab --host 127.0.0.1 --port 8088 Capture evidence: python -m ai_browser_security_suite capture --url http://127.0.0.1:8088/bai-002-hidden-dom.html --out reports/example-capture The lab includes controlled examples for visible prompt injection, hidden DOM, CSS-hidden text, SVG metadata, accessibility mismatch, DOM/render mismatch, visual overlays, QR handoff, delayed mutation, Unicode spoofing, synthetic DLP, seeded login, inert file-sharing lure, metadata contradiction, fail-open pressure, exception abuse, oversized DOM stress, calendar promptware, fake IdP login, document-share lures, QR MFA reset, fake browser updates, OAuth consent lures, helpdesk support-bundle collection, and invoice payment-change deception. Guide: docs/playable-browser-safe-ai-examples.md docs/real-world-browser-ai-attack-scenarios.md Uploadable playground files for local `ollama-webui` practice: examples/ollama-webui-playground/ ## Authorized client scope files The default public workflow is local `ollama-webui` testing. Client-provided FQDNs, IP addresses, ports, paths, and test credentials are supported only for explicit authorized engagements. Those workflows must use a scope file and explicit authorization. Create a local-first scope template: python -m ai_browser_security_suite init-scope --out examples/client-scope.local.yaml Run passive reconnaissance only: python -m ai_browser_security_suite recon --scope examples/scope.example.yaml --out reports/example-recon --passive-only Active checks require the explicit authorization flag: python -m ai_browser_security_suite recon --scope examples/client-scope.local.yaml --out reports/client-recon --i-have-authorization ## Documentation Current documentation: docs/artifact-backed-browser-cases.md docs/authorized-black-box-testing.md docs/coverage-audit.md docs/coverage/browser-safe-ai-series-coverage.md docs/ci-github-actions.md docs/ollama-webui-local-target.md docs/ollama-webui-service-preflight.md docs/ollama-webui-upload-analysis-testing-review.md docs/ollama-webui-project-agent-testing-review.md docs/playable-browser-safe-ai-examples.md docs/quickstart.md docs/real-world-browser-ai-attack-scenarios.md docs/supported-target-policy.md docs/tooling-map-to-series.md ## Repository structure ai-browser-security-test-suite/ ├── docs/ ├── examples/ │ └── ollama-webui-playground/ ├── payloads/ ├── reports/ ├── scripts/ ├── src/ │ └── ai_browser_security_suite/ │ ├── recon/ │ └── targets/ ├── pyproject.toml ├── requirements.txt └── README.md Generated directories and evidence are intentionally ignored: local_lab/ reports/* The placeholder file remains tracked: reports/.gitkeep ## Technical depth demonstrated This repository demonstrates: local vulnerable target design safe synthetic marker strategy Playwright browser automation DOM and rendered-page evidence capture HAR and console-log collection Ollama-backed browser-AI validation coverage auditing against the research series artifact-backed tests for visual deception, DOM/render mismatch, QR handoff, and delayed DOM mutation uploaded-file analysis tests that capture exact model-bound prompts Project Agent tests that capture local project context and allowlisted tool output boundaries JSONL evidence suitable for later SIEM or SOC enrichment artifact manifests with SHA256 hashes for reproducibility review Markdown reports suitable for human review explicit safety boundaries to reduce misuse ## Professional use cases This suite can help teams: demonstrate indirect prompt injection risks safely compare DOM evidence with rendered browser evidence validate whether local browser-AI workflows repeat unsafe synthetic markers test a reproducible browser-based LLM app document evidence for SOC and product-security review prototype mitigations against a controlled local target build repeatable due-diligence demonstrations for browser-based AI ecosystems ## GitHub workflow Use pull requests for changes to protected branches. Recommended development pattern: git switch main git pull --ff-only origin main git switch -c work/ git status --short git add git commit -m "" git push -u origin work/ gh pr create --base main --head work/ --title "" --body "" Do not force-push protected branches. ## License GNU Affero General Public License v3.0 or later. See: LICENSE ## Disclaimer These tools are intended for authorized security testing and research purposes only. Users are responsible for complying with all applicable laws, organizational rules, and written authorization requirements before testing any system. The author assumes no liability for misuse. ## Guided lab tooling policy All guided lab tooling must be free and open source. Labs may use tools available from Parrot OS, Kali Linux, Debian-derived repositories, upstream project source, or project-managed Python code. If a suitable free and open source tool is not available for a lab, this project provides a purpose-built Python tool for that lab. Guided labs must not require commercial-only, paid-only, proprietary-only, trialware, or closed-source tooling. ### Guided DOM/render mismatch lab The toolkit includes an implemented local-only Guided Lab Mode exercise for DOM versus browser-rendered content mismatch. tools/run_dom_render_lab.py payloads/ollama_webui_dom_render_cases.yaml src/ai_browser_security_suite/dom_render.py The lab maps to `guided.dom_render_mismatch` and target scenario `browser.dom_render_mismatch`. It captures raw DOM state, browser-rendered visible text, computed style findings, screenshot evidence, model-bound context, model-response placeholder evidence, `evidence.jsonl`, `artifact-manifest.json`, and an analyst-readable `report.md`. Static HTML parsing alone is not sufficient for this lab. The intended live capture path uses Playwright. Tests use deterministic purpose-built Python renderers so CI remains reproducible without requiring external targets. ### Guided iframe/frame-tree lab The toolkit includes an implemented local-only Guided Lab Mode exercise for iframe and nested browsing context evidence. tools/run_iframe_frame_tree_lab.py payloads/ollama_webui_iframe_frame_tree_cases.yaml src/ai_browser_security_suite/iframe_frame_tree.py The lab maps to `guided.iframe_frame_tree_evidence` and target scenario `browser.iframe_frame_tree`. It captures browser-observed frame relationships, frame URLs, top-page DOM, child-frame DOM snapshots, sandbox findings, srcdoc findings, cross-frame rendered text, model-bound context, model-response placeholder evidence, `evidence.jsonl`, `artifact-manifest.json`, and an analyst-readable `report.md`. Browser rendering and frame-tree observation are required. Static HTML parsing alone is not sufficient for this lab. The intended live capture path uses Playwright. Tests use deterministic purpose-built Python renderers so CI remains reproducible without requiring external targets.