nmicic/compartment

GitHub: nmicic/compartment

内核强制沙箱隔离工具，提供零依赖核心工具和共享配置文件格式。

Stars: 3 | Forks: 0

# 间隔 — Linux进程隔离工具包为不受信任的进程提供内核强制沙箱隔离。两个零依赖核心工具，一个共享配置文件格式，以及可选的BPF-LSM模块。 ## 什么是 | 工具 | 目的 | 根？ | 依赖 | |------|---------|-------|------| | **compartment-user** | Landlock + seccomp + 环境净化 + 审计 | 否 | 无 | | **compartment-root** | 完全命名空间容器 + seccomp + 审计 | 是 | 无 | | **sandbox.sh** | 网络命名空间 + 代理桥接 | 否 | unshare, socat, newuidmap | | **compartment-bpf** | 可选的BPF LSM节点密封（内核端拒绝，即使root） | 是 (CAP_BPF + CAP_SYS_ADMIN) | clang ≥ 12, libbpf, bpftool, libsodium, BTF, 内核 ≥ 6.6 with `lsm=...,bpf` | `compartment-user` / `compartment-root` 使用 Landlock + seccomp（用户空间策略）并且是默认的零依赖路径。 `compartment-bpf` 在 BPF LSM 级别密封单个inode — 执法在内核中，即使root也能生存。两种方法互为补充。有关 BPF 工具及其要求，请参阅 `compartment-bpf/HOWTO.md`。 ## 快速入门 ``` make ./compartment-user -- /bin/sh # sandboxed shell in 2 commands ./compartment-user --dry-run -- /bin/sh # see what would be applied ``` ## 构建 ``` make # builds the zero-dependency core tools make test # run core tests (Landlock + seccomp + env + inheritance) make test-integration # run all tests (includes Claude CLI smoke test) make hardened # build with randomized shell stash path ``` 可选的BPF模块： ``` cd compartment-bpf make vmlinux.h && make ``` 需要Linux内核6.6+，已启用BPF LSM，以及clang、libbpf-dev、bpftool、libsodium-dev和运行内核的BTF。 ## 使用 ``` # 沙盒化 AI 代理（Landlock + seccomp，无 root） ./compartment-user -- claude --model claude-opus-4-6 # 使用配置文件 ./compartment-user --profile strict -- codex --full-auto # 查看不运行时将应用的内容 ./compartment-user --dry-run -- claude # 完整命名空间容器（需要 root） ./compartment-root --profile examples/container.conf -- /bin/sh ./compartment-root --rootdir /srv/jail -U svc --audit -- /usr/bin/myapp # 完整隔离（网络命名空间 + 代理 + Landlock + seccomp） ./sandbox.sh claude --model claude-opus-4-6 ``` ## 示例：强化SSH（网络客户端的权限分离）间隔可以锁定任何网络客户端——不仅仅是AI代理。以下是一个使用SSH的工作示例，展示了如何将进程分割成权限分离的组件，以便单个妥协无法同时访问机密信息并将其泄露。 ### 问题如果远程SSH服务器被入侵，它可以逆向利用SSH客户端。被木马化的客户端可以： - 将被盗凭据写入 `~/.ssh/exfil.txt` - 将按键记录到隐藏文件 - 通过网络将数据泄露到第三方主机 ### 解决方案1：只读SSH (`ssh.conf`) 锁定SSH客户端以只读文件系统访问。它可以读取密钥进行身份验证，但不能将任何内容写入磁盘： ``` # 一行命令：SSH 无文件系统写入 ./compartment-user --profile examples/ssh.conf -- ssh user@host # 如果 SSH 二进制文件尝试写入会发生什么： # touch /tmp/exfil.txt → EACCES（被 Landlock 阻挡） # echo x > ~/.ssh/log.txt → EACCES（被 Landlock 阻挡） # cat ~/.ssh/id_ed25519 → OK（读取允许） ``` ### 解决方案2：偏执SSH (`paranoid-ssh.sh`) 将SSH分割成两个具有互补限制的沙箱进程： ``` ┌──────────────────────────┐ ┌──────────────────────────┐ │ SSH (read-only fs) │────▶│ socat (no user files) │────▶ remote:PORT │ • can read keys │ │ • no $HOME access │ │ • cannot write anywhere │ │ • cannot read SSH keys │ │ • Landlock + seccomp │ │ • Landlock + seccomp │ └──────────────────────────┘ └──────────────────────────┘ localhost:RANDOM_PORT ``` ``` # 对远程服务器进行偏执 SSH ./examples/paranoid-ssh.sh user@remote-host # 使用自定义端口 ./examples/paranoid-ssh.sh user@remote-host -p 2222 # 运行命令 ./examples/paranoid-ssh.sh user@remote-host "uptime" ``` **安全属性：** - **SSH进程**可以读取 `~/.ssh/` 密钥，但不能写入磁盘 → 被逆向利用的SSH无法将被盗数据本地保存 - **socat进程**具有网络访问权限，但不能读取任何用户文件 → 即使socat被利用，攻击者也无法访问凭据 - **任何单个进程**都无法同时访问机密信息并将其泄露这与OpenSSH自身的权限分离原则相同，但它在操作系统级别使用Landlock + seccomp实现，而不是信任应用程序自行分离。 ### 为什么这很重要（2026年范式）传统的系统管理员思维：“SSH是可信的，网络是不可信的。” 间隔思维：“没有任何东西是完全可信的。将每个进程分割，以便任何单个组件的妥协都无法同时实现数据访问和数据泄露。” 这种模式适用于任何网络客户端： - **curl/wget** — 只读配置文件防止保存下载的恶意软件 - **git** — fetch的只读配置文件，workdir的写入配置文件 - **数据库客户端** — 防止凭据记录到磁盘 - **AI代理** — 主要用例（请参阅 `sandbox.sh`） ## 工作原理 **compartment-user** 在exec之前应用内核强制限制： 1. `PR_SET_NO_NEW_PRIVS` — 防止权限提升 2. **Landlock** — 文件系统路径限制（只读系统路径，可写工作目录） 3. **seccomp BPF** — 阻止危险的系统调用（ptrace、mount、kexec、bpf、io_uring、...） 4. **环境净化** — 移除 LD_PRELOAD、LD_LIBRARY_PATH、等 5. **审计日志** — 每日文件日志，带有PPID链所有限制都会由子进程继承，并且无法删除。 **compartment-root** 创建一个完全隔离的容器： 1. `clone()` 与新的 UTS、mount、PID、IPC、net、user命名空间 2. **pivot_root** — 旧的root完全卸载（比chroot更强） 3. 最小 `/dev`、隐藏 `/proc`、隔离主机名 4. **能力下降** — 原始prctl + capset，无libcap。`cap-allow` 通过 `PR_SET_KEEPCAPS` + `capset()` 保留服务用户命名的权限 5. **seccomp BPF** — 原始BPF，无libseccomp 6. **环境净化** + **审计日志**（与 compartment-user 相同） **sandbox.sh** 将命令包装在用户+mount命名空间中： 1. `unshare --user --mount --net` — 硬件模式：仅回环（无外部接口）；软 fallback：slirp4netns with `--disable-host-loopback` 2. Unix套接字代理桥接 — API流量通过公司代理路由 3. 绑定挂载shell替换 — 每个 `/bin/bash` 子进程都得到沙箱隔离（需要挂载命名空间，sandbox.sh 创建）在硬件模式下，如果已配置代理，则命名空间没有外部接口——网络流量仅通过Unix套接字代理桥接。（自动测试验证代理可达性，但尚未包括直接绕过抵抗测试。） ## 配置文件两个工具共享相同的 `.conf` 格式： ``` # 文件系统（compartment-user：Landlock） ro /usr rw $HOME # 文件系统（compartment-root：namespaces） rootdir /srv/containers/default uid 1000 gid 1000 username svc loopback on # 系统调用 block ptrace block mount # 或允许列表模式： # 允许读取 # 允许写入 # 环境 env-deny LD_PRELOAD # 功能 seccomp on no-new-privs on env-sanitize on audit on ``` 搜索顺序：`--profile /path/file.conf` → `~/.config/compartment/.conf` → `/etc/compartment/.conf` → 内置。有关完整格式参考，请参阅 [HOWTO.md](HOWTO.md). ## 示例每个配置文件针对不同的威胁模型。选择与您要保护的内容相匹配的配置文件。 | 文件 | 使用时 | 保护 | |------|----------|------------------| | `ai-agent.conf` | 运行Claude、Codex、Gemini CLI | 代理在工作目录外读取/写入，启动意外的进程 | | `strict.conf` | 不受信任的代码，比ai-agent更严格 | 与上述相同，系统调用表面积更小 | | `ssh.conf` | 在您不完全信任的机器上运行SSH客户端 | 被入侵的SSH二进制将凭据写入磁盘 | | `socat-proxy.conf` | 由 `paranoid-ssh.sh` 内部使用 | socat具有访问您的SSH密钥的权限 | | `container.conf` | 通过 compartment-root 实现完全命名空间隔离 | 进程逃逸其根目录 | | `dev.conf` | 开发和调试 | 没有东西——这是故意放宽的 | **我应该使用哪一个？** - **沙箱化AI代理** → `ai-agent.conf`（默认）或 `strict.conf`（更严格） - **连接到远程服务器** → `paranoid-ssh.sh`，它结合了 `ssh.conf` + `socat-proxy.conf`。SSH进程可以读取您的密钥，但不能写入任何地方。socat进程处理网络连接，但不能读取您的密钥。任何单个进程都无法同时窃取凭据和泄露它们。 - **运行不受信任的服务** → 使用 `compartment-root` 的 `container.conf` - **找出为什么某些内容被阻止** → `dev.conf`，然后从那里开始收紧 ## 对任何程序进行配置文件生成不要手动编写配置文件。使用 `tools/syscall.py` 为任何程序自动生成一个： ``` # 步骤 1：检查——默认配置文件是否会破坏您的程序？ python3 tools/syscall.py check --profile ai-agent -- wget -q -O /dev/null https://example.com # 步骤 2：如果破坏了，生成自定义配置文件 python3 tools/syscall.py profile -o examples/wget.conf -- wget -q -O /dev/null https://example.com # 步骤 3：使用它 ./compartment-user --profile examples/wget.conf -- wget https://example.com ``` 适用于任何东西：`curl`、`git`、`ssh`、`rsync`、`python3`、数据库客户端——您可以运行的任何程序，您都可以对其进行配置文件生成和沙箱化。 ``` # 更多示例 python3 tools/syscall.py profile -o curl.conf -- curl -s https://example.com python3 tools/syscall.py profile -o git.conf -- git clone https://github.com/user/repo python3 tools/syscall.py profile -o psql.conf -- psql -c "SELECT 1" # 严格的允许列表（仅允许观察到的系统调用，拒绝其他所有内容） python3 tools/syscall.py profile --seccomp-mode allow -o strict-curl.conf -- curl https://example.com # 查看程序实际使用的系统调用 python3 tools/syscall.py trace -- ssh user@host "echo hello" ``` 需要 `strace` (`apt install strace`)。有关完整指南，请参阅 [tools/HOWTO-syscall-profiling.md](tools/HOWTO-syscall-profiling.md). ## Shell替换 compartment-user可以透明地拦截 `/bin/bash`，因此AI代理生成的每个子进程都会得到沙箱隔离： ``` /bin/bash (bind mount) → compartment-user → Landlock + seccomp applied → exec /bin/shells/bash (the real shell) ``` 当 compartment-user 编译并可用时，这将在 `sandbox.sh` 内自动发生。有关手动设置选项，请参阅 [HOWTO.md](HOWTO.md). ## 高级部署：间隔登录Shell 间隔可以作为非管理员用户的登录Shell部署，以便每个交互式会话和每个 `execve("/bin/sh", ...)` ——包括远程利用有效载荷——都会自动进入沙箱Shell。这是一个针对受控环境的意见设置（加固服务器、跳板、CI运行器），而不是普遍的建议。 ### 设置 ``` # 使用随机的 shell 储存路径构建 make hardened # 输出：REAL_SHELL_DIR=/bin/.shells_a1b2c3d4e5f6 # 在储存目录中保留真实 shell sudo mkdir -p /bin/.shells_a1b2c3d4e5f6 sudo mv /bin/bash /bin/.shells_a1b2c3d4e5f6/bash sudo mv /bin/sh /bin/.shells_a1b2c3d4e5f6/sh # 将 compartment-user 安装为系统 shell sudo cp compartment-user /bin/bash sudo ln -sf /bin/bash /bin/sh # 为指定的管理员账户保留正常 shell sudo chsh -s /bin/.shells_a1b2c3d4e5f6/bash root sudo chsh -s /bin/.shells_a1b2c3d4e5f6/bash your-admin-user ``` 当以 `bash` 或 `sh`（通过 `argv[0]` 检测）调用时， compartment-user 应用 `ai-agent` 配置文件并从隐藏目录执行真实Shell。 ### 权限模型 ``` root / admin → /bin/.shells_.../bash (real shell, no sandbox) all others → /bin/bash (compartment → sandboxed shell) Landlock + seccomp + env sanitize + audit ``` ### 这阻止了什么调用 `execve("/bin/sh", ...")` 的远程利用有效载荷会得到间隔，而不是bash。有效载荷在执行单个攻击者控制的指令之前会命中Landlock文件系统限制和seccomp系统调用过滤。沙箱Shell不能 `ptrace`，不能加载内核模块，不能挂载文件系统，并且只写入允许的路径。 ### 注意事项 - **兼容性**：一些工作流程期望不受限制的交互式Shell，可能会中断。在部署到生产环境之前彻底测试。 - **不是替代品**：对于正确的主机加固、打补丁和权限分离，这不是替代品。这是一个深度防御层。 - **绕过路径存在**：可以写入ELF二进制到可执行路径并直接调用（而不是通过 `/bin/sh`）的攻击者将绕过shell替换层。父进程上的Landlock限制了它们可以写入的位置，但这不是完全密封的。 - **恢复**：始终保留至少一个具有真实Shell的管理员账户。如果 compartment-user 有漏洞，您需要一种方法返回。 ## 要求 - Linux >= 5.13（Landlock）— compartment-user - Linux >= 4.6（cgroup命名空间）— compartment-root - Linux >= 3.8（用户命名空间）— sandbox.sh - 无外部库。compartment-user 无需root权限。 ## 文件 ``` compartment.h — shared code: profiles, audit, seccomp BPF, env sanitize compartment-user.c — Landlock + seccomp + audit (zero deps, rootless) compartment-root.c — Full namespace container (zero deps, requires root) sandbox.sh — Network namespace + proxy bridge Makefile — Build targets HOWTO.md — Detailed setup guide DESIGN.md — Architecture, security review, lineage from shell-guard SECURITY.md — Vulnerability reporting policy examples/ ai-agent.conf — Profile for Claude/Codex/Gemini strict.conf — Locked-down profile (inherits ai-agent) container.conf — Full namespace isolation profile dev.conf — Relaxed profile for development ssh.conf — Read-only SSH client (no filesystem writes) socat-proxy.conf — Network-only socat bridge (no user file access) paranoid-ssh.sh — Privilege-separated SSH (SSH+socat split) tools/ syscall.py — Profile generator: trace any program, emit .conf HOWTO-syscall-profiling.md — Full guide to syscall profiling man/ compartment-user.1 — Man page (section 1: user commands) compartment-root.8 — Man page (section 8: system administration) tests/ probes/deny_probe.c — Sandbox validation probe (machine-parseable output) profiles/ — Test-specific .conf profiles scripts/run_all.sh — Top-level test runner (52 tests across 4 suites) README.md — Test documentation archive/ shell-guard/ — Archived shell-replacement tool (~2003, self-contained) ``` ## 与替代方案的比较 ``` root required? no yes ┌───────────────┬───────────────┐ filesystem │ compartment- │ compartment- │ restriction │ user │ root │ mechanism │ (Landlock) │ (pivot_root) │ │ │ │ │ Firejail │ bwrap (setuid)│ │ bwrap (userns)│ Docker/Podman │ ├───────────────┼───────────────┤ no filesystem │ seccomp-only │ AppArmor │ restriction │ wrappers │ SELinux │ └───────────────┴───────────────┘ ``` - **Firejail** (~100K行) — 最接近的比较；桌面应用程序成熟的配置文件生态系统，但攻击面大，有CVE历史。compartment-user 小100倍，可以一次性审计。 - **bwrap** (~3K行) — mount/PID/网络命名空间。架构不同（命名空间与Landlock）。需要完整挂载隔离或内核 < 5.13 时使用bwrap；需要配置文件、shell替换或在工作区中禁用用户命名空间的容器中使用 compartment-user。 - **Minijail**（Google）—— 表达式seccomp参数过滤，但需要libminijail。compartment-user 以零依赖部署为代价进行参数过滤。 - **AppArmor/SELinux** —— 系统范围的MAC，更细粒度，但需要管理员访问和系统策略安装。compartment-user 是用户可部署的，无需系统配置更改。没有现有的工具结合：零依赖、具有继承的配置文件、shell替换模式、以及约1600行中的PPID链审计日志。 ## 相关 - [bubblewrap](https://github.com/containers/bubblewrap) —— 基于命名空间的沙箱隔离（互补） - [firejail](https://github.com/netblue30/firejail) —— 命名空间 + seccomp（setuid，配置文件） ## 开发该项目是在AI辅助下开发的： - **[Claude Code](https://claude.ai/code)**（Anthropic）—— 主要编码、测试、调试和实现所有C源代码、shell脚本、配置文件和测试基础设施 - **ChatGPT**（OpenAI）、**Gemini**（Google）、**Codex**（OpenAI）—— 独立代码审查轮次，确定了18个安全漏洞，所有漏洞都在发布前得到修复 - **Human** —— 架构、设计决策、审查协调和最终批准 ## 许可证 Apache-2.0。请参阅 [LICENSE](LICENSE).

标签：应用安全