Justin0504/federated-agent-audit

GitHub: Justin0504/federated-agent-audit

联邦代理审计：保护多代理AI系统隐私和合规性。

Stars: 3 | Forks: 3

# 联邦代理审计 **跟踪和审计任何多代理系统以识别隐私和合规风险——中央审计员永远不会看到原始内容。** ``` pip install federated-agent-audit ``` [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/06/b058f8e39f093218.svg)](https://github.com/Justin0504/federated-agent-audit/actions/workflows/ci.yml) [![PyPI 版本](https://img.shields.io/pypi/v/federated-agent-audit.svg)](https://pypi.org/project/federated-agent-audit/) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/) [![许可证](https://img.shields.io/badge/license-Apache%202.0-green.svg)](LICENSE) [![测试](https://img.shields.io/badge/tests-723%20passing-brightgreen.svg)](tests/) [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) 想象一下 **LangSmith/Langfuse 对于多代理系统，但却是联邦的**——你的提示和输出永远不会离开代理自己的环境。两个支柱，框架和场景无关： 1. **行为跟踪**——从 CrewAI · LangGraph · AutoGen · OpenAI 代理 · LlamaIndex 或任何自定义编排中捕获真实的代理到代理交互图（谁向谁发送了什么，工具调用，交接）。 2. **联邦脱敏审计**——每个代理本地审计；中央审计员只看到散列的、匿名化的、DP-噪声的元数据，但仍能检测到跨代理出现的组合隐私/合规风险。 **这是为谁准备的？** 任何运行多代理系统并需要观察和治理其行为的人——对于无法将原始提示发送给第三方供应商（受监管数据、本地化、数据居住地）的团队来说，需求更为迫切。内置单个 LLM 应用程序入口（防火墙下方）。 ## 30 秒快速入门 ``` from federated_agent_audit import scan result = scan("Zhang Wei's SSN is 123-45-6789, salary $185,000") print(result["clean"]) # False print(result["detected"]) # ['SSN', 'salary'] print(result["text"]) # "Zhang Wei's [REDACTED] is [SSN], [REDACTED] [DOLLAR_AMOUNT]" ``` ``` echo "credit card 4532-1234-5678-9012" | federated-audit scan # REDACTED Detected: credit card ``` ## 保护您的 LLM 调用自动拦截每个 OpenAI/Anthropic 响应——单应用程序入口。生产级加固：失败时打开（防火墙不会崩溃您的应用程序），一旦累积违规，则流式传输被阻止，并在工具调用参数内检查敏感内容。 ``` from federated_agent_audit import firewall fw = firewall(["salary", "SSN", "diagnosis"]) fw.patch_openai() # done — every response (incl. streaming + tool calls) is now checked response = client.chat.completions.create(model="gpt-4o", messages=[...]) # Sensitive content in the response is already redacted ``` ## 问题多代理系统创建了单代理工具无法检测的 **复合隐私风险**： - 代理 A 将薪资数据与代理 B 分享（由 A 的策略允许） - 代理 B 将“摘要”转发给外部合作伙伴（由 B 的策略允许） - **结果**：薪资数据在公司外部泄露——两个代理都没有违反自己的规则现有的可观察性工具（LangSmith，Langfuse）需要将原始提示上传到他们的服务器。此框架审计代理交互 **而中央审计员永远不会看到原始内容**。 📖 **[案例研究](docs/CASE_STUDY.md)** ——仅通过 *组合* 两个符合策略的代理才出现的泄露，使用从未离开代理的原始 PHI/PII 被捕获（`python examples/case_study_healthcare_leak.py`）。 ``` +---------------+ | Central | Phase 2: Network audit | Auditor | (desensitized metadata only) +-------+-------+ | +---------------+---------------+ | | | +------+------+ +----+----+ +--------+------+ | Local Audit | | Local | | Local Audit | Phase 1 | (Agent A) | | (Agt B) | | (Agent C) | +-------------+ +---------+ +---------------+ raw content raw content raw content stays here stays here stays here ``` 查看 **[ARCHITECTURE.md](docs/ARCHITECTURE.md)** 了解双边模型（边缘与中心）、部署拓扑和篡改证据保证。 ## 多代理跟踪与审计集成捕获了 **真实的代理到代理交互图**——这正是组合/级联/跨域检测器分析的内容。一切建立在 `MultiAgentTracer` 之上，它适用于任何框架（或没有）： ``` from federated_agent_audit import MultiAgentTracer, PrivacyPolicy tracer = MultiAgentTracer() tracer.register_agent("hr_bot", PrivacyPolicy(agent_id="hr_bot", must_not_share=["salary"])) # Each call is a real directed edge; taint (domains, sensitivity, origin, # hop count) propagates across hops automatically. tracer.record_handoff("hr_bot", "summary_bot", "Zhang Wei earns $185k", origin="zhang_wei") tracer.record_handoff("summary_bot", "external_bot", "candidate compensation summary") result = tracer.network_audit() # Phase-2 central audit incidents = tracer.aggregated() # denoised, actionable alerts ``` **跟踪，而不仅仅是审计。** 看看您的代理做了什么——按时间顺序和脱敏的——无论是否出错。永远不会有原始内容： ``` tracer.timeline() # [{seq, agent, to, action, domains, sensitivity, local_action, timestamp}, ...] tracer.summary() # per-agent sent/received/internal counts + domains touched tracer.export() # full interaction graph as a JSON-able dict (hashes + metadata, no raw text) ``` 它捕获了单个代理策略无法看到的复合泄露——而中央审计员仍然从未接触过原始数据（`python examples/multiagent_trace_demo.py`）： ``` Incidents: 5 alert_summary={'critical': 3, 'high': 2} [CRITICAL] cross_domain_leak — Sensitive health data reaches social domain via 2-agent chain [CRITICAL] cross_domain_leak — Sensitive finance data reaches social domain via 2-agent chain [CRITICAL] taint_spreading — Data from origin 'zhang_wei' spread to 4 agents across the network [HIGH] inference_accumulation — external_bot accumulated high inference risk (77%) [HIGH] compound_scope_escalation — 3 agent pairs exceed authorized scope Privacy verification (central reports): hr_bot → clean health_bot → clean summary_bot → clean ``` ## 框架集成 ``` # CrewAI — captures agent delegation (Delegate/Ask coworker) as A→B edges from federated_agent_audit.sdk import crew_audit crew = crew_audit(crew, default_policy=policy) crew.kickoff(); result = crew._federated_tracer.network_audit() # LangChain / LangGraph — per-node identity + node-to-node hand-offs from federated_agent_audit.sdk import langchain_callback handler = langchain_callback(default_policy=policy) # asynchronous=True for async graphs graph.invoke(input, config={"callbacks": [handler]}); result = handler.tracer.network_audit() # AutoGen / AG2 — hooks every agent-to-agent message from federated_agent_audit.sdk import autogen_audit tracer = autogen_audit([assistant, user_proxy, critic], default_policy=policy) user_proxy.initiate_chat(assistant, message="..."); result = tracer.network_audit() # OpenAI Agents SDK — captures first-class handoffs from federated_agent_audit.sdk import openai_agents_hooks hooks = openai_agents_hooks(default_policy=policy) await Runner.run(triage_agent, input="...", hooks=hooks); result = hooks.tracer.network_audit() # LlamaIndex AgentWorkflow — captures hand-offs from the event stream from federated_agent_audit.sdk import llamaindex_handler h = llamaindex_handler(default_policy=policy) async for event in workflow.run(user_msg="...").stream_events(): h.handle_event(event) result = h.tracer.network_audit() # Generic Python — single-agent decorator from federated_agent_audit import audited @audited(policy, to_agent="downstream") def my_agent(text: str) -> str: ... ``` ## 它检测的内容 | 风险 | 发生了什么 | 我们如何捕获它 | |------|-------------|-----------------| | **跨域泄露** | 健康数据达到社交/外部代理 | 元数据上的域边界分析 | | **跨所有者泄露** | 我的代理将 *我的* 私密数据泄露给另一个用户的代理 | 所有者边界分析（污染起源与接收者所有者） | | **组合推理** | 代理收集健康 + 身份 = 重新识别 | 准标识符组装检测 | | **聚合攻击** | 3 个代理各自共享一个片段 → 中心重建配置文件 | 多源收敛分析 | | **级联注入** | 提示注入像蠕虫一样在代理之间传播 | 感染树 + 零号患者归因 | | **共谋** | 两个代理交换互补数据以重建配置文件 | 双向互补流检测 | | **行为漂移** | 代理突然改变行为（可能是妥协） | 跨会话 z 分数监控 | | **负面推理** | “我不能分享那个”确认数据存在 | 拒绝模式检测 | | **监管差距** | 欧盟 AI 法案 / GDPR / CA SB 243 / COPPA 未满足 | 每篇文章合规评分 | ## 检测有效性一个标记的基准（真实组合泄露与良性流量）衡量检测质量，而不仅仅是速度： ``` python benchmarks/detection_eval.py # precision / recall / F1 (clean desensitized data) python benchmarks/dp_eval.py # accuracy under full desensitizer + DP ``` **在干净的脱敏数据上**（33 个场景：19 个泄露 + 14 个良性，包括对抗性案例——噪声埋藏的泄露、钻石多路径、同域洗钱、注入蠕虫、敏感性低报规避、多源聚合、慢滴身份组装、跨所有者组泄露、共谋）： **精确度 1.0 / 召回率 1.0 / F1 1.0**，零原始内容泄露，在阈值 0.3–0.8 上稳定。由 `tests/test_detection_benchmark.py` 锁定。 **在完全脱敏 + 差分隐私下**（平均试验结果）： | DP epsilon | 召回率 | 特异性 | F1 | 原始泄露 | |---|---|---|---|---| | 3.0 | 0.89 | 0.93 | 0.91 | **0** | | 1.0 | 0.89 | 0.93 | 0.91 | **0** | | 0.5 | 0.89 | 0.94 | 0.92 | **0** | 域通过 **结构化**（k-匿名性泛化）得到保护，而不是通过每个域的随机响应——这会伪造虚假的敏感边缘并将精确度降低到 ~0.17——并且通过 DP 保留污染。由 `tests/test_dp_robustness.py` 锁定。通过 **LangGraph**（套件内）和 **CrewAI** + **OpenAI 流式传输**（自愿示例）进行现场验证。 ## 强制嵌入、证明与完整性对于审计员在每个下载的代理中发货的部署（强制合规 SDK），中央审计员可以验证边缘没有作弊——**篡改可见**，而无需看到原始内容： - **证明**——构建指纹固定 + HMAC + 每个代理的序列/哈希链捕获修改的构建/更改/省略的报告。可插拔的后端留下一个 **TEE 升级路径**（`CallableBackend` + `evidence_validator`）以实现篡改-*防护*。 - **交叉证实**——接收者记录脱敏收据，因此丢失边缘的发送者会被捕获；单个恶意行为者无法隐藏。 - **挑战/揭示**——中心可以要求一个带有 Merkle 证明的已提交条目，而无需浏览其余内容。完整循环：`python examples/marketplace_forced_embed.py`。 ## 合规引擎内置欧盟 AI 法案、GDPR、CA SB 243 和 COPPA 的法规映射： ``` from federated_agent_audit import ComplianceEngine engine = ComplianceEngine(eu_users=True, california_users=True, involves_children=False) report = engine.evaluate(audit_result) print(report.overall_score, report.status) # 0.0–1.0 · compliant / partial / non_compliant for gap in report.gaps(): print(f"{gap.regulation} {gap.article}: {gap.remediation}") ``` ## CLI & YAML 策略 ``` federated-audit scan "Patient SSN is 123-45-6789" # scan text (or pipe via stdin) federated-audit validate policies/*.yaml # validate policy files federated-audit demo # quick multi-agent demo federated-audit server --port 8000 # start the central audit server ``` ``` # policies/hr_bot.yaml agent_id: hr_bot must_not_share: [salary, SSN, performance review] acceptable_abstractions: {salary: compensation level, SSN: employee identifier} sensitivity_threshold: 3 ``` ## 安装 ``` pip install federated-agent-audit # core pip install "federated-agent-audit[crewai]" # + a framework adapter (or langchain/ # langgraph/autogen/openai-agents/llamaindex) pip install "federated-agent-audit[transport]" # + the central audit server pip install "federated-agent-audit[all]" # everything ``` ## 它是如何工作的 ``` 49 modules · 723 tests · 0 external API calls required Local (Phase 1, at the edge): Network (Phase 2, at the center): PrivacyGate (regex + PII) Cross-domain / cross-owner detection SemanticDetector (4-tier) Compositional leak (quasi-id assembly) TaintTracker (info flow) Cascade infection (patient-zero) Desensitizer (6-layer) Aggregation / collusion / multihop MemoryAuditor (write audit) Topology + blame attribution Attestor (tamper-evidence) Compliance engine + risk aggregation ``` **隐私保证**：中央审计员在架构上无法重建原始内容。数据在离开本地代理之前被散列、匿名化和 DP-噪声化；Merkle 树承诺使审计轨迹 **篡改可见**，而不泄露条目。 ## 开发 ``` git clone https://github.com/Justin0504/federated-agent-audit cd federated-agent-audit pip install -e ".[dev,langchain,langgraph,transport,yaml]" pytest # 723 tests ruff check src/ tests/ benchmarks/ # lint python examples/multiagent_trace_demo.py ``` 欢迎贡献——请参阅 [CONTRIBUTING.md](CONTRIBUTING.md)、[路线图](ROADMAP.md) 和标记为 **good first issue** 的问题。 ## 许可证 Apache 2.0

标签：AI安全, Apache 2.0 许可, Atomic Red Team, AutoGen, Chat Copilot, CrewAI, Langfuse, LangGraph, LangSmith, LlamaIndex, OpenAI Agents, ProjectDiscovery, PyRIT, Python, 代码审查, 内部部署, 合规风险, 多智能体系统, 安全合规, 安全测试, 攻击性安全, 数据 residency, 数据保护, 数据泄露检测, 数据隐私, 无后门, 监管数据, 网络代理, 联邦学习, 联邦审计, 自定义编排, 行为追踪, 逆向工具, 隐私审计