ogulcanaydogan/Sovereign-RAG-Gateway
GitHub: ogulcanaydogan/Sovereign-RAG-Gateway
策略优先的 OpenAI 兼容治理网关,在请求热路径中强制执行策略评估、数据脱敏和检索授权,生成防篡改审计轨迹,专为医疗金融等受监管场景设计。
Stars: 0 | Forks: 1
# Sovereign RAG 网关
**一个策略优先、兼容 OpenAI 的治理网关,面向受监管的 AI 工作负载。**




Sovereign RAG Gateway 在每个 LLM 和 RAG 请求的关键路径中——在流量到达上游提供商之前——强制执行运行时治理,包括身份验证、策略评估、数据脱敏和检索授权。它生成防篡改、哈希链化的决策轨迹,从而在事件响应和监管审计期间实现取证回放。
专为医疗保健、金融服务和其他受监管领域的安全工程团队、平台团队和 SRE 构建,旨在解决事后控制不足的问题。
## 问题背景
受监管行业中的企业 AI 部署面临一个结构性差距:治理控制是事后添加的——一个服务处理脱敏,另一个处理策略,另一个处理路由,审计日志分散在各个系统中,没有因果关联。在事件发生期间,没有任何单一系统能够重建特定请求的完整决策路径。
在医疗保健 (HIPAA)、金融服务 (FCA, PRA) 和其他受监管领域,审计人员要求提供在决策时执行了控制的可证实证明——而不是架构中“某处存在控制”这种理想化的文档。根本问题是:*评估此请求的具体策略版本是什么,对数据应用了哪些转换,你能以加密方式证明吗?*
事后日志无法回答这个问题。如果脱敏在具有最终一致性的单独服务中运行,你无法证明 PHI 在离开边界之前已被清除。如果策略评估是异步的,你无法证明请求在到达提供商之前已受治理。没有强制执行的可观测性只是监控,而非治理。
## 工作原理
Sovereign RAG Gateway 将治理移入热路径(hot path)。每个请求在任何数据离开边界之前都会经过一个确定性的强制执行管道:
```
flowchart TD
A["Client Request"] --> B["Identity & Classification\n(tenant, user, data class)"]
B --> C{"Policy Evaluation\n(OPA)"}
C -- "OPA Unavailable" --> DENY["Deterministic Deny\n(fail-closed)"]
C -- "Deny" --> DENY
C -- "Allow" --> D["Data Redaction\n(PHI/PII, classification-aware)"]
D --> E{"RAG Enabled?"}
E -- "Yes" --> F["Retrieval Authorization\n(policy-scoped connectors)"]
F --> G["Citation Integrity\nEnforcement"]
G --> H["Provider Egress\n(upstream LLM)"]
E -- "No" --> H
H --> I["Response to Client"]
B -.-> AUD["Audit Artifact\n(request-linked, hash-chained)"]
C -.-> AUD
D -.-> AUD
F -.-> AUD
H -.-> AUD
style DENY fill:#d32f2f,color:#fff,stroke:#b71c1c
style AUD fill:#1565c0,color:#fff,stroke:#0d47a1
style C fill:#f57f17,color:#fff,stroke:#e65100
style A fill:#2e7d32,color:#fff,stroke:#1b5e20
style I fill:#2e7d32,color:#fff,stroke:#1b5e20
```
网关对失败行为有明确的倾向:如果策略评估不可用,它默认为 **确定性拒绝 (deterministic deny)**。在受监管环境中,静默回退到宽松行为比明确拒绝会产生更大的事件和审计风险。
### 解决的问题 — 前后对比
```
flowchart LR
subgraph BEFORE["Typical Enterprise AI (Scattered Controls)"]
direction TB
APP1["App Code"] --> LLM1["LLM Provider"]
APP1 -.-> LOG1["Logger A"]
APP1 -.-> RED1["Redaction Svc"]
APP1 -.-> POL1["Policy Svc"]
APP1 -.-> AUD1["Audit DB"]
LOG1 ~~~ RED1
RED1 ~~~ POL1
POL1 ~~~ AUD1
end
subgraph AFTER["Sovereign RAG Gateway (Unified Control Plane)"]
direction TB
APP2["App Code\n(unchanged)"] --> GW["Gateway\n(policy + redact + audit + RAG)"]
GW --> LLM2["LLM Provider"]
end
style BEFORE fill:#fff3e0,stroke:#e65100
style AFTER fill:#e8f5e9,stroke:#2e7d32
style GW fill:#1565c0,color:#fff,stroke:#0d47a1
```
## 架构
网关由五个协作层组成。每层具有单一职责,数据以固定、确定性的顺序流经它们:
```
graph TB
subgraph GATEWAY["Sovereign RAG Gateway"]
direction TB
subgraph INGRESS["Ingress Layer"]
AUTH["Auth Middleware\n(Bearer + headers)"]
REQID["Request ID\nMiddleware"]
end
subgraph ENFORCEMENT["Enforcement Layer"]
POLICY["Policy Engine"]
REDACT["Redaction Engine\n(PHI/PII)"]
TRANSFORM["Policy Transforms"]
end
subgraph RETRIEVAL["RAG Layer"]
ORCH["Retrieval\nOrchestrator"]
REG["Connector Registry"]
FS["Filesystem\nConnector"]
PG["PostgreSQL\npgvector"]
S3C["S3\nConnector"]
CONFL["Confluence\nConnector"]
JIRAC["Jira\nConnector"]
end
subgraph EVIDENCE["Evidence Layer"]
AUDIT["Audit Writer\n(JSON Lines, hash-chained)"]
end
EGRESS["Provider Egress\n(OpenAI-compatible)"]
end
CLIENT["Client\n(OpenAI SDK)"] --> AUTH
AUTH --> REQID
REQID --> POLICY
POLICY --> TRANSFORM
TRANSFORM --> REDACT
REDACT --> ORCH
ORCH --> REG
REG --> FS
REG --> PG
REG --> S3C
REG --> CONFL
REG --> JIRAC
ORCH --> EGRESS
EGRESS --> LLM["Upstream LLM\nProvider"]
POLICY <--> OPA["OPA Server\n(policy bundles)"]
AUTH -.-> AUDIT
POLICY -.-> AUDIT
REDACT -.-> AUDIT
ORCH -.-> AUDIT
EGRESS -.-> AUDIT
style GATEWAY fill:#f5f5f5,stroke:#424242,stroke-width:2px
style INGRESS fill:#e3f2fd,stroke:#1565c0
style ENFORCEMENT fill:#fff3e0,stroke:#e65100
style RETRIEVAL fill:#e8f5e9,stroke:#2e7d32
style EVIDENCE fill:#ede7f6,stroke:#4527a0
style OPA fill:#f57f17,color:#fff,stroke:#e65100
style LLM fill:#78909c,color:#fff,stroke:#455a64
style CLIENT fill:#2e7d32,color:#fff,stroke:#1b5e20
```
### 模块映射
| 层级 | 模块 | 职责 |
|---|---|---|
| Ingress | `middleware/auth.py`, `middleware/request_id.py` | 身份提取、分类标头、请求追踪 |
| Enforcement | `policy/client.py`, `policy/transforms.py`, `redaction/engine.py` | OPA 评估、失败即关闭 (fail-closed) 合约、PHI/PII 清理 |
| Retrieval | `rag/retrieval.py`, `rag/registry.py`, `rag/connectors/` | 跨 5 个连接器的策略范围分发 |
| Egress | `providers/registry.py`, `providers/http_openai.py`, `providers/azure_openai.py`, `providers/anthropic.py` | 具有流式传输和成本感知回退的多提供商路由 |
| Evidence | `audit/writer.py` | 哈希链化的 JSON Lines、经 Schema 验证的审计事件 |
完整架构参考:[`ARCHITECTURE.md`](ARCHITECTURE.md)
## 核心能力
### 路径内策略强制执行
```
flowchart TD
REQ["Request Context\n(tenant, user, classification,\nmodel, RAG config)"] --> PC["PolicyClient"]
PC -->|"HTTP POST"| OPA["OPA Server"]
OPA --> ALLOW["Allow\n+ transforms\n+ policy_hash"]
OPA --> DENY["Deny\n+ reason code\n+ policy_hash"]
PC -->|"timeout / error"| CLOSED["Fail-Closed Deny\n(OPA unavailable)"]
ALLOW --> AUDIT["Audit Event"]
DENY --> AUDIT
CLOSED --> AUDIT
style DENY fill:#d32f2f,color:#fff,stroke:#b71c1c
style CLOSED fill:#d32f2f,color:#fff,stroke:#b71c1c
style ALLOW fill:#2e7d32,color:#fff,stroke:#1b5e20
style OPA fill:#f57f17,color:#fff,stroke:#e65100
style AUDIT fill:#1565c0,color:#fff,stroke:#0d47a1
```
每个请求在检索或提供商出口之前都由 OPA 评估。策略决策是确定性的、机器可读的,并记录策略版本哈希。支持 `enforce` 模式(阻止请求)和 `observe` 模式(仅记录不阻止),用于渐进式推出。
### 数据保护流程
```
flowchart LR
CLIENT["Client Request\n(may contain PHI)"] --> CLASS{"Classification\nHeader"}
CLASS -- "phi / pii" --> SCAN["Regex Pattern\nScanner"]
CLASS -- "public" --> PASS["Unchanged\nPayload"]
SCAN --> MRN["MRN Pattern\n→ [MRN_REDACTED]"]
SCAN --> DOB["DOB Pattern\n→ [DOB_REDACTED]"]
SCAN --> PHONE["Phone Pattern\n→ [PHONE_REDACTED]"]
MRN --> OUT["Redacted\nPayload"]
DOB --> OUT
PHONE --> OUT
OUT --> PROVIDER["To Provider\n(PHI removed)"]
PASS --> PROVIDER
OUT -.-> AUDIT["Audit Event\n(redaction_count)"]
style CLIENT fill:#fff3e0,stroke:#e65100
style SCAN fill:#fce4ec,stroke:#c62828
style PROVIDER fill:#2e7d32,color:#fff,stroke:#1b5e20
style AUDIT fill:#1565c0,color:#fff,stroke:#0d47a1
style MRN fill:#fce4ec,stroke:#c62828
style DOB fill:#fce4ec,stroke:#c62828
style PHONE fill:#fce4ec,stroke:#c62828
```
感知分类的脱敏仅在请求的数据分类标头指示 PHI 或 PII 时激活。脱敏事件被计数、记录并包含在审计产物中。系统不声称完美检测——假阳性和假阴性率被明确测量并发布。
### 策略范围检索 (RAG)
```
graph TB
POLICY["Policy Decision\n(allowed connectors)"] --> AUTH_CHECK["Authorization\nCheck"]
AUTH_CHECK -- "authorized" --> REG["Connector\nRegistry"]
AUTH_CHECK -- "denied" --> BLOCK["Blocked\n(regardless of prompt)"]
REG --> FS["Filesystem\nConnector"]
REG --> PG["PostgreSQL\npgvector"]
REG --> S3["S3\nConnector"]
REG --> CONFL["Confluence\n(read-only)"]
REG --> JIRA["Jira\n(read-only)"]
FS --> MERGE["Merge & Rank\nResults"]
PG --> MERGE
S3 --> MERGE
CONFL --> MERGE
JIRA --> MERGE
MERGE --> CIT["Citation\nMetadata"]
CIT --> VERIFY["Citation Integrity\nVerification"]
style BLOCK fill:#d32f2f,color:#fff,stroke:#b71c1c
style POLICY fill:#f57f17,color:#fff,stroke:#e65100
style VERIFY fill:#2e7d32,color:#fff,stroke:#1b5e20
```
连接器访问按租户和策略授权。无论提示内容如何,都强制执行源分区——因为授权与提示内容解耦,覆盖源范围的提示注入尝试是无效的。响应中的引用必须仅引用授权来源。
### 防篡改审计追踪
```
flowchart LR
subgraph CHAIN["SHA-256 Hash Chain (append-only JSON Lines)"]
direction LR
E1["Event N-1\npayload_hash: abc12"]
E2["Event N\nprev_hash: abc12\npayload_hash: def45"]
E3["Event N+1\nprev_hash: def45\npayload_hash: 78gh9"]
E1 --> E2
E2 --> E3
end
subgraph FIELDS["Each Audit Event Contains"]
direction TB
F1["request_id"]
F2["tenant_id + user_id"]
F3["policy_decision + policy_hash"]
F4["redaction_count"]
F5["provider_route + latency"]
F6["payload_hash + prev_hash"]
end
E2 -.-> REPLAY["Forensic Replay\n(by request_id)"]
style CHAIN fill:#ede7f6,stroke:#4527a0
style FIELDS fill:#e3f2fd,stroke:#1565c0
style REPLAY fill:#2e7d32,color:#fff,stroke:#1b5e20
```
每个审计事件使用 SHA-256 进行哈希链化——每个事件将前一个事件的 `payload_hash` 记录为其 `prev_hash`,创建一个防篡改链条。给定 `request_id`,调查人员可以重建完整的执行路径:认证上下文、策略评估、应用的转换、脱敏操作、检索来源和提供商路由决策。
### 具有成本感知回退的多提供商路由
```
flowchart TD
REQ["Chat / Embeddings /\nStream Request"] --> REG["Provider Registry\n(eligible_chain)"]
REG --> CAP{"Capability\nCheck"}
CAP -->|"chat / embeddings\n/ streaming"| SELECT{"Select\nPrimary"}
SELECT --> P1["OpenAI\n(priority: 10)"]
P1 -->|"success"| OK["Return Response\nor SSE Stream"]
P1 -->|"429 / 502 / 503"| FB{"Fallback?"}
FB -->|"next in chain"| P2["Azure OpenAI\n(priority: 50)"]
P2 -->|"success"| OK
P2 -->|"429 / 502 / 503"| P3["Anthropic\n(priority: 100)"]
P3 --> OK
FB -->|"no more providers"| ERR["ProviderError"]
REG --> COST["cheapest_for_tokens()\n(cost-aware selection)"]
COST -.-> SELECT
style REG fill:#e3f2fd,stroke:#1565c0
style OK fill:#2e7d32,color:#fff,stroke:#1b5e20
style ERR fill:#d32f2f,color:#fff,stroke:#b71c1c
style COST fill:#fff3e0,stroke:#e65100
style CAP fill:#fff3e0,stroke:#e65100
```
感知能力的提供商路由通过 `eligible_chain()` 在尝试回退之前按操作类型(chat、embeddings、streaming)和模型支持过滤提供商。优先级排序的回退链在可重试错误(429、502、503)时自动故障转移。通过 `cheapest_for_tokens()` 进行每令牌成本选择,可在 OpenAI、Azure OpenAI 和 Anthropic 之间实现预算感知路由。路由决策——提供商名称、尝试次数和完整回退链——记录在审计事件中以供取证分析。
### 可观测性技术栈
```
flowchart LR
GW["Gateway\n(/metrics endpoint)"] --> PROM["Prometheus\n(scrape every 10s)"]
PROM --> GRAF["Grafana\n(10 pre-built panels)"]
GRAF --> R1["Request Overview\n(rate, latency p50/p95/p99,\nstatus distribution)"]
GRAF --> R2["Policy Decisions\n(allow/deny rate,\ndeny ratio gauge)"]
GRAF --> R3["Provider & Cost\n(token throughput,\nhourly cost, fallback rate)"]
GRAF --> R4["Data Protection\n(redaction rate,\nprovider distribution)"]
style GW fill:#e3f2fd,stroke:#1565c0
style PROM fill:#fff3e0,stroke:#e65100
style GRAF fill:#e8f5e9,stroke:#2e7d32
```
自定义进程内 Prometheus 收集器,零外部依赖——6 个计数器和 1 个直方图以标准文本格式在 `/metrics` 暴露。预构建的 Grafana 仪表板 ConfigMap,包含跨四个运营域的 10 个面板,可与网关 Helm chart 一起部署。
### OpenAI API 兼容性
与 OpenAI 的聊天补全、嵌入和模型列出处端点即插即用兼容。应用团队无需修改即可使用标准 OpenAI 客户端 SDK——治理在传输层是透明的。
## 安全信任边界
```
flowchart TD
subgraph UNTRUSTED["Untrusted Zone"]
CLIENT["Client Application\n(may send PHI/PII)"]
LLM["External LLM Provider\n(data leaves boundary)"]
end
subgraph BOUNDARY["Gateway Enforcement Boundary"]
direction TB
AUTH["Auth Middleware\n(identity verification)"]
POLICY["Policy Engine\n(OPA evaluation)"]
REDACT["Redaction Engine\n(PHI/PII removal)"]
AUDIT["Audit Writer\n(tamper-evident trail)"]
end
subgraph CONTROLLED["Controlled Zone"]
OPA["OPA Server\n(policy bundles)"]
PG["PostgreSQL\n(pgvector)"]
FS["Filesystem Index\n(JSON Lines)"]
end
CLIENT -->|"raw request\n(may contain PHI)"| AUTH
AUTH --> POLICY
POLICY --> REDACT
REDACT -->|"redacted +\npolicy-evaluated"| LLM
POLICY <-->|"mTLS / internal"| OPA
REDACT -.-> PG
REDACT -.-> FS
AUTH -.-> AUDIT
POLICY -.-> AUDIT
REDACT -.-> AUDIT
style UNTRUSTED fill:#ffebee,stroke:#c62828,stroke-width:2px
style BOUNDARY fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
style CONTROLLED fill:#e3f2fd,stroke:#1565c0,stroke-width:2px
style CLIENT fill:#fff3e0,stroke:#e65100
style LLM fill:#78909c,color:#fff,stroke:#455a64
style AUDIT fill:#1565c0,color:#fff,stroke:#0d47a1
```
网关是不可信客户端流量和不可信提供商出口之间的唯一强制执行点。所有治理——认证、策略评估、数据脱敏和证据生成——都在此边界内执行。受控区域只能通过内部网络从网关访问。没有客户端流量绕过强制执行层,也没有未脱敏的数据离开边界流向外部提供商。
## 关键架构决策
| 决策 | 考虑的替代方案 | 权衡 | 选择原因 |
|---|---|---|---|
| OPA 不可用时失败即关闭 (Fail-closed) | 失败即打开 并记录日志 | 策略停机期间的可用性影响 | 在受监管工作负载中,明确拒绝比隐式许可更安全 |
| 优先使用正则表达式的 PHI/PII 脱敏 | NER/ML 模型管道 | 在上下文相关实体上准确率较低 | 确定性、无模型依赖、可测量的假阳性率。计划 ML 升级路径 |
| 同步策略评估 | 异步 / 最终一致性 | 增加每个请求的延迟 | 异步将破坏“出口前强制执行”的保证 |
| 单一二进制网关 | 微服务网格 | 无法独立扩展各关注点 | 降低运营复杂性;策略、脱敏和审计紧密耦合 |
| 仅兼容 OpenAI 接口 | 多协议支持 | 无原生 Anthropic/Google 端点 | 减少范围;大多数提供商提供 OpenAI 兼容模式 |
| 基于哈希的本地嵌入 | 始终使用远程嵌入 | 检索的语义质量较低 | 确定性、无网络调用、支持空气隔离和测试部署 |
| 自定义 Prometheus 收集器 | `prometheus_client` 库 | 需维护更多代码 | 零外部依赖;线程安全的进程内实现,无传递风险 |
## 治理模式
网关通过两种操作模式支持渐进式采用,允许团队在启用强制执行之前根据生产流量验证策略行为:
```
flowchart LR
subgraph OBSERVE["Observe Mode"]
direction TB
R1["Request"] --> P1["Policy\nEvaluation"]
P1 --> L1["Log Decision\n(allow/deny)"]
L1 --> E1["Forward to\nProvider"]
P1 -.->|"never blocks"| E1
end
subgraph ENFORCE["Enforce Mode"]
direction TB
R2["Request"] --> P2["Policy\nEvaluation"]
P2 -- "Allow" --> E2["Forward to\nProvider"]
P2 -- "Deny" --> D2["403 Structured\nDenial"]
end
OBSERVE -->|"confidence\ngained"| ENFORCE
style OBSERVE fill:#fff3e0,stroke:#e65100
style ENFORCE fill:#e8f5e9,stroke:#2e7d32
style D2 fill:#d32f2f,color:#fff,stroke:#b71c1c
```
团队从观察模式 开始,根据真实流量模式建立策略决策基线。一旦假阳性率可接受且策略覆盖范围得到验证,切换到强制执行模式 使策略决策具有约束力。
## 技术栈
| 层级 | 技术 |
|---|---|
| 语言 | Python 3.12+ |
| 框架 | FastAPI 0.115+ (async, OpenAI-compatible) |
| 策略引擎 | Open Policy Agent (OPA) 0.67+ |
| 向量存储 | PostgreSQL 16+ with pgvector |
| 容器化 | Docker (Python 3.12-slim) |
| 编排 | Kubernetes, Helm v3 |
| 可观测性 | Prometheus metrics, Grafana dashboards, OpenTelemetry collector |
| CI/CD | GitHub Actions (test, deploy-smoke, release) |
| GitOps | Argo CD ApplicationSet, External Secrets Operator |
| 供应链 | Cosign (keyless signing), SPDX SBOM, provenance attestation |
| 包管理 | uv |
| 质量 | pytest, ruff, mypy (strict mode) |
## 基准测试方法
该项目遵循“发布方法论而非仅分数”的评估方法:
```
flowchart LR
CORPUS["Synthetic Corpus\n+ Adversarial\nInputs"] --> CONDITIONS["4 Test\nConditions"]
CONDITIONS --> METRICS["Metrics\nCollection"]
METRICS --> GATES["CI Quality\nGates"]
GATES --> ARTIFACTS["Published\nArtifacts"]
CONDITIONS -.-> C1["Baseline\n(no gateway)"]
CONDITIONS -.-> C2["Observe\n(log only)"]
CONDITIONS -.-> C3["Enforce\n(policy + redact)"]
CONDITIONS -.-> C4["Enforce + RAG\n(full pipeline)"]
ARTIFACTS -.-> A1["CSV / JSON\nraw data"]
ARTIFACTS -.-> A2["Provenance\nmanifest"]
ARTIFACTS -.-> A3["Reproduction\nscripts"]
style CORPUS fill:#e3f2fd,stroke:#1565c0
style GATES fill:#fff3e0,stroke:#e65100
style ARTIFACTS fill:#e8f5e9,stroke:#2e7d32
```
**治理产出与性能开销** — 主要基准测试轨道量化了治理有效性与运行时开销之间的权衡:
| 条件 | 描述 |
|---|---|
| Baseline | 直接提供商调用,无网关 |
| Observe | 网关决策被记录,不强制执行 |
| Enforce | 策略评估 + 数据脱敏 |
| Enforce + RAG | 策略 + 脱敏 + 连接器范围检索 |
**关键指标和 v0.2 目标:**
| 指标 | 目标 |
|---|---|
| 泄露率 (敏感数据到达提供商) | < 0.5% |
| 脱敏假阳性率 | < 8% |
| 策略拒绝 F1 分数 | >= 0.90 |
| 引用完整性 (仅授权来源) | >= 99% |
| p95 延迟开销 | < 250 ms |
| p95 延迟开销 (RAG) | < 600 ms |
CI 强制的质量门槛:引用存在率 >= 0.95,pgvector Recall@3 >= 0.80。所有基准测试产物(原始 CSV/JSON、来源清单、复现脚本)与摘要报告一起发布。
完整方法:[`docs/benchmarks/governance-yield-vs-performance-overhead.md`](docs/benchmarks/governance-yield-vs-performance-overhead.md)
## 发布和供应链管道
每个带标签的发布都经过一个签名、可审计的管道:
```
flowchart LR
TAG["Git Tag\n(v*)"] --> BUILD["Container\nBuild"]
BUILD --> PUSH["Push to\nGHCR"]
PUSH --> SIGN["Cosign\n(keyless)"]
SIGN --> SBOM["SPDX SBOM\nGeneration"]
SBOM --> PROV["Provenance\nAttestation"]
PROV --> REL["GitHub\nRelease"]
TAG --> NOTES["Extract Notes\nfrom CHANGELOG"]
NOTES --> REL
style TAG fill:#2e7d32,color:#fff,stroke:#1b5e20
style SIGN fill:#1565c0,color:#fff,stroke:#0d47a1
style SBOM fill:#4527a0,color:#fff,stroke:#311b92
style PROV fill:#4527a0,color:#fff,stroke:#311b92
style REL fill:#2e7d32,color:#fff,stroke:#1b5e20
```
## CI/CD 管道
```
flowchart LR
subgraph CI["ci.yml (every push / PR)"]
direction TB
LINT["ruff\nlint"] --> TYPE["mypy\ntypecheck"]
TYPE --> TEST["pytest\n(unit + integration)"]
TEST --> SCHEMA["schema\nvalidation"]
end
subgraph SMOKE["deploy-smoke.yml"]
direction TB
KIND["Spin up\nkind cluster"] --> HELM["Install\nHelm chart"]
HELM --> ROLL["Validate\nrollout"]
ROLL --> HEALTH["Endpoint\nhealth check"]
end
subgraph RELEASE["release.yml (v* tag)"]
direction TB
BUILD["Container\nbuild"] --> GHCR["Push to\nGHCR"]
GHCR --> COSIGN["Cosign\n(keyless)"]
COSIGN --> SBOM_R["SPDX\nSBOM"]
SBOM_R --> ATTEST["Provenance\nattestation"]
ATTEST --> GH_REL["GitHub\nRelease"]
end
PUSH["git push"] --> CI
PUSH --> SMOKE
TAG_R["git tag v*"] --> RELEASE
style CI fill:#e3f2fd,stroke:#1565c0
style SMOKE fill:#e8f5e9,stroke:#2e7d32
style RELEASE fill:#ede7f6,stroke:#4527a0
style PUSH fill:#2e7d32,color:#fff,stroke:#1b5e20
style TAG_R fill:#4527a0,color:#fff,stroke:#311b92
```
- **ci.yml** — 每次推送和拉取请求时执行 lint (ruff)、类型检查、测试 和 JSON Schema 验证
- **deploy-smoke.yml** — 启动 kind 集群,安装 Helm chart,验证 rollout,并运行端点健康检查
- **release.yml** — 由 `v*` 标签触发:构建容器,推送到 GHCR,使用 cosign 签名,生成 SPDX SBOM,附加来源证明,从 CHANGELOG 发布发布说明
## GitOps 和密钥管理
```
flowchart LR
subgraph REPO["Git Repository"]
direction TB
CHART["Helm Chart\n(charts/)"]
DEV_V["dev/values.yaml"]
STG_V["staging/values.yaml"]
PROD_V["prod/values.yaml"]
end
subgraph ARGOCD["Argo CD"]
direction TB
APPSET["ApplicationSet\n(env generator)"]
end
subgraph K8S["Kubernetes"]
direction TB
DEV_NS["srg-system\n(dev)"]
STG_NS["srg-staging\n(staging)"]
PROD_NS["srg-prod\n(prod)"]
end
subgraph ESO["External Secrets"]
direction TB
AWS["AWS Secrets\nManager"]
SYNC["ESO Controller\n(1h refresh)"]
end
REPO --> APPSET
APPSET --> DEV_NS
APPSET --> STG_NS
APPSET --> PROD_NS
AWS --> SYNC
SYNC --> DEV_NS
SYNC --> STG_NS
SYNC --> PROD_NS
style REPO fill:#e3f2fd,stroke:#1565c0
style ARGOCD fill:#fff3e0,stroke:#e65100
style K8S fill:#e8f5e9,stroke:#2e7d32
style ESO fill:#ede7f6,stroke:#4527a0
```
Argo CD ApplicationSet 从列表生成器为每个环境生成一个 Application。开发和预发布环境在提交时自动同步;生产环境需要手动同步批准。External Secrets Operator 将 API 密钥和提供商凭证从 AWS Secrets Manager 同步到 Kubernetes Secrets,自动每小时刷新。轮换手册涵盖标准轮换、紧急撤销和同步监控。
## 竞争格局
针对 AI 网关和治理领域的 10 个相邻工具进行了评估:
| 类别 | 评估的工具 |
|---|---|
| AI Gateway / Proxy | LiteLLM Proxy, Portkey, OpenRouter |
| API Gateway + AI | Kong AI Gateway, Gloo AI Gateway, Envoy AI Gateway |
| Cloud-Native AI Gateway | Cloudflare AI Gateway, Azure APIM GenAI Gateway |
| Guardrails / Safety | NVIDIA NeMo Guardrails, Guardrails AI |
**差异化 — 没有任何单一竞争对手结合的三项能力:**
- **失败即关闭 的路径内策略强制执行** — 当 OPA 不可用时确定性拒绝,而非静默回退到宽松行为
- **防篡改决策谱系** — SHA-256 哈希链化审计事件及策略版本哈希,支持任何请求的取证重建
- **具有引用完整性的策略范围 RAG** — 检索授权与提示内容解耦,引用仅针对允许的来源进行验证
包含来源引用的完整分析:[`docs/strategy/differentiation-strategy.md`](docs/strategy/differentiation-strategy.md)
## 工程指标
### 代码库规模
| 指标 | 值 |
|---|---|
| 应用代码 | 约 4,970 行,跨 44 个模块 |
| 测试代码 | 约 3,090 行,跨 46 个测试文件 |
| 测试代码比 | 59% |
| 测试函数 | 122 个 (单元、集成、契约、基准) |
| 支持脚本 | 约 1,830 行跨 13 个脚本 |
| 文档 | 约 1,150 行,跨 22 个文档 |
| 当前版本 | 1.1.0-alpha.1 |
### 质量和契约
| 指标 | 值 |
|---|---|
| 类型检查 | mypy strict mode (44 个源文件零错误) |
| Linting | ruff (零警告) |
| JSON Schema 契约 | 4 个 (策略决策、审计事件、引用、证据包) |
| 测试覆盖范围 | 单元、集成、契约、基准验证 |
| 基准评估门槛 | 2 个 (引用完整性、pgvector 排名) |
### 部署和运营
| 指标 | 值 |
|---|---|
| Kubernetes manifests | 25 个 YAML 文件 |
| Helm chart 模板 | 12 个模板,带 values schema 验证 |
| CI/CD 管道 | 8 个 (test, provider parity matrix, deploy-smoke, signed release, release verification, EKS validation, evidence replay, weekly evidence automation) |
| GitOps 环境 | 3 个 (dev, staging, prod,通过 Argo CD) |
| Prometheus 指标 | 10 个计数器 + 2 个直方图 |
| Grafana 仪表板面板 | 13 个面板,跨 5 个运营域 |
## 快速开始
### 前置条件
- Python 3.12+
- [uv](https://github.com/astral-sh/uv) 包管理器
- Docker (用于容器化部署)
- kind (用于本地 Kubernetes)
### 开发
```
make dev # Start dev server with hot reload
make test # Run full test suite
make lint # Ruff linting
make typecheck # mypy strict type checking
```
### Kubernetes 部署
```
make helm-lint # Validate Helm chart
make helm-template # Generate manifests
make demo-up # Deploy to kind + smoke test
```
### API 使用
带自动脱敏的 PHI 分类请求:
```
curl -s http://127.0.0.1:8000/v1/chat/completions \
-H 'Authorization: Bearer dev-key' \
-H 'x-srg-tenant-id: tenant-a' \
-H 'x-srg-user-id: user-1' \
-H 'x-srg-classification: phi' \
-H 'content-type: application/json' \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"hello DOB 01/01/1990"}]}'
```
带引用追踪的 RAG 请求:
```
curl -s http://127.0.0.1:8000/v1/chat/completions \
-H 'Authorization: Bearer dev-key' \
-H 'x-srg-tenant-id: tenant-a' \
-H 'x-srg-user-id: user-1' \
-H 'x-srg-classification: phi' \
-H 'content-type: application/json' \
-d '{
"model":"gpt-4o-mini",
"messages":[{"role":"user","content":"give triage policy summary"}],
"rag":{"enabled":true,"connector":"filesystem","top_k":2}
}'
```
生成用于事件回放的证据包:
```
python scripts/audit_replay_bundle.py \
--request-id \
--audit-log artifacts/audit/events.jsonl \
--out-dir artifacts/evidence \
--include-chain-verify
```
从死信存储重放失败的 webhook 投递:
```
python scripts/replay_webhook_dead_letter.py \
--dead-letter artifacts/audit/webhook_dead_letter.db \
--dead-letter-backend sqlite \
--event-types policy_denied,budget_exceeded \
--max-events 50 \
--report-out artifacts/audit/webhook_replay_report.json
```
### v0.4 运行时控制环境标志
```
# 预算控制
SRG_BUDGET_ENABLED=true
SRG_BUDGET_DEFAULT_CEILING=100000
SRG_BUDGET_WINDOW_SECONDS=3600
SRG_BUDGET_TENANT_CEILINGS="tenant-a:50000,tenant-b:250000"
SRG_BUDGET_BACKEND=memory # memory|redis
SRG_BUDGET_REDIS_URL=redis://redis:6379/0
SRG_BUDGET_REDIS_PREFIX=srg:budget
SRG_BUDGET_REDIS_TTL_SECONDS=7200
# Webhook 通知
SRG_WEBHOOK_ENABLED=true
SRG_WEBHOOK_ENDPOINTS='[{"url":"https://hooks.example.com/srg","secret":"replace_me","event_types":["policy_denied","budget_exceeded","redaction_hit","provider_fallback","provider_error"]}]'
SRG_WEBHOOK_TIMEOUT_S=5.0
SRG_WEBHOOK_MAX_RETRIES=1
SRG_WEBHOOK_BACKOFF_BASE_S=0.2
SRG_WEBHOOK_BACKOFF_MAX_S=2.0
SRG_WEBHOOK_DEAD_LETTER_BACKEND=sqlite # sqlite|jsonl
SRG_WEBHOOK_DEAD_LETTER_PATH=artifacts/audit/webhook_dead_letter.db
SRG_WEBHOOK_DEAD_LETTER_RETENTION_DAYS=30
# Tracing 诊断 + OTLP 导出
SRG_TRACING_ENABLED=true
SRG_TRACING_MAX_TRACES=1000
SRG_TRACING_OTLP_ENABLED=true
SRG_TRACING_OTLP_ENDPOINT=http://otel-collector:4318/v1/traces
SRG_TRACING_OTLP_TIMEOUT_S=2.0
SRG_TRACING_OTLP_HEADERS='{"Authorization":"Bearer replace_me"}'
SRG_TRACING_SERVICE_NAME=sovereign-rag-gateway
# 可靠性 / load shedding(可选)
SRG_INFLIGHT_GLOBAL_LIMIT=200
SRG_INFLIGHT_TENANT_DEFAULT_LIMIT=50
SRG_INFLIGHT_TENANT_LIMITS="tenant-a:25,tenant-b:75"
# SharePoint 连接器(可选)
SRG_RAG_SHAREPOINT_BASE_URL=https://graph.microsoft.com/v1.0
SRG_RAG_SHAREPOINT_SITE_ID=
SRG_RAG_SHAREPOINT_DRIVE_ID=
SRG_RAG_SHAREPOINT_AUTH_MODE=bearer_token # bearer_token|managed_identity
SRG_RAG_SHAREPOINT_BEARER_TOKEN=
SRG_RAG_SHAREPOINT_MANAGED_IDENTITY_ENDPOINT=http://169.254.169.254/metadata/identity/oauth2/token
SRG_RAG_SHAREPOINT_MANAGED_IDENTITY_RESOURCE=https://graph.microsoft.com/
SRG_RAG_SHAREPOINT_MANAGED_IDENTITY_API_VERSION=2018-02-01
SRG_RAG_SHAREPOINT_MANAGED_IDENTITY_CLIENT_ID=
SRG_RAG_SHAREPOINT_MANAGED_IDENTITY_TIMEOUT_S=3.0
SRG_RAG_SHAREPOINT_ALLOWED_PATH_PREFIXES=/drives//root:/Ops
```
## 文档
| 文档 | 描述 |
|---|---|
| [`docs/strategy/differentiation-strategy.md`](docs/strategy/differentiation-strategy.md) | 竞争分析和定位 |
| [`docs/strategy/why-this-exists-security-sre.md`](docs/strategy/why-this-exists-security-sre.md) | 安全和 SRE 问题叙述 |
| [`docs/strategy/killer-demo-stories.md`](docs/strategy/killer-demo-stories.md) | 5 个可衡量的演示场景 |
| [`docs/tr/proje-ozeti.md`](docs/tr/proje-ozeti.md) | 详细的土耳其语项目叙述(解决的问题、架构和执行历史) |
| [`docs/benchmarks/governance-yield-vs-performance-overhead.md`](docs/benchmarks/governance-yield-vs-performance-overhead.md) | 完整基准测试方法 |
| [`docs/architecture/threat-model.md`](docs/architecture/threat-model.md) | 威胁矩阵、控制和残余风险 |
| [`docs/operations/helm-kind-runbook.md`](docs/operations/helm-kind-runbook.md) | 本地 Kubernetes 部署指南 |
| [`docs/operations/confluence-connector.md`](docs/operations/confluence-connector.md) | Confluence 只读连接器设置 |
| [`docs/operations/jira-connector.md`](docs/operations/jira-connector.md) | Jira 只读连接器设置 |
| [`docs/operations/sharepoint-connector.md`](docs/operations/sharepoint-connector.md) | SharePoint 只读连接器设置 |
| [`docs/operations/compliance-control-mapping.md`](docs/operations/compliance-control-mapping.md) | 技术控制到证据的映射 |
| [`docs/operations/incident-replay-runbook.md`](docs/operations/incident-replay-runbook.md) | 请求级回放和签名证据程序 |
| [`docs/operations/secrets-rotation-runbook.md`](docs/operations/secrets-rotation-runbook.md) | 密钥轮换和紧急撤销 |
| [`docs/operations/offline-evidence-signature-verification.md`](docs/operations/offline-evidence-signature-verification.md) | 发布证据包的离线 SHA/签名验证 |
| [`docs/operations/runtime-controls-v050.md`](docs/operations/runtime-controls-v050.md) | Redis 预算、OTLP 追踪导出和 webhook 投递强化 |
| [`docs/benchmarks/reports/provider-parity-latest.md`](docs/benchmarks/reports/provider-parity-latest.md) | 跨提供商兼容性矩阵快照 |
| [`docs/benchmarks/reports/index.md`](docs/benchmarks/reports/index.md) | 每周基准/证据报告索引 |
| [`docs/releases/v1.0.0.md`](docs/releases/v1.0.0.md) | 当前稳定版发布说明 (v1.0.0) |
| [`docs/releases/v1.1.0-alpha.1.md`](docs/releases/v1.1.0-alpha.1.md) | 最新预发布说明 (v1.1.0-alpha.1) |
| [`docs/contracts/v1/`](docs/contracts/v1/) | JSON Schema 契约 (策略、审计、引用、证据包) |
| [`docs/releases/v0.9.0-rc1.md`](docs/releases/v0.9.0-rc1.md) | 以前的预发布说明 (v0.9.0-rc1) |
| [`docs/releases/v0.8.0-beta.1.md`](docs/releases/v0.8.0-beta.1.md) | 以前的预发布说明 (v0.8.0-beta.1) |
| [`docs/releases/v0.7.0-rc1.md`](docs/releases/v0.7.0-rc1.md) | 以前的预发布说明 (v0.7.0-rc1) |
| [`docs/releases/v0.7.0-alpha.2.md`](docs/releases/v0.7.0-alpha.2.md) | 以前的预发布说明 (v0.7.0-alpha.2) |
| [`docs/releases/v0.6.0.md`](docs/releases/v0.6.0.md) | 以前的稳定版发布说明 (v0.6.0) |
| [`docs/releases/v0.5.0.md`](docs/releases/v0.5.0.md) | 以前的稳定版发布说明 (v0.5.0) |
| [`docs/releases/v0.5.0-alpha.1.md`](docs/releases/v0.5.0-alpha.1.md) | 以前的预发布说明 (v0.5.0-alpha.1) |
| [`docs/releases/v0.4.0-rc1.md`](docs/releases/v0.4.0-rc1.md) | 以前的候选发布说明 (v0.4.0-rc1) |
| [`scripts/check_release_assets.py`](scripts/check_release_assets.py) | 发布产物完整性验证器 (存在性 + 包 SHA-256,可选签名) |
| [`deploy/terraform/README.md`](deploy/terraform/README.md) | Terraform EKS 模块用法和安全默认值 |
| [`docs/releases/v0.3.0.md`](docs/releases/v0.3.0.md) | 以前的发布说明 (v0.3.0) |
| [`docs/releases/v0.2.0.md`](docs/releases/v0.2.0.md) | 以前的发布说明 (v0.2.0) |
## 诚实的差距评估
本项目做出狭隘的、可测试的声明——而非理想化的声明:
- **不声称完美的 PHI 检测。** 优先使用正则表达式的脱敏具有可测量的假阳性和假阴性。比率已基准化并发布。
- **不声称完整的提供商 API 对等性。** OpenAI 兼容性涵盖核心端点(chat、embeddings、models)。提供商特定的扩展在早期版本中超出范围。
- **不声称替代更广泛的控制。** 网关强制执行不能替代安全的 SDLC、IAM 或数据治理计划。
- **策略质量取决于 fixture 覆盖范围。** 如果没有严格的测试门槛和审查流程,OPA 策略可能会发生漂移。
## 路线图
- [x] 具有成本感知回退的多提供商路由
- [x] 用于请求/策略/成本遥测的基础 Grafana 仪表板
- [x] 外部密钥集成和轮换手册
- [x] GitOps manifests (Argo CD) 用于声明式晋升
- [x] 聊天补全的流式支持
- [x] Azure/Anthropic 提供商适配器
- [x] 用于文档检索的 S3 连接器
- [x] 带有经过验证指南的 EKS 参考部署
- [x] 证据回放包导出和 schema
- [x] Confluence 只读连接器
- [x] Jira 只读连接器
- [x] SharePoint 只读连接器
- [x] 签名证据包输出(分离签名 + 验证)
### 下一步 (v0.4.0)
- [x] 响应脱敏 — 在返回给客户端之前扫描 LLM 输出中的 PHI/PII
- [x] Token 预算强制执行 — 具有策略集成的每租户滑动窗口配额
- [x] 跨网关 → OPA → 提供商的 OpenTelemetry 分布式追踪
- [x] 策略拒绝、脱敏触发和成本阈值违规时的 Webhook 通知
- [x] 用于生产 AWS 部署的 Terraform/Pulumi IaC 模块 (EKS + RDS + S3)
### 下一步 (v0.5.0 Foundation)
- [x] 用于多副本部署的 Redis 支持分布式预算追踪
- [x] 带有可配置端点、超时和标头的 OTLP HTTP 追踪导出器
- [x] Webhook 重试/退避/幂等性 + 死信队列输出
- [x] 带有确定性摘要/报告输出的死信回放 CLI
- [x] 基准趋势回归门槛(当前 vs 签入基线)
- [x] SharePoint 只读连接器 (Graph API, 策略范围检索)
- [x] 发布 v0.5.0-alpha.1 发布说明和标记预发布 ([tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v0.5.0-alpha.1), [release workflow run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22216373015))
- [x] 在 kind smoke 环境中验证 runtime-controls 技术栈并发布每周报告 ([deploy-smoke run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22207623171), [weekly report](docs/benchmarks/reports/weekly-2026-02-20.md))
### 下一步 (v0.6.0)
- [x] 将提供商对等矩阵提升为发布门槛,并保留 CI 产物 (`http_openai`, `azure_openai`, `anthropic`) ([workflow](.github/workflows/ci.yml), [script](scripts/provider_parity_matrix.py), [latest snapshot](docs/benchmarks/reports/provider-parity-latest.md))
- [x] 强化 webhook 死信持久性默认值 (`sqlite` backend + retention pruning + replay compatibility) ([store](app/webhooks/dead_letter_store.py), [replay](scripts/replay_webhook_dead_letter.py))
- [x] 在运营仪表板中发布 webhook 回放/保留指标面板 ([dashboard](deploy/observability/grafana-dashboard-configmap.yaml))
- [x] 通过定时 GitHub Actions 工作流自动化每周证据报告生成 ([workflow](.github/workflows/weekly-evidence-report.yml))
- [x] 从每周报告产物自动维护 `docs/benchmarks/reports/index.md` ([index script](scripts/update_weekly_reports_index.py), [index](docs/benchmarks/reports/index.md))
- [x] 添加 SharePoint 托管身份认证模式 (tokenless runtime credential path) ([connector](app/rag/connectors/sharepoint.py), [ops guide](docs/operations/sharepoint-connector.md))
- [x] 发布带有从 `v0.5.x` 迁移说明的 `v0.6.0` 发布档案 ([dossier](docs/releases/v0.6.0.md))
### 下一步 (v0.7.0-alpha.1)
- [x] 发布 `v0.7.0-alpha.1` 预发布档案和标记发布 ([dossier](docs/releases/v0.7.0-alpha.1.md), [tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v0.7.0-alpha.1), [release workflow run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22601774845))
- [x] 在 `release-verify` 工作流中强制执行严格的发布验证 (`bundle.sha256` + 分离签名 + 必需公钥) ([workflow run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22601822166))
- [x] 发布用于外部签名验证的发布证据公钥产物 (`release-evidence-public.pem`)
- [x] 添加签名验证失败行为的篡改测试 ([tests/unit/test_check_release_assets.py](tests/unit/test_check_release_assets.py))
- [x] 发布每周 runtime-controls/release-integrity 验证报告 ([weekly report](docs/benchmarks/reports/weekly-2026-03-03.md), [deploy-smoke run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22601763792))
### 下一步 (v0.7.0-alpha.2)
- [x] 在发布工作流中从 semver 标签后缀自动标记 GitHub 预发布状态 (`vX.Y.Z-*` -> prerelease true)
- [x] 添加历史发布完整性扫描模式(验证最近 N 个标签,不仅是最新) ([workflow](.github/workflows/release-verify.yml), [script](scripts/check_release_assets.py))
- [x] 强制 release-verify 运行作为 GA 晋升标签前的必需状态检查 ([workflow](.github/workflows/release.yml), [script](scripts/check_ga_release_gate.py))
- [x] 发布用于离线证据签名验证的操作员手册 ([runbook](docs/operations/offline-evidence-signature-verification.md))
- [x] 添加发布元数据漂移检查(tag semver prerelease vs GitHub release prerelease flag) ([workflow](.github/workflows/release-verify.yml), [script](scripts/check_release_assets.py))
- [x] 发布 `v0.7.0-alpha.2` 预发布档案和标记发布 ([dossier](docs/releases/v0.7.0-alpha.2.md), [tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v0.7.0-alpha.2), [release workflow run](_URL_68/>))
- [x] 在 kind smoke 环境中验证 runtime-controls 技术栈并发布每周报告 ([deploy-smoke run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22602966738), [release-verify run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22603006187), [weekly report](docs/benchmarks/reports/weekly-2026-03-03.md))
### 下一步 (v0.7.0)
- [x] 发布 `v0.7.0-rc1` 预发布档案和标记发布 ([dossier](docs/releases/v0.7.0-rc1.md), [tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v0.7.0-rc1), [release workflow run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22615083310))
- [x] 完成稳定窗口证据 (`deploy-smoke` x3 success, `release-verify` x2 success, CI/terraform/benchmark trend green) ([deploy-smoke](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22615070145), [release-verify](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22615333208), [ci](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22615070147), [terraform](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22615070135))
- [x] 发布带有同提交 `release-verify` 证明的 `v0.7.0` GA 版本 ([dossier](docs/releases/v0.7.0.md), [tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v0.7.0), [same-commit verify](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22615333208), [release workflow run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22615360166))
- [x] 发布带有发布/部署/验证运行引用的 GA 每周报告 ([weekly report](docs/benchmarks/reports/weekly-2026-03-03.md))
- [x] 开启 `Next (v0.8.0-alpha.1)` 待办事项(最多 5 个可衡量项)
### 下一步 (v0.8.0-alpha.1)
- [x] 添加带有 JSON 输出的自动化稳定窗口证据脚本 (`deploy-smoke`/`release-verify`/`ci`/`terraform` counts + pass/fail) ([script](scripts/check_stabilization_window.py), [workflow](.github/workflows/ga-readiness.yml), [unit test](tests/unit/test_check_stabilization_window.py))
- [x] 添加 GA 门槛集成测试,当 `vX.Y.Z` 标签缺少同提交 `release-verify` 时失败 ([test](tests/unit/test_check_ga_release_gate.py), [gate script](scripts/check_ga_release_gate.py))
- [x] 添加用于计划漂移检测的发布证据产物契约检查 (asset presence + signature/digest parity) ([script](scripts/check_release_evidence_contract.py), [workflow](.github/workflows/release-verify.yml), [unit test](tests/unit/test_check_release_evidence_contract.py))
- [x] 将面向操作员的发布验证仪表板快照生成添加到每周证据管道 ([snapshot script](scripts/generate_release_verification_snapshot.py), [weekly workflow](.github/workflows/weekly-evidence-report.yml), [weekly report generator](scripts/generate_weekly_evidence_report.py))
- [x] 在 kind smoke 环境中添加一键回滚演练 (`v0.7.0` -> previous stable) 验证并发布报告 ([script](deploy/kind/rollback-drill.sh), [workflow](.github/workflows/rollback-drill.yml), [ops guide](docs/operations/rollback-drill.md))
- [x] 发布 `v0.8.0-alpha.1` 预发布并发布证据工作流引用 ([tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v0.8.0-alpha.1), [release run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22757672097), [ga-readiness run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22757439288), [release-verify run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22757415378), [weekly evidence run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22757475305), [rollback-drill run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22757591696), [weekly report](docs/benchmarks/reports/weekly-2026-03-06.md))
### 下一步 (v0.8.0-beta.1)
- [x] 发布 `v0.8.0-beta.1` 预发布档案和标记发布 ([dossier](docs/releases/v0.8.0-beta.1.md), [tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v0.8.0-beta.1), [release workflow run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22809107307))
- [x] 在 beta 切割期间保持严格发布验证基线(最新 + latest10 扫描 + 证据契约检查)为绿色 ([release-verify workflow](.github/workflows/release-verify.yml), [integrity script](scripts/check_release_assets.py), [contract script](scripts/check_release_evidence_contract.py))
### 下一步 (v0.9.0-rc1)
- [x] 发布 `v0.9.0-rc1` 档案和标记 RC 版本 ([dossier](docs/releases/v0.9.0-rc1.md), [tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v0.9.0-rc1), [release workflow run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22809195389))
- [x] 确认稳定窗口标准 (`deploy-smoke>=3`, `release-verify>=2`, `ci>=1`, `terraform-validate>=1`) 和基准趋势门槛通过 ([stabilization checker](scripts/check_stabilization_window.py), [benchmark trend](scripts/check_benchmark_trend.py))
### 下一步 (v1.0.0)
- [x] 发布带有同提交 `release-verify` 证明的 `v1.0.0` GA 版本 ([dossier](docs/releases/v1.0.0.md), [tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v1.0.0), [same-commit release-verify](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22809283055), [release workflow run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22809289654))
- [x] 发布 GA 每周证据和操作员快照产物 ([weekly evidence run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22809312244), [weekly report](docs/benchmarks/reports/weekly-2026-03-07.md), [snapshot JSON](docs/benchmarks/reports/assets/release-verification/weekly-2026-03-07.json), [snapshot PNG](docs/benchmarks/reports/assets/release-verification/weekly-2026-03-07.png))
- [x] 在当前 `main` SHA 上验证 GA 运营工作流 ([ga-readiness](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22809311534), [release-verify](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22809311867), [rollback-drill](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22809312609))
### 下一步 (v1.1.0-alpha.1)
- [x] 添加带有确定性阈值和 CI 输出的可靠性/SLO 门槛脚本 ([script](scripts/check_slo_reliability.py), [slo-reliability run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22810621043))
- [x] 添加用于提供商风暴、策略超时和预算后端瞬态故障的确定性故障注入套件 ([script](scripts/run_fault_injection_suite.py), [weekly run evidence](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22810535407))
- [x] 添加负载削减/背压控制,具有确定性 `503 overload_shed` 行为以及审计/指标覆盖 ([inflight guard](app/services/inflight_guard.py), [chat service](app/services/chat_service.py), [audit schema](docs/contracts/v1/audit-event.schema.json))
- [x] 用 soak + fault + SLO 摘要产物扩展每周证据管道 ([workflow](.github/workflows/weekly-evidence-report.yml), [report generator](scripts/generate_weekly_evidence_report.py), [weekly report](docs/benchmarks/reports/weekly-2026-03-08.md), [snapshot JSON](docs/benchmarks/reports/assets/release-verification/weekly-2026-03-08.json), [snapshot PNG](docs/benchmarks/reports/assets/release-verification/weekly-2026-03-08.png))
- [x] 添加专门的 `slo-reliability` 工作流并将可靠性门槛集成到 CI 中 ([workflow](.github/workflows/slo-reliability.yml), [release-verify run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22810621054), [ga-readiness run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22810618059), [rollback-drill run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22810535411), [tag/release](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/releases/tag/v1.1.0-alpha.1), [release workflow run](https://github.com/ogulcanaydogan/Sovereign-RAG-Gateway/actions/runs/22810636117))
## 许可证
详见 [LICENSE](LICENSE)。
标签:AI治理, HIPAA, JSONLines, LLM网关, OpenAI兼容, PHI清洗, PII脱敏, Python, RAG, Sovereign AI, Streamlit, 人工智能安全, 医疗AI, 合规性, 大模型安全网关, 子域名突变, 审计追踪, 提示词注入防护, 敏感数据过滤, 无后门, 用户代理, 监管科技, 网络安全, 自定义请求头, 访问控制, 请求拦截, 运行时策略执行, 逆向工具, 金融AI, 防御性安全, 隐私保护, 零信任