paperclipinc/hermes-operator
GitHub: paperclipinc/hermes-operator
「Hermes Operator」是一款生产级的Kubernetes操作符,用于管理Hermes Agent,提供声明式配置、安全默认设置和自动化更新等功能。
Stars: 6 | Forks: 1
# 赫耳墨斯操作员
[Kubernetes operator for nousresearch/hermes-agent](https://github.com/nousresearch/hermes-agent): a Python-based self-improving multi-platform AI agent. Declarative spec,
opinionated security defaults, S3 backups, OCI-registry auto-update,
SSA-based GitOps coexistence, and a one-shot migration path from
openclaw-operator.
`hermes-operator` ships as v1.0.0 with [v1 stability commitments](docs/api-versioning.md)
in place from day one: no v0.x grind.
## 快速入门
```
# 1. 通过 Helm(OCI 图表;Helm 3.8+)安装 CRDs 和操作员。
# 省略 --version 以获取最新版本,或添加 --version X.Y.Z 以固定版本。
helm install hermes-operator \
oci://ghcr.io/paperclipinc/charts/hermes-operator \
-n hermes-operator --create-namespace
# 2. 应用最小实例。
kubectl apply -n agents -f - <<'YAML'
apiVersion: hermes.agent/v1
kind: HermesInstance
metadata:
name: my-hermes
spec:
image:
repository: ghcr.io/paperclipinc/hermes-agent
tag: "v2026.5.29.2"
storage:
persistence:
enabled: true
size: 10Gi
YAML
# 3. 观察其收敛。
kubectl get hi -n agents -w
# NAME READY PHASE IMAGE AGE
# my-hermes True Ready ghcr.io/paperclipinc/hermes-agent:v2026.5.29.2 30s
```
For more involved scenarios, see [`examples/`](examples/).
## 架构
```
flowchart LR
subgraph User
GitOps[FluxCD / Argo]
Kubectl[kubectl apply]
end
subgraph ControlPlane["Kubernetes control plane"]
APIServer[(kube-apiserver)]
HInstance["HermesInstance"]
HSelfConfig["HermesSelfConfig"]
HClusterDefaults["HermesClusterDefaults
(singleton)"] end subgraph Operator["hermes-operator pod"] DefaulterWebhook[Defaulter] ValidatorWebhook[Validator] InstanceCtrl[HermesInstance
controller] SelfConfigCtrl[HermesSelfConfig
controller
SSA: hermes.agent/selfconfig] ClusterDefaultsCtrl[ClusterDefaults
controller] end subgraph Workload["agent workload (per HermesInstance)"] STS[StatefulSet] Svc[Service] NetPol[NetworkPolicy default-deny] PVC[PVC ~/.hermes] Honcho[Honcho Deploy
profile store] CronJob[Backup CronJob] end S3[(S3-compatible
backup target)] OCI[(OCI registry
hermes-agent tags)] GitOps --> APIServer Kubectl --> APIServer APIServer <-->|admission| DefaulterWebhook APIServer <-->|admission| ValidatorWebhook APIServer --> HInstance APIServer --> HSelfConfig APIServer --> HClusterDefaults HInstance --> InstanceCtrl HSelfConfig --> SelfConfigCtrl HClusterDefaults --> ClusterDefaultsCtrl InstanceCtrl --> STS InstanceCtrl --> Svc InstanceCtrl --> NetPol InstanceCtrl --> PVC InstanceCtrl --> Honcho InstanceCtrl --> CronJob SelfConfigCtrl -.SSA patch.-> HInstance CronJob --> S3 InstanceCtrl -.poll.-> OCI ``` The agent runs as a StatefulSet (single replica by default) under a default- deny NetworkPolicy. The `HermesSelfConfig` controller uses Server-Side Apply under field manager `hermes.agent/selfconfig`, so FluxCD/Argo can own the parent `HermesInstance` for other fields without flap. `HermesClusterDefaults` is a cluster-scoped singleton (name **must** be `cluster`) that fills `nil` fields only: explicit values on the instance always win. ## 功能 | 区域 | 功能 | 备注 | |---|---|---| | **声明式** | Single `HermesInstance` CR drives the whole stack | StatefulSet, Service, PVC, NetworkPolicy, ConfigMap, PDB, HPA, ServiceMonitor, Honcho deploy, backup CronJob: all owned and reconciled. | | **声明式** | `HermesClusterDefaults` for cluster-wide defaults | Defaulting webhook fills `nil` fields only. | | **自适应** | `HermesSelfConfig` for audited agent-initiated mutations | SSA under field manager `hermes.agent/selfconfig`. Policy-gated by `spec.selfConfigure.protectedKeys`. | | **自适应** | OCI-registry-driven auto-update | Channel-pinned polling, pre-update backup, probe-failure rollback. | | **安全** | Default-deny NetworkPolicy + per-gateway allow rules | Derived from `spec.gateways` and `spec.networking.egress`. | | **安全** | Read-only root filesystem | Writable `emptyDir`s for `/tmp` and `~/.config` subPaths. | | **安全** | Per-CRD validating + defaulting webhooks | Plus warnings on unknown config keys and unresolvable gateway tokens. | | **安全** | RBAC aggregation labels | `kubectl auth can-i create hermesinstances --as=jane` works out of the box. | | **安全** | Image signing + SBOM | Cosign keyless OIDC, SPDX SBOM on every release. | | **可观察** | Prometheus metrics + ServiceMonitor | Per-controller, per-instance, per-subsystem. `metrics.secure` consistent. | | **可观察** | [Grafana dashboard](docs/grafana/) | Ships as JSON. Variables: `namespace`, `instance`. | | **可观察** | Exhaustive [condition catalogue](docs/conditions.md) | Every condition × every reason code, documented and stable. | | **多平台** | Telegram / Discord / Slack / WhatsApp / Signal gateways | First-class `spec.gateways.*` sections, secret-rotation-friendly. | | **Python runtime** | `uv`-installable agent runtime | Init container runs `uv sync` against a lockfile bundled in the agent image. | | **Python runtime** | FFmpeg + ripgrep available out of the box | Hard dependencies of hermes-agent. | | **可扩展** | Optional HPA via `spec.availability.hpa` | StatefulSet retained for identity through restarts. | | **可扩展** | Optional `topologySpreadConstraints` | Sane defaults plus `spec.availability.topologySpreadConstraints` override. | | **弹性** | PodDisruptionBudget auto-managed when `replicas > 1` | | | **弹性** | Finalizer-driven backup-on-delete | `r.Patch` (JSON patch) for finalizer mutations, never `r.Update`. | | **弹性** | Zombie-process reaper | `tini` as PID 1; `shareProcessNamespace: false` by default. | | **备份 / 恢复** | S3-compatible backups | Scheduled, on-delete, pre-update. `tar.zst` snapshots + `meta.json`. | | **备份 / 恢复** | Declarative one-shot restore | `spec.restoreFrom` is immutable once applied. | | **迁移** | One-shot OpenClaw → Hermes migration | From sibling `OpenClawInstance` or S3 backup. Uses hermes-agent's importer. | | **配置存储** | Optional Honcho companion | Deployment + Service + PVC + secret, fully managed. | | **网关身份验证** | Per-platform `secretRef` for tokens | Rotate independently, audited via webhook warnings. | | **云原生** | Helm chart, OLM bundle, plain kustomize manifests | All three are first-class. CRDs templated under the Helm chart. | | **云原生** | Multi-arch (`amd64`+`arm64`), Cosign-signed, SBOM-attested | | | **GitOps** | SSA-based SelfConfig coexists with Argo/Flux | No flap on shared instances. | | **稳定性** | v1.0 ships with [versioning](docs/api-versioning.md) + [deprecation](docs/deprecations.md) policies | Conversion-webhook scaffolding in place for future v2. | ## 工作示例:自我配置 The agent can persist a learned skill, env var, config patch, workspace file, or Honcho profile by creating a `HermesSelfConfig` in its namespace. The operator validates against the parent instance's `selfConfigure.protectedKeys` allowlist and applies via SSA: ``` apiVersion: hermes.agent/v1 kind: HermesSelfConfig metadata: name: install-finance-skill namespace: agents spec: instanceRef: my-hermes addSkills: - source: "git+https://github.com/foo/finance-skill@v1.2.0" patchConfig: schedules: morning-brief: "0 8 * * *" addEnvVars: - name: FINANCE_TZ value: Europe/Berlin ``` Apply, then watch: ``` kubectl get hsc -n agents # NAME PHASE INSTANCE AGE # install-finance-skill Applied my-hermes 3s ``` The audit trail lives in `kubectl describe hsc install-finance-skill` and on the instance via the per-field SSA field manager `hermes.agent/selfconfig`: `kubectl get hi my-hermes -o jsonpath='{.metadata.managedFields}'` shows exactly which fields the agent owns vs. Flux owns vs. you own. See [`examples/`](examples/) for end-to-end recipes. ## 支持的 Kubernetes 版本 | Operator | Kubernetes | |---|---| | v1.x | 1.28, 1.29, 1.30, 1.31, 1.32 | We drop the oldest k8s minor when Kubernetes EOLs it, on the *next* operator minor release. Patch releases never change the supported matrix. ## 分发 | Channel | What | |---|---| | Helm (OCI) | `helm install hermes-operator oci://ghcr.io/paperclipinc/charts/hermes-operator` | | OLM / OperatorHub | `kubectl operator install hermes-operator` (pending first OperatorHub release) | | Plain manifests | `kubectl apply -f https://github.com/paperclipinc/hermes-operator/releases/latest/download/install.yaml` | | Container image | `ghcr.io/paperclipinc/hermes-operator:v0.1.9` (multi-arch, Cosign-signed, SBOM attested) | ## 文档 - [设计规范](docs/superpowers/specs/2026-05-12-hermes-operator-design.md): the canonical product/architecture doc. - [API参考](docs/api-reference.md): every field on every CR. - [条件目录](docs/conditions.md): every status condition, reason code, troubleshooting hint. - [API版本策略](docs/api-versioning.md): what is and is not a breaking change. - [弃用策略](docs/deprecations.md): the 3-step flow + active deprecations. - [路线图](ROADMAP.md): shipped, planned, future, non-goals. - [示例](examples/): 9 worked YAML recipes. - [Grafana仪表板](docs/grafana/): operator-overview dashboard JSON. ## 贡献 See [`CONTRIBUTING.md`](CONTRIBUTING.md). Pull requests follow [Conventional Commits](https://www.conventionalcommits.org/) (`feat:`, `fix:`, `docs:`, `ci:`, `chore:`, `refactor:`, `test:`); release-please drives the release-PR loop from `feat:`/`fix:`. ## 安全 See [`SECURITY.md`](SECURITY.md). Report vulnerabilities via the GitHub security advisory flow; do not file public issues for security bugs. ## 许可证 Apache-2.0. See [`LICENSE`](LICENSE).
(singleton)"] end subgraph Operator["hermes-operator pod"] DefaulterWebhook[Defaulter] ValidatorWebhook[Validator] InstanceCtrl[HermesInstance
controller] SelfConfigCtrl[HermesSelfConfig
controller
SSA: hermes.agent/selfconfig] ClusterDefaultsCtrl[ClusterDefaults
controller] end subgraph Workload["agent workload (per HermesInstance)"] STS[StatefulSet] Svc[Service] NetPol[NetworkPolicy default-deny] PVC[PVC ~/.hermes] Honcho[Honcho Deploy
profile store] CronJob[Backup CronJob] end S3[(S3-compatible
backup target)] OCI[(OCI registry
hermes-agent tags)] GitOps --> APIServer Kubectl --> APIServer APIServer <-->|admission| DefaulterWebhook APIServer <-->|admission| ValidatorWebhook APIServer --> HInstance APIServer --> HSelfConfig APIServer --> HClusterDefaults HInstance --> InstanceCtrl HSelfConfig --> SelfConfigCtrl HClusterDefaults --> ClusterDefaultsCtrl InstanceCtrl --> STS InstanceCtrl --> Svc InstanceCtrl --> NetPol InstanceCtrl --> PVC InstanceCtrl --> Honcho InstanceCtrl --> CronJob SelfConfigCtrl -.SSA patch.-> HInstance CronJob --> S3 InstanceCtrl -.poll.-> OCI ``` The agent runs as a StatefulSet (single replica by default) under a default- deny NetworkPolicy. The `HermesSelfConfig` controller uses Server-Side Apply under field manager `hermes.agent/selfconfig`, so FluxCD/Argo can own the parent `HermesInstance` for other fields without flap. `HermesClusterDefaults` is a cluster-scoped singleton (name **must** be `cluster`) that fills `nil` fields only: explicit values on the instance always win. ## 功能 | 区域 | 功能 | 备注 | |---|---|---| | **声明式** | Single `HermesInstance` CR drives the whole stack | StatefulSet, Service, PVC, NetworkPolicy, ConfigMap, PDB, HPA, ServiceMonitor, Honcho deploy, backup CronJob: all owned and reconciled. | | **声明式** | `HermesClusterDefaults` for cluster-wide defaults | Defaulting webhook fills `nil` fields only. | | **自适应** | `HermesSelfConfig` for audited agent-initiated mutations | SSA under field manager `hermes.agent/selfconfig`. Policy-gated by `spec.selfConfigure.protectedKeys`. | | **自适应** | OCI-registry-driven auto-update | Channel-pinned polling, pre-update backup, probe-failure rollback. | | **安全** | Default-deny NetworkPolicy + per-gateway allow rules | Derived from `spec.gateways` and `spec.networking.egress`. | | **安全** | Read-only root filesystem | Writable `emptyDir`s for `/tmp` and `~/.config` subPaths. | | **安全** | Per-CRD validating + defaulting webhooks | Plus warnings on unknown config keys and unresolvable gateway tokens. | | **安全** | RBAC aggregation labels | `kubectl auth can-i create hermesinstances --as=jane` works out of the box. | | **安全** | Image signing + SBOM | Cosign keyless OIDC, SPDX SBOM on every release. | | **可观察** | Prometheus metrics + ServiceMonitor | Per-controller, per-instance, per-subsystem. `metrics.secure` consistent. | | **可观察** | [Grafana dashboard](docs/grafana/) | Ships as JSON. Variables: `namespace`, `instance`. | | **可观察** | Exhaustive [condition catalogue](docs/conditions.md) | Every condition × every reason code, documented and stable. | | **多平台** | Telegram / Discord / Slack / WhatsApp / Signal gateways | First-class `spec.gateways.*` sections, secret-rotation-friendly. | | **Python runtime** | `uv`-installable agent runtime | Init container runs `uv sync` against a lockfile bundled in the agent image. | | **Python runtime** | FFmpeg + ripgrep available out of the box | Hard dependencies of hermes-agent. | | **可扩展** | Optional HPA via `spec.availability.hpa` | StatefulSet retained for identity through restarts. | | **可扩展** | Optional `topologySpreadConstraints` | Sane defaults plus `spec.availability.topologySpreadConstraints` override. | | **弹性** | PodDisruptionBudget auto-managed when `replicas > 1` | | | **弹性** | Finalizer-driven backup-on-delete | `r.Patch` (JSON patch) for finalizer mutations, never `r.Update`. | | **弹性** | Zombie-process reaper | `tini` as PID 1; `shareProcessNamespace: false` by default. | | **备份 / 恢复** | S3-compatible backups | Scheduled, on-delete, pre-update. `tar.zst` snapshots + `meta.json`. | | **备份 / 恢复** | Declarative one-shot restore | `spec.restoreFrom` is immutable once applied. | | **迁移** | One-shot OpenClaw → Hermes migration | From sibling `OpenClawInstance` or S3 backup. Uses hermes-agent's importer. | | **配置存储** | Optional Honcho companion | Deployment + Service + PVC + secret, fully managed. | | **网关身份验证** | Per-platform `secretRef` for tokens | Rotate independently, audited via webhook warnings. | | **云原生** | Helm chart, OLM bundle, plain kustomize manifests | All three are first-class. CRDs templated under the Helm chart. | | **云原生** | Multi-arch (`amd64`+`arm64`), Cosign-signed, SBOM-attested | | | **GitOps** | SSA-based SelfConfig coexists with Argo/Flux | No flap on shared instances. | | **稳定性** | v1.0 ships with [versioning](docs/api-versioning.md) + [deprecation](docs/deprecations.md) policies | Conversion-webhook scaffolding in place for future v2. | ## 工作示例:自我配置 The agent can persist a learned skill, env var, config patch, workspace file, or Honcho profile by creating a `HermesSelfConfig` in its namespace. The operator validates against the parent instance's `selfConfigure.protectedKeys` allowlist and applies via SSA: ``` apiVersion: hermes.agent/v1 kind: HermesSelfConfig metadata: name: install-finance-skill namespace: agents spec: instanceRef: my-hermes addSkills: - source: "git+https://github.com/foo/finance-skill@v1.2.0" patchConfig: schedules: morning-brief: "0 8 * * *" addEnvVars: - name: FINANCE_TZ value: Europe/Berlin ``` Apply, then watch: ``` kubectl get hsc -n agents # NAME PHASE INSTANCE AGE # install-finance-skill Applied my-hermes 3s ``` The audit trail lives in `kubectl describe hsc install-finance-skill` and on the instance via the per-field SSA field manager `hermes.agent/selfconfig`: `kubectl get hi my-hermes -o jsonpath='{.metadata.managedFields}'` shows exactly which fields the agent owns vs. Flux owns vs. you own. See [`examples/`](examples/) for end-to-end recipes. ## 支持的 Kubernetes 版本 | Operator | Kubernetes | |---|---| | v1.x | 1.28, 1.29, 1.30, 1.31, 1.32 | We drop the oldest k8s minor when Kubernetes EOLs it, on the *next* operator minor release. Patch releases never change the supported matrix. ## 分发 | Channel | What | |---|---| | Helm (OCI) | `helm install hermes-operator oci://ghcr.io/paperclipinc/charts/hermes-operator` | | OLM / OperatorHub | `kubectl operator install hermes-operator` (pending first OperatorHub release) | | Plain manifests | `kubectl apply -f https://github.com/paperclipinc/hermes-operator/releases/latest/download/install.yaml` | | Container image | `ghcr.io/paperclipinc/hermes-operator:v0.1.9` (multi-arch, Cosign-signed, SBOM attested) | ## 文档 - [设计规范](docs/superpowers/specs/2026-05-12-hermes-operator-design.md): the canonical product/architecture doc. - [API参考](docs/api-reference.md): every field on every CR. - [条件目录](docs/conditions.md): every status condition, reason code, troubleshooting hint. - [API版本策略](docs/api-versioning.md): what is and is not a breaking change. - [弃用策略](docs/deprecations.md): the 3-step flow + active deprecations. - [路线图](ROADMAP.md): shipped, planned, future, non-goals. - [示例](examples/): 9 worked YAML recipes. - [Grafana仪表板](docs/grafana/): operator-overview dashboard JSON. ## 贡献 See [`CONTRIBUTING.md`](CONTRIBUTING.md). Pull requests follow [Conventional Commits](https://www.conventionalcommits.org/) (`feat:`, `fix:`, `docs:`, `ci:`, `chore:`, `refactor:`, `test:`); release-please drives the release-PR loop from `feat:`/`fix:`. ## 安全 See [`SECURITY.md`](SECURITY.md). Report vulnerabilities via the GitHub security advisory flow; do not file public issues for security bugs. ## 许可证 Apache-2.0. See [`LICENSE`](LICENSE).
标签:Apache 2.0 License, Artifact Hub, Auto-Update, Conformance Testing, E2E Testing, EVTX分析, GitOps, Go, Go Module, Hermes Agent, Kubernetes Version, Multi-platform, OpenClaw, OpenSSF Scorecard, Operator, Python, Rollback, Ruby工具, S3 Backup, Security, Self-improving, 凭据导出, 无后门, 日志审计, 逆向工具