devcarlosfigueiredo/Harbor-registry
GitHub: devcarlosfigueiredo/Harbor-registry
一个基于 Harbor 的私有容器注册中心方案,通过 IaC 与 CI/CD 在 AWS 上实现安全、可审计的镜像管理与漏洞防控。
Stars: 0 | Forks: 0
# 🏗️ harbor-registry
## 📋 目录
- [这是什么演示](#what-this-demonstrates)
- [架构](#architecture)
- [项目结构](#project-structure)
- [先决条件](#prerequisites)
- [快速开始](#quick-start)
- [安全模型](#security-model)
- [流水线流程](#pipeline-flow)
- [成本估算](#cost-estimate)
- [运维运行手册](#operations-runbook)
## 什么这演示
| 技能 | 证据 |
|-------|---------|
| **Harbor (CNCF)** | 完整安装、项目配置、RBAC、保留策略、Webhook |
| **漏洞扫描** | CI 中的 Trivy(预推送) + Harbor 服务端(推送后) |
| **Terraform IaC** | EC2、EBS(加密)、VPC、IAM、S3、AWS Backup、CloudWatch |
| **安全优先设计** | 机器人账号、Secrets Manager、IMDSv2、无 root 容器 |
| **GitHub Actions CI/CD** | 多作业流水线,包含 SARIF 上传、PR 评论、门控 |
| **生产运维** | 健康检查、备份脚本、监控、日志轮转 |
**目标受众:** 因合规原因无法使用 Docker Hub 或 ECR/ACR 的组织 — 葡萄牙和欧洲的银行、政府、医疗保健机构。
## 架构
```
┌─────────────────────────────────────────────────────────────────────┐
│ Developer Workstation │
│ git push → GitHub │
└───────────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ GitHub Actions CI/CD Pipeline │
│ │
│ [1] Trivy fs scan ──→ Block if CRITICAL in source/deps │
│ [2] docker build ──→ Multi-stage, non-root, hardened │
│ [3] Trivy image scan ──→ HARD GATE: block push if CRITICAL CVEs │
│ [4] docker push ──→ Harbor (Robot Account auth) │
│ [5] Harbor scan poll ──→ Verify server-side scan result │
└───────────────────────────────┬────────────────────────────────────┘
│ HTTPS (443)
▼
┌─────────────────────────── AWS ─────────────────────────────────────┐
│ │
│ VPC 10.50.0.0/16 │
│ └── Public Subnet 10.50.1.0/24 │
│ └── EC2 t3.small (Amazon Linux 2023) │
│ ├── Elastic IP ──→ harbor.yourcompany.com (DNS A record) │
│ ├── Security Group: HTTPS (443), HTTP (80) → redirect │
│ ├── EBS Root: 20GB encrypted gp3 (OS) │
│ └── EBS Data: 50GB encrypted gp3 (Harbor data) │
│ └── /opt/harbor-data/ │
│ ├── data/ ← Docker images (layers) │
│ ├── database/ ← PostgreSQL data │
│ ├── logs/ ← Harbor logs │
│ ├── certs/ ← TLS certificates │
│ └── backups/ ← DB + config backups │
│ │
│ Docker Containers (via Docker Compose): │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ nginx │ │ harbor- │ │ harbor- │ │ trivy- │ │
│ │ (proxy) │ │ core │ │ db │ │ adapter │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ registry │ │ harbor- │ │ harbor- │ │ harbor- │ │
│ │ (v2) │ │jobservice│ │ portal │ │ redis │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ S3 Bucket ──→ Daily DB backups (7-day retention) │
│ AWS Backup ──→ EBS snapshots (14-day retention) │
│ Secrets Manager ──→ Admin + DB passwords │
│ IAM Role ──→ EC2 access to S3, Secrets Manager (no keys!) │
└──────────────────────────────────────────────────────────────────────┘
```
## 项目结构
```
harbor-registry/
├── terraform/
│ ├── main.tf # EC2, VPC, EBS, IAM, S3, AWS Backup
│ ├── variables.tf # All configurable parameters with validation
│ ├── outputs.tf # IPs, URLs, SSM connect command
│ ├── user_data.sh # Bootstrap: Docker, EBS mount, Harbor install
│ └── terraform.tfvars.example # Copy → terraform.tfvars (never commit)
│
├── harbor/
│ ├── harbor.yml # Harbor configuration reference/template
│ ├── nginx/
│ │ └── nginx.conf # Nginx with TLS 1.2+, rate limiting, security headers
│ └── certs/
│ └── .gitkeep # TLS certs generated here (gitignored)
│
├── scripts/
│ ├── install-harbor.sh # Full Harbor setup: TLS, config, install, wait
│ ├── create-project.sh # Post-install: project, robot accounts, policies
│ ├── health-check.sh # Comprehensive health check + CloudWatch metrics
│ └── backup-harbor.sh # PostgreSQL + config backup to S3
│
├── app/
│ └── main.py # Demo Flask app (shows what's in the registry)
│
├── Dockerfile # Multi-stage, non-root, OCI labels
├── requirements.txt # Pinned Python deps
│
├── .github/workflows/
│ ├── build-and-push-harbor.yml # Main CI/CD: fs scan → build → image scan → push
│ └── terraform-harbor.yml # Terraform validate/plan/apply pipeline
│
├── docs/
│ ├── harbor-rbac.md # RBAC model, robot accounts, rotation guide
│ └── trivy-scanning.md # Scanning strategy, remediation, compliance
│
└── README.md
```
## 先决条件
### 工具
```
# 必需
terraform >= 1.6.0
docker >= 24.0
aws cli >= 2.0
# 可选(用于本地代码检查)
pip install checkov
brew install tflint
```
### AWS 要求
- 具有创建以下资源的 IAM 用户/角色:EC2、VPC、EBS、IAM、S3、Secrets Manager、AWS Backup
- 用于 `harbor.yourcompany.com` 的现有 Route 53 托管区域(或外部 DNS)
### 所需的 GitHub Secrets
| 密钥 | 描述 |
|--------|-------------|
| `HARBOR_HOST` | Harbor FQDN(例如 `harbor.yourcompany.com`) |
| `HARBOR_ROBOT_USERNAME` | 机器人账号名称(来自 `scripts/create-project.sh`) |
| `HARBOR_ROBOT_PASSWORD` | 机器人账号密钥 |
| `HARBOR_CA_CERT` | CA 证书(自签名时使用;受信任 CA 可省略) |
| `HARBOR_ADMIN_PASSWORD` | Harbor 管理员密码(用于 Terraform) |
| `HARBOR_DB_PASSWORD` | 数据库密码(用于 Terraform) |
| `AWS_PLAN_ROLE_ARN` | Terraform 计划阶段的 IAM 角色 ARN(只读) |
| `AWS_APPLY_ROLE_ARN` | Terraform 应用阶段的 IAM 角色 ARN(写入) |
## 快速开始
### 步骤 1 — 预置基础设施
```
cd terraform
# 复制并编辑变量
cp terraform.tfvars.example terraform.tfvars
# 编辑 terraform.tfvars:设置 harbor_hostname、allowed_cidr
# 通过环境变量设置密钥(永远不要在 tfvars 文件中设置)
export TF_VAR_harbor_admin_password="YourStrongPassword16+"
export TF_VAR_harbor_db_password="YourStrongDBPassword16+"
# 应用
terraform init
terraform plan
terraform apply
# 注意输出结果
terraform output harbor_public_ip # → Point DNS A record here
terraform output ssm_connect_command # → Access without SSH
```
### 步骤 2 — 配置 DNS
```
# 将 DNS A 记录指向弹性 IP
harbor.yourcompany.com → $(terraform output -raw harbor_public_ip)
```
等待 DNS 传播(Route 53 通常少于 60 秒)。
### 步骤 3 — 安装 Harbor
```
# 通过 SSM 连接到 EC2(无需 SSH 密钥)
eval "$(terraform output -raw ssm_connect_command)"
# 检查引导进度
sudo tail -f /var/log/user-data.log
# 如需要,手动运行
sudo bash /opt/harbor-install/harbor/scripts/install-harbor.sh \
harbor.yourcompany.com \
/opt/harbor-data
# 验证 Harbor 是否正在运行
docker compose -f /opt/harbor/docker-compose.yml ps
```
### 步骤 4 — 创建项目和机器人账号
```
# 从具有 IAM 角色访问 Secrets Manager 权限的 EC2 运行
# 或从设置了 HARBOR_ADMIN_PASS 的任何机器运行
export HARBOR_HOST="harbor.yourcompany.com"
export HARBOR_ADMIN_PASS="your-admin-password"
bash scripts/create-project.sh
# 注意打印到标准输出的机器人账户凭据
# 将它们添加到 GitHub Secrets:
# HARBOR_ROBOT_USERNAME
# HARBOR_ROBOT_PASSWORD
```
### 步骤 5 — 测试流水线
```
# 推送代码更改以触发流水线
echo "# trigger build" >> app/main.py
git add . && git commit -m "test: trigger Harbor pipeline"
git push origin main
# 在 GitHub Actions 中观察流水线
# 预期流程:
# Trivy FS 扫描 → 构建 → Trivy 镜像扫描 → 推送到 Harbor → 验证
```
### 步骤 6 — 在 Harbor UI 中验证
```
https://harbor.yourcompany.com
→ Projects → myapp → harbor-demo
→ Click on the image tag
→ Vulnerabilities tab → scan results
→ Should show 0 CRITICAL CVEs
```
## 安全模型
### 认证:代码中无密码
```
GitHub Actions
├── AWS: OIDC → IAM Role (no access keys)
└── Harbor: Robot Account (not human credentials)
├── Username: robot$github-actions-push
└── Password: stored in GitHub Secrets (from Secrets Manager)
EC2 Instance
├── No SSH key required (AWS Systems Manager Session Manager)
└── IAM Role → Secrets Manager → Harbor passwords fetched at boot
```
### 网络安全
```
Security Group:
Inbound:
✓ 443 (HTTPS) — from allowed_cidr (restrict to office/VPN!)
✓ 80 (HTTP) — nginx redirects to HTTPS
✗ 22 (SSH) — disabled by default (use SSM)
✗ All others — denied
Outbound:
✓ All — OS updates, Trivy DB, Docker Hub pulls
```
### 静态加密
```
EBS Root volume: AES-256 (AWS managed key)
EBS Data volume: AES-256 (AWS managed key)
S3 Backup bucket: SSE-S3 (AES-256)
Secrets Manager: AES-256 KMS
```
### 容器安全
```
# Dockerfile 安全特性:
✓ Non-root user (uid 1000)
✓ Multi-stage build (no build tools in runtime)
✓ No shell in runtime (harder to exec into)
✓ HEALTHCHECK defined
✓ OCI labels for traceability
✓ No --privileged, no capabilities
```
## 流水线流程
```
git push (main)
│
▼
[Trivy FS Scan] ──── CRITICAL in source? ──→ ❌ BLOCKED
│ clean
▼
[Docker Build] ── multi-stage, non-root
│
▼
[Trivy Image Scan] ── CRITICAL CVE? ──→ ❌ BLOCKED (never pushed)
│ clean
▼
[Trust Harbor CA] ── add self-signed cert to Docker daemon
│
▼
[Login to Harbor] ── robot$github-actions-push (not admin!)
│
▼
[docker push] ── tags: sha-a1b2c3d, main, latest
│
▼
[Harbor Trivy Scan] ── automatic on arrival
│
├── CRITICAL found → image flagged, pulls return 403
└── Clean → image available for deployment
│
▼
[Verify Harbor Scan] ── poll API until scan completes
│
▼
[Pipeline Summary] ── post to PR / GitHub Summary
```
## 成本估算
| 资源 | 类型 | 每月成本 |
|----------|------|-------------|
| EC2 t3.small | 730 小时 | ~$16.80 |
| EBS 根卷(20GB gp3) | 存储 | ~$1.60 |
| EBS 数据卷(50GB gp3) | 存储 | ~$4.00 |
| 弹性 IP(已绑定) | 静态 IP | $0.00 |
| S3 备份(~5GB) | 存储 | ~$0.12 |
| AWS Backup(EBS 快照) | 快照存储 | ~$2.00 |
| Secrets Manager(2 个密钥) | 每个密钥 | ~$0.80 |
| CloudWatch(日志 + 告警) | 用量 | ~$1.50 |
| **总计** | | **约 $27/月** |
## 运维运行手册
### 连接到 EC2
```
# 通过 SSM(推荐 — 无需 SSH 密钥)
aws ssm start-session --target $(terraform output -raw ec2_instance_id) --region eu-west-1
# Harbor 日志
sudo docker compose -f /opt/harbor/docker-compose.yml logs -f --tail=100
# Harbor 状态
sudo docker compose -f /opt/harbor/docker-compose.yml ps
```
### 重启 Harbor
```
sudo systemctl restart harbor
# 或手动操作:
cd /opt/harbor
sudo docker compose down && sudo docker compose up -d
```
### 健康检查
```
# 从具有 HARBOR_ADMIN_PASS 的 EC2 或任何机器运行
export HARBOR_HOST="harbor.yourcompany.com"
bash scripts/health-check.sh
# 退出 0 = 健康,1 = 降级,2 = 严重
```
### 手动备份
```
export S3_BUCKET=$(terraform output -raw backup_bucket)
export AWS_REGION="eu-west-1"
sudo bash scripts/backup-harbor.sh
```
### 扩展 EBS 卷(无停机)
```
# 1. 在 AWS 控制台或 CLI 中修改卷
aws ec2 modify-volume --volume-id $(terraform output -raw ebs_data_volume_id) --size 100
# 2. 在 EC2 上扩展文件系统(gp3 无需重启)
sudo growpart /dev/xvdf 1
sudo xfs_growfs /opt/harbor-data
df -h /opt/harbor-data # Verify new size
```
### 轮换机器人账号
```
# 参见 docs/harbor-rbac.md 获取完整轮换流程
# 快速版本:
ROBOT_ID=12
NEW_SECRET=$(openssl rand -base64 32 | tr -d '=+/')
curl -sk -u "admin:$ADMIN_PASS" \
-X PATCH \
-H "Content-Type: application/json" \
-d "{\"secret\": \"$NEW_SECRET\"}" \
"https://harbor.yourcompany.com/api/v2.0/robots/$ROBOT_ID"
gh secret set HARBOR_ROBOT_PASSWORD --body "$NEW_SECRET"
```
### 垃圾回收(释放磁盘空间)
```
# 通过 Harbor UI:管理 → 垃圾回收 → 立即垃圾回收
# 或通过 API:
curl -sk -u "admin:$ADMIN_PASS" \
-X POST \
"https://harbor.yourcompany.com/api/v2.0/system/gc/schedule" \
-H "Content-Type: application/json" \
-d '{"schedule":{"type":"Manual"}}'
```
## 进一步阅读
- [`docs/harbor-rbac.md`](docs/harbor-rbac.md) — RBAC 模型、机器人账号、审计日志
- [`docs/trivy-scanning.md`](docs/trivy-scanning.md) — 扫描策略、修复、脱机环境
- [Harbor 文档](https://goharbor.io/docs/latest/)
- [Trivy 文档](https://aquasecurity.github.io/trivy/)
- [Harbor CNCF 项目](https://www.cncf.io/projects/harbor/)
## 作者
本演示旨在展示:
- **Harbor** — 超越 ECR/ACR 的 CNCF 容器注册表
- **DevSecOps** — 带有 Trivy 的左移扫描、安全门控
- **Terraform** — EC2、EBS、IAM、S3、AWS Backup、Secrets Manager
- **运维成熟度** — 健康检查、备份、轮换、监控
标签:AWS, DPI, EBS, EC2, ECS, GitHub Actions, GitHub Advanced Security, Harbor, IaC, IAM, IMDSv2, PR评论, SARIF, Secrets Manager, Terraform, VPC, Webhook, 健康检查, 力导向图, 医疗, 合规, 备份, 安全加固, 安全扫描, 容器镜像, 政府, 日志轮转, 时序注入, 机器人账号, 欧洲, 漏洞利用检测, 生产运营, 监控, 私有镜像仓库, 自动化安全门禁, 自动化流水线, 自动笔记, 葡萄牙, 金融, 镜像保留策略