replicate/cog

GitHub: replicate/cog

一个将机器学习模型打包成生产就绪容器的开源工具，自动处理环境配置和 API 生成。

Stars: 9444 | Forks: 696

# Cog：用于机器学习的容器 Cog 是一个开源工具，允许你将机器学习模型打包成标准的、生产就绪的容器。你可以将打包好的模型部署到你自己的基础设施，或者部署到 [Replicate](https://replicate.com/)。 ## 亮点 - 📦 **摆脱 Docker 容器的烦恼。** 编写自己的 `Dockerfile` 可能是一个令人困惑的过程。使用 Cog，你可以通过一个[简单的配置文件](#how-it-works)来定义环境，它会生成一个包含所有最佳实践的 Docker 镜像：Nvidia 基础镜像、高效的依赖缓存、安装特定 Python 版本、合理的默认环境变量等等。 - 🤬️ **告别 CUDA 噩梦。** Cog 知道哪些 CUDA/cuDNN/PyTorch/Tensorflow/Python 组合是兼容的，并会为你正确设置好一切。 - ✅ **使用标准 Python 定义模型的输入和输出。** 然后，Cog 会生成一个 OpenAPI schema 并验证输入和输出。 - 🎁 **自动 HTTP 预测服务器**：使用高性能的 Rust/Axum 服务器，根据模型的类型动态生成 RESTful HTTP API。 - 🚀 **为生产环境做好准备。** 将你的模型部署到任何可以运行 Docker 镜像的地方。可以是自己的基础设施，也可以是 [Replicate](https://replicate.com)。 ## 工作原理使用 `cog.yaml` 定义模型运行的 Docker 环境： ``` build: gpu: true system_packages: - "libgl1-mesa-glx" - "libglib2.0-0" python_version: "3.13" python_requirements: requirements.txt predict: "predict.py:Predictor" ``` 使用 `predict.py` 定义如何在模型上运行预测： ``` from cog import BasePredictor, Input, Path import torch class Predictor(BasePredictor): def setup(self): """Load the model into memory to make running multiple predictions efficient""" self.model = torch.load("./weights.pth") # The arguments and types the model takes as input def predict(self, image: Path = Input(description="Grayscale input image") ) -> Path: """Run a single prediction on the model""" processed_image = preprocess(image) output = self.model(processed_image) return postprocess(output) ``` 在上面的代码中，我们接受图像路径作为输入，并在通过模型运行后返回转换后的图像路径。现在，你可以在此模型上运行预测： ``` $ cog predict -i image=@input.jpg --> Building Docker image... --> Running Prediction... --> Output written to output.jpg ``` 或者，构建用于部署的 Docker 镜像： ``` $ cog build -t my-classification-model --> Building Docker image... --> Built my-classification-model:latest $ docker run -d -p 5000:5000 --gpus all my-classification-model $ curl http://localhost:5000/predictions -X POST \ -H 'Content-Type: application/json' \ -d '{"input": {"image": "https://.../input.jpg"}}' ``` 或者，通过 `serve` 命令组合构建和运行： ``` $ cog serve -p 8080 $ curl http://localhost:8080/predictions -X POST \ -H 'Content-Type: application/json' \ -d '{"input": {"image": "https://.../input.jpg"}}' ```