tensorzero/tensorzero

GitHub: tensorzero/tensorzero

面向工业级大模型应用的一站式开源技术栈，整合网关、可观测性、评估优化与实验发布能力。

Stars: 11641 | Forks: 919

TensorZero Logo

# TensorZero

**TensorZero 是一个面向 _工业级 LLM 应用_ 的开源技术栈：** - **Gateway：** 通过统一的 API 访问所有 LLM 提供商，专为高性能构建（<1ms p99 延迟） - **可观测性：** 将推理和反馈存储在您的数据库中，可通过编程方式或 UI 访问 - **优化：** 收集指标和人工反馈，以优化 prompt、模型和推理策略 - **评估：** 使用启发式方法、LLM 评估器等对单个推理或端到端工作流进行基准测试 - **实验：** 借助内置的 A/B 测试、路由、回退、重试等功能，充满信心地发布按需取用，逐步采用，并与其他工具互补。

官网 · 文档 · Twitter · Slack · Discord

快速开始 (5分钟) · 部署指南 · API 参考 · 配置参考

## 功能 ### 🌐 LLM Gateway - [x] 通过单一统一的 API **[调用任何 LLM](https://www.tensorzero.com/docs/gateway/call-any-llm)**（API 或自托管） - [x] 支持 **[流式传输](https://www.tensorzero.com/docs/gateway/guides/streaming-inference)**、**[工具使用](https://www.tensorzero.com/docs/gateway/guides/tool-use)**、**[结构化输出 (JSON)](https://www.tensorzero.com/docs/gateway/generate-structured-outputs)**、**[批量处理](https://www.tensorzero.com/docs/gateway/guides/batch-inference)**、**[嵌入](https://www.tensorzero.com/docs/gateway/generate-embeddings)**、**[多模态（图像、文件）](https://www.tensorzero.com/docs/gateway/guides/multimodal-inference)**、**[缓存](https://www.tensorzero.com/docs/gateway/guides/inference-caching)** 等进行推理 - [x] **[创建 prompt 模板和 schema](https://www.tensorzero.com/docs/gateway/create-a-prompt-template)** 以在应用程序和 LLM 之间强制执行结构化接口 - [x] 得益于 🦀 Rust，满足极高的吞吐量和延迟需求：**[在 10k+ QPS 下增加 <1ms p99 延迟开销](https://www.tensorzero.com/docs/gateway/benchmarks)** - [x] 使用任何编程语言：**[通过我们的 Python SDK、任何 OpenAI SDK 或我们的 HTTP API 进行集成](https://www.tensorzero.com/docs/gateway/clients)** - [x] 通过路由、重试、回退、负载均衡、精细的超时设置等 **[确保高可用性](https://www.tensorzero.com/docs/gateway/guides/retries-fallbacks)** - [x] **[跟踪使用情况和成本](https://www.tensorzero.com/docs/operations/track-usage-and-cost)**，并 **[强制执行自定义速率限制](https://www.tensorzero.com/docs/operations/enforce-custom-rate-limits)**（支持精细的范围，例如标签） - [x] **[为 TensorZero 设置认证](https://www.tensorzero.com/docs/operations/set-up-auth-for-tensorzero)**，允许客户端访问模型而无需共享提供商的 API 密钥 #### 支持的模型提供商 **[Anthropic](https://www.tensorzero.com/docs/gateway/guides/providers/anthropic)**, **[AWS Bedrock](https://www.tensorzero.com/docs/gateway/guides/providers/aws-bedrock)**, **[AWS SageMaker](https://www.tensorzero.com/docs/gateway/guides/providers/aws-sagemaker)**, **[Azure](https://www.tensorzero.com/docs/gateway/guides/providers/azure)**, **[DeepSeek](https://www.tensorzero.com/docs/gateway/guides/providers/deepseek)**, **[Fireworks](https://www.tensorzero.com/docs/gateway/guides/providers/fireworks)**, **[GCP Vertex AI Anthropic](https://www.tensorzero.com/docs/gateway/guides/providers/gcp-vertex-ai-anthropic)**, **[GCP Vertex AI Gemini](https://www.tensorzero.com/docs/gateway/guides/providers/gcp-vertex-ai-gemini)**, **[Google AI Studio (Gemini API)](https://www.tensorzero.com/docs/gateway/guides/providers/google-ai-studio-gemini)**, **[Groq](https://www.tensorzero.com/docs/gateway/guides/providers/groq)**, **[Hyperbolic](https://www.tensorzero.com/docs/gateway/guides/providers/hyperbolic)**, **[Mistral](https://www.tensorzero.com/docs/gateway/guides/providers/mistral)**, **[OpenAI](https://www.tensorzero.com/docs/gateway/guides/providers/openai)**, **[OpenRouter](https://www.tensorzero.com/docs/gateway/guides/providers/openrouter)**, **[SGLang](https://www.tensorzero.com/docs/gateway/guides/providers/sglang)**, **[TGI](https://www.tensorzero.com/docs/gateway/guides/providers/tgi)**, **[Together AI](https://www.tensorzero.com/docs/gateway/guides/providers/together)**, **[vLLM](https://www.tensorzero.com/docs/gateway/guides/providers/vllm)**, and **[xAI (Grok)](https://www.tensorzero.com/docs/gateway/guides/providers/xai)**. 需要其他服务？TensorZero 还支持 **[任何兼容 OpenAI 的 API（例如 Ollama）](https://www.tensorzero.com/docs/gateway/guides/providers/openai-compatible)**。 #### 使用示例您可以将 TensorZero 与任何 OpenAI SDK（Python、Node、Go 等）或兼容 OpenAI 的客户端一起使用。 1. **[部署 TensorZero Gateway](https://www.tensorzero.com/docs/deployment/tensorzero-gateway)**（一个 Docker 容器）。 2. 更新您的 OpenAI 兼容客户端中的 `base_url` 和 `model`。 3. 运行推理： ``` from openai import OpenAI # 将客户端指向 TensorZero Gateway client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used") response = client.chat.completions.create( # Call any model provider (or TensorZero function) model="tensorzero::model_name::anthropic::claude-sonnet-4-6", messages=[ { "role": "user", "content": "Write a haiku about TensorZero.", } ], ) ``` 有关更多信息，请参阅 **[快速开始](https://www.tensorzero.com/docs/quickstart)**。 ### 🔍 LLM 可观测性 - [x] 将推理和 **[反馈（指标、人工编辑等）](https://www.tensorzero.com/docs/gateway/guides/metrics-feedback)** 存储在您自己的数据库中 - [x] 使用 TensorZero UI 或编程方式深入分析单个推理或高级聚合模式 - [x] **[构建数据集](https://www.tensorzero.com/docs/gateway/api-reference/datasets-datapoints)** 用于优化、评估和其他工作流 - [x] 使用新的 prompt、模型、推理策略等重放历史推理 - [x] **[导出 OpenTelemetry traces (OTLP)](https://www.tensorzero.com/docs/operations/export-opentelemetry-traces)** 和 **[导出 Prometheus 指标](https://www.tensorzero.com/docs/observability/export-prometheus-metrics)** 到您喜欢的应用可观测性工具 - [ ] 即将推出：AI 辅助调试和根因分析；AI 辅助数据标注 ### 📈 LLM 优化 - [x] 使用监督微调、RLHF 和其他技术优化您的模型 - [x] 使用自动 prompt 工程算法（如 **[GEPA](https://www.tensorzero.com/docs/optimization/gepa)** 和 MIPROv2）优化您的 prompt - [x] 通过动态上下文学习、best/mixture-of-N 采样等优化您的 **[推理策略](https://www.tensorzero.com/docs/gateway/guides/inference-time-optimizations)** - [x] 为您的 LLM 启用反馈循环：一个将生产数据转化为更智能、更快速、更便宜模型的数据与学习飞轮 - [ ] 即将推出：合成数据生成 ### 📊 LLM 评估 - [x] 使用由启发式方法或 LLM 评估器驱动的 _推理评估_ 来 **[评估单个推理](https://www.tensorzero.com/docs/evaluations/inference-evaluations/tutorial)**（≈ LLM 的单元测试） - [x] 使用具有完全灵活性的 _工作流评估_ 来 **[评估端到端工作流](https://www.tensorzero.com/docs/evaluations/workflow-evaluations/tutorial)**（≈ LLM 的集成测试） - [x] 像优化任何其他 TensorZero 函数一样优化 LLM 评估器，使其与人类偏好保持一致 - [ ] 即将推出：更多内置评估器；无头评估

Evaluation » UI Evaluation » CLI

docker compose run --rm evaluations \

  --evaluation-name extract_data \

  --dataset-name hard_test_cases \

  --variant-name gpt_4o \

  --concurrency 5

Run ID: 01961de9-c8a4-7c60-ab8d-15491a9708e4

Number of datapoints: 100

██████████████████████████████████████ 100/100

exact_match: 0.83 ± 0.03 (n=100)

semantic_match: 0.98 ± 0.01 (n=100)

item_count: 7.15 ± 0.39 (n=100)

### 🧪 LLM 实验 - [x] **[运行自适应 A/B 测试](https://www.tensorzero.com/docs/experimentation/run-adaptive-ab-tests)**，以便充满信心地发布并确定最适合您用例的 prompt 和模型。 - [x] 在复杂的工作流中执行规范的实验，包括支持多轮 LLM 系统、序贯检验等。 ### 以及更多！ - [x] 使用对 GitOps 友好的编排构建简单的应用程序或大规模部署 - [x] **[扩展 TensorZero](https://www.tensorzero.com/docs/operations/extend-tensorzero)**，利用内置的逃生舱口、编程优先的使用方式、直接的数据库访问等功能 - [x] 与第三方工具集成：专业的可观测性和评估工具、模型提供商、代理编排框架等 - [x] 通过使用 Playground UI 交互式地实验 prompt，实现快速迭代 ## 常见问题 **TensorZero 与其他 LLM 框架有何不同？** 1. TensorZero 使您能够根据生产指标和人工反馈优化复杂的 LLM 应用程序。 2. TensorZero 满足工业级 LLM 应用的需求：低延迟、高吞吐量、类型安全、自托管、GitOps、可定制性等。 3. TensorZero 统一了整个 LLMOps 技术栈，创造了复合效益。例如，LLM 评估可用于微调模型以及 AI 评估器。 **我可以在 \_\_\_ 中使用 TensorZero 吗？** 可以。支持所有主流编程语言。它与 **[OpenAI SDK](https://www.tensorzero.com/docs/gateway/clients/)**、**[OpenTelemetry](https://www.tensorzero.com/docs/operations/export-opentelemetry-traces/)** 以及 **[所有主流 LLM](https://www.tensorzero.com/docs/integrations/model-providers/)** 配合良好。 **TensorZero 可以用于生产环境吗？** 可以。从前沿 AI 初创公司到财富 50 强企业，都在使用 TensorZero。这是一个案例研究：**[利用 LLM 在大型银行中自动化代码变更日志](https://www.tensorzero.com/blog/case-study-automating-code-changelogs-at-a-large-bank-with-llms)** **TensorZero 的费用是多少？** TensorZero Stack（LLMOps 平台）是 100% 自托管和开源的。 TensorZero Autopilot（自动化 AI 工程师）是基于 TensorZero Stack 构建的互补性付费产品。 **谁在构建 TensorZero？** 我们的技术团队包括前 Rust 编译器维护者、拥有数千次引用的机器学习研究人员（斯坦福大学、卡内基梅隆大学、牛津大学、哥伦比亚大学），以及一家十角兽初创公司的首席产品官。我们的投资人与领先的开源项目（如 ClickHouse、CockroachDB）和 AI 实验室（如 OpenAI、Anthropic）的投资人相同。请参阅我们的 **[$7.3M 种子轮融资公告](https://www.tensorzero.com/blog/tensorzero-raises-7-3m-seed-round-to-build-an-open-source-stack-for-industrial-grade-llm-applications/)** 和 **[VentureBeat 的报道](https://venturebeat.com/ai/tensorzero-nabs-7-3m-seed-to-solve-the-messy-world-of-enterprise-llm-development/)**。我们正在 **[纽约招聘](https://www.tensorzero.com/jobs)**。 **如何开始？** 您可以逐步采用 TensorZero。我们的 **[快速开始](https://www.tensorzero.com/docs/quickstart)** 将在短短 5 分钟内，从一个普通的 OpenAI 封装器转变为一个具备可观测性和微调功能的生产级 LLM 应用程序。 ## 演示 https://github.com/user-attachments/assets/4df1022e-886e-48c2-8f79-6af3cdad79cb ## 开始使用 **今天就开始构建。** **[快速开始](https://www.tensorzero.com/docs/quickstart)** 展示了使用 TensorZero 设置 LLM 应用程序是多么容易。 **有问题？** 在 **[Slack](https://www.tensorzero.com/slack)** 或 **[Discord](https://www.tensorzero.com/discord)** 上提问。 **在工作中使用 TensorZero？** 发送电子邮件至 **[hello@tensorzero.com](mailto:hello@tensorzero.com)** 与您的团队建立 Slack 或 Teams 频道（免费）。 ## 示例我们正在编写一系列 **完整可运行的示例**，以展示 TensorZero 的数据与学习飞轮。 ## 博客文章我们在 **[TensorZero 博客](https://www.tensorzero.com/blog)** 上撰写有关 LLM 工程的文章。以下是我们最喜欢的一些文章： - **[LLM 网关中的老虎机算法：利用自适应实验（A/B 测试）更快地改进 LLM 应用](https://www.tensorzero.com/blog/bandits-in-your-llm-gateway/)** - **[OpenAI 的强化微调 (RFT) 值得吗？](https://www.tensorzero.com/blog/is-openai-reinforcement-fine-tuning-rft-worth-it/)** - **[结合程序化数据管理的蒸馏：更智能的 LLM，推理成本降低 5-30 倍](https://www.tensorzero.com/blog/distillation-programmatic-data-curation-smarter-llms-5-30x-cheaper-inference/)** - **[从 NER 到代理：自动 prompt 工程能否扩展到复杂任务？](https://www.tensorzero.com/blog/from-ner-to-agents-does-automated-prompt-engineering-scale-to-complex-tasks/)**

标签：A/B测试, AI开发工具, API集成, LLMOps, LLM应用栈, MLOps, OpenAI代理, RESTful API, Rust, 全文检索, 包管理器, 可观测性, 可视化界面, 大模型网关, 实验平台, 工业级AI, 幻觉缓解, 开源框架, 性能优化, 持续集成, 提示词优化, 检测绕过, 模型评估, 模型路由, 用户代理, 统一API, 网络流量审计, 自定义请求头, 请求拦截, 逆向工具, 通知系统, 通知系统