BerriAI/litellm

GitHub: BerriAI/litellm

统一的 LLM 调用网关与 Python SDK,以 OpenAI 格式调用 100+ 模型,集成成本追踪、负载均衡与访问控制。

Stars: 37428 | Forks: 6068

🚅 LiteLLM

Call 100+ LLMs in OpenAI format. [Bedrock, Azure, OpenAI, VertexAI, Anthropic, Groq, etc.]

Deploy to Render Deploy on Railway

LiteLLM Proxy Server (AI Gateway) | Hosted Proxy | Enterprise Tier

PyPI Version Y Combinator W23 Whatsapp Discord Slack

Group 7154 (1) ## LiteLLM 用途
LLMs - 调用 100+ LLMs (Python SDK + AI Gateway) [**所有支持的端点**](https://docs.litellm.ai/docs/supported_endpoints) - `/chat/completions`、`/responses`、`/embeddings`、`/images`、`/audio`、`/batches`、`/rerank`、`/a2a`、`/messages` 等。 ### Python SDK ``` pip install litellm ``` ``` from litellm import completion import os os.environ["OPENAI_API_KEY"] = "your-openai-key" os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key" # OpenAI response = completion(model="openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}]) # Anthropic response = completion(model="anthropic/claude-sonnet-4-20250514", messages=[{"role": "user", "content": "Hello!"}]) ``` ### AI Gateway (代理服务器) [**快速入门 - E2E 教程**](https://docs.litellm.ai/docs/proxy/docker_quick_start) - 设置 virtual keys,发起你的第一个请求 ``` pip install 'litellm[proxy]' litellm --model gpt-4o ``` ``` import openai client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000") response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] ) ``` [**文档:LLM Providers**](https://docs.litellm.ai/docs/providers)
Agents - 调用 A2A Agents (Python SDK + AI Gateway) [**支持的 Providers**](https://docs.litellm.ai/docs/a2a#add-a2a-agents) - LangGraph、Vertex AI Agent Engine、Azure AI Foundry、Bedrock AgentCore、Pydantic AI ### Python SDK - A2A 协议 ``` from litellm.a2a_protocol import A2AClient from a2a.types import SendMessageRequest, MessageSendParams from uuid import uuid4 client = A2AClient(base_url="http://localhost:10001") request = SendMessageRequest( id=str(uuid4()), params=MessageSendParams( message={ "role": "user", "parts": [{"kind": "text", "text": "Hello!"}], "messageId": uuid4().hex, } ) ) response = await client.send_message(request) ``` ### AI Gateway (代理服务器) **步骤 1.** [将你的 Agent 添加到 AI Gateway](https://docs.litellm.ai/docs/a2a#adding-your-agent) **步骤 2.** 通过 A2A SDK 调用 Agent ``` from a2a.client import A2ACardResolver, A2AClient from a2a.types import MessageSendParams, SendMessageRequest from uuid import uuid4 import httpx base_url = "http://localhost:4000/a2a/my-agent" # LiteLLM proxy + agent name headers = {"Authorization": "Bearer sk-1234"} # LiteLLM Virtual Key async with httpx.AsyncClient(headers=headers) as httpx_client: resolver = A2ACardResolver(httpx_client=httpx_client, base_url=base_url) agent_card = await resolver.get_agent_card() client = A2AClient(httpx_client=httpx_client, agent_card=agent_card) request = SendMessageRequest( id=str(uuid4()), params=MessageSendParams( message={ "role": "user", "parts": [{"kind": "text", "text": "Hello!"}], "messageId": uuid4().hex, } ) ) response = await client.send_message(request) ``` [**文档:A2A Agent Gateway**](https://docs.litellm.ai/docs/a2a)
MCP Tools - 将 MCP servers 连接到任何 LLM (Python SDK + AI Gateway) ### Python SDK - MCP Bridge ``` from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client from litellm import experimental_mcp_client import litellm server_params = StdioServerParameters(command="python", args=["mcp_server.py"]) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write) as session: await session.initialize() # Load MCP tools in OpenAI format tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai") # Use with any LiteLLM model response = await litellm.acompletion( model="gpt-4o", messages=[{"role": "user", "content": "What's 3 + 5?"}], tools=tools ) ``` ### AI Gateway - MCP Gateway **步骤 1.** [将你的 MCP Server 添加到 AI Gateway](https://docs.litellm.ai/docs/mcp#adding-your-mcp) **步骤 2.** 通过 `/chat/completions` 调用 MCP tools ``` curl -X POST 'http://0.0.0.0:4000/v1/chat/completions' \ -H 'Authorization: Bearer sk-1234' \ -H 'Content-Type: application/json' \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Summarize the latest open PR"}], "tools": [{ "type": "mcp", "server_url": "litellm_proxy/mcp/github", "server_label": "github_mcp", "require_approval": "never" }] }' ``` ### 在 Cursor IDE 中使用 ``` { "mcpServers": { "LiteLLM": { "url": "http://localhost:4000/mcp/", "headers": { "x-litellm-api-key": "Bearer sk-1234" } } } } ``` [**文档:MCP Gateway**](https://docs.litellm.ai/docs/mcp)
## 如何使用 LiteLLM 你可以通过 Proxy Server 或 Python SDK 使用 LiteLLM。两者都为你提供了访问多个 LLM(100+ LLMs)的统一接口。选择最适合你需求的选项:
LiteLLM AI Gateway LiteLLM Python SDK
使用场景 访问多个 LLM 的中心化服务 (LLM Gateway) 直接在你的 Python 代码中使用 LiteLLM
适用人群 Gen AI Enablement / ML Platform 团队 构建 LLM 项目的开发者
主要功能 具有身份验证和授权的中心化 API gateway,按项目/用户进行多租户成本追踪和支出管理,按项目定制(日志记录、guardrails、缓存),用于安全访问控制的 virtual keys,用于监控和管理的 admin dashboard UI 直接集成到你代码库中的 Python 库,具有跨多个部署(如 Azure/OpenAI)重试/回退逻辑的 Router - Router,应用级负载均衡和成本追踪,具有 OpenAI 兼容错误的异常处理,可观测性回调(Lunary、MLflow、Langfuse 等)
LiteLLM 性能:在 1k RPS 下 **P95 延迟为 8ms**(见基准测试[这里](https://docs.litellm.ai/docs/benchmarks)) [**跳转到 LiteLLM Proxy (LLM Gateway) 文档**](https://docs.litellm.ai/docs/simple_proxy)
[**跳转到受支持的 LLM Providers**](https://docs.litellm.ai/docs/providers) **稳定版:** 使用带有 `-stable` 标签的 docker 镜像。这些镜像在发布前经过了 12 小时的负载测试。[在此处了解更多关于发布周期的信息](https://docs.litellm.ai/docs/proxy/release_cycle) 支持更多 providers。如果缺少某个 provider 或 LLM 平台,请提出 [功能请求](https://github.com/BerriAI/litellm/issues/new?assignees=&labels=enhancement&projects=&template=feature_request.yml&title=%5BFeature%5D%3A+)。 ## OSS 采用者
Stripe Google ADK Greptile OpenHands

Netflix

OpenAI Agents SDK
## 支持的 Providers ([网站支持的模型](https://models.litellm.ai/) | [文档](https://docs.litellm.ai/docs/providers)) | Provider | `/chat/completions` | `/messages` | `/responses` | `/embeddings` | `/image/generations` | `/audio/transcriptions` | `/audio/speech` | `/moderations` | `/batches` | `/rerank` | |-------------------------------------------------------------------------------------|---------------------|-------------|--------------|---------------|----------------------|-------------------------|-----------------|----------------|-----------|-----------| | [Abliteration (`abliteration`)](https://docs.litellm.ai/docs/providers/abliteration) | ✅ | | | | | | | | | | | [AI/ML API (`aiml`)](https://docs.litellm.ai/docs/providers/aiml) | ✅ | ✅ | ✅ | ✅ | ✅ | | | | | | | [AI21 (`ai21`)](https://docs.litellm.ai/docs/providers/ai21) | ✅ | ✅ | ✅ | | | | | | | | | [AI21 Chat (`ai21_chat`)](https://docs.litellm.ai/docs/providers/ai21) | ✅ | ✅ | ✅ | | | | | | | | | [Aleph Alpha](https://docs.litellm.ai/docs/providers/aleph_alpha) | ✅ | ✅ | ✅ | | | | | | | | | [Amazon Nova](https://docs.litellm.ai/docs/providers/amazon_nova) | ✅ | ✅ | ✅ | | | | | | | | | [Anthropic (`anthropic`)](https://docs.litellm.ai/docs/providers/anthropic) | ✅ | ✅ | ✅ | | | | | | ✅ | | | [Anthropic Text (`anthropic_text`)](https://docs.litellm.ai/docs/providers/anthropic) | ✅ | ✅ | ✅ | | | | | | ✅ | | | [Anyscale](https://docs.litellm.ai/docs/providers/anyscale) | ✅ | ✅ | ✅ | | | | | | | | | [AssemblyAI (`assemblyai`)](https://docs.litellm.ai/docs/pass_through/assembly_ai) | ✅ | ✅ | ✅ | | | ✅ | | | | | | [Auto Router (`auto_router`)](https://docs.litellm.ai/docs/proxy/auto_routing) | ✅ | ✅ | ✅ | | | | | | | | | [AWS - Bedrock (`bedrock`)](https://docs.litellm.ai/docs/providers/bedrock) | ✅ | ✅ | ✅ | ✅ | | | | | | ✅ | | [AWS - Sagemaker (`sagemaker`)](https://docs.litellm.ai/docs/providers/aws_sagemaker) | ✅ | ✅ | ✅ | ✅ | | | | | | | | [Azure (`azure`)](https://docs.litellm.ai/docs/providers/azure) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | | [Azure AI (`azure_ai`)](https://docs.litellm.ai/docs/providers/azure_ai) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | | [Azure Text (`azure_text`)](https://docs.litellm.ai/docs/providers/azure) | ✅ | ✅ | ✅ | | | ✅ | ✅ | ✅ | ✅ | | | [Baseten (`baseten`)](https://docs.litellm.ai/docs/providers/baseten) | ✅ | ✅ | ✅ | | | | | | | | | [Bytez (`bytez`)](https://docs.litellm.ai/docs/providers/bytez) | ✅ | ✅ | ✅ | | | | | | | | | [Cerebras (`cerebras`)](https://docs.litellm.ai/docs/providers/cerebras) | ✅ | ✅ | ✅ | | | | | | | | | [Clarifai (`clarifai`)](https://docs.litellm.ai/docs/providers/clarifai) | ✅ | ✅ | ✅ | | | | | | | | | [Cloudflare AI Workers (`cloudflare`)](https://docs.litellm.ai/docs/providers/cloudflare_workers) | ✅ | ✅ | ✅ | | | | | | | | | [Codestral (`codestral`)](https://docs.litellm.ai/docs/providers/codestral) | ✅ | ✅ | ✅ | | | | | | | | | [Cohere (`cohere`)](https://docs.litellm.ai/docs/providers/cohere) | ✅ | ✅ | ✅ | ✅ | | | | | | ✅ | | [Cohere Chat (`cohere_chat`)](https://docs.litellm.ai/docs/providers/cohere) | ✅ | ✅ | ✅ | | | | | | | | | [CometAPI (`cometapi`)](https://docs.litellm.ai/docs/providers/cometapi) | ✅ | ✅ | ✅ | ✅ | | | | | | | | [CompactifAI (`compactifai`)](https://docs.litellm.ai/docs/providers/compactifai) | ✅ | ✅ | ✅ | | | | | | | | | [Custom (`custom`)](https://docs.litellm.ai/docs/providers/custom_llm_server) | ✅ | ✅ | ✅ | | | | | | | | | [Custom OpenAI (`custom_openai`)](https://docs.litellm.ai/docs/providers/openai_compatible) | ✅ | ✅ | ✅ | | | ✅ | ✅ | ✅ | ✅ | | | [Dashscope (`dashscope`)](https://docs.litellm.ai/docs/providers/dashscope) | ✅ | ✅ | ✅ | | | | | | | | | [Databricks (`databricks`)](https://docs.litellm.ai/docs/providers/databricks) | ✅ | ✅ | ✅ | | | | | | | | | [DataRobot (`datarobot`)](https://docs.litellm.ai/docs/providers/datarobot) | ✅ | ✅ | ✅ | | | | | | | | | [Deepgram (`deepgram`)](https://docs.litellm.ai/docs/providers/deepgram) | ✅ | ✅ | ✅ | | | ✅ | | | | | | [DeepInfra (`deepinfra`)](https://docs.litellm.ai/docs/providers/deepinfra) | ✅ | ✅ | ✅ | | | | | | | | | [Deepseek (`deepseek`)](https://docs.litellm.ai/docs/providers/deepseek) | ✅ | ✅ | ✅ | | | | | | | | | [ElevenLabs (`elevenlabs`)](https://docs.litellm.ai/docs/providers/elevenlabs) | ✅ | ✅ | ✅ | | | ✅ | ✅ | | | | | [Empower (`empower`)](https://docs.litellm.ai/docs/providers/empower) | ✅ | ✅ | ✅ | | | | | | | | | [Fal AI (`fal_ai`)](https://docs.litellm.ai/docs/providers/fal_ai) | ✅ | ✅ | ✅ | | ✅ | | | | | | | [Featherless AI (`featherless_ai`)](https://docs.litellm.ai/docs/providers/featherless_ai) | ✅ | ✅ | ✅ | | | | | | | | | [Fireworks AI (`fireworks_ai`)](https://docs.litellm.ai/docs/providers/fireworks_ai) | ✅ | ✅ | ✅ | | | | | | | | | [FriendliAI (`friendliai`)](https://docs.litellm.ai/docs/providers/friendliai) | ✅ | ✅ | ✅ | | | | | | | | | [Galadriel (`galadriel`)](https://docs.litellm.ai/docs/providers/galadriel) | ✅ | ✅ | ✅ | | | | | | | | | [GitHub Copilot (`github_copilot`)](https://docs.litellm.ai/docs/providers/github_copilot) | ✅ | ✅ ✅ | ✅ | | | | | | | | [GitHub Models (`github`)](https://docs.litellm.ai/docs/providers/github) | ✅ | ✅ | ✅ | | | | | | | | | [Google - PaLM](https://docs.litellm.ai/docs/providers/palm) | ✅ | ✅ | ✅ | | | | | | | | | [Google - Vertex AI (`vertex_ai`)](https://docs.litellm.ai/docs/providers/vertex) | ✅ | ✅ | ✅ | ✅ | ✅ | | | | | | | [Google AI Studio - Gemini (`gemini`)](https://docs.litellm.ai/docs/providers/gemini) | ✅ | ✅ | ✅ | | | | | | | | | [GradientAI (`gradient_ai`)](https://docs.litellm.ai/docs/providers/gradient_ai) | ✅ | ✅ | ✅ | | | | | | | | | [Groq AI (`groq`)](https://docs.litellm.ai/docs/providers/groq) | ✅ | ✅ | ✅ | | | | | | | | | [Heroku (`heroku`)](https://docs.litellm.ai/docs/providers/heroku) | ✅ | ✅ | ✅ | | | | | | | | | [Hosted VLLM (`hosted_vllm`)](https://docs.litellm.ai/docs/providers/vllm) | ✅ | ✅ | ✅ | | | | | | | | | [Huggingface (`huggingface`)](https://docs.litellm.ai/docs/providers/huggingface) | ✅ | ✅ | ✅ | ✅ | | | | | | ✅ | | [Hyperbolic (`hyperbolic`)](https://docs.litellm.ai/docs/providers/hyperbolic) | ✅ | ✅ | ✅ | | | | | | | | | [IBM - Watsonx.ai (`watsonx`)](https://docs.litellm.ai/docs/providers/watsonx) | ✅ | ✅ | ✅ | ✅ | | | | | | | | [Infinity (`infinity`)](https://docs.litellm.ai/docs/providers/infinity) | | | | ✅ | | | | | | | | [Jina AI (`jina_ai`)](https://docs.litellm.ai/docs/providers/jina_ai) | | | | ✅ | | | | | | | | [Lambda AI (`lambda_ai`)](https://docs.litellm.ai/docs/providers/lambda_ai) | ✅ | ✅ | ✅ | | | | | | | | | [Lemonade (`lemonade`)](https://docs.litellm.ai/docs/providers/lemonade) | ✅ | ✅ | ✅ | | | | | | | | | [LiteLLM Proxy (`litellm_proxy`)](https://docs.litellm.ai/docs/providers/litellm_proxy) | ✅ | ✅ | ✅ | ✅ | ✅ | | | | | | | [Llamafile (`llamafile`)](https://docs.litellm.ai/docs/providers/llamafile) | ✅ | ✅ | ✅ | | | | | | | | | [LM Studio (`lm_studio`)](https://docs.litellm.ai/docs/providers/lm_studio) | ✅ | ✅ | ✅ | | | | | | | | | [Maritalk (`maritalk`)](https://docs.litellm.ai/docs/providers/maritalk) | ✅ | ✅ | ✅ | | | | | | | | | [Meta - Llama API (`meta_llama`)](https://docs.litellm.ai/docs/providers/meta_llama) | ✅ | ✅ | ✅ | | | | | | | | | [Mistral AI API (`mistral`)](https://docs.litellm.ai/docs/providers/mistral) | ✅ | ✅ | ✅ | ✅ | | | | | | | | [Moonshot (`moonshot`)](https://docs.litellm.ai/docs/providers/moonshot) | ✅ | ✅ | ✅ | | | | | | | | | [Morph (`morph`)](https://docs.litellm.ai/docs/providers/morph) | ✅ | ✅ | ✅ | | | | | | | | | [Nebius AI Studio (`nebius`)](https://docs.litellm.ai/docs/providers/nebius) | ✅ | ✅ | ✅ | ✅ | | | | | | | | [NLP Cloud (`nlp_cloud`)](https://docs.litellm.ai/docs/providers/nlp_cloud) | ✅ | ✅ | ✅ | | | | | | | | | [Novita AI (`novita`)](https://novita.ai/models/llm?utm_source=github_litellm&utm_medium=github_readme&utm_campaign=github_link) | ✅ | ✅ | ✅ | | | | | | | | | [Nscale (`nscale`)](https://docs.litellm.ai/docs/providers/nscale) | ✅ | ✅ | ✅ | | | | | | | | | [Nvidia NIM (`nvidia_nim`)](https://docs.litellm.ai/docs/providers/nvidia_nim) | ✅ | ✅ | ✅ | | | | | | | | | [OCI (`oci`)](https://docs.litellm.ai/docs/providers/oci) | ✅ | ✅ | ✅ | | | | | | | | | [Ollama (`ollama`)](https://docs.litellm.ai/docs/providers/ollama) | ✅ | ✅ | ✅ | ✅ | | | | | | | | [Ollama Chat (`ollama_chat`)](https://docs.litellm.ai/docs/providers/ollama) | ✅ | ✅ | ✅ | | | | | | | | | [Oobabooga (`oobabooga`)](https://docs.litellm.ai/docs/providers/openai_compatible) | ✅ | ✅ | ✅ | | | ✅ | ✅ | ✅ | ✅ | | | [OpenAI (`openai`)](https://docs.litellm.ai/docs/providers/openai) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | | [OpenAI-like (`openai_like`)](https://docs.litellm.ai/docs/providers/openai_compatible) | | | | ✅ | | | | | | | | [OpenRouter (`openrouter`)](https://docs.litellm.ai/docs/providers/openrouter) | ✅ | ✅ | ✅ | | | | | | | | | [OVHCloud AI Endpoints (`ovhcloud`)](https://docs.litellm.ai/docs/providers/ovhcloud) | ✅ | ✅ | ✅ | | | | | | | | | [Perplexity AI (`perplexity`)](https://docs.litellm.ai/docs/providers/perplexity) | ✅ | ✅ | ✅ | | | | | | | | | [Petals (`petals`)](https://docs.litellm.ai/docs/providers/petals) | ✅ | ✅ | ✅ | | | | | | | | | [Predibase (`predibase`)](https://docs.litellm.ai/docs/providers/predibase) | ✅ | ✅ | ✅ | | | | | | | | | [Recraft (`recraft`)](https://docs.litellm.ai/docs/providers/recraft) | | | | | ✅ | | | | | | | [Replicate (`replicate`)](https://docs.litellm.ai/docs/providers/replicate) | ✅ | ✅ | ✅ | | | | | | | | | [Sagemaker Chat (`sagemaker_chat`)](https://docs.litellm.ai/docs/providers/aws_sagemaker) | ✅ | ✅ | ✅ | | | | | | | | | [Sambanova (`sambanova`)](https://docs.litellm.ai/docs/providers/sambanova) | ✅ | ✅ | ✅ | | | | | | | | | [Snowflake (`snowflake`)](https://docs.litellm.ai/docs/providers/snowflake) | ✅ | ✅ | ✅ | | | | | | | | | [Text Completion Codestral (`text-completion-codestral`)](https://docs.litellm.ai/docs/providers/codestral) | ✅ | ✅ | ✅ | | | | | | | | | [Text Completion OpenAI (`text-completion-openai`)](https://docs.litellm.ai/docs/providers/text_completion_openai) | ✅ | ✅ | ✅ | | | ✅ | ✅ | ✅ | ✅ | | | [Together AI (`together_ai`)](https://docs.litellm.ai/docs/providers/togetherai) | ✅ | ✅ | ✅ | | | | | | | | | [Topaz (`topaz`)](https://docs.litellm.ai/docs/providers/topaz) | ✅ | ✅ | ✅ | | | | | | | | | [Triton (`triton`)](https://docs.litellm.ai/docs/providers/triton-inference-server) | ✅ | ✅ | ✅ | | | | | | | | | [V0 (`v0`)](https://docs.litellm.ai/docs/providers/v0) | ✅ | ✅ | ✅ | | | | | | | | | [Vercel AI Gateway (`vercel_ai_gateway`)](https://docs.litellm.ai/docs/providers/vercel_ai_gateway) | ✅ | ✅ | ✅ | | | | | | | | | [VLLM (`vllm`)](https://docs.litellm.ai/docs/providers/vllm) | ✅ | ✅ | ✅ | | | | | | | | | [Volcengine (`volcengine`)](https://docs.litellm.ai/docs/providers/volcano) | ✅ | ✅ | ✅ | | | | | | | | | [Voyage AI (`voyage`)](https://docs.litellm.ai/docs/providers/voyage) | | | | ✅ | | | | | | | | [WandB Inference (`wandb`)](https://docs.litellm.ai/docs/providers/wandb_inference) | ✅ | ✅ | ✅ | | | | | | | | | [Watsonx Text (`watsonx_text`)](https://docs.litellm.ai/docs/providers/watsonx) | ✅ | ✅ | ✅ | | | | | | | | | [xAI (`xai`)](https://docs.litellm.ai/docs/providers/xai) | ✅ | ✅ | ✅ | | | | | | | | | [Xinference (`xinference`)](https://docs.litellm.ai/docs/providers/xinference) | | | | ✅ | | | | | | | [**阅读文档**](https://docs.litellm.ai/docs/) ## 在开发者模式下运行 ### 服务 1. 在根目录下设置 .env 文件 2. 运行依赖服务 `docker-compose up db prometheus` ### 后端 1. (在根目录下) 创建虚拟环境 `python -m venv .venv` 2. 激活虚拟环境 `source .venv/bin/activate` 3. 安装依赖 `pip install -e ".[all]"` 4. `pip install prisma` 5. `prisma generate` 6. 启动代理后端 `python litellm/proxy/proxy_cli.py` ### 前端 1. 导航到 `ui/litellm-dashboard` 2. 安装依赖 `npm install` 3. 运行 `npm run dev` 启动 dashboard # 我们为什么要构建这个 - **对简单性的需求**:我们的代码在管理与转换 Azure、OpenAI 和 Cohere 之间的调用时开始变得极其复杂。
标签:AI网关, Anthropic, API代理, API管理, API统一接口, AWS Bedrock, Azure OpenAI, CIS基准, DLL 劫持, DNS解析, ETW劫持, Google VertexAI, HuggingFace, IP 地址批量处理, LLM, MITM代理, NVIDIA NIM, OpenAI兼容, Python SDK, SageMaker, Unmanaged PE, Y Combinator, 中间件, 人工智能, 企业级AI, 大语言模型, 大语言模型蜜罐, 安全护栏, 开源项目, 成本跟踪, 提示词工程, 日志记录, 模型切换, 模型管理, 模型部署, 用户模式Hook绕过, 策略决策点, 自定义请求头, 请求拦截, 负载均衡, 逆向工具