huggingface/hf-hub

GitHub: huggingface/hf-hub

Hugging Face Hub 的官方 Rust 客户端，提供模型与数据集仓库管理、文件上传下载及版本控制等完整功能，支持异步与同步两种调用模式。

Stars: 311 | Forks: 124

# hf-hub [Hugging Face Hub API](https://huggingface.co/docs/hub/api) 的 Rust 客户端。 `hf-hub` 提供了一个类型化的、符合人体工程学的接口，用于从 Rust 与 Hugging Face Hub 进行交互。它是 Python [`huggingface_hub`](https://github.com/huggingface/huggingface_hub) 库的 Rust 等效版本。同时提供了 **异步** 接口（`HFClient`，默认开启）和 **同步** 接口（`HFClientSync`，通过 `blocking` 特性启用）。这两者的方法相互对应。 ## 功能 - **Repository 操作** — 查询 model、dataset 和 space 的元数据；创建、删除、更新和移动 repository - **文件操作** — 上传文件和文件夹、下载文件、列出 repository 树、检查文件是否存在 - **Commit 操作** — 创建包含多个文件操作的 commit、列出 commit 历史记录、查看不同 revision 之间的 diff - **分支与标签管理** — 创建和删除分支与标签、列出 ref - **用户与组织信息** — whoami、用户资料、组织详情、关注者 - **流式分页** — 异步列表 endpoint 返回 `impl Stream>`，以实现惰性且内存高效的迭代；同步对应方法则收集为 `Vec` - **Bucket 操作** — 创建、删除、列出和移动 bucket；在 bucket 内上传、下载和删除文件 - **Xet 高性能传输** — 支持 Hugging Face 的 Xet 存储后端 - **异步或同步** — 在你自己的 tokio 运行时中使用 `HFClient`，或者为同步调用者使用 `HFClientSync`（需要 `blocking` 特性） ## 安装添加到你的 `Cargo.toml`： ``` [dependencies] hf-hub = "1.0.0" ``` 要使用同步接口，请启用 `blocking` 特性： ``` [dependencies] hf-hub = { version = "1.0.0", features = ["blocking"] } ``` ## CLI 安装 `hfrs` 命令行工具提供了 Hub 的终端接口。使用以下命令安装： ``` cargo install --git https://github.com/huggingface/hf-hub.git hfrs ``` 这默认以 release 模式构建。安装完成后，运行 `hfrs --help` 以查看可用命令。 ## 快速入门 ### 异步 ``` use hf_hub::HFClient; use hf_hub::repository::RepoInfo; #[tokio::main] async fn main() -> hf_hub::HFResult<()> { let client = HFClient::new()?; // Get model info let RepoInfo::Model(info) = client .model("openai-community", "gpt2") .info() .send() .await? else { unreachable!("handle type guarantees the Model variant"); }; println!("Model: {} (downloads: {:?})", info.id, info.downloads); Ok(()) } ``` ### 同步需要 `blocking` 特性。`HFClientSync` 在内部管理着一个专用的 tokio 运行时，因此调用者不需要自己提供。 ``` use hf_hub::HFClientSync; use hf_hub::repository::RepoInfo; fn main() -> hf_hub::HFResult<()> { let client = HFClientSync::new()?; let RepoInfo::Model(info) = client .model("openai-community", "gpt2") .info() .send()? else { unreachable!("handle type guarantees the Model variant"); }; println!("Model: {} (downloads: {:?})", info.id, info.downloads); Ok(()) } ``` 同步句柄（`HFClientSync`、`HFRepositorySync`、`HFSpaceSync`、`HFBucketSync`）的方法与它们的异步对应项一一映射。请参阅 `examples/` 目录中的 `blocking_*` 示例以获取可运行的程序。 ## 使用示例 ### 按作者列出 model ``` use futures::StreamExt; use hf_hub::HFClient; #[tokio::main] async fn main() -> hf_hub::HFResult<()> { let client = HFClient::new()?; let stream = client .list_models() .author("meta-llama") .limit(5_usize) .send()?; futures::pin_mut!(stream); while let Some(model) = stream.next().await { let model = model?; println!("{}", model.id); } Ok(()) } ``` ### 操作 repository 句柄 ``` use hf_hub::HFClient; use hf_hub::repository::RepoInfo; #[tokio::main] async fn main() -> hf_hub::HFResult<()> { let client = HFClient::new()?; let repo = client.model("openai-community", "gpt2"); let RepoInfo::Model(model_info) = repo.info().send().await? else { println!("error, not a model"); return Ok(()); }; println!("Model: {}", model_info.id); let exists = repo .file_exists() .filename("config.json") .send() .await?; println!("config.json exists: {exists}"); Ok(()) } ``` ### 下载文件 ``` use std::path::PathBuf; use hf_hub::HFClient; #[tokio::main] async fn main() -> hf_hub::HFResult<()> { let client = HFClient::new()?; let repo = client.model("openai-community", "gpt2"); let path = repo .download_file() .filename("config.json") .local_dir(PathBuf::from("/tmp/hf-downloads")) .send() .await?; println!("Downloaded to: {}", path.display()); Ok(()) } ``` ### 上传文件 ``` use hf_hub::HFClient; use hf_hub::repository::AddSource; #[tokio::main] async fn main() -> hf_hub::HFResult<()> { let client = HFClient::new()?; let repo = client.model("your-username", "your-repo"); let commit = repo .upload_file() .source(AddSource::Bytes(b"Hello, world!".to_vec())) .path_in_repo("greeting.txt") .commit_message("Add greeting file") .send() .await?; println!("Committed: {:?}", commit.oid); Ok(()) } ``` ### 创建 repository ``` use hf_hub::HFClient; #[tokio::main] async fn main() -> hf_hub::HFResult<()> { let client = HFClient::new()?; let url = client .create_repo() .repo_id("your-username/new-model") .private(true) .exist_ok(true) .send() .await?; println!("Repository URL: {}", url.url); Ok(()) } ``` ## 认证客户端按以下顺序解析认证 token： 1. 通过 `HFClientBuilder::token()` 显式提供的 token 2. `HF_TOKEN` 环境变量 3. `HF_TOKEN_PATH` 指定路径下的 token 文件 4. 位于 `~/.cache/huggingface/token` 的默认 token 文件将 `HF_HUB_DISABLE_IMPLICIT_TOKEN` 设置为任何非空值即可禁用自动 token 解析。 ## 配置 | 环境变量 | 描述 | |---------------------------------|--------------------------------------------------------| | `HF_ENDPOINT` | Hub API endpoint (默认：`https://huggingface.co`) | | `HF_TOKEN` | 认证 token | | `HF_TOKEN_PATH` | token 文件路径 | | `HF_HOME` | 缓存目录根路径 (默认：`~/.cache/huggingface`) | | `HF_HUB_DISABLE_IMPLICIT_TOKEN` | 禁用自动加载 token | | `HF_HUB_USER_AGENT_ORIGIN` | 自定义 User-Agent origin 字符串 | ## 错误处理所有可能出错的操作都会返回 `Result`。`HFError` 枚举为常见的失败情况提供了结构化的变体： - `HFError::AuthRequired` — 401 响应，token 缺失或无效 - `HFError::RepoNotFound` — repository 不存在或无法访问 - `HFError::BucketNotFound` — bucket 不存在或无法访问 - `HFError::EntryNotFound` — 文件或路径在 repository 或 bucket 中不存在 - `HFError::RevisionNotFound` — 分支、标签或 commit 不存在 - `HFError::Forbidden` — 403 响应，权限不足 - `HFError::Conflict` — 409 响应，资源已存在或冲突 - `HFError::RateLimited` — 429 响应，请求过多 - `HFError::Http` — 其他包含状态码、URL 和响应主体的 HTTP 错误 ## 许可证 Apache-2.0

标签：AI模型管理, API客户端, Hugging Face, Rust, SOC Prime, 可视化界面, 开发工具, 文件传输, 网络流量审计, 通知系统