Ts-Pytham/TiktokExplode

GitHub: Ts-Pytham/TiktokExplode

TiktokExplode是一个.NET库,用于自动化获取TikTok视频元数据和下载内容,绕过平台反爬虫机制。

Stars: 2 | Forks: 0

# TiktokExplode [![NuGet](https://img.shields.io/nuget/v/TiktokExplode.svg?label=TiktokExplode)](https://www.nuget.org/packages/TiktokExplode) [![NuGet](https://img.shields.io/nuget/v/TiktokExplode.Infrastructure.svg?label=TiktokExplode.Infrastructure)](https://www.nuget.org/packages/TiktokExplode.Infrastructure) [![NuGet](https://img.shields.io/nuget/v/TiktokExplode.All.svg?label=TiktokExplode.All)](https://www.nuget.org/packages/TiktokExplode.All) [![NuGet](https://img.shields.io/nuget/v/TiktokExplode.Extensions.DependencyInjection.svg?label=TiktokExplode.Extensions.DependencyInjection)](https://www.nuget.org/packages/TiktokExplode.Extensions.DependencyInjection) [![.NET](https://img.shields.io/badge/.NET-8.0%20%7C%209.0-512BD4)](https://dotnet.microsoft.com) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

TiktokExplode

**TiktokExplode** 是一个 .NET 库,它允许你以编程方式从 TikTok 获取元数据和下载视频。它通过一个简单、清晰的 API 处理会话管理、Cookie 注入以及 WAF/机器人检测规避——这样你就可以专注于使用数据,而不是与平台对抗。 该库遵循 **Clean Architecture**:领域层 (`TiktokExplode`) 零外部依赖,并暴露不可变、强类型的模型,而基础设施层 (`TiktokExplode.Infrastructure`) 处理所有基于 HTTP 和浏览器的关注点。 ## 功能 - 获取完整视频元数据:作者、统计数据、时长、语言、位置、比特率等 - 下载无水印或带水印的视频 - 下载静态封面图 (JPEG) 或动态封面图 (WebP) - 通过 `IProgress` 在下载期间报告进度 - 自动 WAF/机器人检测重试,可配置退避策略 - **策略模式** —— 可在 Playwright(可靠)或纯 HTTP(轻量级)页面抓取间选择 - Clean Architecture —— 纯领域层,零外部依赖 - 目标框架 **net8.0** 和 **net9.0** ## 安装 **推荐 —— 一条命令同时安装两个包:** ``` dotnet add package TiktokExplode.All ``` ``` dotnet add package TiktokExplode.Infrastructure ``` **使用 Microsoft.Extensions.DependencyInjection?** ``` dotnet add package TiktokExplode.Extensions.DependencyInjection ``` ## 快速开始 ``` using TiktokExplode.Infrastructure.Clients; using TiktokExplode.Infrastructure.Common; await using var client = TiktokClient.CreateWithBrowser(); var video = await client.GetVideoAsync("https://www.tiktok.com/@user/video/1234567890"); Console.WriteLine($"ID: {video.Id}"); Console.WriteLine($"Author: {video.Author.Name} (@{video.Author.UniqueId})"); Console.WriteLine($"Duration: {video.Duration.Seconds}s"); Console.WriteLine($"Views: {video.Stats.Views}"); Console.WriteLine($"Likes: {video.Stats.Likes}"); // Download without watermark (stream + content length) await using var streamInfo = await client.DownloadAsync(video); await using var file = File.Create($"{video.Id}.mp4"); await streamInfo.Stream.CopyToAsync(file); // Or use the extension to download directly to a file with progress IProgress progress = new Progress(p => Console.Write($"\rProgress: {p:P0}")); await client.DownloadAsync(video, $"{video.Id}.mp4", progress); ``` ## API 参考 ### `TiktokClient` `TiktokClient` 使用 **策略模式** 将页面抓取与下载解耦。使用工厂方法选择策略: ``` // Playwright — uses a real browser to bypass WAF (recommended) await using var client = TiktokClient.CreateWithBrowser(); // Playwright with custom options await using var client = TiktokClient.CreateWithBrowser( new PlaywrightFetcherOptions { BrowserChannel = "msedge", Headless = true }, new TikTokOptions { MaxWafRetries = 5 }); // HTTP-only — lightweight, may be blocked by WAF await using var client = TiktokClient.CreateWithHttp(); // Inject your own IPageFetcher implementation await using var client = new TiktokClient(myFetcher, new TikTokOptions()); ``` #### 方法 | 方法 | 返回类型 | 描述 | | ---------------------------------------------------- | ------------ | --------------------------------- | | `GetVideoAsync(string url, CancellationToken)` | `Video` | 获取完整视频元数据 | | `DownloadAsync(Video, CancellationToken)` | `StreamInfo` | 下载无水印视频 | | `DownloadWatermarkedAsync(Video, CancellationToken)` | `StreamInfo` | 下载带水印视频 | `TiktokClient` 实现了 `IAsyncDisposable` —— 请始终使用 `await using`。 #### 扩展方法(通过 `TiktokClientExtensions`) | 方法 | 描述 | | --- | --- | | `DownloadAsync(video, filePath, progress?, ct)` | 将无水印视频下载到文件,可选进度报告 | | `DownloadWatermarkedAsync(video, filePath, progress?, ct)` | 将带水印视频下载到文件,可选进度报告 | | `DownloadImageAsync(video, filePath, ct)` | 下载静态封面图 (JPEG) | | `DownloadAnimatedImageAsync(video, filePath, ct)` | 下载动态封面图 (WebP) | ``` // Download to file path with optional progress await client.DownloadAsync(video, "output.mp4", progress, cancellationToken); await client.DownloadWatermarkedAsync(video, "output_wm.mp4", progress, cancellationToken); // Download cover images await client.DownloadImageAsync(video, "cover.jpg"); await client.DownloadAnimatedImageAsync(video, "cover.webp"); ``` `ContentLength` 来源于 CDN 响应头 —— 始终准确,非元数据估算值。 ### `StreamInfo` 由 `DownloadAsync` 和 `DownloadWatermarkedAsync` 返回。实现了 `IAsyncDisposable`。 | 属性 | 类型 | 描述 | | --------------- | -------- | --------------------------------- | | `Stream` | `Stream` | 视频内容流 | | `ContentLength` | `long` | 来自 CDN 的准确文件大小(字节) | ### `TikTokOptions` | 属性 | 默认值 | 描述 | | ---------------- | ------- | ------------------------------------------- | | `MaxWafRetries` | `3` | WAF 检测最大重试次数 | | `RetryBaseDelay` | `2s` | 重试间基础延迟(线性增长) | ### `PlaywrightFetcherOptions` | 属性 | 默认值 | 描述 | | ---------------- | ------- | ---------------------------------------------------------------------------------------- | | `BrowserChannel` | `null` | 浏览器渠道(如 `"msedge"`, `"chrome"`)。`null` 使用 Playwright 内置的 Chromium | | `Headless` | `true` | 以无头模式运行浏览器 | | `PageTimeoutMs` | `30000` | 导航超时时间(毫秒) | ### `HttpFetcherOptions` | 属性 | 默认值 | 描述 | | ------------- | ------------- | ------------------------------------------ | | `UserAgent` | Chrome 136 UA | 请求发送的 User-Agent 头 | | `WarmupDelay` | `1200ms` | 预热请求后抓取前的延迟 | ### `Video` 模型 | 属性 | 类型 | 描述 | | ------------- | ---------------- | -------------------------------------------------- | | `Id` | `string` | TikTok 视频 ID | | `Description` | `string` | 标题 / 描述 | | `Author` | `Author` | 作者实体 | | `Duration` | `VideoDuration` | 时长(秒)和精确秒数 | | `Stats` | `VideoStats` | 观看数、点赞数、评论数、分享数、收藏数、转发数 | | `Info` | `VideoInfo` | 技术信息、比特率和下载 URL | | `Language` | `VideoLanguage` | 检测到的内容语言 | | `Location` | `string` | 位置标签(如有) | | `Cover` | `VideoCover` | 静态 (JPEG) 和动态 (WebP) 封面图 URL | | `CreatedAt` | `DateTimeOffset` | 上传日期 | ### `Author` 模型 | 属性 | 类型 | 描述 | | ------------- | ---------------------- | ---------------------------------------------------------- | | `Id` | `string` | TikTok 内部用户 ID | | `UniqueId` | `string` | 账号名(如 `johndoe`) | | `Name` | `string` | 显示名称 | | `Description` | `string` | 个人简介 | | `IsVerified` | `bool` | 认证标志 | | `IsPrivate` | `bool` | 私密账号 | | `Avatar` | `ProfileImageVariants` | 头像图片 URL(小、中、大尺寸) | | `Stats` | `AuthorStats` | 关注者、正在关注、好友、获得的点赞数、视频数量 | | `CreatedAt` | `DateTimeOffset` | 账号创建日期 | ## 依赖注入 `TiktokExplode.Extensions.DependencyInjection` 提供了一个流畅的 `AddTiktokExplode()` 扩展方法,用于将所有 TiktokExplode 服务注册到 .NET DI 容器中。 ``` // Default — Playwright fetcher, all defaults services.AddTiktokExplode(); // Custom — Playwright with visible browser window services.AddTiktokExplode(b => b .UsePlaywrightFetcher(o => o.Headless = false)); // HTTP fetcher — lighter, no browser dependency services.AddTiktokExplode(b => b .UseHttpFetcher(o => o.WarmupDelay = TimeSpan.Zero) .ConfigureTiktok(o => o.MaxWafRetries = 5)); ``` 已注册的服务: | 服务 | 实现 | 生命周期 | | --- | --- | --- | | `IVideoClient` | `TiktokClient` | 单例 | | `IPageFetcher` | `PlaywrightFetcher` 或 `HttpFetcher` | 单例 | | `TikTokOptions` | — | 单例 | | `PlaywrightFetcherOptions` 或 `HttpFetcherOptions` | — | 单例 | ``` // Consume in your services via constructor injection public class MyService(IVideoClient client) { public async Task GetTitleAsync(string url) { var video = await client.GetVideoAsync(url); return video.Description; } } ``` ## 错误处理 ``` using TiktokExplode.Domain.Exceptions; try { var video = await client.GetVideoAsync(url); } catch (TiktokWafException ex) { // Bot detection triggered after all retries exhausted } catch (VideoNotFoundException ex) { // Video does not exist or is private } catch (TiktokParsingException ex) { // Unexpected page structure (TikTok changed their HTML/JSON) } catch (TiktokException ex) { // Base exception — catch-all for library errors } ``` ## 项目结构 ``` TiktokExplode/ # Domain — zero external dependencies Domain/ Entities/ # Video, Author ValueObjects/ # VideoInfo, StreamInfo, VideoStats, VideoDuration, etc. Abstractions/ # IVideoClient Exceptions/ # TiktokException hierarchy Utilities/ # URL validation TiktokExplode.Infrastructure/ # HTTP + browser automation (Playwright + AngleSharp) Clients/ # TiktokClient : IVideoClient Fetchers/ # IPageFetcher, PlaywrightFetcher, HttpFetcher Http/ # TikTokDownloadClient — CDN download management Browser/ # TikTokBrowser — internal Playwright wrapper Parsers/ # TikTokVideoParser — JSON extraction from hydration script Options/ # TikTokOptions, PlaywrightFetcherOptions, HttpFetcherOptions Common/ # StreamExtensions, TiktokClientExtensions TiktokExplode.All/ # Meta-package — installs both packages above in one command TiktokExplode.Extensions.DependencyInjection/ # AddTiktokExplode() for Microsoft.Extensions.DI ``` ## 致谢 本库深受 [**YoutubeExplode**](https://github.com/Tyrrrz/YoutubeExplode) (作者 [Tyrrrz](https://github.com/Tyrrrz)) 的启发。他的工作向我展示了一个为媒体平台设计的、优秀的、清晰的 .NET 库应该是什么样子。没有这个参考,TiktokExplode 就不会存在。谢谢你。 ## 许可证 MIT —— 详情请参阅 [LICENSE](LICENSE)。
标签:Bot检测规避, Clean Architecture, Cookie注入, C#编程, .NET开发库, NuGet包, Playwright, TikTok视频下载, WAF绕过, 下载进度, 会话管理, 反爬虫绕过, 多人体追踪, 数据提取API, 浏览器自动化, 特征检测, 社交媒体数据采集, 策略模式, 自动重试, 视频元数据提取, 视频处理