databrickslabs/dbldatagen

GitHub: databrickslabs/dbldatagen

专为 Databricks 和 Spark 环境设计的合成数据生成库,支持快速生成大规模、可重复且保持跨表关联一致性的测试数据集。

Stars: 446 | Forks: 90

# Databricks Labs 数据生成器 (`dbldatagen`) [文档](https://databrickslabs.github.io/dbldatagen/public_docs/index.html) | [发行说明](CHANGELOG.md) | [示例](examples) | [教程](tutorial) [![build](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/34a71edff9234252.svg)](https://github.com/databrickslabs/dbldatagen/actions?query=workflow%3Abuild+branch%3Amaster) [![PyPi package](https://img.shields.io/pypi/v/dbldatagen?color=green)](https://pypi.org/project/dbldatagen/) [![codecov](https://codecov.io/gh/databrickslabs/dbldatagen/branch/master/graph/badge.svg)](https://codecov.io/gh/databrickslabs/dbldatagen) [![PyPi downloads](https://img.shields.io/pypi/dm/dbldatagen?label=PyPi%20Downloads)](https://pypistats.org/packages/dbldatagen) [![lines of code](https://tokei.rs/b1/github/databrickslabs/dbldatagen)]([https://codecov.io/github/databrickslabs/dbldatagen](https://github.com/databrickslabs/dbldatagen))