timescale/timescaledb

GitHub: timescale/timescaledb

作为 PostgreSQL 扩展的时序数据库，在保持 SQL 兼容性的同时提供自动分区、列存储压缩和连续聚合等高性能实时分析能力。

Stars: 22902 | Forks: 1108

TimescaleDB 是一个 PostgreSQL 扩展，用于对时间序列和事件数据进行高性能实时分析

[![文档](https://img.shields.io/badge/Read_the_docs-black?style=for-the-badge&logo=readthedocs&logoColor=white)](https://docs.tigerdata.com/) [![SLACK](https://img.shields.io/badge/Ask_the_community-black?style=for-the-badge&logo=slack&logoColor=white)](https://timescaledb.slack.com/archives/C4GT3N90X) [![免费试用 TimescaleDB](https://img.shields.io/badge/Try_Tiger_Cloud_for_free-black?style=for-the-badge&logo=timescale&logoColor=white)](https://console.cloud.timescale.com/signup)

## TimescaleDB 快速入门在 10 分钟内开始使用 TimescaleDB。本指南将帮助您在本地运行 TimescaleDB，创建启用列存储的第一个 hypertable，将数据写入列存储，并体验即时分析查询性能。 ### 您将学到什么 - 如何通过单行安装或 Docker 命令运行 TimescaleDB - 如何创建启用列存储的 hypertable - 如何将数据直接插入列存储 - 如何执行分析查询 ### 前置条件 - 您的机器上已安装 Docker - 建议内存 8GB RAM - `psql` 客户端（随 PostgreSQL 附带）或任何 PostgreSQL 客户端，如 [pgAdmin](https://www.pgadmin.org/download/) ### 步骤 1：启动 TimescaleDB 您有两种启动 TimescaleDB 的选择： #### 选项 1：单行安装（推荐）最简单的入门方式： **Linux/Mac:** ``` curl -sL https://tsdb.co/start-local | sh ``` 此命令： - 下载并启动 TimescaleDB（如果尚未下载） - 在端口 **6543** 上暴露 PostgreSQL（使用非标准端口以避免与端口 5432 上的其他 PostgreSQL 实例冲突） - 使用 timescaledb-tune 自动针对您的环境调整设置 - 设置持久化数据卷 #### 选项 2：手动 Docker 命令（也适用于 Windows）或者，您可以直接使用 Docker 运行 TimescaleDB： ``` docker run -d --name timescaledb \ -p 6543:5432 \ -e POSTGRES_PASSWORD=password \ timescale/timescaledb-ha:pg18 ``` **注意：** 我们使用端口 **6543**（映射到容器端口 5432）是为了避免与您可能在标准端口 5432 上运行的其他 PostgreSQL 实例发生冲突。等待大约 1-2 分钟让 TimescaleDB 下载并初始化。 ### 步骤 2：连接到 TimescaleDB 使用 `psql` 连接： ``` psql -h localhost -p 6543 -U postgres # 出现提示时，输入密码：password ``` 您应该能看到 PostgreSQL 提示符。验证 TimescaleDB 已安装： ``` SELECT extname, extversion FROM pg_extension WHERE extname = 'timescaledb'; ``` 预期输出： ``` extname | extversion -------------+------------ timescaledb | 2.x.x ``` **更喜欢 GUI？** 如果您想使用图形化工具而不是命令行，可以下载 [pgAdmin](https://www.pgadmin.org/download/) 并使用相同的连接详情连接到 TimescaleDB（主机：`localhost`，端口：`6543`，用户：`postgres`，密码：`password`）。 ### 步骤 3：创建您的第一个 Hypertable 让我们创建一个启用列存储的、用于 IoT 传感器数据的 hypertable： ``` -- Create a hypertable with automatic columnstore CREATE TABLE sensor_data ( time TIMESTAMPTZ NOT NULL, sensor_id TEXT NOT NULL, temperature DOUBLE PRECISION, humidity DOUBLE PRECISION, pressure DOUBLE PRECISION ) WITH ( tsdb.hypertable ); -- create index CREATE INDEX idx_sensor_id_time ON sensor_data(sensor_id, time DESC); ``` `tsdb.hypertable` - 将其转换为 TimescaleDB hypertable 了解更多： - [关于 hypertables](https://docs.tigerdata.com/use-timescale/latest/hypertables/) - [API 参考](https://docs.tigerdata.com/api/latest/hypertable/) - [关于列存储](https://docs.tigerdata.com/use-timescale/latest/compression/about-compression/) - [手动启用列存储](https://docs.tigerdata.com/use-timescale/latest/compression/manual-compression/) - [API 参考](https://docs.tigerdata.com/api/latest/compression/) ### 步骤 4：插入样本数据让我们添加一些样本传感器读数： ``` -- Enable timing to see time to execute queries \timing on -- Insert sample data for multiple sensors -- SET timescaledb.enable_direct_compress_insert = on to insert data directly to the columnstore (columnnar format for performance) SET timescaledb.enable_direct_compress_insert = on; INSERT INTO sensor_data (time, sensor_id, temperature, humidity, pressure) SELECT time, 'sensor_' || ((random() * 9)::int + 1), 20 + (random() * 15), 40 + (random() * 30), 1000 + (random() * 50) FROM generate_series( NOW() - INTERVAL '90 days', NOW(), INTERVAL '1 seconds' ) AS time; -- Once data is inserted into the columnstore we optimize the order and structure -- this compacts and orders the data in the chunks for optimal query performance and compression DO $$ DECLARE ch TEXT; BEGIN FOR ch IN SELECT show_chunks('sensor_data') LOOP CALL convert_to_columnstore(ch, recompress := true); END LOOP; END $$; ``` 这将生成过去 90 天内 10 个传感器的约 7,776,001 条读数。验证数据已插入： ``` SELECT COUNT(*) FROM sensor_data; ``` ### 步骤 5：运行您的第一个分析查询现在让我们运行一些展示 TimescaleDB 性能的分析查询： ``` -- Enable query timing to see performance \timing on -- Query 1: Average readings per sensor over the last 7 days SELECT sensor_id, COUNT(*) as readings, ROUND(AVG(temperature)::numeric, 2) as avg_temp, ROUND(AVG(humidity)::numeric, 2) as avg_humidity, ROUND(AVG(pressure)::numeric, 2) as avg_pressure FROM sensor_data WHERE time > NOW() - INTERVAL '7 days' GROUP BY sensor_id ORDER BY sensor_id; -- Query 2: Hourly averages using time_bucket -- Time buckets enable you to aggregate data in hypertables by time interval and calculate summary values. SELECT time_bucket('1 hour', time) AS hour, sensor_id, ROUND(AVG(temperature)::numeric, 2) as avg_temp, ROUND(AVG(humidity)::numeric, 2) as avg_humidity FROM sensor_data WHERE time > NOW() - INTERVAL '24 hours' GROUP BY hour, sensor_id ORDER BY hour DESC, sensor_id LIMIT 20; -- Query 3: Daily statistics across all sensors SELECT time_bucket('1 day', time) AS day, COUNT(*) as total_readings, ROUND(AVG(temperature)::numeric, 2) as avg_temp, ROUND(MIN(temperature)::numeric, 2) as min_temp, ROUND(MAX(temperature)::numeric, 2) as max_temp FROM sensor_data GROUP BY day ORDER BY day DESC LIMIT 10; -- Query 4: Latest reading for each sensor -- Highlights the value of Skipscan executing in under 100ms without skipscan it takes over 5sec SELECT DISTINCT ON (sensor_id) sensor_id, time, ROUND(temperature::numeric, 2) as temperature, ROUND(humidity::numeric, 2) as humidity, ROUND(pressure::numeric, 2) as pressure FROM sensor_data ORDER BY sensor_id, time DESC; ``` 注意这些分析查询运行得有多快，即使是对数百万行进行聚合。这就是 TimescaleDB 列存储的威力。 ### 幕后发生了什么？ TimescaleDB 自动： - **对数据进行分区**，将其划分为基于时间的块以提高查询效率 - **直接写入列存储**，使用列式存储（通常 90%+ 压缩率）和更快的向量化查询 - **优化查询**，仅扫描相关的时间范围和列 - **启用 time_bucket()** - 一个用于时间序列聚合的强大函数了解更多： - [查询数据](https://docs.tigerdata.com/use-timescale/latest/query-data/) - [写入数据](https://docs.tigerdata.com/use-timescale/latest/write-data/) - [关于时间桶](https://docs.tigerdata.com/use-timescale/latest/time-buckets/about-time-buckets/) - [API 参考](https://docs.tigerdata.com/api/latest/hyperfunctions/time_bucket/) - [所有 TimescaleDB 功能](https://docs.tigerdata.com/use-timescale/latest/) ### 后续步骤既然您已经掌握了基础知识，可以探索更多内容： ### 创建连续聚合连续聚合使得在超大规模数据集上的实时分析运行得更快。它们在后台连续且增量地刷新查询，因此当您运行此类查询时，只需要计算发生变化的数据，而不是整个数据集。这就是它们与常规 PostgreSQL [物化视图](https://www.postgresql.org/docs/current/rules-materializedviews.html) 的区别，后者无法进行增量物化，每次刷新时都必须从头开始重建。让我们为每小时传感器统计信息创建一个连续聚合： #### 步骤 1：创建连续聚合 ``` CREATE MATERIALIZED VIEW sensor_data_hourly WITH (timescaledb.continuous) AS SELECT time_bucket('1 hour', time) AS hour, sensor_id, AVG(temperature) AS avg_temp, AVG(humidity) AS avg_humidity, AVG(pressure) AS avg_pressure, MIN(temperature) AS min_temp, MAX(temperature) AS max_temp, COUNT(*) AS reading_count FROM sensor_data GROUP BY hour, sensor_id; ``` 这将创建一个物化视图，将您的传感器数据预先聚合到每小时的时间桶中。该视图会自动填充现有数据。 #### 步骤 2：添加刷新策略为了在新数据到达时保持连续聚合的更新，添加一个刷新策略： ``` SELECT add_continuous_aggregate_policy( 'sensor_data_hourly', start_offset => INTERVAL '3 hours', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 hour' ); ``` 此策略： - 每小时刷新一次连续聚合 - 处理从 3 小时前到 1 小时前的数据（留出最近一小时用于实时查询） - 仅增量处理新的或更改的数据 #### 步骤 3：查询连续聚合现在您可以查询预先聚合的数据，以获得更快的结果： ``` -- Get hourly averages for the last 24 hours SELECT hour, sensor_id, ROUND(avg_temp::numeric, 2) AS avg_temp, ROUND(avg_humidity::numeric, 2) AS avg_humidity, reading_count FROM sensor_data_hourly WHERE hour > NOW() - INTERVAL '24 hours' ORDER BY hour DESC, sensor_id LIMIT 50; ``` #### 连续聚合的优势 - **更快的查询**：预先聚合的数据意味着查询在毫秒而不是秒内完成 - **增量刷新**：仅处理新/更改的数据，而非整个数据集 - **自动更新**：刷新策略使您的聚合保持最新，无需手动干预 - **实时选项**：您可以启用实时聚合，以合并物化数据和原始数据 #### 亲自尝试比较性能差异： ``` -- Query the raw hypertable (slower on large datasets) \timing on SELECT time_bucket('1 hour', time) AS hour, AVG(temperature) AS avg_temp FROM sensor_data WHERE time > NOW() - INTERVAL '60 days' GROUP BY hour ORDER BY hour DESC LIMIT 24; -- Query the continuous aggregate (much faster) SELECT hour, avg_temp FROM sensor_data_hourly WHERE hour > NOW() - INTERVAL '60 days' ORDER BY hour DESC LIMIT 24; ``` 注意连续聚合查询的速度有多快，尤其是随着数据集的增长！了解更多： - [关于连续聚合](https://docs.tigerdata.com/use-timescale/latest/continuous-aggregates/) - [API 参考](https://docs.tigerdata.com/api/latest/continuous-aggregates/create_materialized_view/) - [TimescaleDB 文档](https://docs.timescale.com) - [时间序列最佳实践](https://docs.timescale.com/use-timescale/latest/schema-management/) - [连续聚合](https://docs.timescale.com/use-timescale/latest/continuous-aggregates/) ## 示例通过使用真实世界数据集的完整、独立示例学习 TimescaleDB。每个示例都包含样本数据和分析查询。 - **[纽约出租车数据](docs/getting-started/nyc-taxi/)** - 交通和基于位置的分析 - **[金融市场数据](docs/getting-started/financial-ticks/)** - 交易和市场数据分析 - **[应用程序事件](docs/getting-started/events-uuidv7/)** - 使用 UUIDv7 的事件日志记录或者尝试我们的一些研讨会 - **[AI 研讨会：电动汽车充电站分析](https://github.com/timescale/TigerData-Workshops/tree/main/AI-Workshop)** - 结合 PostgreSQL 与 AI 功能，用于管理和分析电动汽车充电站数据 - **[时间序列研讨会：金融数据分析](https://github.com/timescale/TigerData-Workshops/tree/main/TimeSeries-Workshop-Finance/)** - 处理加密货币 tick 数据，创建 K 线图 ## 想要托管和托管 TimescaleDB？试试 Tiger Cloud [Tiger Cloud](https://docs.tigerdata.com/getting-started/latest/) 是适用于您所有应用程序的现代 PostgreSQL 数据平台。它增强了 PostgreSQL 以处理时间序列、事件、实时分析和向量搜索——所有这些都在单个数据库中与事务性工作负载并存。您将获得一个能够处理实时数据摄取、延迟和乱序更新以及低延迟查询的系统，并具备您的应用程序所需的性能、可靠性和可扩展性。Tiger Cloud 非常适合 IoT、加密货币、金融、SaaS 和许多其他领域，允许您构建数据密集型、关键任务应用程序，同时保留 PostgreSQL 的熟悉感和可靠性。请参阅[我们的白皮书](https://docs.tigerdata.com/about/latest/whitepaper/)，深入了解 Tiger Cloud 的架构以及它如何满足即使是最苛刻应用程序的需求。 Tiger Cloud 服务是一个优化的 100% PostgreSQL 数据库实例，您可以按原样使用，也可以根据您的特定业务需求进行扩展。可用功能包括： - **时间序列和分析**：搭载 TimescaleDB 的 PostgreSQL。您熟悉和喜爱的 PostgreSQL，通过用于大规模存储和查询时间序列数据的功能进行了增强，以实现实时分析和其他用例。通过 hypertable、连续聚合和列式存储获得更快的时间查询。通过原生压缩、数据保留策略以及将数据分层存储到 Amazon S3 来节省存储成本。 - **AI 和向量**：搭载向量扩展的 PostgreSQL。将 PostgreSQL 用作向量数据库，配备专用的扩展，用于从头开始构建 AI 应用程序直至规模化。通过 pgvector 和 pgvectorscale 扩展获得快速准确的相似性搜索。使用 pgai 扩展创建向量嵌入并对您的数据执行 LLM 推理。 - **PostgreSQL**：受信任的行业标准 RDBMS。非常适合需要强数据一致性、复杂关系和高级查询功能的应用程序。获得 ACID 合规性、广泛的 SQL 支持、JSON 处理以及通过自定义函数、数据类型和扩展进行的可扩展性。所有服务都包括您期望的生产环境云工具：[自动备份](https://docs.tigerdata.com/use-timescale/latest/backup-restore/backup-restore-cloud/)、[高可用性](https://docs.tigerdata.com/use-timescale/latest/ha-replicas/)、[只读副本](https://docs.tigerdata.com/use-timescale/latest/ha-replicas/read-scaling/)、[数据分叉](https://docs.tigerdata.com/use-timescale/latest/services/service-management/#fork-a-service)、[连接池](https://docs.tigerdata.com/use-timescale/latest/services/connection-pooling/)、[分层存储](https://docs.tigerdata.com/use-timescale/latest/data-tiering/)、[基于用量的存储](https://docs.tigerdata.com/about/latest/pricing-and-account-management/)等等。 ## 检查构建状态 |Linux/macOS|Linux i386|Windows|Coverity|代码覆盖率|OpenSSF| |:---:|:---:|:---:|:---:|:---:|:---:| |[![Linux/macOS 构建状态](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/797215d56e180446.svg)](https://github.com/timescale/timescaledb/actions/workflows/linux-build-and-test.yaml?query=workflow%3ARegression+branch%3Amain+event%3Aschedule)|[![Linux i386 构建状态](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/d0f09dbbef180447.svg)](https://github.com/timescale/timescaledb/actions/workflows/linux-32bit-build-and-test.yaml?query=workflow%3ARegression+branch%3Amain+event%3Aschedule)|[![Windows 构建状态](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/ce641521b0180448.svg)](https://github.com/timescale/timescaledb/actions/workflows/windows-build-and-test.yaml?query=workflow%3ARegression+branch%3Amain+event%3Aschedule)|[![Coverity 扫描构建状态](https://scan.coverity.com/projects/timescale-timescaledb/badge.svg)](https://scan.coverity.com/projects/timescale-timescaledb)|[![代码覆盖率](https://codecov.io/gh/timescale/timescaledb/branch/main/graphs/badge.svg?branch=main)](https://codecov.io/gh/timescale/timescaledb)|[![OpenSSF 最佳实践](https://www.bestpractices.dev/projects/8012/badge)](https://www.bestpractices.dev/projects/8012)| ## 参与我们欢迎对 TimescaleDB 的贡献！有关详细信息，请参阅[贡献](https://github.com/timescale/timescaledb/blob/main/CONTRIBUTING.md)和[代码风格指南](https://github.com/timescale/timescaledb/blob/main/docs/StyleGuide.md)。 ## 了解 Tiger Data Tiger Data 是针对事务性、分析性和智能体工作负载最快的 PostgreSQL。要了解有关公司及其产品的更多信息，请访问 [tigerdata.com](https://www.tigerdata.com)。 ## 故障排除 #### Docker 容器无法启动 ``` # 检查 container 是否正在运行 docker ps -a # 查看 container 日志（使用相应的 container 名称） # 对于一键安装： docker logs timescaledb-ha-pg18-quickstart # 对于手动 Docker 命令： docker logs timescaledb # 停止并移除现有的 container # 对于一键安装： docker stop timescaledb-ha-pg18-quickstart && docker rm timescaledb-ha-pg18-quickstart # 对于手动 Docker 命令： docker stop timescaledb && docker rm timescaledb # 全新启动 # 选项 1：使用一键安装 curl -sL https://tsdb.co/start-local | sh # 选项 2：使用手动 Docker 命令 docker run -d --name timescaledb -p 6543:5432 -e POSTGRES_PASSWORD=password timescale/timescaledb-ha:pg18 ``` #### 无法使用 psql 连接 - 验证 Docker 容器正在运行：`docker ps` - 检查端口 6543 是否未被占用：`lsof -i :6543` - 尝试使用显式主机：`psql -h 127.0.0.1 -p 6543 -U postgres` #### 找不到 TimescaleDB 扩展 `timescale/timescaledb-ha:pg18` 镜像已预安装和预加载 TimescaleDB。如果您看到错误，请确保您使用的是正确的镜像。 ## 清理当您完成实验后： #### 如果您使用的是单行安装： ``` # 停止 container docker stop timescaledb-ha-pg18-quickstart # 移除 container docker rm timescaledb-ha-pg18-quickstart # 移除持久化数据 volume docker volume rm timescaledb_data # （可选）移除 Docker image docker rmi timescale/timescaledb-ha:pg18 ``` #### 如果您使用的是手动 Docker 命令： ``` # 停止 container docker stop timescaledb # 移除 container docker rm timescaledb # （可选）移除 Docker image docker rmi timescale/timescaledb-ha:pg18 ``` **注意：** 如果您使用手动 Docker 命令创建了命名卷，可以使用 `docker volume rm ` 将其删除。

标签：Docker, Hypertable, PostgreSQL 扩展, SQL, TimescaleDB, TSDB, 事件数据, 云数据库, 列式存储, 多线程, 大数据, 安全防御评估, 实时分析, 客户端加密, 客户端加密, 开源, 数据压缩, 数据库, 时序数据, 时间序列数据库, 测试用例, 物联网, 监控, 目录扫描, 系统审计, 请求拦截