elitan/postgres-nanoid

GitHub: elitan/postgres-nanoid

一个 PostgreSQL 扩展函数，在数据库层直接生成带前缀、URL 友好的安全唯一标识符。

Stars: 68 | Forks: 0

# PostgreSQL Nanoid 为 PostgreSQL 提供安全、URL 友好的唯一标识符。简单、快速、随处可用。 ## 安装

点击展开安装 SQL（可直接复制粘贴）

``` CREATE EXTENSION IF NOT EXISTS pgcrypto; DROP FUNCTION IF EXISTS nanoid CASCADE; DROP FUNCTION IF EXISTS nanoid_optimized CASCADE; -- Helper function for random generation CREATE OR REPLACE FUNCTION nanoid_optimized(size int, alphabet text, mask int, step int) RETURNS text LANGUAGE plpgsql VOLATILE PARALLEL SAFE AS $$ DECLARE idBuilder text := ''; counter int := 0; bytes bytea; alphabetIndex int; alphabetArray text[]; alphabetLength int; BEGIN alphabetArray := regexp_split_to_array(alphabet, ''); alphabetLength := array_length(alphabetArray, 1); LOOP bytes := gen_random_bytes(step); FOR counter IN 0..step - 1 LOOP alphabetIndex :=(get_byte(bytes, counter) & mask) + 1; IF alphabetIndex <= alphabetLength THEN idBuilder := idBuilder || alphabetArray[alphabetIndex]; IF length(idBuilder) = size THEN RETURN idBuilder; END IF; END IF; END LOOP; END LOOP; END $$; -- Main nanoid function - secure random IDs CREATE OR REPLACE FUNCTION nanoid( prefix text DEFAULT '', size int DEFAULT 21, alphabet text DEFAULT '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', additionalBytesFactor float DEFAULT 1.02 ) RETURNS text LANGUAGE plpgsql VOLATILE PARALLEL SAFE AS $$ DECLARE random_size int; random_part text; finalId text; alphabetLength int; mask int; step int; BEGIN IF size IS NULL OR size < 1 THEN RAISE EXCEPTION 'The size must be defined and greater than 0!'; END IF; IF alphabet IS NULL OR length(alphabet) < 2 OR length(alphabet) > 255 THEN RAISE EXCEPTION 'The alphabet must be between 2 and 255 symbols!'; END IF; IF additionalBytesFactor IS NULL OR additionalBytesFactor < 1 THEN RAISE EXCEPTION 'The additional bytes factor can''t be less than 1!'; END IF; random_size := size - length(prefix); IF random_size < 1 THEN RAISE EXCEPTION 'The size must be larger than the prefix length! Need at least % characters.', length(prefix) + 1; END IF; alphabetLength := length(alphabet); mask := (2 << cast(floor(log(alphabetLength - 1) / log(2)) AS int)) - 1; step := cast(ceil(additionalBytesFactor * mask * random_size / alphabetLength) AS int); IF step > 1024 THEN step := 1024; END IF; random_part := nanoid_optimized(random_size, alphabet, mask, step); finalId := prefix || random_part; RETURN finalId; END $$; ```

**适用于所有 Postgres 提供商：** - AWS RDS, Google Cloud SQL, Azure Database 等 - 自托管的 Postgres (v12+) - 需要 `pgcrypto` 扩展（大多数托管提供商均支持） ## 快速开始 ``` -- Generate IDs with prefixes SELECT nanoid('cus_'); -- cus_V1StGXR8_Z5jdHi6B SELECT nanoid('ord_'); -- ord_K3JwF9HgNxP2mQrTy SELECT nanoid('user_'); -- user_9LrfQXpAwB3mHkSt -- Use in tables CREATE TABLE customers ( id SERIAL PRIMARY KEY, public_id TEXT NOT NULL UNIQUE DEFAULT nanoid('cus_'), name TEXT NOT NULL ); ``` ## 为什么选择 Nanoids | 特性 | 自增 | UUID | Nanoid | | ------------------- | ----------------- | ------------- | ------------ | | **安全性** | 否（暴露数量） | 是 | 是 | | **长度** | 可变 | 36 个字符 | 21 个字符 | | **URL 友好** | 是 | 否（包含连字符）| 是 | | **分布式支持** | 否 | 是 | 是 | | **性能** | 快速 | 较慢 | 快速 | ## 性能 ``` SELECT nanoid('ord_') FROM generate_series(1, 100000); -- ~0.9s = 110,000 IDs/sec ``` - 生成速度快（100K+ ID/秒） - 内存效率高 - 分布式系统中无需协调 ## 用法 ### 基础示例 ``` -- Default (21 chars) SELECT nanoid(); -- V1StGXR8_Z5jdHi6B-myT -- With prefix SELECT nanoid('user_'); -- user_V1StGXR8_Z5jdHi6B SELECT nanoid('ord_'); -- ord_K3JwF9HgNxP2mQrTy -- Custom size SELECT nanoid('cus_', 25); -- cus_V1StGXR8_Z5jdHi6B-my -- Custom alphabet (hex-only) SELECT nanoid('tx_', 16, '0123456789abcdef'); -- tx_a3f9d2c1b8e4 ``` ### 生产环境表 ``` CREATE TABLE customers ( id SERIAL PRIMARY KEY, public_id TEXT NOT NULL UNIQUE DEFAULT nanoid('cus_'), name TEXT NOT NULL, CHECK (public_id ~ '^cus_[0-9a-zA-Z]{17}$') ); CREATE TABLE orders ( id SERIAL PRIMARY KEY, public_id TEXT NOT NULL UNIQUE DEFAULT nanoid('ord_'), customer_id TEXT REFERENCES customers(public_id), amount DECIMAL(10,2) ); ``` **大小计算：** 默认大小 21，带有前缀 `cus_`（4 个字符）= 17 个随机字符 **提示：** - 在 `public_id` 上添加 `UNIQUE` 就足够了 - 无需额外索引 - 使用正则表达式的 CHECK 约束速度很快，可用于前缀验证 ### 批量生成 ``` WITH batch_ids AS ( SELECT nanoid('item_') as id, 'Product ' || generate_series as name FROM generate_series(1, 100000) ) INSERT INTO products (public_id, name) SELECT id, name FROM batch_ids; -- ~1 second for 100k IDs ``` ### 参数 - `prefix` (text, 默认 `''`) - 前置到 ID 的字符串 - `size` (int, 默认 `21`) - 包含前缀的总长度 - `alphabet` (text, 默认 `0-9a-zA-Z`) - 62 个 URL 安全字符，长度必须在 2-255 个字符之间 - `additionalBytesFactor` (float, 默认 `1.02`) - 用于提升效率的缓冲区乘数 ### 自定义字母表 ``` -- Hex-only IDs SELECT nanoid('tx_', 16, '0123456789abcdef'); -- tx_a3f9d2c1b8e4 -- Numbers-only (not recommended - less entropy) SELECT nanoid('ref_', 12, '0123456789'); -- ref_847392 -- URL-safe base64 SELECT nanoid('tok_', 32, '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-_'); ``` ## 时间排序 ID（进阶）如果你需要字典序的时间排序（审计日志、事件流），可以使用 `nanoid_sortable()`。这会在 ID 中嵌入时间戳，这**会暴露创建时间和业务活动模式**。请仅在必要时使用。

点击展开可排序 ID 的安装说明

``` -- Add to your existing installation DROP FUNCTION IF EXISTS nanoid_sortable CASCADE; DROP FUNCTION IF EXISTS nanoid_extract_timestamp CASCADE; CREATE OR REPLACE FUNCTION nanoid_sortable( prefix text DEFAULT '', size int DEFAULT 21, alphabet text DEFAULT '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', additionalBytesFactor float DEFAULT 1.02 ) RETURNS text LANGUAGE plpgsql VOLATILE PARALLEL SAFE AS $$ DECLARE timestamp_ms bigint; timestamp_encoded text := ''; remainder int; temp_ts bigint; random_size int; random_part text; finalId text; alphabetArray text[]; alphabetLength int; mask int; step int; BEGIN IF size IS NULL OR size < 1 THEN RAISE EXCEPTION 'The size must be defined and greater than 0!'; END IF; IF alphabet IS NULL OR length(alphabet) < 2 OR length(alphabet) > 255 THEN RAISE EXCEPTION 'The alphabet must be between 2 and 255 symbols!'; END IF; IF additionalBytesFactor IS NULL OR additionalBytesFactor < 1 THEN RAISE EXCEPTION 'The additional bytes factor can''t be less than 1!'; END IF; timestamp_ms := extract(epoch from clock_timestamp()) * 1000; alphabetArray := regexp_split_to_array(alphabet, ''); alphabetLength := array_length(alphabetArray, 1); temp_ts := timestamp_ms; IF temp_ts = 0 THEN timestamp_encoded := alphabetArray[1]; ELSE WHILE temp_ts > 0 LOOP remainder := temp_ts % alphabetLength; timestamp_encoded := alphabetArray[remainder + 1] || timestamp_encoded; temp_ts := temp_ts / alphabetLength; END LOOP; END IF; WHILE length(timestamp_encoded) < 8 LOOP timestamp_encoded := alphabetArray[1] || timestamp_encoded; END LOOP; random_size := size - length(prefix) - 8; IF random_size < 1 THEN RAISE EXCEPTION 'The size including prefix and timestamp must leave room for random component! Need at least % characters.', length(prefix) + 9; END IF; mask := (2 << cast(floor(log(alphabetLength - 1) / log(2)) AS int)) - 1; step := cast(ceil(additionalBytesFactor * mask * random_size / alphabetLength) AS int); IF step > 1024 THEN step := 1024; END IF; random_part := nanoid_optimized(random_size, alphabet, mask, step); finalId := prefix || timestamp_encoded || random_part; RETURN finalId; END $$; CREATE OR REPLACE FUNCTION nanoid_extract_timestamp( nanoid_value text, prefix_length int DEFAULT 0, alphabet text DEFAULT '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' ) RETURNS timestamp LANGUAGE plpgsql IMMUTABLE PARALLEL SAFE AS $$ DECLARE timestamp_encoded text; timestamp_ms bigint := 0; alphabetArray text[]; alphabetLength int; char_pos int; i int; BEGIN timestamp_encoded := substring(nanoid_value, prefix_length + 1, 8); alphabetArray := regexp_split_to_array(alphabet, ''); alphabetLength := array_length(alphabetArray, 1); FOR i IN 1..length(timestamp_encoded) LOOP char_pos := array_position(alphabetArray, substring(timestamp_encoded, i, 1)); IF char_pos IS NULL THEN RAISE EXCEPTION 'Invalid character in timestamp: %', substring(timestamp_encoded, i, 1); END IF; timestamp_ms := timestamp_ms * alphabetLength + (char_pos - 1); END LOOP; RETURN to_timestamp(timestamp_ms / 1000.0); EXCEPTION WHEN OTHERS THEN RAISE EXCEPTION 'Invalid nanoid format or timestamp extraction failed: %', SQLERRM; END $$; ```

**用法：** ``` -- Time-sorted IDs (8 chars timestamp + 9 chars random for size 21 with 4-char prefix) SELECT nanoid_sortable('log_'); -- log_0uQzNrIEg13LGTj4c SELECT nanoid_sortable('evt_'); -- evt_0uQzNrIEutvmf1aS -- Extract timestamp SELECT nanoid_extract_timestamp('log_0uQzNrIBqK9ayvN1T', 4); -- 2025-01-15 14:23:10.204 -- Use in tables CREATE TABLE audit_logs ( id SERIAL PRIMARY KEY, event_id TEXT NOT NULL UNIQUE DEFAULT nanoid_sortable('log_'), message TEXT ); ``` **权衡：** - **优点：** 无需单独的时间戳列即可实现字典序的时间排序 - **缺点：** 会暴露创建时间和业务活动模式 - **用例：** 对隐私要求不高的内部审计日志 ## 开发 ``` # 使用 Docker 克隆和测试 git clone https://github.com/elitan/postgres-nanoid cd postgres-nanoid make up && make test-all # Start + run tests make psql # Connect and try functions ```

标签：NanoID, PostgreSQL, 唯一标识符, 数据库扩展, 测试用例, 随机数生成