PostgreSQL— Benchmarking the Generation of Primary Key Values

Eric Hosick
The Opinionated Software Architect
7 min readSep 22, 2022

--

Gain insights into performance costs of generating primary key values for the following types: bigint (identity/bigserial), uuid v1, uuid v1, uuid v3, uuid v4, uuid v5, and timestamptz.

Photo by Zanardi.

The Opinionated TLDR

The Opinionated Architect’s TLDR:

General Findings (See uuid-ossp and pgcrypto libraries):

  • Hash UUID — Favor uuid_generate_v5(), which is 1.1x faster and uses SHA-1, over uuid_generate_v3() (which uses MD5).
  • Random UUID — Favor gen_random_uuid(), which is ~3.5x faster, over uuid_generate_v4() unless security is important (see RC5 and Fortuna).
  • Unique UUID (0% of clashing at the server level)— Favor uuid_generate_v1(), which is 1.7x faster, over uuid_generate_v1mc() unless security is important (uuid_generate_v1() exposes the mac Address of the machine and both expose uuid generation time).
  • Favor GENERATED IDENTITY, the SQL standard, over bigserial (see Don't Use Serial).
  • A PRIMARY KEY bigint column, with values generated using GENERATED IDENTITY, is ~3.0x faster than a PRIMARY KEY uuid column with values generated using gen_random_uuid(). When to choose a uuid over a bigint as the primary key for a table is beyond the scope of this article.
  • V1 UUIDs contain a timestamp. It’s still faster to call gen_random_uuid() + clock_timestamp() (~3.3x faster) than calling uuid_generate_v1().

NOTE:

  • Benchmarks are based on the function call times, ignoring other aspects of databases such as storage and indexing time. Be careful when making design decisions only based on the benchmarks provided.
  • These opinions are within the context of using PostgreSQL as your backend database. These opinions may not work with other persistence solutions.
  • All examples in this article use Idempotent SQL DDL.

This Article Is Not About

  • Selecting the correct uuid type varies based on the use case. See uuid-ossp and pgcrypto for more information about different uuid defaults.
  • We do not consider which is better to use as a primary key: a uuid or bigint.

Introduction

Measuring the performance of functions that take a tiny amount of time to execute can be difficult. Even on a dedicated server, we can see a significant difference in execution time. Check out PostgreSQL — Accurately Benchmarking Features Which Take Very Little Time to see how we tackled this issue.

Benchmarking and PostgreSQL

The data in this article came from running PostgreSQL 14.5 (Debian 14.5–1) in a Docker container with deploy.resources.cpus set to 3 and deploy.resources.memory set to 10G on a Macbook Pro M1 Max and 64GB ram.

Benchmarking Generating UUIDs

We've benchmarked the performance of different types of UUIDs from two libraries: uuid-ossp and pgcrypto. We only make comparisons between similar uuid types (trying for an apples-to-apples comparison).

You can verify the findings using primitive_benchmark_01.sql.

left     -  right     | Times  | left (micro-sec)| right (micro-sec)
--------------------------------------------------------------------
gran v1 - gran v1 | - | 1.1696 ± 0.0031 | 1.1680 ± 0.0021
gran v1 - uuid v1 | 3.5x | 1.1685 ± 0.0024 | 4.0586 ± 0.0038
gran v1 - uuid v4 | 2.8x | 1.1763 ± 0.0023 | 3.3236 ± 0.0054
uuid v1 - uuid v1mc | 1.7x | 4.0589 ± 0.0058 | 7.0557 ± 0.0048
uuid v5 - uuid v3 | 1.1x | 0.7567 ± 0.0020 | 0.6893 ± 0.0023

Note:

  • Benchmarks truncated (upper/lower 10th percentile).
  • Bold/Italic items had the fastest execution time. If neither is bolded, then execution times are considered equivalent.
  • gran v1 — gen_random_uuid(), uuid v1 uuid_generate_v1(), uuid v1mc — uuid_generate_v1(), uuid v3 — uuid_generate_v3(), uuid v4 — uuid_generate_v4(), uuid v5 — uuid_generate_v5()
  • Benchmarks for uuid_generate_v3() and uuid_generate_v5() are challenging to calculate as we need to vary the name parameter: accomplished by passing RANDOM()::text to the name parameter.

UUID Benchmark Results

UUID types are different enough (see uuid-ossp and pgcrypto) that we only compared similar uuid types.

  • Hashing (uuid_generate_v3(), uuid_generate_v5()) — Favor uuid_generate_v5(), which is 1.1x faster and uses SHA-1, over uuid_generate_v3() (which uses MD5). SHA-1 is considered more secure than MD5; however, both are vulnerable to dictionary attacks.
  • Random UUID (uuid_generate_v4(), gen_random_uuid()) — Generates a random uuid with a very small chance of collision (also see Are UUIDs Really Unique). For random UUIDs, favor gen_random_uuid(), which is ~3.5x faster, over uuid_generate_v4(). These functions use a different approach to generating random numbers. Delve deeper into the decision if security is important. See "What Is Name and Namespace" for further details.
  • Unique UUID (uuid_generate_v1(), uuid_generate_v1mc()) — Generates a unique UUID, that has 0% of clashing at the server level, using your computer's timestamp and MAC address (v1mc generates a random MAC address). If security is important, this might not be the right choice as a v1 uuid exposes the MAC Address and uuid generation time.

Benchmarking GENERATED IDENTITY (bigserial)

We favor GENERATED IDENTITY over bigserial because GENERATED IDENTITY is the SQL standard (see Don't Use Serial). As such, all performance measurements use GENERATED IDENTITY.

Benchmarking GENERATED IDENTITY requires us to move from benchmarking using SELECT to benchmarking that involves tables and indexes. For index type, we use PostgreSQL's b-tree index type (the default index type for a PRIMARY KEY). We will benchmark brin vs. b-tree indexes in another article.

Since we need to use indexes, we also want to test for variability based on the number of rows involved when inserting values. Within the left/right description column in the table below, we provide the number of rows for each benchmark (1e1 up to 9e5).

You can verify the findings using primitive_benchmark_02.sql.

left     -  right     | Times  | left (micro-sec)| right (micro-sec)
--------------------------------------------------------------------
rand v1 - bigint 1e1 | - | 18.675 ± 15.148 | 14.035 ± 7.7434
rand v1 - bigint 1e2 | 1.80x | 2.7130 ± 0.2107 | 1.5095 ± 0.1515
rand v1 - bigint 1e3 | 2.34x | 2.1507 ± 0.0414 | 0.9196 ± 0.0301
rand v1 - bigint 1e4 | 2.44x | 2.1271 ± 0.0290 | 0.8712 ± 0.0151
rand v1 - bigint 1e5 | 2.41x | 2.3914 ± 0.0328 | 0.9928 ± 0.0328
rand v1 - bigint 2e5 | 2.55x | 2.5007 ± 0.0293 | 0.9796 ± 0.0152
rand v1 - bigint 3e5 | 2.69x | 2.5603 ± 0.0289 | 0.9515 ± 0.0113
rand v1 - bigint 4e5 | 2.80x | 2.6448 ± 0.0253 | 0.9439 ± 0.0095
rand v1 - bigint 5e5 | 2.85x | 2.6752 ± 0.0284 | 0.9386 ± 0.0153
rand v1 - bigint 6e5 | 2.89x | 2.7230 ± 0.0220 | 0.9417 ± 0.0171
rand v1 - bigint 7e5 | 2.97x | 2.7687 ± 0.0159 | 0.9334 ± 0.0082
rand v1 - bigint 8e5 | 2.98x | 2.8064 ± 0.0193 | 0.9423 ± 0.0143
rand v1 - bigint 9e5 | 3.01x | 2.8344 ± 0.0201 | 0.9414 ± 0.0119

Note:

  • Benchmarks truncated (upper/lower 10th percentile).
  • Bold/Italic items had the fastest execution time. If neither is bolded, then execution times are considered equivalent.

UUID and IDENTITY Benchmark Results

For storing a value in a unique b-tree index, generating a bigint value is ~3x faster than generating a random uuid value.

The function uuid_generate_v4() generates UUIDs in a sorted order, which should make inserting them into an index faster. However, we found that gen_random_uuid() is faster by ~1.8x (see gen_random_uuid-vs-uuid_generate_v4.sql).

Comparing gen_random_uuid() and uuid_generate_v1()

Trying to compare gen_random_uuid() with uuid_generate_v1() is not an apples-to-apples comparison: these two UUID’s have different intent.

Further, generating a uuid_generate_v1() also provides a timestamp of uuid generation time. When architecting, we could leverage the creation time of a v1 uuid removing the need for a created_at column.

We’ll benchmark calling uuid_generate_v1() with gen_random_uuid() + clock_timestamp(). You can verify the findings using gen_random_uuid-vs-uuid_generate_v1.sql.

left     -  right     | Times  | left (micro-sec)| right (micro-sec)
--------------------------------------------------------------------
uuid v1 - rand+clkts | 2.88x | 22.7550 ± 7.261 | 7.9050 ± 2.4958
uuid v1 - rand+clkts | 3.33x | 8.5635 ± 1.3582 | 2.5705 ± 0.4494
uuid v1 - rand+clkts | 3.38x | 4.2949 ± 0.7240 | 1.2693 ± 0.1632
uuid v1 - rand+clkts | 3.32x | 3.9737 ± 0.0390 | 1.1970 ± 0.0209
uuid v1 - rand+clkts | 3.32x | 4.0693 ± 0.0287 | 1.2266 ± 0.0073
uuid v1 - rand+clkts | 3.30x | 4.0958 ± 0.0307 | 1.2413 ± 0.0055
uuid v1 - rand+clkts | 3.30x | 4.0934 ± 0.0247 | 1.2399 ± 0.0086
uuid v1 - rand+clkts | 3.31x | 4.0892 ± 0.0157 | 1.2372 ± 0.0079
uuid v1 - rand+clkts | 3.31x | 4.0852 ± 0.0190 | 1.2328 ± 0.0069
uuid v1 - rand+clkts | 3.31x | 4.0873 ± 0.0118 | 1.2346 ± 0.0047
uuid v1 - rand+clkts | 3.31x | 4.0861 ± 0.0125 | 1.2340 ± 0.0050
uuid v1 - rand+clkts | 3.31x | 4.0845 ± 0.0156 | 1.2326 ± 0.0045
uuid v1 - rand+clkts | 3.31x | 4.0787 ± 0.0091 | 1.2315 ± 0.0041

Note:

  • Benchmarks truncated (upper/lower 10th percentile).
  • Bold/Italic items had the fastest execution time. If neither is bolded, then execution times are considered equivalent.
  • uuid v1 — uuid_generate_v1(), rand+clkts — gen_random_uuid() + clock_timestamp()

gen_random_uuid() and uuid_generate_v1() Benchmark Results

Prior benchmarking shows that gen_random_uuid( ) is ~3.5x faster than uuid_generate_v1(). Above benchmarking shows that gen_random_uuid() + clock_timestamp() ~3.3x faster than uuid_generate_v1(). Only considering the time to execute functions, it is still faster to use gen_random_uuid() over uuid_generate_v1() even considering that uuid_generate_v1() also provides a timestamp.

Note that we have not included the overhead of saving the information via an INSERT.

Conclusion

In this article, we've taken the time to benchmark functions that generate UUIDs. Further, we did a quick deep dive into using bigint vs. a uuid as a primary key index.

--

--

Eric Hosick
The Opinionated Software Architect

Creator, entrepreneur, software architect, software engineer, lecturer, and technologist.