Laurenz Albe: Bulk Load Performance In PostgreSQL

There are several techniques to bulk load data into PostgreSQL. I decided to compare their performance in a simple test case. I’ll add some recommendations for parameter settings to improve the performance even more.

Contents hide

1 An example table to bulk load data

2 Different methods for bulk loading

2.1 Single INSERTs in autocommit mode (the fool’s way)

2.2 Single INSERTs in one transaction

2.3 Single INSERTs with a prepared statement in one transaction

2.4 Multi-line INSERTs in one transaction

2.5 Multi-line INSERTs with a prepared statement in one transaction

2.6 Bulk load with COPY (the king’s way)

3 Performance comparison of the bulk load methods

4 Discussion of the bulk load performance test results

5 Tuning PostgreSQL for bulk load

An example table to bulk load data

The table is simple enough:

CREATE TABLE instest (
   id    bigint PRIMARY KEY,
   value text   NOT NULL
);

It is a narrow table (only two columns), but it has a primary key index. Loading data would be much faster without the index, but in real life you cannot always drop all indexes and constraints before loading data. Moreover, dropping indexes before the bulk load and re-creating them afterwards can be slower, if the table already contains data.

For the performance test, I’m going to load 10 million rows that look like (1, '1'), (2, '2') and so on, counting up to 10000000. The test will be performed on PostgreSQL v16 with the default configuration. I am aware that the default configuration is not perfect for bulk loading, but that does not bother me, since I am only interested in comparing the different methods.

Different methods for bulk loading

I’ll try the following six techniques:

Single `INSERT`s in autocommit mode (the fool’s way)

This will be terribly slow, since each statement will run in its own transaction. For each transaction, PostgreSQL has to write the WAL (the transaction log) out to disk, which leads to 10 million I/O requests.

Naturally, this is not the correct thing to do for a bulk load. But it is the way a transactional application loads data into the database, with many clients inserting small amounts of data, and no way to bundle the individual requests into bigger transactions. There are some remedies available to boost performance in such a case:

Set synchronous_commit to off. That will boost performance amazingly, but an operating system crash could lead to some committed transactions getting lost.
Set commit_delay to a value greater than 0 and tune commit_siblings. That can reduce the number of I/O requests. The effect won’t be as marked as with synchronous_commit, but you can never lose a committed transaction.

Single `INSERT`s in one transaction

The only difference to the previous test is that we will insert all 10 million rows in a single transaction. That is bound to boost performance quite a lot.

Single `INSERT`s with a prepared statement in one transaction

The difference to the previous test is that we use a prepared statement:

PREPARE stmt(bigint,text) AS
INSERT INTO instest (id, value) VALUES ($1, $2);

and perform the inserts like this:

EXECUTE stmt(1, '1');

The performance should be better, because PostgreSQL can reuse the execution plan for the INSERT statement rather than planning it 10 million times. With short statements, that can be a notable performance improvement.

Multi-line `INSERT`s in one transaction

You can insert several rows with a single INSERT statement:

INSERT INTO instest (id, value) VALUES
   (1, '1'),
   (2, '2'),
   ...
   (1000, '1000');

For this test, I’ll insert 1000 rows per statement, so there will be 10000 such INSERT statements. The benefit is that there are fewer client-server round trips, and PostgreSQL has to plan and execute fewer statements.

Multi-line `INSERT`s with a prepared statement in one transaction

This test is like the previous one, except that I’ll use a prepared statement:

PREPARE stmt(bigint,text,bigint,text,...) AS
INSERT INTO instest (id, value) VALUES
   ($1, $2),
   ($3, $4),
   ...
   ($1999, $2000);

The idea is to combine the benefits of prepared statements and multi-line INSERTs.

Bulk load with `COPY` (the king’s way)

It is well known that COPY is the fastest way to load data into PostgreSQL. Let’s see how much better it really is! For this test, I’ll load all 10 million rows with a single COPY statement.

The down side of COPY is that it is a non-standard SQL statement, so not all APIs support it, and you cannot use it in programs that are to support other database systems as well.

Note that you can use COPY (FREEZE) if you create or truncate the table in the same transaction. That won’t speed up loading, but it avoids the overhead of setting hint bits on the rows by the first reader and anti-wraparound autovacuum on the table.

Performance comparison of the bulk load methods

The tests were run on my Laptop with Fedora Linux 37, untuned PostgreSQL v16 and an NVMe disk.

Bulk load method	Duration
single `INSERT`s, many transactions	8954 s
single `INSERT`s	841 s
single `INSERT`s, prepared statement	688 s
bulk `INSERT`s	52 s
bulk `INSERT`s, prepared statement	57 s
`COPY`	14 s

Discussion of the bulk load performance test results

Most test results are as expected:

running each INSERT in its own transaction is unbearable slow
using prepared statements instead for the small INSERTs is a win
few big statements perform better than many small ones
nothing is as fast as COPY

What may surprise is that using prepared statements for the multi-line statements slows down processing. I am not certain what the reason is, but it could be the overhead of the extended query protocol.

Tuning PostgreSQL for bulk load

For completeness’ sake, I will add some hints how you can configure PostgreSQL to speed up bulk loads. There is little you can to to speed up writes, but you can reduce the number of redundant, unnecessary writes.

The most important parameter to tune is max_wal_size. The default value is 1GB. If more than that amount of WAL has been written since the latest checkpoint, PostgreSQL will trigger another checkpoint. The idea is to keep crash recovery time short. But 1GB is very little when it comes to bulk loading, and you can end up having a checkpoint every couple of seconds. That is an unnecessary overhead, particularly since PostgreSQL might have to flush the same (index) pages to disk again and again.
For similar reasons, you should increase checkpoint_timeout. Fewer checkpoints mean fewer redundant writes. Moreover, reducing the number of checkpoints will reduce the size of the WAL, since fewer “full-page images” have to be written.