A Ledger In PostgreSQL Is Fast!
I’ve been working on a ledger implementation in pure PostgreSQL called pgledger. For the backstory, please read my previous blog post: Ledger Implementation in PostgreSQL.
Now that the project is a bit further along, I decided to gather some performance numbers. And it’s fast! Depending on the scenario, I can easily get over 10,000 ledger transfers per second on my laptop with a stock, un-optimized PostgreSQL. I would imagine a well tuned production database would do a lot more.
Sure, it’s not TigerBeetle level performance, but still more than enough for most applications. And the simplicity of having the ledger in your main database is huge.
Scenarios and Scripting
Performance is a notoriously hard thing to measure, since different usage patterns and different hardware can yield very different results.
I have been iterating on a script in the pgledger repository to help measure performance: performance_check.go. My thought was that others could also run this script in their environments if they want to gather more realistic numbers for their setup.
The script takes a few inputs, including:
- The number of accounts (each transfer is moving money from one of these accounts to another one)
- The number of concurrent workers doing transfers
- The duration of time to run
To simulate a scenario where there isn’t much account contention, we can ensure there are many more accounts than workers. That way, concurrent workers rarely try to transfer between the same accounts. Or alternatively, if our system has only a handful of hot accounts, we can keep the number of accounts low to simulate workers waiting for locked accounts.
The script also measures the database size before and after, and then calculates the amount of disk space used per transfer. This should take into account the data in both tables and indexes.
Local Results
Here are some results from my laptop (M3 Macbook Air). I am using a vanilla, unoptimized PostgreSQL 17.5, set up with:
brew install postgresql@17
brew services start postgresql@17
First, the low contention scenario:
> go run performance_check.go --accounts=50 --workers=20 --duration=30s
Completed transfers: 319105
Elapsed time in seconds: 30.0
Database size before: 1795 MB
Database size after: 2021 MB
Database size growth in bytes: 237223936
Transfers/second: 10636.8
Milliseconds/transfer: 1.9
Bytes/transfer: 743
We can see here that we spent less than 2 milliseconds per transfer for an overall rate of 10,636.8 transfers per second. And each transfer added about 743 bytes to the database. Compared to many queries I’ve seen in financial application code, this is quite fast.
And the account contention scenario, where workers may need to wait on other workers currently using the same accounts:
> go run performance_check.go --accounts=10 --workers=20 --duration=30s
Completed transfers: 226767
Elapsed time in seconds: 30.0
Database size before: 2021 MB
Database size after: 2182 MB
Database size growth in bytes: 168566784
Transfers/second: 7558.9
Milliseconds/transfer: 2.6
Bytes/transfer: 743
In this scenario, our throughput dropped to 7,558.9 transfers per second since transfers were waiting to lock hot accounts before proceeding. You can also see that the time per transfer increased to 2.6 milliseconds. Still quite fast, though.
Remote Results
I wanted to also test the performance against a hosted database, so I set up an account on Neon. The free database has 2 vCPU and 8 GB of RAM. Since the latency from my laptop to the cloud hosted database is higher, I ran more workers (since each worker would spend more time waiting on network responses):
> go run performance_check.go --accounts=100 --workers=60 --duration=30s
Completed transfers: 48937
Elapsed time in seconds: 30.0
Database size before: 157 MB
Database size after: 192 MB
Database size growth in bytes: 36470784
Transfers/second: 1631.2
Milliseconds/transfer: 36.8
Bytes/transfer: 745
(Note that I also had to increase the pool size in the PostgreSQL client to handle the increased workers by appending &pool_max_conns=100
to the connection string.1)
These numbers aren’t as great, but it’s still over 1,000 transfers per second on the free tier of a database that is at least one US state away, so I think it’s still pretty good. If someone wants to test this on their large optimized production database, I would love to see the results.
Final Thoughts
I’m happy with these numbers for now. I think this performance is high enough for most use cases, and it’s likely that any production system will have other bottlenecks in the database before hitting the pgledger limits.
I also haven’t done a lot of optimizations to the ledger. In fact, I’ve made some choices that hurt performance in the name of simplicity. If/When performance becomes a bigger concern, I think there’s a few things that could be done to make pgledger even faster. It seems like the limiting factor right now is disk write speed, so reducing the amount of data per transfer would help, such as:
- Removing fields which are duplicated between
pgledger_transfers
andpgledger_entries
such as the account IDs - Using an ID format that takes up less space (I like prefixed ULIDs, but they could be stored as UUID types)
- Removing the
created_at
fields and relying on the timestamps within the IDs
So if you need a ledger, please check out pgledger and let me know what you think!