2025-08-19 08:54:00
www.orioledb.com
When many sessions try to insert into the same B-tree leaf page, classic exclusive page locking serializes progress and wastes time on sleep/wake cycles. We’re introducing a batch page insertion path that lets the session holding the page lock insert for itself and its neighbors. The result: dramatically reduced lock waits, and big gains at high client counts (2X throughput boost starting from 64 clients in our benchmark).
In OrioleDB beta12, inserts into a B-tree leaf are performed under an exclusive page lock. If several backends target the same leaf page, they queue, sleep, and wake up in turn.
That’s correct and simple — but not cheap:
- Every process waiting for the lock pays the cost of sleep → wake → re-check just to insert a tiny tuple.
- If the leaf is hot (e.g., time-ordered keys, skew, or tight key ranges), we get a huge queue on that single page.
- The lock hand-off cadence, not raw CPU, becomes the bottleneck.
The idea: cooperate while you hold the page
We introduced a cooperative scheme that transfers the insertion work to the process that already holds the leaf’s exclusive lock. Think of it as “group commit” for page inserts.
How the new batch page insertion works
-
Lockless list of waiting processes per page.
The lock-protected queue of page-lock waiting processes is replaced with a lockless list (we will call it a “wait list”), so publishing intent is wait-free for followers.
-
The waiting process publishes the tuple.
Before a process queues for the page lock, it publishes its tuple (key + payload + a small control meta-information) into shared memory and links it into the page’s wait list.
-
The lock holder process batch insert.
When a process inserts into the leaf page (under the lock), it also scans the wait list, collects the already-published tuples whose keys belong to that page, and inserts as many as fit, along with its own tuple.
-
Waking up is usually a no-op.
When the original inserters wake, many find their tuple is already inserted. They will just proceed.
If the batch filled the page and forced a split, the remaining inserters re-check the page’s hikey and may follow the rightlink, similarly to a standard insert after a split.
This is pure cooperation: no session is required to help others, but the lock holder has the best vantage point to make progress at a low cost for the group.
Thanks to this scheme, the lock holder performs N inserts for at least one lock handover. The N is limited by the wait list size and the number of tuples that can be inserted into the page with split. So, under high concurrency, the lock holder can grow one page at a time. This is great compared to inserting one tuple at a time.
Benchmark
The benchmark was performed on an AWS EC2 c7gd.metal machine with 64 Graviton3 vCPU. The simple benchmark is simulating an IoT (Internet of Things) workload. The DDL for this benchmark is given below.
CREATE TABLE measure (
sensor_id int NOT NULL,
ts timestamp NOT NULL,
value float8 NOT NULL,
PRIMARY KEY (sensor_id, ts)) USING orioledb;
CREATE INDEX measure_ts_idx ON measure(ts);
The measure
table stores measures coming from sensors at paricular moments of time. The measure_ts_idx
index on timestamp provides an effective way to query slices of measures. Obviously, this index will have ordered inserts and will suffer from lock convoys on its rightmost leaf page. The pgbench
script simulates the insertion of measures for 10 sensors in one query.
INSERT INTO measure VALUES
(:client_id * 10, current_timestamp, random()),
(:client_id * 10 + 1, current_timestamp, random()),
(:client_id * 10 + 2, current_timestamp, random()),
(:client_id * 10 + 3, current_timestamp, random()),
(:client_id * 10 + 4, current_timestamp, random()),
(:client_id * 10 + 5, current_timestamp, random()),
(:client_id * 10 + 6, current_timestamp, random()),
(:client_id * 10 + 7, current_timestamp, random()),
(:client_id * 10 + 8, current_timestamp, random()),
(:client_id * 10 + 9, current_timestamp, random());
-
OrioleDB beta12 vs. PostgreSQL heap
OrioleDB beta12 appears to be slower than PostgreSQL heap on 16 concurrent connections, but shows 10%-20% speedup starting from 64 connections.
-
Batch page insertion
Batch page insertion optimization dramatically improves the situation by reducing lock waits. While having no regression, it shows almost 2X speedup starting from 64 connections.
Try it and tell us what you see
If your workload shows skewed keys or time-based ordering (IoT, event logs, time-series dimensions, append-heavy OLTP), you’re a prime beneficiary. The optimization is now on our main branch and will be published in the OrioleDB beta13 release. We’d love to hear your results and traces — especially at 64+ clients where the gains are most significant.
Keep your files stored safely and securely with the SanDisk 2TB Extreme Portable SSD. With over 69,505 ratings and an impressive 4.6 out of 5 stars, this product has been purchased over 8K+ times in the past month. At only $129.99, this Amazon’s Choice product is a must-have for secure file storage.
Help keep private content private with the included password protection featuring 256-bit AES hardware encryption. Order now for just $129.99 on Amazon!
Help Power Techcratic’s Future – Scan To Support
If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.
As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!
BITCOIN bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge Scan the QR code with your crypto wallet app |
DOGECOIN D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA Scan the QR code with your crypto wallet app |
ETHEREUM 0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a Scan the QR code with your crypto wallet app |
Please read the Privacy and Security Disclaimer on how Techcratic handles your support.
Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.