Making Postgres 42,000x slower
Based on The PrimeTime's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Start with a measurable baseline (TPC-C via benchbase, 128 warehouses, 100 connections) to quantify how far performance can be pushed in either direction.
Briefing
Postgres can be driven to extreme slowdown—about 42,000× slower than a default setup—by tuning only configuration parameters, while still keeping enough transactional throughput to avoid a total shutdown. The exercise matters because it demonstrates how fragile “performance” can be: small, well-intentioned changes to caching, maintenance behavior, and write-ahead logging can compound into a system that spends most of its time doing expensive work instead of serving queries.
The benchmark starts with a baseline using TPC-C (via benchbase) at 128 warehouses, 100 connections, and a target of 10,000 TPS, running on Linux on a Ryzen 7950X with 32 GB RAM and a 2 TB SSD. With Postgres 19 (as described), default settings are adjusted only for a few standard performance knobs—shared buffers, work memory, and worker processes—yielding roughly 7,082 TPS. From there, the goal flips: force Postgres to read and write as inefficiently as possible, without deleting indexes or otherwise “cheating” by changing schema.
The first major lever is cache starvation. Postgres’ shared buffers and related caching behavior are central to avoiding disk reads; shrinking shared buffers forces more page requests to miss RAM and hit the operating system instead. The tuning sequence pushes shared buffers down aggressively: from 10 GB down to 8 MB, then toward ~2 MB, where the system’s throughput collapses to under 500 TPS and later to roughly 200–300 TPS. The hit rate drops sharply (from ~99.9% to around the 70% range at one point), which drives a surge in read system calls.
Next comes maintenance sabotage. Autovacuum and autoanalyze are reconfigured to run far more frequently, with vacuum cost limits set so vacuuming rarely pauses, and maintenance memory/logging adjusted to make vacuum work heavier. The result is that vacuum and analysis repeatedly touch “hot” tables, and because the cache is already starved, each run forces significant disk reads. Logs are used to confirm the performance hit lines up with frequent vacuum/analyze activity.
Then the write path is tuned to be as slow as possible. Write-ahead logging (WAL) and checkpoint behavior are made to flush and checkpoint constantly: WAL flush delays are minimized, checkpoint frequency is increased, and flush/checkpoint I/O is forced to happen in the most punishing way (including settings like open data sync and full-page write behavior). Checkpoints that normally wouldn’t be so frequent begin occurring repeatedly, and the throughput drops further—down to roughly 98× slower and then below ~170× slower.
Finally, index usage is discouraged without removing indexes. By increasing the relative cost of random page access (random_page_cost) and adjusting CPU-related index costs, the planner is pushed toward sequential scans, which are slower under the cache-starved conditions. Throughput falls again to around 87 TPS, then below 1 TPS after additional tuning.
A last step uses Postgres 18’s I/O controls to force I/O into a single worker thread (via the IO method knob and related worker settings). With I/O effectively serialized across the workload, the system reaches the headline outcome: well below 0.1 TPS, with only 11 transactions completing successfully across 100 connections and 120 seconds, and even more failures due to deadlocks. The takeaway is blunt: configuration-only changes can turn Postgres into a near-dead system, and the same knobs that help performance can be weaponized against it.
Cornell Notes
A default-tuned Postgres setup at ~7,082 TPS (TPC-C, 128 warehouses, 100 connections) can be pushed to roughly 42,000× slower—well under 0.1 TPS—using configuration parameters alone. The slowdown comes from chaining three effects: starving shared buffers to force disk reads, making autovacuum/autoanalyze run constantly so maintenance repeatedly hits disk, and degrading WAL/checkpoint behavior so commits and checkpoints flush far more aggressively. Indexes aren’t removed; planner costs are adjusted so sequential scans become preferable under the cache-starved conditions. The final step serializes I/O using Postgres 18’s IO method controls, making throughput collapse even further, with many transactions failing due to deadlocks.
How does shrinking shared buffers translate into a large TPS drop?
Why does reconfiguring autovacuum and autoanalyze hurt performance so much in this setup?
What role do WAL and checkpoints play in the slowdown?
How can indexes be effectively “disabled” without deleting them?
Why does forcing I/O into one thread matter when there are 100 connections?
Review Questions
- Which tuning change first causes the biggest shift from memory-resident reads to disk reads, and how is that reflected in hit rate or read system calls?
- Explain how autovacuum frequency and vacuum cost limits interact with a tiny shared buffers setting to amplify disk I/O.
- What combination of planner cost changes and I/O serialization ultimately pushes the system from “hundreds of TPS” to “well under 0.1 TPS”?
Key Points
- 1
Start with a measurable baseline (TPC-C via benchbase, 128 warehouses, 100 connections) to quantify how far performance can be pushed in either direction.
- 2
Starving shared buffers forces page misses, increasing read system calls and collapsing TPS even before maintenance or WAL changes.
- 3
Making autovacuum/autoanalyze run almost continuously can dominate workload time when cache hit rates are already low.
- 4
Aggressive WAL flush and checkpoint tuning increases commit and durability overhead, with logs showing frequent checkpoint cycles.
- 5
Index usage can be discouraged without dropping indexes by raising random_page_cost and related planner costs so sequential scans become cheaper.
- 6
Postgres 18’s IO method controls can serialize I/O, preventing overlap and driving throughput toward near-zero under disk-heavy conditions.
- 7
Even when some transactions complete, deadlocks can rise as the system becomes overloaded and maintenance/write behavior worsens.