Bun vs Go Perf | Prime Reacts
Based on The PrimeTime's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Latency is judged primarily by P99 from the client side, with availability defined by failed requests relative to total requests.
Briefing
The central takeaway is that Bun and Go trade places depending on what kind of workload hits the system: Bun tends to hold lower client latency longer under a CPU-constrained, mostly “server does little work” HTTP workload, while Go can fall behind on that narrow test but pulls ahead once real persistence work (MongoDB inserts) dominates. The practical message is less “pick a winner” and more “benchmarks are workload-specific,” especially when 99th-percentile latency and Kubernetes throttling are part of the measurement.
The comparison uses a controlled HTTP setup built with standard libraries: an endpoint returns hardcoded JSON for a first round of tests, then a second round adds a persistence layer where each POST generates a UID and writes a document to MongoDB. Across both rounds, the emphasis stays on four user-facing “golden signals”: latency (tracked with P99), throughput (requests per second), saturation (service fullness via CPU usage and throttling behavior), and availability (failed requests). Latency is measured from the client side, and the load is ramped until each service breaks—either by timeouts, Kubernetes CPU throttling, or resource exhaustion.
A key methodological choice shapes the results: both services are deployed with only one CPU allocated per instance (even though Go’s runtime can use multiple CPUs via goroutines). Two replicas of each service run per EC2 instance, and load is generated with increasing virtual clients until failure. The test environment is production Kubernetes on EC2 (with Graviton instances), not localhost, aiming to avoid the distortions that come from local-only optimizations.
In the “hardcoded JSON” scenario, Go’s latency and availability degrade earlier as request rates climb. The results show Go hitting a breaking point around the low-to-mid tens of thousands of requests per second, with latency rising sharply and Kubernetes throttling contributing to the slowdown. Bun maintains lower latency longer and only starts to degrade at higher request rates, though it too eventually reaches a limit and gets throttled. The discussion around why centers on the mismatch between Go’s typical strength—efficient vertical scaling across CPU cores—and the test’s deliberate CPU restriction, which can stress scheduling overhead and worker behavior.
When MongoDB inserts enter the picture, the story flips. The workload becomes dominated by database write time, and Bun’s earlier advantage in the static endpoint shrinks. Go delivers substantially better end-to-end performance for the POST path: client latency and database insert duration are both reported as roughly half of Bun’s, and Go sustains higher throughput before failing. CPU usage patterns also change, with Go using less CPU while Bun degrades sooner.
The broader conclusion is cautionary: small synthetic or “small world” benchmarks can produce wildly different outcomes from tiny changes in assumptions—CPU limits, header sizes, database involvement, and even how concurrency is implemented. The takeaway lands on a decision rule: validate with your actual workload and measure production 99th-percentile outcomes rather than trusting micro-benchmarks that may not resemble real traffic.
Cornell Notes
Bun and Go do not have a single, universal winner; performance depends on the workload and where the bottleneck sits. In a CPU-constrained HTTP test that mostly returns hardcoded JSON, Bun holds lower P99 latency and better availability longer, while Go degrades earlier as request rates rise and Kubernetes throttling appears. The test’s one-CPU-per-instance setup is highlighted as especially unfavorable to Go’s typical strength: efficient use of multiple CPU cores via goroutines. When the workload includes real persistence—POST requests that generate a UID and insert documents into MongoDB—Go performs significantly better, with lower client latency and faster database insert time. The practical lesson is to benchmark with realistic endpoints and measure production 99th-percentile results.
Why does the one-CPU-per-instance constraint matter so much for interpreting Bun vs Go results?
What does “golden signals” mean here, and why is P99 latency emphasized?
How does the workload shift from “static JSON” to “real persistence,” and why does that change the winner?
What role does Kubernetes CPU throttling play in the observed failures?
Why are “small world” benchmarks described as misleading even when they look rigorous?
Review Questions
- In the static JSON test, what combination of metrics (latency P99, availability, throttling, saturation) signals that Go is breaking down earlier than Bun?
- How does adding MongoDB inserts change the dominant bottleneck, and which measured durations reflect that shift?
- What specific test design choice could make a language with strong multi-core scaling look worse than it would under more realistic CPU allocation?
Key Points
- 1
Latency is judged primarily by P99 from the client side, with availability defined by failed requests relative to total requests.
- 2
Throughput and saturation are tracked alongside CPU usage, including the impact of Kubernetes CPU throttling on response times.
- 3
The static endpoint test favors Bun under a one-CPU-per-instance constraint, while Go degrades earlier as request rates rise.
- 4
The persistence endpoint test (MongoDB inserts per POST) shifts the bottleneck toward database write time, where Go shows lower client latency and faster inserts.
- 5
Benchmark outcomes can flip when the workload changes from minimal server work to real I/O and persistence.
- 6
Micro-benchmarks and small-world tests can mislead decision-making; measuring production 99th-percentile outcomes is presented as the safer approach.