Get AI summaries of any video or article — Sign up free
Bun vs Go Perf | Prime Reacts thumbnail

Bun vs Go Perf | Prime Reacts

The PrimeTime·
5 min read

Based on The PrimeTime's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Latency is judged primarily by P99 from the client side, with availability defined by failed requests relative to total requests.

Briefing

The central takeaway is that Bun and Go trade places depending on what kind of workload hits the system: Bun tends to hold lower client latency longer under a CPU-constrained, mostly “server does little work” HTTP workload, while Go can fall behind on that narrow test but pulls ahead once real persistence work (MongoDB inserts) dominates. The practical message is less “pick a winner” and more “benchmarks are workload-specific,” especially when 99th-percentile latency and Kubernetes throttling are part of the measurement.

The comparison uses a controlled HTTP setup built with standard libraries: an endpoint returns hardcoded JSON for a first round of tests, then a second round adds a persistence layer where each POST generates a UID and writes a document to MongoDB. Across both rounds, the emphasis stays on four user-facing “golden signals”: latency (tracked with P99), throughput (requests per second), saturation (service fullness via CPU usage and throttling behavior), and availability (failed requests). Latency is measured from the client side, and the load is ramped until each service breaks—either by timeouts, Kubernetes CPU throttling, or resource exhaustion.

A key methodological choice shapes the results: both services are deployed with only one CPU allocated per instance (even though Go’s runtime can use multiple CPUs via goroutines). Two replicas of each service run per EC2 instance, and load is generated with increasing virtual clients until failure. The test environment is production Kubernetes on EC2 (with Graviton instances), not localhost, aiming to avoid the distortions that come from local-only optimizations.

In the “hardcoded JSON” scenario, Go’s latency and availability degrade earlier as request rates climb. The results show Go hitting a breaking point around the low-to-mid tens of thousands of requests per second, with latency rising sharply and Kubernetes throttling contributing to the slowdown. Bun maintains lower latency longer and only starts to degrade at higher request rates, though it too eventually reaches a limit and gets throttled. The discussion around why centers on the mismatch between Go’s typical strength—efficient vertical scaling across CPU cores—and the test’s deliberate CPU restriction, which can stress scheduling overhead and worker behavior.

When MongoDB inserts enter the picture, the story flips. The workload becomes dominated by database write time, and Bun’s earlier advantage in the static endpoint shrinks. Go delivers substantially better end-to-end performance for the POST path: client latency and database insert duration are both reported as roughly half of Bun’s, and Go sustains higher throughput before failing. CPU usage patterns also change, with Go using less CPU while Bun degrades sooner.

The broader conclusion is cautionary: small synthetic or “small world” benchmarks can produce wildly different outcomes from tiny changes in assumptions—CPU limits, header sizes, database involvement, and even how concurrency is implemented. The takeaway lands on a decision rule: validate with your actual workload and measure production 99th-percentile outcomes rather than trusting micro-benchmarks that may not resemble real traffic.

Cornell Notes

Bun and Go do not have a single, universal winner; performance depends on the workload and where the bottleneck sits. In a CPU-constrained HTTP test that mostly returns hardcoded JSON, Bun holds lower P99 latency and better availability longer, while Go degrades earlier as request rates rise and Kubernetes throttling appears. The test’s one-CPU-per-instance setup is highlighted as especially unfavorable to Go’s typical strength: efficient use of multiple CPU cores via goroutines. When the workload includes real persistence—POST requests that generate a UID and insert documents into MongoDB—Go performs significantly better, with lower client latency and faster database insert time. The practical lesson is to benchmark with realistic endpoints and measure production 99th-percentile results.

Why does the one-CPU-per-instance constraint matter so much for interpreting Bun vs Go results?

Go’s performance advantage often comes from being able to use multiple CPU cores effectively via goroutines. In this setup, each service is limited to one CPU (with cgroups), even though the system could support more. That means Go can’t realize its usual vertical-scaling behavior, so scheduling and concurrency overhead can become more visible. The result is that Go can look worse in a test designed to stress a narrow operating point, even if it would perform better when allowed to use more cores.

What does “golden signals” mean here, and why is P99 latency emphasized?

The tests track four user-facing metrics: latency (specifically P99), throughput (requests per second), saturation (how full the service is, including CPU throttling), and availability (failed requests relative to total). P99 matters because tail latency is what users feel most during spikes; average latency can look fine while the system is actually failing or timing out for a subset of requests.

How does the workload shift from “static JSON” to “real persistence,” and why does that change the winner?

The first phase returns hardcoded JSON with minimal server-side work. The second phase adds a persistence layer: each POST generates a UID and writes a document to MongoDB, with instrumentation to measure both client request duration and MongoDB insert duration. Once database writes dominate, the bottleneck moves away from runtime overhead and toward I/O and database interaction. In that regime, Go shows lower end-to-end latency and faster inserts, reducing Bun’s earlier advantage.

What role does Kubernetes CPU throttling play in the observed failures?

CPU throttling is treated as a key mechanism behind latency spikes and throughput collapse. When CPU usage hits the Kubernetes limits, throttling increases response times and reduces overall performance. In the static endpoint test, Go is reported to experience throttling earlier, which aligns with its earlier latency degradation and availability drops.

Why are “small world” benchmarks described as misleading even when they look rigorous?

Small benchmarks can isolate a tiny slice of behavior—like returning a static JSON payload—and that slice may not match real endpoints. Tiny changes (CPU limits, concurrency model, header sizes, whether a database insert happens, or how concurrency is scheduled) can flip the results. The transcript repeatedly stresses that micro-benchmarks can produce “wildly weird outcomes,” so production-like workloads and production measurements are necessary.

Review Questions

  1. In the static JSON test, what combination of metrics (latency P99, availability, throttling, saturation) signals that Go is breaking down earlier than Bun?
  2. How does adding MongoDB inserts change the dominant bottleneck, and which measured durations reflect that shift?
  3. What specific test design choice could make a language with strong multi-core scaling look worse than it would under more realistic CPU allocation?

Key Points

  1. 1

    Latency is judged primarily by P99 from the client side, with availability defined by failed requests relative to total requests.

  2. 2

    Throughput and saturation are tracked alongside CPU usage, including the impact of Kubernetes CPU throttling on response times.

  3. 3

    The static endpoint test favors Bun under a one-CPU-per-instance constraint, while Go degrades earlier as request rates rise.

  4. 4

    The persistence endpoint test (MongoDB inserts per POST) shifts the bottleneck toward database write time, where Go shows lower client latency and faster inserts.

  5. 5

    Benchmark outcomes can flip when the workload changes from minimal server work to real I/O and persistence.

  6. 6

    Micro-benchmarks and small-world tests can mislead decision-making; measuring production 99th-percentile outcomes is presented as the safer approach.

Highlights

Bun holds lower P99 latency longer in a CPU-limited static JSON workload, but Go’s performance degrades earlier as throttling and saturation kick in.
Once MongoDB inserts dominate the request path, Go delivers substantially better end-to-end latency—reported as roughly half of Bun’s for both client time and insert time.
The one-CPU-per-instance setup is framed as a major confounder for interpreting Go’s results, since Go’s strengths often rely on multi-core utilization.
The discussion repeatedly warns that tiny benchmark differences can produce large, misleading performance swings—so production measurement matters more than synthetic wins.

Topics

  • Bun vs Go
  • Kubernetes throttling
  • MongoDB persistence
  • P99 latency
  • Micro-benchmarks

Mentioned

  • Aiden
  • P99
  • CPU
  • JSON
  • HTTP
  • P99
  • cgroups
  • Kubernetes
  • EC2
  • gRPC
  • hpack
  • HPACK
  • JIT
  • V8
  • JSC
  • PR
  • UID
  • P99