Real engineer reacts to Notion speed issues
Based on Tools on Tech's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Notion’s front end can stay responsive because it runs as a full client-side web app, so typing doesn’t always depend on slow back-end operations.
Briefing
Notion’s speed problems likely come down to a scaling bottleneck in how its database handles writes—especially when many users edit or when templates with lots of blocks are loaded—rather than slow image storage. From an infrastructure perspective, the experience can still feel fast while typing because the front end runs locally in the browser, but heavier operations that require coordinated database updates can stall once demand grows.
The breakdown starts with the front end: Notion runs a full web application inside the browser (or mobile app), so most “snappy” interactions scale with the user’s own device resources. Memory usage per tab is described as roughly 100 MB, and because the front end handles immediate UI updates on the client side, typing into text fields doesn’t necessarily wait on the slowest parts of the system.
Behind the scenes, Notion relies on a back end that exchanges data with the front end through an application interface (described as “API” calls visible in browser network traffic). The back end performs identity checks—each request is treated like a fresh “passport check”—and then coordinates the data the front end needs. Actual content storage is split conceptually into a database for text and structured content, plus blob storage for images and large files, with images routed through Amazon Cloud storage. Since image delivery is handled externally, the argument narrows to the database as the most likely source of user-visible latency.
As Notion grows, the bottleneck shifts from “it works” to “it breaks under load.” The transcript uses an analogy of changing race-car wheels mid-race: scaling changes must preserve reliability while traffic keeps increasing. A personal example of a distributed system failure is used to illustrate how redundancy and failover can collapse during peak demand—machines become too busy to coordinate, leading to cascading shutdowns until only one server remains.
For Notion specifically, the key constraint is write coordination. The system is described as having a master database that receives changes from all users. That master can’t be easily scaled horizontally because updates must propagate to other machines; attempting to replicate writes across multiple nodes creates a “traffic jam.” The slowdown is most noticeable in operations that trigger lots of data and permissions—like opening a template containing many blocks—when users don’t have time to wait for the content to load.
The proposed fix is database sharding: splitting data into multiple clusters (for example, separating regions such as Europe, Asia, and the Americas) so users fetch and update data closer to where they are. That reduces cross-ocean latency but introduces new routing challenges: URLs would need to encode which cluster a page belongs to, likely via an added hashed string that the back end can decode to redirect users to the correct cluster. The transcript frames this as complex, multi-layer engineering—front-end routing, back-end reconfiguration, and file storage alignment—yet positioned as the kind of work that can prevent Notion from becoming slow enough to lose customers over time.
Cornell Notes
Notion’s perceived speed depends on where the latency happens. The front end runs as a complete browser app, so typing can feel fast because the client handles immediate UI updates. Slower moments likely trace to the database layer, where writes must be coordinated through a master database and then propagated, creating a scaling bottleneck. Image and large-file handling is attributed to Amazon Cloud storage, so it’s less likely to be the main culprit. A likely long-term remedy is sharding—splitting the database into regional clusters and routing users to the right cluster using encoded information in URLs—though that adds significant engineering complexity.
Why can Notion feel responsive while still being slow overall?
What system components are identified as likely contributors to latency?
How does the described write path create a scaling bottleneck?
Why are templates with many blocks singled out as a pain point?
What is sharding, and how would it help Notion’s speed?
How might Notion route users to the correct cluster without exposing internal details?
Review Questions
- Which part of Notion’s architecture is most associated with “snappy typing,” and why?
- What specific property of the write path makes a single master database hard to scale?
- How would sharding change both performance and URL/routing complexity?
Key Points
- 1
Notion’s front end can stay responsive because it runs as a full client-side web app, so typing doesn’t always depend on slow back-end operations.
- 2
The database layer is the most likely source of user-visible latency, since image and large-file delivery is attributed to Amazon Cloud storage.
- 3
Write coordination through a master database creates a scaling bottleneck because updates must propagate to other machines, risking replication “traffic jams.”
- 4
Operations that load permission-heavy content—like templates with many blocks—are more likely to trigger noticeable delays.
- 5
Distributed failover and coordination can collapse under peak load when machines can’t communicate, leading to cascading failures.
- 6
Sharding (splitting data into multiple regional clusters) can reduce latency by keeping reads/writes closer to users.
- 7
Sharding requires new routing logic, likely encoding cluster identity into URLs so the back end can redirect requests correctly.