Uber Writes A Data Store To Save 6 Million

TL;DR

Uber migrated payment transaction data from Dynamo DB and blob storage into an immutable long-term store called Ledger store, targeting both cost savings and stronger integrity guarantees.

Briefing Cornell Notes

Briefing

Uber built a purpose-built long-term data store, Ledger store, to take over payment transaction storage from Dynamo DB and blob storage—cutting annual maintenance costs by an estimated $6 million while migrating roughly 1 trillion records. The shift matters because it targets one of the hardest problems in fintech-scale systems: keeping financial transaction data both complete and correct over time, across many access patterns, without letting storage costs and operational complexity spiral.

Before Ledger store, Dynamo DB handled only “hot” payment data up to about 12 weeks old, while older data lived in an in-house blob service. Rising storage costs pushed Uber to reduce Dynamo DB usage, and the new architecture extends that cost-control strategy into a dedicated system designed for financial integrity. Ledger store is described as immutable storage that provides verifiable completeness and correctness guarantees—treating ledgers as the source of truth for financial events and data movement.

The core engineering challenge is indexing at massive scale. Uber needs to look up ledgers via many access patterns, which drives “trillions of indexes” over “hundreds of billions of ledgers.” Ledger store therefore relies on strongly consistent indexing, supported by a two-phase commit approach: it persists an intent in the index first, then writes the record, and finally asynchronously commits or rolls back depending on success. The transcript also emphasizes the operational reality of managing those indexes over time, including index life cycle management—automated reindexing when index definitions change. That process creates a new index, backfills data, validates, swaps to the new index, and deletes the old one.

Cost and performance improvements are paired with migration safety. Uber used “shadow validation” by double-writing payment data to both Dynamo DB and Ledger store, then comparing read results to confirm a near one-to-one surface before fully switching. For the migration itself, the team used shallow and offline validation, plus historical data validation and an incremental backfill job running in Apache Spark. The backfill load reportedly reached about 10x normal production load and took around three months.

To reduce risk during rollout, Uber took a conservative deployment strategy with a fallback path: if data wasn’t found in Ledger store, systems could fetch it from Dynamo DB. The migration reportedly completed without downtime or outages. Overall savings were estimated at more than $6 million per year, attributed to reduced reliance on Dynamo DB and blob storage and the simplified long-term storage architecture—enough, in the transcript’s framing, to fund roughly a handful of additional engineers.

The discussion around why build instead of buy centers on the mismatch between generic distributed databases and the specific requirements of immutable financial ledgers, verifiable integrity, and strongly consistent indexing at Uber’s scale. Even with skepticism about reinventing storage, the transcript repeatedly returns to the same point: when data volume, correctness requirements, and indexing demands reach extreme levels, bespoke engineering can become the only way to meet both integrity and cost targets.

Cornell Notes

Uber migrated payment transaction data from Dynamo DB and blob storage into a purpose-built immutable long-term store called Ledger store, targeting both cost reduction and stronger data integrity guarantees. Ledger store is designed so ledgers act as the source of truth for financial events, with verifiable completeness and correctness. The system must support many access patterns, which creates massive indexing needs—trillions of indexes—so it uses strongly consistent indexing with a two-phase commit style workflow (intent persisted first, then record write, then async commit/rollback). Uber reduced risk during migration by shadow-writing to both systems and comparing read results, then using validation plus a fallback to Dynamo DB. The result was an estimated yearly savings of over $6 million and a migration completed without downtime.

Why did Uber move payment storage away from Dynamo DB and blob storage?

Dynamo DB was used only for the most recent ~12 weeks of payment data, while older data sat in an in-house blob service. Rising storage costs drove Uber to reduce Dynamo DB usage, and Ledger store extends that approach by replacing the long-term storage layer with a dedicated system built for financial transaction integrity and cost control. The migration is described as cutting estimated annual maintenance costs by more than $6 million.

What makes Ledger store different from a typical database from an integrity standpoint?

Ledger store is described as immutable and built to provide verifiable data completeness and correctness guarantees. Since ledgers are treated as the source of truth for financial events and data movement, the store must ensure that what’s written is complete and correct, not merely eventually consistent. That integrity focus shows up again in the migration strategy (shadow validation and offline validation) and in the indexing workflow.

How does Ledger store handle the indexing problem at Uber’s scale?

Uber needs to query ledgers through many access patterns, which leads to “trillions of indexes” over “hundreds of billions of ledgers.” To support strongly consistent index behavior, the transcript describes a two-phase commit approach: Ledger store persists an intent in the index, then writes the record, and finally asynchronously commits or rolls back the intent depending on success or failure. It also supports index life cycle management—creating a new index, backfilling, validating, swapping, and deleting the old index when definitions change.

What migration techniques reduced the risk of switching from Dynamo DB to Ledger store?

Uber used shadow validation by double-writing data to Dynamo DB and Ledger store and comparing read results between the two systems to check for a near one-to-one mapping. It also used shallow and offline validation for correctness, plus historical data validation and an incremental backfill job in Apache Spark. For rollout safety, systems could fall back to Dynamo DB if Ledger store didn’t have the requested data.

What operational costs and timeline constraints came with the migration?

The incremental backfill job reportedly generated load around 10x usual production load and took about three months. The transcript frames this as a major operational burden, alongside the need for validation and conservative rollout to avoid outages. Despite that, the migration reportedly finished without downtime or outage during or after the switch.

Review Questions

What integrity guarantees does Ledger store provide, and why are those guarantees especially important for payment transaction data?
Explain the role of strongly consistent indexing and the two-phase commit workflow described for Ledger store.
How did shadow validation and fallback-to-Dynamo DB reduce migration risk during Uber’s switch to Ledger store?

Key Points

1
Uber migrated payment transaction data from Dynamo DB and blob storage into an immutable long-term store called Ledger store, targeting both cost savings and stronger integrity guarantees.
2
Dynamo DB was limited to hot payment data for about 12 weeks, with older data stored in an in-house blob service before the Ledger store transition.
3
Ledger store treats ledgers as the source of truth for financial events and data movement, aiming for verifiable completeness and correctness.
4
Massive indexing requirements—trillions of indexes over hundreds of billions of ledgers—drive the need for strongly consistent indexing and a two-phase commit style workflow.
5
Uber managed index changes through index life cycle management: create new index, backfill, validate, swap, and delete the old index.
6
Migration safety relied on shadow validation (double-write and read comparison), offline validation, Apache Spark backfill, and a fallback path to Dynamo DB.
7
The migration reportedly completed without downtime and delivered estimated yearly savings of over $6 million.

Highlights

Ledger store is positioned as immutable financial storage that provides verifiable completeness and correctness guarantees, with ledgers serving as the source of truth.

Indexing is the bottleneck at Uber scale: the system needs to support many access patterns, leading to trillions of indexes and requiring strongly consistent behavior.

Shadow validation—double-writing to Dynamo DB and Ledger store and comparing reads—was used to confirm correctness before full cutover.

Index life cycle management automates reindexing when definitions change: backfill, validate, swap, and delete old indexes.

The migration used conservative rollout with fallback to Dynamo DB and reportedly avoided downtime, despite a three-month backfill running at ~10x normal load.

Topics

Ledger Store
Dynamo DB Migration
Strongly Consistent Indexing
Two-Phase Commit
Index Life Cycle Management