$135 Billion Accidentally Deleted By Google
Based on The PrimeTime's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
UniSuper’s Google Cloud VMware Engine private cloud was automatically deleted one year after provisioning, breaking a key service.
Briefing
A Google Cloud VMware Engine private cloud instance used by Australia’s UniSuper—managing about A$135 billion for more than 600,000 members—was automatically deleted exactly one year after it was created, knocking out a key service and wiping the underlying compute and database stack. The outage mattered because restoring the application was relatively straightforward, but recovering the data required extensive manual work and depended on backups kept outside the deleted environment.
The chain of events traces back to May 1, 2023. UniSuper needed a VMware-based private cloud on Google Cloud, but its capacity requirements couldn’t be met through Google’s standard public interface. UniSuper contacted Google to provision a special private cloud using an internal deployment tool. Engineers reviewed the ticket, filled in account, region, and hardware specifications, and ran the command. Everything looked correct—until the private cloud’s default behavior kicked in.
On May 1, 2024, UniSuper developers found that a critical service stopped responding and that the cloud hosting it appeared to have vanished. Audit logs showed no human deletion, and the private cloud didn’t behave like a typical customer-managed environment that would normally persist. UniSuper investigated through Google’s internal API logs and discovered the instance had been deleted automatically when its one-year fixed term expired. That auto-deletion behavior was not supposed to be enabled for a customer-created private cloud.
The root cause was a missing parameter in the internal tool call made by Google engineers a year earlier. Because that parameter wasn’t included, the private cloud was created with the default “auto-delete after 1 year” setting. Google later confirmed the fault was on its side, after UniSuper initially issued a statement blaming a third-party provider and then updated it to name Google Cloud. By May 8, Google’s CEO publicly acknowledged responsibility.
Restoring the infrastructure and redeploying the application stack took advantage of existing code and pooling, but data recovery was the hard part. The deleted private cloud spanned multiple availability zones, and there was no intact replica/production stack in another zone that could be used to pull data back. UniSuper’s survival depended on backups: copies in Google Cloud Storage (a separate service from the deleted private cloud) and backups with another provider. Once users could log in and view balances again, service was largely back by May 15—about two weeks after the deletion.
The incident also sparked broader debate about service-level expectations and operational safeguards. Even with backups, the deletion created downtime and significant customer effort, raising questions about how “defaults” in internal tooling can become catastrophic when they slip into production. The practical takeaway was blunt: redundancy has to be real—multiple copies, on different media and providers, and ideally offsite—because operational mistakes at a cloud provider can still erase environments that customers assumed would persist.
Cornell Notes
UniSuper’s Google Cloud VMware Engine private cloud was automatically deleted one year after provisioning, because an internal deployment command missed a crucial parameter. The missing parameter caused the instance to inherit a default “auto-delete after 1 year” behavior, even though that wasn’t expected for a customer environment. When the deletion occurred on May 1, 2024, UniSuper’s compute and database stack disappeared, breaking a key service. Restoring the application was manageable, but data recovery required extensive manual work using backups stored outside the deleted private cloud. The outage ended up lasting until mid-May, and Google later confirmed the fault was theirs after initial third-party blame and subsequent CEO acknowledgment.
What exactly failed for UniSuper, and why was it so disruptive even if backups existed?
How did the deletion happen without any visible human action in audit logs?
What was the underlying technical mistake in the provisioning step?
Why couldn’t UniSuper rely on a replica in another availability zone?
What did Google’s public accountability look like after the outage?
What broader lesson emerged about cloud “defaults” and operational risk?
Review Questions
- What missing provisioning detail caused the private cloud to inherit an unexpected auto-deletion behavior?
- Why was data recovery harder than application restoration in UniSuper’s case?
- What backup locations/services were referenced as critical to getting UniSuper back online?
Key Points
- 1
UniSuper’s Google Cloud VMware Engine private cloud was automatically deleted one year after provisioning, breaking a key service.
- 2
The deletion occurred due to a missing parameter in Google’s internal deployment tool, which caused an unexpected default auto-delete behavior.
- 3
Audit logs showed no human deletion; internal API logs indicated the expiry-driven automated deletion.
- 4
Restoring compute and redeploying code was relatively straightforward, but recovering databases required extensive manual work because the entire multi-zone production environment was deleted.
- 5
UniSuper avoided total data loss by using backups in Google Cloud Storage and backups with another provider.
- 6
Google Cloud’s CEO later confirmed the fault was on Google’s side after initial third-party blame and updates from UniSuper.