intro to cloud hacking (leaky buckets)
Based on NetworkChuck's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
S3 “leaks” frequently start with discoverability: DNS lookups can confirm a domain maps to an S3 website endpoint.
Briefing
Cloud security failures are often simple misconfigurations—especially in Amazon S3—and they can be exploited with basic, publicly available techniques. The core takeaway is that “leaky buckets” aren’t rare edge cases: many S3 buckets are discoverable through DNS and then readable through direct object access or overly permissive policies. That combination turns cloud hacking into something closer to a checklist than a deep research project, which is exactly why the topic is framed as both educational and unsettling.
The walkthrough starts with Amazon S3 itself: a storage service where companies place files that can be served like an external drive or cloud folder. It then explains why breaches happen—primarily because bucket settings and access policies are complicated, and because older buckets may not have the safer defaults that block public access and enable encryption. Even when a bucket is “public” in the console, access can still fail unless the bucket policy is correctly written in JSON; a small mistake can either lock data away or, in the worst case, expose it.
To demonstrate how discovery works, the process begins at a challenge site (flaws.cloud) and uses DNS lookups to confirm that a domain resolves to an S3 website endpoint. With that confirmation, the method shifts to the AWS CLI. First, the script lists bucket contents anonymously (using the S3 URL scheme) to show how public buckets can be enumerated and files downloaded without credentials. A “secret” file is then retrieved from a public bucket, illustrating how quickly an exposed object can turn into a data leak.
The next phase shows how credentials can turn a near-miss into a full compromise. After creating an AWS account, the workflow sets up an IAM user with “Amazon S3 Full access,” generates access keys, and configures an AWS CLI profile. With those credentials, a bucket that previously returned “Access denied” becomes readable. The reason is an older, risky configuration pattern: allowing access to “any authenticated AWS user,” meaning anyone with valid AWS credentials can read the bucket’s contents.
A later level demonstrates a different failure mode: sensitive data hidden in version history. The bucket content includes a Git-based website update, and the attacker uses AWS CLI sync to pull down the site. By inspecting Git directories and checking out an earlier commit, the walkthrough recovers an “access keys” file that had been accidentally committed and later removed. Those leaked keys are then used to create another AWS CLI profile, which reveals additional buckets the attacker can access—showing how one mistake can cascade into broader exposure.
The session ends by pointing to additional practice challenges and a practical tool, AWS bucket dump, which automates searching for S3 buckets using keyword lists and can download contents at scale. The overall message is blunt: S3 misconfiguration, overly broad permissions, and leaked credentials in public artifacts are recurring, fixable problems—and they remain exploitable until teams audit bucket policies, rotate keys, and remove sensitive data from version control.
Cornell Notes
Amazon S3 breaches often come from “leaky bucket” conditions: public or discoverable buckets, miswritten bucket policies, and legacy permission models that allow access to any authenticated AWS user. The workflow demonstrates how to confirm an S3-backed domain via DNS lookups, then enumerate and download objects using the AWS CLI without credentials when buckets are public. When buckets require authentication, creating an IAM user and using its access keys can still succeed if the bucket policy is overly permissive (e.g., “authenticated AWS users” can read). A separate level shows how Git artifacts can preserve secrets in earlier commits, enabling recovery of leaked access keys and access to additional buckets. The practical takeaway is that small configuration and operational mistakes can expand into full data exposure.
How can a domain be verified as pointing to an Amazon S3 bucket before attempting any S3 actions?
What’s the difference between listing a public S3 bucket anonymously and listing a private one with credentials?
Why did the “Access denied” bucket become readable after using an IAM profile?
How can Git history inside an S3-hosted website lead to recovered secrets?
Once access keys are recovered from a bucket, how does that expand the attack surface?
What does the AWS bucket dump tool automate, and why does it matter?
Review Questions
- What DNS steps would you take to confirm a domain is backed by an S3 website endpoint?
- Describe two distinct reasons S3 buckets can be “leaky” based on the walkthrough’s levels.
- How does checking out an earlier Git commit help recover secrets that aren’t visible in the current bucket contents?
Key Points
- 1
S3 “leaks” frequently start with discoverability: DNS lookups can confirm a domain maps to an S3 website endpoint.
- 2
Public buckets can often be enumerated and downloaded with AWS CLI using unsigned requests (e.g., --no-sign-request).
- 3
Bucket policies are error-prone; older buckets may lack modern defaults that block public access and enforce safer settings.
- 4
Overly broad legacy permissions—such as allowing any authenticated AWS user—can turn “private” into effectively readable for anyone with credentials.
- 5
Leaked access keys can be used to create new AWS CLI profiles, which may expose additional buckets beyond the initial target.
- 6
Git artifacts preserved in bucket-hosted sites can retain secrets in earlier commits, recoverable via git checkout.
- 7
Automated tooling like AWS bucket dump can scale bucket discovery and content collection, making auditing and key rotation even more critical.