intro to cloud hacking (leaky buckets)

TL;DR

S3 “leaks” frequently start with discoverability: DNS lookups can confirm a domain maps to an S3 website endpoint.

Briefing Cornell Notes

Briefing

Cloud security failures are often simple misconfigurations—especially in Amazon S3—and they can be exploited with basic, publicly available techniques. The core takeaway is that “leaky buckets” aren’t rare edge cases: many S3 buckets are discoverable through DNS and then readable through direct object access or overly permissive policies. That combination turns cloud hacking into something closer to a checklist than a deep research project, which is exactly why the topic is framed as both educational and unsettling.

The walkthrough starts with Amazon S3 itself: a storage service where companies place files that can be served like an external drive or cloud folder. It then explains why breaches happen—primarily because bucket settings and access policies are complicated, and because older buckets may not have the safer defaults that block public access and enable encryption. Even when a bucket is “public” in the console, access can still fail unless the bucket policy is correctly written in JSON; a small mistake can either lock data away or, in the worst case, expose it.

To demonstrate how discovery works, the process begins at a challenge site (flaws.cloud) and uses DNS lookups to confirm that a domain resolves to an S3 website endpoint. With that confirmation, the method shifts to the AWS CLI. First, the script lists bucket contents anonymously (using the S3 URL scheme) to show how public buckets can be enumerated and files downloaded without credentials. A “secret” file is then retrieved from a public bucket, illustrating how quickly an exposed object can turn into a data leak.

The next phase shows how credentials can turn a near-miss into a full compromise. After creating an AWS account, the workflow sets up an IAM user with “Amazon S3 Full access,” generates access keys, and configures an AWS CLI profile. With those credentials, a bucket that previously returned “Access denied” becomes readable. The reason is an older, risky configuration pattern: allowing access to “any authenticated AWS user,” meaning anyone with valid AWS credentials can read the bucket’s contents.

A later level demonstrates a different failure mode: sensitive data hidden in version history. The bucket content includes a Git-based website update, and the attacker uses AWS CLI sync to pull down the site. By inspecting Git directories and checking out an earlier commit, the walkthrough recovers an “access keys” file that had been accidentally committed and later removed. Those leaked keys are then used to create another AWS CLI profile, which reveals additional buckets the attacker can access—showing how one mistake can cascade into broader exposure.

The session ends by pointing to additional practice challenges and a practical tool, AWS bucket dump, which automates searching for S3 buckets using keyword lists and can download contents at scale. The overall message is blunt: S3 misconfiguration, overly broad permissions, and leaked credentials in public artifacts are recurring, fixable problems—and they remain exploitable until teams audit bucket policies, rotate keys, and remove sensitive data from version control.

Cornell Notes

Amazon S3 breaches often come from “leaky bucket” conditions: public or discoverable buckets, miswritten bucket policies, and legacy permission models that allow access to any authenticated AWS user. The workflow demonstrates how to confirm an S3-backed domain via DNS lookups, then enumerate and download objects using the AWS CLI without credentials when buckets are public. When buckets require authentication, creating an IAM user and using its access keys can still succeed if the bucket policy is overly permissive (e.g., “authenticated AWS users” can read). A separate level shows how Git artifacts can preserve secrets in earlier commits, enabling recovery of leaked access keys and access to additional buckets. The practical takeaway is that small configuration and operational mistakes can expand into full data exposure.

How can a domain be verified as pointing to an Amazon S3 bucket before attempting any S3 actions?

Use DNS resolution to find the IP address for the domain, then run a reverse DNS lookup on that IP. In the walkthrough, NS lookup is used first on flaws.cloud, then again on the returned IP address; the reverse lookup indicates it’s an S3 website endpoint. That confirmation justifies switching to S3-specific testing rather than guessing.

What’s the difference between listing a public S3 bucket anonymously and listing a private one with credentials?

For public buckets, the AWS CLI can list objects without configured credentials by using the S3 URL scheme and adding the option to avoid signing requests (the walkthrough uses --no-sign-request). For private buckets, the same listing attempt returns “Access denied,” but succeeds after creating an IAM user, generating access keys, configuring an AWS CLI profile, and then running the CLI commands with that profile.

Why did the “Access denied” bucket become readable after using an IAM profile?

The bucket’s policy matched a legacy, overly broad permission pattern: it allowed access to any authenticated AWS user. That means any valid AWS credentials—such as those from the newly created IAM user—could read the bucket’s contents. The walkthrough highlights this as an old configuration that isn’t as prominent in modern defaults, but still exists in older buckets.

How can Git history inside an S3-hosted website lead to recovered secrets?

If the bucket contains a website that was updated using Git, the attacker can sync the bucket contents locally, inspect hidden Git directories (using ls -al), and use Git logs/refs to identify an earlier commit. By checking out a commit hash where a secret was accidentally added (then later removed), the attacker can recover files such as access keys that were not present in the current version.

Once access keys are recovered from a bucket, how does that expand the attack surface?

The attacker creates a new AWS CLI profile using the recovered access key ID and secret access key. With that profile, they can list all buckets accessible to that account (running aws s3 ls without specifying a bucket). The walkthrough shows that this reveals additional buckets beyond the original target, turning one leaked secret into broader access.

What does the AWS bucket dump tool automate, and why does it matter?

AWS bucket dump automates searching for S3 buckets using a list of candidate bucket names (via -l) and keyword lists (via -g). It can then download contents from discovered buckets. This matters because it shifts the process from manual enumeration to scalable, repeatable collection—useful for authorized testing, but dangerous if misused.

Review Questions

What DNS steps would you take to confirm a domain is backed by an S3 website endpoint?
Describe two distinct reasons S3 buckets can be “leaky” based on the walkthrough’s levels.
How does checking out an earlier Git commit help recover secrets that aren’t visible in the current bucket contents?

Key Points

1
S3 “leaks” frequently start with discoverability: DNS lookups can confirm a domain maps to an S3 website endpoint.
2
Public buckets can often be enumerated and downloaded with AWS CLI using unsigned requests (e.g., --no-sign-request).
3
Bucket policies are error-prone; older buckets may lack modern defaults that block public access and enforce safer settings.
4
Overly broad legacy permissions—such as allowing any authenticated AWS user—can turn “private” into effectively readable for anyone with credentials.
5
Leaked access keys can be used to create new AWS CLI profiles, which may expose additional buckets beyond the initial target.
6
Git artifacts preserved in bucket-hosted sites can retain secrets in earlier commits, recoverable via git checkout.
7
Automated tooling like AWS bucket dump can scale bucket discovery and content collection, making auditing and key rotation even more critical.

Highlights

A domain can be tied to S3 using forward DNS resolution followed by reverse DNS lookup on the returned IP, then tested directly with AWS CLI S3 commands.

A bucket that returns “Access denied” can still be readable if its policy allows access to any authenticated AWS user—credentials alone become the key.

Secrets can survive in version history: syncing a Git-based site from S3 and checking out an earlier commit can reveal an “access keys” file.

Recovered AWS access keys enable listing all accessible buckets for that account, expanding the breach beyond the original object.

AWS bucket dump automates large-scale S3 bucket searching using candidate lists and keyword filters.

Topics

Amazon S3
Bucket Misconfiguration
AWS CLI
IAM Access Keys
Git Secrets

Mentioned

S3
AWS
CLI
IAM
DNS
CTF
Git
JSON
IP