Accidentally Expose All Your Stuff on S3 with ACLs

Before we learn all the cool ways to prevent data leaks, we're going to... leak some data so you know how these messes keep happening.

Prerequisites

  • None

The Lesson

As we move back into our discussion of data security, I need to remind you that I am not exactly the most subtle individual on the face of the planet. I like my hammers. So once again, I remind you of the two biggest sources of cloud security breaches:

  • Lost/stolen/exposed credentials

  • Public facing stuff

And when it comes to “public stuff” there isn’t a larger sources of historical data leaks than Amazon S3. The wild part is that these aren’t because S3 got hacked — mostly it’s because someone puts data in S3, which defaults to being totally private, and then… makes it public?

Yeah. That’s a thing. For understandable reasons. The good news is that AWS has added multiple layers and defaults to make it much harder to accidentally expose data in S3, but mistakes still happen.

The crux of the issue is that there are some very legitimate reasons to expose data to the Internet in S3; heck, it’s one of the service’s best features! As security nerds our challenge is to understand how and why public exposures occur, how to find and remediate the unapproved ones, and how to prevent inadvertent disclosures moving forward.

Now I could have built this lesson and lab completely around the tools we use to identify and block public access, but that wouldn’t help you understand the how and why of it all. So we’ll start with a couple labs on how to make things public, then switch gears to detection and prevention.

What you need to know about S3

The power of S3 (Simple Storage Service) is that it enables you to store massive amounts of data, and even provide direct Internet access to your data. We use it to replace web servers, host images, serve up data, and… store massive amounts of internal (private) data. S3 is the honey badger of storage… it don’t care what you throw at it, its job is to never back down.

S3 defaults everything to being private, but anything in S3 can be made public over the Internet. This is by design. But due to the high numbers of inadvertent exposures, AWS has continued to add security controls and defaults so you have to go through multiple hoops to make anything public.

S3 is a highly scalable, highly available object storage service. Unlike traditional file systems which organize data in a hierarchy of directories and files, object storage treats data as distinct objects – each consisting of the data itself, metadata describing the object, and a unique identifier. Think of it like a massive digital warehouse where each item (object) has its own tracking number (identifier) and detailed label (metadata) — whether the item is a single photo or an entire database backup.

S3 stores objects in containers called buckets, which serve as the top-level organization unit. This approach enables S3 to scale virtually without limit while maintaining consistent performance and 99.999999999% (11 nines — not a typo) durability.

And there’s the origin of the problem: S3 isn’t like having a private file server you accidentally exposed to the Internet by opening some firewall rules. Anything and everything in S3 is potentially Internet accessible, because S3 is designed to store and serve files over the Internet. In fact when you name a bucket, even a private bucket, that name must be globally unique on the Internet, just in case you want to make it public later.

S3 is also positively ancient by cloud standards; it was released in 2006, which means it can legally vote and enlist in the military (but not drink without a fake ID). That release was looooonnnng before IAM, resource policies, AWS Organizations, and many of our other controls. It has some really weird behaviors — like you can share a bucket with someone, they can fill the bucket with files, and you pay the bill, even though you don’t own and cannot delete the files.

Access Control Lists

Part of the problem with S3 is that it has multiple access mechanisms. Some of this is due to age — as AWS evolves they develop and add up with better options, but cannot always shut off the old methods because they are still in wide use.

For our purposes I like to distill these down to ACLs, resource policies, and identity-based policies.

  • S3 supports Access Control Lists (ACLs) at the bucket and object levels. These will be our focus today. Ideally you shouldn’t use these, and these days they are disabled by default. Yes, we will learn why.

  • Bucket policies are resource-based, and the preferred method for managing access. We played with one in a previous lab and will learn more in the next lab.

  • Within your account you can also manage access using IAM policies, but they only apply to users and roles in your account, and this mechanism cannot be used to provide direct public access.

ACLs are… weird… compared to everything else we’ve covered in AWS. It’s an older mechanism, from back when dinosaurs ruled the earth. AWS defines them as a sub-resource of a bucket or object, and they define which AWS accounts or groups (including the public) can have access. One sign of how old they are is that ACLs are XML rather than JSON.

A majority of modern use cases in Amazon S3 no longer require the use of ACLs. We recommend that you keep ACLs disabled, except in unusual circumstances where you need to control access for each object individually.

-AWS Documentation

The thing is, I still see them all over the place, and I guarantee you’ll run into them. Not in the “hi, nice to meet you,” way but more, “oh, look at that nice tree, I wonder if it will get out of my way?”

TL;DR: ACLs are antiquated and complicated; don’t use them.

Rather than try to cover everything about ACLs, here are the important bits:

  • ACLs are effectively disabled by default on new buckets. To use them you need to change a setting called S3 Object Ownership.

    • The default owner is Bucket owner. That means whoever owns the bucket owns the objects inside it. This probably sounds weird, but S3 was originally designed so you could give someone access to store their objects in your bucket. You could actually have an object in your bucket which you couldn’t delete.

    • When the owner is set to Bucket owner, all ACL processing is disabled and you need to use bucket policies (next week’s lab) to control access instead.

  • If you enable ACLs, the options are bucket owner preferred and object writer. Owner preferred means you own the objects even if someone writes into your bucket. Object writer means you get to pay for someone else’s stuff.

    • There really are no legitimate use cases for object writer anymore.

  • Access is then granted based on something called a grantee and gets into … deep weirdness very quickly. That’s beyond our scope today; if you need to figure out what these are it will be because you are fixing someone else’s mess. In that case, the documentation is your friend.

  • The big whammy is that you can set a grantee to “all users” (public), which is how a metric ton of data leaks happen.

  • Permissions are read or write (with separate permissions to read or write ACLs).

    • If you set read to all users, the entire Internet can access your data.

    • If you set write to all users, the entire Internet can host their Napster libraries in your bucket.

      • We know it won’t be Napster libraries. Don’t make me say what you’ll really be hosting — this is a family-friendly learning environment.

  • ACLs can apply to buckets and objects. You can have a private bucket but a public object.

    • This gets ugly quickly, because even just scanning object-level permissions is incredibly time consuming and resource intensive.

Look, this isn’t to scare you. ACLs are messy but we don’t use them much anymore. AWS has also added extensive support to detect and prevent S3 exposures, which we will cover very soon.

But because you are highly likely to encounter ACLs — not because you will necessarily use them yourself — I want you to see what they look like and learn how they are misconfigured.

Key Lesson Points

  • S3 is an Internet-based object storage service which supports public and private repositories.

  • We store objects in buckets.

  • There are multiple security tools for managing access, and Access Control Lists (ACLs) are the oldest and not used by default.

    • But we still see them, and setting an ACL to public is one of the biggest sources of AWS data leaks.

The Lab

I think it’s important to see how misconfigurations happen, so even though you may never use ACLs yourself, today we will make a bucket and set it to public access. When we are done you might ask yourself how any idiot could jump through this many hoops to make sensitive data public, but please remember that you are seeing how AWS works in 2024, not how it worked even a couple years ago. Also, the AWS console will do its best to slow your stupidity roll… but when you operate from the command line, IaC, or API, there are fewer warnings and checkboxes.

To make a bucket public we will:

  • Double-check our account-level Block Public Access settings and disable them if needed. (We’ll review BPA in an upcoming lab.)

  • Create the bucket.

  • Disable bucket-level Block Public Access.

  • Enable ACLs and set object ownership.

  • Finish creating the bucket.

  • Change the ACLs to allow public read access.

  • And check all the warning boxes along the way telling us not to do this.

Video Walkthrough

Step-by-Step

Start in your Sign in portal > TestAccount1 > AdministratorAccess. Then go to S3 > click the hamburger [3 vertical lines] > Block Public Access settings for this account:

These should be off by default, but make sure you check.

Then go to Buckets > Create bucket:

Choose General purpose and then enter a name. S3 bucket names must be globally unique and must meet character restrictions! This means lowercase only, no spaces, dashes are okay but not slashes, and other stuff. I use first initial last name -slaw-random keyboard bashes, so mine looks like rmogull-slaw-32149870erhij:

Then scroll down and change Object ownership > ACLs enabled > Bucket owner preferred. Notice the warnings? (You won’t see those on the command line.)

Then we need to disable Block Public Access for the bucket. Uncheck the top and then check the acknowledgement:

Scroll down past encryption and other settings (S3 is always default encrypted; we’ll discuss later) and Create bucket:

Now you have a potentially pubic bucket, but it isn’t actually public yet.

Click into your bucket to change the ACLs. One sad note here: AWS used to have a big icon if your buckets were public or potentially public, right on this screen. They recently changed that and we need to turn on Access Analyzer for those warnings (which we will do in an upcoming lab).

Then click the Permissions tab and scroll down to Access Control List (ACL) > Edit:

Yeah, that Grantee stuff looks weird. We want to grant Everyone List & Read. This allows them to see everything in the bucket and read the objects.

Then click Save changes.

And that’s it! You officially own a public bucket, accessible by anyone on the Internet, with… no data inside to lose. Yes, we jumped through a lot of hoops, but to repeat myself: this isn’t an uncommon scenario, and most of those checks and balances are relatively new.

Okay, next time we'll make a bucket public using a bucket policy, and then we’ll learn more about Access Analyzer and Block Public Access.

Lab Key Points

  • Don’t use ACLs.

  • Definitely don’t set the grantee to Everyone.

  • Please.

-Rich

Reply

or to participate.