Explore the Power and Pain of User-Data

The user-data field is a powerful tool for automation, but can be a major source of risk.

Rich Mogull
October 03, 2024 • Estimated Reading Time: 10 minutes

Prerequisites

Nothing special this week

The Lesson

Some of my earliest hands-on cloud work was building the CCSK training class for the Cloud Security Alliance back in 2010. Those were simpler days — without IAM, VPCs, or about 200 other services to figure out.

I needed to build labs students could run in their own accounts, but which would still be consistent, not require students to be Linux masters, and ideally didn’t make them perform a bunch of non-security configuration which would blow all our class time. CloudFormation wasn’t an option yet, and our automation tooling was quite limited.

That’s when I learned about the user-data field. This very cool capability enables you to send data into an instance when you launch it; the instance can read and then act on the user-data. How does it work? It’s simple: when you run an instance you can specify content for the user-data field. This content can be text or a reference to a file — either a local file you pipe in when running from the command line, or a reference to a location such as an S3 bucket.

That data then lives in the metadata service, where it is accessible by instances. We use user-data two ways:

To pass data into the instance which is then available to system software, applications, and (logged-in) users. In the real world this is used for software license codes, variables needed by applications, and… passwords. Yeah, I think you see where I’m headed with this one.
To run scripts on boot. Many operating system images we use in cloud providers and containers (since user-data is supported all over the place now) are configured to execute properly-formatted user-data as a script.

There are good reasons to support both use cases. It enables us to take a single base image and reconfigure it for our current needs when we launch it. It’s a core automation capability which isn’t used quite as often these days, thanks to CloudFormation and Terraform, but is still in widespread use.

And attackers love it!

In my case I maintained a script in S3 which configured student instances to all run the same software stack, all set up the way we needed for class. Remember that I built the trainings but a lot of other people delivered them, and this kept thousands of students around the world all running a consistent lab setup.

And yeah, it was super hard to set up and maintain, but still offered incredible scalability and automation.

There are two common security risks with user-data:

Cloud users sometimes use it as an easy way to embed credentials into an instance. Everything from database passwords to those pesky AKIAs. Attackers know to look for them as part of their attacks, so scraping user-data from a compromised instance is often automated. The data is also exposed if, instead of compromising the instance, they gain compromised IAM credentials with sufficient access.
Attackers who obtain compromised credentials which enable them to run instances will leverage user-data with a script that downloads attack tools or, more commonly, a cryptominer into the instance. We’ll get into image security soon, but this saves an attacker from running a non-standard image which might trigger alarm bells and lead back to themselves.

Key Lesson Points

The user-data field enables us to pass data or scripts into an instance when we launch it.
This can be abused by attackers who steal sensitive data (usually credentials) and may use it to run attack or cryptomining scripts.

The Lab

We will break up our exploration of user-data into two parts. This week we will run an instance, pass in some sensitive data, and see where it’s potentially exposed. Next week we’ll learn how to run a script on launch, and we’ll… have a little fun with it.

For this lab we’ll create a VPC to use with CloudFormation. Then we’ll launch an instance and embed some ‘secret’ user data for us to ‘steal’. Heck, we’ll even review the CloudTrail logs to see what it all looks like!

Video Walkthrough

Step-by-Step

First we’ll set up our environment with our little friend, CloudFormation. Log into your Sign-in portal > TestAccount1 > AdministratorAccess > CloudFormation > Create stack, and use these details:

S3 URL: https://cloudslaw.s3.us-west-2.amazonaws.com/publicvpc.template
Name: SLAW

Once it launches go to EC2 > Instances > Launch instances, and follow the screenshots.

Name it Secret and choose Amazon Linux:

Scroll down and Proceed without a key pair, then Select an existing security group > default:

Scroll WAAAAYYY down into Advanced details until you see User data at the bottom. Type in whatever you want, but I like:

Username: gullible
Password: wordpass

Yes, that was a close friend’s password for nearly everything until about 5 years ago. But he doesn’t read these posts so he’ll never know I outed him. Then Launch instance.

Take a break for 3-5 minutes for the instance to launch and stabilize. It’s your time, do what you want — I don’t judge. Once things are running we will explore different ways to retrieve (or not) user-data.

After the wait, go back to Instances, then select your instance and go to Actions > Instance settings > Edit user data:

This is one of the only ways to see the data, but everyone with read permissions for EC2 has access.

Okay, now let’s connect to the instance and run a couple command lines to see how the data is stored in the metadata service. Cancel the Edit user data screen, double-check that you have your instance selected, then Connect > Session Manager > Connect. Once you are in I want you to paste two command lines. The first gets your token for version 2 of the metadata service, because you are all good citizens and know you should never use IMDSv1. The second command line retrieves user-data:

TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"`

curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/user-data

This shows how the data resides in the metadata service. It’s retrievable by anyone who can log into (or break into) the instance. The data is also exposed if, instead of compromising the instance, they gain compromised IAM credentials with sufficient access. This is really a terrible place to keep passwords so promise you’ll never do this, and you will kindly work with your developers to never do this.

What’s the alternative? That’s a big question, and the answer uses techniques and technologies we call secrets management. We will absolutely cover that in future labs.

So how can we detect sensitive data exposed in user-data? Well, let’s see what we can see in the logs. Go to CloudTrail > Event history. Then scroll down until you see the RunInstances action and click it:

Go to the JSON block and look in Request parameters, which is the part of the API call where you tell AWS what you want the action to do. Notice anything?

This is how you can tell AWS takes security seriously. They scrub this data from the logs because they know it is often used for sensitive data, and exposing sensitive data in logs is all too common. Think about it: these are operational logs, not just something we use for security, and if user-data was in there anyone with access to the logs could potentially see something sensitive.

Sweet! Except… uh… what if it’s your job to detect when someone stores a secret in user-data? Or there’s an incident and you need to see what might have been exposed?

Well, you will kinda need to also have permission to read the user-data directly, and hope the instance wasn’t terminated — because that’s when you lose the ability to see it forever. In incident response training, this is one of my favorite ways to mess with students.

Congratulations, you now know how people mess up and store credentials in user-data, how to retrieve it, and what it looks like in the logs. Go ahead and Terminate Your Instance! Then go back into CloudFormation > SLAW > Delete stack, and I’ll see you next week!

Lab Key Points

We can specify user data when we launch an instance from the console, or via an API call or command line.
The user-data is readable in the console if you go to the UI to edit it, and it’s always available if you can log into the instance and connect to the metadata service.
The user-data is not readable in CloudTrail; AWS filters it out for security.
Don’t store passwords/credentials in user-data. Attackers always look there first.

-Rich

Reply

or to participate.