Cloud Security Lab a Week (S.L.A.W)
Posts
Getting Started with CloudTrail Security Queries

Getting Started with CloudTrail Security Queries

Unlock the mysteries of CloudTrail logs with a few simple starter queries. We'll use these later as the basis of threat detectors and for incident analysis.

Rich Mogull
June 19, 2025 • Estimated Reading Time: 12 minutes

CloudSLAW is, and always will be, free. But to help cover costs and keep the content up to date we have an optional Patreon. For $10 per month you get access to our support Discord, office hours, exclusive subscriber content (not labs — those are free — just extras) and more. Check it out!

Prerequisites

Have Athena configured for querying CloudTrail, like we showed you in this previous lab.

The Lesson AND the Lab!

That’s right, this week we are skipping to the fun part — for the first time the Lesson and Lab are combined! Why? Because usually I use the Lesson section to focus on core principles, and then the Lab to put them into action. But for today’s topic it makes more sense to walk you through some basic queries as we explore the structure of CloudTrail, instead of just having you read about it and then… stare at it.

Today we will dig into CloudTrail and focus on searching on and extracting key data, which we tend to use in threat detectors and incident analysis. CloudTrail logs and events have a consistent structure, but since every AWS service is run by a different team, and the team generates its own CloudTrail events, there’s enough variation to… be kind of annoying at times. It isn’t all that bad once you learn some basics, so that’s our focus today.

Remember that a CloudTrail log entry is a record of an API call. Every API call includes an identity, an action, the parameters of that action (e.g., “delete THIS S3 object”), maybe a response from AWS (not all API calls provide a verbose response), and metadata like the time, region, source IP address, etc.

Bah, enough talk — let’s do!

Video Walkthrough

First, let’s make a couple API calls, which will generate log entries we can dig into.

Log into your Sign in Portal > TestAccount1 > AdministratorAccess > EC2:

We’ll keep it simple and run a basic instance, let it finish booting up, and then terminate it. Follow these steps…

Launch Instance > name it ‘delete’ > Launch instance:

Click through to proceed without key pair > Launch instance:

Now wait 3-5 minutes. Then Instances > click your instance > Instance state > Terminate (delete) instance:

Great, we generated what we need, and it should up in our central CloudTrail within 5 minutes. So, go take 5 minutes and do what you want. NO, NOT THAT!!! Oh, never mind, just… see you in a few minutes.

Now close the TestAccount1 tab > Sign In Portal > SecurityAudit > SecurityFullAdmin > Athena:

My Athena queries from the last lab still showed up, so I just closed them out and deleted whatever was in the Query 1 tab.

Let’s set up a basic scenario. Imagine you get a call from a scared intern who noticed that a critical EC2 instance is… not there anymore, and the entire Internet is down. I mean, this is the sort of things interns tend to do, but in this case the intern swears they didn’t touch anything. In fact, they were at the beach when it went down (please don’t tell their manager). Also, they have no idea who even owns the instance — they just know the Internet is down and their little brother can’t watch Bluey and this is bad.

Okay, where to start?

Well, if we think about it, we want to know:

Who terminated the instance and when.
Who originally launched the instance, which could indicate who owns it.
Whether this was a mistake, or maybe a hack? I mean we don’t even have an instance ID to work with at this point.

Look, this is highly contrived and barely makes sense, but I’ve had multiple 5 and 6 am work calls this week, so it’s the best you’re going to get.

Let’s start by running a query to find recently terminated instances and order them with the most recent ones at the top. That should help narrow it down. And notice in the query that I’m searching based only the Action (TerminateInstances).

Querying based on the Action is a common starting point, especially when building a threat detector.

Assuming you have the same table name as me from the prior lab, this query will work without any modification and you can just Copy > Paste > Run. If you used different names you will need to change it for your <databasename>.<tablename>:

SELECT 
    eventtime,
    eventname,
    sourceipaddress,
    useridentity.type as user_type,
    useridentity.principalid,
    useridentity.arn as user_arn,
    useridentity.userName,
    awsregion,
    requestparameters,
    responseelements,
    errorcode,
    errormessage,
    recipientaccountid
FROM cloudtrail_logs.organization_trail
WHERE eventname = 'TerminateInstances'
ORDER BY eventtime DESC;

Most of the fields are very obvious. Notice that userIdentity contains a lot of valuable data, it and we pulled out a few specific fields with the “.” nomenclature. Not every possible field is always present, so be careful how you write queries when searching on identity information.

Scroll around to the right and take a look at the data. Most of the rest, like the eventSource (the service) and eventName (the action) are pretty obvious, but there are two very important fields which can be confusing. requestParameters includes all the parameters in the request (in this case, the instance we are terminating) and the responseElements is everything AWS sent back in response to the API call.

Now here’s a bummer. Although there is sometimes a field for the resource, it’s … almost always missing or empty. You’d think every API call would include its resource (the full ARN, or at least the ID) but it doesn’t work that way. Why? Because sometimes you are making a request with a specific resource ID (e.g., TerminateInstances to terminate this instance) but in other cases you don’t even have an ID yet. For example, RunInstances — it isn’t like we get to pick our instance ID. AWS assigns that and returns it in the response.

Look in the requestParameters column, find the instance ID, and copy that into a text file.

This leads to another common starting point: a specific resource, when you have the ID. When searching all API calls for a resource, if you want all actions on that resource you need to search in both the requestParameters and responseElements!

And both of those fields embed other fields, so here’s how I handle it: the ever powerful “LIKE” search with wildcards! In SQL the ‘%’ sign is the wildcard, not ‘*’!!!

This query below pulls every single action involving that instance, since it’s looking in both requests and responses.

Hit “+” and paste this into Query 2, replacing your instance ID where indicated:

SELECT 
    eventtime,
    eventname,
    eventsource,
    sourceipaddress,
    useridentity.type as user_type,
    useridentity.arn as user_arn,
    useridentity.userName,
    awsregion,
    requestparameters,
    responseelements,
    errorcode,
    errormessage,
    recipientaccountid
FROM cloudtrail_logs.organization_trail
WHERE requestparameters LIKE '%YOUR INSTANCE ID%' 
   OR responseelements LIKE '%YOUR INSTANCE ID%'
ORDER BY eventtime DESC;

This is pretty great. We now see every action on the instance in reverse chronological order. Look at the responseElements field of the RunInstances API call — that’s where AWS first assigned the instance ID. This is why we need to look in both fields. Another example: if I ran a generic DescribeInstances to find one to attack, that would return every instance, even though I never specified an instance ID.

It’s call and response! “Give me an instance!” “Okay, here is the ID for your instance, and everything else I think is important.”

These don’t show very well in screenshots, so explore your results and pay particular attention to the request and the response for both the Run and Terminate actions. Or you can check out the video and get dizzy watching me scroll around.

Now what about all those Describe actions? That’s because we are working in the console, and the console wants to show us stuff, so every time we click into a new screen or view it makes a Describe call to … show us stuff.

To level set, we now have starting points based on an action or based on a resource, but what about the identity?

Looking at our two actions (Run and Terminate) they both use the same identity. We can just query directly on that identity and get a sense of everything that user/role did:

Copy useridentity.arn > Click “+” for Query 3 > Copy and Paste this query > replace with your useridentity.arn:

SELECT 
    eventtime,
    eventname,
    eventsource,
    sourceipaddress,
    awsregion,
    requestparameters,
    responseelements,
    errorcode,
    errormessage,
    recipientaccountid,
    useragent
FROM cloudtrail_logs.organization_trail
WHERE useridentity.arn = 'YOUR ARN'
ORDER BY eventtime DESC;

Very nice. Well, all that looks like normal stuff. Heck, it looks like I’m the one who terminated my very own instance! How about some clues if I think this is nefarious activity?

Well, we can look at all the IP addresses that originated the API calls. We can also look at the useragent and see if it suddenly changes to something different. Changing IP addresses and user agents often indicate use of stolen credentials or sessions.

Keep in mind that although we are querying our Organizations trail, which has all the logs from every account (which is why are queries are slow, so eventually we will need to partition)… this particular user identity query only finds things in one account. Why? Well, we are looking at the activity after assuming a role, and if you review that ARN it has the account ID and a unique role name.

What if we want to look in every account? That’s important when we suspect an attacker is bouncing around accounts with that one base IAM Identity Center user. This makes it tougher to be 100% accurate, but we don’t need 100%. This query goes back to using LIKE and our ‘%’ wildcard, and only specifies the last username.

Copy and Paste into a new query window, and replace “rmogull” with whatever is after the last / in your ARN:

SELECT 
    eventtime,
    eventname,
    eventsource,
    sourceipaddress,
    useridentity.arn as user_arn,
    useridentity.userName,
    awsregion,
    recipientaccountid
FROM cloudtrail_logs.organization_trail
WHERE useridentity.arn LIKE '%/rmogull'
ORDER BY eventtime DESC;

Now scroll to the right and look at the recipientaccountid — you now see the API calls across every account in your org:

And thus we cover our three main security entry points for queries:

The action
The resource
The identity

In future labs we will simulate and trace a full attack, but first we need a better understanding of how to write Athena queries with a focus on the data in CloudTrail logs. Feel free to peek ahead and see some more complex queries in this Securosis blog post.

Key Lesson Points

Actions, resources, and identities tend to be starting points for security related queries.
We can use these queries both as threat detectors and for incident analysis (we will cover both in future labs).
Using the LIKE keyword and a “%resource or other id%” wildcard helps query, even when we don’t know all the exact parameters or response elements.

-Rich

Reply

or to participate.