Cloud Security Lab a Week (S.L.A.W)
Posts
Distributing and Testing Security Autoremediation Events

Distributing and Testing Security Autoremediation Events

Learn how to efficiently distribute and test security autoremediation events in Part 4 of our Epic Automation series.

Rich Mogull
February 20, 2025 • Estimated Reading Time: 13 minutes

Prerequisites

Complete the previous 3 labs in this series:

The Lesson

We’re in the thick of it now!

Over the course of the three previous labs we learned some new tech (StackSets) and expanded our use of EventBridge with multiple event buses. Plus we managed cross-account centralization and permissions.

This week I want to talk a bit about architecture and efficiency. The pattern we are building out is used for much more than security; it’s one of many common event-driven architectures. A few criteria led us down this particular path:

We are automating only within AWS using AWS native events. This is a big driver towards using EventBridge to centralize the events we want, and use EventBridge Rules as our trigger and filtering mechanism.
- Only forwarding the events we want keeps costs and workload down. That’s why we selected the specific S3 events we wanted.
We want to run our security tooling centrally, but manage distributed accounts. This pushed us into centralizing our event buses, and building our actual application in a central account instead of distributing it.
- For example, we could have pushed serverless functions into every account. But in our case it makes more sense to run them in our Security Operations account and have them make changes in the target accounts.
My experience and comfort level with this architecture. I released my first autoremediation the weekend after EventBridge Rules (then called CloudWatch Rules) were released. An architecture similar to what we are building is the core of FireMon Cloud Defense, a battle-tested commercial platform I developed before FireMon acquired it. I know this works, and I know it scales to thousands of AWS accounts.
- Unlike our architecture, Cloud Defense customers send us nearly every CloudTrail and Security Hub event, and FireMon filters and processes them. We are looking for a broader range of things, and the cost model here is different than a commercial platform.

Okay, this is why we used EventBridge, centralized, and filtered the events we want. Our next step is to process those events to enforce autoremediation guardrails using serverless functions.

I already spoiled the fact that we will use AWS Lambda, which is what we call serverless. I’ll talk a lot more about Lambda in our next lab, but for now just know that, at its core, a lambda function is code we write, which runs within AWS only when it’s triggered. We don’t need to maintain a container or server — we just load up our code and run it. We can trigger a lambda function using an EventBridge rule, at a scheduled time (like a cron job), from another lambda function, or via… a lot of other ways.

Our next decision is: how do we want to structure our lambda function(s)? That affects how we build our EventBridge rule to trigger them.

We can, for example, have one big function for all autoremediation, no matter the service. Sure we only have S3 now, but someday we’ll probably add more automation for more services.
Or we can have smaller functions for each automation.
Or we could have a function per service. For example we might use one function for all S3 automations and another for all EC2 automations.

All these are viable options. I’ve learned to generally avoid a lot of smaller functions because they becomes difficult to manage. And there are nuances to lambda functions, especially around how long it takes them to start up (cold vs. warm starts).

If option #2 is out, that leaves #1 or #3. For these labs I decided to go with #3 for a couple reasons:

Each function needs fewer IAM privileges, which is… nice.
Since these are learning labs, it allows us to compartmentalize each service. All our S3 code will be in one place, not commingled with other services, making it easier to read and discuss.
Once functions hit a certain size you need to upload them as .zip files instead of pasting into the console. For our labs I want you to see everything in the console.

One last point: we also need to decide whether we want our assessment and remediation code in the same function. Again, for simplicity and clarity, we will keep both consolidated.

Hopefully this gives you an idea of why I built these labs this way. Even experienced pros don’t always get things correct on the first shot. We have completely rebuilt the architecture of Cloud Defense multiple times as our needs and AWS features have changed. What’s great about cloud is that we can run multiple stacks simultaneously, which enables us to make these updates, usually without downtime or impact visible to users.

Key Lesson Points

The Lab

In today’s lab we will build the EventBridge Rule in our central event hub (in SecurityOperations), which we will later use to trigger lambda functions. Then we’ll test it by making changes in S3. But instead of triggering the lambda we will just send ourselves an email via an SNS topic. Why? Because this is an easy way to test our current implementation and make sure everything is wired up correctly before we start kicking off code.

All the building will be in SecurityOperations and all the S3 changes will be in our new Production1 account. Since we didn’t add that account to IAM Identity Center, we’ll access it by popping into our CloudSLAW management account and using the old-fashioned switch role with the OrganizationAccountAccessRole. Don’t worry, it’s been a while but we covered it over a year ago and it’s easy.

Video Walkthrough

Step-by-Step

We’ll start with Sign-in console > SecurityOperations > AdministratorAccess > Oregon region > Simple Notification Service (SNS) > Create topic > and name it SecurityTesting. We’ll create a topic for our testing and subscribe to it via email. I’m going this route because it’s easiest and fastest for training purposes, but when building a real application I would be more likely to send test results to CloudWatch Logs. But honestly, what we’re doing today works just fine.

Leave all the defaults and Create topic:

Then Create subscription > Email (for the protocol) > enter your email > Create subscription:

Then check your email and click the link:

Now we have testing alerts set up. Anything sent there will show up in email.

Next we need to create the EventBridge Rule which will send events to our lambda functions. As a reminder, we are using the per-service pattern, so this rule’s job is to grab all S3 events. Since we don’t have those lambda functions yet, we’ll test all our cross-account wiring by just sending ourselves email through that SNS topic.

Time to switch over to EventBridge > Rules > select the SecurityAutomation event bus > Create rule:

Name it S3-security-remediations and double check it’s set for Rule with an event pattern, then click Next:

Select Custom pattern, then paste in the pattern below (above the screenshot). Then click Next.

This pattern will match on any management event API call that comes from the S3 service into CloudTrail. There are also things called data events which we’ll cover down the road (it’s… a very long road). I actually messed this up pretty bad when I was putting this lab together and lost multiple evening hours, but eventually got there.

EventBridge Rule:

{
  "detail-type": ["AWS API Call via CloudTrail"],
  "detail": {
    "eventSource": ["s3.amazonaws.com"]
  }
}

We’ll pick our SNS topic as the target. Remember this is just for testing — we’ll swap things around later. Select AWS Service > SNS Topic > Target in this account > SecurityTesting > Next:

Review everything on the next page and then Create rule. No, you don’t get a screenshot for that button — sue me.

We will only test in one region to save time. This is also where we will perform all lambda function coding and testing. After we get everything working in Oregon, at the very end we will replicate the final result into Virginia. I’ll talk about cross-region events then, and why that isn’t how we are implementing things here.

Now CLOSE THE SECURITYOPERATIONS BROWSER WINDOW, then go back to your Sign-in portal > CloudSLAW > AdministratorAccess. As I mentioned in the intro, we will use the switch role console feature to access our Production1 account to initiate our test. This is just a way of assuming a role from the console.

I thought about adding Production1 to IAM Identity Center so it would show in our portal, but decided not to add the overhead to this lab, since we won’t go back for a while.

Then go to Organizations > AWS accounts > Workloads > Prod > and copy the Production1 ID:

Then go to the upper-right and Switch Role:

Paste in the account ID, then enter the role name OrganizationAccountAccessRole. As a reminder, that’s the default name for an admin-level role which AWS Organizations automatically adds to every account. It’s super powerful, and one reason we restrict use of the management account. What we are doing is not a good security practice, but it is convenient for this lab! I try not to do that often, but it’s the best choice to keep this one from taking more than our usual 15-30 minutes.

Paste the account ID as the name, pick an optional color, and Switch Role:

Now go to S3 and change to the Oregon region, then Create bucket:

I really couldn’t care less what you name it. Remember to keep it all lowercase and make it globally unique. Here’s mine (putting test and slaw in there might help you remember what it is 6.78 years from now):

Then Add tag… any tags… Again, here’s my example: the pinnacle of creativity. Then Create bucket:

If you recall, we don’t have a trigger on CreateBucket but we do have a trigger on creating a tag. Adding that tag is the key to making this test work. If everything is wired correctly you should get an email within 30 seconds (check your spam folder). Here’s mine:

Sweet! That means my events are flowing from Production1 to SecurityOperations, and triggering my new S3 rule.

Lab Key Points

We previously used CloudFormation StackSets to push an EventBridge rule into all accounts in our Prod OU. This includes the Production1 account.
As part of that rule, if we make the API call to create or modify a tag on an S3 bucket it sends the event to an event bus in our SecurityOperations account. We set permissions to enable any account in our organization to send to that bus.
This week we added an EventBridge rule to send us email for any S3 event which arrives on our SecurityAutomation event bus.
Then we tested by creating a bucket with tags in Production1 and receiving the email.
This confirms that our cross-account event forwarding, based on specific rules, works. Now that we know this is up and running, we can start building automation.

-Rich

Reply

or to participate.