Cloud Security Lab a Week (S.L.A.W)
Posts
Stage Check: Network and Workload Fundamentals

Stage Check: Network and Workload Fundamentals

We've covered a ridiculous amount in this block, so let's pull it together and see why it all matters.

Rich Mogull
October 31, 2024 • Estimated Reading Time: 9 minutes

Whoa!

Yeah, we’ve covered a lot since the last Stage Check: over three months of content on critically important topics. I don’t like to go this long between Stage Checks, but all those labs kept rolling into the next one, and there wasn’t a good stopping point until now.

We aren’t done talking about network and workload security, but we need to change over to cover data security fundamentals before we finish out our first year of labs.

My objective today isn’t to review all the ins and outs of everything we covered, but to tie it all together and hammer home why I picked this content at this point in our journey. Heck, the reason is pretty easy:

These are the most common mistakes that lead to actual breaches.

(Quick note: there won’t be any video with this stage check because I had some skin cancer removed from my face 2 days ago and… well, it wasn’t fun for you to look at in the first place).

Know Your Threat Model

One of the most valuable skill pairs in information security is the ability to build threat models, and apply them to designing security defenses. Threat modeling is the formal process of identifying threats and aligning countermeasures. While that sounds ridiculously obvious; in day-to-day of security operations it can be all too easy to get wrapped up in technologies, tools, processes, compliance, politics, and all sorts of other minutiae that distracts from the big picture. A shockingly large percentage of the organizations I’ve talked with over the years spend insane amounts of money on security that isn’t aligned with a realistic threat model.

There are multiple formal processes and models out there, but early in your journey I recommend focusing on the basics. For our purposes I like the Universal Cloud Threat Model. Why? Because Chris Farris and I wrote it and we are awesome. Also it’s highly focused on the major sources of current cloud security breaches.

Over the course of 40+ sessions I’ve tried to hammer the top two sources of cloud breaches into you:

Exposed/stolen static credentials. For AWS this is mostly access keys. According to material in their presentations this is the root cause of 66% of their customer breaches.
Publicly exposed resources. Typically vulnerable instances exposed to the Internet. As you might imagine, this often leads to lost/stolen credentials, when an instance has IAM keys or roles associated with it.

Attackers use their initial access for a lot of things, but the top two right now are cryptomining and ransomware. During this block we covered a lot of cryptomining, and as we move into data security we’ll get into preventing ransomware.

For nearly a year my primary focus has been helping you learn how to build a secure, enterprise-scale cloud infrastructure that eliminates static credentials, reduces the risk of stolen session credentials, and minimizes public resources.

Aligning Defenses

Since static credentials and publicly exposed resources are the root cause of most cloud breaches, defenses should focus on blocking those two attack vectors first.

And that’s exactly what we’ve been building!

The bulk of our Organizations setup, using IAM Identity Center and Delegated Administration, dramatically reduced our need for access keys, and the risk of exposing them.
Our network and workload block integrated a range of network tools (e.g., VPC Endpoints) and workload technologies (e.g., Session Manager) to reduce exposure of instances to the Internet.

If any of you are working in active AWS environments you likely know how valuable these approaches are. To drive the point home a little deeper, we:

Learned how to build a Minimum Viable Network which only includes the network components we need, to reduce Blast Radius. Unlike a physical network, with a cloud network we only need the subnets and routes needed for resources with a need to talk to each other (or the Internet) to access each other (or the Internet).
We learned how clouds use Software Defined Networks, and how they differ from traditional networks. A VPC behaves more like a circuit switched network than a packet switched network, which means communications go from source to destination almost as if they were on a private connection, and Security Groups and Route Tables combine to enforce this.
Many instances are publicly exposed so someone can log in and manage them. So we spent time on avoiding use of SSH authentication, and using Session Manager for authentication instead, since it eliminates the need for both Internet access and SSH keys.
We learned some of the most common ways instances are compromised and credentials are stolen. These include SSRF and the Instance Metadata Service, opening SSH to the Internet, and abusing User Data fields.

And yep, it took 16 labs to get through all that material, but we’ve just built a hell of a strong foundation.

Putting It All Together

We started with an overview of VPCs and how they differ from physical networks. Then we learned how to build a basic VPC from scratch, which illustrated the core components (VPC, subnets, route tables, Internet Gateways, and CIDR ranges). We moved on to learn about private subnets and how to give them Internet access with NAT Gateways. After a quick review of Infrastructure as Code we learned why Security Groups are awesome.

Those labs provided a solid foundation in AWS network fundamentals. Then we moved on to talk about workload security, and the interaction of workloads and networks.

First we launched an instance while learning the basics of virtual machines, including images. In that lab we connected using Session Manager, an AWS service which enables us to connect to instances in private subnets (so long as there’s a NAT Gateway). Because sometimes we want to access an AWS service without any Internet access, we then learned how to use VPC Endpoints instead. Do not underestimate the power of being connecting securely to instances in private networks without sitting in the same datacenter! We rounded out our discussion of Session Manager by learning how to replace the need for SSH (so we can still use the command line and not the console) and how to log all Session Manager activity.

With the fundamentals covered we started seeing security in action. This is my favorite string of labs so far, since you got to experience how real breaches happen.

First we exposed an instance with port 22 open to the Internet and (hopefully) triggered GuardDuty alerts for the inevitable brute force attacks. That is why we keep our stuff private! But what if someone gets into an instance? You got to hack (sorta) my account with an access key I embedded in an image, showing the risks of static credentials. But “aha!” you say, “I use IAM roles!” Okie dokie, let’s see how those credentials are exposed if you are using the Instance Metadata Service V1 (IMDSv1). Combined with Service Side Request Forgery (SSRF) vulnerabilities, this … has made headlines.

Where else do we screw up storing credentials? How about the user-data field? Oh, did I forget to mention that attackers can also use that to run arbitrary code, including cryptomining, in your account (if they have credentials to run an instance)?

As fun as simulating breaches is, I personally like to play detective, so we spent two labs learning about EBS Volumes and Snapshots and how to use them for forensics.

Like I said at the start, whoa.

I hope you enjoyed this learning block. These lessons help teach the concepts, the implementation, and why it all matters. We have a lot more on workloads and endpoints to cover, but as I mentioned earlier we need to knock out some data security basics before spreading our wings into more services and deeper topics.

Not bad for a free email list, eh?

-Rich

Reply

or to participate.