Journey to the Center of the VPC: Getting Started with Cloud Networks

Cloud networks aren't business as usual. Learn how they work, how this changes security, and how to shift your mindset.

Prerequisites

  • A Workload account

The Lesson

I had this great intro all planned out, and then I realized we have a ton of information to cover so we should just jump into the content. Today’s lab is a lot of explanation with a little exploration, all to set us up to really understand AWS networking.

It used to be that most security pros came from networking backgrounds, but those days are long past. As is usual for CloudSLAW, I won’t assume you know a lot about networking or security, so we need to cover some traditional basics before we get into cloud networking. You need to understand why and how cloud is so different.

How a normal packet switched network works (for the newbies)

Let’s start with an overly simple review of pre-cloud networks. We used to have multiple network standards, but these days (thank you, Internet) we pretty much stick with Ethernet. Whether at home, work, or a coffee shop… wired or wireless, the TL;DR is that your network is effectively packet switched. This means you send out little packets with headers saying what they are, where they should go, and who you are, containing the data you want to deliver on the other side. The receiving side listens for packets addressed to it and reassembles them, hopefully in the right order, to read the transmission. Since we run bunches of different applications using different protocols, part of the address says “use this port [a number] and this protocol”. So web servers know to listen on port 80 for HTTP traffic, and ssh servers listen for… ssh on port 22 (or you could assign a different port).

You connect to physical hardware like switches, wireless access points, and routers, which figure out how to get packets to their destinations and route them around from hop to hop. By default nothing is encrypted and anything can talk to anything, which opens up opportunities for attackers like sniffing the network to read someone else’s packets, faking their address, and other shenanigans. There isn’t any security built in, so we add firewalls to perform additional analysis, with rules about which things are allowed to talk to each other over which ports. Some advanced firewalls do more interesting things with more complex rules. Without firewalls anything could talk to anything.

This hardware is physically wired together. Firewalls are expensive, so we tend to put them on the perimeters of our networks, and sometimes add some extra layers inside our networks. All these little packets are zooming around the wires from hop to hop until they get where they are going.

A network is comprised of the physical boxes and wires connected together, but also the configuration that allows things plugged into the network to talk. Another over-simplification is that a network has a base address/configuration, and something that controls and manages all the connections inside it. A subnet is a smaller portion of a network, and they are how we divide things up. Most of you are familiar with seeing IP addresses like 10.0.5.4. That’s an IP version 4 address, and almost everything has an address — IPv4, IPv6, or both. For our example the ‘base’ of the network is 10.0.0.0 and 10.0.5.4 is an address assigned to a single device on the network, such as your computer. We will focus on IPv4 addresses for a while, but eventually will get into IP version 6 — once I figure it out myself :)

Some IP addresses are unique worldwide, and we can connect different networks together using various protocols like IP and others that run on top of IP — mostly TCP and UDP. This is how we build the Internet: by connecting together a bunch of networks using standard protocols. Our 10.0.0.0 address range won’t route over the Internet, so it’s one we use for smaller networks like homes, while an address like 216.144.238.51 is globally unique and routes… somewhere that doesn’t matter today.

In today’s lab you’ll see networks and subnets in action, and we’ll spend more time on them as we explore a network already running in our AWS accounts.

How an AWS network works (the super simple starter edition)

All networks use software built into the switches/routers/etc., but because all those physical things need standard ways to talk to each other, there’s plenty of complexity and capabilities. It can also be limiting — the structure of the network is constrained by physical wiring and existing protocols.

Amazon, every other cloud provider, and many work networks get around some of this by using more advanced software overlays called Software Defined Network (SDN). SDNs decouple the control plane from the data plane — which means they still use all that hardware and all (well, some) of those standard protocols, but they manage how things move around with extra software. This is its own kind of virtualization.

Here’s are two simple examples of why you could use an SDN. Imagine if you want a subnet of things that want to talk to other things, but you can’t plug them all into the same boxes due to physical limitations. With an SDN you can define a subnet that covers two different corners of your data center, and all the software on the servers in both corners thinks it’s just running off a single large physical switch. Or, in the case of AWS and other cloud providers, what if you need to give two different customers the exact same network addresses, while also ensuring they can’t see each other’s traffic? Because that’s kind of the only way a big cloud can work.

What I’m about to write is my highly simplified interpretation of an amazing AWS presentation called a Day in the Life of a Billion Packets and the later version called Another Day, Another Billion Flows. I’ll be stealing from these presentations for the next few months as we dig into networking. If you have a network background I strongly recommend you watch them.

AWS needed to create massive global networks with ridiculously large datacenters, but still allow every customer to use whatever non-Internet addresses they wanted (because that’s what you can do in your datacenter, your office, or even your house) while not seeing anyone else’s traffic. If 1.75 million customers all want to use 10.0.5.4, AWS needs to let them. And I wouldn’t be surprised if more than 1.75 million customers were actually using that IP address right now!

Amazon networks are called VPCs (Virtual Private Clouds). Right now you have a VPC running in every region of every one of your accounts, and they all use the same IP addresses and look exactly the same, but you aren’t paying a penny for them until you use them. A VPC is a type of SDN. Here’s a simplified diagram of how a VPC works, and a walkthrough. Keep in mind that this is not totally accurate. There are a lot of behind-the-scenes things I’m choosing to skip over because I probably don’t fully understand them, and some are simply not public. Because how AWS manages their hardware and software shouldn’t matter to us as customers, but it is interesting

  • We have 2 physical Amazon servers. Each has its own unique IP address.

  • We have 2 customers, each with a single VPC. One is Red, the other Blue, and each VPC has its own unique ID. You’ll actually see this when we start poking around, and the ID is part of the ARN (Amazon Resource Name).

  • Notice they both use the same IP address range: 10.0.0.0/16. The /16 tells us how large the network is, using a notation called Classless Inter-Domain Routing (CIDR), which we will discuss later.

  • Imagine that each Amazon server is running a bunch of customer instances (virtual machines). Red’s instance with the address of 10.0.5.1, instance ID i-159, on Amazon’s server with underlying address 10.0.5.4, wants to talk to Red’s 10.0.5.4 (instance i-789), which is on another Amazon server with its own address 192.168.7.2. But that server also contains a Blue instance with Blue IP 10.0.5.4.

    • Confusing, huh? That’s 3 things with the same IP address, overlapping in weird ways..

  • To start, instance 10.0.5.1 sends a packet out through its network interface (ENI). This isn’t a physical network card — it’s a virtual thing AWS provides called an Elastic Network Interface (ENI). The Amazon server ‘steals’ the packet, instead of putting it on a physical network.

    • An instance can have multiple ENIs with different addresses just like a computer can have multiple physical network cards.

  • Amazon’s server ‘wraps’ the packet up into a different… kind of packet. This wrapper has all the extra information Amazon needs, including the VPC ID.

  • The server then communicates with a massive database called the Mapping Service. This Amazon-specific service tracks all the VPCs and rules about what can talk to what, and where things are on Amazon’s actual datacenter network.

  • The Mapping Service tells the Amazon server where to send the packet, which then drops onto the physical network and goes to Amazon’s destination server.

  • The destination server sees the packet, checks its wrapper, and then checks back in with the Mapping Service to see whether the packet is supposed to be there and whether it’s allowed. Please, AWS friends — I KNOW THIS ISN’T HOW IT REALLY WORKS, but it’s the best way to explain in the little space I have! 

  • If the packet is allowed, it gets routed to the correct instance and unwrapped, and the new packet is sent to the ENI of the destination instance.

This works because the VPC software keeps track of what can talk to what and where it all is on the AWS datacenter’s network. Wrapping ensures only traffic from resources on the same VPC can talk to each other. All the Mapping Service checks make sure there aren’t mistakes, and that traffic can’t be spoofed/faked, because all sources and destinations are checked on both sides.

Why VPCs are amazeballs

Back in my first description of a regular network, there are some security problems. Without firewalls or other add-ons, anything can talk to anything else, and anything on the wires can sniff (listen to) any of the traffic passing by.

VPCs are default deny and eliminate spoofing, sniffing, and most common network layer attacks! How?

Everything a customer runs on a VPC is virtualized. You can’t trick your instance into listening to the underlying network’s traffic — you only get to see your overlayed SDN (the VPC). All customer traffic is snagged by the software layer, and doesn’t get to travel directly on the physical network.

The VPC gets to run all its software rules on those intercepted packets, and will only route allowed traffic around the network, and only to the defined destination!

VPC Red can only talk to VPC Red. VPC Blue will never see any VPC Red traffic, because Red packets only go from Red source to Red destination. Even cooler? VPC Red can’t even sniff the traffic because packets route directly from the source to the destination in the VPC, and are never dropped nilly willy onto a network or subnet!

One year I was teaching this at Black Hat and James Arlen, my co-instructor, offered one of the most profound descriptions I’ve ever heard for why this works: “A VPC effectively turns a packet switched network into a circuit switched network.”

What does that mean? In packet switched networks, we drop packets onto subnets, and everything on the subnet listens for anything addressed to it. This is how adversaries can just sniff what’s going on. A circuit switched network is like running the wires directly from A to B. The packets flow directly, and nothing else is on the line.

(Technically AWS is on the line, but they have a lot of rules and internal tech to keep themselves from sniffing. Especially using their new, super-cool Nitro hardware, which is built into many of Amazon’s servers).

On a VPC:

  • Only allowed traffic is accepted. The network drops anything weird or unexpected, which is how many network attacks work on non-cloud hardware.

  • Traffic only goes from source to destination, and only when allowed. Even on your own VPC you can’t sniff traffic without messing with the routing or using special (expensive) monitoring capabilities.

  • Traffic never leaves a VPC unless you use specific additional software constructs to allow it. One of these is called an Internet Gateway and handles communicating from your VPC to the Internet. We’ll see one later.

I love VPCs. They fundamentally changed how I think about networking and how I define architectures. Over the coming months (heck, years) we will explore many different ways of using them, and we will eventually move beyond my simplified explanations to learn more details about how they work and how to configure them. I wish more organizations were able to use them the right way, but that’s a thorny issue I’ll bring up later, when I’m ready to piss off a lot of people.

But we need to start by getting our hands dirty, seeing what a VPC really looks like!

Today’s lesson key points are:

  • Cloud providers use Software Defined Networks for their customers, but these still use an underlying physical network.

  • Amazon’s network is called Virtual Private Cloud (VPC).

  • VPCs effectively take a packet switched network and make them more like a circuit switched network, where packets are “wrapped” and go directly from a source to a destination.

  • VPCs use default deny and only traffic that is allowed, with a valid route, goes from a source to a destination. All other packets are dropped.

The Lab

Today’s lab is short and sweet — we will take a tour of the default VPC in one of our accounts, and look at the core primitives that make up a VPC.

Then we’ll delete it so we can build our own from scratch in the next lab.

Video Walkthrough

Step-by-Step

Start from your sign in portal. We want to use TestAccount1 with AdministratorAccess:

Then go to VPC:

The dashboard added a really cool feature to show all your VPC resources across all your regions. Expand the VPC section, and you can see you already have multiple virtual networks running in your account. There’s no traffic on them, so you aren’t paying anything yet. Also notice how you see Unsupported for some regions. Like every region except Oregon and Virginia. Why? Because we shut down those regions with our SCP in an earlier lab. 😀 

Each VPC is a complete virtual network. Make sure you are in us-west-2 (Oregon) and click Your VPCs, then the only one in the list for a deeper look.

There is a lot on the next screen. You get to see all the core primitives (components) which go into a VPC:

  • The VPC itself. This is the virtual network, and it has an IP address range. That /16 at the end is what we call CIDR notation; it means “this is a network large enough for about 65K potential IP, addresses using the range 172.31.—.—.

  • 4 subnets. Each subnet has its own CIDR range inside 172.31.0.0.

    • Notice us-west-2a? That’s the Availability Zone. An AZ is a separate data center in a region, connected to the rest of the region. Each AZ has its own power and networking, so if one goes down the others stay up. We can assign subnets to different AZs and spread resources around for resilience.

  • One route table. This decides where traffic is routed — whether to the same network, or out to the Internet.

  • One Internet Gateway. You don’t need one, but this is the construct that connects your VPC to the Internet. We will come back to IGs in a minute.

This visual map is pretty new and very welcome. Previously it was extremely difficult to walk through all the connections by clicking around the console.

Click around a little, focusing on the numbered items for now:

  1. This is the unique VPC ID. Remember everything in AWS has a unique ID, and this becomes part of the ARN.

  2. This is a default VPC. That means it’s there by default, and when you run things like instances, they will use this VPC unless you specify another one. You don’t need a default, and I usually delete them.

  3. Your CIDR (IP address) range. 172.31.0.0/16 is what we call a class B network (based on its size — that /16). I often use 10.0.0.0/16. 192.168.0.0/16 is another common non-Internet-routable IP network range. Those network ranges work fine for local networks, but won’t work on the public Internet.

  4. The subnets showing the visual connections and how they share…

  5. A route table. Subnets can share route tables, or a subnet can use its own private table.

  6. The network connection here shows connections to other networks. In this case it’s an Internet gateway. You can tell because the ID starts with “igw-”.

One piece I to ignore for now is the Main network ACL. You can click it to look, but it’s set to allow all traffic. I don’t like ACLs in VPCs, and I generally don’t use them. We will get to them way down the road when we cover thornier advanced topics. We will eventually cover everything you see — but today is all about the VPC, subnets, and route tables.

But isn’t this pretty?

That /20 means about 4,000 IP addresses are available. Click on the ‘expand’ box/arrow thingy to open a new tab for this subnet.

Explore this page, focusing on the Route Table. All the subnets share this one set of rules, but they don’t need to. Route tables in AWS are pretty easy to read once you get to know them:

  • All traffic in the same VPC should route locally. This doesn’t mean the packets are dropped into a local pool like on a physical network — they still go from point to point — but they are allowed to go anywhere on the local VPC.

  • 0.0.0.0/0 is a magic IP which means “the Internet”. Anything destined for the Internet goes to the Internet Gateway.

These subnets are public because they connect directly to an IG. AWS also supports private subnets, which don’t route 0.0.0.0/0 to an IG.

Click around the VPC, subnets, and route tables. Then go back to the VPC and let’s delete it. Why? We don’t need a default VPC — it’s convenient, but also a small security risk, since everything on them can be open to the Internet, and it’s easy to make a mistake and deploy things there.

That’s it for this week. We’ll build our replacement VPC in the next lab.

-Rich

Reply

or to participate.