NAT Your Way to Privacy (and Maybe Poverty)

This week we'll learn about private subnets and the different options for letting them talk to the Internet, which are annoyingly required a lot.

Prerequisites

The Lesson

In Build a VPC from Scratch we created a VPC with two subnets connected to an Internet gateway. As mentioned, this means any resources in those subnets have access to the Internet, and are potentially exposed to inbound access from the Internet. Anything deployed there is potentially public, which is why, you know, we named the subnets “public 1” and “public 2”.

But by now I’ve turned you all into paranoid security freaks, so you break out in a cold sweat at the thought of a server touching the Internet. It’s like the fear you’d experience walking through the wilderness of Alaska. Naked. Covered in honey. And lox.

That’s why AWS calls them VPCs: Virtual Private Cloud. The original version created private networks which could never access the Internet, and only connect to your datacenter. This led to customers asking annoying questions like, “you’re saying I have to route all my traffic, even to talk to S3, back through my datacenter and then back out through my own firewalls, because the only way to access S3 is over the Internet, even though I CAN SEE THE FILES RIGHT THERE IN THE CORNER OF THE BUILDING!?”

Yes, that was a thing. Actually, it still can be, but we’ll get there. And perhaps I should get to the point.

We don’t want everything to be out on the Internet. In fact, we want to connect as little as possible directly to the Internet, because it will be attacked before you finish reading this sentence. This is actually very easy to implement in AWS. All we need to do is set the route table of a subnet to not send traffic to 0.0.0.0/0 to the Internet gateway.

In those early days of VPCs, that 0.0.0.0/0 traffic routed back to a datacenter over a dedicated link. We can still use those, and many large organizations still do. But if we don’t route the traffic through a datacenter, this can cause issues. There are two primary ones you discover very quickly:

  • You can’t talk to any AWS services. Why? Because the API endpoints for AWS are all on the Internet, and without Internet access you can’t reach them, even if the servers are in the same building.

  • You can’t perform software updates or talk to anything outside the subnets you are set to route to.

Sometimes this is okay. I’ve designed some really cool architectures which are super secure and work this way. But it is definitely limiting.

Network Address Translation (NAT)

Right now I’m writing this from my home office. My house has a bit of an over-engineered network with 42 active clients connecting via 3 wireless access points and multiple cabled connections. These all talk to the Internet over a single big connection with a modem that has… a single IP address. If you think about it, how does that even work? How can 42 devices share one connection and Internet address?

NAT has been around for a long time. Everything on my network has a private IP address in the 10.0.0.0/16 range, just like the network we built in AWS (heck, I even have multiple subnets because.. you’ve met me, right?) My router uses NAT to keep track of which device is talking to who on the Internet, and makes sure all the requests and responses go back where they should. “Translate this private network address into our shared Internet address, and send the response back to whoever requested it”.

AWS first solved this with something called a NAT instance. For a long time, this was an instance (virtual machine) you would deploy in your public subnet. It would have a route so it could talk to anything local. Then, in the private subnet, you could route all 0.0.0.0/0 traffic to the public instance. Of course this instance needed software to understand how to NAT. And, of course, they crashed at really inopportune times. This technique still works and is very close to how your home network works.

Instance running NAT software

Early NAT instances only supported one network interface, so they had to live in the public subnet. These days we can configure two network interfaces: one public and one private. This diagram shows the original, and still default, architecture.

Now I want to be clear — there are some very large organizations which still use this approach so they can run custom security services on their Internet-bound traffic. And these days it isn’t unusual to route all your traffic through virtual firewalls from major manufacturers.

But quite a few AWS customers wanted something so they didn’t have to maintain another server. So AWS created the NAT Gateway. This is like an Internet gateway, but it… does NAT. These are serverless constructs, which doesn’t mean there isn’t a server somewhere — just that you, the customer, don’t have to manage it and can’t ever log in, it and we have no idea what the operating system is, or anything else. It’s a magic black box — just accept that. These NAT gateways work well, and I use them all the time, but they have a drawback: cost.

NAT Gateway is a serverless AWS construct

Work in cloud for a while and you start learning which services really increase your bill. For me, NAT gateways are my AWS cost kryptonite. You pay for running it, even if you aren’t routing traffic through it. You also have to pay for an Elastic IP Address, which is a dedicated Internet-routable IP address in Amazon’s pool. Oh, and you also pay for traffic.

All that said, if you want a private subnet but also want to allow your resources to talk to things on the Internet, like other AWS services and software updates, NAT is the fastest and easiest option. And really, sometimes it is more cost effective than maintaining your own NAT instance.

Okay experienced AWS folks, you can stop biting your tongues. We do have other ways of allowing private subnet resources to talk to the Internet or AWS services (APIs). We will definitely get to service endpoints later. And there are enterprise-class network options we will never have a lab for since I don’t think Equinix is about to sponsor CloudSLAW and provide us all free MPLS. I’ll talk about all this… much later.

Lesson Key Points

  • Network Address Translation (NAT) enables us to connect private subnets to the Internet for outbound communications.

  • AWS services run on the public Internet, so we need NAT or some other connection to talk to them from our private subnets.

  • AWS supports two types of NAT:

    • NAT instances are virtual machines running NAT software.

    • NAT Gateways are serverless constructs which AWS manages and you pay for.

The Lab

We will create two private subnets, then create a NAT Gateway, and then update the routing tables so the private subnets route to the NAT Gateway.

We’ll finish by deleting the entire thing to save costs. Don’t worry — next lab I’ll provide a CloudFormation template to recreate this configuration on demand.

Video Walkthrough

Step-by-Step

Sign in and go to TestAccount1 with AdministratorAccess. Double check that you are in us-west-2 (Oregon) and then go to VPC. Since you have now navigated through the console dozens of times, I’ll skip the screenshots for this start. Email me if you think I should put them back in, seriously.

The first thing we need to do is create 2 new subnets. Before the step by step screenshots, here are the two key criteria:

  • Name them slaw-private 1 and slaw-private-2.

  • The CIDR ranges will be 10.0.3.0/24 and 10.0.4.0/24.

Even though we will delete these at the end, I will be using these names and ranges in the Infrastructure as Code for future labs. Consistency will make it easier to understand what’s going on. I promise I’m not making a big deal out of this because I was the kind of student who would give things funny names, like every variable in my code named after a Star Wars character and not what the variable was for, and that always worked out fine THANK YOU VERY MUCH.

Go to Subnets > Create subnet:

You only have 1 VPC to choose from, making this step easy:

Then fill in the name and CIDR, and set the Availability Zone to us-west-2a:

Now repeat for slaw-public-2 and update the Availability Zone to us-west-2b and Create subnet:

Isn’t this pretty?

Now go to NAT Gateways > Create NAT Gateway. Name it slaw-NAT, choose the slaw-public-1 subnet, and click the button to Allocate Elastic IP. This will assign a public IP address from Amazon’s (massive) pool to your NAT Gateway. As a reminder, you pay for these so we will release it at the end of the lab. Then Create NAT gateway:

At this point we have two new subnets but they are sharing the main route table, so despite the name they are public subnets. We have a NAT gateway sitting in one of our public subnets, but nothing is routing traffic to it. We can quickly fix both issues with a simple route table. As a reminder, we have a single route table for our entire VPC, which routes all traffic through the Internet gateway. It looks like this:

Go to Route tables > Create route table:

Call it slaw-private-route and then Create route table:

It should take you right to the route table, where you can see there is only one route for local to 10.0.0.0/16 just like our first route table. We need to Edit routes:

Create a new route with a Destination 0.0.0.0/0, then click the dropdown under Target, select NAT Gateway, and then click the dropdown below to select the only one we have in the VPC. Then Save changes:

Now we need to tell our private subnets to use this new route table. Go to Subnet associations and Edit subnet associations:

Select our two private subnets and Save associations:

Now if you go back into Subnets, pick one of the private subnets, and click Route table, you can see that all Internet traffic now routes to the NAT Gateway:

Our VPC now has two public subnets which can talk to the Internet and accept Internet traffic using the Internet gateway. We also have two private subnets which can talk to the Internet but the Internet can’t see, thanks to the NAT Gateway.

It’s pretty cool. And now we need to destroy it like a father kicking his kids’ sandcastle at the beach. Not out of spite, but to teach them what to expect if they buy beachfront property and don’t take global warming seriously.

Follow these steps; I’m including a few key screenshots for the important parts.

  • Delete the NAT Gateway. This is the most time-consuming part; it can take 5-10 minutes before it’s fully deleted. You won’t be able to clear anything else out until this is complete.

  • Release the Elastic IP address. You can’t do this until the NAT Gateway is fully deleted by AWS.

  • Delete the VPC.

You won’t be able to do this for 5-10 minutes, until the NAT Gateway is fully cleared

That’s it!

Lab Key Points

  • NAT Gateways live in the public subnet so they can access the Internet.

  • NAT Gateways require a dedicated elastic IP address, an Internet-routable IP address, so they can receive traffic back from whatever they are talking to.

  • In our route tables if the subnet routes 0.0.0.0/0 to an Internet gateway, that subnet is Public. If 0.0.0.0/0 routes to a NAT Gateway, instance, or other destination, it’s a private subnet.

-Rich

Reply

or to participate.