Software Engineering

Building a Kubernetes Cluster on AWS EKS using Terraform – Part II

Churn Prediction durch Data Science & Deep Learning

Part II – configuring the AWS basics

In the last article of the series, I explained the basics of Terraform and how to set it up to connect to your AWS account and share its state via S3. This time, we will be taking a look at how you can split your Terraform resource setup into modules and setup the basic networking infrastructure we need on AWS, including the VPC, subnets and security groups.

Splitting your Terraforum setup into multiple scripts isn’t strictly necessary, but I still recommend doing it if only for clarity’s sake – it’s much easier to understand a finished stack of Terraform files if you can work through them one after another.

Creating a new Terraform module is easy – it basically means creating another folder with new .tf-files. Watch out though – a new folder also means the variables you defined in the root folder need to be defined again if you use them. When running the module from the root level, you can then insert the variable values into the module when you define it. You will see how exactly that is done at the end of the article, when we insert the finished module into the root level setup.

Creating the VPC: the foundation of your AWS network

The most basic part of a AWS infrastructure is the Virtual Private Cloud, VPC in short. It encapsulates your infrastructure in its own networking area, which contains all the other relevant parts. It can be setup easily like so:

DNS hostnames and support are features we will be needing at a later point to setup internal address resolution. The CIDR block is important if you use multiple VPCs in relation to your project, as VPCs can not interact with each other properly if they are based on the same CIDR blocks. The tags can be assigned to any resource to make it easier to understand where they came from and what domain they belong to at a later point.

Using subnets for network seperation and high availability

The next part of our AWS infrastructure will be Subnets. They are used to separate your VPC into multiple smaller networks which have their own routing configuration based on their CIDR-blocks and IP tables. For our example, we create three subnets:

  • One gateway subnet, which contains an internet gateway that can be used to connect to outside resources and receive traffic from the internet,
  • One application subnet, which will contain our EKS cluster and other resources related to our hosted services and servers,
  • And one database subnet, which should only be accessed from our application subnet to make sure that our data is safe and basically impossible to access from the outside.

If you want to make sure your subnet contents are highly available, you can also create multiple subnets of each kind, using different availability zones. To do this, we define a variable to document the count of availability zones we want to cover, and use this with the resource attribute „count“. Terraform creates multiples of resources this way:

Note that we also use our first data source in this snippet. These can be defined in Terraform to get data from sources like the state and providers. In this case, we use a data source type provided by the AWS provider to fetch a list of available Availability Zones for the region we use with our AWS provider. The resulting list is then used to define the availability zone in each subnet.

Opening a doorway to the outside: Internet Gateways

Our subnets are currently closed of to the outside. We need a way to connect to the internet for various reasons though – one of them being the fact that our EKS master will not be available inside our custom subnet right away, and another that we need to setup various  packages using repositories outside our cluster.

To do this, we use the AWS resource Internet Gateway.

While we want to be able to connect to instances in our application subnet, the other subnets should be mostly isolated from the internet. To ensure that these still can connect to the outside, without being targetable from the internet, we create a NAT Gateway. To address these from each subnet, we also need to create a Elastic IP Address every time – these are unchanging IP’s that we can associate with our resources to make sure they are always available at the same endpoint.

Drawing the map to navigate the subnet-seas using Route Tables

To make sure that our gateways are addressed by instances inside the subnets by default when they try to connect to the internet, we implement Route Tables. They serve as default paths using defined CIDR-paths:

We create a Route Table for each subnet, making sure that our gateway subnets use the previously built Internet Gateway to allow connections in and out, and the application subnets use the NAT Gateways to connect to the internet.

All that’s missing is the actuall association between the Route Tables and our previously created Subnets.

Moving into a module

As mentioned at the start of the article, we can separate our Terraform files into modules to help with clarity. For this, we  move our files into a subfolder of our initial setup.

We also create a file to call our submodules while running the root level:

Remember that you need to re-define your variables in the context of the module again!

Foundation laid

The resources resulting from our .tf-files are the foundation we need to build our EKS cluster on AWS. With all the networking set up, we will continue in the next article with the setup of Security Groups, which will be used to finely control which instances can communicate with certain other instances or the internet at a later point.

To try it our yourself, make sure you added your own configurations to the terraform.tfvars and state_config.tf files and run „terraform plan -var-file terraform.tfvars“. If everything went right, Terraform will list all resources it needs to build to fit the files, summarised „Plan: 22 to add, 0 to change, 0 to destroy.“

You can check out the code in my GitHub-Repository for the article series – don’t forget to enter your values for the access keys and region in the .tfvars file and the bucket configuration before running it!