DevOps

Deploying a three-tier-infrastructure on AWS using Python and Pulumi

In my last article I introduced Pulumi as a viable alternative to Terraform for provisioning cloud resources from code. In this article, we will take a closer look at an example of Pulumi code being used for simplified three tier architecture that could be used for service deployments in the cloud. The main objective is to show how Python & Pulumi can be used in tandem to set up some resources, so we are working with a relatively simple infrastructure example.

The full code is available on my GitHub – we create some resources multiple times with only small differences, so I will not be pasting all the code into this article; I recommend you to follow along looking at the code instead.

Tools of the trade

We use three different tools – or rather platforms – to provision our architecture. Our Infrastructure-as-Code tool of choice is Pulumi, a modernized alternative to Terraform based on the same principles. If you are curious about the similarities, differences, and general functionality of Pulumi, check out my last article outlining Pulumi.

The programming language and environment we use is Python. It allows us to comfortably list our required resources and it’s extensibility and isolation principles make sure that we can run our code anywhere with the same results.

The cloud provider we use is Amazon Web Services. While it seems more complex than other providers at first glance, AWS offers powerful options to structure an infrastructure, with finely tunable control over interactions between every corner of your system.

Setting up Pulumi & Python

Setting up our software is easy. Make sure Python3, pip and Pulumi are installed on your system. On Mac, you can easily do this using Homebrew.

You also need to have the AWS CLI installed and configured on your system – I recommend using AWS profiles to save your credentials, as it makes connecting Pulumi and your AWS account much easier.

Once you have the software installed, create an empty directory and initialize a new Pulumi project using pulumi new python . This command will start up an interactive dialogue.

When you are done, some files will have appeared:

  • pulumi.yaml contains the basic metadata for your project, such as its name and runtime.
  • requirements.txt contains the dependencies Python needs to run the project.  Make sure it contains

pulumi>=2.0.0,<3.0.0
pulumi-aws>=2.4.0

  • __main__.py , a python file that contains your code. Note that in my code example, I split the basic setup and each tier into its own file to make it a little less crowded per file.

I recommend also creating a Pulumi.<yourProjectName>.yaml to set up your variables in code, like this:

config:
  AWS_example:pathToWebsiteContents: ./www
  AWS_example:targetDomain: <placeholder>
  aws:profile: <placeholder>
  aws:region: <placeholder>

Once you are done with the Pulumi setup, create a Python virtual environment, run it, and install your dependencies:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip3 install -r requirements.txt

What we build

Architectures nowadays are typically separated into tiers. Isolated tiers allow you to develop the respective parts of your system separately, which enables easier CI/CD and scaling of resources according to their load and demand. If you run multiple microservices, for example, you might want to scale up a single one without changing the number of replicas for your frontend hosting or database.

In this article, we take a look at a simple three tier architecture that could be expanded to better work with more complex systems.

The diagram above shows the AWS products we will be using for each tier. Before we provision servers, we will need to set up general resources that can be shared between tiers to allow each service to communicate with each other.

The baseline resources

A Virtual Private Cloud (VPC) is the most basic networking resource on AWS. It acts as a pool for your resources, which each resource in a VPC able to contact each other if you configure it that way. We create a single VPC and reference it later in our resources. We also set some networking details to make sure our routing doesn’t clash at a later point.

shared_vpc = pulumi_aws.ec2.Vpc(
    resource_name='pulumi-aws-example',
    assign_generated_ipv6_cidr_block=True,
    cidr_block="10.0.0.0/16",
    enable_dns_hostnames=True,enable_dns_support=True)

Additionally, we set up subnets. They represent the finer networking separation in our VPC, each covering a Availability Zone. I set up a subnet for incoming connections from the internet (gateway), one for our application backend (application) and two for our database tier (database).

subnet_gateway = pulumi_aws.ec2.Subnet(
    resource_name='pulumi-aws-example_gateway',
    availability_zone=availableZones.names[0],
    cidr_block="10.0.10.0/24",
    vpc_id=shared_vpc.id)

To enable our backend tier to receive incoming connections and connect to outside resources, we add a NAT gateway, an Internet Gateway and an Elastic IP for said Internet Gateway:

internet_gateway = pulumi_aws.ec2.InternetGateway(
    resource_name='pulumi-aws-example',
    vpc_id=shared_vpc.id)

gateway_eip = pulumi_aws.ec2.Eip(
    resource_name='pulumi-aws-example',
    vpc=True)

nat_gateway = pulumi_aws.ec2.NatGateway(
    resource_name='pulumi-aws-example',
    allocation_id=gateway_eip.id,
    subnet_id=subnet_gateway.id)

At last, we create route tables and associate our gateways with them for each subnet, like this:

routetable_application = pulumi_aws.ec2.RouteTable(
    resource_name='pulumi-aws-example_application',
    vpc_id=shared_vpc.id,
    routes=[
        {
            "cidrBlock": "0.0.0.0/0",
            "gatewayId": nat_gateway.id
        }])

routetableAssociation_application = pulumi_aws.ec2.RouteTableAssociation(
    resource_name='pulumi-aws-example_application',
    subnet_id=subnet_application.id,
    route_table_id=routetable_application)

For the full code, check out vpc-setup.py .

The frontend tier

For the frontend tier, we use a comprehensive example published by the Pulumi developers themselves. You can find the code in tier1.py . It uses:

  • Amazon S3 is used to store the website’s contents. S3 buckets act as storage for static files.
  • Amazon CloudFront is the CDN serving content. It helps making your static website accessible from different corners of the world with high performance.
  • Amazon Route53 is used to set up the DNS for the website.
  • Amazon Certificate Manager is used for securing things via HTTPS.

I recommend checking out the documentation in the linked repository if you are curious what the example does exactly and how it fits together.

Note that I added code to add a Route53 domain – it was expected input in the original example and I wanted to try setting up the domain completely from code.

A basic static website is included in the example – once the deployment is finished, you can directly visit your defined domain URL to see it running!

The certificate validation is the most difficult part to get right – it can’t be completely automated, since certificate validation requires human input. If you are running into problems deploying here, head on over to the AWS console and check out the status of your Amazon Certificate Manager (ACM) certificate in the us-east-1 region – Cloudfront can only use certificates in this region. If you already have a certificate and/or Route53 hosted zone set up in your AWS account, it might be a good idea to substitute the code creating these resources with code finding the resources on your account instead, using the ACM and Route53 packages. If you need more pointers on how to do this, my article series on EKS and Terraform used this approach.

The backend tier

In tier 2, we deploy a simple EC2 instance that could host a service of your choice, for example using Docker. First, we get an AMI for a basic Ubuntu installation from AWS:

ami=pulumi_aws.get_ami(filters=[
                {
                    "name": "name",
                    "values": ["ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*"],
                },
                {
                    "name": "virtualization-type",
                    "values": ["hvm"],
                },
            ],
            most_recent=True,
            owners=["099720109477"])

After that, we set up a security group for our EC2-Instance to be deployed into. Note that we add rules to allow the instance to connect to the NAT gateway, the database tier and accept requests from the internet gateway:

ec2SecurityGroup = pulumi_aws.ec2.SecurityGroup(
    resource_name="pulumi-aws-example_application",
    vpc_id=shared_vpc.id,
    egress=[{
        'from_port' : '0',
        'to_port' : '0',
        'protocol' : '-1',
        'cidr_blocks' : ['0.0.0.0/0']
    }],
    ingress=[{
            'cidr_blocks' : ['0.0.0.0/0'],
            'from_port' : '80',
            'to_port' : '80',
            'protocol' : 'tcp',
            'description' : 'Allow internet access to instance'
        }])

After that, we set up the EC2 instance itself, connecting it to the relevant security group and subnet, using the AMI we fetched earlier:

ec2instance = pulumi_aws.ec2.Instance(
    resource_name="pulumi-aws-example",
    availability_zone=availableZones.names[0],
    security_groups=[ec2SecurityGroup.id],
    subnet_id=subnet_application.id,
    instance_type='t2.micro',
    ami=ami.id)

If you have code for a simple application providing an API for a frontend to connect (for example via REST) and connecting to a database URL, you could host it on this instance to test it. For more information about connecting to your EC2 instance, see this documentation.

The data tier

Similarly to the second tier, our third tier mainly consists of a single instance running a simple database. The code in tier3.py  creates a Security Group, a rule for that group and a subnet group to fit the instance into our existing network infrastructure. Note that we setup a rule to allow the backend tier to connect to this tier, but otherwise, it is isolated from external access.

databaseSecurityGroup = pulumi_aws.ec2.SecurityGroup(
    resource_name="pulumi-aws-example_database",
    vpc_id=vpc_setup.shared_vpc.id)

databaseSecurityGroupRule = pulumi_aws.ec2.SecurityGroupRule(
    resource_name="pulumi-aws-example",
    security_group_id=databaseSecurityGroup.id,
    source_security_group_id=tier2.ec2SecurityGroup,
    protocol="tcp",
    from_port=5432,
    to_port=5432,
    type="ingress"
    )

subnetGroup = pulumi_aws.rds.SubnetGroup(
    resource_name="pulumi-aws-example",
    subnet_ids=[vpc_setup.subnet_database.id, vpc_setup.subnet_database2.id]
)

Then we create our database instance. There’s a lot of configurations for databases regarding their size and fail-safe backups – if you are not familiar and confused by the attributes and options, I recommend taking a look at the official documentation.

database = pulumi_aws.rds.Instance(
    resource_name="pulumi-aws-example",
    db_subnet_group_name=subnetGroup,
    allocated_storage=20,
    port=5432,
    storage_type="gp2",
    engine="postgres",
    engine_version="10.6",
    instance_class="db.t2.micro",
    name="pulumiAwsExample",
    identifier="pulumi-aws-example",
    username="pulumiAwsExample",
    password="example1Password",
    apply_immediately=False,
    final_snapshot_identifier="pulumi-aws-example",
    skip_final_snapshot=False,
    vpc_security_group_ids=[databaseSecurityGroup.id]
)

Another useful resource you can provision is a DynamoDB-endpoint. DynamoDB is a document-based database system useful for managing large amounts of data pieces with varying formats (e.g.recipes). Providing an endpoint allows your resources to connect to DynamoDB without having to leave AWS networking.

dynamodbEndpoint = pulumi_aws.ec2.VpcEndpoint(
    resource_name="pulumi-aws-example_dynamodb",
    vpc_id=vpc_setup.shared_vpc.id,
    service_name="com.amazonaws.eu-west-1.dynamodb",
    route_table_ids=[vpc_setup.routetable_application.id]
)

 

Usage & possible improvements

Once you have configured your project, you can deploy the infrastructure by running pulumi up . Pulumi will display a list of changes needed to build the configured infrastructure, and if you confirm the changes, start deploying. Note that to fully deploy the certificate, you need to use the AWS CLI or the AWS console to make sure your ACM validation can finish.

When the infrastructure successfully deploys, you can put your own frontend into the resulting S3 bucket, run your own code on the backend EC2 instance and set up the RDS database instance accordingly (I recommend doing this from your backend code, otherwise you need to add another security group rule to allow you database to accept connections from outside the VPC). With that done, you should be able to implement a full example using all three tiers.

Done experimenting? Run pulumi destroy  and Pulumi will remove all resources necessary for the stack.

Remember: Pulumi utilizes a centralized stack overview over at http://app.pulumi.com, where you can check out your deployment, it’s history and other metadata.

With some upgrades, this infrastructure could be changed to support much more complex backend services. For example, if your project requires a multitude of different microservices, you could change the second tier to use an EKS-managed Kubernetes cluster instead of a simple EC2-Instance, allowing you to scale the number of machines according to your need. Take a look at my article series about setting up a similar infrastructure using Terraform – due to Pulumis direct ties to Terraform, it should be possible to directly translate the concept fairly easily.

I hope this article helped you in understanding how Python and Pulumi can work together to setup infrastructures of varying complexity – writing it certainly did for me!