Introduction
Infrastructure as Code (IaC) is a way, as the name aptly implies, of managing your infrastructure like it was code. This circular definition will become clear with the example below.
Imagine you are trying to create a three-tiered web application on AWS as shown.
The presentation tier is responsible for presenting the user interface to the user. It includes the user interface components such as HTML, CSS, and JavaScript running on EC2 instances.
The logic tier is responsible for processing user requests and generating responses, by communicating with the database layer to retrieve or store data. This is also deployed on EC2 instances
The database tier is responsible for storing and managing the application's data and allows access to its data through the logic tier. The database runs on AWS RDS.
Each of the instances are in an autoscaling group with a load balancer in front of it (except for the database tier).
If you want to create this infrastructure through the AWS console, you would have to manually click through various screens to spin up the infrastructure. This is fine if it is a one time activity. However, if you need to repeat this across different environments like development and test, or need to add additional infrastructure like caches, queues, firewall rules, IAM or SSL certificates, then it becomes increasingly more complex to manage through the AWS console.
Managing complex infrastructure through the console also introduces the possibility of human error.
Infrastructure as code expresses your desired infrastructure in the language of code. This brings all the benefits of code to managing your infrastructure like:
Version Control - allows you to store the history of your infrastructure and revert to a previous version if needed
Faster & safer deployments - can recreate infrastructure in new environments quickly and with less errors since every part of the infrastructure is clearly defined in the code
Documentation - your current infrastructure state is documented and kept up to date automatically whenever you make a change. This keeps your infrastructure documentation detailed and accurate, compared to having the infrastructure written in a document or on a confluence page that may not be updated whenever there is a change
How Infrastructure as Code Works - Explained with an Analogy
Infrastructure as code allows you to create a detailed blueprint of your infrastructure. This blueprint gives instructions to your cloud provider about the infrastructure you want created.
This is similar to how an architecture blueprint works. It outlines the layout, dimensions, materials, and various components of the structure. The blueprint serves as a reference for architects and engineers to understand the desired construction.
The blueprint leaves little room for error. It will be interpreted in the same way by any architect or engineer. If you wanted to build exact copies of this house, all you need is the architecture blueprint.
Infrastructure as code, at a basic level, works in the same way as an architecture blueprint. It details the infrastructure you want to create, as code, in a number of different possible languages (JSON, YAML, HCL, Python, Ruby, JavaScript, etc) instructing the cloud provider to create your infrastructure exactly as specified.
Declarative & Imperative Infrastructure as Code Tools
There are many IaC options to choose from, and all the major cloud providers have their own dedicated tools:
AWS has CloudFormation
GCP has Deployment Manager
Azure has Resource manager
The limitation of these cloud provider specific tools is that they can only create infrastructure in their respective clouds, so CloudFormation only works in AWS and Deployment Manager only works in GCP. IaC using these providers is usually written in JSON or YAML format.
Terraform is open source and can be used to create infrastructure across all the major cloud providers. It uses HCL (HashiCorp Configuration Language).
Infrastructure as code can also be written using popular languages like Python and JavaScript.
These scripting/programming languages lie on a spectrum of declarative and imperative code as shown below.
The main difference between an imperative and declarative language is that imperative languages explicitly define the control flow. This is simply the order in which instructions are executed in a program. Control flow determines the path the program takes and how it responds to different conditions or events.
In imperative languages, control flow is explicitly defined using control structures such as loops, conditionals, and function calls. Imperative languages give you more flexibility in configuring your infrastructure. This is not necessarily a positive, as more flexibility means more opportunity to introduce errors into your infrastructure.
A declarative language focuses on describing the desired result without giving specific instructions on how to achieve it.
An example JSON is shown below, used in AWS CloudFormation to create an EC2 instance.
"Type": "AWS::EC2::Instance",
"Properties": {
"ImageId": "ami-0123456789",
"InstanceType": "t2.micro",
"KeyName": "my-key-pair",
"SecurityGroupIds": ["sg-0123456789"],
"SubnetId": "subnet-0123456789",
"Tags": [
{
"Key": "Name",
"Value": "MyEC2Instance"
}
]
}
A declarative language like JSON abstracts away the underlying complexity that details how the EC2 instance will be created. All it cares about is the end state.
Terraform HCL is closer to the declarative end of the spectrum. Terraform allows you to describe the desired infrastructure's final state without specifying the exact steps to get there. Terraform internally manages the execution order, resource dependencies, and handles the infrastructure changes based on the desired configuration.
However, Terraform does have support for some imperative features like variables and expressions, allowing dynamic behaviour based on inputs. So, it is not a completely declarative language like JSON.
How Terraform Works
There are two fundamental concepts that serve as a foundation for understanding Terraform:
The configuration file - this describes the desired infrastructure
The state file - this describes the current infrastructure as it exists in the real world
Terraform’s job is to create, modify or delete infrastructure as needed so that the desired infrastructure configuration is met. It does this by executing the necessary API calls to your cloud provider(s) to create, modify, or destroy the resources as specified. Once the infrastructure has been created/modified/destroyed to match the configuration file, the state file is updated to reflect the current infrastructure.
The terraform plan
command creates an execution plan, which lets you preview the changes that Terraform plans to make to your infrastructure. By default, when Terraform creates a plan, it compares the desired configuration as described in the configuration file, with the current configuration as described in the state file. Terraform then proposes a list of changes needed what will ensure that the current configuration matches the desired configuration.
If you then run the terraform apply
command, terraform will modify the real world infrastructure so that it matches the desired configuration, and updates the state file to show the new infrastructure configuration. At a high level, this is what terraform does.
Let’s bring back the architectural blueprint analogy. The configuration file is like the architectural blueprint. It details the infrastructure that needs to be built i.e. the desired construction. The real world infrastructure is the existing construction in the physical world and the state file is a representation of what currently exists i.e. the current blueprint. The engineers work to ensure that the existing construction matches the architecture blueprint.
In this analogy, engineers do the work of terraform, in ensuring that the existing construction matches the architecture blueprint. You don’t need to specify the details of how to build the house, you just need to specify what you want built and the engineers handle the rest.
Bringing it Together
Infrastructure as code (IaC) is a great way of managing complex infrastructure configuration in the form of code. This naturally brings all the advantages of code to your infrastructure like version control, faster and safer infrastructure deployments across different environments and up to date documentation of your infrastructure.
Terraform is an open source IaC tool that allows you to work with multiple cloud providers to spin up infrastructure as defined in your configuration files. Terraform HCL is a declarative language that allows you to describe your desired infrastructure configuration. All you have to do is specify what you want created and terraform handles the creation on your behalf by making API calls to your chosen cloud provider(s).