Docker is both the name of the company (Docker Inc) and the software they have created which packages software into containers. But what exactly is a container and why is it useful? A container is a lightweight, standalone, and executable software package that includes everything needed to run a piece of software.
To understand how containers work and why they are incredibly useful for software development, you need to understand two seemingly unrelated topics - shipping containers and virtual machines.
A Brief History of Shipping Containers
"The Box: How the Shipping Container Made the World Smaller and the World Economy Bigger" is a book by Marc Levinson that explores the profound impact of the shipping container on global trade and the world economy. While the history of the shipping container may seem irrelevant in a discussion about docker containers, they both have more in common than you would expect.
Before containers, cargo handling was labor-intensive and time-consuming, leading to inefficiencies and delays in global trade. Cargo arrived in various shapes and sizes, and the lack of standardised packaging made it challenging to stack and secure items efficiently. Without standardised containers, cargo was often stored haphazardly in the holds of ships or on dockyards. This inefficient use of space meant that ships were not carrying as much cargo as they could potentially hold, leading to higher transportation costs.
The adoption of uniform container dimensions and handling procedures allowed for seamless transfer of cargo between different modes of transportation - ships, trucks, trains and the cranes used to move the containers around.
This standardisation was the key to the success of shipping containers. After all, if one company’s containers would not fit on another’s ship, truck or freight train, every company would need a fleet of containers to be able to send things to every customer - an operational nightmare. Standardisation of shipping containers makes them portable, i.e. easy to move from one place to another. This portability is a key feature of docker containers as well, which we shall discuss shortly.
Virtual Machines
Virtual machines (VMs) are created through a process called virtualisation.
Virtualisation is a technology that allows you to create multiple simulated environments or virtual versions of something, such as an operating system, a server, storage, or a network, on a single physical machine. These virtual environments behave as if they are independent, separate entities, even though they share the resources of the underlying physical system.
Virtualisation is like having a magician's hat that can conjure multiple hats within it. Just as the magician's hat creates the illusion of many hats appearing despite having just one physical hat, virtualisation allows a single physical computer or server to appear as multiple virtual machines (VMs), each with its own operating system and resources.
VMs virtualise the hardware. This simply means that a VM takes a single piece of hardware - a server, and creates virtual versions of other servers running their own operating systems. Physically, it is just a single piece of hardware. Logically, multiple virtual machines can run on top of a single piece of hardware; essentially, one or more computers running within a computer, as shown below.
How exactly does virtualisation work? This is illustrated below.
At the base, you have the host hardware and OS. This is the physical machine that is used to create the virtual machines. On top of this, you have the hypervisor. This allows multiple virtual machines, each with their own operating systems (OS) to run on a single physical server.
VMs have a few downsides however, which containers address. Two downsides particularly standout:
VMs consume more resources: VMs have a higher resource overhead due to the need to run a full OS instance for each VM. This can lead to larger memory and storage consumption. This in turn can have a negative effect on performance and startup times.
Portability: VMs are typically less portable due to differences in underlying OS environments. Moving VMs between different hypervisors or cloud providers can be more complex.
The major cloud providers all have VMs. For AWS, its EC2. GCP has Compute Engine and Azure has Azure Virtual Machines.
Containers
A container is a lightweight, standalone, and executable software package that includes everything needed to run a piece of software, including the code, runtime, system tools, and libraries. Containers are designed to isolate applications and their dependencies, ensuring that they can run consistently across different environments. Whether the application is running from your computer or in the cloud, the application behaviour remains the same.
Unlike VMs, containers virtualise the operating system. This simply means that a container uses a single OS to create a virtual application and its libraries. Containers run on top of a shared OS provided by the host system.
This is illustrated below.
The container engine allows you to spin up containers. It provides the tools and services necessary for building, running, and deploying containerised applications.
Containers have several benefits:
Portability: Containers are designed to be platform-independent. They can run on any system that supports the container runtime, such as Docker, regardless of the underlying operating system. This makes it easier to move applications between different environments, including local development machines, testing servers, and cloud platforms.
Efficiency: Containers share the host system's operating system, which reduces the overhead of running a virtual machine with multiple operating systems. This leads to more efficient resource utilization and allows for a higher density of applications that can run on a single host.
Consistency: Containers package all the necessary components, including the application code, runtime, libraries, and dependencies, into a single unit. This eliminates the "it works on my machine" problem and ensures that the application runs consistently across different environments, from development to production.
Isolation: Containers provide a lightweight and isolated environment for running applications. Each container encapsulates the application and its dependencies, ensuring that they do not interfere with each other. This isolation helps prevent conflicts and ensures consistent behaviour across different environments.
Fast Deployment: Containers can be created and started quickly, often in a matter of seconds. This rapid deployment speed is particularly beneficial for applications that need to scale up or down based on demand.
Docker, Dockerfile and Docker Images
Now that we have covered VMs and containers, what exactly is Docker? Docker is simply a tool for creating and managing containers.
At its core, Docker has two concepts that are useful to understand - Dockerfile and Docker Images.
Dockerfile has the set of instructions for building a Docker Image.
A Docker Image serves as a template for creating Docker containers. It contains all the necessary code, runtime, system tools, libraries, and settings required to run a software application.
So, a Dockerfile is used to build a Docker Image which is then used as the template for creating one or more Docker containers. This is illustrated below.
If this explanation still causes you to scratch your head, consider the following analogy using shipping containers.
Imagine you need to build multiple shipping containers to transport items all over the world. You start with a document listing out the requirements for your shipping container. This will contain information like the container dimensions, type of seals, door locking mechanisms, ventilation and refrigeration requirements (if you are shipping food that needs a temperature controlled environment for example) etc.
This requirement document will then be used to create a detailed template for the container which will include engineering drawings showing the dimensions and other specifications. From this template, the physical containers will then be built. This single template can be used to build one or many physical containers which will all be identical and match the specifications in the container template.
This is illustrated below.
The Dockerfile is analogous to the requirements document, which simply has a set of instructions for building the container template.
The Docker Image is analogous to the container template, which details all the instructions needed for building the physical container. Once created, Docker images are immutable, meaning they cannot be changed. If you need to make changes to an application, you need to modify the Dockerfile and create a new image. This immutability ensures consistency and reproducibility in application deployment.
And finally, the Docker container is analogous to the physical shipping container.
Bringing it Together
In summary, containers provide a portable and efficient way to package applications and their dependencies, ensuring consistency across various environments. The benefits they bring to software development is similar to the benefits brought to the global economy by the humble shipping container. Shipping containers, through standardisation, ensure that any container, anywhere in the world, can be seamlessly used to move items across various modes of transportation - ships, trucks, trains.
Portability ensures applications can run consistently across different environments, from development laptops to production servers, on-premises, or in the cloud.
Improved efficiency comes from the fact that containers share the host operating system, making them lightweight and efficient. This leads to rapid container startup times and efficient resource utilisation.
Oh my god. Never thought reading tech could be so warm and comfortable. Please, explain Platform Engineering.