Stateful vs Stateless Architectures Explained

Benefits and tradeoffs

Aug 18, 2023

Introduction

"State" refers to the condition of a system, component, or application at a particular point in time. As a simple example, if you are shopping on amazon.com, whether you are currently logged into the site or if you have anything stored in your cart are some examples of state.

State represents the data that is stored and used to keep track of the current status of the application. Understanding and managing the state is crucial for building interactive and dynamic web applications.

The concept of a “state” crosses many boundaries in architecture. Design patterns (like REST and GraphQL), protocols (like HTTP and TCP), firewalls and functions can be stateful or stateless. However, the underlying principle of “state” cutting across all of these domains remains the same.

This article will explain what state means. It will also explain stateful and stateless architectures with some analogies and the benefits and tradeoffs of both.

Stateful Architecture

Imagine you go to a pizza restaurant to eat some food. In this restaurant, there is only a single waiter, and the waiter takes detailed notes on your table number, what you ordered, your preferences based on past orders, like what type of pizza crust you like or toppings you are allergic to, etc.

All of these pieces of information that the waiter writes down in their notepad is the customer state. Only the waiter serving you has access to this information. If you want to make a change to your order or check how its coming along, you need to speak to the same waiter that took your order. But since there is only one waiter, that is not a problem.

Now, suppose the restaurant starts to get busier. Your waiter has to respond to other guests so more waiters are called to work. You now want to check the status of your order and make a small change to it - a plain crust instead of a cheesy crust. The only available waiter is different to the one who initially took your order.

This new waiter does not have details of your order i.e. your state. Naturally, he will not be able to check the status of your order or make changes to it. A restaurant that operates like this, where only the waiter that initially took your order can give you updates about it, or make changes to it, follows a stateful design.

Similarly, a stateful application will have a server that remembers clients data (i.e. state). All future requests will be routed to the same server using a load balancer with sticky sessions enabled. In this way, the server is always aware of the client.

Sticky sessions is a configuration that allows the load balancer to direct a user's requests consistently to the same backend server for the duration of their session. This is in contrast to traditional load balancing, where requests from a user can be directed to any available backend server in a round-robin or other load distribution pattern.

What is the problem with a stateful architecture? Imagine a restaurant run in this manner. While it may be ideal and easy to implement for a small, family run restaurant with only a few customers, such a design is not fault tolerant and not scalable.

What happens if the waiter who took a customers order has an emergency and needs to leave? All the information regarding that order leaves with him as well, disrupting the customer’s experience, since any new waiter brought in to replace the old one has no knowledge of previous orders - a design that is not fault tolerant. Also, having to distribute requests so that the same customer can only speak to the same waiter means that the load on different waiters is not equally distributed. Some waiters will be overwhelmed with requests if you have a very demanding customer who always modifies or adds things to his order. Some of the other waiters will have nothing to do, and can’t step in to help - a non scalable design.

Similarly, storing state data for different customers on different servers is not fault tolerant and not scalable. A server failure will lead to loss of state data. So, if a user is logged in and about to checkout for a large order on Amazon.com for example, the user will be forced to re-authenticate and the users basket will be empty. They would have to log in again and fill up their basket from scratch - a poor user experience.

Scalability will also be difficult to achieve during peak times like Black Friday with a stateful design. New servers will be added to the auto scaling group but since sticky sessions are enabled, clients will be routed to the same server, causing them to be overwhelmed, which can cause an increase in response times - a poor user experience.

Stateless architectures solve a lot of these problems.

Stateless Architecture

“Stateless” architecture is a confusing term, as it implies the system is without state. A stateless architecture does not, however, mean that state information is not stored. It simply means that state information is stored outside of the server. Therefore, the state of being stateless only applies to the server.

Bringing back the restaurant analogy, waiters in a stateless restaurant can be thought of as having perfectly forgetful memories. They do not recognise old customers, can’t recall what you ordered or how you like your pizza. They will simply take note of customers orders on a separate system, say a computer, that is accessible by all the waiters. They can then revert back to the computer to get details of an order and make changes to it as required.

By storing the ‘state’ of a customers order on a central system accessible by other waiters, any waiter can serve any customer.

In a stateless architecture, HTTP requests from a client can be sent to any of the servers. State is typically stored on a separate database, accessible by all the servers. This creates a fault tolerant & scalable architecture since web servers can be added or removed as needed, without impacting state data. The load will also be equally distributed across all servers, since the load balancer will not need a sticky session configuration to route the same clients to the same servers.

Typically, state data is stored in a cache like Redis, an in-memory data store. Storing state data in-memory improves read and write times, compared to storing it on disk, as explained here.

Bringing it Together

This article has described how stateful and stateless web applications work and the trade-offs of both. However, the principle of statefulness and statelessness applies beyond web applications.

If we look at network protocols as an example, HTTP is a stateless protocol. This means that each HTTP request from a client to a server is independent and carries no knowledge of previous requests or their context. The server treats each request as a separate and isolated transaction, and it doesn't inherently maintain information about the state of the client between requests. State is either maintained on the servers (stateful architecture) or in a separate database outside the servers (stateless architecture). The HTTP protocol itself does not maintain state.

Unlike the stateless nature of HTTP, the TCP protocol is connection-oriented and stateful. It establishes a connection between two devices (usually a client and a server) and maintains a continuous communication channel until the connection is terminated.

The same logic applies to firewalls as well, which can be stateful or stateless.

In AWS, a security group is a virtual firewall that controls inbound and outbound traffic for virtual machines or instances within a cloud environment. Security groups are stateful. When you allow a specific incoming traffic flow, the corresponding outgoing traffic flow is automatically allowed as well. In other words, the state of the connection is tracked.

Network Access Control Lists (NACLs) are used to control inbound and outbound traffic at the subnet level in AWS. NACLs are stateless. Being stateless means that you must explicitly define rules for both incoming and outgoing traffic. Unlike security groups, where response traffic is automatically allowed when you allow incoming traffic, NACLs require you to define separate rules for inbound and outbound traffic.

Functions and design patterns can also be stateful or stateless. The key principle behind something that is stateful is that it has perfect memory or knowledge of previous calls or requests while something that is stateless has no memory or knowledge of previous calls or requests.

Lightcloud

Discussion about this post