The Idempotence Principle in Software Architecture

The Power of mental models for software engineers

Having a carefully crafted toolbox of mental models can be incredibly helpful for a software engineer. On the lower level, you write the code - implement some algorithms and logic. On the upper level - you need to solve problems and make decisions. And mistakes at this level can cost you a lot.

Mental models are essential tools for software engineers, as they can help them quickly understand and solve problems in different areas. Given the vast amount of knowledge that a software engineer must master, simplifying and organizing complex issues is critical to success in the field.

Additionally, having a shared set of mental models can help to improve communication and collaboration among team members.

This article explains the idempotence principle in software engineering and provides some practical implementations for developers. By understanding and applying idempotence at each layer of the software development process, you can reduce errors, increase reliability, future-proof your systems, reduce technical debt, and improve productivity.

Meet the idempotence principle - your universal hammer

This concept states a pretty simple thing - performing an operation multiple times should have the same effect as performing it once. And “same effect” means that we will have the same result and that the context, the system won’t change its state.

This principle has many practical applications in software design and architecture that can improve software systems' robustness, reliability, and scalability. This article will explore examples of how the idempotence principle can be applied at different software design and architecture layers, from source code design to application and infrastructure architecture.

A few math applications

Here are just a few examples of this principle in math:

Multiplying by 0 - any number times 0 equals 0, so multiplying by 0 once has the same effect as multiplying by 0 multiple times. In a Boolean ring, multiplication is idempotent
Applying the ceiling or floor function to an integer - applying it once has the same effect as applying it multiple times
Taking the logarithm of a number with base 1 - taking the logarithm of a number with base 1 once has the same effect as taking it multiple times

Computer science idempotence

Programming-idempotence is about side effects. It’s about what happens to the outside world when you call a function. Idempotence says, “If you’ve called me once or multiple times - the result would be the same”

Source code idempotence

Pure functions

Same Input → Same Output + No Side Effects

Well-known thing - writing functions without side effects. Functions with no side effects are called pure functions because they do not modify any state outside their scope. Pure functions always produce the same output for the same input, no matter how many times they are called. This property allows the function to be called multiple times without changing the system's state.

Examples:

🔴 Not idempotent: A function that calculates the average of a list of numbers modifies the list by removing the elements that have already been averaged
🟢 Idempotent: A function copies the list and works with the copy to calculate the average
🟢 Idempotent: Hashing function. Always the same result for the same input.

function simpleHash(s: string): number {
    let hash = 0;
    if (s.length === 0) {
        return hash;
    }
    for (let i = 0; i < s.length; i++) {
        const char = s.charCodeAt(i);
        hash = ((hash << 5) - hash) + char;
        hash = hash & hash; // Convert to 32bit integer
    }
    return hash;
}

console.log(simpleHash('Hello, World!')); // Same hash for same string

Practical benefits ⭐️

Pure functions are easier to reuse, test, read, understand, and compose.

React components

Same Props & State → Same Output

In React, components are designed to be idempotent and stateless, meaning they always produce the same output for the same props and state. This is achieved by isolating side effects and relying on pure functions to render the component's output.

This idempotent behavior is important because it makes React components predictable and easy to reason about, reducing the risk of bugs and inconsistencies. Moreover, it enables advanced features like server-side rendering and code splitting, which rely on the ability to render the same component output in different contexts and environments.

Practical benefits ⭐️

Possibility of SSR, declarative style of view definition

Application Layer

API design

In the application layer, the idempotence principle can be implemented by designing APIs that are idempotent. An idempotent API is an API that can be called multiple times with the same input without changing the state of the system. For example, an API that creates a user account should be idempotent so that if the API is called multiple times with the same input, it does not create duplicate user accounts.

To achieve idempotence in APIs, developers can use techniques like unique identifiers, conditional requests, and optimistic concurrency control. Unique identifiers can be used to ensure that the same input is not processed twice. Conditional requests can be used to ensure that the state of the system has not changed since the last request. Optimistic concurrency control can be used to ensure that changes made by one request do not conflict with changes made by another request.

Examples:

🔴 Not idempotent: Pushing “Place order” creates a new order in the database
🟢 Idempotent: Pushing “Place order” moves order 234 from ‘cart’ to ‘finalized’ status.

// Idempotent API endpoint for updating a user's profile
app.put('/users/:id/profile', (req, res) => {
  const userId = req.params.id;
  const updatedProfile = req.body;

  // Check if the user profile already exists
  const existingProfile = userProfiles.find((profile) => profile.userId === userId);

  if (existingProfile) {
    // Update the existing user profile
    existingProfile.name = updatedProfile.name;
    existingProfile.email = updatedProfile.email;

    res.status(200).json(existingProfile);
  } else {
    // Create a new user profile
    const newProfile = {
      userId,
      name: updatedProfile.name,
      email: updatedProfile.email,
    };

    userProfiles.push(newProfile);

    res.status(201).json(newProfile);
  }
});

Database migrations

Database migrations should be designed to be idempotent so that applying the same migration multiple times has the same effect as applying it once.

Example. In the following script, newField is the field that we want to add to the documents where it doesn't exist. If the newField doesn't exist in a document, it will be added with a default value "defaultValue". If it already exists, the document will remain unchanged. As a result, the script is idempotent - you can run it multiple times and it will always result in the same final state of the data.

const addFieldToCollection = async () => {
  const client = new MongoClient(url);

  try {
    await client.connect();

    const db = client.db(dbName);
    const collection = db.collection(collectionName);

    // Add the 'newField' to documents where it does not exist
    const filter = { newField: { $exists: false } };
    const update = { $set: { newField: "defaultValue" } };

    const result = await collection.updateMany(filter, update);
  } catch (err) {
    console.error(`An error occurred while performing the migration: ${err}`);
  } finally {
    await client.close();
    console.log("Connection to MongoDB closed");
  }
};

Application dependencies management

Almost all package managers (npm, Yarn, Maven, etc) use the idempotence principle to ensure that installing packages and dependencies always results in the same state, regardless of the number of times it is done. When a package is installed, the package manager checks if it is already installed and skips the installation process if it is.

Moreover, package managers often use package-lock files to ensure that the same versions of packages are installed across different environments or machines. So, running the installation process with the same lock file always results in the same state.

Containerization

When creating a Docker image, the Dockerfile specifies the steps needed to create the image, including installing packages, configuring environment variables, and running commands. Each step in the Dockerfile creates a new layer in the image, which can be cached and reused if the Dockerfile is unchanged.

This caching mechanism makes Docker image creation idempotent, as running the same Dockerfile multiple times results in the same image, as long as the Dockerfile remains unchanged. Moreover, Docker images can be pushed to a container registry and pulled to different machines, ensuring that the same image always results in the same container state, regardless of the machine or environment it is deployed in.

Event stream processing

In event stream processing systems, idempotence plays a crucial role in ensuring data consistency and preventing duplicate processing of events. Idempotence is achieved by designing event processing logic to handle the same event multiple times without causing unintended side effects or producing inconsistent results.

Let's consider a system that processes customer order events in an e-commerce application. The goal is to process each order event exactly once, even in the presence of failures or retries. Here's how idempotence can be achieved:

Deduplication: Incoming events can be assigned a unique identifier (such as a UUID) when they are generated. The processing system can maintain a record of processed events and check if an incoming event with the same identifier has already been processed. If the event has been processed before, it can be safely ignored or flagged as a duplicate.
Idempotent Processing Logic: The processing logic should be designed to be idempotent, meaning that processing the same event multiple times should have the same effect as processing it once. For example, if an order event is being processed, the system should only perform the necessary actions (e.g., updating inventory, generating invoices) if the event has not been processed before. If the event has already been processed, the system can skip those actions.
Exactly-Once Delivery: The event stream processing system should ensure that events are delivered to the processing logic exactly once. This can be achieved by leveraging features such as message acknowledgments, consumer offsets, or transactional message processing in the underlying event streaming platform.

Infrastructure Layer

Infrastructure as Code (IaC)

In infrastructure architecture, the idempotence principle can be implemented by using tools and technologies that support idempotent operations. For example, configuration management tools like Ansible and Terraform are designed to be idempotent. This means that if a configuration change is made, the tool will only apply the change if it is necessary. If the configuration is already in the desired state, the tool will not make any changes.

Idempotence is important in infrastructure architecture because it helps to ensure that the system is always in a consistent state. If a change is made to the system that does not have the desired effect, the idempotent tool can be run again to bring the system back into the desired state.

Examples:

🔴 Not idempotent: Install MongoDB
🟢 Idempotent: Install MongoDB if it isn’t already installed

In the example below, we have a Terraform configuration that defines three resources: an S3 bucket, an IAM user, and an S3 bucket policy.

The idempotence is achieved by relying on Terraform's resource management capabilities. Terraform ensures that the desired state defined in the configuration matches the actual state of the resources, allowing for idempotent operations.

resource "aws_s3_bucket" "example_bucket" {
  bucket = "example-bucket"
  acl    = "private"
}

resource "aws_iam_user" "example_user" {
  name = "example-user"
}

resource "aws_s3_bucket_policy" "example_bucket_policy" {
  bucket = aws_s3_bucket.example_bucket.id

  policy = <<EOF
	{
	  "Version": "2023-06-06",
	  "Statement": [
	    {
	      "Sid": "AllowGetObject",
	      "Effect": "Allow",
	      "Principal": {
	        "AWS": "arn:aws:iam::${aws_iam_user.example_user.id}:user/${aws_iam_user.example_user.name}"
	      },
	      "Action": "s3:GetObject",
	      "Resource": "arn:aws:s3:::${aws_s3_bucket.example_bucket.id}/*"
	    }
	  ]
	}
	EOF
}

Load balancing

Load balancing algorithms should be designed to be idempotent, so that distributing load across multiple instances of the same service has the same effect, regardless of the order in which requests are processed.

Orchestration platforms

Kubernetes, as an orchestration platform for containerized applications, leverages the idempotence principle to ensure reliable and consistent management of resources. With Kubernetes, applying the same desired state configuration multiple times has the same effect as applying it once, thanks to its declarative nature. This means that Kubernetes automatically handles resource creation, update, and reconciliation, ensuring that the desired state is achieved and maintained regardless of the number of configuration applications. Idempotency in Kubernetes enables easy scaling, fault tolerance, and infrastructure management, reducing manual intervention and ensuring system reliability.

Conclusion

The idempotence principle is a powerful concept in software engineering that can help to reduce errors and increase the reliability of systems. By implementing idempotent functions, APIs, and infrastructure architecture, we can build systems that are more robust and less prone to failure. It is important for software engineers to understand and apply the idempotence principle in all layers of software engineering, from source code to infrastructure architecture.

Ensuring idempotence at each layer of software engineering can help to improve the quality of the software produced. By focusing on idempotence, software engineers can reduce the amount of time spent on debugging and troubleshooting errors in the system. This approach can also help to reduce the amount of downtime experienced by the system, which can result in increased productivity and customer satisfaction.

In addition, the use of idempotent functions, APIs, and infrastructure architecture can help to future-proof the system. As the system evolves over time, idempotence ensures that the system remains consistent, even as new features are added or existing features are modified. This can help to reduce the amount of technical debt that accumulates over time, which can make the system more difficult to maintain and update.

In conclusion, the idempotence principle is an important concept in software engineering that should be applied at each layer of the software development process. By prioritizing idempotence, software engineers can build more reliable, consistent, and future-proof systems.