Producers Guarantees for Event-driven Development

Written by wirtzleg | Published 2022/07/25
Tech Story Tags: event-driven-architecture | microservices | message-queue | transactions | distributed-systems | development | queue | software-architecture

TLDRIn this article, we're going to talk about how to provide guaranteed messages sending for producers. Simple transactions cannot cover the problem, but distributed transactions are expensive. Idempotency helps with scheduled jobs. An additional table in DB makes the application less overloaded.via the TL;DR App

Often, we see delivery guarantees for consumers in message queues - at least once, at most once, etc. But in this article, we're going to talk about how to provide guaranteed messages sending for producers.

Problem Statement

Consider the problem in the example of buying airline tickets. Let's imagine that we have a durable queue to send events of order payments, rejects, declines, etc. Our queue is the source of truth so that we can reread messages from it at any time to restore the system's state. Other services listen to these events, update booking statuses, and send emails.

Transactions

When a consumer has paid for an order, we want to:
1. Update the order status in the database
2. Send an event about status changing
If we perform these actions one by one, without using any transactions, we may update the data in the database but fail to send the message to the queue.
If we wrap the execution of operations in a transaction, we need to understand that, by default, a transaction will work only for one of these two data storages. With a transactional write to the database, a message may be sent to the queue successfully, but the transaction will fail when committing. The user will receive a letter about successful payment, but at the same time, we will have a different state in the database. We want to avoid such inconsistency.
For the transaction to cover both data sources, we need to use a distributed transaction, for example, using the 2-Phase-Commit protocol. Not all data sources support such transactions. Plus, we need a transaction orchestrator that will manage the process. It is also important to keep in mind that 2PC significantly slows down the system performance, so with intensive writing, services can become overloaded.

Idempotency

Alternatively, we can rely on idempotency. By using the message queue in this way, consumers must correctly reprocess events that they have handled before. If the service has already updated the booking status to
Issued
after payment, then it should not throw exceptions if it sees such a message again.
Thus, we should not worry if we send several identical messages to the queue. Therefore, to send status change events, we can use schedulers (for example, cron jobs). Our flow will look like this:
1. Transactional update of the order status in the database
2. Periodic job, which
a) Selects unsent status changes
b) For every change
1. Sends a message to the queue
2. Marks as sent in the database (sending to the queue and marking in the database do not have a common transaction)
Thus, in the worst case, we can send a message to the queue but fall on the mark in the database. Here, in the next tick of the scheduler, we will re-send the message to the queue and try to mark it in the database again. As a result, sooner or later the mark will be put down, and consumers will ignore repeated messages in the queue.

Real-world Optimizations

If we create a cron job for each such logic, then we can quickly find dozens of schedulers, which makes it difficult to maintain the application.
Instead, we can create one table in the database for the messages we want to send to the queue (for example,
queue_messages
). And create only 1 cron job, which we can run noticeably more often, reducing the overall delay in the system.
At the same time, we should update the status of orders and insert changes into
queue_messages
transactionally so as not to violate the consistency of the system.

Written by wirtzleg | Senior Software Engineer
Published by HackerNoon on 2022/07/25