About a year ago, I was working on a project at my previous job. The goal was to set up a reliable message queue. Most people might go straight to RabbitMQ, and I almost did too. But I decided to look for something different. That's when I came across NATS JetStream. After working with it for the past year, I can say it's truly impressive. It's fast, it can handle a lot of data, and it fits well with cloud systems.
Today, I'll walk you through how to use NATS JetStream with a Golang app. By the time you finish reading this article, you'll be able to implement a robust and fast message queue.
For those who are not familiar, what are message queues?
In the real world, message queues help systems talk to each other without getting overwhelmed. For example, when you order something online, your order might go into a message queue before it's processed. This way, even if many people order at the same time, the system can handle it smoothly.
Message queues are key for making sure data gets where it needs to go, especially when things get busy.
For my project, I had a clear list of needs when it came to selecting a message queue:
Durable Queues: It was crucial that our messages stayed safe. We couldn't afford to lose them due to network glitches or if a server needed to restart. In simple terms, once a message is in the queue, it should stay there until it's processed.
Retrying Messages: Sometimes, there can be small issues or temporary errors when a system tries to read a message. I needed a way to send these messages back into the queue so they could be tried again later. This ensures that no message gets left behind because of short-term problems.
Horizontal Scaling: As our system might get more users and data, I wanted our message queue to grow easily with us.
With these requirements in mind, I began my search for the perfect message queue solution.
When searching for the right message queue, I explored several options. Here's a brief overview:
Gearman, Beanstalkd, and similar: I didn't dive too deep into these. They're older and seem to lack current support.
RabbitMQ: It's favored by many small businesses. However, I found it a bit slow. Setting the HA (High-available) cluster requires real expertise. Plus, it lacks the ability to “requeue” a message with delay.
Redis: It's excellent for PubSub and storing KV data. But as a queue system? Not the best. If Redis is set to store all messages, it slows down drastically.
Apache Kafka: It's a strong choice for event streaming. But it’s not the best choice for message queues. For instance, how can I requeue a message?
Apache Pulsar: It looked promising and fits the needs of large organizations. It met most of my criteria. Yet, horizontal scaling isn't straightforward. It demands a large infrastructure and a skilled team, which felt excessive for my project's scale.
Finally, I stumbled upon NATS JetStream. It seemed to align perfectly with my requirements.
To understand how NATS JetStream works, I made some simple code and did tests. Here's what I learned:
Setting Things Up: I made two simple Golang CLI tools:
...
publishedTotal := 0
for i := 0; i < 30000; i++ {
subject := "test_subject_name"
_, err = jetStream.PublishAsync(subject, []byte(strconv.Itoa(i)))
if err != nil {
return fmt.Errorf("publish: %w", err)
}
publishedTotal++
}
fmt.Printf("published %d messages\n", publishedTotal)
...
...
totalReceived := int32(0)
for {
func() {
ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
defer cancel()
msgs, fetchErr := jetStreamSubscriber.Fetch(1, nats.Context(ctx))
if fetchErr != nil {
if errors.Is(fetchErr, context.DeadlineExceeded) {
return
}
log.Printf("sub fetch err: %s\n", fetchErr)
return
}
for _, msg := range msgs {
ackErr := msg.Ack()
if ackErr != nil {
log.Printf("sub ack err: %s\n", ackErr)
}
totalReceived++
log.Printf("total received: %d\n", totalReceived)
}
}()
}
...
With these tools ready, I checked a few things:
Question: What if I send thousands of messages all at once?
Answer: The consumers took all the messages in. Nothing was lost.
Question: What if I send messages but there's no one to consume them?
Answer: When a consumer is ready, it receives all the messages.
Question: How do consumers act if the NATS server stops? Do they try to connect again when NATS is back?
Answer: Yes, consumers connect back to NATS by themselves when it's working again.
NATS showed excellence aligning our first requirement. Let’s check out how it can send messages back to a queue.
In NATS JetStream, requeuing a message that has encountered a problem is straightforward. You can achieve this by calling the Nak()
function available in the message API. Invoking Nak()
ensures that the message is sent back to the queue, effectively providing a second chance for it to be processed at a later time.
Another feature that adds to the utility of NATS JetStream is the ability to requeue messages with a custom delay. This functionality comes in handy when dealing with tasks that are deferred and need to be carried out after a certain period has elapsed. For example, if you have a notification system that must send alerts at specific intervals, you can use this feature to schedule these messages accurately.
Two requirements are nailed. There is one more important point about scalability.
When it comes to horizontal scaling, NATS JetStream makes the process relatively simple. Setting up a cluster involves running multiple NATS servers with JetStream enabled. All that's required is to specify the same cluster_name
and define the routing paths between each node in the configuration.
I added an example of docker-compose.yml file to run a local NATS cluster:
version: "3.5"
services:
nats1:
container_name: nats1
image: nats
entrypoint: /nats-server
command: --server_name N1 --cluster_name JSC --js --sd /data --cluster nats://0.0.0.0:4245 --routes nats://nats2:4245,nats://nats3:4245 -p 4222
networks:
- nats
ports:
- 4222:4222
nats2:
container_name: nats2
image: nats
entrypoint: /nats-server
command: --server_name N2 --cluster_name JSC --js --sd /data --cluster nats://0.0.0.0:4245 --routes nats://nats1:4245,nats://nats3:4245 -p 4222
networks:
- nats
ports:
- 4223:4222
nats3:
container_name: nats3
image: nats
entrypoint: /nats-server
command: --server_name N3 --cluster_name JSC --js --sd /data --cluster nats://0.0.0.0:4245 --routes nats://nats1:4245,nats://nats2:4245 -p 4222
networks:
- nats
ports:
- 4224:4222
networks:
nats:
name: nats
What adds to the utility is that you can connect to any node within the cluster, whether from a publisher or a consumer application, and it just works seamlessly. This implies that the system is designed to handle varying load distributions, routing the messages effectively irrespective of which node you are connected to.
It was clear to me that NATS JetStream aligned with all my requirements perfectly.
I spent considerable time with NATS JetStream and can outline its strengths and limitations based on my experience:
The performance benchmark was conducted by Steamnative, where NATS JetStream was compared with other popular message queues like Apache Pulsar and RabbitMQ. It's important to approach these results with a degree of skepticism, as the company is known for its affinity towards Apache Pulsar. They themselves noted that they faced difficulties in configuring NATS JetStream for the benchmark, which could potentially skew the results in favor of their preferred system.
NATS JetStream – During our tests, we attempted to follow the recommended practices for deliverGroups and deliverSubjects in NATS, but encountered difficulties. Our NATS subscriptions failed to act in a shared mode and instead exhibited a fan-out behavior, resulting in a significant read amplification of 16 times. This likely significantly impacted the overall publisher performance.
After working with NATS JetStream for an extended period, several aspects of the system became apparent:
Easy configuration.
High performance.
Broad functionality.
Beyond its capabilities as a message queue, NATS JetStream also functions as a message broker for event-streaming, PubSub, and synchronous Request-Reply message bus.
While this article provides an initial overview, I plan to discuss the diverse features of NATS JetStream in more detail in future articles. This will include a closer look at its specific functionalities and how they can be applied in different scenarios.