The publish-subscribe (or pub/sub) messaging pattern is a design pattern that provides a framework for exchanging messages that allows for loose coupling and scaling between the sender of messages (publishers) and receivers (subscribers) on topics they subscribe to.
Messages are sent (pushed) from a publisher to subscribers as they become available. The host (publisher) publishes messages (events) to channels (topics). Subscribers can sign up for the topics they are interested in.
This is different from the standard request/response (pull) models in which publishers check if new data has become available. This makes the pub/sub method the most suitable framework for streaming data in real-time.
It also means that dynamic networks can be built at internet scale. However, building a messaging infrastructure at such a scale can be problematic.
This introduction to the pub/sub messaging pattern describes what it is, and why developers use it, and discusses the difficulties that must be overcome when building a messaging system at scale.
The Ably realtime platform uses the publish-subscribe pattern at internet scale for delivering messages in real-time.
In the pub/sub messaging pattern, publishers do not send messages directly to all subscribers; instead, messages are sent via brokers. Publishers do not know who the subscribers are or to which (if any) topics they subscribe. This means publisher and subscriber operations can operate independently of each other. This is known as loose coupling and removes service dependencies that would otherwise be there in traditional messaging patterns.
Pub/sub is different from the standard request/response models in which publishers (pull) to check if new data is available. This makes the pub/sub method central to effective streaming of data in real-time.
The pub/sub pattern allows extremely dynamic networks to be built at scale without overloading the publishing components or causing unnecessary costs. However, there are difficulties associated with scaling and different ways of getting around these difficulties that need consideration.
Typical uses of the pub/sub pattern include event messaging, instant messaging, and data streaming (such as live-streaming sporting events). Pub/sub is also used for workload balancing and with asynchronous workflows.
Communication infrastructure for a pub/sub system (Diagram adapted from msn).
A simple information system can follow a simple pattern: input–processing–output. At a reasonable scale, the system will need multiple input and output modules for handling concurrent requests. A problem then arises of routing messages from input modules to their respective output modules.
To solve this routing problem, the input and output modules need an addressing mechanism. It is the processing module’s job to route them to the correct recipient based on an address.
At internet scale, the publish-subscribe pattern can handle tens of thousands of concurrent connections.
At internet scale, the system will handle thousands or even tens of thousands of concurrent connections. It needs to also be capable of handling high volume and global geographical spread of users.
At such a massive scale, the system needs to solve the following problems:
In short, the problems come down to minimizing the shared knowledge of addresses. Pub/sub solves the problems by using a data pipe through which modules can post and retrieve their messages.
The modules do not need to maintain shared knowledge of the whereabouts of other modules. The input modules only accept user input, processing modules only process the data, and the output modules only display the output.
In pub/sub, there is one channel for posting messages and one for retrieving. It happens in steps like this:
The same pattern works at any scale.
In pub/sub messaging pre- and post-processing of the messages is used to address routing problems at internet scale.
A logistics company, in theory, would typically have a mix of customer data and generic data and a highly variable customer load. The data channels between the customers, the drivers, and the delivery office may also be unreliable. It is important that subscribers receive all of the messages customers are sending, but it is not necessary to know about the customers or how many there are.
It is also important that the company does not over-provision their service (which would be costly), or over-provision load balancing, which would add extra complexity and be detrimental to the performance of the network.
It is important to remember that the pub/sub pattern is suited to conveying information whose relevance fades fast. (What is the score now? And now?) As information is frequently replaced, there is no pressing need to store it. Usually, it is enough to keep the most recent message, or enough information to recreate a view of fairly recent events.
Developers use pub/sub to take advantage of edge computing and the network backbone:
Event messaging: pub/sub powers many realtime interactions across domains like EdTech, B2B platforms, and delivery logistics. As we shop online more frequently for a wider variety of goods, package delivery has become commonplace. Logistics companies need to use delivery resources more efficiently. To optimize delivery, dispatching systems need up-to-date information on where their drivers are. Pub/sub event messaging helps logistics companies do this.
Dispatchers need to access drivers’ location information on demand, ideally continually. Having this data at the ready allows them to better predict arrival times and improve routing solutions. Dispatching systems also send out information such as cancellations, traffic information, and new package pickups.
As the day goes on, this information becomes more critical. It gets harder to maintain delivery time windows, and schedule adjustments must be made to maximize the number of on-time deliveries.
This is a lot of data, and not all of it is relevant at any given time. To get around this problem, devices need to be able to subscribe to updates that matter to them. With a pattern like pub/sub, all parties only subscribe to whatever is relevant to them:
These systems enable customers to track deliveries in real-time. For example, reschedule any package in transit, and to alert drivers that there are pickups to be made en route, to allow for more effective routing, which reduces fuel costs and improves efficiency.
Other use-case examples include:
Here are two examples of pub/sub applications with code snippets.
Faye
Faye is an open source system used by Aha! Roadmap software and Shopify. It is based on pub/sub messaging. The following code sample shows how to start a server, create a client, and send messages:
Ably Realtime Chat App
Here is an example of how you might add pub/sub functionality to a chat app using one of Ably’s Realtime SDKs.
When the app launches, the SDK initializes and subscribes to the topic that represents a public chat room.
Subsequently, when the user wants to send a chat message, the chat app publishes the message on the same topic.
The app unsubscribes from the channel when the user logs out or leaves the chat room.
It is straightforward to implement a single-channel pub/sub messaging framework. But when you start to scale, the classic problems of distributed systems engineering emerge. When scaling to multiple channels and increasing complexity to any significant degree, the problems increase, and maintaining reliability becomes difficult.
Distributed messaging systems should ideally have the three elements of reliability, speed, and ordering. However, it’s usually the case that you only get to have two of them. To create a system that allows all three, you have to start at the design level with a watertight mathematical model. It is just about impossible to add in the missing third element later.
These are the problems to deal with:
These are all problems of building a system at scale. Because you don’t necessarily know all the information you might need about your system at any given time, either the framework needs to be clever enough to handle it, or all the applications in your system need to be quite advanced.
Ably balances the above concerns through judicious use of the TCP layer. By generating multiple paths, we gain reliability but without the expense of speed — we can do fast pathing because we control the path we follow. Also, because of the way the network is set up we can maintain ordering, which is often lost in the trade-off with speed of delivery.
This is baked in at the design stage, because the problems that arise when building in a global framework are almost impossible to correct at a later stage.
You can either build a pub/sub messaging infrastructure yourself (self-deploy) or adopt a cloud native Software-as-a-Service (SaaS) infrastructure, such as Ably.
Solving the design considerations of building a globally scaling system is far from easy for reasons described in the previous section. Building your own messaging system requires budgeting for more design upfront.
If choosing to self-deploy, there are also considerations such as infrastructure setup, installing, and framework configuration. Doing these yourself gives you oversight of building the features you want in your system, but is also time-consuming and expensive.
The advantages of “as-a-service” pub/sub infrastructure over self-deployment are:
Ably is an enterprise-ready pub/sub messaging platform. We make it easy to efficiently design, quickly ship, and seamlessly scale critical realtime functionality delivered directly to end-users. Everyday we deliver billions of realtime messages to millions of users for thousands of companies.
We power the apps that people, organizations, and enterprises depend on everyday like Lightspeed System’s realtime device management platform for over seven million school-owned devices, Vitac’s live captioning for 100s of millions of multilingual viewers for events like the Olympic Games, and Split’s realtime feature flagging for one trillion feature flags per month.
We’re the only pub/sub platform with a suite of baked-in services to build complete realtime functionality: presence shows a driver’s live GPS location on a home-delivery app, history instantly loads the most recent score when opening a sports app, stream resume automatically handling reconnection when swapping networks, and our integrations extend Ably into third-party clouds and systems like AWS Kinesis and RabbitMQ. With 25+ SDKs we target every major platform across web, mobile, and IoT.
Our platform is mathematically modeled around Four Pillars of Dependability so we’re able to ensure messages don’t get lost while still being delivered at low latency over a secure, reliable, and highly available global edge network.
Developers from startups to industrial giants choose to build on Ably because they simplify engineering, minimize DevOps overhead, and increase development velocity.
See the in-depth Ably article: Everything you need to know about publish-subscribe with further details on the aspects of the publish-subscribe pattern.
Now that you know about the basics of pub/sub, find out more about
Also, read more about messaging design patterns, and realtime technologies in general. Or jump in and try sending and receiving some messages with the Ably platform.
Previously published at https://www.ably.io/topic/pub-sub