Occasionally, I get asked very specific questions about SocketCluster’s (or SCC’s) scalability. The answers to those questions tend boil down to a few simple rules of thumb which I would like to discuss in this article.
As with any distributed system, achieving unlimited scalability requires adhering to certain constraints. SC was designed to be a front-end-facing pub/sub system and so its scalability constraints reflect that. As opposed to back-end-facing message queues such as RabbitMQ, ZeroMQ, Kafka and NSQ which aim to support a limited number of unlimited-throughput channels, SC is designed to support an unlimited number of limited-throughput channels.
In SC, the only hard limit to scalability is the number of messages that can be published to a specific channel every second — That number varies greatly depending on implementation and concurrency levels but, as a very rough guideline; a channel can handle between 10K to 20K published messages per second on a decent machine (assuming a proportional number of concurrent sockets; note that throughput may be several times higher at lower concurrency levels)…
In practice, what this means is that in SC, you should aim to spread your publish operations over as many channels as possible. Note that there is no theoretical limit when it comes to the total number of subscribers that you can have for any given channel and there is also no theoretical limit when it comes to the total number of channels that you can have in your system. The only limit that you need to consider is the maximum number of publishes on a per-channel basis.
If you keep in mind that SC is intended for front-end use, this is generally not a problem. It would be rare to come across a scenario where each end user in your system would want to consume more than 1K messages per second (on the front-end)… Even fast-paced multiplayer games rarely exceed 30 messages per second per user.
When thinking about scalability in SC, the only question that you need to ask yourself is whether or not the total number of publish operations in your system is directly proportional to the total number of users in your system. If so, you need to be aware of how those messages will flow through the system.
To help you think about this, here are some extreme scenarios which are fully supported by SCC (assuming that you can spin-up enough hosts/machines — Note that x can be practically any number):
Every scenario between those two extremes is also fully supported provided that you have enough machines.
Here is an extreme scenario which is not supported by SCC (regardless of how many hosts/machines you spin up):
While the scenario above might work with RabbitMQ or Kafka, it is not supported in SC because it exceeds SC’s fundamental limit of publishes per second per channel.
If you think about the unsupported scenario above and consider that SC is a front-end facing system, you should see that this publish constraint is actually not a problem at all. Imagine being an end-user and seeing a chat-box that is bound to a specific channel and rendering messages at a rate of x million messages per second… It wouldn’t make any sense from a UX perspective.
If you do happen to need really high publish throughput for a ‘single channel’ in SC; you can always increase capacity by sharding your channel and use a hash function to distribute data evenly between them e.g. my-channel/1
, my-channel/2
, my-channel/3
…
Note that although systems like RabbitMQ and Kafka can support extremely high throughput on a per-channel (queue/topic) basis, the bottleneck tends to be the number of new unique channels that you can create per second; this is an area where SC shines because it has no such limit on channel creation.
You have to think of each channel in SC as being like a water pipe — There is only so much water that you can push through any given pipe but you can make sure that everyone gets enough water by simply adding more pipes.