Srushtika Neelakantam

@n.srushtika

Everything you ever wanted to know about WebSockets, literally!

We are knee deep into the real-time world by this point with so many applications working with live data. It’s high time there was an explanation of all the events leading up to this point in a technological stance. So, here goes…

These days, applications are moving from utilizing stale data from a database or data that’s created on-the-fly following an event trigger a live experience following real-world events. The first thing we think of when it comes to realtime applications is ‘WebSockets’. But, in spite of a lot of people constantly tossing around this term in technological circles, there actually seems to be huge misconceptions associated with its meaning and working.

Let’s bust the jargon and understand what’s happening!

HTTP -> Long Polling -> WebSockets

Back in the day, HTTP’s stateless request-response mechanism worked perfectly well for the then use-cases, letting any two nodes communicate over the internet. Since it was all stateless, even if the connection dropped, you could easily restore the communication from that very point.

However, soon with applications moving to realtime implementations, that is, ensuring a minimal-latency sharing of data just as it is created in the real world, the traditional request-response cycles turned out to be a huge overhead. Why? The high-frequency request-response cycles lead to more latency since each of these cycles required a new connection to be set up every time.

Logically, the next step would be a way to minimize these cycles for the same amount of data flow. Solution? Long-polling!

With Long polling, the underlying TCP socket connection could be persisted for a little longer i.e., the connection can be kept open for a little longer than usual. This not only gave the server an opportunity to collate more than one piece of data to send back in a single response rather than doing so in individual responses, but also, it almost completely eliminated the case of empty responses being returned due to lack of data, as now the server could just return back a response whenever it has some data to actually give back.

But, even the Long Polling technique involved connection setup and frequent request-response cycles, similar to the traditional HTTP, so of course, leading to more latency.

For most of the realtime applications, the speed of data, up to the nearest milliseconds, is absolutely critical, hence neither of the above options sound useful.

What then?

Since I started off the article with the mention of WebSockets, you obviously would have guessed what I was getting at.

So, WebSockets, unlike HTTP, is a stateful communications protocol that works over TCP.

The communication initially starts off as an HTTP handshake but if both of the communicating parties agree to continue over WebSockets, the connection is simply elevated giving rise to a full-duplex, persistent connection. This means the connection remains open for the complete duration that the application is running for. This gives the server a way to initiate any communication and send off data to pre-subscribed clients, so they don’t have to keep sending in requests inquiring about the availability of new data.

There’s actually a lot more stuff happening under-the-hood of realtime applications, than what I have simply summarized in this article. You can find the full article linked below, which talks about the history of the internet protocols, the people behind their creation, motivations behind conceptualizing these protocols, open source solutions that you can implement for free and also taking it to the next level by adding scalability, etc. You should definitely give it a read!

More by Srushtika Neelakantam

Topics of interest

More Related Stories