Whether it’s a World Cup match, the Super Bowl, or the French Open finals, watching it with your friends on a Saturday night is . Sadly, not all of us can get tickets and travel across cities, countries, or continents to attend them. Thankfully, live streaming makes it possible to watch all the action, close to real-time. #goals But, the only question is “how close to real-time are we talking?” Video streaming is largely facilitated on the back of a video protocol called (HTTP Live Streaming). While the origins and fundamentals of HLS are explained in another piece on our blog, the current piece will focus on how resolved one of its greatest shortcomings: latency. HLS HLS To start with, let’s take a quick peek at how works. HLS The Way of the HLS We will first try to understand how works, and makes live streaming possible. This is what the typical flow of an streaming system looks like: HLS HLS The audio/video stream captured by input devices is encoded and ingested into a . media server The transcodes the stream into an -compatible format with multiple variants and also creates a file to be used by the video players. media server HLS ABR playlist Then, the serves the media and the file to the clients, either directly or via s by acting as an . media server playlist CDN origin server The players, on the client end, make use of the file to navigate through the video These segments are typically “slices” of the video being generated, with a definite duration (called , usually 2 to 6 seconds). playlist segments. segment size The is refreshed based on and players can select the specified in them, based on the order of playback and the video quality they require. playlist segment size segments Even though offers a reliable way of video streaming, its high latency levels may pose obstacles and issues for many streamers or video distributors. According to the , a player should load the media files in advance before playing it. This makes an inherently higher latency protocol with a latency of about 30 to 60 seconds. HLS initial specification HLS Tuning HLS for Low Latency Everyone was interested in implementing but the high latency was a serious roadblock. So, devs and enthusiasts started to find workarounds to reduce latency and refine the protocol for effective usage. Some of these practices offered such positive results that they started becoming a silent standard along with the specification. Two of these practices are listed below: HLS HLS Reducing the default segment size When Apple introduced , . Most implementers found it too long because of which Apple decided to . The overall latency can reduced by reducing and the of the player. HLS the typical segment size was 10 seconds HLS reduce it to 6 seconds segment size buffer size However, this carries some issues. Some of them include , or for devices with inferior network conditions. The ideal segment size should be decided based on the target audience and could be in the range of 2 to 4 seconds. increased overall bitrate buffering jitter Media Ingest with faster protocols The main reason is used for live streaming is the scalability, reliability and player compatibility it provides across all platforms, especially when compared to other protocols. This has made irreplaceable for video delivery so far. HLS HLS But the first mile contribution (also known as ) from the stack can be replaced with lower latency protocols to reduce overall latency. ingest HLS The is usually replaced by , which enjoys wide support for encoders/services and has proved to be a cost-effective solution. The stream ingested with is then transcoded to support with the help of a before serving the content. Even though there have been experiments with other protocols such as , for the ingest part, remains the most popular option. HLS ingest RTMP ingest RTMP HLS media server WebRTC SRT RTMP The Evolution of HLS to LL-HLS The latency in started posing a significant hurdle, leading to less than stellar user experiences. This was becoming more frequent since was being widely adopted around the world. Tuning wasn’t enough and everyone was looking for better and more sustainable solutions. HLS HLS HLS It was in 2016 that Twitter’s Periscope engineering team made some major changes to their implementation in order to achieve low latency with . This proprietary version of , often referred to as , offered latency of 2 to 5 seconds. HLS HLS LHLS , the main competitor for came up with a low latency solution based on in 2017, following which a community-based low latency solution ( ) was drafted in the year 2018. This variant was heavily inspired from the Periscope’s and leveraged Chunked Transfer Encoding ( ) to reduce latency. This variant is often referred to Community Low Latency HLS ( ). DASH HLS chunked CMAF HLS L-HLS LHLS CTE CL-HLS While this version of was gaining popularity, Apple decided to release their own extension of the protocol called Low Latency HLS ( ) in 2019. This is often referred to as Apple Low Latency HLS ( ). This version of HLS offered low latency comparable to the and promised compatibility with Apple devices. Since then, has been merged into the specification and has technically become a single protocol. HLS LL-HLS ALHLS CL-HLS LL-HLS HLS How LL-HLS reduces Latency In this section, we’ll explore the changes brings to making low latency streaming possible. This protocol came with 2 main changes in spec, responsible for its low latency nature. One is to divide the into and deliver them as soon as they’re available. The other is to inform the player about the data to be loaded next before said data is even available. LL-HLS HLS, segments parts Dividing Segments into Parts The video are further divided into (similar to used in ). These parts are just “smaller segments” with a definite duration - represented with tag in the . segments parts chunks CMAF EXT-X-PART media playlist The players can fill up their buffer more efficiently by publishing the while the is being generated. Reducing the buffer size on the player side using this approach, results in reduced latency. These are then collectively replaced with their respective upon completion, which will remain available for a longer period of time. parts segment parts segments Preload Hints When was first introduced, it had push specified as a requirement on the server side for sending new data to clients. Many commercial providers were not supporting this feature at the time, which resulted in a lot of confusion. LL-HLS HTTP/2 CDN This issue was addressed by Apple in a subsequent update, replacing the push with . They decided to include support for by adding a new tag to the , reducing overhead. HTTP/2 preload hints preload hints EXT-X-PRELOAD-HINT playlist With the help of , a video player can anticipate the data to be loaded next and can send a request to from the to gain faster access to the next /data. The servers should block all requests for the preload hint data and return them as soon as the data becomes available, thus reducing latency. preload hint URI hint part A look at the LL-HLS Media Playlist Now, let’s take a look at how these tags are specified in the file, using an example. We will assume the segment size to be and the part size to be . We will also assume that 2 ( and ) have been completely played, while the 3rd ( ) is still being generated. This is being published as a list of in the order of playback because it has not yet been completed. media playlist 6 seconds 200 milliseconds segments segment A B segment segment C segment parts The following is a sample ( file). media playlist M3U8 #EXTM3U # Other tags # # The following tags are used for accessing the sequences # that are completely generated and can be loaded without any # delay. # These segments are specified by their duration, followed # by an unique URI. #EXTINF:6.0, fileSegmentA.mp4 #EXTINF:6.0, fileSegmentB.mp4 # # The following tags are used for accessing the parts of # the segment currently being generated. # These parts are specified by their duration, followed # either by a unique URI for that specific part or the segment # URI with a byte-range. #EXT-X-PART:DURATION=0.200,URI="filePartC.0.mp4" #EXT-X-PART:DURATION=0.200,URI="filePartC.1.mp4" # or #EXT-X-PART:DURATION=0.200,URI="fileSegmentC.mp4",BYTERANGE=20000@21000 # # The following tag is used to inform the player about the # most likely part to be fetched next, before it becomes available, # to be used for playback. #EXT-X-PRELOAD-HINT:TYPE=PART,URI="filePartC.2.mp4" Players that don’t support yet tend to ignore tags like and , enabling them to treat the with the traditional and load at a higher latency. LL-HLS EXT-X-PART EXT-X-PRELOAD-HINT playlist HLS segments Low-Latency HLS on non-Apple devices The new and improved has a latency of about 3 seconds or less. The only reasonable competition for this protocol is . But Apple does not support on all of its devices. This makes the only low latency live streaming protocol that has wide client-side support including Apple devices. HLS LL-DASH DASH LL-HLS One of the main advantages of using is its backward compatibility with legacy players. The players that don’t support this variant may fall back to standard and still work with higher latency. Since this protocol required players to start loading unfinished media segments instead of waiting until they become fully available, the changes in the spec made it difficult to adapt it quickly for all players. LL-HLS HLS HLS It took a while for most non-Apple devices to start supporting . Now, it is widely supported across almost all platforms with relatively newer versions of players. Even though some of them have been planning the support for the protocol since its inception, most of them are new and are improving their compatibility at the moment. LL-HLS Here are some popular players from different platforms that support in its entirety: LL-HLS AVPlayer (iOS) Exoplayer (Android) THEOPlayer JWPlayer HLS.js VideoJS AgnoPlay Comparing LL-HLS, LL-DASH and WebRTC Here, we compare three protocols , and on six parameters: compatibility, delivery method, support for , security, latency, best use case. LL-HLS LL-DASH WebRTC ABR Compatibility provides good support for all Apple devices and browsers. It has been gaining support for most non-Apple devices. LL-HLS supports most non-Apple devices and browsers but is not supported on any Apple device. LL-DASH is supported across all popular browsers and platforms. WebRTC Delivery Method First, let’s go through a few relevant terms used with CMAF. is a technique used for making publishable “chunks”. When added together, these chunks create a video segment. Chunks have a set duration and are the smallest unit that can be published. Chunked Encoding (CE) is a technique used to deliver the “chunks” as they are created in a sequential order. With , one request for a segment is enough to receive all its chunks. The transmission ends once a zero-length chunk is sent. This method allows even small chunks to be used for transfer. Chunked Transfer Encoding (CTE) CTE uses to create “parts” or “chunks” of a segment. But, instead of using , this protocol uses its own method of delivering chunks over . The client has to make a request for every single part, instead of just requesting the whole segment and receiving it in parts. LL-HLS Chunked Encoding Chunked Transfer Encoding TCP uses for creating chunks and for delivering them over . LL-DASH Chunked Encoding Chunked Transfer Encoding TCP uses for sending video and audio streams over . WebRTC Real-time Transfer Protocol (RTP) UDP Support for Adaptive Bitrate (ABR) is a technique for dynamically adjusting the compression level and video quality of a stream to match bandwidth availability. It heavily impacts the video streaming experience for the viewer. Adaptive Bitrate (ABR) has support for . LL-HLS ABR has support for . LL-DASH ABR doesn’t support . But, a similar technique called Simulcast is used for dynamically adjusting video quality. WebRTC ABR Security Both and support media encryption and benefit from security features such as token authentication and digital rights management ( ). LL-HLS LL-DASH DRM supports end-to-end encryption of media for transfer, user, file, and round-trip authentication. This is often sufficient for purposes. WebRTC DRM Latency Both and have a latency of 2 to 5 seconds. LL-HLS LL-DASH , on the other hand, has a sub second latency of ~500 milliseconds. WebRTC Best Use Case Both and are best suited for live streaming events that need to be delivered to millions of viewers. They are often used for streaming sporting events live. LL-HLS LL-DASH is very frequently used for solutions such as video conferencing that require minimal latency and are not expected to scale to a big number. WebRTC Now that the supports low latency streaming, it is all set to conquer the video streaming space, ready to serve millions of fans watching their favourite team play a crucial match without any issues. Whether you want to start live streaming yourself or build an app that facilitates live streaming, remains your best friend. HLS LL-HLS