paint-brush
How to Speed Test a CDN: An Introductionby@productivityhacks
128 reads

How to Speed Test a CDN: An Introduction

tldt arrow

Too Long; Didn't Read

The CDN is just a massively distributed caching layer. There are 3 all-important metrics to measure: cold cache latency, hot cache latency and warm cache latency. Cold cache latency is much slower than the hot cache, but the warm cache is much more ambiguous as it must be fetched at some time away from the cold and hot cache tests. On CloudFront, for example, I’m seeing a drop from 1.2 seconds to 0.232 seconds in the cold cache test.

People Mentioned

Mention Thumbnail

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coin Mentioned

Mention Thumbnail
featured image - How to Speed Test a CDN: An Introduction
Productivity Hacks & Software Performance Analysis HackerNoon profile picture

Since I’m going to be writing a whole bunch of articles about CDN speeds, I wanna consolidate my testing methodology into a single article. I’m going to be linking back to this article a lot.

Questions to be answered in this article:


  • What numbers are measured for CDN download speed?
  • Why were those numbers selected?
  • How are the numbers measured?
  • Where is the measurement done? Cloud server or home computer?


The three important measurements to make

At the end of the day, the CDN is really just a caching layer. All it is is just a massively distributed caching layer. That’s it. That’s all it is.


And if you’re familiar at all with how caching layers work, there are 3 all-important metrics to measure:


  • Cold cache latency — item is not in cache at all
  • Hot cache latency — item is in fastest response layer of cache
  • Warm cache latency — item is at a slower response layer of cache.

To illustrate this in hardware terms: the server responding from RAM is the fastest possible latency. That’s a hot cache. Getting it from disk (SSD) is the second-fastest possible latency. And you can add more layers (HDD). Finally, the cold cache is if the server doesn’t have the item at all.


With regards to a CDN, how do we measure hot, cold, and warm cache?


Cold cache latency: You just fetch a file that the CDN has never seen before. This is going to take the longest time. On CloudFront, for example, I’m seeing a drop from 1.2 seconds to 0.232 seconds in the hot cache.


Hot cache latency: You can fetch the same file 10 times in a row and take the average.


Warm cache latency: Trickier. Much trickier. We must wait some amount of time to wait for the file to leave the hot cache. For now, we are sticking with 30 minutes, but we may adjust this as we learn how CDNs are moving items from hot cache to warm cache. The result of the warm cache is also much more ambiguous as it must be fetched at some time away from the cold and hot cache tests.

You can see the cold cache is much slower than the hot cache by running this curl:


curl -w “@curl-format.txt” -o tmp -s “ [https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024×576.png](https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024x576.png) “


This example uses CloudFront. There is a massive jump from 1+ seconds to 0.1–0.2 seconds.


How to measure the 3 measurements: my open-sourced Python script (check GitHub)

I used a python script to automate the measurements. This is important so that the measurements are very consistent between CDNs. Also important to remove human error.


It is open-sourced here: https://github.com/speedtestdemon/speed-tests/blob/master/test.py.

There is a massive drop in CDN download time from the cold cache to the hot cache. CloudFront gave me 1.09 seconds for cold cache, then 0.112 seconds for hot cache, then 0.226 for the warm cache.

Here is the full python output. The cool thing is that it prints out the headers it got for the cold cache and the warm cache. The headers are used to sanity check that the test is doing the correct thing regarding cold cache and warm cache test.

------------------------------------------------------------- Testing " **Cold cache speed**" ------------------------------------------------------------- Got headers: HTTP/2 200 **content-type** : image/png **content-length** : 719983 **date** : Sun, 20 Jun 2021 21:47:06 GMT **last-modified** : Mon, 07 Jun 2021 00:16:21 GMT **etag** : "52ae2ff2354d4a68e680b77b4da58985" **accept-ranges** : bytes **server** : AmazonS3 **x-cache** : Miss from cloudfront **via** : 1.1 c39432c353feb02b03735f3850e19107.cloudfront.net (CloudFront) **x-amz-cf-pop** : IAH50-C1 **x-amz-cf-id** : NgkCqcrwb3K65LeGu7uhebFNODrNI9s8wVeHZ93lq2XKrE3q9PMm-A== **time\_namelookup** : 0.06327399999999999691 **time\_connect** : 0.01762099999999999778 **time\_appconnect** : 0.05910300000000001663 **time\_pretransfer** : 0.00010199999999999099 **time\_redirect** : 0.00000000000000000000 **time\_starttransfer** : 0.95064099999999995827 **time to download** : 0.00008300000000005525 **time\_total** : 1.09082400000000001583 ------------------------------------------------------------- Testing " **Hot cache speed**" ------------------------------------------------------------- 10 requests done. Average: **time\_namelookup** : 0.00190460000000000000 **time\_connect** : 0.02373660000000000006 **time\_appconnect** : 0.06306589999999999419 **time\_pretransfer** : 0.00019590000000000024 **time\_redirect** : 0.00000000000000000000 **time\_starttransfer** : 0.02317680000000000434 **time to download** : 0.00007860000000000089 **time\_total** : 0.11215840000000001919 ------------------------------------------------------------- Testing " **Warm cache speed**" ------------------------------------------------------------- Sleeping for 0.5 hr to move cache from hot to warm Got headers: HTTP/2 200 **content-type** : image/png **content-length** : 719983 **last-modified** : Mon, 07 Jun 2021 00:16:21 GMT a **ccept-ranges** : bytes **server** : AmazonS3 **date** : Mon, 21 Jun 2021 00:09:46 GMT **etag** : "52ae2ff2354d4a68e680b77b4da58985" **x-cache** : Hit from cloudfront **via** : 1.1 9b59bfec44582f64d3d8dac9fb7d27b7.cloudfront.net (CloudFront) **x-amz-cf-pop** : DFW50-C1 **x-amz-cf-id** : hzwVoHfaHen2TR3cCNRsnwniXMc3_BaOWk7oa2DiQaWkioXqwSGRrg== **time\_namelookup** : 0.11359600000000000253 **time\_connect** : 0.01640300000000000091 **time\_appconnect** : 0.07292699999999999183 **time\_pretransfer** : 0.00030900000000000372 **time\_redirect** : 0.00000000000000000000 **time\_starttransfer** : 0.02220300000000000051 **time to download** : 0.00011099999999999999 **time\_total** : 0.22554899999999999949

Please note these metrics like time_namelookup do correspond to the same meaning curl's time_namelookup. 'Curl' shows a cumulative time measure, so it is always increasing. However, I want to look at the time each stage took separately from each other, so curl's cumulative, the increasing timestamp was not helpful.

Common pitfalls of speed testing CDNs

These are some mistakes I made while speed-testing CDNs. These are easy mistakes to make, so I decided to write about them.

Mistake #1: cold cache tests should be done where the CDN does not have the file cached.

Why it’s an easy mistake to make: Most people think files are cached for only 24 hours. That’s not necessarily true. Some CDNs like Jetpack CDN evidently cache it for more than 1 day (based on my tests). Additionally, most people’s curl calls do not include the headers warning sign that the file could be cached.


How to fix: Check the headers returned. You should see the keyword “MISS” somewhere in there. If you see a “HIT,” that’s a warning sign. This is also why the python script prints out the headers for the cold cache test and warm cache test.

Mistake #2: you need to actually download the file.

Why it’s an easy mistake to make: for some reason, it is extremely easy to accidentally make curl skip the download of the actual file. There are at least 3 ways of doing this.

  1. If you set the “NOBODY” option to curl via pycurl, it does not download the file. I made this mistake in my python script.
  2. If you set the “-O /dev/null” flag via the curl in the command line, it does not download the file. You think it would just download the file and dump it to /dev/null. No. curl is “smarter” than that and just skips the download altogether.
  3. If you set the “-I” flag via curl in the command line, it does not download the file.

Why this is a serious mistake to avoid: CloudFront does not cache the file if you do not download the file! And since CloudFront is an industry-standard CDN, it is likely other CDNs have the same behavior.


How you can detect this mistake: Look at the download time (time_total — time_starttransfer). If it is in the hundreds of microseconds, that’s a problem. The speed of light is only 0.184 miles/microsecond. If you got only 200 microseconds, then that is 37 miles…roundtrip. Or 18 miles one way. It is highly unlikely there’s a data center ‘that close’ to where you are. It means that there was no network transfer.


How to fix: In pycurl, write the downloaded contents to a BytesIO or StringsIO variable. To see how it’s done, see my python CDN speed test script. Or if you’re using curl from the command line, avoid “-O /dev/null” and “-I” flag and make sure download time is at least several milliseconds.

Mistake #3: The input URL is invalid.

Why it’s an easy mistake to make: Your curl speed test doesn’t report any 404 error. So you get back the speed test results without realizing there’s an error.


How to catch mistakes: One red flag you wanna look at is the “download” time. If it’s less than 1 millisecond, that’s way too fast. Do the math. The speed of light is .184 per microsecond. There are 1000 microseconds in a millisecond. If you’re getting, let’s say 200 microseconds, that’s only 36 miles…. roundtrip. One way, it’s only 18 miles. So 200 microseconds is way too fast and indicates there’s probably no network transfer (i.e., no download) happening at all.


Of course, the other thing you could do is actually download the file…use wget. You’ll see an error message pretty easily.


How to fix the mistake: Use the right URL.

Mistake #4: The input URL is HTTP, not HTTPS.

Why it’s an easy mistake to make: People are careless or perhaps not technical enough. This is actually an important thing to ensure, as I’m noticing the SSL exchange normally takes 50 milliseconds. Given a fast curl time is only around 120 milliseconds (from a home computer, not cloud provider).


How to fix the mistake: Be consistent in your testing. Use HTTPS for all or HTTP for all. And you should probably use HTTPS since that’s the standard (everybody uses HTTPS).

Why I chose not to test on a normal web server like DigitalOcean

It turns out it’s not a good idea to use DigitalOcean to run these CDN speed tests.


I initially thought it would be a good idea to run these CDN speed tests on a standardized server location like DigitalOcean. Since I travel a lot, I can’t guarantee my internet will always be the same quality, or the location could be drastically different, such as different countries.


So that’s what I did. I tried it out…and noticed DigitalOcean’s internet speed is fast — REALLY fast. Like, way faster than my home internet speed. I was a little shocked. I didn’t think download speeds could ever get that fast. It makes me wonder what I need to do to make my home internet as good as that because I really thought I had the best possible internet that money could pay for (I basically use my computer 100 hours a week, OK. It’s important to me).


If you wanna see an example of the drastic difference in CDN speed…well, here’s one. It’s using the CloudFront example that we’re using in the “Free CDN” blog series. When I ran my speed test python script with:


python3 test.py [https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024×576.png](https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024x576.png)


I got back a download time of a blazing fast 0.0225 seconds or 23 milliseconds. Holy cow. And then doing the same thing on my home internet, I only get 0.2315 seconds or 232 milliseconds. What the hell.


Out of curiosity, I ran an internet speed test on the DigitalOcean server. Like how the hell is it that much faster. That’s gotta get anybody’s interest up. And would you believe the numbers???

# speedtest-cli Retrieving speedtest.net configuration... Testing from DigitalOcean (128.199.187.118)... Retrieving speedtest.net server list... Selecting best server based on ping... Hosted by NewMedia Express (Singapore) [13.17 km]: 2.16 ms Testing download speed................................................................................ Download: 1740.52 Mbit/s Testing upload speed...................................................................................................... Upload: 1420.29 Mbit/s

LOL, holy crap, Download speeds of 1740.52 Mbit/s and Upload speeds of 1420.29 Mbit/s? How much money did DigitalOcean pay for that? I wanna buy it.


That is literally 10x as fast as my internet, which tops out at 160 Mbit/s (using fast.com).

So there are several reasons why I decided not to run these speed tests on DigitalOcean after all:

  • It’s hard to tell the difference between good or bad CDNs.
  • It’s not a real-world usage of how CDNs actually get used. Almost nobody will have internet speeds that fast, where it’s nearly 10x faster than the most premium internet you can get.


BTW - a relevant article that also talks about the distorted internet speeds of cloud providers: how accurate are CDNPerf’s numbers? Answer: not very accurate. TODO: I will link to this essay (rant) once I’ve written it!

Why do the CDN speed tests need to be done almost at the same time

This is probably obvious to most internet users, but because I’m trying to measure CDNs as methodically as possible, I might explain it briefly.


Internet speeds usually vary greatly depending on the time of the day. For example, during the evenings, there is typically massive internet congestion as people get off work and get leisure time. Video bandwidth is particularly notorious for hogging a lot of bandwidth. For example, did you know Netflix alone hogs 40% of the internet traffic during the evenings? Google it. And that’s just Netflix. Imagine if you added Youtube too. The evenings just get really congested with internet traffic.


And since we are focused on CDNs as the variable, we should make internet conditions invariant. This means running the CDN speed test for 2 CDNs at basically the same time (let’s say within 1 minute is fine).

Summary (TL;DR)

  • I measure three CDN download speeds: cold, hot, and warm cache.
  • The measurement is automated with a python script (it’s open-sourced, see it here: https://github.com/speedtestdemon/speed-tests/blob/master/test.py)
  • CDN speed measurements on 2 or more CDNs are to be conducted within the space of a few minutes.
  • CDN speed measurement is conducted on a home computer with home internet, not on the cloud where internet speeds are 10x home internet speeds (easily).

Questions? You can ask me on social media https://twitter.com/SpeedTestDemon


Also published on https://speedtestdemon.com/testing-methodology/.