Since I’m going to be writing a whole bunch of articles about CDN speeds, I wanna consolidate my testing methodology into a single article. I’m going to be linking back to this article a lot.
Questions to be answered in this article:
At the end of the day, the CDN is really just a caching layer. All it is is just a massively distributed caching layer. That’s it. That’s all it is.
And if you’re familiar at all with how caching layers work, there are 3 all-important metrics to measure:
To illustrate this in hardware terms: the server responding from RAM is the fastest possible latency. That’s a hot cache. Getting it from disk (SSD) is the second-fastest possible latency. And you can add more layers (HDD). Finally, the cold cache is if the server doesn’t have the item at all.
With regards to a CDN, how do we measure hot, cold, and warm cache?
Cold cache latency: You just fetch a file that the CDN has never seen before. This is going to take the longest time. On CloudFront, for example, I’m seeing a drop from 1.2 seconds to 0.232 seconds in the hot cache.
Hot cache latency: You can fetch the same file 10 times in a row and take the average.
Warm cache latency: Trickier. Much trickier. We must wait some amount of time to wait for the file to leave the hot cache. For now, we are sticking with 30 minutes, but we may adjust this as we learn how CDNs are moving items from hot cache to warm cache. The result of the warm cache is also much more ambiguous as it must be fetched at some time away from the cold and hot cache tests.
You can see the cold cache is much slower than the hot cache by running this curl:
curl -w “@curl-format.txt” -o tmp -s “ [https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024×576.png](https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024x576.png) “
This example uses CloudFront. There is a massive jump from 1+ seconds to 0.1–0.2 seconds.
I used a python script to automate the measurements. This is important so that the measurements are very consistent between CDNs. Also important to remove human error.
It is open-sourced here: https://github.com/speedtestdemon/speed-tests/blob/master/test.py.
There is a massive drop in CDN download time from the cold cache to the hot cache. CloudFront gave me 1.09 seconds for cold cache, then 0.112 seconds for hot cache, then 0.226 for the warm cache.
Here is the full python output. The cool thing is that it prints out the headers it got for the cold cache and the warm cache. The headers are used to sanity check that the test is doing the correct thing regarding cold cache and warm cache test.
------------------------------------------------------------- Testing " **Cold cache speed**" ------------------------------------------------------------- Got headers: HTTP/2 200 **content-type** : image/png **content-length** : 719983 **date** : Sun, 20 Jun 2021 21:47:06 GMT **last-modified** : Mon, 07 Jun 2021 00:16:21 GMT **etag** : "52ae2ff2354d4a68e680b77b4da58985" **accept-ranges** : bytes **server** : AmazonS3 **x-cache** : Miss from cloudfront **via** : 1.1 c39432c353feb02b03735f3850e19107.cloudfront.net (CloudFront) **x-amz-cf-pop** : IAH50-C1 **x-amz-cf-id** : NgkCqcrwb3K65LeGu7uhebFNODrNI9s8wVeHZ93lq2XKrE3q9PMm-A== **time\_namelookup** : 0.06327399999999999691 **time\_connect** : 0.01762099999999999778 **time\_appconnect** : 0.05910300000000001663 **time\_pretransfer** : 0.00010199999999999099 **time\_redirect** : 0.00000000000000000000 **time\_starttransfer** : 0.95064099999999995827 **time to download** : 0.00008300000000005525 **time\_total** : 1.09082400000000001583 ------------------------------------------------------------- Testing " **Hot cache speed**" ------------------------------------------------------------- 10 requests done. Average: **time\_namelookup** : 0.00190460000000000000 **time\_connect** : 0.02373660000000000006 **time\_appconnect** : 0.06306589999999999419 **time\_pretransfer** : 0.00019590000000000024 **time\_redirect** : 0.00000000000000000000 **time\_starttransfer** : 0.02317680000000000434 **time to download** : 0.00007860000000000089 **time\_total** : 0.11215840000000001919 ------------------------------------------------------------- Testing " **Warm cache speed**" ------------------------------------------------------------- Sleeping for 0.5 hr to move cache from hot to warm Got headers: HTTP/2 200 **content-type** : image/png **content-length** : 719983 **last-modified** : Mon, 07 Jun 2021 00:16:21 GMT a **ccept-ranges** : bytes **server** : AmazonS3 **date** : Mon, 21 Jun 2021 00:09:46 GMT **etag** : "52ae2ff2354d4a68e680b77b4da58985" **x-cache** : Hit from cloudfront **via** : 1.1 9b59bfec44582f64d3d8dac9fb7d27b7.cloudfront.net (CloudFront) **x-amz-cf-pop** : DFW50-C1 **x-amz-cf-id** : hzwVoHfaHen2TR3cCNRsnwniXMc3_BaOWk7oa2DiQaWkioXqwSGRrg== **time\_namelookup** : 0.11359600000000000253 **time\_connect** : 0.01640300000000000091 **time\_appconnect** : 0.07292699999999999183 **time\_pretransfer** : 0.00030900000000000372 **time\_redirect** : 0.00000000000000000000 **time\_starttransfer** : 0.02220300000000000051 **time to download** : 0.00011099999999999999 **time\_total** : 0.22554899999999999949
Please note these metrics like time_namelookup do correspond to the same meaning curl's time_namelookup. 'Curl' shows a cumulative time measure, so it is always increasing. However, I want to look at the time each stage took separately from each other, so curl's cumulative, the increasing timestamp was not helpful.
These are some mistakes I made while speed-testing CDNs. These are easy mistakes to make, so I decided to write about them.
Why it’s an easy mistake to make: Most people think files are cached for only 24 hours. That’s not necessarily true. Some CDNs like Jetpack CDN evidently cache it for more than 1 day (based on my tests). Additionally, most people’s curl calls do not include the headers warning sign that the file could be cached.
How to fix: Check the headers returned. You should see the keyword “MISS” somewhere in there. If you see a “HIT,” that’s a warning sign. This is also why the python script prints out the headers for the cold cache test and warm cache test.
Why it’s an easy mistake to make: for some reason, it is extremely easy to accidentally make curl skip the download of the actual file. There are at least 3 ways of doing this.
Why this is a serious mistake to avoid: CloudFront does not cache the file if you do not download the file! And since CloudFront is an industry-standard CDN, it is likely other CDNs have the same behavior.
How you can detect this mistake: Look at the download time (time_total — time_starttransfer). If it is in the hundreds of microseconds, that’s a problem. The speed of light is only 0.184 miles/microsecond. If you got only 200 microseconds, then that is 37 miles…roundtrip. Or 18 miles one way. It is highly unlikely there’s a data center ‘that close’ to where you are. It means that there was no network transfer.
How to fix: In pycurl, write the downloaded contents to a BytesIO or StringsIO variable. To see how it’s done, see my python CDN speed test script. Or if you’re using curl from the command line, avoid “-O /dev/null” and “-I” flag and make sure download time is at least several milliseconds.
Why it’s an easy mistake to make: Your curl speed test doesn’t report any 404 error. So you get back the speed test results without realizing there’s an error.
How to catch mistakes: One red flag you wanna look at is the “download” time. If it’s less than 1 millisecond, that’s way too fast. Do the math. The speed of light is .184 per microsecond. There are 1000 microseconds in a millisecond. If you’re getting, let’s say 200 microseconds, that’s only 36 miles…. roundtrip. One way, it’s only 18 miles. So 200 microseconds is way too fast and indicates there’s probably no network transfer (i.e., no download) happening at all.
Of course, the other thing you could do is actually download the file…use wget. You’ll see an error message pretty easily.
How to fix the mistake: Use the right URL.
Why it’s an easy mistake to make: People are careless or perhaps not technical enough. This is actually an important thing to ensure, as I’m noticing the SSL exchange normally takes 50 milliseconds. Given a fast curl time is only around 120 milliseconds (from a home computer, not cloud provider).
How to fix the mistake: Be consistent in your testing. Use HTTPS for all or HTTP for all. And you should probably use HTTPS since that’s the standard (everybody uses HTTPS).
It turns out it’s not a good idea to use DigitalOcean to run these CDN speed tests.
I initially thought it would be a good idea to run these CDN speed tests on a standardized server location like DigitalOcean. Since I travel a lot, I can’t guarantee my internet will always be the same quality, or the location could be drastically different, such as different countries.
So that’s what I did. I tried it out…and noticed DigitalOcean’s internet speed is fast — REALLY fast. Like, way faster than my home internet speed. I was a little shocked. I didn’t think download speeds could ever get that fast. It makes me wonder what I need to do to make my home internet as good as that because I really thought I had the best possible internet that money could pay for (I basically use my computer 100 hours a week, OK. It’s important to me).
If you wanna see an example of the drastic difference in CDN speed…well, here’s one. It’s using the CloudFront example that we’re using in the “Free CDN” blog series. When I ran my speed test python script with:
python3 test.py [https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024×576.png](https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024x576.png)
I got back a download time of a blazing fast 0.0225 seconds or 23 milliseconds. Holy cow. And then doing the same thing on my home internet, I only get 0.2315 seconds or 232 milliseconds. What the hell.
Out of curiosity, I ran an internet speed test on the DigitalOcean server. Like how the hell is it that much faster. That’s gotta get anybody’s interest up. And would you believe the numbers???
# speedtest-cli Retrieving speedtest.net configuration... Testing from DigitalOcean (128.199.187.118)... Retrieving speedtest.net server list... Selecting best server based on ping... Hosted by NewMedia Express (Singapore) [13.17 km]: 2.16 ms Testing download speed................................................................................ Download: 1740.52 Mbit/s Testing upload speed...................................................................................................... Upload: 1420.29 Mbit/s
LOL, holy crap, Download speeds of 1740.52 Mbit/s and Upload speeds of 1420.29 Mbit/s? How much money did DigitalOcean pay for that? I wanna buy it.
That is literally 10x as fast as my internet, which tops out at 160 Mbit/s (using fast.com).
So there are several reasons why I decided not to run these speed tests on DigitalOcean after all:
BTW - a relevant article that also talks about the distorted internet speeds of cloud providers: how accurate are CDNPerf’s numbers? Answer: not very accurate. TODO: I will link to this essay (rant) once I’ve written it!
This is probably obvious to most internet users, but because I’m trying to measure CDNs as methodically as possible, I might explain it briefly.
Internet speeds usually vary greatly depending on the time of the day. For example, during the evenings, there is typically massive internet congestion as people get off work and get leisure time. Video bandwidth is particularly notorious for hogging a lot of bandwidth. For example, did you know Netflix alone hogs 40% of the internet traffic during the evenings? Google it. And that’s just Netflix. Imagine if you added Youtube too. The evenings just get really congested with internet traffic.
And since we are focused on CDNs as the variable, we should make internet conditions invariant. This means running the CDN speed test for 2 CDNs at basically the same time (let’s say within 1 minute is fine).
Questions? You can ask me on social media https://twitter.com/SpeedTestDemon
Also published on https://speedtestdemon.com/testing-methodology/.