“Warm up” your test tool and get better performance tests. Here is how we did it with WebPageTest. Part of my job at Ekstra Bladet Udvikling/Development is to monitor web site performance at one of Denmark’s biggest and busiest websites. When you test and measure a website’s performance it’s important that your tool looks as much as a real user experience as possible. This is possible with the performance tools we are currently using in Ekstra Bladet Development but it requires a little fiddling. Read how it’s possible and what you can do to get started. 🕐 The Challenge Performance is a huge factor in the user’s experience. This is especially true if you have a website who (like ekstrabladet.dk) contains a lot of which all have an effect on the technical performance. ads Back in July 2016, I summed up our work with performance and published (among other articles) here on Medium. Here I wrote about how much of our performance is affected by that is out of our control. ‘Your ads performance is your performance’ technology Besides that, a lot has happened in advertising technology and ekstrabladet.dk is no exception. The advertisers increasingly prefer to reach u . They do this by using a range of technology partners which promise to help with this and they try to match the content with the users based on, among other things, their cookies. sers which they hope will be receptive to their message When so much has happened in the value chain from advertiser to user it seems strange that , which then visits and measures them — and then shows us the results. Many ads (and the technology matching them with users) depend on cookies and “cold browsers” like the ones in performance tools don’t have cookies. performance is still widely measured by throwing a bunch of URLs into a tool Therefore it would be pretty neat if you could that live on a website but won’t show up until there is something to target. “warm up” the test browser, thereby making it more attractive to various technologies and ads 🕑 The Concept A while ago I read an article (which I regrettably can’t find despite numerous attempts…otherwise I would have linked to it) where a company behind a performance tool had a suggestion for a solution. They let their tool . visit a series of URL addresses before vising the URL where the actual performance testing is happening This has the advantage that the test browsers builds a cookie profile (from the places it has visited) which all of a sudden makes it more interesting to various technologies. personalization In the article the company told how the new way of measuring showed that a website was in fact . Meaning that it took longer time for the website to load for a real user than for the cold test browsers which a lot of people trust in when they measure performance. slower, than they had thought 🕒 The Tools I decided that I wanted to conduct the same experiment with ekstrabladet.dk. We are quite well of in the sense, that the tools we are using for performance testing already can do this. WebPageTest is (part of) the beating heart in SpeedCurve. We while we use . SpeedCurve is built on WebPageTest so actually we are only using one tool, deep down. And the type of I have used in WebPageTest (which I’ll get back to in a moment) is supported in the Enterprise edition of SpeedCurve which we have. Hooray. use SpeedCurve for the continuous performance tests WebPageTest for our ‘ad hoc’ (from time to time) tests scripting As recommended by Steve Souders from SpeedCurve, I decided to test it in some manual WebPageTest tests. It is the results of these tests I want to share with you in this article. 🕓 The Setup 🕔 Find the URLs The first step was to get some good URLs I could get WebPageTest to visit to warm up the test browser. I talked with some of my colleagues in our Sales/BackOffice/AdOps team and they handed me : the following addresses which they believe could be interesting http://superbrugsen.dk/tilbudsavis/ https://www.alka.dk/bilforsikring http://www.nykredit.dk/dit-liv/bolig/ny-bolig https://danskespil.dk/oddset?intcmp=top_menu_oddset_brand http://www.circlek.dk/dk_DK/pg1334082175653/privat/extraClub.html Here we are using fem URLs. Possible because I asked for “a handful” — so the number 5 has no significance in this context. 🕕 Write the script Now we want WebPageTest to visit these URLs before visiting the article on ekstrabladet.dk we want to have measured. Here we use . the scripting interface that is built into WebPageTest Luckily so it’s easy to get started. scripting is a part of the WebPageTest documentation On the documentation page you’ll find this example: logData 0 // put any urls you want to navigatenavigate www.aol.comnavigate news.aol.com logData 1 // this step will get recordednavigate news.aol.com/world WebPageTest scripting works by issuing various commands (in this case “navigate” and “logData”) and one or more parameters (in this case 0/1 or a URL). These to must be . That is important to remember. separated by a tab The example above start out by visiting ‘www.aol.com’ and then goes to ‘news.aol.com’. But because the test browser is instructed not to save any data on the performance test (“logData” is set to 0), it doesn’t take any notes, so to speak. It does that, however, when it navigates to ‘news.aol.com/world’ because “logData” is being set to 1. It’s actually quite logical. In this case it could be to measure what I call the “cache win”. Meaning, how easier/faster a page loads when the user has previously visited another page using some of the same resources (images, CSS, JS files etc.). and only saving the performance test from the last visit. This can also be measured by visiting the same URL twice We can use this function for what we want to achieve. Not because we want to measure the cache win (the five advertiser websites probably don’t share resources with ekstrabladet.dk) but because . the data/information that is being loaded on a page is hidden in the browser cache and as cookies If we want to write a script that visits the five URLs and then does a performance test on the frontpage at ekstrabladet.dk it will look like this: logData 0navigate http://superbrugsen.dk/tilbudsavis/navigate https://www.alka.dk/bilforsikringnavigate http://www.nykredit.dk/dit-liv/bolig/ny-bolignavigate https://danskespil.dk/oddset?intcmp=top_menu_oddset_brandnavigate http://www.circlek.dk/dk_DK/pg1334082175653/privat/extraClub.htmllogData 1navigate http://ekstrabladet.dk If you need a copy-paste version of the script, I have 😉 uploaded it as a .txt file 🕖 Define your measuring range For these tests I have chosen to focus on . So far a lot of our performance work has focused on the front page, but we are currently rolling out a new article design across our website — and a part of our way of working is , so it made sense to look at the mobile edition/version of articles. Therefore I made WebPageTest emulate an iPhone 6. the performance of articles ‘mobile first’ I chose to use . There are others close to Denmark, Germany for example — but I’ve been using the one in Ireland for a long time, so for the sake of comparison I stuck with that. the WebPageTest server based in Ireland Note, that when you test/measure performance from another country or far-away location, you . A load time of 9 seconds isn’t necessarily 9 seconds just because some test from Ireland says so. On the other hand, . Like, if you chance something on your website and the load time drops from 10 to 5 that is still cut in half, as long as the before/after tests are done from the same location and in the same way, of course. And comparable tests is exactly what we want to do here. shouldn’t get too attached to the actual values you trust comparable measurements/tests I decided on articles in our entertainment section (“flash!”), since it is one of the sections that has the news design and the accompanying functionality. I have tested 10 articles. Both the times of publication and of test were spread out across a couple of days: Ingen skilsmissekommentarer fra Aqua-Lene Dansk grandprixvinder er blevet gift Line Baun: Derfor flyttede jeg hjemmefra Dronningen om sin barndom: Én sætning foragtede jeg ‘Paradise’-deltager ville være politiker: Derfor er planen droppet Efter Facebook-fight: — Jeg føler mig som en beskidt luder Her er verdens bedst betalte kendis Smukke Helenas hund passer på missen Ærlig Line Baun om sit livs kiks: Jeg ville da ønske, jeg aldrig havde sagt det Ærlig Mascha Vang: Sådan påvirker terroren mig 🕗 Remember the general WebPageTest tips When you are using WebPageTest on a website containing third party content/technology it is important to get all of that into the load being tested. so as to to waste precious ad displays on a machine. Therefore it is important to check the ‘Preserve original User Agent string’ setting found ‘Advanced’ in ‘Advanced Settings’: Some tech providers will hide ads, for instance, if they can see it’s a test browser ‘PTST’ is the WebPageTest test browser’s way of identifying itself as a test browser. This can cause some issues so I always check this box ↑ You should also watch on YouTube. It is the first part of a presentation given by Patrick Meenan from Google’s Chrome team (and ) at the performance conference ‘Velocity’ in 2014. ‘Velocity 2014 — WebPagetest Power Users — Part 1’ the guy behind WebPageTest Among other things he recommends you , since WebPageTest picks the best one by choosing the median one. He also stresses that you have , since the first run warms up the DNS cache, server, database etc. run an odd number of runs more than one run I have chosen to have five runs in my test. Besides this, I run every test 10 times (meaning 50 runs all in all) which I then gather in a spreadsheet and find the average. 🕘 The results It can all get a little abstract if we don’t have the same data to look at. Therefore I’ve uploaded my spreadsheet to Google Docs so you can have a look. There you will also find links to every WebPageTest test I have used. There is nothing secret in these tests/measurements. Everyone can measure the performance of ekstrabladet.dk articles using WebPageTest. I have merely copied in the results (and converted the number format), calculated the average and compared across tests with and without the five URLs. That’s it. → You can see the spreadsheet here The most interesting, of course, are the comparisons, so here they are. Forgiv me for just pasting screenshots from Excel — I just didn’t feel like spending half a day pasting data into HTML tables (note: ‘Ny’ is Danish for ‘New’): A quick glance across the numbers that . Speed Index is (besides ) . It is, in other words, an attempt to measure the — or a large part of the, at least. the Speed Index value is where the action is being documented an expression of how quickly the first viewport is ready perceived performance To get a more general impression of the results we can look at the average percentage increase. Note, this is really not feasible since there might be big differences between the articles (and there are after all only 10 of them) but it can give a broader view and maybe help identify certain tendencies: This clearly shows that especially Speed Index is improved by including the five URLs. Here we can see that Speed Index has a 30 percent increase which is significant. This graph (which might receive a ‘World’s Ugliest Graph’ nomination) illustrates the drop in Speed Index; the articles are paired by color: Speed Index for the 10 different articles, with and without the new URLs in the test tool. 🕙 The Conclusion Despite the fact that the other values aren’t really affected it does look like the first view/viewport is noticably faster when the test tool (WebPageTest) has visited the five URLs and built a (admittedly quite poor) cookie profile prior to visiting the ekstrabladet.dk article. As such, this doesn’t really mean anything to the user experience at ekstrabladet.dk, just like we don’t need to change anything on our website following this discovery. Yet it is still important. (at least when it comes to Speed Index) ; because a test browser with a cookies profile is more similar to a real user. It indicates that our site may perform better for our users than our tools show Therefore we need to consider changing our setup for performance testing and measuring; I will get back to that in a short while. At the same time it’s interesting that there is only a very small change in the load-time on the articles. for the user (or, test browser, to be honest), but the first viewport is ready much faster. That could indicate that some elements/requests are being fetched in another order, so that the top elements on the page are loaded first — which makes great sense seen from a performance point of view. The article itself doesn’t load faster I find it interesting as well, that we see . I thought we would make the same discovery as the people in the article I can no longer find; that our site is actually slower for the real users than in a test tool. the exact opposite of what I was expecting Instead we see that at least Speed Index is improved when we start imitating real browser behavior. That is good news (although it might be the other way around with other or more URLs) — but how can I be like this? 🕚 Why (maybe) — and the future These tests (even though there is a considerable amount of runs) do not reveal why the articles in the test have a better perceived performance if theh test browser is equipped with cookies. A theory which I share with a colleague in our BackOffice/AdOps team is that . A lot of ads are after all trying to get to the right users by using things like cookies, so that conclusion is right there in front of us. the profile is simply becoming more “attractive” for the various technologies Now we need to dig into it and find out whether this is a binary state (where the big difference lies in ‘cookies’/’no cookies’) or if it’s more of a gradual thing, where we might see even better performance if we have 25 URLs in stead of 5. It may be a glitch in WebPageTest, of course. That the test browser for some reason becomes faster at loading the first viewport when it has visited other sites prior to the article. I don’t think so — but in that case it doesn’t change the fact, that there are certain uncertainties regarding the tools we use to measure and test performance. With these 10 articles we saw a drop in Speed Index. — for example I just did a control test på another article from the same section; here Speed Index dropped by about 20 percent, while the load time dropped more than 10 percent. Next time it might be something different So, what happens now? First of all, we need to do more testing. As I mention above we need to find out whether there is a difference in the number of URLs being visited before the performance test. This isn’t really something we can use on our website — but we can use it to build better performance tests and measurements. When the user experience is as dynamic as it is nowadays, it’s extremely important that our performance tests are as similar as possible to real browser visits. I would also like to do tests where the ads (and similar third party technology) are kept out of the equation. My thesis is that the difference in Speed Index will vanish — but that needs to be confirmed/ruled out through tests. , which we use for the continuing performance tests. Here we will face an interesting discussion: ? And if so, would it be the same X number of URLs; or are other URLs better suited for other sites? When we have found the best possible setup we need to change our setup in SpeedCurve Should we also change the way we test the other websites we use for benchmarking Here we will also need to decide whether this setup needs to be , whereby we from time to time add/edit the URLs on the list. maintained with ads or other personalization technology, I will recommend you do something similar. Talk with others in your organization on which URLs could be interesting to use in your tests. If you work with the performance of a website It may well be that you find, there is no difference — but then you’ll know. And remember to keep the tests alive and repeat them so you are always making decisions on the right basis. 🕛 What is real? There can be no doubt that performance testing is a very…lively field where you can never be sure that you are actually measuring what the users are experiencing. As Morpheus tells Neo in ‘The Matrix’: What is real? How can you define real? These tests underline a point others have made many times; that . personalization causes each of us to experience our own World Wide Web This makes it impossible to define what a ‘real’ user exprience is — and it means that performance at a website like ours will always be a general expression of how the site is probably experienced by as many users as possible. This article was originally posted (in Danish) at our Ekstra Bladet Development blog →