Among all the types of tests which make up the pyramid of software testing, the end-to-end UI tests are considered by far the most difficult to manage, slow to execute, unstable, flaky, and sometimes even downright impractical. Just consider this recent post from the ThoughtWorks blog – the author recalls a discussion with an engineering team about the state of their UI testing:
"Well, we have a couple of Selenium tests, but they're pretty brittle. They always seem to be broken, so we rarely run them."
Elsewhere across the industry, the team behind the Cypress project seems to think that software testing doesn't enjoy a favorable reputation among software developers: "Until now, end-to-end testing wasn't easy. It was the part developers hated."
Convincing developers that software testing is exciting and cool, the same way as frontend development is perceived to be, will take some stretching of the imagination. Testing is “dreaded” by developers simply because the entire software industry has adopted a dualist view of the software development process. We have developers on this side, who are always promised to build cool stuff, to experiment with the latest technologies, and so on, and we have testing and quality assurance on this other side, which is often regarded as too tedious and unexciting by developers.
As long as this duality between development and testing exists, with developers dreading the testing part (and perhaps also quality engineering dreading the developers’ obsession with constant change and slickness), the tests will continue to be flaky and brittle, which will lead to overall poor quality.
What I will try to suggest is that coding and testing are not two separate activities that happen independently of each other. Testing is an integral part of software development, in the same way as coding is. Every developer who thinks that testing is not their thing is still doing some form of testing. They are just not doing it in an automated fashion, but rather in a manual, tedious, and somewhat random one.
About ten years ago, when Node.js had been released, there were barely any tools available for end-to-end testing using JavaScript, and the ones that existed required significant effort to set up and use in an effective manner. However, since the release of Nigthwatch.js in 2014, several other testing frameworks powered by Node.js were published (like
Surely the days when you had to spend hours, if not days, to set up a UI automation project are behind us. With Nightwatch, it only takes a minute or two now to be up and running with a testing automation project using Chrome, Firefox, Safari, including built-in support for Selenium Grid and cloud testing with Browserstack. Additional services are easy to add.
<iframe title="vimeo-player" src="https://player.vimeo.com/video/376838936" width="640" height="360" frameborder="0" allowfullscreen></iframe>
But are the tests still brittle and/or flaky? Right, the two most popular adjectives used with UI tests, especially with Selenium tests. In Nightwatch, there is now pretty much all you can expect from an open-source framework to mitigate brittleness and flakiness in the test runs, like implicit waits for elements, automatic retries on failed assertions, retries on failed test cases, retries on network errors, and so on. Our plan is not just to try to detect and mitigate flakiness, but to make it impossible.
On the evolution of web development over the past decade, the story is not that convincing though. Or it all depends on what you mean by “evolved”. Sure, there’s webpack and React and ES6, but according to data from httparchive.org the median page load time has remained about the same over the past 10 years, even though internet speeds have been steadily increasing, along with rapid advancements in hardware.
It’s safe to say that the internet is faster, but websites aren’t. In addition, as the team behind the Skypack utility has pointed out, “building for the web has never been more complicated.”
With this in mind, I’d be quite confident in affirming that actually web development hasn’t really evolved all that much, even though we now have better and more sophisticated tools. On the other hand, frontend testing did evolve and is still evolving on multiple fronts, maybe even more rapidly than web development itself.
The feedback loop is an iterative activity made of the following steps: plan, code, verify, exit – the process of planning a code change, implementing it, and then verify if the outcome is as intended.
Feedback loops are everywhere in software development and beyond. I’ve first heard about it from a colleague who delivered a presentation at a jQuery conference, back in 2010. Feedback loops are a common concept in Agile software methodologies, but the same concept is found also in control theory, biology, math, engineering, etc. Even the Earth’s climate system has feedback loops.
One of the founding texts of cognitive science, Plans and the Structure of Behavior (1960), introduced an early form of the feedback loop as a fundamental unit of behavior in humans. The authors identified the following steps: Test-Operate-Test-Exit which makes a TOTE unit. As they described, “in its weakest form, the TOTE asserts simply that the operations an organism performs are constantly guided by the outcomes of various tests.” [1]
According to the “cybernetic hypothesis”, the feedback loop is the fundamental building block of the nervous system [2]. This brings me to my earlier point that every developer saying they don’t do testing in their process is still doing some form of testing. Any task that we have to complete, either being implementing a new feature or fixing a bug, the process involves at least one feedback loop – a TOTE unit, usually with multiple iterations.
Here’s an example task. Browserstack is a popular choice for users of Nightwatch who are looking to run their tests on a distributed cloud infrastructure that contains multiple desktop and mobile browsers.
Say you’re a developer building the Browserstack Automate UI dashboard and your task is to make a new Nightwatch test appear in the list, in real-time.
Considering the above TOTE unit, the Test phase will be quite complex and will involve a few different operations that are needed before asserting if the feature is implemented successfully (the condition which will stop the loop). There are even some sub-feedback loops in there.
Once you implement the changes, you need to perform the following manual steps of the verification process:
The implementation phase is variable at each iteration, but the testing phase is quite fixed, involving the same steps and roughly the same amount of effort every time. Therefore if we manage to reduce it, not only it will be less complex to execute, but it will be shorter.
Thankfully now we can automate all the manual steps involved in the preparation phase and we can also add a test assertion to verify if the condition was met (the test script that was executed appears in the dashboard list). Then the actual Test phase of the feedback loop will consist of only running this newly created automated script.
If you’re still not convinced, maybe this demo project will help, which is available on Github. The project contains all the code for the experiment I’ve described above.
I've included the public URL here in order to have a working example, but of course, during the implementation phase, a dev server would be used instead.
Here's what the main end-to-end test does:
By using this technique, not only you will have shorter feedback loops, but you’ll also have an end-to-end test as well when the task is done, which can be used for the purpose of regression testing and continuous integration.
[1] George A. Miller, Eugene Galanter, Karl H. Pribram – Plans and the Structure of Behavior (Henry Holt and Co, 1960) p. 39
[2] The “cybernetic hypothesis” formulated by Norbert Wiener in Cybernetics or Control and Communication in the Animal and the Machine (New York: Wiley, 1948)