Love Selenium? It may be cheating on you

You’ve been using Selenium and are pretty happy with it, right? It may not be the most fun you had programming, but it does the job and you know you can always count on it for UI tests automation. Or can you?

The truth is scary

OK, so I picked a dramatic heading to get a reaction, but here’s the thing. Most of my friends from CTO down to junior developers feel that Selenium gives a good enough automation coverage and they feel comfortable pushing changes to prod if the automated tests pass.

But after I bring up my points about Selenium, after about 5 minutes I see their eyes open up, their confidence take a hit and their mind start racing for solutions.

The truth is that WebDriver API is not capable of verifying half of the problems that a human can spot instantly, and even a well-coded Selenium test typically verifies less than 10% of the user interface. A passing Selenium test gives you an illusion of safety, when in reality you have to use manual testers to ensure that users don’t get shipped crap. Have I got your attention now?

What’s missing in a typical test?

To understand why I claim that a test only covers 10% of the UI functionality, let’s examine the top answer from StackOverflow for automating the testing of Google search:

public static void main(String[] args) {

WebDriver driver = new FirefoxDriver();

driver.get(“http://www.google.com");

WebElement element = driver.findElement(By.name(“q”));

element.sendKeys(“Cheese!\n”); // send also a “\n”

element.submit();

// wait until the google page shows the result

WebElement myDynamicElement = (new WebDriverWait(driver, 10))

.until(ExpectedConditions.presenceOfElementLocated(By.id(“resultStats”)));

List<WebElement> findElements = driver.findElements(By.xpath(“//* [@id=’rso’]//h3/a”));

// this are all the links you like to visit

for (WebElement webElement : findElements)

{

System.out.println(webElement.getAttribute(“href”));

}

Pretty simple: open the page, find the search box, type in the value and submit the form. Then find the result elements on the page using XPath and read their link attributes.

So what’s missing? A whole lot, actually.

Visual Verification

Let’s start with the visuals. Take a look at the image below, see how many problems you can find in it.

Did you spot all 3 of them? Here’s the scary part: all of them are invisible to Selenium test so it will happily report this page as “green”.

Does the test verify that CSS styles are still good? No. Does it check if the text fits into the allotted space and is not cropped? Nope. That images are not broken? Nada. That the color of the text on the button is readable on the background? That would be another no.

Content Verification

Well, maybe you don’t care much about the colors, fonts, images and users with different resolutions than yours (sarcasm!:-). What about content? All the text, numbers and messages that users are shown. That’s pretty important, right?

A typical Selenium test checks the page title and a couple of strings inside the page body. It would be crazy to write a bunch of lookups for elements, get their text and then try to verify its correctness. And expect for that code not to break. So once again, our Selenium test doesn’t verify what the user will see.

Can we punt to the manual tester for content verification too? Let’s again look at some pages and see how humans do with change detection. Here’s Wikipedia article on Selenium:

In the screenshot of the same page below there’s something broken, can you spot what it is?

Did you see that “scripting language” link become regular text? I bet this wasn’t nearly as easy as spotting missing buttons in the previous example, but loosing a link kind of sucks for the user.

What about checking the same content but in a different resolution?

Whether you’ve noticed that the “Selenium IDE” reference has disappeared, I am sure it wasn’t nearly as easy as spotting a missing button in the previous example. Missing text, extra text and changed text can all be a major problem for users or a financial, legal and even a social application (“like” vs “don’t like”).

By now we can all agree that verifying what we show to the user is actually very important, and it’s not addressed well by Selenium. The obvious objection would be that a test can’t check every pixel and every letter on the screen because it would be always broken, and reviewing these broken tests is a giant waste of time. I’d say that’s true, but it doesn’t mean there’s no solution to this. It just needs a little magic, but let’s talk about that later :-)

Test Maintenance

Having some automation testing is infinitely better than having none, so while Selenium is far from perfect, one can argue that it does the job. You code the test and reap the benefits ever after. But anyone who’s implemented a decent number of automated tests knows that maintenance of these tests is a Herculean task.

Broken Selectors

The more elements you want to verify, the more often the test breaks. One element moved on a page can break a hundred tests that referred to it with a CSS or XPath selector. There are design patterns like page objects that can be implemented to improve this, but they require upfront planning and investment of time, and fixing broken locators requires time.

Keeping it all in sync

Browsers auto-update themselves. Webdriver libraries need to keep pace but with hundreds of unfixed bugs in each one of them keeping all of these in a working order is a “pick your poison” situation.

Error Analysis

A test can fail due to an application error, but it can also fail due to a bug in WebDriver, a browser crash, or a timeout. Sniffing through logs and figuring out what actually happened is not a fun task.

Managing State

Controlling which tests run and in what order is not easy. Organizing them in suites, prioritizing the execution, scheduling them to run concurrently and quickly interpreting the results of the execution are all left for the developer to setup and review at a low level. To me, dealing with these mundane tasks in the 21st century feels like using a bucket to flush the toilet. I can do it, but I prefer the auto-refill water tank with a handle, thank you very much!

Stop b$tching, start a revolution

It’s easy to harp on some tool’s problems. And it wouldn’t be fare to do this without offering a solution. So here I go.

A couple of years ago we gave up on our attempts to make Selenium work for the UI testing of our product AjaxSwing. We realized that Selenium does not guarantee that the UI of our product is not broken, and we couldn’t find anything that did. We also realized that we, humans, can’t detect all the changes during the manual testing and we knew that we couldn’t just ship untested product to our customers. So we built Screenster, a cloud-based Selenium alternative that truly tests what humans actually see on the screen.

Let me briefly share how Screenster addresses each of the fundamental problems of Selenium outlined above.

Visual Verification

Screenster has a concept of a baseline which gets captured when a user records a test. Baseline stores screenshots of each page and DOM state so it can understand each element on the page.

After the baseline is captured, subsequent runs can compare new screenshots with baseline screenshots and the unexpected changes are highlighted for a tester to review. The tester can approve the changes if those are new features, or mark them as a bug to be fixed. Presto!

Content Verification

Visual verification is great for pixel-perfect UIs, but it can result in a lot of changes to review. A button moved a few pixels or a minor adjustment of background color shouldn’t break every test.

Screenster supports Content Verification mode, where it compares all content during a test run with the content of the baseline. All CSS changes are ignored but any difference in text is reported to the user for approval or rejection.

Test Maintenance

We put a lot of effort into reducing the ways a test can break. We have self-healing tests with smart selectors that automatically localize the elements that were moved, renamed or changed. This feature alone prevents 90% of the reasons Selenium tests break.

Screenster has smart logic for timeouts, it can retry failed operations and when errors happen, they are visually shown on the screen where they happened.

There is a cool portal where team can collaborate, there is a scheduler that can run tests in parallel and the results are shown in awesome-looking dashboards that you can impress the business and management with:-)

Bottom Line

Writing Selenium tests is buying into an illusion that you have secured the UI through automation. Sorry, but that’s not true. Your tests are not touching 90% of the UI and therefore are not catching 90% of the issues that would get shipped to the user.

New generation platforms like Screenster provide a solution to many problems of hand-coded frameworks. They are not mature yet but they are getting better every month, and they aim to relieve the humans from doing monkey work, allowing them to actually develop features.

Check it out and let me know what you think!