Selenium testing: a new hope

Part II. Losing weight and going to containers.

In the first part I told about simple approaches to build scalable Selenium cluster without writing any line of code. In this part we’re going to dive into more subtle Selenium questions:

How to prepare easily scalable worker nodes using standard Selenium hub
Why it’s possible and even recommended to run the majority of browsers inside containers and how to do that
Which ready to use open source tools exist

What’s inside worker node

All new tools described in the first part are in fact smart thin proxies that redirect user requests to real Selenium Hub and Nodes. However when you think a bit more questions arrive:

How to arrange hubs and nodes to efficiently consume hardware resources and to still remain scalable?
Which operating system to use?
Which software should be installed?
Can we work without real display?

One may try to use hardware servers having one Selenium Hub and a lot of Selenium Nodes for different browsers. Seems reasonable but in fact is far from being useful:

As I previously said Selenium hub becomes very slow with big quantities of loaded nodes. Not sure about actual reasons but that’s what our experience shows. An advice — don’t look at Selenium source code when going to bed if you wish to sleep without nightmares. So we can’t use dozens of Selenium nodes with the same hub. Let’s then use only a few nodes with one hub. To stay efficient we need to reduce total number of CPU cores per hub — that’s a good reason to migrate to clouds. For example our installation for a long time was using small virtual machines with only 2 VCPUs and 4 Gb of RAM.
Not clear how to install different versions of the same browser in easy way (e.g. using binary packages).
Not clear how to easily account total quantity of each browser version.
Different Selenium Node versions are compatible with different browser versions. E.g. newer Selenium node can not support relatively old browser version.

The simplest way to always have the same quantity of nodes connected to the same hub is to launch both hub and nodes on the same virtual machine. If you use one machine per browser version that becomes an elementary school task to calculate total number of available browsers. You can easily add and remove virtual machines containing compatible version of Selenium node and browser. This is what we recommend when working with Selenium cluster in cloud with static quantities of each version available.

But what should be present inside such virtual machine except Selenium Hub and Selenium Node to work smoothly?

First of all we recommend using Linux as base operating system where possible. With Linux you can cover 80% of your browser needs. It’s simpler to enumerate what is not covered:

Internet Explorer and Microsoft Edge. These run only in Microsoft Windows and are subject for a separate article. For never was a story of more woe.
Desktop Safari. Is anybody using it? Selenium has rather poor support of this browser.
iOS and real Apple devices. You need to use Apple hardware such as MacMini and Appium to work with them.

To run standard Selenium you need to have Java (JDK or JRE) installed and Selenium JAR of desired version present.
A virtual machine has no real display so you need to launch Selenium node in particular X server that can work without display. Such implementation is called Xvfb. This can be done as follows:

xvfb-run -l -a -s '-screen 0 1600x1200x24 -noreset' java -jar \ selenium-server-standalone.jar -role node <...other args>

Notice that Xvfb is only needed for Selenium node process.

You may also want to install additional font packages such as Microsoft True Type fonts.
If you wish to take screenshots it’s recommended to turn off caret blinking for Gtk-based browsers.
If your tests need to interact with sound system you also need to set up dummy sound card. A script for Ubuntu can be like the following:

#!/bin/bashapt-get -y install linux-sound-base libasound2-devapt-get -y install alsa-utils alsa-ossapt-get -y install --reinstall linux-image-extra-`uname -r` \modprobe snd-dummyif ! grep -Fxq "snd-dummy" /etc/modules; thenecho "snd-dummy" >> /etc/modulesfiadduser $(whoami) audio

Losing weight

As you could probably notice Selenium is a Java application. You need to have Java Virtual Machine (JVM) installed on your system to run Selenium. The smallest Java installation package called JRE is about 50 megabytes. Selenium JAR for the latest version 3.0.1 adds 20 more megabytes. Then consider operating system size, all required fonts, browser distribution size and you easily add up to several hundreds of megabytes. Although disk storage is now relatively cheap — we can do better. Selenium 2.0 and 3.0 series are also called Selenium WebDriver. This is because different browsers support is implemented using so called webdriver binaries. Here’s how it works:

Browser developers can implement their product any way they want. To have the product supported by Selenium they need to provide a standalone binary having the same API as Selenium Server does and supporting JSONWire protocol. This binary should be able to launch browser process, execute protocol commands according to their specification and stop process when requested. Any details of communication between driver binary and browser binary are left up to browser developers. The only contract is to support the same Selenium API. For example Chrome has Chromedriver, Opera Blink has OperaDriver and so on.
When setting up Selenium you specify only the path to driver binary.
When you request a new session Selenium in fact launches the driver binary and then delegates (proxies) your requests to the driver process. Driver does the rest. You can achieve the same result by manually starting driver process on desired port and running your tests against this port.

Having said that — is not it a bit expensive to spend hundreds of megabytes for a simple proxy? A year ago the answer was definitely no because there was no such binary for Firefox — the most widely used Selenium browser. It was Selenium server responsibility to start Firefox process, upload an extension to it and proxy requests to the port opened by this extension. During the last year the situation has changed. Starting from Firefox 48.0 Selenium interacts with browser by using a standalone driver called Geckodriver. That means that at least for the majority of desktop browsers we can safely remove Selenium Server and proxy requests directly to driver binaries.

Going to containers

In previous sections I described how we can organize a Selenium cluster using virtual machines in cloud. In this approach virtual machines are always running and thus spending your money. Also total number of hosts for concrete browser version is limited and this can lead to free browsers exhaustion during peak loads. I have heard about working and even patented sophisticated solutions that prelaunch and warm up a pool of virtual machines according to current load to always have free browsers. That works but can we do better? The main issue with hypervisor virtualization is the speed. It can take several minutes to launch a new virtual machine. Let’s think a bit more — do we need a separate operating system for every browser? The answer is no — we only need filesystem and network isolation working fast. This is where container virtualization comes into play. For the moment containers work mostly under Linux but as I said Linux easily covers 80% of most popular browsers. Containers start in seconds and go down even faster.

What should we place inside container? — Almost the same stuff as we do for virtual machine: browser binary, fonts, Xvfb. For the old Firefox version we still need to have Java and Selenium server but for Chrome, Opera and latest Firefox we can use driver binary as container main process. Using minimalistic Linux distribution such as Alpine we can deliver extremely small and lightweight containers.

Selenoid

Actually the most popular and well-known container platform is Docker. Selenium developers provide a set of prebuilt Docker containers to launch Selenium server standalone or Selenium Grid in Docker environment. Unfortunately to create a cluster you need to start and stop these containers manually or using some automation tool like Docker Compose. This is already better than installing Selenium from packages but it would be better to get a lightweight daemon with the following behavior:

Somebody starts this daemon instead of standard Selenium Hub
The daemon knows that e.g. to launch Firefox 48 it needs to pull and run container X and for Chrome 53 — container Y.
A user requests Selenium session as usually but from this new daemon.
The daemon analyses desired capabilities, starts correct container and then proxies all requests to its main process (either a Selenium server or just a webdriver binary). And we did it… and even more.

During the years of using standard Selenium server on a large scale we understood that it’s an overhead to use JVM and fatty Selenium JAR for proxying requests. So we were searching for a lighter technology. Finally we chose the Go programming language aka Golang. Why is it better for our purposes?

Static linking. Compilation result is a single binary to run. Having the binary there is no need to install anything more like JVM for Java.
Cross compilation. We can compile binaries for different platforms using the same Go compiler.
Rich standard library. For us the most important things were out of the box reverse proxying and HTTP/2 support.
Big community. It’s already becoming the mainstream.
Supported by popular IDE. There’s a good plugin for IntellijIDEA and an alpha version of Gogland IDE on the same platform.

We did not find a good name for this new Go daemon. This is why it’s called just Selenoid. To start working with Selenoid do three simple steps:

Create a JSON file with browser version to container mapping:

{"firefox": {"default": "latest","versions": {"48.0": {"image": "selenoid/firefox:48.0","port": "4444"},"latest": {"image": "selenoid/firefox:latest","port": "4444"}}},"chrome": {"default": "53.0","versions": {"53.0": {"image": "selenoid/chrome:53.0","port": "4444"}}}}

Like in GridRouter XML file you specify available browser versions. But Selenoid starts containers on the same machine or using a remote Docker API so there’s no need to enter host names and regions. For each browser version you need to provide container name, version and port that container listens to.

Run Selenoid binary:

$ selenoid -limit 10 -conf /etc/selenoid/browsers.json

By default it starts on port 4444 as if it was Selenium Hub.

Run your tests pointing to Selenoid host like you do for standard Selenium.

Our tests show that Docker containers even with standard Selenium server inside start in a few seconds. What you get instead — is a guaranteed memory and disk state. The browser state is always like after fresh installation. More than that you can install Selenoid on a cluster of hosts having the same set of desired browsers stored as Docker images. This gives you a big Selenium cluster that automagically scales according to browser consumption. If current requests need more Chrome sessions — more containers are launched. When there’s no Chrome requests — all containers go down and free place for other browser requests.

To deliver better load distribution Selenoid automatically puts requests exceeding sessions limit to wait queues and processes them when some sessions on the same host end. But Selenoid is not just a container management tool. It allows you to start on demand not only containers but also any driver binaries instead. The main use case for this feature is replacing Selenium Server on Windows. Selenoid in that case will start IEDriverServer binary thus economizing memory consumption and avoiding some proxying errors in Selenium Server.

Go Grid Router (aka ggr)

Do you remember that GridRouter is also a Java application? We did an effort and created a lightweight Go implementation called Go Grid Router (or simply ggr). What are the benefits?

Increased performance. Can serve at least 25% more requests.
Lower memory consumption. Under 150 rps load it consumes 100–200 megabytes of RAM and this consumption remains stable.
Client disconnect issue fixed. When client disconnects (e.g. because of timeout) original GridRouter continues the attempts to create a new session. This clutters the network and decreases GridRouter performance when too many hubs become unavailable. Go implementation contrarily stops new attempts as soon as client disconnects.
Graceful restart. When used outside of Docker containers you can gracefully (without losing client requests) restart the server by sending SIGUSR2.
Quota reload by request. When using multiple GridRouter instances behind load balancer it’s important to update quota XML files synchronously. When you add new hub hosts and update XML files on a running cluster a quota inconsistency may occur. In that case some client sessions can get 404 error because not all GridRouter instances have the latest host lists already installed. Go implementation does not reload quota files automatically but wait for a SIGHUP signal. This works with both standalone binary and Docker container.
Encrypted passwords. Ggr uses Apache htpasswd files with plain text user name and encrypted password.
Reduced binary size. Currently it’s about 6 megabytes. No need to download and install Java. When packed inside Alpine Docker container total container size is 11 megabytes.

When combined with Selenoid it allows to create scalable and robust clusters like this:

Conclusion

In this part, I told you about the bleeding edge technologies that can be used to organize Selenium cluster in modern way:

Why Selenium is suitable to be run in containers environment
What should be put to container
Which open source solutions exist to achieve this

In the next part I am going to show how to start using solutions described in this part in popular cloud platforms.

À bientôt…