Selenium project launched in 2004 now became an industry standard for browser automation. However if your QA department is relatively big, sooner or later you will face to recommended Selenium architecture limitations. In this article I would like to tell you how to create a scalable and fault-tolerant Selenium solution easily.
Selenium architecture radically changed several times since 2004 when its first prototype was created. Current Selenium architecture introduced in 2.0 branch is called Selenium Grid. It works like the following:
Usually a cluster consists of two daemon applications: Selenium Hub and Selenium Node. A hub is an API that handles user requests and redirects them to respective nodes. A node is an actual request executor launching browser processes and requesting desired test steps from them. In theory an unlimited number of Selenium Nodes can be connected to one Selenium Hub and every node can launch any installed browser. But what’s in practice?
The simplest scalable approach is to use multiple Selenium Hubs distrubuted across multiple datacenters. However standard Selenium libraries can only work with one Selenium hub. We need to teach them to work with such distributed system.
An initial approach we successfully used several years ago was a client library that did client-side load balancing. This is how it works:
A single line of test code (new session request) should be changed to support that library. For example in Java tests a new session request may look like that:
WebDriver driver =new RemoteWebDriver("http://my-hub.example.com:4444/wd/hub", caps);
All classes in this code come from a standard Selenium Java client. E.g. if a client-side library is called SeleniumHubFinder a new session request will look like:
WebDriver driver = SeleniumHubFinder.find(caps);
No Selenium hub URL is used in updated code — this information is stored inside client library. That’s it! This approach worked for years. Hundreds of software testers in our company were satisfied. What are the drawbacks of using client library?
Relying on our experience with client-side solution we introduced the following natural requirements to server-side one:
We called the server — GridRouter because the only thing it does is routing user requests to correct Selenium Grid Hub. Here’s the new architecture:
Initially we implemented GridRouter using Java, Jetty and Spring Framework. Its source code is available on Github. This implementation is using a plain text properties file to store users list and an XML file to save a list of Selenium hubs for each user. A typical users list (by default /etc/grid-router/users.properties) looks like the following:
user:password, useruser2:password2, user
Every line corresponds to one user. Passwords in current implementation are stored without any encryption. This is because we consider that users are mainly needed to account browsers consumption by different teams. Selenium hub lists are stored in XML files of the following format (by default /etc/grid-router/quota/*.xml):
<qa:browsers xmlns:qa="urn:config.gridrouter.qatools.ru"><browser name="firefox" defaultVersion="33.0"><version number="33.0"><region name="us-west"><host name="ff33-hub-1.example.com" port="4444" count="5"/></region><region name="us-east"><host name="ff33-hub-2.example.com" port="4444" count="5"/></region></version><version number="37.0"><region name="us-west"><host name="ff37-hub-1.example.com" port="4444" count="3"/><host name="ff37-hub-2.example.com" port="4444" count="4"/></region><region name="us-east"><host name="ff37-hub-3.example.com" port="4444" count="2"/></region></version></browser><browser name="chrome" defaultVersion="42.0"><version number="42.0"><region name="us-west"><host name="ch42-hub-1.example.com" port="4444" count="10"/></region><region name="us-east"><host name="ch42-hub-2.example.com" port="4444" count="10"/></region></version></browser></qa:browsers>
You can see that we define available browser names, their versions and a set of hosts distributed across multiple regions. A region in our terms is just a datacenter. Information about datacenters is mainly needed if one datacenter goes down. We select a host from another datacenter if the first session attempt fails. This approach increases the probability of faster Selenium session creation.
As I previously said GridRouter implements standard Selenium protocol and is fully compatible with all existing client libraries. The topic we have left is how to authenticate in GridRouter i.e. specify which quota we want to use. All Selenium client libraries support only one authentication method — Basic HTTP Authentication. That’s why GridRouter supports only this method too. Usually Selenium hub url is like the following:
http://example.com:4444/wd/hub
As you probably know basic HTTP authentication username and password can be encoded to URL like that:
http://username:[email protected]:4444/wd/hub
This is the only change you need to do in your code to use GridRouter instead of Selenium Hub. The majority of Selenium client libraries including Java and Python implementations work with such notation. Some Selenium-based Javascript tools however require you to specify username and password as separate configuration options.
GridRouter allowed us to stop using client-side libraries. It gave users with different languages access to a scalable Selenium installation. To scale GridRouter installation you just need to add more Selenium hubs to its XML configuration — all changes are applied automatically without service restart. To serve more requests per second you also need to add GridRouter hosts behind load balancer. Our experience shows that GridRouter works perfectly when total percentage of used browsers of any version is below ~80%. Problems begin when the peak load arrives and browser consumption grows up to 90–100% of total capacity. In this case the random uniform session attempts distrubution becomes inefficient.
We are trying to obtain Selenium session on fully occupied hub too often and do attempts to several hubs before returning session to user. This increases session start time and slows down tests. Our next stage in Selenium cluster development aimed to resolve the issues above was a new product called Selenograph. Selenograph is a Java server based on GridRouter source code fully compatible with its configuration files. The main differences are:
In this part I told you about standard Selenium scalability problems and how they can be resolved with minor changes to your cluster architecture. In the next part we’ll discuss topics like:
Stay tuned…