Selenium has been the de facto automation standard/tool for browser/UI automation for quite sometime.
Creators of selenium also created selenium grid for scaling the UI testing. The grid consists of a selenium hub and several selenium nodes which connects/hosts the browsers and thus providing a way to run several UI tests simultaneously. Selenium also provides some easy way to customize/extend existing capabilities. For most of the use cases the selenium hub /node based architecture scales pretty well and doesn’t need much customization.
The architecture, although supports scaling the selenium nodes, it doesn’t make it easy to scale the hub itself. On a typical deployment there would be a single hub and several nodes attached to it.
The mapping of sessions/browser instances is maintained in memory inside the hub. So, if the hub fails, the whole grid will become unusable.
A simple way to enable some level of HA is to run at least 2 hubs.
Requirement of sticky session
Since selenium hubs can’t talk to each other, the distribution of load to different hubs need some kind of stickiness. Stickiness here needs to make sure that once the session is created on one hub, all subsequent request for that specific session comes to the same hub.
Session stickiness is not a new thing. A typical stickiness for browser based clients is done via cookies. The LB looks at the cookie information being passed to decide which upstream server the request would be forwarded.
Cookie based stickiness ruled out
Passing this extra bit of information using cookie is not possible in selenium automation scenarios, so this was not an option for our use case.
Client ip based stickiness
One option that we have been using so far is to create stickiness based on client ip.
So all calls originating from one client ip will always land to a given hub. While this works for most of the scenarios, the load on hubs are not distributed evenly.
This particular technique solves the HA problem as such, but this may not be a great solution for scaling if the load is not even.
Uneven load and uneven test cases in test suites
A large test suite with huge number of UI tests can completely chock the one off hub where all tests are landing.
For making our test infrastructure more reliable and robust, we may have to explore some different avenue.
Openresty is a software bundle of nginx with some high quality Lua libraries included. It allows you to customize load balancing behavior and add custom behaviors.
We can add openresty as a proxy layer before the hub.
We can deploy couple of instances of openresty in front of hub and the LB can forwards the request to one of these openresty instances in a round robin fashion ( earlier the LB was forwarding the request to hub directly using the client ip hashing).
Openresty instances works as a reverse proxy and all instances of hub are upstream server for each openresty instances.
The way the request forwarding works is:
Sample nginx conf with embedded lua code:
While solving the scale and availability with openresty proxy, we can use this proxy for some more features built on top of Selenium hub.
Selenium grid: https://www.seleniumhq.org/docs/07_selenium_grid.jsp