[ ] Disclaimer - the thoughts in this post are mine alone and the "we" in the title refers to an open-source project where I happen to be the lead developer. I wasn't really planning to write this blog post, but I'm quite concerned by - where a certain tool vendor was compelled to take down a blog post. The blog post in question was satire, and poked fun at - but it apparently hurt the feelings of some of the developers. this incident on Twitter Selenium Now, I've been doing open-source for a while and I know how thankless a job it can be. But I've also drawn my from the big-guns of the "testing community" when I've with other open-source projects. I've come to the inevitable conclusion that you will end up upsetting or the other on the internet - no matter data-backed and objective you try to be. But still - that certain influential personalities in test-automation circles to shut down kind of criticism. And in this case seem to have . fair share of attack published technical comparisons someone how it really irks me do their best any succeeded If you open-source something, you have to face the cold, hard reality that someone criticize it at some point. One has to grow up and deal with it. Not shoot the messenger. will If things continue like this, we can't have progress. I'm going to throw caution to the winds, put myself squarely in the line of fire and get a few things off my chest. I have been working on creating an open-source alternative to Selenium for a few months now. This did not happen overnight, and in the true spirit of open-source - I genuinely explored the option of contributing to the Selenium project. But I eventually chose the option of doing a complete re-write - and will explain my reasoning below. I will try to be respectful of the efforts that various developers over the years have put into the Selenium project. But I believe the project has some technical shortcomings, and it is important to discuss these - even though it may make some people feel uncomfortable. So here goes. - Selenium has been around which is a terrific achievement, but it does mean that the code-base has a strong "legacy" feel to it - and has a lot of "surface area" and dead-code. I found it hard to navigate the code base, looking for answers to what I thought would be simple questions. As a long term Java and open-source programmer, I am very particular that a project should be "build-able" locally after just doing a and installing one or two development tools - but Selenium is from this ideal state. In fact, the Selenium committers regularly conduct a for those who aspire to contribute code to the Selenium project. What I don't like is that the project keeps changing the build tool. Originally I guess it was Maven, then it became " " and now they moved to " " putting it even more out of the comfort zone of regular Java programmers like me. But there is a good reason why they need these "exotic" build tools - which is that Selenium needs to support multiple programming languages. Which brings me to my next point. 1) Contributing to Selenium is really hard since 2004 very git clone very far workshop full-day especially Buck Bazel - One of the main reasons for the code base being so hard to understand and the need for such inscrutable build tools is this overarching architecture decision - that Selenium has to support at least all the following languages - all at the time: 1) Java 2) Python, 3) C# / .NET 4) JavaScript 5) Ruby 2) Selenium has to support multiple language bindings same Now this is understandable, we programmers are a very fussy and opinionated lot, we pick a language of choice and then we tend to look down our noses at any language. But I can't get the thought out of my head that when I look at the Selenium project, I see a massive duplication of effort and waste of energy. All we are trying to do is remote-control a web-browser, but the Selenium project ends up implementing the 5 times over. Along with all the obvious challenges of having to co-ordinate releases across a relatively large development team - and having to document all of these things. other same thing What is the alternative, you may ask. The way we have approached this in is by creating a DSL (Domain Specific Language) which is programming-language "neutral". So although the engine and implementation behind the scenes is pure Java, users have to use this "one true" DSL for scripting tests. Since Karate can be used as a , you don't need to be a hard-core Java programmer and this has been working well so far for API testing. And I believe browser-automation is yet another domain where teams should insist on using their "favorite" programming language - if a simpler, cross-platform scripting option exists. Karate binary executable via the command-line not So I'm treating the fact that Selenium has to "spread itself thin" - as a . Of course this is just my opinion, and time will tell. weakness - this seems to be by design, and I know the developers themselves will agree that this is true. To be really useful, a testing framework has to address the following concerns, there are more, but let me pick some obvious ones: 3) Selenium is incomplete as a testing-framework CI integration Configuration and Environment Switching Grouping / Tagging HTML / reporting Parallel-execution Assertions Selenium does solve for these, it is assumed that these needs will be filled either by separate unit-testing frameworks, home-grown frameworks or 3rd-party "wrapper" frameworks. Not surprisingly, there is a proliferation of frameworks both open-source and not, that layer themselves over Selenium to address these concerns. A pet peeve of mine is that many in the test-automation community have an instead of focusing on testing - and I consider Selenium to blame. not all unhealthy obsession with "creating frameworks" has everything built-in, and you truly need only library that does the job. Karate one - Selenium depends on the - the design of which is greatly influenced by the internal wire-protocol that Selenium used to use in the old days. The concept has certainly stood the test of time, but has been criticized for a few reasons - . To summarize, it is stateless, requires multiple network "hops" to achieve simple browser-automation primitives, and is very loosely-coupled with the browser, which may sound like good design - but ends up being a limitation when you want to closely track what's happening within the browser. automation has nowadays as a popular option, and even the Selenium team is exploring how to - but all that is still work-in-progress at this point. 4) The WebDriver W3C spec has limitations W3C WebDriver specification well described here Chrome-native emerged bolt on some of the advantages without depending on any 3rd-party library. What's unique about Karate is that it implements the WebDriver spec (also from scratch) and layers options over a interface. You can indeed write your test-suites for Chrome, do all your dev and testing with it, and then swap out the config for another browser at run-time. Karate implements the Chrome DevTools Protocol also both unified - some of the architecture decisions we made such as - have turned out to be good ones. The fact that we now have an execution engine written from the ground up, and the fact that this happens to be an "interpreted" language that falls into the "keyword driven" category - means that we can do some pretty cool things, technically. I don't know of any other UI automation framework where you can step- during a debug-session and code. 5) Limited Debug Support re-writing the parts of Cucumber we needed extremely backwards hot-reload - one of the first things that hits you when you get into Selenium is the system of "Implicit Waits", "Explicit Waits" and "Fluent Waits" that everybody seems to be talking about and is the subject of many a blog post, tutorial and "interview-question" peddler. And I'm going to admit that I still don't fully understand what they do and how to get them to work. I was determined to design a better, less-confusing API when working on Karate's UI automation implementation - and here is the result. 6) Selenium "Waits" are hard to understand - Another pet-peeve that I have when I observe the Selenium ecosystem is the amount of time spent obsessing over patterns such as the so-called "Page Object Model". I have a point-of-view that needing such patterns is a sign of weakness in your API, and it is because your API is not concise enough that people have to layer things on top of it in the pursuit of "re-use". And most of the Selenium "wrappers" don't do a good enough job IMO. 7) The Element locator API is clunky Karate from day-one has had a reputation of making commonly needed operations into , which is what you would expect from a true DSL. describes our approach to the need to achieve "re-use" while still ensuring that your main "flow" remains readable. There are some interesting (and possibly controversial) decisions made - such as that the locator string can have a "prefix" that encodes the of locator. See if you can spot some " " below. one-liners This section of the documentation type friendly locators And by the way, I still don't understand what on earth a " " is. So sue me. Screenplay Pattern - and can be made simpler. You can't have a discussion on Selenium without the word "grid" coming up at some point. While doing "grid stuff" was really hard in the past - it has come a long way though, and there is and 3rd party alternatives such as that make things easier nowadays. Surprisingly, most testing frameworks fail at parallel testing, but Karate has the advantage that since it started out as a "headless" API testing framework, it had to get parallel testing right - from , and this is part of the , not an add-on. 8) Parallel and Distributed Testing is hard Zalenium Selenoid / Aerokube day one core And we are experimenting with a way to do distributed parallel testing needing to provision or install a "grid" equivalent, and which in your existing CI set-up - so please help us test this if you can ! without just works - this is probably an unfair advantage that Karate has, because it started out as an API testing tool and which I strongly believe is the right architectural "sequence". I see many tool vendors and projects that started out as a UI automation solutions - but then had to retro-actively "bolt on" poor substitutes for API testing. More and more people are catching on to the fact that moving the "testing pyramid" is a good thing. 9) Hybrid API and UI tests are hard down So with Karate you can indeed achieve very effective approaches such as getting an Auth token via an API call and then dropping a cookie or two onto a blank page - which means that you can completely by-pass a time-consuming and potentially "flaky" sign-in "form" or UI. Karate has a serious advantage as a framework that can do API and UI testing - within the same syntax, within the same test flow. both Karate happens to have API mocking ( ) and also capabilities, and we are excited about how these can potentially complement - or be mixed into UI testing in the future. which can even serve HTML performance-testing By the way, we have contributors , which is a sign that Karate is flexible enough to handle mobile and even desktop application testing within the same core framework. working on Appium support - in the Selenium world, you will find plenty of examples and references to the " " and the various equivalents in the different language bindings and "wrappers" - and this is another area we've tried our best to simplify in Karate. Most of the time, you just need to execute a snippet of JS against a given HTML DOM element. Karate recognizes this common need - and makes it dead simple. 10) Executing JavaScript in the Browser is hard JavaScriptExecutor So I'll stop here with these top 10 ! Karate is certainly a new entrant into the UI automation "wild west", but the and adoption of the API testing capabilities has been extraordinary, and we are confident that the community will appreciate the things that we have tried to improve, and the things that we have tried to do differently. community reception , let us know what works and what doesn't - and if you found the arguments above compelling, please support this project by spreading the word - even if you can't contribute code ! Thanks :) Please try it out