Open-sourcing Ferrum: a fearless Ruby Chrome driverby@iurii-gurzhii
425 reads
425 reads

Open-sourcing Ferrum: a fearless Ruby Chrome driver

by Iurii GurzhiiFebruary 29th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Ferrum is a Ruby gem that controls Chrome through a WebSocket using the Chrome DevTools Protocol. Ferrum uses the CDP Protocol, which powers Chrome’s Web Inspector under the hood. The one thing Ferrum can’t do is solve a CAPTCHA for you… yet. You can intercept network traffic, stub responses, modify cookies, headers, and scripts; you even can send mouse and keyboard events that are indistinguishable from real ones! Ferrum was written to be fully thread-safe from the outset.

Company Mentioned

Mention Thumbnail
featured image - Open-sourcing Ferrum: a fearless Ruby Chrome driver
Iurii Gurzhii HackerNoon profile picture

If you want to run integration tests on your website, you have three options: Poltergeist, Selenium and now, a new secret weapon — Ferrum.

Poltergeist is fantastic, but unfortunately, it’s really quite outdated now.Selenium requires installing additional software, it’s slower, and it doesn’t give you full control of the browser. It’s definitely not an option for everyone.Ferrum is faster than Selenium, there’s no additional software to install, and with it, you have complete control of the browser.

Back in the 1970s when Unix first appeared on the horizon, almost everything was headless: the vast majority of tools you used just didn’t have any sort of visible UI, excepting printing error messages when something went wrong.

Web start-up philosophy tends to be a little different to that of Unix. Start-ups can evolve in many ways, and we’re constantly having to make trade-offs. Interpreted languages, dynamic typing, and automatic memory management all make life much easier than the tools the original Unix developers had to work with, and they allow us to write very readable code, but it can still be error-prone. The quicker we can write integration tests and run them, the lower the chances of not missing a bug: and that’s where Ferrum, written by our colleague Dmitry, comes in.

Meet CDP

CDP stands for “Chrome DevTools Protocol”. It’s not something especially new, and you’ve probably seen Chrome’s Web Inspector: CDP is what powers the Inspector under the hood.

CDP itself is a fairly simple protocol, based upon JSONRPC. The API is split into multiple domains targeting different aspects of Chrome: the core browser application is in the Browser domain, a particular page is dealt with by the Page domain, you can interact with the DOM tree using methods in the DOM domain, and so on. There are quite a number of different domains, each of which has its own set of methods that you can call.

Ferrum — a secret weapon for Chrome

Now that we’ve covered an introduction to CDP itself, let’s move on to Ferrum. Ferrum is a Ruby gem that controls Chrome through a WebSocket using the Chrome DevTools Protocol, and provides you with a high-level API to it.

Let’s try using Ferrum to take a screenshot of the Google homepage:

browser =
browser.screenshot(path: "google.png")

And that’s all there is to it! Using Ferrum really is simplicity itself.

Just because it’s easy to use doesn’t mean it isn’t capable of a lot: there are a great many CDP features already supported, beyond basic navigation, search, and screenshots. You can intercept network traffic, stub responses, control authentication, and modify cookies, headers, and scripts; you even can send mouse and keyboard events that are indistinguishable from real ones!

(The one thing it can’t do is solve a CAPTCHA for you… yet).

Fully thread-safe control

Ferrum was written to be fully thread-safe from the outset. You can work with dozens of open tabs from the same Chrome process using multiple threads, and each page is maintained independently through a dedicated WebSocket. For example:

browser =
context = browser.contexts.create

t1 = do |c|
 page = c.create_page
 page.screenshot(path: "t1.png")

t2 = do |c|
 page = c.create_page
 page.screenshot(path: "t2.png")

In the code above we create a context, which is a Chrome feature particularly suited for testing — in effect, it works like a private browsing window, and once you close it, everything is gone. You can create tabs in the same context if they should share some history or data, or separate them into different contexts if they need to be completely independent of each other.

Ferrum + Capybara = Cuprite

Cuprite is a driver for Capybara that uses Ferrum to control Chrome. This means you can use the familiar Capybara API without needing to rely on additional dependencies that other drivers need for headless Chrome. If you’re already using Capybara, Cuprite gives you the benefits of Ferrum without having to learn a new set of APIs.

Ferrum + Crawling = Vessel

If you’re considering using Ferrum to perform some web crawling, then look no further than Vessel, a Ferrum-based crawling framework for Ruby.

Now that you’ve had a quick introduction to Ferrum, all you need to do now is head over to the Ferrum Github and get started on your next Chrome automation project! Here at Evrone, as well as open-sourcing some of the tools we've created ourselves, we also support other open-source initiatives, including offering our design and identity services to particularly promising and exceptional projects, such as Ferrum.