Hackernoon logoHow To Build Links Detector That Making Links in Your Book Clickable by@trekhleb

How To Build Links Detector That Making Links in Your Book Clickable

image
Oleksii Trekhleb Hacker Noon profile picture

@trekhlebOleksii Trekhleb

Software Engineer

šŸ¤·šŸ»ā€ The Problem

I bought a printed book about Machine Learning recently and while I was reading through the first several chapters I've encountered many printed links in the text that looked like

https://tensorflow.org/ 
or Ā 
https://some-url.com/which/may/be/even/longer?and_with_params=true
.

I saw all these links, but I couldn't click on them since they were printed (thanks, cap!). To visit these links I needed to start typing them character by character in the browser's address bar, which was pretty annoying and error-prone.

šŸ’”Ā The Solution

So, I was thinking, what if, similarly to QR-code detection, we will try to "teach" the smartphone toĀ (1)Ā detectĀ andĀ (2)Ā recognizeĀ printed links for us and to make themĀ clickable? This way you would do just one click instead of multiple keystrokes. The operational complexity of "clicking" the printed links goes fromĀ 

O(N)
Ā toĀ 
O(1)
.

This is exactly what I've tried to achieve by making aĀ Links DetectorĀ app. It makes you do just one click on the link instead of typing the whole link manually character by character.

I came up with a custom dataset of

120
photos of book pages that contained links in it. I usedĀ TensorFlow 2 Object Detection APIĀ to train a custom object detector model to find positions and bounding boxes of the sub-strings likeĀ 
https://
Ā in the text image (i.e. in smartphone camera stream). You may found the details of the training inĀ šŸ“–Ā šŸ‘†šŸ» Making the Printed Links Clickable Using TensorFlow 2 Object Detection APIĀ long read article.

The text of each link (right continuation ofĀ 

https://
Ā bounding box) was recognized by usingĀ TesseractĀ library.

šŸš€Ā Launch Links Detector demoĀ from your smartphone to see the final result.
šŸ“Ā Open links-detector repositoryĀ on GitHub to see the complete source code of the application.

āš ļøĀ Limitations

Currently, the application is inĀ experimentalĀ AlphaĀ stage and hasĀ many issues and limitations. So don't raise your expectations level too high until these issues are resolved šŸ¤·šŸ»ā€.

āš™ļøĀ Technologies

Links DetectorĀ is a pure frontendĀ ReactĀ application written onĀ TypeScript. Links detection is happening right in your browser without the need of sending images to the server.

Links DetectorĀ isĀ PWAĀ (Progressive Web App) friendly application made on top of aĀ WorkboxĀ library. While you navigate through the app it tries to cache all resources to make them available offline and to make consequent visits much faster for you. You may alsoĀ installĀ Links Detector as a standalone app on your smartphone.

Links detection and recognition happens by means ofĀ TensorFlowĀ and Tesseract.jsĀ libraries which in turn rely onĀ WebGLĀ and WebAssembly browser support.

Also published at https://towardsdatascience.com/making-the-printed-links-clickable-using-tensorflow-2-object-detection-api-be42bd65488a

Tags

Join Hacker Noon

Create your free account to unlock your custom reading experience.