We are all familiar with Google’s reCAPTCHA, the small test that is meant to distinguish us from the machines. For many of us the reCAPTCHA challenge took a form of two scrambled words and we were asked to decipher both to be allowed to access a web page. Google has released it 3 years after they began their quest to scan every book in the world. They had used a sophisticated OCR technology to accomplish this feat, however it wasn’t perfect. As one may suspect there were many small errors in the automatic transcribing process and it would be impractical to have people go over the text to find mistakes. Also at that time bots were starting to become a nuisance and a need to distinguish genuine users and bots become clear.
reCAPTCHA was originally developed by students at Carnegie Mellon University which was later acquired by Google. The idea was simple they took one image of a known and one of an unknown word, and displayed them both to a user and ask them to transcribe. Once many people had solved the reCAPTCHA, they would accept the word.
And this is how they got many people to transcribe books and newspapers for free. Of course not everyone was happy and some people even went as far as suing Google on the basis of them using free labour. That wasn’t the only issue, as the OCR technology got better, the bots were able to solve reCAPTCHAs with 99% accuracy. So Google iterated on it and now we have a new noCAPTCHA reCAPCTHA which asks the users to identify specific images from a pool of 9 pictures.
Initially the images were of house numbers, presumably to help Google Maps with finding addresses in Street View. However recently most of the images are of street signs (with occasional store front images). Which leads me to believe that we are all currently teaching our future drivers the basics of driving; the street signs. Who knows what will come next? Maybe we will have to identify potential hazards?