The current world population is about 8 billion, so why aim for just the 331.9 million in the American population? Well we just did the impossible, unthinkable, and unpredictable. At HackerNoon, we want (at least we’re trying) to give everyone on EARTH with internet access, a fair shot at reading the relevant content that is moving the World.
We have used machine learning to translate all top stories on our platform from English to Spanish, Hindi, Mandarin, Vietnamese, French, Portuguese, and Japanese. We will keep translating new upcoming top stories. Top stories will now see all these languages on top of the feature image.
Also if you navigate to the tag page of a specific language, you will notice that the entire page will be in that specific language. https://hackernoon.com/tagged/hackernoon-hi for example, will only show in the specific language, in this case Hindi. But is the same for other languages.
Well, thanks to the new poll system created by Jeferson, we were able to ask users what languages would they like to read stories on. We also cross referenced with our existing readership. The results of the poll were pretty clear so we decided to move forward with the. project. Check the results right here:
We started with the Google Translation API. We really like the accuracy, and with a diverse team, we are able to check the content of most languages to make sure the translation of articles are reliable. After seeing the simplicity of the the API, I think this was the best choice. We are exploring various rules and tools to improve the base translation, but in the long run, we will be betting on the community to improve these translations (more to come at a later date!). We also created a new database to store translated articles in order to differentiate from the original content, with of course some correlation between the two.
The hardest part of this project was creating a framework to that will somehow load the static data of a specific page in the language of that specific page. The idea here is that if someone is reading a story in French, static data (text that does not change) such as “new story“ will show the translated version like “Nouvelle histoire”, same thing for other languages.
There are multiple ways of accomplishing this, I decided to simply create an object to store the text of each language, store the object on our database, then load it via API depending on the language of the page. After the static translation was done, all that was left was to actually translate top stories via script. It was a long process, it took about two weeks to translate all the stories to all these languages. The better part is that all translated stories are also added to the HackerNoon sitemap with metadata in the language of the page, which will definitely facilitate the sharing experience. Like this search in Vietnamese on google:
I hope you all enjoy reading in your native language, comment & share!