Our current design sprints look to improve how we add language to our text classification system.
Translations, l33t speak and other nuances make it impossible for the machine to do it all. Human verification ensures a high level of accuracy, although it’s often a slow and arduous process.
A small (but significant) starting point is looking at groupings of words for inaccuracy. For example, we can take a set of translations from a new language and compare them to an established language. We let the machine make some assumptions, i.e. most of the words we can assume are “correct” and have humans vet those assumptions.
What’s an appropriate UI for verifying a list of words? Where is the balance between speed and accuracy?
Using a set of words for colours in Spanish, translated through Google Translate, we prototyped our ideas. After some talking and sketching, we had some good insights and the ability to make some early decisions.
A/B Testing Layouts
After some initial HTML wireframes, we landed on two different layouts. We wanted to see which one was optimal for speed and accuracy. Speed being how many words a person can “verify” per second. Accuracy being how many words they verified correctly.
Setting the complications of language aside, here’s what we learned about our first interface ideas. Although, the patterns of commonly missed words prompted some great new ideas.
Test Users: internal staff and posting links on Designer News, Dribbble and Hacker News.
FREE CODE: Dig through the repo.
What should we do next? Or different?
Any feedback or suggestions that would help us is much appreciated.