Which came first the dataset or the algorithm?

Shruti Gandhi of Array Ventures recently sat down with Lukas Biewald of Crowdflower to discuss his thoughts on machine learning on this [Array] Podcast. Lukas suggests people embrace the following realizations:

1. You Need A LOT of Data

Unlike years past, modern-day machine learning is capable of combing through tons and tons of training data. The more data you have, the better the machine learning algorithm. The better the algorithm, the better the product can be. For instance, the US Government spent billions of dollars trying to build a comprehensive translation services platform. Google beat them to the punch by utilizing as much data as possible. Even though Google Translation was just a side project for the company, they could data-crawl millions of international websites to help the algorithm learn better.

2. Companies Have To Be Realistic About Expectations

Most people think that computer programs don’t make mistakes. The problem is that they do — especially within the realm of artificial intelligence. The best AI solutions are 80–82% accurate. But many executives question the value proposition if a machine-learning program makes a mistake ~20% of the time. This is understandable given that mistakes can come with large costs. So a human is still needed in the loop for the 20% of the time when AI can’t handle those cases.

3. The Development of AI is Just Starting

However, the quality of machine learning is on the rise. From just the last five years, machine learning has improved quite dramatically.

In his latest article, Dave Gershgorn, a reporter on AI, said that in 2012, “Google’s neural network […] taught itself to detect the shapes of cats and humans with more than 70% accuracy. It was a 70% improvement over any other machine learning at the time.” Today, AI is even more advanced. Gershgorn continues on to say that “artificial intelligence research has progressed more in the last five years than in the last 50 years in part because so much more data is available to use in training the AI. Much of that progress can be seen in products from Google, Amazon, and Facebook: Your photos can be tagged automatically, your email app knows how you like to respond to emails, or a new smart speaker can use AI to recognize what you’re saying.”

Again, the more data you have, the better the algorithm can be.

However, Lukas believes that expectations should be limited. To make AI algorithms 100% correct is extremely difficult, if not impossible at this point. But it always helps to dream and the possibilities increase every day.

Still want more? Check out the entire podcast here. Subscribe to [Array] Podcast to learn an array of hacks and skills from other successful founders.

Shruti Gandhi is managing partner at Array Ventures. Array Ventures is VC firm focused on investing in founders creating companies that take advantage of data, artificial intelligence, and new behaviors to create new platforms for large markets.