Where Can Dreams Take You? The Art of Teaching Associative Thinking to Machines

Written by lensai | Published 2020/05/10
Tech Story Tags: artificial-intelligence | hackernoon-top-story | startups-top-story | technology | entrepreneur | engineering | machine-learning | ai-top-story

TLDR LensAI is an AI-powered contextual computer vision ad platform that monetizes any visual content. If you've missed Part I of this story, you can read it here: How We Taught Artificial Intelligence to Sell. Where Can Dreams Take You? The Art of Teaching Associative Thinking to Machines.@ lensai LensAI.com: LensAI. The art of teaching AI to think is hard to duplicate by a computer due to its inability to duplicate its human brain thinking process. At the beginning of our product development, we were passionate about it and full of energy.via the TL;DR App

The idea of LensAI was born last year sometime in May. If you've missed Part I of my story, you can read it here -> How We Taught Artificial Intelligence to Sell.

At the early stage of our product development, we were passionate about it and full of energy. We desired to change the world now and forever. We felt that we were the innovators the industry had never seen, and massive changes would come with us! We wanted to fly, but the experience we gained from previous startups kept us pulled to the ground.
"You are dreaming again. Stop it! It is time to build a product that can make you money, preferably quickly! Enough thinking about changing the world, at least until you see the first AI-earned $1k in your pocket," my mind kept repeating to me all the time.
After negotiating with my inner-critics and my partner, we decided to press the stop button. A well-known and proven business model was chosen for our startup. Since that moment, there have been no more thoughts about the integration of our technology into lenses and glasses. We shifted our focus to recognizing objects and selling ads.

Competitors as a Source of Inspiration

We were doing our research on competitors, but our social media followers were a great help too as they were continually sending us tons of information on various companies that were somehow engaged in the object recognition in images or videos.

The Cutting Board
Being the restless minds, Konstantin and I were tormented by endless doubts. We had so many unanswered questions.
"Why all these competing companies are worth millions and hundreds of millions of dollars, but there are no unicorns among them? Why are their technologies so rare to come across on the Internet? Why? Why? Why?" I was asking myself so often that it drove me nuts. 
To proceed further with our idea, we had to find the answers. Using "visual search" and "advertising" as the keywords, we were able to find the companies that were more or less related to our idea. We gather all of them on our "cutting board," took out a "knife," and began the messy examining process.
  • Through analyzing companies one after the other, we noticed two major similarities: limited advertising inventory and limited ad displays. All those projects were sharpened to work with a narrow circle of products. More than that, some of them worked only within a single category of products or one brand. Also, to display an advertisement, the identical match of the product had to be available. This prerequisite seemed too strict for us.
  • Many projects had limitations due to the business model they used. Either they were B2B (business-to-business) and very complex or B2C (business-to-consumer) and, in our opinion, not viable. For example, some projects required interaction with product placement agencies to show ads in a video that was streamed on Smart TV.
  • All projects had the same "chicken and egg" problem in common.  It occurred due to an insufficient amount of data about products that were being sold at that moment and could be displayed in relevant visual content even though the advertiser was ready to pay to advertise these products.
Our "cutting board" helped us to identify three main problems that we needed to solve more efficiently than our competitors if we wanted to succeed:
  • The ads of products/services had to be hyper-relevant to the detected objects of the visual content regardless of the date when the content was first released.
  • We had to place our ads straight into detected objects and show them in a way that ad placement decisions would not depend strictly on similarity algorithms. Simultaneously, we didn’t want the user to question our ad placements or ad choices.
  • We had to provide a machine with a broad and relevant advertising inventory that we would have to receive regularly. This way, the machine could apply various display advertising algorithms.

Eureka!

We had to make the machine to "think" exactly like a person in the moment of browsing. For example:
  • "I want this red bag. It goes perfect with my shoes!"
  • "What a cute shirt this guy is wearing, I know that a men's watch would match it perfectly!"
  • "Great hotel, I should book it for my next vacation!" 
As we know, a human brain thinking process is hard to duplicate by the computer due to its emotional inability. But we decided to give it a try and train the machine in associative decision-making.

Teaching Associative Thinking to Machine

Our first try to formulate all the requirements and link relationships between detectable objects, entities, people, and other characteristics ended up looking like some unknown monster that demanded us to analyze all possible and impossible data.
Through multiple tests and pain in the brain, we were able to collect all the data we possessed that could be analyzed.  The following picture cleared up:
We systematized the entire "zoo" of data into an influence map that navigated how every entity influenced every other entity in that particular scene. Let me show you how it worked within a frame. For example, once "Environment" was identified as "Kitchen," the influence of objects from such advertising categories as "Kitchen Furniture," "Kitchen Appliance," "Spoons," "Kitchen Care Products," "Cups," "Tableware," and others, was increased.
As a result, we built an influence map, a version of which you can see below.  (We distorted it on purpose as it includes parameters of the algorithm that are of value to us.)
In this image, you can see what entities we operate and how all of them influence each other while the process of analyzing images and context is running to display the most relevant advertisement.
The task was successfully defined: determine the relationships of objects and entities to advertising categories according to characterizing features.
Once accomplished, the next one was to establish a connection between two entities: a category and an object.
Object relationships are ways of one object to have an obvious connection with some Taxonomy category. In other words, each object relation can be described as some influence of the object on the products from Taxonomy. Meaning, when we see an object in the image, we can assume that this object is a central one, and we need to find relevant to this object products according to some characterizing feature.
When working on the task, it was necessary to apply the following relationships (characterizing features):
  • Taxonomy products are often located on the object,
  • Taxonomy products are often used for an object,
  • Taxonomy products cannot be used without an object,
  • Taxonomy products are used with the object.
The Tool
We needed two elements to work with large volumes of data and various types of relationships between entities, they were:
  • people with arms and heads,
  • a tool that would help these people to transfer their knowledge to a machine, in our case, the particular knowledge to identify relationships between objects. 
Guess which one was much harder to find? (Spoiler Alert! It was the first one.)
We searched all over the Internet but could not find a tool that would satisfy our requirements. We had no choice but to create our own. We brought together the capabilities of working with massive amounts of data and types of connections. Notice that each possible connection had to have its own weight of influence.
Also, all these connections had to be established by various people who could catch machine-made errors and then correct them. As a result, we got a two-panel data manager that had a broad set of functions and the ability to work with both relational and graph databases.
Handwork: Hell Exists
This stage of product development was the one that almost burnt us out. Neither Konstantin nor I experienced the euphoria we had at the beginning of the project. It seemed to us that to process and link such volume of data is beyond the bounds of possibility. Our mission to determine the relationships between the objects was about to fail.
To feel our pain, imagine a few of the processes we had to deal with, such as linking detectable objects to advertising categories, collecting all possible variations of detectable objects, connecting both industries and context themes with advertising categories...
We didn't count all these processes, but there were hundreds of thousands of them. It felt like HELL. We knew it was time to delegate, so two data since groups of five people were created, both had the only purpose of linking all entities together. Here is the picture we got.
You can see the relationships of various entities in the images above. The last one shows a "scene," aka the influence of multiple entities within a single image.
We were able to achieve the scenario where the detected in the frame objects always fight for the right to be displayed, and the algorithm's job is to determine which ad to display in each frame. For example, the final decision on which one of three advertisements to show (a teapot, a coffee maker, or a furniture store brand) is totally on the algorithm's shoulders.
Nowadays, you cannot surprise anyone with object recognition software, but with the high accuracy of recognition, you can.
The next step was to identify how accurately our models worked and give Fs to machines if they were wrong. In parallel with this work, we also manually tested some of the relationships that we established when we were completing the task to determine relationships between entities.
As a result, we got a whole bunch of classification errors that were done by machine algorithms. We found them in thousands of images and video frames of movies. To correct the errors, both our data since groups had to mark them manually and then classify them one by one.
There were nearly thirty different classes of errors. I will list some of the largest ones below. Please know that each of these error classes includes up to ten subclasses.
Here is a list of errors:
  • Wrong Class
  • Wrong Product
  • Wrong Taxonomy
  • For Classes Without Coordinates: Wrong Surrounding
We continue to work and are getting closer to our goal. The next step is to release a public demo so that you can evaluate our work or point out our shortcomings that require a fix.
Continue to follow LensAI and join us on PH.
We promise to delight you with quality content, update on the progress of our project and various entertaining stories about machine learning.
LensAI is an Affiliate Marketing Platform that automatically
delivers the right ads in the right place at the right time! 🎯
We are building an ad format that turns visual content into a seamless shopping experience — no more gap between browsing and sale.
And yes, we know how to catch a fleeting glance of a customer! 😉
LensAI is about to bring some changes to digital advertising.
Be the first one to witness them! 👀 -> https://www.producthunt.com/upcoming/lensai

Written by lensai | AI-powered contextual computer vision ad platform that monetizes any visual content
Published by HackerNoon on 2020/05/10