How I made a food recognizing bot for Telegram and why it doesn’t have a future (yet)

Meet @ItadakimasuBot. He’s a genuinely nice guy who was created to help you maintain your diet. He keeps track of everything you eat and gives you statistics about it.

But he’s special — he can recognize food from your pictures. You don’t need to write down all nutrition info by your hand — he will do everything by himself.

Note 1: This project was made with mainly learning purpose in mind, so it still requires a lot of work

Note 2: in the first part of the article I’m going to explain what makes the bot work, so if you’re not interested in technical parts — skip it and start reading part 2, where I explain my view on the current situation with bots.

Part 1. How @ItadakimasuBot works

@ItadakimasuBot is a bot for Telegram (for now), written in pure JavaScript. It uses Node.js for both Bot Logic and generation of Statistics page.

To recognize food I used Clarifai API — even though it’s far from being perfect (it doesn’t recognize branded food, for example, both snickers and bounty will just be tagged as ‘Chocolate Bar’), it works good enough for now, so I decided against making my own system.

To get nutrition info, I found two possible choices: FatSecret and Nutrionix. Both of them support full USDA information, but FatSecret works with custom portions.

I decided to stick with Nutrionix since, I think, information in it is in a bit more user-friendly form.

Since bots don’t have built-in session, I decided to use Redis for intermediate information, such as current bot state.

To keep long-term information, such as Food logs, Stats URLs and User information, I used Postgres.

Besides Bot, I also have a server to handle showing statistics to user built. It’s built upon Express.js. It’s actually really simple and only manages two different requests — one returns the *.html page, and one returns the data.

I didn’t use any fancy frameworks such as React or Angular for the front-end because, honestly, using big frameworks for a single page with 3 graphs would be ridiculous. Therefore, I just used jQuery (I didn’t want to use it, but since DatePicker of my choice uses it, I went with it), Chart.js for graphs and Webpack for minimization.

Bot reacts to next commands:

/help — Simply displays a message with a list of commands

/norms — Display daily norms and allows to change them

/log food_name — Logs a food by the name

/start — Shows starting message, sets daily norms for the user and creates User objects

/stats — Generates and send User URL to see statistics

[photo] — Recognize food and log it

In order to communicate with user I decided to use InlineKeyboard, because ReplyKeyboard (modified OS keyboard) doesn’t allow to set custom key values, which would require either additional use of session, or additional API use.

Daily norms input flowchart
Food logging flowchart

Part 2. Why bots are useless. For now.

Bots are fun to play with. You don’t need to install stand-alone applications, they work and look the same on any platform.

Bots are easy and fun to develop. You don’t have to worry about GUI or platforms at all.

But there are several serious flaws.

First problem is that every simple action, such as the input of nutrition values, becomes really clumsy and requires additional validations — do you want to ask user to validate their info on every turn or only in the end? But what if user accidentally enters from info to one of the fields, should he re-enter everything?

It could be solved by allowing to customize input methods, disabling normal keyboard and making custom input methods.

As of now, there are two different input methods: InlineKeyboard and ReplyKeyboard. Both of them have their own pluses, but also they have their own problem.

InlineKeyboard doesn’t hide normal input field, which requires making additional validations, in case user inputs something that’s not from the list. This is a bit solved by the fact that InlineKeyboard answer sends InlineQuery instead of normal Message event, but since the older InlineKeyboards aren’t disables, it doesn’t completely solves the problem.

ReplyKeyboard is used in place of a normal keybord and hides original from the user. Yet, original QWERTY can still be accessed. Another problem with it is that currently, it’s impossible to send custom information with it — the data sent by key will be exactly as the text on it, which would require additional use of either session or API.

Even without entering the nutrition info, a minimum of two validations is required

Second problem is the lack of any visual control. Currently, it’s impossible to display graphs and tables inside the bot, there are only two choices:

  1. Generate graphs as images on the backend. Really bad choice because most of the bot users use mobile phones with small screens. Plus, it removes most of the graph informativity — it’s impossible to see the exact values.
  2. Send a URL to the web-page that has everything. It works, but then again, why do we need bots?
Imagine looking at this on your mobile phone screen

The bots of 2017 aren’t that different from the bots, that we had in IRC or Jabber 20 years ago. They have a really bright future, but as of now — it’s really hard to make them user-friendly for anything that’s more complex than controlling the lights in your house.

Part 3. Future

As of now, @ItadakimasuBot isn’t really meant to be used in daily life. I made it in several weeks and, as a student(pls hire me), can’t really afford working on it anymore.

But if I would, I’d:

  1. Port it to Facebook Messenger.
  2. Make my own Food-recognition system and nutrition DB
  3. Add support for different languages
  4. Add support for UK/US units (I might actually do this in near future)

I made it open-source so if anyone wants to code-review it (as a aspiring programmer, I would love to have someone more experienced review my code!) or do something with it — you’re welcome.



Thanks for your time!

Topics of interest

More Related Stories