

The Not Hotdog app from HBOβs Silicon Valley has become one of the most iconic joke apps in tech. Like most things in the show, there is grain of truth to the fictitious app that makes it actually plausible. With Not Hotdog, HBO went a step farther, they actually built the thing. If you ever need help to figure out if something is a hotdog, head over the app store.
Tim Anglade, a software developer for the show tasked with building the app, was kind enough to share his experience in a fantastic blog post. In just a few hours, he got a quick prototype working using Googleβs Vision API, but for a variety of reasons he wanted everything to run on device. Basically, it had to work in airplane mode. This seemingly simple requirement lead him on a circuitous several months journey that included help from the creators of TensorFlow.
Itβs been 6 months since Timβs post. This afternoon, using Appleβs new open source tool Turi Create, I was able build a simple Not Hotdog in about 3 hours. I often hear about the breakneck pace of machine learning but today I felt it. If youβre a developer who has been waiting for the right time to join in, the bar has never been lower.
In the rest of this post, Iβll show you how to build Not Hotdog in a few hours, writing fewer than 100 lines of code. All you need is Turi Create and your laptop.
Turi Create is a high-level library for creating custom machine learning modules. If youβre vaguely familiar with machine learning frameworks, Turi Create feels an order of magnitude simpler than Keras which feels an order of magnitude simpler than raw TensorFlow.
Under the hood of Turi Create is Apacheβs MXNet, though you really canβt access any of it. Apple gives you the option of using a few pre-trained models and architectures, but nearly all of the parameters are inaccessible.
Finally, Apple has implemented some barebones data structures mirroring much of Pandas DataFrame functionality. Iβm not sure why they decided to re-implement these from scratch, but they made good choices of method names and it feels quite natural if youβve used Pandas. One nice addition is an interactive GUI for visualizing data activated by calling the explore()
method. Shown below is the interactive viewer I used to inspect images for this project. If you are doing a task like object detection, the visualization tool will even draw annotations like bounding boxes when available.
As with many machine learning tools, python is the language of choice. Apple has easy-to-follow instructions in the projectβs readme. Iβll create a clean virtualenv and install with pip install turicreate
. Iβve never used MXNet, but it appears that all the dependencies are installed just fine. I havenβt tried installing with GPU support, but that seems pretty easy, too.
Before I can start training a model I need data. I considered writing a scraper to crawl a few thousand results from Google Images, but then I found kmather37βs advice about using ImageNet to explore and download images in specific categories.
Simply type in a search term, click on the appropriate Sysnet category then in the Download tab click Download URLs of images in Sysnet. Iβll save the list of URLs in a text file for later.
I need two categories of images: βhotdogsβ andβ¦wellβ¦βnot hotdogsβ.
For hotdogs, Iβm using the hotdog and frankfurter bun categories yielding a total of 2,311 images. For not hotdogs, Iβm using categories plants, pets, buildings, and pizza, yielding a total of 8,035 images. The image URLs are saved, one per line, in two text files.
Next, I need to loop through the URLs and download the actual images. There are a couple edge cases and failure modes to work around, but Iβll spare you the details. You can have a look at the function I ended up writing in this gist. Itβs important here to download images from each category into its own folder so that Turi Create can make labels for each. The two folders Iβm using are hotdog
and nothotdog
.
Any data scientist will tell you that data prep and cleaning is the most time consuming part of each project. Each image recognition model generally takes a slightly different image format, size, and preprocessing. I had expected to write a bunch of tips and tricks for data prep. Nope. Apple has baked all of those steps into training. All I need to do is load the data into Turi Create.
import turicreate as tc
data = tc.image_analysis.load_images(
'path/to/images',
with_path=True
)
A few of the image files I downloaded were corrupt and Turi Create threw warnings, but itβs nothing to be concerned about. The argument with_path=True
adds a path
column to the resulting data frame with the absolute path of each image. I can now use a clever trick to create labels for each one of the training images:
data['label'] = data['path'].apply(
lambda path: 'hotdog' if '/hotdog/' in path else 'nothotdog'
)
Thatβs it. Training data cleaned and loaded. Just for fun, I decided to test out the groupby
method on the data frame:
data.groupby('label', [tc.aggregate.COUNT])
Result:
label Count
hotdog 1586
nothotdog 5651
I lost about 25% of the images I started with. This training set could easily be augmented by creating variants with random noise, blur, and transformations, but lets keep going for now.
The first thing I want to do is split the data into a training set and a testing set for validation. Turi Create has a nice function for this:
train_data, test_data = data.random_split(0.8)
Now itβs time to train. Training happens by calling classifier.create()
. This threw me off initially because most ML / deep learning frameworks will initialize a model with random weights and require separate steps to compile, train, and evaluate. With Turi Create, the create function does everything. The same function preprocesses the images, extracts features, trains the model, and evaluates it all. Itβs really refreshing to work this way. There arenβt nearly as many configuration options compared with Keras, TensorFlow, or Caffe, but itβs so much more accessible.
By default, Turi Create will use ResNet-50 for its image classifiers. While ResNet provides great performance for itβs size, itβs a little over 100mb which is heavy for a mobile app. Iβll switch to SqueezeNet which is only 5mb, but sacrifices a bit of accuracy. After some experimenting, here is the create function I ended up with:
model = tc.image_classifier.create(
train_data,
target='label',
model='squeezenet_v1.1',
max_iterations=50
)
This took about 5 minutes to run on my 2015 15-inch MacBook Pro. I was really surprised at how fast training was. My guess is that Apple is using transfer learning here, but Iβll need to look into the source to confirm it. Turi Create starts with the pre-trained SqueezeNet model. It analyzes your labels and recreates the output layers to have the proper number of classes. Then it tunes the original weights to better fit your data.
One thing Iβd like to see added is the ability to start with an arbitrary model. For example, I started by training for a default 10 iterations, then decided to bump it up to 25. I needed to start from scratch and redo the first 10. It would be nice to simply continue where I left off with the model from the first training.
I was going to attempt to run things with a GPU, but honestly this was so fast I decided I donβt need to.
Here is the output from training:
Earlier I created a 20% holdout group of images for testing. Turi Create models have a convenient model.evaluate()
to test the accuracy on full data sets. The results are pretty encouraging:
The last number is one to care about: 96.3% accuracy on the test data. Not bad for an hour of downloading images and 5 minutes of training!
Turi Create can save two model formats:Β .model
andΒ .mlmodel
. The former is a binary format readable by Turi Create (I suspect this might just be an MXNet file) and the latter is a Core ML file that can be dropped into your XCode project. It couldnβt be simpler:
model.export_coreml('HotdogNotHotdog.mlmodel')
The finalΒ .mlmodel
file weighs in at a svelte 4.7mb.
Itβs time to fire up XCode and put my model into an actual iOS app. Iβm not a Swift developer so this part scares me the most. Iβm able to get things working by changing a single line in Appleβs image classification example app (download it here).
Unzip the example project and copy the exported HotdogNotHotdog.mlmodel
file into the Model/
folder, then add it to the Compile Sources list of the project. There might be a way to get XCode to do this automatically, but I always end doing it by hand.
Now I need to swap my model out for the original. Change line 30 of the ImageClassificationViewController.swift
file to:
let model = try VNCoreMLModel(for: HotdogNotHotdog().model)
Thatβs it! Time to build and test it out.
Apple is changingβ¦ For decades, they have been laser-focused on high end hardware differentiated by tightly controlled software. Theyβve eschewed cloud services almost entirely and their best offering to developers is the monolithic 6gb XCode, that never seems to be up-to-date on my machine. The rise of machine learning, and now deep learning, seems to be driving change.
Appleβs first and only blog is a Machine Learning Journal. Their last two open source projects have been coremltools and now Turi Create. Both make it easier for developers to get models into applications. These tools are still a bit rough around the edges, as Apple hasnβt had a lot of practice here, but I am shocked at how easy it was to complete this project.
If youβre a mobile developer who wants give your users magical user experiences that leverage deep learning, itβs never been easier. You no longer need a PhD in Artificial Intelligence or a math textbook to get started. I was able to build a βNot Hotdogβ clone with in under 100 lines of code in afternoon. Kudos to all of the people working on these tools at Apple.
If youβve got an awesome project that uses machine learning or AI on the edge, comment below or send us an email at heartbeat@fritz.ai
Create your free account to unlock your custom reading experience.