The Not Hotdog app from HBO’s Silicon Valley has become one of the most iconic joke apps in tech. Like most things in the show, there is grain of truth to the fictitious app that makes it actually plausible. With Not Hotdog, HBO went a step farther, they actually built the thing. If you ever need help to figure out if something is a hotdog, head over the app store. Tim Anglade, a software developer for the show tasked with building the app, was kind enough to share his experience in a . In just a few hours, he got a quick prototype working using Google’s Vision API, but for a variety of reasons he wanted everything to run on device. Basically, it had to work in airplane mode. This seemingly simple requirement lead him on a circuitous several months journey that included help from the creators of TensorFlow. fantastic blog post It’s been 6 months since Tim’s post. This afternoon, using Apple’s new open source tool , I was able build a simple Not Hotdog in about 3 hours. I often hear about the breakneck pace of but today I felt it. If you’re a developer who has been waiting for the right time to join in, the bar has never been lower. Turi Create machine learning In the rest of this post, I’ll show you how to build Not Hotdog in a few hours, writing fewer than 100 lines of code. All you need is Turi Create and your laptop. What is Turi Create? Turi Create is a high-level library for creating custom machine learning modules. If you’re vaguely familiar with machine learning frameworks, Turi Create feels an order of magnitude simpler than which feels an order of magnitude simpler than raw . Keras TensorFlow Under the hood of Turi Create is Apache’s MXNet, though you really can’t access any of it. Apple gives you the option of using a few pre-trained models and architectures, but nearly all of the parameters are inaccessible. Finally, Apple has implemented some barebones data structures mirroring much of Pandas DataFrame functionality. I’m not sure why they decided to re-implement these from scratch, but they made good choices of method names and it feels quite natural if you’ve used Pandas. One nice addition is an interactive GUI for visualizing data activated by calling the method. Shown below is the interactive viewer I used to inspect images for this project. If you are doing a task like object detection, the visualization tool will even draw annotations like bounding boxes when available. explore() Turi Create comes with some nice interactive GUIs to explore data. Installation As with many machine learning tools, python is the language of choice. Apple has easy-to-follow instructions in the project’s readme. I’ll create a clean virtualenv and install with . I’ve never used MXNet, but it appears that all the dependencies are installed just fine. I haven’t tried installing with GPU support, but that seems pretty easy, too. pip install turicreate Data Collection Before I can start training a model I need data. I considered writing a scraper to crawl a few thousand results from Google Images, but then I found ’s advice about using ImageNet to . kmather37 explore and download images in specific categories Simply type in a search term, click on the appropriate Sysnet category then in the tab click . I’ll save the list of URLs in a text file for later. Download Download URLs of images in Sysnet ImageNet Explorer results for HotDogs I need two categories of images: “hotdogs” and…well…“not hotdogs”. For hotdogs, I’m using the and categories yielding a total of 2,311 images. For not hotdogs, I’m using categories and yielding a total of 8,035 images. The image URLs are saved, one per line, in two text files. hotdog frankfurter bun plants, pets, buildings, pizza, Next, I need to loop through the URLs and download the actual images. There are a couple edge cases and failure modes to work around, but I’ll spare you the details. You can have a look at the function I ended up writing in . It’s important here to download images from each category into its own folder so that Turi Create can make labels for each. The two folders I’m using are and . this gist hotdog nothotdog Data Preprocessing Any data scientist will tell you that data prep and cleaning is the most time consuming part of each project. Each image recognition model generally takes a slightly different image format, size, and preprocessing. I had expected to write a bunch of tips and tricks for data prep. Nope. Apple has baked all of those steps into training. All I need to do is load the data into Turi Create. import turicreate as tc data = tc.image_analysis.load_images('path/to/images',with_path=True) A few of the image files I downloaded were corrupt and Turi Create threw warnings, but it’s nothing to be concerned about. The argument adds a column to the resulting data frame with the absolute path of each image. I can now use a clever trick to create labels for each one of the training images: with_path=True path data['label'] = data['path'].apply(lambda path: 'hotdog' if '/hotdog/' in path else 'nothotdog') That’s it. Training data cleaned and loaded. Just for fun, I decided to test out the method on the data frame: groupby data.groupby('label', [tc.aggregate.COUNT]) Result:label Counthotdog 1586nothotdog 5651 I lost about 25% of the images I started with. This training set could easily be augmented by creating variants with random noise, blur, and transformations, but lets keep going for now. Training The first thing I want to do is split the data into a training set and a testing set for validation. Turi Create has a nice function for this: train_data, test_data = data.random_split(0.8) Now it’s time to train. Training happens by calling . This threw me off initially because most ML / deep learning frameworks will initialize a model with random weights and require separate steps to compile, train, and evaluate. With Turi Create, the create function does everything. The same function preprocesses the images, extracts features, trains the model, and evaluates it all. It’s really refreshing to work this way. There aren’t nearly as many configuration options compared with Keras, TensorFlow, or Caffe, but it’s so much more accessible. classifier.create() By default, Turi Create will use for its image classifiers. While ResNet provides great performance for it’s size, it’s a little over 100mb which is heavy for a mobile app. I’ll switch to which is only 5mb, but sacrifices a bit of accuracy. After some experimenting, here is the create function I ended up with: ResNet-50 SqueezeNet model = tc.image_classifier.create(train_data,target='label',model='squeezenet_v1.1',max_iterations=50) This took about 5 minutes to run on my 2015 15-inch MacBook Pro. I was really surprised at how fast training was. My guess is that Apple is using transfer learning here, but I‘ll need to look into the source to confirm it. Turi Create starts with the pre-trained SqueezeNet model. It analyzes your labels and recreates the output layers to have the proper number of classes. Then it tunes the original weights to better fit your data. One thing I’d like to see added is the ability to start with an arbitrary model. For example, I started by training for a default 10 iterations, then decided to bump it up to 25. I needed to start from scratch and redo the first 10. It would be nice to simply continue where I left off with the model from the first training. I was going to attempt to run things with a GPU, but honestly this was so fast I decided I don’t need to. Here is the output from training: Output from model training. Testing Earlier I created a of images for testing. Turi Create models have a convenient to test the accuracy on full data sets. The results are pretty encouraging: 20% holdout group model.evaluate() The last number is one to care about: 96.3% accuracy on the test data. Not bad for an hour of downloading images and 5 minutes of training! Exporting Turi Create can save two model formats: and . The former is a binary format readable by Turi Create (I suspect this might just be an MXNet file) and the latter is a Core ML file that can be dropped into your XCode project. It couldn’t be simpler: .model .mlmodel model.export_coreml('HotdogNotHotdog.mlmodel') The final file weighs in at a svelte 4.7mb. .mlmodel Creating the Not Hotdog App It’s time to fire up XCode and put my model into an actual iOS app. I’m not a Swift developer so this part scares me the most. I’m able to get things working by changing a single line in Apple’s image classification example app ( ). download it here Unzip the example project and copy the exported file into the folder, then add it to the Compile Sources list of the project. There might be a way to get XCode to do this automatically, but I always end doing it by hand. HotdogNotHotdog.mlmodel Model/ Don’t forget to add your model to the Compile Sources list of your XCode project or else it won’t be able to compile your model during the Build phase. Now I need to swap my model out for the original. Change line 30 of the file to: ImageClassificationViewController.swift let model = try VNCoreMLModel(for: HotdogNotHotdog().model) That’s it! Time to build and test it out. “Not Hotdog” working on my iPhone X Final thoughts Apple is changing… For decades, they have been laser-focused on high end hardware differentiated by tightly controlled software. They’ve eschewed cloud services almost entirely and their best offering to developers is the monolithic 6gb XCode, that never seems to be up-to-date on my machine. The rise of machine learning, and now deep learning, seems to be driving change. Apple’s first and only blog is a . Their last two open source projects have been and now . Both make it easier for developers to get models into applications. These tools are still a bit rough around the edges, as Apple hasn’t had a lot of practice here, but I am shocked at how easy it was to complete this project. Machine Learning Journal coremltools Turi Create If you’re a mobile developer who wants give your users magical user experiences that leverage deep learning, it’s never been easier. You no longer need a PhD in or a math textbook to get started. I was able to build a “Not Hotdog” clone with in under 100 lines of code in afternoon. Kudos to all of the people working on these tools at Apple. Artificial Intelligence If you’ve got an awesome project that uses machine learning or AI on the edge, comment below or send us an email at heartbeat@fritz.ai