Shrink the model size and reduce the computational resources needed to do the inference calculations input layer After that stripping the model was as simple as running the following command bazel-bin/tensorflow/tools/graph_transforms/transform_graph \—-inputs=” " —-in_graph= \—-outputs=” " —-out_graph= \—-transforms=’add_default_attributes strip_unused_nodes(type=float, shape=”1, , , ") remove_nodes(op=Identity, op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms quantize_weights strip_unused_nodes sort_by_execution_order’ input_1 tmp/tensorflow_inception_graph.pb output_node0 tmp/quantized_graph.pb 299 299 3 Now that we have the quantized graph, to we can just replace the one in . Do not forget to add the labels text file too. run it on iOS https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/ios/camera/data Last change you need to make is to change the fields in to match what your model expects as an input. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/ios/camera/CameraExampleViewController.mm#L38 In our case it was const int wanted_input_width = 299; const int wanted_input_height = 299; const int wanted_input_channels = 3; const float input_mean = 0.0f; const float input_std = 255.0f; const std::string input_layer_name = "input_1"; const std::string output_layer_name = "output_node0"; is very similar too, add your quantized graph and labels there . And update the lines in to match what your input expects. Running it on Android https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android/assets https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/ClassifierActivity.java#L61 In a coming post I will cover how you can embed the quantized graph from the last step into your existing iOS and Android apps. Would you be interested in learning more about this https://leanpub.com/ml-mobile