Deep Learning CNN’s in Tensorflow with GPUs

In my last tutorial, you created a complex convolutional neural network from a pre-trained inception v3 model. In this tutorial, you’ll learn the architecture of a convolutional neural network (CNN), how to create a CNN in Tensorflow, and provide predictions on labels of images. Finally, you’ll learn how to run the model on a GPU so you can spend your time creating better models, not waiting for them to converge. Overview Introduction to CNN’s Creating your first CNN and training on CPU Training on a GPU Prerequisites Basic machine learning understanding Basic Tensorflow understanding AWS account (for gpu) Convolutional Neural Networks Convolutional neural networks are the current state-of-art architecture for image classification. They’re used in practice today in facial recognition, self driving cars, and detecting whether an object is a hot-dog. Basic Architecture The basics of a CNN architecture consist of 3 components. A convolution, pooling, and fully connected layer. These components work together to learn a dense feature representation of an input. Convolution A convolution consists of a kernel (green square above), also called filter, that is applied in a sliding window fashion to extract features from the input. This filter is shifted after each operation across the input by an amount called strides. At each operation, a matrix multiply of the kernel and current region of input is calculated. Filters can be stacked to create high-dimensional representations of the input. What happens if the filter doesn’t evenly map to the size of the input ? There are two ways of handling differing filter size and input size, known as same padding and valid padding**.** Same padding will pad the input border with zeros (as seen above) to ensure the input width and height are preserved. Valid padding does not pad. Typically, you’ll want to use same padding or you’ll rapidly reduce the dimensionality of your input. Finally, an activation function (typically a ReLU) is applied to give the convolution non-linearity. ReLU’s are a bit different from other activation functions, such as sigmoid or tanh, as ReLUs are one-sided. This one-sided property allows the network to create sparse representation (zero value for hidden units), increasing computational efficiency. ReLU Pooling Pooling is an operation to reduce dimensionality. It applies a function summarizing neighboring information. Two common functions are max pooling and average pooling. By calculating the max of an input region, the output summarizes intensity of surrounding values. Pooling layers also have a kernel, padding and are moved in strides. To calculate the output size of a pooling operation, you can use the formula (Input Width - kernel width + 2 * padding) / strides + 1. Fully Connected Layer Fully connected layers you are likely familiar with from neural networks. Each neuron in the input is connected to each neuron in the output; fully-connected. Due to this connectivity, each neuron in the output will be used at most one time. Fully connected neural network In a CNN, the input is fed from the pooling layer into the fully connected layer. Depending on the task, a regression or classification algorithm can be applied to create the desired output. Review You’ve now learned about what makes up a convolutional neural network. By passing input through a convolution, you extract highly-dimensional features. Pooling summarizes spatial information and reduces dimensionality. Lastly, this feature representation is passed through fully connected layers to a classifier or regressor. Full CNN Architecture (source) Creating a CNN in Tensorflow Now that you have the idea behind a convolutional neural network, you’ll code one in Tensorflow. You’ll be creating a CNN to train against the MNIST (Images of handwritten digits) dataset. After training, you’ll achieve ~98.0% accuracy @ 10k iterations. Setup Environment First you’ll need to setup your environment. Additionally, you’ll create a setup.py file. Anaconda environment files for python3.5 and python2.7 are listed below. https://gist.github.com/ColeMurray/35fce8ffc78aeaf8d4f16d55e947bb69?embedable=true#file-environment35-yml https://gist.github.com/ColeMurray/9ede676dda786148572e1361693ec707?embedable=true#file-environment27-yml If you do not use anaconda, you can install tensorflow via pip: $ pip install tensorflow https://gist.github.com/ColeMurray/0236c0cf2eca7aecffd5830052f2926e?embedable=true#file-setup-py Run: python3 setup.py develop The Data Here, you’ll create 3 separate inputs; a training set, validation set, and test set. A validation set allows you to better train your model by providing additional data to tune hyper parameters against. Download the Data The data can be retrieved with this command: $ curl https://pjreddie.com/media/files/mnist_train.csv -o data/mnist_train.csv # 104 MB $ curl https://pjreddie.com/media/files/mnist_test.csv -o data/mnist_test.csv # 17.4 MB https://gist.github.com/ColeMurray/bf25aad332b3074c2776f4b0112f3947?embedable=true#file-mnist-py https://gist.github.com/ColeMurray/bf25aad332b3074c2776f4b0112f3947?embedable=true#file-mnist-py Architecture Here, you’ll create a few helper functions for creating the network. These functions are used to create the individual components discussed earlier. Helper Functions / Model definition: https://gist.github.com/ColeMurray/0876c8a2b8104888ad567f332c9a1f7d?embedable=true#file-model-py Model Here’s the code for training the model. The three public functions are explained below: https://gist.github.com/ColeMurray/7018d60f18054d34fdbb77abb1b79822?embedable=true#file-model-py Code is available here: https://github.com/ColeMurray/tensorflow-cnn-tutorial/tree/add_model_functions Inference. This function is responsible for creating a prediction it believes the input represents. Here, it will return a 1x10 tensor for each input. Values contained in this tensor will be passed to the loss function to determine how far off this prediction is from ground truth. As indicated by the batch_size hyper parameter, you are processing 128 images at a time. This technique is known as mini-batch. By processing inputs in smaller batches, as opposed to the entire dataset, input can be fit in memory. Additionally, the model will converge more rapidly due to updating the weights after each batch rather than after processing all examples. Loss. Here, you’ll use the softmax cross entropy function to perform an N way classification. The softmax function is used to normalize (summing the tensor adds to one) the input produced from the inference function. With this normalized tensor, cross entropy is calculated against the one hot encoded labels. Cross entropy gives a measure of how far off the prediction is from the ground truth. Each iteration, an optimizer is applied to minimize this cross entropy. cross entropy Train & Evaluate Below you’ll train the model for 10k iterations. Each 1000 iterations, you’ll test the model against the validation set to get an idea of the accuracy. Finally, you’ll evaluate the trained model against the test dataset to get a measure of out-of-sample accuracy. At 10k iterations, you should see accuracy around 98.0%. To execute this code, run this command: $ python3 mnist_conv2d_medium_tutorial/train.py (Building the computational graph can take a few seconds depending on hardware) https://gist.github.com/ColeMurray/a073ed2749f8660fac9e93b9800afcce?embedable=true#file-train-py With the model trained, you’ll now evaluate it on the test set from the last checkpoint. https://gist.github.com/ColeMurray/73e509e973379a943f6efbe0451d99c1?embedable=true#file-evaluate-py $ python3 mnist_conv2d_medium_tutorial/evaluate.py Code up to this point: https://github.com/ColeMurray/tensorflow-cnn-tutorial/tree/train_and_evaluate You can visualize your results by running: $ tensorboard --logdir=graphs/ --port=6006 navigate in browser: localhost:6006 Training on a GPU As you noticed, training a CNN can be quite slow due to the amount of computations required for each iteration. You’ll now use GPU’s to speed up the computation. Tensorflow, by default, gives higher priority to GPU’s when placing operations if both CPU and GPU are available for the given operation. For simplifying the tutorial, you won’t explicitly define operation placement. You can read more about how to do this here. Create a GPU Box For this tutorial, you’ll use a community AMI. Head over to the AWS console and launch a new EC2 instance. At the AMI screen, select community and enter this AMI id: ami-5e853c48. This AMI comes with Tensorflow and Nvidia drivers with CUDA pre-installed. For instance type, select G2.2xlarge. After selecting an instance type, be sure to create a key-pair. This key-pair will allow you to ssh into the instance and copy/execute your code. Sync Your Code Now that your instance is created, you’ll need to copy your code and dataset onto it. The easiest way to do this is with rsync. Rsync is a unix command built on top of ssh that allows for efficient file transfer. It’s highly flexible, offering multiple options to directly alter the behavior. Below, the command will copy your project directory to your gpu instance user’s home directory. rsync -trucv mnist_conv2d_medium_tutorial ip-address-of-your-gpu-box:/home/ubuntu/ Run the code Below, you’ll ssh into the instance and install the package. After installation, run the train command. After running the train command, you’ll see output indicating where the operations are being placed. As shown below, operations are being placed onto the gpu as expected. $ ssh ubuntu@ip-address-of-your-gpu-box $ cd mnist-conv2d-medium-tutorial $ pip3 install . $ python3 mnist_conv2d_medium_tutorial/train.py After ~20 mins, training will complete and you can run the evaluate command to test against the test set. $ python3 mnist_conv2d_medium_tutorial/evaluate.py Conclusion In this tutorial you learned the concept behind convolutional neural networks. Additionally, you learned the Tensorflow implementation of a basic CNN to achieve ~98.0% accuracy. Finally, you learned how to run your code on a GPU for performance improvement. Complete Code here: ColeMurray/tensorflow-cnn-tutorial_tensorflow-cnn-tutorial - Tensorflow tutorial on convolutional neural networks._github.com Next Steps: Play with hyperparameters (batch size, learning rate, kernel size, number of iterations) to see how it affects model performance Train and evaluate your model against other datasets (CIFAR-10) Go deeper Call to Action: If you enjoyed this tutorial, follow and recommend! Interested in learning more about Deep Learning / Machine Learning? Check out my other tutorials: - Deep learning with Keras on Google Compute Engine - Recommendation Systems with Apache Spark on Google Compute Engine Other places you can find me: Twitter: https://twitter.com/_ColeMurray In my last tutorial , you created a complex convolutional neural network from a pre-trained inception v3 model. my last tutorial network In this tutorial, you’ll learn the architecture of a convolutional neural network (CNN), how to create a CNN in Tensorflow, and provide predictions on labels of images. Finally, you’ll learn how to run the model on a GPU so you can spend your time creating better models, not waiting for them to converge. Overview Overview Introduction to CNN’s Creating your first CNN and training on CPU Training on a GPU Introduction to CNN’s Creating your first CNN and training on CPU Training on a GPU Prerequisites Basic machine learning understanding Basic Tensorflow understanding AWS account (for gpu) Basic machine learning understanding Basic Tensorflow understanding AWS account (for gpu) Convolutional Neural Networks Convolutional neural networks are the current state-of-art architecture for image classification. They’re used in practice today in facial recognition, self driving cars, and detecting whether an object is a hot-dog. Basic Architecture The basics of a CNN architecture consist of 3 components. A convolution, pooling, and fully connected layer. These components work together to learn a dense feature representation of an input. Convolution Convolution A convolution consists of a kernel (green square above) , also called filter, that is applied in a sliding window fashion to extract features from the input. This filter is shifted after each operation across the input by an amount called strides. At each operation, a matrix multiply of the kernel and current region of input is calculated. Filters can be stacked to create high-dimensional representations of the input. kernel (green square above) strides. What happens if the filter doesn’t evenly map to the size of the input ? What happens if the filter doesn’t evenly map to the size of the input ? There are two ways of handling differing filter size and input size, known as same padding and valid padding**.** Same padding will pad the input border with zeros (as seen above) to ensure the input width and height are preserved. Valid padding does not pad. same valid Typically, you’ll want to use same padding or you’ll rapidly reduce the dimensionality of your input. Finally, an activation function (typically a ReLU ) is applied to give the convolution non-linearity. ReLU’s are a bit different from other activation functions, such as sigmoid or tanh, as ReLUs are one-sided. This one-sided property allows the network to create sparse representation (zero value for hidden units), increasing computational efficiency. ReLU ReLU Pooling Pooling Pooling is an operation to reduce dimensionality. It applies a function summarizing neighboring information. Two common functions are max pooling and average pooling. By calculating the max of an input region, the output summarizes intensity of surrounding values. Pooling layers also have a kernel, padding and are moved in strides. To calculate the output size of a pooling operation, you can use the formula (Input Width - kernel width + 2 * padding) / strides + 1. (Input Width - kernel width + 2 * padding) / strides + 1. Fully Connected Layer Fully Connected Layer Fully connected layers you are likely familiar with from neural networks. Each neuron in the input is connected to each neuron in the output; fully-connected. Due to this connectivity, each neuron in the output will be used at most one time. Fully connected neural network In a CNN, the input is fed from the pooling layer into the fully connected layer. Depending on the task, a regression or classification algorithm can be applied to create the desired output. Review Review You’ve now learned about what makes up a convolutional neural network. By passing input through a convolution, you extract highly-dimensional features. Pooling summarizes spatial information and reduces dimensionality. Lastly, this feature representation is passed through fully connected layers to a classifier or regressor. Full CNN Architecture ( source ) source Creating a CNN in Tensorflow Now that you have the idea behind a convolutional neural network, you’ll code one in Tensorflow. You’ll be creating a CNN to train against the MNIST (Images of handwritten digits) dataset . After training, you’ll achieve ~98.0% accuracy @ 10k iterations. MNIST (Images of handwritten digits) dataset Setup Environment Setup Environment First you’ll need to setup your environment. Additionally, you’ll create a setup.py file. Anaconda environment files for python3.5 and python2.7 are listed below. https://gist.github.com/ColeMurray/35fce8ffc78aeaf8d4f16d55e947bb69?embedable=true#file-environment35-yml https://gist.github.com/ColeMurray/35fce8ffc78aeaf8d4f16d55e947bb69?embedable=true#file-environment35-yml https://gist.github.com/ColeMurray/9ede676dda786148572e1361693ec707?embedable=true#file-environment27-yml https://gist.github.com/ColeMurray/9ede676dda786148572e1361693ec707?embedable=true#file-environment27-yml If you do not use anaconda, you can install tensorflow via pip: $ pip install tensorflow $ pip install tensorflow https://gist.github.com/ColeMurray/0236c0cf2eca7aecffd5830052f2926e?embedable=true#file-setup-py https://gist.github.com/ColeMurray/0236c0cf2eca7aecffd5830052f2926e?embedable=true#file-setup-py Run: python3 setup.py develop python3 setup.py develop The Data Here, you’ll create 3 separate inputs; a training set, validation set, and test set. A validation set allows you to better train your model by providing additional data to tune hyper parameters against. Download the Data Download the Data The data can be retrieved with this command: $ curl https://pjreddie.com/media/files/mnist_train.csv -o data/mnist_train.csv # 104 MB $ curl https://pjreddie.com/media/files/mnist_train.csv -o data/mnist_train.csv # 104 MB $ curl https://pjreddie.com/media/files/mnist_test.csv -o data/mnist_test.csv # 17.4 MB $ curl https://pjreddie.com/media/files/mnist_test.csv -o data/mnist_test.csv # 17.4 MB https://gist.github.com/ColeMurray/bf25aad332b3074c2776f4b0112f3947?embedable=true#file-mnist-py https://gist.github.com/ColeMurray/bf25aad332b3074c2776f4b0112f3947?embedable=true#file-mnist-py https://gist.github.com/ColeMurray/bf25aad332b3074c2776f4b0112f3947?embedable=true#file-mnist-py https://gist.github.com/ColeMurray/bf25aad332b3074c2776f4b0112f3947?embedable=true#file-mnist-py Architecture Here, you’ll create a few helper functions for creating the network. These functions are used to create the individual components discussed earlier. Helper Functions / Model definition: Helper Functions / Model definition: https://gist.github.com/ColeMurray/0876c8a2b8104888ad567f332c9a1f7d?embedable=true#file-model-py https://gist.github.com/ColeMurray/0876c8a2b8104888ad567f332c9a1f7d?embedable=true#file-model-py Model Model Here’s the code for training the model. The three public functions are explained below: https://gist.github.com/ColeMurray/7018d60f18054d34fdbb77abb1b79822?embedable=true#file-model-py https://gist.github.com/ColeMurray/7018d60f18054d34fdbb77abb1b79822?embedable=true#file-model-py Code is available here: https://github.com/ColeMurray/tensorflow-cnn-tutorial/tree/add_model_functions Code is available here: https://github.com/ColeMurray/tensorflow-cnn-tutorial/tree/add_model_functions https://github.com/ColeMurray/tensorflow-cnn-tutorial/tree/add_model_functions Inference. This function is responsible for creating a prediction it believes the input represents. Here, it will return a 1x10 tensor for each input. Values contained in this tensor will be passed to the loss function to determine how far off this prediction is from ground truth. Inference. As indicated by the batch_size hyper parameter, you are processing 128 images at a time. This technique is known as mini-batch. By processing inputs in smaller batches, as opposed to the entire dataset, input can be fit in memory. Additionally, the model will converge more rapidly due to updating the weights after each batch rather than after processing all examples. Loss. Here, you’ll use the softmax cross entropy function to perform an N way classification. The softmax function is used to normalize (summing the tensor adds to one) the input produced from the inference function. Loss. Loss. With this normalized tensor, cross entropy is calculated against the one hot encoded labels. Cross entropy gives a measure of how far off the prediction is from the ground truth. Each iteration, an optimizer is applied to minimize this cross entropy. cross entropy Train & Evaluate Below you’ll train the model for 10k iterations. Each 1000 iterations, you’ll test the model against the validation set to get an idea of the accuracy. Finally, you’ll evaluate the trained model against the test dataset to get a measure of out-of-sample accuracy. At 10k iterations, you should see accuracy around 98.0%. To execute this code, run this command: $ python3 mnist_conv2d_medium_tutorial/train.py (Building the computational graph can take a few seconds depending on hardware) $ python3 mnist_conv2d_medium_tutorial/train.py (Building the computational graph can take a few seconds depending on hardware) https://gist.github.com/ColeMurray/a073ed2749f8660fac9e93b9800afcce?embedable=true#file-train-py https://gist.github.com/ColeMurray/a073ed2749f8660fac9e93b9800afcce?embedable=true#file-train-py With the model trained, you’ll now evaluate it on the test set from the last checkpoint. https://gist.github.com/ColeMurray/73e509e973379a943f6efbe0451d99c1?embedable=true#file-evaluate-py https://gist.github.com/ColeMurray/73e509e973379a943f6efbe0451d99c1?embedable=true#file-evaluate-py $ python3 mnist_conv2d_medium_tutorial/evaluate.py $ python3 mnist_conv2d_medium_tutorial/evaluate.py Code up to this point: https://github.com/ColeMurray/tensorflow-cnn-tutorial/tree/train_and_evaluate Code up to this point: https://github.com/ColeMurray/tensorflow-cnn-tutorial/tree/train_and_evaluate https://github.com/ColeMurray/tensorflow-cnn-tutorial/tree/train_and_evaluate You can visualize your results by running: $ tensorboard --logdir=graphs/ --port=6006 navigate in browser: localhost:6006 $ tensorboard --logdir=graphs/ --port=6006 navigate in browser: localhost:6006 Training on a GPU As you noticed, training a CNN can be quite slow due to the amount of computations required for each iteration. You’ll now use GPU’s to speed up the computation. Tensorflow, by default, gives higher priority to GPU’s when placing operations if both CPU and GPU are available for the given operation. For simplifying the tutorial, you won’t explicitly define operation placement. You can read more about how to do this here . here Create a GPU Box For this tutorial, you’ll use a community AMI. Head over to the AWS console and launch a new EC2 instance. At the AMI screen, select community and enter this AMI id: ami-5e853c48. This AMI comes with Tensorflow and Nvidia drivers with CUDA pre-installed. ami-5e853c48. For instance type, select G2.2xlarge. After selecting an instance type, be sure to create a key-pair. This key-pair will allow you to ssh into the instance and copy/execute your code. ssh Sync Your Code Sync Your Code Now that your instance is created, you’ll need to copy your code and dataset onto it. The easiest way to do this is with rsync. Rsync is a unix command built on top of ssh that allows for efficient file transfer. It’s highly flexible, offering multiple options to directly alter the behavior. Below, the command will copy your project directory to your gpu instance user’s home directory. Rsync rsync -trucv mnist_conv2d_medium_tutorial ip-address-of-your-gpu-box:/home/ubuntu/ rsync -trucv mnist_conv2d_medium_tutorial ip-address-of-your-gpu-box:/home/ubuntu/ Run the code Run the code Below, you’ll ssh into the instance and install the package. After installation, run the train command. After running the train command, you’ll see output indicating where the operations are being placed. As shown below, operations are being placed onto the gpu as expected. $ ssh ubuntu@ip-address-of-your-gpu-box $ cd mnist-conv2d-medium-tutorial $ pip3 install . $ python3 mnist_conv2d_medium_tutorial/train.py $ ssh ubuntu@ip-address-of-your-gpu-box $ cd mnist-conv2d-medium-tutorial $ pip3 install . $ python3 mnist_conv2d_medium_tutorial/train.py After ~20 mins, training will complete and you can run the evaluate command to test against the test set. $ python3 mnist_conv2d_medium_tutorial/evaluate.py $ python3 mnist_conv2d_medium_tutorial/evaluate.py Conclusion Conclusion In this tutorial you learned the concept behind convolutional neural networks. Additionally, you learned the Tensorflow implementation of a basic CNN to achieve ~98.0% accuracy. Finally, you learned how to run your code on a GPU for performance improvement. Complete Code here: ColeMurray/tensorflow-cnn-tutorial _tensorflow-cnn-tutorial - Tensorflow tutorial on convolutional neural networks._github.com ColeMurray/tensorflow-cnn-tutorial ColeMurray/tensorflow-cnn-tutorial _tensorflow-cnn-tutorial - Tensorflow tutorial on convolutional neural networks._github.com Next Steps: Next Steps: Play with hyperparameters (batch size, learning rate, kernel size, number of iterations) to see how it affects model performance Train and evaluate your model against other datasets (CIFAR-10) Go deeper Play with hyperparameters (batch size, learning rate, kernel size, number of iterations) to see how it affects model performance learning Train and evaluate your model against other datasets ( CIFAR-10 ) CIFAR-10 Go deeper Call to Action: Call to Action: If you enjoyed this tutorial, follow and recommend! If you enjoyed this tutorial, follow and recommend! Interested in learning more about Deep Learning / Machine Learning? Check out my other tutorials: Interested in learning more about Deep Learning / Machine Learning? Check out my other tutorials: - Deep learning with Keras on Google Compute Engine - Deep learning with Keras on Google Compute Engine - Deep learning with Keras on Google Compute Engine - Recommendation Systems with Apache Spark on Google Compute Engine - Recommendation Systems with Apache Spark on Google Compute Engine - Recommendation Systems with Apache Spark on Google Compute Engine Other places you can find me: Other places you can find me: Twitter: https://twitter.com/_ColeMurray Twitter: https://twitter.com/_ColeMurray Twitter: https://twitter.com/_ColeMurray https://twitter.com/_ColeMurray