Human Visual System is a marvel of the world. People can readily recognise digits. But it is not as simple as it looks like. The human brain has a million neurons and billions of connections between them, which makes this exceptionally complex task of image processing easier. People can effortlessly recognize digits. However, it turns into a challenging task for computers to recognize digits. Simple hunches about how to recognize digits become difficult to express algorithmically. Moreover, there is a significant variation in writing from person to person, which makes it immensely complex. is the working of a machine to train itself so that it can recognize digits from different sources like emails, bank cheque, papers, images, etc. A Handwritten digit recognition system Google Colab Google Colab has been used to implement the network. It is a free cloud service that can be used to develop deep learning applications using popular libraries such as Keras, TensorFlow, PyTorch, and OpenCV. The most important feature that distinguishes Colab from other free cloud services is; it provides GPU and is totally free. Thus, if PC is incompatible with hardware requirements or does not support GPU, then it is the best option because a stable internet connection is the only requirement. MNIST Datasets MNIST stands for “Modified National Institute of Standards and Technology”. It is a dataset of 70,000 handwritten images. Each image is of 28x28 pixels i.e. about 784 features. Each feature represents only one pixel’s intensity i.e This database is further divided into 60,000 training and 10,000 testing images. . from 0(white) to 255(black). Phases of Implementation Import the libraries First, we imported all the libraries that we are going to use. We imported which is an that is used for machine learning applications such as neural networks etc. Further, we imported function, which is basically used for from the library which is used for purposes. After that, we imported i.e. Numerical Python which is used to TensorFlow open-source free library pyplot plotting, matplotlib visualisation NumPy perform various mathematical operations. Load the dataset The Keras library already contains some datasets such as CIFAR10, CIFAR100, Boston Housing price regression dataset, IMDB movie review sentiment classification dataset etc. The MNIST dataset is also part of it. So, we imported it from and loaded it into variable “objects” The method returns us the training data(train_img), its labels(train_lab) and also the testing data(test_img) and its labels(test_lab). Out of the 70,000 images provided in the dataset, 60,000 are given for training and 10,000 are given for testing. keras.datasets . objects.load_data() Before preprocessing the data, we first displayed the first 20 images of the training set with the help of . for loop is used to add a subplot or grid-like structure to the current figure. The first argument is for “ second for “ and third for position index in the grid. subplot() no. of rows”, no. of columns” Suppose we have to plot 10 images in the 4x5 grid starting from the second position in the grid. Then, it will be like is used to display data as an image i.e. training image (train_img[i]) whereas stands for the colour map. is an optional feature. Basically, if the image is in the array of shape (M, N), then the controls the colour map used to display the values. cmap=‘gray’ will display image as grayscale while cmap=‘gray_r’ is used to display image as inverse grayscale. imshow() cmap Cmap cmap sets title for each image. We have set “Digit: train_lab[i]” as the title for each image in the subplot. title() is used for tuning subplot layout. In order to change the space provided between two rows, we have used If you want to change space between two columns then you can use subplots_adjust() hspace. wspace. By default parameters of the subplot layout are, In order to hide the axis of the image, has been used. plt.axis(‘off’) After that, we displayed the shape of training and testing section. means there are 60,000 images in the training set and each image is of size 28x28 pixels. Similarly, there are 10,000 images of the same size in the testing set. (60000,28,28) So each image is of size 28x28 i.e. 784 features, and each feature represents the intensity of each pixel from 0 to 255. You can use to print the first training set image in the matrix form of 28x28. print(train_img[0]) We plotted the first training image on a histogram. Before normalisation, is used to plot the histogram for the first training image i.e. train_img[0]. The image has been reshaped into a 1-D array of size 784. is an optional parameter which specifies the colour of the histogram. Title of the histogram, Y-axis and X-axis have been named as “Pixel vs its intensity”, “PIXEL” and “Intensity”. hist() facecolor Pre-process the data Before feeding the data to the network, we will normalize it. Normalizing the input data helps to speed up the training. Also, it reduces the chance of getting stuck in local optima, since we’re using to find the optimal weights for the network. stochastic gradient descent The pixel values are between 0 and 255. So, scaling of input values is good when using neural network models since the scale is well known and well behaved, we can very quickly normalize the pixel values to the range 0 and 1 by dividing each value by the maximum intensity of 255. After normalisation, Creating the model There are 3 ways to create a model in Keras: The is very straightforward and simple. It allows to build a model layer by layer. Sequential model The which is an easy-to-use, fully-featured API that supports arbitrary model architectures. This is the Keras “industry-strength” model. Functional API where you implement everything from scratch on your own. Model subclassing Here, we have used the . This model has one input layer, one output layer and two hidden layers. Sequential model is used to create a layer of the network in sequence. Sequential() is used here to add the layer into the model. .add() In the first layer(input layer), we feed image as the input. Since each image is of size 28x28, hence we have used to compress the input. Flatten() We have used in the other layers It ensures that each neuron in the previous layer is connected to every neurone in the next layer. Dense() . The model is a simple neural network with two hidden layers with A rectifier linear unit activation function is used for the neurons in the hidden layers. The nicest thing about it is that its gradient is always equal to 1, this way we can pass the maximum amount of the error through the network during back-propagation. 512 neurons. (ReLU) The output layer has 10 neurons i.e. for each class from 0 to 9. A is used on the output layer to turn the outputs into probability-like values. softmax activation function Note: You can add more neurons int the hidden layers. You can even increase the no. of hidden layers int the model to increase efficiency. However, it will take more time during training. Compiling the network Next, we need to compile our model. Compiling the model takes three parameters: optimizer, loss and metrics. The optimizer controls the learning rate. We are using as our optimizer. It is generally a good optimizer to use for many cases. It adjusts the learning rate throughout the training. ‘adam’ We will use for our loss function because it saves time in memory as well as computation since it simply uses a single integer for a class, rather than a whole vector. A lower score indicates that the model is performing better. ‘Sparse_Categorical_Crossentropy’ In order to determine the accuracy, we will use the to see the accuracy score on the validation set when we train the model. ‘accuracy’ metric Train the model We will train the model with the help of function. It will have parameters as training data (train_img), training labels (train_lab) and the number of epochs. The number of epochs is the number of times the model will cycle through the data. The more epochs we run, the more the model will improve, up to a certain point. After that point, the model will stop improving during each epoch. fit() We will save the model as project.h5 Evaluate the model method when compiling the model. So in our case, the accuracy is computed on the 10,000 testing examples using the network weights given by the saved model. model.evaluate() computes the loss and any metric defined Verbose can be either 0,1, or 2. By default verbose is 1. verbose = 0, means silent. verbose = 1, which includes both progress bar and one line per epoch. verbose = 2, one line per epoch i.e. epoch no./total no. of epochs. After evaluating the model, we will now check the model for the testing section. is used to do prediction on the testing set. model.predict() returns the indices of the maximum values along an axis. np.argmax() Now, in order to make a prediction for a new image that is not part of MNIST dataset. We will first create a function named “ ”. load_image Above function converts the image into an array of pixels which is fed to the model as an input. In order to upload a file from local drive, we used the code: google.colab files uploaded = files.upload() from import It will lead you to select a file. Click on “ ” then select and upload the file and wait for the file to be uploaded 100%. You will see the name of the file once Colab has uploaded it. Choose Files In order to display image file, we used the code: IPython .display Image Image(‘ img.jpeg’,width= ,height= ) img.jpeg is the file name. from import 5 250 250 5 As you can see we have successfully predicted the value as 5. Now, if we want to run the model after a few days then, we will have to run the whole code again, which is time-consuming. In that case, you can use the saved model i.e. project.h5 So, before closing the colab notebook, you can download the model from the folder symbol. So, when you try to run the model again, all you have to do is upload project.h5 file from the computer by using the code : google.colab files uploaded = files.upload() from import When the file is 100% uploaded, use the following code & after that, you can predict the digit for new images without running the whole code. model=tf.keras.models.load_model(‘project.h5’) Link for reference https://colab.research.google.com/drive/10LzhqSlJx4bnCNT6C8llhuXTDuh_WQPG?usp=sharing Thanks for reading! Also published at https://medium.com/@officialgargijha/mnist-handwritten-digit-recognition-using-neural-network-2b729bacb0d5