Originally published at https://www.datacamp.com/community/tutorials/tensorflow-tutorial
Deep learning is a subfield of machine learning that is a set of algorithms that is inspired by the structure and function of the brain.
TensorFlow is the second machine learning framework that Google created and used to design, build, and train deep learning models.You can use the TensorFlow library do to numerical computations, which in itself doesnât seem all too special, but these computations are done with data flow graphs. In these graphs, nodes represent mathematical operations, while the edges represent the data, which usually are multidimensional data arrays or tensors, that are communicated between these edges.
You see? The name âTensorFlowâ is derived from the operations which neural networks perform on multidimensional data arrays or tensors! Itâs literally a flow of tensors. For now, this is all you need to know about tensors, but youâll go deeper into this in the next sections!
Todayâs TensorFlow tutorial for beginners will introduce you to performing deep learning in an interactive way:
- Youâll first learn more about tensors;
- Then, the tutorial youâll briefly go over some of the ways that you can install TensorFlow on your system so that youâre able to get started and load data in your workspace;
- After this, youâll go over some of the TensorFlow basics: youâll see how you can easily get started with simple computations.
- After this, you get started on the real work: youâll load in data on Belgian traffic signs and exploring it with simple statistics and plotting.
- In your exploration, youâll see that there is a need to manipulate your data in such a way that you can feed it to your model. Thatâs why youâll take the time to rescale your images and convert them to grayscale.
- Next, you can finally get started on your neural network model! Youâll build up your model layer per layer;
- Once the architecture is set up, you can use it to train your model interactively and to eventually also evaluate it by feeding some test data to it.
- Lastly, youâll get some pointers for further improvements that you can do to the model you just constructed and how you can continue your learning with TensorFlow.
Download the notebook of this tutorial here.
Also, you could be interested in a course on Deep Learning in Python, DataCampâs Keras tutorial or the keras with R tutorial.
Introducing Tensors
To understand tensors well, itâs good to have some working knowledge of linear algebra and vector calculus. You already read in the introduction that tensors are implemented in TensorFlow as multidimensional data arrays, but some more introduction is maybe needed in order to completely grasp tensors and their use in machine learning.
Before you go into plane vectors, itâs a good idea to shortly revise the concept of âvectorsâ; Vectors are special types of matrices, which are rectangular arrays of numbers. Because vectors are ordered collections of numbers, they are often seen as column matrices: they have just one column and a certain number of rows. In other terms, you could also consider vectors as scalar magnitudes that have been given a direction.
Remember: an example of a scalar is â5 metersâ or â60 m/secâ, while a vector is, for example, â5 metres northâ or â60 m/sec Eastâ. The difference between these two is obviously that the vector has a direction. Nevertheless, these examples that you have seen up until now might seem far off from the vectors that you might encounter when youâre working with machine learning problems. This is normal; The length of a mathematical vector is a pure number: it is absolute. The direction, on the other hand, is relative: it is measured relative to some reference direction and has units of radians or degrees. You usually assume that the direction is positive and in counterclockwise rotation from the reference direction.
Visually, of course, you represent vectors as arrows, as you can see in the picture above. This means that you can consider vectors also as arrows that have direction and length. The direction is indicated by the arrowâs head, while the length is indicated by the length of the arrow.
So what about plane vectors then?
Plane vectors are the simplest setup of tensors. They are much like regular vectors as you have seen above, with the sole difference that they find themselves in a vector space. To understand this better, letâs start off with an example: you have a vector that is 2 X 1. This means that the vector belongs to the set of real numbers that come paired two at a time. Or, stated differently, they are part of two-space. In such cases, you can represent vectors on the coordinate (x,y) plane with arrows or rays.
Working from this coordinate plane in a standard position where vectors have their endpoint at the origin (0,0), you can derive the x coordinate by looking at the first row of the vector, while youâll find the y coordinate in the second row. Of course, this standard position doesnât always need to be maintained: vectors can move parallel to themselves in the plane without experiencing changes.
Note that similarly, for vectors that are of size 3 X 1, you talk about the three-space. You can represent the vector as a three-dimensional figure with arrows pointing to positions in the vectors pace: they are drawn on the standard x, y and z axes.
Itâs nice to have these vectors and to represent them on the coordinate plane, but in essence, you have these vectors so that you can perform operations on them and one thing that can help you in doing this is by expressing your vectors as bases or unit vectors.
Unit vectors are vectors with a magnitude of one. Youâll often recognize the unit vector by a lowercase letter with a circumflex, or âhatâ. Unit vectors will come in convenient if you want to express a 2-D or 3-D vector as a sum of two or three orthogonal components, such as the xâ and yâaxes, or the zâaxis.
And when you are talking about expressing one vector, for example, as sums of components, youâll see that youâre talking about component vectors, which are two or more vectors whose sum is that given vector.
Tip: watch this video, which explains what tensors are with the help of simple household objects!
Next to plane vectors, also covectors and linear operators are two other cases that all three together have one thing in common: they are specific cases of tensors. You still remember how a vector was characterized in the previous section as scalar magnitudes that have been given a direction. A tensor, then, is the mathematical representation of a physical entity that may be characterized by magnitude and multiple directions.
And, just like you represent a scalar with a single number and a vector with sequence of three numbers in a 3-dimensional space, for example, a tensor can be represented by an array of 3R numbers in a 3-dimensional space.
The âRâ in this notation represents the rank of the tensor: this means that in a 3-dimensional space, a second-rank tensor can be represented by 3 to the power of 2 or 9 numbers. In an N-dimensional space, scalars will still require only one number, while vectors will require N numbers, and tensors will require N^R numbers. This explains why you often hear that scalars are tensors of rank 0: since they have no direction, you can represent them with one number.
With this in mind, itâs fairly easy to recognize scalars, vectors, and tensors and to set them apart: scalars can be represented by a single number, vectors by an ordered set of numbers, and tensors by an array of numbers.
What makes tensors so special is the combination of components and basis vectors: basis vectors transform one way between reference frames and the components transform in just such a way as to keep the combination between components and basis vectors the same.
Installing TensorFlow
Now that you know more about TensorFlow, itâs time to get started and install the library. Here, itâs good to know that TensorFlow provides APIs for Python, C++, Haskell, Java, Go, Rust, and thereâs also a third-party package for R called tensorflow
.
Tip: if you want to know more about deep learning packages in R, consider checking out DataCampâs keras: Deep Learning in R Tutorial.
In this tutorial, you will download a version of TensorFlow that will enable you to write the code for your deep learning project in Python. On the TensorFlow installation webpage, youâll see some of the most common ways and latest instructions to install TensorFlow using virtualenv
, pip
, Docker and lastly, there are also some of the other ways of installing TensorFlow on your personal computer.
Note You can also install TensorFlow with Conda if youâre working on Windows. However, since the installation of TensorFlow is community supported, itâs best to check the official installation instructions.
Now that you have gone through the installation process, itâs time to double check that you have installed TensorFlow correctly by importing it into your workspace under the alias tf
:
import tensorflow as tf
Note that the alias that you used in the line of code above is sort of a conventionâââItâs used to ensure that you remain consistent with other developers that are using TensorFlow in data science projects on the one hand, and with open-source TensorFlow projects on the other hand.
Getting Started With TensorFlow: Basics
Youâll generally write TensorFlow programs, which you run as a chunk; This is at first sight kind of contradictory when youâre working with Python. However, if you would like, you can also use TensorFlowâs Interactive Session, which you can use to work more interactively with the library. This is especially handy when youâre used to working with IPython.
For this tutorial, youâll focus on the second option: this will help you to get kickstarted with deep learning in TensorFlow. But before you go any further into this, letâs first try out some minor stuff before you start with the heavy lifting.
First, import the tensorflow
library under the alias tf
, as you have seen in the previous section. Then initialize two variables that are actually constants. Pass an array of four numbers to the constant()
function.
Note that you could potentially also pass in an integer, but that more often than not, youâll find yourself working with arrays. As you saw in the introduction, tensors are all about arrays! So make sure that you pass in an array :) Next, you can use multiply()
to multiply your two variables. Store the result in the result
variable. Lastly, print out the result
with the help of the print()
function. Find the exercise here.
Note that you have defined constants in the DataCamp Light code chunk above. However, there are two other types of values that you can potentially use, namely placeholders, which are values that are unassigned and that will be initialized by the session when you run it. Like the name already gave away, itâs just a placeholder for a tensor that will always be fed when the session is run; There are also Variables, which are values that can change. The constants, as you might have already gathered, are values that donât change.
The result of the lines of code is an abstract tensor in the computation graph. However, contrary to what you might expect, the result
doesnât actually get calculated; It just defined the model but no process ran to calculate the result. You can see this in the print-out: thereâs not really a result that you want to see (namely, 30). This means that TensorFlow has a lazy evaluation!
However, if you do want to see the result, you have to run this code in an interactive session. You can do this in a few ways, as is demonstrated in the DataCamp Light code chunks below.
Note that you can also use the following lines of code to start up an interactive Session, run the result
and close the Session automatically again after printing the output.
In the code chunks above you have just defined a default Session, but itâs also good to know that you can pass in options as well. You can, for example, specify the config
argument and then use the ConfigProto
protocol buffer to add configuration options for your session.
For example, if you add config=tf.ConfigProto(log_device_placement=True)
to your Session, you make sure that you log the GPU or CPU device that is assigned to an operation. You will then get information which devices are used in the session for each operation. You could use the following configuration session also, for example, when you use soft constraints for the device placement: config=tf.ConfigProto(allow_soft_placement=True)
.
Now that youâve got TensorFlow installed and imported into your workspace and youâve gone through the basics of working with this package, itâs time to leave this aside for a moment and turn your attention to your data. Just like always, youâll first take your time to explore and understand your data better before you start modeling your neural network.
Belgian Traffic Signs: Background
Even though traffic is a topic that is generally known amongst you all, it doesnât hurt going briefly over the observations that are included in this dataset to see if you understand everything before you start. In essence, in this section, youâll get up to speed with the domain knowledge that you need to have to go further with this tutorial.
Of course, because Iâm Belgian, Iâll make sure youâll also get some anecdotes :)
- Belgian traffic signs are usually in Dutch and French. This is good to know, but for the dataset that youâll be working with, itâs not too important!
- There are six categories of traffic signs in Belgium: warning signs, priority signs, prohibitory signs, mandatory signs, signs related to parking and standing still on the road and, lastly, designatory signs.
- On January 1st, 2017, more than 30,000 traffic signs were removed from Belgian roads.These were all prohibitory signs relating to speed.
- Talking about removal, the overwhelming presence of traffic signs has been an ongoing discussion in Belgium (and by extension, the entire European Union).
Loading And Exploring The Data
Now that you have gathered some more background information, itâs time to download the dataset here. You should get the two zip files listed next to âBelgiumTS for Classification (cropped images), which are called âBelgiumTSC_Trainingâ and âBelgiumTSC_Testingâ.
Tip: if you have downloaded the files or will do so after completing this tutorial, take a look at the folder structure of the data that youâve downloaded! Youâll see that the testing, as well as the training data folders, contain 61 subfolders, which are the 62 types of traffic signs that youâll use for classification in this tutorial. Additionally, youâll find that the files have the file extension .ppm
or Portable Pixmap Format. You have downloaded images of the traffic signs!
Letâs get started with importing the data into your workspace. Letâs start with the lines of code that appear below the User-Defined Function (UDF) load_data()
:
- First, set your
ROOT_PATH
. This path is the one where you have made the directory with your training and test data. - Next, you can add the specific paths to your
ROOT_PATH
with the help of thejoin()
function. You store these two specific paths intrain_data_directory
andtest_data_directory
. - You see that after, you can call the
load_data()
function and pass in thetrain_data_directory
to it. - Now, the
load_data()
function itself starts off by gathering all the subdirectories that are present in thetrain_data_directory
; It does so with the help of list comprehension, which is quite a natural way of constructing lists - it basically says that, if you find something in thetrain_data_directory
, youâll double check whether this is a directory, and if it is one, youâll add it to your list. Remember that each subdirectory represents a label. - Next, you have to loop through the subdirectories. You first initialize two lists,
labels
andimages
. Next, you gather the paths of the subdirectories and the file names of the images that are stored in these subdirectories. After, you can collect the data in the two lists with the help of theappend()
function.
Note that in the above code chunk, the training and test data are located in folders named âTrainingâ and âTestingâ, which are both subdirectories of another directory âTrafficSignsâ. On a local machine, this could look something like â/Users/yourName/Downloads/TrafficSignsâ, with then two subfolders called âTrainingâ and âTestingâ.
With your data loaded in, itâs time for some data inspection! You can start off with a pretty simple analysis with the help of the ndim
and size
attributes of the images
array:
Note that the images
and labels
variables are lists, so you might need to use np.array()
to convert the variables to array in your own workspace. This has been done for you here!
Note that the images[0]
that you printed out is in fact one single image that is represented by arrays in arrays! This might seem counterintuitive at first, but itâs something that youâll get used to as you go further into working with images in machine learning or deep learning applications.
Next, you can also take a small look at the labels
, but you shouldnât see too many surprises at this point.
These numbers already give you some insights into how successful your import was and the exact size of your data. At first sight, everything has been executed the way you expected it to and you see that the size of the array is considerable if you take into account that youâre dealing with arrays within arrays.
Tip try adding the following attributes to your arrays to get mor information about the memory layout, the length of one array element in bytes and the total consumed bytes by the arrayâs elements with the flags
, itemsize
and nbytes
attributes. You can test this out in the IPython console in the DataCamp Light chunk above!
Next, you can also take a look at the distribution of the traffic signs.
Awesome job! Now letâs take a closer look at the histogram that you made!
You clearly see that not all types of traffic signs are equally represented in the dataset. This is something that youâll deal with later when youâre manipulating the data before you start modeling your neural network.
At first sight, you see that there are labels that are more heavily present in the dataset than others: the labels 22, 32, 38, and 61 definitely jump out. At this point, itâs nice to keep this in mind, but youâll definitely go further into this in the next section!
The previous, small analyses or checks have already given you some idea of the data that youâre working with, but when your data mostly consists of images, the step that you should take to explore your data is by visualizing it.
Letâs check out some random traffic signs:
- First, make sure that you import the
pyplot
module of thematplotlib
package under the common aliasplt
. - Then, youâre going to make a list with 4 random numbers. These will be used to select traffic signs from the
images
array that you have just inspected in the previous section. In this case, you go for300
,2250
,3650
and4000
. - Next, youâll say that for every element in the lenght of that list, so from 0 to 4, youâre going to create subplots without axes (so that they donât go running with all the attention and your focus is solely on the images!). In these subplots, youâre going to show a specific image from the
images
array that is in accordance with the number at the indexi
. In the first loop, youâll pass300
toimages[]
, in the second round2250
, and so on. Lastly, youâll adjust the subplots so that thereâs enough width in between them. - The last thing that remains is to show your plot with the help of the
show()
function!
As you sort of guessed by the 62 labels that are included in this dataset, the signs are different from each other.
But what else do you notice? Take another close look at the images below:
These four images are not of the same size!
You can obviously toy around with the numbers that are contained in the traffic_signs
list and follow up more thoroughly on this observation, but be as it may, this is an important observation which you will need to take into account when you start working more towards manipulating your data so that you can feed it to the neural network.
Letâs confirm the hypothesis of the differing sizes by printing the shape, the minimum and maximum values of the specific images that you have included into the subplots.
The code below heavily resembles the one that you used to create the above plot, but differs in the fact that here, youâll alternate sizes and images instead of plotting just the images next to each other.
Note how you use the format()
method on the string "shape: {0}, min: {1}, max: {2}"
to fill out the arguments {0}
, {1}
, and {2}
that you defined.
Now that you have seen loose images, you might also want to revisit the histogram that you printed out in the first steps of your data exploration; You can easily do this by plotting an overview of all the 62 classes and one image that belongs to each class.
Note that even though you define 64 subplots, not all of them will show images (as there are only 62 labels!). Note also that again, you donât include any axes to make sure that the readersâ attention doesnât dwell far from the main topic: the traffic signs!
As you mostly guessed in the histogram above, there are considerably more traffic signs with labels 22, 32, 38, and 61. This hypothesis is now confirmed in this plot: you see that there are 375 instances with label 22, 316 instances with label 32, 285 instances with label 38 and, lastly, 282 instances with label 61.
One of the most interesting questions that you could ask yourself now is whether thereâs a connection between all of these instancesâââmaybe all of them are designatory signs?
Letâs take a closer look: you see that label 22 and 32 are prohibitory signs, but that labels 38 and 61 are designatory and a prioritory signs, respectively. This means that thereâs not an immediate connection between these four, except for the fact that half of the signs that has a heavy presence in the dataset is of the prohibitory kind.
Feature Extraction
Now that you have thoroughly explored your data, itâs time to get your hands dirty! Letâs recap briefly what you discovered to make sure that you donât forget any steps in the manipulation:
- The size of the images was unequal;
- There are 62 labels or target values (as your labels start at 0 and end at 61);
- The distribution of the traffic sign values is pretty unequal; There wasnât really any connection between the signs that were heavily present in the dataset.
Now that you have a clear idea of what you need to improve, you can start with manipulating your data in such a way that itâs ready to be fed to the neural network or whichever model you want to feed it to. Letâs start first with extracting some featuresâââyouâll rescale the images and youâll convert the images that are held in the images
array to grayscale. Youâll do this color conversion mainly because the color matters less in classification questions like the one youâre trying to answer now. For detection, however, the color does play a big part! So in those cases, itâs not needed to do that conversion!
To tackle the differing image sizes, youâre going to rescale the images; You can easily do this with the help of the skimage
or Scikit-Image library, which is a collection of algorithms for image processing.
In this case, the transform
module will come in handy, as it offers you a resize()
function; Youâll see that you make use of list comprehension (again!) to resize each image to 28 by 28 pixels. Once again, you see that the way you actually form the list: for every image that you find in the images
array, youâll perform the transformation operation that you borrow from the skimage
library. Finally, you store the result in the images28
variable:
# Import the `transform` module from `skimage` from skimage import transform
# Rescale the images in the `images` array images28 = [transform.resize(image, (28, 28)) for image in images]
This was fairly easy wasnât it?
Note that the images are now four-dimensional: if you convert images28
to an array and if you concatenate the attribute shape
to it, youâll see that the printout tells you that images28
âs dimensions are (4575, 28, 28, 3)
. The images are 784-dimensional (because your images are 28 by 28 pixels).
You can check the result of the rescaling operation by re-using the code that you used above to plot the 4 random images with the help of the traffic_signs
variable; Just donât forget to change all references to images
to images28
.
Check out the result here:
Note that because you rescaled, your min
and max
values have also changed; They seem to be all in the same ranges now, which is really great because then you donât necessarily need to normalize your data!
As said in the introduction to this section of the tutorial, the color in the pictures matters less when youâre trying to answer a classification question. Thatâs why youâll also go through the trouble of converting the images to grayscale.
Note, however, that you can also test out on your own what would happen to the final results of your model if you donât follow through with this specific step.
Just like with the rescaling, you can again count on the Scikit-Image library to help you out; In this case, itâs the color
module with its rgb2gray()
function that you need to use to get where you need to be.
Thatâs going to be nice and easy!
However, donât forget to convert the images28
variable back to an array, as the rgb2gray()
function does expect an array as an argument.
# Import `rgb2gray` from `skimage.color` from skimage.color import rgb2gray
# Convert `images28` to an array images28 = np.array(images28)
# Convert `images28` to grayscale images28 = rgb2gray(images28)
Double check the result of your grayscale conversion by plotting some of the images; Here, you can again re-use and slightly adapt some of the code to show the adjusted images.
Note that you indeed have to specify the color map or cmap
and set it to "gray"
to plot the images in grayscale. That is because imshow()
by default uses, by default, a heatmap-like color map. Read more here.
Tip: since you have been re-using this function quite a bit in this tutorial, you might look into how you can make it into a function :)
These two steps are very basic ones; Other operations that you could have tried out on your data include data augmentation (rotating, blurring, shifting, changing brightness,âŚ). If you want, you could also set up an entire pipeline of data manipulation operations through which you send your images.
Deep Learning With TensorFlow
Now that you have explored and manipulated your data, itâs time to construct your neural network architecture with the help of the TensorFlow package!
Continue to read here.
Where To Go Next?
If you want to continue working with this dataset and the model that you have put together in this tutorial, try out the following things:
- Apply regularized LDA on the data before you feed it to your model. This is a suggestion that comes from one of the original papers, written by the researchers that gathered and analyzed this dataset.
- You could also, as said in the tutorial itself, also look at some other data augmentation operations that you can perform on the traffic sign images. Additionally, you could also try to tweak this network further; The one that you have created now was fairly simple.
- Early stopping: Keep track of the training and testing error while you train the neural network. Stop training when both errors go down and then suddenly go back upâââthis is a sign that the neural network has started to overfit the training data.
- Play around with the optimizers.
Make sure to check out the Machine Learning With TensorFlow book, written by Nishant Shukla.
Tip also check out the TensorFlow Playground and the TensorBoard.
If you want to keep on working with images, definitely check out DataCampâs scikit-learn tutorial, which tackles the MNIST dataset with the help of PCA, K-Means and Support Vector Machines (SVMs). Or take a look at other tutorials such as this one that use the Belgian traffic signs dataset.
Originally published at www.datacamp.com.