Getting to Know TensorFlow by@binroot

November 3rd 2016 19,163 reads

*This article was excerpted from **Machine Learning with TensorFlow**.*

Before jumping into machine learning algorithms, you should first familiarize yourself with how to use the tools. This article covers some essential advantages of TensorFlow, to convince you itβs the machine learning library of choice.

As a thought experiment, letβs imagine what happens when we write Python code without a handy computing library. Itβll be like using a new smartphone without installing any extra apps. The phone still works, but youβd be more productive if you had the right apps.

Consider the followingβ¦Youβre a business owner tracking the flow of sales. You want to calculate your revenue from selling your products. Your inventory consists of 100 different products, and you represent each price in a vector calledprices. Another vector of size 100 calledamountsrepresents the inventory count of each item. You can write the following chunk of Python code shown in listing 1 to calculate the revenue of selling all products. Keep in mind that this code doesnβt import any libraries.

Thatβs a lot of code just to calculate the inner-product of two vectors (also known as *dot product*). Imagine how much code would be required for something more complicated, such as solving linear equations or computing the distance between two vectors.

By installing the TensorFlow library, you also install a well-known and robust Python library called NumPy, which facilitates mathematical manipulation in Python. Using Python without its libraries (e.g. NumPy and TensorFlow) is like using a camera without autofocus: you gain more flexibility, but you can easily make careless mistakes. Itβs already pretty easy to make mistakes in machine learning, so letβs keep our camera on auto-focus and use TensorFlow to help automate some tedious software development.

Listing 2 shows how to concisely write the same inner-product using NumPy.

Python is a succinct language. Fortunately for you, that means you wonβt see pages and pages of cryptic code. On the other hand, the brevity of the Python language implies that a lot is happening behind each line of code, which you should familiarize yourself with carefully as you work.

By the wayβ¦Detailed documentation about various functions for the Python and C++ APIs for TensorFlow are available at https://www.tensorflow.org/api_docs/index.html.

This article is geared toward using TensorFlow for computations, because machine learning relies on mathematical formulations. After going through the examples and code listings, youβll be able to use TensorFlow for some arbitrary tasks, such as computing statistics on big data. The focus here will entirely be about how to use TensorFlow, as opposed to machine learning in general.

Machine learning algorithms require a large amount of mathematical operations. Often, an algorithm boils down to a composition of simple functions iterated until convergence. Sure, you might use any standard programming language to perform these computations, but the secret to both manageable and performant code is the use of a well-written library.

That sounds like a gentle start, right? Without further ado, **letβs write our first TensorFlow code!**

First, we need to ensure that everything is working correctly. Check the oil level in your car, repair the blown fuse in your basement, and ensure that your credit balance is zero.

Just kidding, Iβm talking about TensorFlow.

Go ahead and create a new file called *test.py* for our first piece of code. Import TensorFlow by running the following script:

import tensorflow as tf

This single import prepares TensorFlow for your bidding.** **If the Python interpreter doesnβt complain, then weβre ready to start using TensorFlow!

Having technical difficulty?A common cause of error at this step is if you installed the GPU version and the library fails to search for CUDA drivers. Remember, if you compiled the library with CUDA, you need to update your environment variables with the path to CUDA. Check the CUDA instructions on TensorFlow. (See https://www.tensorflow.org/versions/master/get_started/os_setup.html#optional-linux-enable-gpu-support for further information).

The TensorFlow library is usually imported with the *tf *qualified name. Generally, qualifying TensorFlow with *tf *is a good idea to remain consistent with other developers and open-source TensorFlow projects. You may choose not to qualify it or change the qualification name, but then successfully reusing other peopleβs snippets of TensorFlow code in your own projects will be an involved process.

Now that we know how to import TensorFlow into a Python source file, letβs start using it! A convenient way to describe an object in the real world is by listing out its properties, or features. For example, you can describe a car by its color, model, engine type, and mileage. An ordered list of some features is called a *feature vector,* and thatβs exactly what weβll represent in TensorFlow code.

Feature vectors are one of the most useful devices in machine learning because of their simplicity (theyβre lists of numbers). Each data item typically consists of a feature vector and a good dataset has thousands, if not thousands, of these feature vectors. No doubt, youβll often be dealing with more than one vector at a time. A *matrix* concisely represents a list of vectors, where each column of a matrix is a feature vector.

The syntax to represent matrices in TensorFlow is a vector of vectors, each of the same length. Figure 1 is an example of a matrix with two rows and three columns, such as [[1, 2, 3], [4, 5, 6]]. Notice, this is a vector containing two elements, and each element corresponds to a row of the matrix.

We access an element in a matrix by specifying its row and column indices. For example, the first row and first column indicate the first top-left element. Sometimes itβs convenient to use more than two indices, such as when referencing a pixel in a color image not only by its row and column, but also its red/green/blue channel. A *tensor* is a generalization of a matrix that specifies an element by an arbitrary number of indices.

Example of a tensorβ¦Suppose an elementary school enforces assigned seating to its students. Youβre the principal, and youβre terrible with names. Luckily, each classroom has a grid of seats, where you can easily nickname a student by his or her row and column index.

There are multiple classrooms, so you canβt say βGood morning 4,10! Keep up the good work.β You need to also specify the classroom, βHi 4,10 from classroom 2.β Unlike a matrix, which needs only two indices to specify an element, the students in this school need three numbers. Theyβre all a part of a rank three tensor!

The syntax for tensors is even more nested vectors. For example, a 2-by-3-by-2 tensor is [[[1,2], [3,4], [5,6]], [[7,8], [9,10], [11,12]]], which can be thought of as two matrices, each of size 3-by-2. Consequently, we say this tensor has a *rank* of 3. In general, the rank of a tensor is the number of indices required to specify an element. Machine learning algorithms in TensorFlow act on Tensors, and itβs important to understand how to use them.

Itβs easy to get lost in the many ways to represent a tensor. Intuitively, each of the following three lines of code in Listing 3 is trying to represent the same 2-by-2 matrix. This matrix represents two features vectors of two dimensions each. It could, for example, represent two peopleβs ratings of two movies. Each person, indexed by the row of the matrix, assigns a number to describe his or her review of the movie, indexed by the column. Run the code to see how to generate a matrix in TensorFlow.

The first variable (*m1*) is a list, the second variable (*m2*) is an *ndarray *from the NumPy library, and the last variable (*m3*) is TensorFlowβs *Tensor *object. All operators in TensorFlow, such as *neg*, are designed to operate on tensor objects. A convenient function we can sprinkle anywhere to make sure that weβre dealing with tensors, as opposed to the other types, is *tf.convert_to_tensor(Β β¦ )*. Most functions in the TensorFlow library already perform this function (redundantly), even if you forget to. Using *tf.convert_to_tensor(Β β¦ )* is optional, but I show it here because it helps demystify the implicit type system being handled across the library. The aforementioned listing 3 produces the following output three times:

<class βtensorflow.python.framework.ops.Tensorβ>

Letβs take another look at defining tensors in code. After importing the TensorFlow library, we can use the constant operator as follows in Listing 4.

Running listing 4 produces the following output:

Tensor( βConst:0β,

shape=TensorShape([Dimension(1),

Dimension(2)]),

dtype=float32 )

Tensor( βConst_1:0β,

shape=TensorShape([Dimension(2),

Dimension(1)]),

dtype=int32 )

Tensor( βConst_2:0β,

shape=TensorShape([Dimension(2),

Dimension(3),

Dimension(2)]),

dtype=int32 )

As you can see from the output, each tensor is represented by the aptly named *Tensor *object. Each *Tensor *object has a unique label (*name*), a dimension (*shape*) to define its structure, and data type (*dtype*) to specify the kind of values weβll manipulate. Because we didnβt explicitly provide a name, the library automatically generated them: βConst:0β, βConst_1:0β, and βConst_2:0β.

Notice that each of the elements of *matrix1 *end with a decimal point. The decimal point tells Python that the data type of the elements isnβt an integer, but instead a float. We can pass in explicit *dtype *values. Much like NumPy arrays, tensors take on a data type that specifies the kind of values weβll manipulate in that tensor.

TensorFlow also comes with a few convenient constructors for some simple tensors. For example, *tf.zeros(shape)* creates a tensor with all values initialized at zero of a specific shape. Similarly, *tf.ones(shape)* creates a tensor of a specific shape with all values initialized at one. The shape argument is a one-dimensional (1D) tensor of type *int32 *(a list of integers) describing the dimensions of the tensor.

Now that we have a few starting tensors ready to use, we can apply more interesting operators, such as addition or multiplication. Consider each row of a matrix representing the transaction of money to (positive value) and from (negative value) another person. Negating the matrix is a way to represent the transaction history of the other personβs flow of money. Letβs start simple and run the negation op (short for operation) on our *matrix1 *tensor from listing 4. Negating a matrix turns the positive numbers into negative numbers of the same magnitude, and vice versa.

Negation is one of the simplest operations. As shown in listing 5, negation takes only one tensor as input, and produces a tensor with every element negatedβββnow, try running the code yourself. If you master how to define negation, itβll provide a stepping stone to generalize that skill to all other TensorFlow operations.

Asideβ¦Definingan operation, such as negation, is different fromrunningit.

Listing 5 generates the following output:

Tensor(βNeg:0β, shape=(1, 2), dtype=int32)

The official documentation carefully lays out all available math ops: https://www.tensorflow.org/api_docs/Python/math_ops.html.

Some specific examples of commonly used operators include:

tf.add(x, y)

Add two tensors of the same type, x + y

tf.sub(x, y)

Subtract tensors of the same type, x β y

tf.mul(x, y)

Multiply two tensors element-wise

tf.pow(x, y)

Take the element-wise power of x to y

tf.exp(x)

Equivalent to pow(e, x), where e is Eulerβs number (2.718β¦)

tf.sqrt(x)

Equivalent to pow(x, 0.5)

tf.div(x, y)

Take the element-wise division of x and y

tf.truediv(x, y)

Same as tf.div, except casts the arguments as a float

tf.floordiv(x, y)

Same as truediv, except rounds down the final answer into an integer

tf.mod(x, y)

Takes the element-wise remainder from division

Exerciseβ¦Use the TensorFlow operators weβve learned to produce the Gaussian Distribution (also known as Normal Distribution). See Figure 3 for a hint. For reference, you can find the probability density of the normal distribution online: https://en.wikipedia.org/wiki/Normal_distribution.

Most mathematical expressions such as β*β, β-β, β+β, etc. are shortcuts for their TensorFlow equivalent, for the sake of brevity. The Gaussian function includes many operations, and itβs cleaner to use some short-hand notations as follows:

from math import pi

mean = 1.0

sigma = 0.0

(tf.exp(tf.neg(tf.pow(x β mean, 2.0) /

(2.0 * tf.pow(sigma, 2.0) ))) *

(1.0 / (sigma * tf.sqrt(2.0 * pi) )))

As you can see, TensorFlow algorithms are easy to visualize. They can be described by flowcharts. The technical (and more correct) term for the flowchart is a *graph*. Every arrow in a flowchart is called the *edge* of the graph. In addition, every state of the flowchart is called a *node*.

A session is an environment of a software system that describes how the lines of code should run. In TensorFlow, a session sets up how the hardware devices (such as CPU and GPU) talk to each other. That way, you can design your machine learning algorithm without worrying about micro-managing the hardware that it runs on. Of course, you can later configure the session to change its behavior without changing a line of the machine learning code.

To execute an operation and retrieve its calculated value, TensorFlow requires a session. Only a registered session may fill the values of a Tensor object. To do so, you must create a session class using *tf.Session()* and tell it to run an operator (listing 6). The result will be a value you can later use for further computations.

Congratulations! You have just written your first full TensorFlow code. Although all it does is negate a matrix to produce [[-1, -2]], the core overhead and framework are just the same as everything else in TensorFlow.

You can also pass options to tf.Session. For example, TensorFlow automatically determines the best way to assign a GPU or CPU device to an operation, depending on what is available. We can pass an additional option, *log_device_placements=True*, when creating a Session, as shown in listing 7.

This outputs info about which CPU/GPU devices are used in the session for each operation. For example, running listing 6 results in traces of output like the following to show which device was used to run the negation op:

Neg: /job:localhost/replica:0/task:0/cpu:0

Sessions are essential in TensorFlow code. You need to call a session to actually βrunβ the math. Figure 4 maps out how the different components on TensorFlow interact with the machine learning pipeline. A session not only runs a graph operation, but can also take placeholders, variables, and constants as input. Weβve used constants so far, but in later sections weβll start using variables and placeholders. Hereβs a quick overview of these three types of values.

**Placeholder**: A value that is unassigned, but will be initialized by the session wherever it is run.**Variable:**A value that can change, such a parameter of a machine learning model.**Constant:**A value that does not change, such as hyper-parameters or settings.

Thatβs it for now, I hope that you have successfully acquainted yourself with some of the basic workings of TensorFlow. If this article has left you ravenous for more delicious TensorFlow tidbits, please go **download the first chapter of ****Machine Learning with TensorFlow** and see this Slideshare presentation for more information (and a **discount code**).