November 3rd 2016

*This article was excerpted from **Machine Learning with TensorFlow**.*

Before jumping into machine learning algorithms, you should first familiarize yourself with how to use the tools. This article covers some essential advantages of TensorFlow, to convince you it’s the machine learning library of choice.

As a thought experiment, let’s imagine what happens when we write Python code without a handy computing library. It’ll be like using a new smartphone without installing any extra apps. The phone still works, but you’d be more productive if you had the right apps.

Consider the following…You’re a business owner tracking the flow of sales. You want to calculate your revenue from selling your products. Your inventory consists of 100 different products, and you represent each price in a vector calledprices. Another vector of size 100 calledamountsrepresents the inventory count of each item. You can write the following chunk of Python code shown in listing 1 to calculate the revenue of selling all products. Keep in mind that this code doesn’t import any libraries.

That’s a lot of code just to calculate the inner-product of two vectors (also known as *dot product*). Imagine how much code would be required for something more complicated, such as solving linear equations or computing the distance between two vectors.

By installing the TensorFlow library, you also install a well-known and robust Python library called NumPy, which facilitates mathematical manipulation in Python. Using Python without its libraries (e.g. NumPy and TensorFlow) is like using a camera without autofocus: you gain more flexibility, but you can easily make careless mistakes. It’s already pretty easy to make mistakes in machine learning, so let’s keep our camera on auto-focus and use TensorFlow to help automate some tedious software development.

Listing 2 shows how to concisely write the same inner-product using NumPy.

Python is a succinct language. Fortunately for you, that means you won’t see pages and pages of cryptic code. On the other hand, the brevity of the Python language implies that a lot is happening behind each line of code, which you should familiarize yourself with carefully as you work.

By the way…Detailed documentation about various functions for the Python and C++ APIs for TensorFlow are available at https://www.tensorflow.org/api_docs/index.html.

This article is geared toward using TensorFlow for computations, because machine learning relies on mathematical formulations. After going through the examples and code listings, you’ll be able to use TensorFlow for some arbitrary tasks, such as computing statistics on big data. The focus here will entirely be about how to use TensorFlow, as opposed to machine learning in general.

Machine learning algorithms require a large amount of mathematical operations. Often, an algorithm boils down to a composition of simple functions iterated until convergence. Sure, you might use any standard programming language to perform these computations, but the secret to both manageable and performant code is the use of a well-written library.

That sounds like a gentle start, right? Without further ado, **let’s write our first TensorFlow code!**

First, we need to ensure that everything is working correctly. Check the oil level in your car, repair the blown fuse in your basement, and ensure that your credit balance is zero.

Just kidding, I’m talking about TensorFlow.

Go ahead and create a new file called *test.py* for our first piece of code. Import TensorFlow by running the following script:

import tensorflow as tf

This single import prepares TensorFlow for your bidding.** **If the Python interpreter doesn’t complain, then we’re ready to start using TensorFlow!

Having technical difficulty?A common cause of error at this step is if you installed the GPU version and the library fails to search for CUDA drivers. Remember, if you compiled the library with CUDA, you need to update your environment variables with the path to CUDA. Check the CUDA instructions on TensorFlow. (See https://www.tensorflow.org/versions/master/get_started/os_setup.html#optional-linux-enable-gpu-support for further information).

The TensorFlow library is usually imported with the *tf *qualified name. Generally, qualifying TensorFlow with *tf *is a good idea to remain consistent with other developers and open-source TensorFlow projects. You may choose not to qualify it or change the qualification name, but then successfully reusing other people’s snippets of TensorFlow code in your own projects will be an involved process.

Now that we know how to import TensorFlow into a Python source file, let’s start using it! A convenient way to describe an object in the real world is by listing out its properties, or features. For example, you can describe a car by its color, model, engine type, and mileage. An ordered list of some features is called a *feature vector,* and that’s exactly what we’ll represent in TensorFlow code.

Feature vectors are one of the most useful devices in machine learning because of their simplicity (they’re lists of numbers). Each data item typically consists of a feature vector and a good dataset has thousands, if not thousands, of these feature vectors. No doubt, you’ll often be dealing with more than one vector at a time. A *matrix* concisely represents a list of vectors, where each column of a matrix is a feature vector.

The syntax to represent matrices in TensorFlow is a vector of vectors, each of the same length. Figure 1 is an example of a matrix with two rows and three columns, such as [[1, 2, 3], [4, 5, 6]]. Notice, this is a vector containing two elements, and each element corresponds to a row of the matrix.

We access an element in a matrix by specifying its row and column indices. For example, the first row and first column indicate the first top-left element. Sometimes it’s convenient to use more than two indices, such as when referencing a pixel in a color image not only by its row and column, but also its red/green/blue channel. A *tensor* is a generalization of a matrix that specifies an element by an arbitrary number of indices.

Example of a tensor…Suppose an elementary school enforces assigned seating to its students. You’re the principal, and you’re terrible with names. Luckily, each classroom has a grid of seats, where you can easily nickname a student by his or her row and column index.

There are multiple classrooms, so you can’t say “Good morning 4,10! Keep up the good work.” You need to also specify the classroom, “Hi 4,10 from classroom 2.” Unlike a matrix, which needs only two indices to specify an element, the students in this school need three numbers. They’re all a part of a rank three tensor!

The syntax for tensors is even more nested vectors. For example, a 2-by-3-by-2 tensor is [[[1,2], [3,4], [5,6]], [[7,8], [9,10], [11,12]]], which can be thought of as two matrices, each of size 3-by-2. Consequently, we say this tensor has a *rank* of 3. In general, the rank of a tensor is the number of indices required to specify an element. Machine learning algorithms in TensorFlow act on Tensors, and it’s important to understand how to use them.

It’s easy to get lost in the many ways to represent a tensor. Intuitively, each of the following three lines of code in Listing 3 is trying to represent the same 2-by-2 matrix. This matrix represents two features vectors of two dimensions each. It could, for example, represent two people’s ratings of two movies. Each person, indexed by the row of the matrix, assigns a number to describe his or her review of the movie, indexed by the column. Run the code to see how to generate a matrix in TensorFlow.

The first variable (*m1*) is a list, the second variable (*m2*) is an *ndarray *from the NumPy library, and the last variable (*m3*) is TensorFlow’s *Tensor *object. All operators in TensorFlow, such as *neg*, are designed to operate on tensor objects. A convenient function we can sprinkle anywhere to make sure that we’re dealing with tensors, as opposed to the other types, is *tf.convert_to_tensor( … )*. Most functions in the TensorFlow library already perform this function (redundantly), even if you forget to. Using *tf.convert_to_tensor( … )* is optional, but I show it here because it helps demystify the implicit type system being handled across the library. The aforementioned listing 3 produces the following output three times:

<class ‘tensorflow.python.framework.ops.Tensor’>

Let’s take another look at defining tensors in code. After importing the TensorFlow library, we can use the constant operator as follows in Listing 4.

Running listing 4 produces the following output:

Tensor( “Const:0”,

shape=TensorShape([Dimension(1),

Dimension(2)]),

dtype=float32 )

Tensor( “Const_1:0”,

shape=TensorShape([Dimension(2),

Dimension(1)]),

dtype=int32 )

Tensor( “Const_2:0”,

shape=TensorShape([Dimension(2),

Dimension(3),

Dimension(2)]),

dtype=int32 )

As you can see from the output, each tensor is represented by the aptly named *Tensor *object. Each *Tensor *object has a unique label (*name*), a dimension (*shape*) to define its structure, and data type (*dtype*) to specify the kind of values we’ll manipulate. Because we didn’t explicitly provide a name, the library automatically generated them: “Const:0”, “Const_1:0”, and “Const_2:0”.

Notice that each of the elements of *matrix1 *end with a decimal point. The decimal point tells Python that the data type of the elements isn’t an integer, but instead a float. We can pass in explicit *dtype *values. Much like NumPy arrays, tensors take on a data type that specifies the kind of values we’ll manipulate in that tensor.

TensorFlow also comes with a few convenient constructors for some simple tensors. For example, *tf.zeros(shape)* creates a tensor with all values initialized at zero of a specific shape. Similarly, *tf.ones(shape)* creates a tensor of a specific shape with all values initialized at one. The shape argument is a one-dimensional (1D) tensor of type *int32 *(a list of integers) describing the dimensions of the tensor.

Now that we have a few starting tensors ready to use, we can apply more interesting operators, such as addition or multiplication. Consider each row of a matrix representing the transaction of money to (positive value) and from (negative value) another person. Negating the matrix is a way to represent the transaction history of the other person’s flow of money. Let’s start simple and run the negation op (short for operation) on our *matrix1 *tensor from listing 4. Negating a matrix turns the positive numbers into negative numbers of the same magnitude, and vice versa.

Negation is one of the simplest operations. As shown in listing 5, negation takes only one tensor as input, and produces a tensor with every element negated — now, try running the code yourself. If you master how to define negation, it’ll provide a stepping stone to generalize that skill to all other TensorFlow operations.

Aside…Definingan operation, such as negation, is different fromrunningit.

Listing 5 generates the following output:

Tensor(“Neg:0”, shape=(1, 2), dtype=int32)

The official documentation carefully lays out all available math ops: https://www.tensorflow.org/api_docs/Python/math_ops.html.

Some specific examples of commonly used operators include:

tf.add(x, y)

Add two tensors of the same type, x + y

tf.sub(x, y)

Subtract tensors of the same type, x — y

tf.mul(x, y)

Multiply two tensors element-wise

tf.pow(x, y)

Take the element-wise power of x to y

tf.exp(x)

Equivalent to pow(e, x), where e is Euler’s number (2.718…)

tf.sqrt(x)

Equivalent to pow(x, 0.5)

tf.div(x, y)

Take the element-wise division of x and y

tf.truediv(x, y)

Same as tf.div, except casts the arguments as a float

tf.floordiv(x, y)

Same as truediv, except rounds down the final answer into an integer

tf.mod(x, y)

Takes the element-wise remainder from division

Exercise…Use the TensorFlow operators we’ve learned to produce the Gaussian Distribution (also known as Normal Distribution). See Figure 3 for a hint. For reference, you can find the probability density of the normal distribution online: https://en.wikipedia.org/wiki/Normal_distribution.

Most mathematical expressions such as “*”, “-“, “+”, etc. are shortcuts for their TensorFlow equivalent, for the sake of brevity. The Gaussian function includes many operations, and it’s cleaner to use some short-hand notations as follows:

from math import pi

mean = 1.0

sigma = 0.0

(tf.exp(tf.neg(tf.pow(x — mean, 2.0) /

(2.0 * tf.pow(sigma, 2.0) ))) *

(1.0 / (sigma * tf.sqrt(2.0 * pi) )))

As you can see, TensorFlow algorithms are easy to visualize. They can be described by flowcharts. The technical (and more correct) term for the flowchart is a *graph*. Every arrow in a flowchart is called the *edge* of the graph. In addition, every state of the flowchart is called a *node*.

A session is an environment of a software system that describes how the lines of code should run. In TensorFlow, a session sets up how the hardware devices (such as CPU and GPU) talk to each other. That way, you can design your machine learning algorithm without worrying about micro-managing the hardware that it runs on. Of course, you can later configure the session to change its behavior without changing a line of the machine learning code.

To execute an operation and retrieve its calculated value, TensorFlow requires a session. Only a registered session may fill the values of a Tensor object. To do so, you must create a session class using *tf.Session()* and tell it to run an operator (listing 6). The result will be a value you can later use for further computations.

Congratulations! You have just written your first full TensorFlow code. Although all it does is negate a matrix to produce [[-1, -2]], the core overhead and framework are just the same as everything else in TensorFlow.

You can also pass options to tf.Session. For example, TensorFlow automatically determines the best way to assign a GPU or CPU device to an operation, depending on what is available. We can pass an additional option, *log_device_placements=True*, when creating a Session, as shown in listing 7.

This outputs info about which CPU/GPU devices are used in the session for each operation. For example, running listing 6 results in traces of output like the following to show which device was used to run the negation op:

Neg: /job:localhost/replica:0/task:0/cpu:0

Sessions are essential in TensorFlow code. You need to call a session to actually “run” the math. Figure 4 maps out how the different components on TensorFlow interact with the machine learning pipeline. A session not only runs a graph operation, but can also take placeholders, variables, and constants as input. We’ve used constants so far, but in later sections we’ll start using variables and placeholders. Here’s a quick overview of these three types of values.

**Placeholder**: A value that is unassigned, but will be initialized by the session wherever it is run.**Variable:**A value that can change, such a parameter of a machine learning model.**Constant:**A value that does not change, such as hyper-parameters or settings.

That’s it for now, I hope that you have successfully acquainted yourself with some of the basic workings of TensorFlow. If this article has left you ravenous for more delicious TensorFlow tidbits, please go **download the first chapter of ****Machine Learning with TensorFlow** and see this Slideshare presentation for more information (and a **discount code**).