Michael Ramos


TensorFlow (GPU) Setup for Developers

tensorflow, machine learning, gpu, setup-guide


…For devs wanting to run some cool models or experiments with TensorFlow (on GPU for more intense training). This probably isn’t for the professional data scientists or anyone creating actual models — I imagine their setups are a bit more verbose.

This blog post will cover my manual implementation of setting up/building TensorFlow with GPU support. I’ve spent decent time reading posts and going through walkthroughs… and learned a ton from them… so I pieced together this installation guide to which I’ve been routinely using since (should have a CloudFormation script soon). There are obviously simple installs via pip/conda, but we want to manually build out TF and get it running on GPU… still this installation guide is for simple/default configurations and settings.

The Core (we will cover these as we go along):

  • AWS EC2 p2.xlarge
  • Ubuntu 16.04
  • CUDA 8.0 (along with CUDDN)
  • TensorFlow 1.0
  • Python 2.x and libs (libs not directly needed here, but I base my installations around Python usage)

Launch your AWS EC2 Instance

  1. Choose OS type: Ubuntu Server 16.04
  2. Choose Instance Type: GPU Compute > p2.xlarge

Why p2? They were made specifically for what we want to do, which is to run intense computations on the GPU. p2.xlarge offers us 12gb GPU. We will be able to do (batched) computationally-intensive learning just fine on this machine

WARNING: p2 instances, while running, are $0.90/hour — So I suggest you only keep the instance state running while you are performing tasks such as training.

3) Configuration: Basic configurations are fine, you will need to use a VPC (default VPC and subnet settings are fine as well)

4) Storage: Note: when adding storage, account for the amount of data you’ll most likely be training on. For example, the COCO 2014 dataset (images) is around 15GB, and the trained neural net most commonly used for COCO is .5GB. Also, we will be downloading 100s of MBs of individual software and libraries. I’ve opted for 50GB storage in my work, and have been using around 30–40GB of that.

5) Continue on, and setup with your preferences or the defaults. There is no other special settings that are needed to proceed, so launch and wait for the instance to be started.

Launch the instance, wait for it to be running, and we should be good to move on.

Installation Guide

SSH into your newly launched instance.

Ubuntu updates

$ sudo apt-get update
$ sudo apt-get upgrade
Note: If you get the response:
“new version of boot/grub/menu.lst …”
— I keep the local version in my testing environments. For production use, due diligence is required.


useful tools we will need:

$ sudo apt-get -y git vim curl build-essential cmake pkg-config zip

this is also where I install the python libraries I will use (not required for tensorflow install):

$ sudo apt-get -y install python-pip python-dev python-numpy python-scipy ipython python-matplotlib python-sklearn python-wheel

other system libraries i’ve found to be required when building and using tensorflow:

$ sudo apt-get -y install -y libpng12-dev ibjasper-dev libfreetype6 libjpeg-dev libtiff5-dev libgtk2.0-dev libavcodec-dev libavformat-dev libswscale-dev libv4l-dev swig

now create a directory for us to work out of:

$ mkdir install && cd install

Building TensorFlow

We are going to be building TensorFlow from source. There is a simple pip installation, but we will get better performance — in some use cases — by building


Bazel is TenorFlow’s build tool. Bazel needs to use our ip and because we are in a VPC, we need to change the hosts file.

To get your private ip-address, run ifconfig in the terminal… you’ll get an output, so look for inet adds and note that ip — (also available in the AWS Console)

  1. modify /etc/hosts:
$ sudo vim /etc/hosts
// [vim opens the file …]
## change this line: localhost
## to (if your private ip is ip-155–10–100–10

2. install Java (needed for Bazel):

$ sudo add-apt-repository ppa:webupd8team/java
// [press enter]
$ sudo apt-get update
$ sudo apt-get install -y oracle-java8-installer
// [accept terms]

3. install Bazel (0.5.1):

$ wget https://github.com/bazelbuild/bazel/releases/download/0.5.1/bazel-0.5.1-installer-linux-x86_64.sh
$ chmod +x bazel-0.5.1-installer-linux-x86_64.sh
$ ./bazel-0.5.1-installer-linux-x86_64.sh

Bazel will now be installed. Let’s add it in .bashr

modify ~/.bashrc (add the 2 lines to the end of the file):

$ vim ~/.bashrc
// [vim opens the file]
# Add to end of file:
source /home/ubuntu/.bazel/bin/bazel-complete.bash
export PATH=”$PATH:$HOME/bin”

let’s load .bashrc into our shell now:

$ source ~/.bashrc

We now have Bazel.

Ok, before we get TensorFlow and build, we need to install our dependencies for GPU-support. These include CUDA and CUDNN. 
For more info on these: https://en.wikipedia.org/wiki/CUDA

install CUDA:

$ wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_8.0.44-1_amd64.deb
$ sudo dpkg -i cuda-repo-ubuntu1404_8.0.44–1_amd64.deb
$ sudo apt-get -y update
$ sudo apt-get -y upgrade
$ sudo apt-get install -y cuda
// [will take a couple of minutes]
$ sudo sh -c “sudo echo ‘/usr/local/cuda/lib64’ > /etc/ld.so.conf.d/cuda.conf”
$ sudo ldconfig

To test that we have this installed correctly, run a simple command:

$ nvidia-smi
## note, using:
## watch -n 0.5 nvidia-smi
## is a useful way to launch a “live” monitor of GPU stats, use this when you train your models

Note: For CUDNN, you will need a Nvidia developer account. You can get this here: https://developer.nvidia.com/rdp/cudnn-download

After you have setup your account, go ahead and download the 8.0-linux-x64-v5.1 tarball to your own machine.

We will transfer this file (and all others coming from our machine) to our EC2 instance using scp.

Note: Before you scp the cudnn tarball, edit the command below with your ec2 and correct paths

run scp command from your own machine (replace ip):

$ scp -i \
~/keys/learner-key.pem \
~/Downloads/cudnn-8.0-linux-x64-v5.1.tgz \
## the syntax:
## scp -i <path_to_ec2_private_key> <path_to_file_to_transfer> <receiving_machine_and_path_to_dest>

After the tarball transfers, unzip and place:

$ tar -zxf cudnn-8.0-linux-x64-v5.1.tgz
$ sudo cp -P cuda/lib64/* /usr/local/cuda/lib64/


Yea.. cool. Let’s go and get TensorFlow. Using v1 here.

$ git clone https://github.com/tensorflow/tensorflow
$ cd tensorflow
$ git checkout v1.0.0
$ ./configure
// [launches configuration]

This will launch the configuration script.

Note: You will need to configure one setting as non-default: CUDA usage. So [press enter] to get all default settings, until you see the question asking if you want to support CUDA. YES we do. The following CUDA settings can be default as well.

Ok, TensorFlow is configured and ready to build. The build process is timely. Usually lasting around 1-hour on p2.xlarge. So to combat this, and not leave your process bound to the open shell, we will use screen. Screen is a tool that will allow us to launch a new window in the shell, start a process, and then detach from that window… and eventually allow us to come back/re-attach to that window/process. See more here: https://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/

(screen also comes in handy when training your models and running those respective processes)

launch a new screen window:

$ screen

We should still be in the install/tensorflow directory

build TensorFlow with Bazel:

$ bazel build -c opt — config=cuda //tensorflow/tools/pip_package:build_pip_package
// [bazel builds tensorflow here — long process]

Note: This is a long process, expect this to take an hour. Thanks to screen, we can detach from this window (close the ssh connection if you’d like) and take a break

To detach from the screen window where Bazel is building:

ctrl-a + d

… time period of tensorflow building …

After this break, reattach to the screen window that is building/built TensorFlow:

$ screen -r

install the new build using pip:

$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.0.0-cp27-cp27mu-linux_x86_64.whl

We want to now make sure TensorFlow can find cuda before running. We need to add it into our environment.

modify ~/.bashrc

$ vim ~/.bashrc
// [vim opens the file]
# Add to end of file:
export LD_LIBRARY_PATH=”$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda

cd out to the home directory (make sure you are out of tensflow directory)



let’s test this in interactive Python:

$ python
// [python interpreter opens]
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
… more like output like this (good sign that we see CUDA)

>>> with tf.Session() as sess:
…… sess.run(tf.global_variables_initializer())
…… print sess.run(output)
[press enter]
// [script runs — you will see an initial output]

and to confirm we are using GPU, look for this output:

Now go do some cool s***.

follow me

More by Michael Ramos

Topics of interest

More Related Stories