Keras with GPU on Amazon EC2 – a step-by-step instruction

Written by mateuszsieniawski | Published 2017/02/16
Tech Story Tags: cloud-computing | keras | deep-learning | tensorflow | aws-ec2

TLDRvia the TL;DR App

Due to the need of using more and more complex neural networks we also require better hardware. Our PCs often cannot bear that large networks, but you can relatively easily rent a powerful computer paid by hour in Amazon EC2 service.

I use Keras – an open source neural network Python library. It’s great for a beginning the journey with deep learning mostly because of its ease of use. It is build on top of TensorFlow (but Theano can be used as well) – an open source software library for numerical computation. The rented machine will be accessible via browser using Jupyter Notebook – a web app that allows to share and edit documents with live code.

Keras can be run on GPU using cuDNN – deep neural network GPU-accelerated library. This approach is much much faster than a typical CPU because of has been designed for parallel computation. I advise you to take a look on a few CNN benchmarks comparing the running time of most popular neural networks on different GPUs and CPU.

I’ll present you a step-by-step instruction how to set up such a deep learning environment from a pre-prepared Amazon Machine Image (AMI).

1) Create an account

Visit https://aws.amazon.com/ and create an AWS account.

Now Sign in to the console.

Your dashboard should look like something like this.

Make sure that you selected Frankfurt, N. Virgiania, or Singapore as your region. Later this will allow you to use a pre-prepared Keras AMI. If you want to set up such an AMI by yourself you can follow this guide.

2) Launch an instance

Now let’s move on to EC2 dashboard.

“Amazon Elastic Compute Cloud (Amazon EC2) provides scalable computing capacity in the Amazon Web Services (AWS) cloud. Using Amazon EC2 eliminates your need to invest in hardware up front, so you can develop and deploy applications faster. You can use Amazon EC2 to launch as many or as few virtual servers as you need, configure security and networking, and manage storage. Amazon EC2 enables you to scale up or down to handle changes in requirements or spikes in popularity, reducing your need to forecast traffic.” (as Amazon docs say).

So in other words, you can at any time rent a server that will make computations, in this case machine learning model training. Let’s launch such an instance!

First you need to select an AMI where all the necessary tools are already installed (Keras on TensoFlow with Jupyter Notebook).

Select an instance type (how good computer are you renting). Of course, the better instance you choose the more you pay. But you are creating your first instance ever, so you don’t want the best type out there, do you? Just pick t2.micro, it’s meant to be a testing instance. It’ll allow you to find your feet without purging your wallet.

When you get comfortable with it and need more computing power, I suggest that you should use a g* type instance (g stands for GPU back-end) e.g. g2.2xlarge. It’s a default GPU instance being priced at about $0,772 per hour.

Nothing fancy in here, just skip.

You can use up to 30 GB for free. Also, if you don’t want your data to disappear after terminating the instance you should uncheck “delete on termination” checkbox.

Just move on.

Ok, this stage is important, because you’ll want to access your instance not only using ssh, but also via browser. Add a custom TCP rule on port 8888. Make it accessible only from your IP address, both 8888 and 22(ssh).

Everything is ready so finally let’s launch the instance!

You only need to set up a new (or choose an existing one) key pair. They are necessary to login via ssh to your machine.

Download the generated key and keep it private! Nobody except you should have access to it.

Now let’s see the status of the machine.

As you can see, the instance is up and running. Good job! You just launched an AWS instance.

3) Set up Jupyter Notebook

Now let’s make use of it. Connect via ssh.

As the instruction says, change the privileges of the private key and type the example into a terminal (or connect using PuTTY). After -i param insert the path to the private key and instead of ‘root’ type ‘ubuntu’. So the command looks as follows (if you are using Windows check out how to connect via PuTTY):

ssh -i ‘path/to/private/key’ ubuntu@public_dns

Run the notebook by typing

sudo jupyter notebook

into terminal. You can access the notebook via browser by typing your_public_dns:8888 (8888 is the Jupyter default port).

4) Connect to your instance

The default password is ‘’machinelearningisfun” (I suggest that you changed the password, on Jupyter Notebook documentation it’s explained how to do it).

The MNIST database is a well-known collection of handwritten digits. I prepared a sample notebook that loads the dataset and fits a sample convolutional neural network. Open mnist.jpynb example and run the cells yourself.

Code comes from Keras repository examples

When you’re done remember to terminate your instance! Payment calculation is based on the amount of time the instance was up. For example, if you forget about your g2.2xlarge instance that has been running for a month, you’ll pay $0,772*24*30 = $555,84.

So, what’s next? I encourage you to take a look on the notMNIST dataset which contains set of alphabet letters coming from different fonts. You may be as well interested in CIFAR-10 – a set of color images which can be matched to 10 categories e.g. airplanes, ships, birds, or cats.

If you are new to Keras you may be interested in this tutorial. Or, as in my case, detecting trypophobia (for your sanity, please do not google images of it) triggers in photos. I’ve learned basics of convolutional neural networks (and how to set a machine on) during workshop at Polish Children’s Fund tutored by Piotr Migdał. Source code of one of the other participants, using VGG16 for feature extraction, is available on GitHub.

If you are interested how I approached detecting trypophobia triggers – stay tuned for the next post!


Published by HackerNoon on 2017/02/16