Install Python GDAL 🌐 using Conda 🐍 on a Mac 🍎

Written by filipwodnicki | Published 2017/10/31
Tech Story Tags: python | geospatial | analytics | jupyter-notebook | obscure-psa

TLDRvia the TL;DR App

Do not pip install gdal, do not install GDAL inside a virtual-env. Instead, use Conda.

These are my instructions on how to install GDAL using Conda on a Mac. Before we dive in, let me explain why I am writing this guide. GDAL stands for the “Geospatial Data Abstraction Library**”** and it is released by the Open Source Geospatial Foundation. For Python, the GDAL package is released with a package called **osgeo** as well. And as it happens, I need both for a project I’m doing. And I want them neatly wrapped up inside a virtual environment.

One Does Not Simply pip install gdal Source

I started by trying to use pip and virtual-env. This obscure how-to (and first post on Medium!) comes to you after hours of googling and trying to debug the errors I encountered along the way. In the end, I shifted gears, tried something new and switched over to Conda. This was the solution to all my problems. OK, except a few, but I fixed those too.

Conda is great because it’s a package manager like pip, but it also manages your virtual environments like virtual-env does. Except it does both way better and it’s a joy to use. Treat yo self and switch to Conda.

I wanted to spare you the trouble I experienced so I wrote up the following instructions.

Environment:

A quick note about the environment I’m working in:

Mac OS 10.12.6 SierraMiniconda2 for Python 2 (Conda 4.3.30) info_Python 2.7.14_

optional python packages:Jupyter notebook, installs with $ conda install jupyternb_conda, makes Jupyter play nice with Conda, $ conda install nb_conda

If you are working in a different development environment, your mileage may vary.

And now, the instructions that I myself needed hours ago…

How to Install GDAL 🌐 using Conda 🐍 on a Mac 🍎

This tutorial assumes you have Conda already installed and a Conda environment already created. Instructions here and here, respectively.

Step 1: Activate your Conda environment 🚀

Open up Terminal, run this:

$ source activate [yourEnvironmentName]

For me, [yourEnvironmentName] = geoenv.

Screenshot. We’ve gone from the global shell to the Conda local environment that was just created.

(We can deactivate with the command $ source deactivate.)

Step 2: Ok, now we get to install GDAL. 🔧

Still in Terminal, run this command:

$ conda install gdal

Here’s what I got:

Screenshot. Output from “conda install gdal”

Great. As it turns out, for the osgeo subpackage to work, we actually need the dependency **jpeg version 8**, rather than **9**. You can read more about how I came to that conclusion towards the end of this post, under #Diagnosing.

For now, all you need to do is run this:

$ conda install -f jpeg=8

The “-f” flag forces the install (which is really a downgrade of the the jpeg module).

Screenshot. Install jpeg version 8 with Conda

OK, we should have a working version of GDAL now! Let’s just test it to make sure.

Step 3: Test the installation 🔍

You can do this in the command line or in a Jupyter notebook. Since I want to make sure gdal will work in Jupyter later, I’m going to test there.

To open a new Jupyter Notebook 📙, go back to Terminal, run this command:

$ jupyter notebook

This command will open up a new tab in your internet browser with the Jupyter Notebook file viewer. Navigate to the directory where you wish to save your notebook. Now, we want to start a new notebook. Go to the upper righthand corner, click “New”.

Screenshot. Jupyter Notebook, creating a new notebook, I select “Python [conda env:geoenv]”

Make sure to choose the Conda environment you’ve been working with as the Python Kernel.

Let’s go ahead and test! Run these commands in the notebook.

import gdalhelp(gdal)

Screenshot. Import gdal and get the help to make sure it works!

The help for gdal works, so we’re off to the races.

import osgeohelp(osgeo)

Screenshot. Import osgeo and run help(osgeo) to make sure it works!

Success! 🤗

We’ve finally got GDAL installed as well as osgeo. Everything is working (for now). This somewhat lengthy post was a joy to write, as this problem caused me innumerable hours of strife. I hope to save you from the same. Thanks in advance for your claps 👏🏽 Let me know if something needs an edit or clarification. With that, I’m off to explore graphs with **networkx**!

With ❤︎,

Filip

p.s.

#Motivation for this post

I generally like to use virtual environments on projects to keep things organized. First, I tried to install GDAL inside a python virtual-env which was a huge fail. There are instructions out there how to do that for Windows and Ubuntu, but I couldn’t get it to work for Mac. Virtual-env was more like virtual-enemy. Some folks on StackOverflow suggested to use Conda instead. I ran into a few snags anyways, so I decided to publish these instructions how to Install GDAL using Python/Conda on Mac. Dear reader, I hope this guide saves you some time.

#What am I using GDAL for

I need GDAL for a very particular reason. It’s required for and a dependency of the **read_shp()** function of the networkx Python module. That function reads in an ESRI shapefile (geospatial data) an converts it into a network/graph object. Obviously, you might need GDAL for something else.

To install, I tried using **pip install gdal** inside a Python virtual environment (a.k.a. virtual-env) at first. That failed. I guess you could say it was only a pip dream, sigh. Or maybe it had something to do with having QGIS via Kyngchaos installed. That distribution includes GDAL not as a Python package, but as a Framework.

Anyways, the bottom line is that I still needed GDAL to work inside a Python virtual environment.

#Diagnosing the Conda install issue:

It was not possible for me to get GDAL installed inside a virtual-env using pip. That’s why I switched to Conda.

When running the install in Conda, I ran into a few issues. Simply running the read_shp function from **networkx** was giving me a generic error, much like it was in virtual-env.

ImportError: read_shp requires OGR:

Screenshot. Jupyter notebook. What happens when I try to run the command: G = nx.read_shp(‘file.shp’)

In the screenshot you can see that the code requires **from osgeo import ogr** which is actually included as part of the GDAL module.

So when we try to **import gdal**, we can see what’s actually happening:

Library not loaded: @rpath/libjpeg.8.dylib

Error when I try to run import gdal. The library jpeg.8 is missing.

The jpeg8 library is not loading. To investigate, we can check what packages conda has installed:**$ conda list**

Result of the conda list command. Sure enough, we have jpeg=9, rather than 8.

Moreover, when I uninstall and reinstall only **gdal**, it actually becomes evident that gdal itself updates jpeg to version 9, only to break later.

GDAL breaks itself. Or rather, GDAL breaks osgeo which it’s bundled with(!)

The fix is to simply downgrade jpeg 9 to jpeg 8 after installing gdal. You can find the recipe for that in Step 3 of the #Instructions above. Thanks!

Sources I used:

  • egayer’s comment in this thread on the gdal GitHub
  • “Having trouble installing GDAL for python”
  • I’m not including my crazy, exhaustive searches for anything related to “pip install GDAL” or “GDAL python install mac virtual-env” in this list. Bless your heart if you try to go that path.

Published by HackerNoon on 2017/10/31