paint-brush
A Better Guide to Build Apache Superset From sourceby@kharekartik
12,154 reads
12,154 reads

A Better Guide to Build Apache Superset From source

by Kartik KhareSeptember 22nd, 2019
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

In this article, we’ll be deep-diving on how to build Apache Superset from the source. The official documentation is too complicated for a new contributor and thus my attempt to simplify it. The official build guide is available to download and test your knowledge of the build process. The front-end and the backend need to be built separately. Let’s start by building the backend first and then build the front end using the Yarn tool to build superset UI.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - A Better Guide to Build Apache Superset From source
Kartik Khare HackerNoon profile picture

In this article, we’ll be deep-diving on how to build Apache Superset from the source. The official documentation is too complicated for a new contributor and thus my attempt to simplify it.

First, you’ll need the following installed on your system

  • Python 3.6 or 3.7
  • NodeJS
  • NPM
  • Yarn package manager for NodeJS

Let’s first install OS dependencies. Most of these should already be there in your system.

MacOS:

brew install pkg-config libffi openssl python
env LDFLAGS="-L$(brew --prefix openssl)/lib" CFLAGS="-I$(brew --prefix openssl)/include" pip install cryptography==2.4.2

Debian/Ubuntu:

sudo apt-get install build-essential libssl-dev libffi-dev python3.6-dev python-pip libsasl2-dev libldap2-dev

Fedora/RHEL-Derivatives:

sudo yum upgrade python36u-setuptools
sudo yum install gcc gcc-c++ libffi-devel python36u-devel python36u-pip python-wheel openssl-devel libsasl2-devel openldap-devel

For RHEL, you might find the python36u-devel doesn’t exist. In that case search for the correct python3-devel dependency related to your architecture using

yum search python3 | grep devel

Once you have these dependencies setup you are ready to go to the next step.

Superset Repository contains both the frontend and the backend, both of which need to be built separately. Let’s start by building the backend first.

Clone the repository

git clone https://github.com/apache/incubator-superset.git
cd incubator-superset/

Create a virtual environment

A python virtual environment isolates your dependencies from the rest of the system. This eliminates dependency conflicts and hence recommended.

python3 -m venv path/to/new/virtual/env

Once you have created virtual env, activate it using

source path/to/new/virtual/env/bin/activate

Install the dependencies

Now you can install all the required dependencies.

pip install -r requirements.txt
pip install -r requirements-dev.txt

Some of the dependencies present in requirements.txt create usually give errors while running. To avoid them, install all the dependencies mentioned below

pip install numpy==1.17
pip install sqlalchemy==1.2.18
pip install pandas==0.23.4
pip install markupsafe==1.0
pip install mysqlclient

Install superset

pip install -e .

This will use setup.py file to install superset.

Now, let’s proceed to build the frontend.

First, we need to change the directory to superset/assets/

cd superset/assets/

Once that is done, we can start building superset UI.

Pull the dependencies

First, we pull all the required node js dependencies using yarn. To do that just run

yarn

Yarn will download and install all the dependencies present in package.json file.

Build UI

To finally build the front-end, use

npm run build

Voila`! We are done with installing superset from source. Now you can simply run superset using

superset run

Run Tests

To run all the tests, you can run

tox

Let us look at some of the common errors which you might encounter during the build process.


1. Error: flask_appbuilder.base: ‘NoneType’ object has no attribute ‘name’
Solution:

superset init
superset db upgrade

2. Error: Failure while creating virtualenv.

  • virtualenv is installed with python3.6 but you are using python3.7 to create venv. 
    Solution: Reinstall virtualenv for python3.7
  • libffi.so missing. 
    Solution: Install python3-devel to fix that

3. Error: flask command not found
Solution :

pip install flask-cli

4. Error: Yarn — There appears to be trouble with your network connection. Retrying...

Solution:

You are probably installing all these dependencies in a closed network such as an office or college. Set up a valid yarn proxy to allow it to download dependencies. 

export http_proxy=http://host:port/
export https_proxy=https://host:port/
yarn config set proxy http://host:port/
yarn config set https-proxy https://host:port/

If you encounter any other errors, apart from the ones mentioned above, please refer the official build guide.

Connect with me on LinkedIn or Facebook or drop a mail to [email protected]