Knowing how to work with packages is an important skill for any Python developer. However, unlike the language itself, there is nothing Zen about packaging in Python.
Packaging is one of the hardest things to learn in Python. That’s why we asked two Mighty Avengers to help us.
We’ll work with Clint and Natasha on a small application, starting with a couple of modules and finish with our very own python package.
The code examples in this article can be found here.
We break down our code into modules for modularity. A python module is just one file containing code. Modular code is easier to work with. Why?
Less stuff to keep in (human) memory while coding, less scrolling, knowing exactly where to look when you have to find a particular thing, breaking big problems into smaller ones, et cetera.
Let’s meet Clint. Clint is working on a new app, which prints a message based on the command line argument entered. The app uses a custom print function. Clint asks Natasha to write that code for him. Natasha creates the bprint
module for Clint, who imports it in his app
module.
"""01/app.pyMain app module."""
import sys
import bprint
commands = {"reboot": "Rebooting launch control system.","halt": "Halting all systems.",}
try:command = sys.argv[1]except IndexError:bprint.upper_print("No command provided. Terminating.")sys.exit(1)else:print(commands.get(command, "Doing absolutely nothing."))
Here’s the bprint
module.
"""01/bprint.pyCool print functions."""
def upper_print(message):print(message.upper())
if __name__ == '__main__':print("Hello little child.")upper_print("You have run bprint as a script.")print("That seems great. Goodnight.")
Let’s see Clint’s directory structure.
.├── app.py└── bprint.py
0 directories, 2 files
Clint can run his app using:
python3 app.py <command>
Awesome! Modules are pretty helpful. Clint and Natasha did a great job together. It doesn’t seem like they needed any packages yet.
In the next version of his app, Clint decides that he needs more custom print functions to display his messages. Again, he asks Natasha to help him out. Natasha writes a module sprint
containing the code that Clint requires.
Clint can just copy the files in the same directory along with his app.py
and use them.
.├── app.py├── bprint.py└── sprint.py
0 directories, 3 files
But now Clint feels uneasy, his code is mixed up with Natasha’s code. This will cause problems as the size of the code base increases. He decides to re-organize it. So he moves Natasha’s code into a sub-folder and modifies the imports in the app.py
file.
.├── app.py└── print_helpers├── bprint.py└── sprint.py
1 directory, 3 files
Clint’s modified app.py
file.
"""02/app.pyClint's new main app module."""
import sys
from print_helpers import bprint, sprint
commands = {"reboot": "Rebooting launch control system.","halt": "Halting all systems.",}
try:command = sys.argv[1]except IndexError:bprint.upper_print("No command provided. Terminating.")sys.exit(1)else:sprint.tacky_print(commands.get(command, "Doing absolutely nothing."))
And that’s it. Anyone who wants to use Natasha’s code can (with her permission) copy the print_helpers
directory to their code and import
.
Clint just created a package of Natasha’s code. A python package is collection of modules. Developers generally put modules related to each other in the same package.
Natasha is not satisfied with this. Copy pasting complete packages is error prone. It feels like a temporary hack. Natasha tries to learn more about packages and finds that with just a little more work, she can distribute her code in a better way.
By providing a setup.py
install script.
A lot can go inside the setup.py
file. Here is the minimal example that Natasha uses.
"""03/setup.pyNatasha's initial setup.py"""
from setuptools import setup, find_packages
setup(name='print-helpers',version='0.1',packages=find_packages(),)
Natasha also creates an empty file named __init__.py
inside the print_helpers
directory. This is required by python. A directory containing a __init__.py
is identified as a package.
.├── print_helpers│ ├── bprint.py│ ├── __init__.py│ └── sprint.py└── setup.py
1 directory, 4 files
Natasha can now pack all of her code into a single archive file and distribute it. Wheels are the recommended format for distributing packages. Natasha can create a wheel for her package by running
python3 setup.py bdist_wheel
This will generate a lot of new files along with the wheel
in the dist/ directory.
.├── build│ ├── bdist.linux-x86_64│ └── lib│ └── print_helpers│ ├── bprint.py│ ├── __init__.py│ └── sprint.py├── dist│ └── print_helpers-0.1-py3-none-any.whl├── print_helpers│ ├── bprint.py│ ├── __init__.py│ └── sprint.py├── print_helpers.egg-info│ ├── dependency_links.txt│ ├── PKG-INFO│ ├── SOURCES.txt│ └── top_level.txt└── setup.py
7 directories, 12 files
Natasha can then send this .whl
file to Clint.
Clint can now use pip
to install Natasha’s package. Pip is a command line utility to manage python packages. Here’s the command Clint runs.
pip install ./print_helpers-0.1-py3-none-any.whl
pip
will automatically copy Natasha’s code files from the package to specific locations on Clint’s computer. These are the locations where Python looks for modules when it encounters an import statement in the code.
Check the output of pip show
to examine the location of the newly installed package.
$ pip show print-helpers
Name: print-helpersVersion: 0.1Summary: UNKNOWNHome-page: UNKNOWNAuthor: UNKNOWNAuthor-email: UNKNOWNLicense: UNKNOWNLocation: /home/ramit/my_py_env/lib/python3.6/site-packagesRequires:Required-by:
This location will vary from one computer to the next, but you will also find this location in sys.path
which is a list of directories where python searches for modules.
Any Python code running on Clint’s computer can now import Natasha’s package. Clint does not need to maintain a copy of Natasha’s code along with his own in his projects.
"""04/app.py"""
import sys
from print_helpers import bprint, sprint
commands = {"reboot": "Rebooting launch control system.","halt": "Halting all systems.",}
try:command = sys.argv[1]except IndexError:bprint.upper_print("No command provided. Terminating.")sys.exit(1)else:sprint.tacky_print(commands.get(command, "Doing absolutely nothing."))
Directory structure of Clint’s app.
.└── app.py
0 directories, 1 file
Pip can install packages from PyPI, local or remote archives, Github repos or project directories. Python packages are generally uploaded to PyPI which serves as a repository for python packages. Natasha can upload her package to PyPI and anyone wanting to use her package can install the package directly using pip
, eliminating the need to acquire a package archive manually. This is what most of us are familiar with.
Try finding a few python packages on Github. Here’s one if you can’t remember any. Here’s another. Examine the directory structure of the repository linked above.
.├── LICENSE.txt├── MANIFEST.in├── README.md├── requirements.txt├── setup.py├── src│ └── fsanitize│ ├── app.py│ ├── __init__.py│ ├── logmgr.py│ └── sanitize.py└── tests├── __init__.py└── test_all.py
3 directories, 11 files
These contain a lot of files along with setup.py
. These can include tests, test configs, manifest files, license information, documentation, build scripts and much more. These are distributed along with the python modules inside the package.
Ideally, Clint would also create a package for his app and specify Natasha’s package as a dependency. Dependencies get installed along with the main package automatically.
print-helpers
is present in python’s sys.path
bprint
from the command line.console_scripts
and entry_points
do.__init__.py
file and running bprint
from the command line.install_requires
keyword argument does.