Building a Python package

Ioannis Nasios Python Leave a Comment

I have been working with Python over the last years, using Python mainly for machine learning projects but I had never built a package.
Recently I built my first Python package SolarSystem and published it on github and on Pypi. The idea of SolarSystem, i.e. planets and other celestial objects position estimation as a time function, was old as well as the relevant code, but I had not undertaken organizing the code in a package until recently.

In the way to my first Python package I discovered that besides the code itself, there are many things that I could add to my project to make it complete, stable, and user friendly. Below I describe all the steps I have taken for anyone wishing to follow, as I found out that not all steps are crystal clear but rather need some digging in to be revealed.

What follows has been tested in Ubuntu 18.04, but I guess it will suffice for other operating systems, too, with minimal changes.

Step 1: Write the code

Write all your python scripts containing all necessary functions and classes.

Step 2: Make your scripts a Python package.

As per the official documentation, your package name will be the name of the directory containing your __init__.py file. __init__.py is required for this directory, and it can be empty or contain imports or your other python scripts or even your whole package code.
Then you can create a setup.py file as described in the documentation (or preferably copy a setup file from a github repo and make all necessary adjustments).
Create a README.md file as this will be your repo’s landing page with a detailed description and information you want to share with visitors, and a LICENCE file describing the terms of use of your package (there are a lot of licence types available and practically you have to select one of them and don’t have to manually write one of your own).

Step 3: Install your package locally

Now that you have created the setup.py file you can install your package from your local directory in order to manually check that your code works and everything runs as expected. So far your directory structure should be as below.

whole_project/ 
your_package/
__init__.py
all_other_python_scripts.py
setup.py
LICENSE
README.md
# To install your package:
cd whole_project/
pip install .

Note that your whole_project name can be the same as your_package name (ex. my whole_project is named solarsystem as my_package name)

Step 4: Upload your package to Pypi

In order for others to be able to install your package more easily, you can share it by uploading to Pypi, the official Python repository.
To upload to Pypi you should first generate the distribution archives.

cd whole_project/
pip install -U setuptools wheel
python setup.py sdist bdist_wheel

Create an account to Pypi and on testPypi to first test you package.

# install twine package:
python -m pip install --user --upgrade twine

In order to upload your package you can follow the instruction on pypi or create a .pypirc file in your HOME directory containing all the credentials from your registration on pypi and testpypi (so you don’t have to give username and password every time) and upload. Example of .pypirc file:

[distutils]
index-servers =
pypi
pypitest
[pypi]
repository: https://upload.pypi.org/legacy/
username: your_user_name_on_pypi
password: your_password_on_pypi
[pypitest]
repository: https://test.pypi.org/legacy/
username: your_user_name_on_testpypi
password: your_password_on_testpypi
# first upload to pypitest server to ensure everything runs normal
twine upload dist/* -r pypitest
# install from testpypi to check all went well
pip install --index-url https://test.pypi.org/simple/ --no-deps your_package
# if all looks OK, upload to pypi and install from pypi
twine upload dist/* -r pypi
pip install your_package

Step 5: Make a Github repo and upload your package

I assume you already have git installed (a free distributed version-control system); if not, install it first (not shown).
Go to github, create an account if necessary, and create a repository with the same name as the name of your package.
Optionally, locally inside your package directory you can create a .gitignore file, containing the names of local files and directories which git will ignore from adding to repository.
At the top of your GitHub repository’s Quick Setup page, click to view your remote_repository_URL.
Working with git and github have many aspects. There are many commands but just a few will be enough for our case. Here are the necessary commands for committing locally and upload to github.

cd whole_project/
# Initialize the local directory as a Git repository. (Run Once)
git init
# Add files in your new local repository. This stages them for the first commit.
git add .
# Commit the files that you've staged in your local repository with a comment.
git commit -m "First commit"
#Add the URL for the remote repository where your local repository will be pushed (Run Once).
git remote add origin remote_repository_URL
# Push changes from your local repository to GitHub.
git push origin master

Step 6: Build package documentation

For users to understand how a class or function of your code works, documentation is necessary.
A complete way of doing so is to write documentation using docstrings """ """, inside every class and function in a structured text form. This way we will always know how to run a certain class or function provided that we can see the code. Furthermore, later you can use the sphinx package to extract this documentation from inside your python scripts to create html documentation. You can further read about how to document your code here (personally I prefer the Google docstring format), and here is how to use sphinx to create your documentation:

# Building documentation locally by using sphinx package.
pip install sphinx
cd whole_project/
mkdir docs
cd docs
# use sphinx-apidoc
sphinx-apidoc -F --full -e --separate --tocfile your-package.rst /-o path-to-docs-directory/ /path-to-your_package/

This will create a conf.py and some .rst files inside your docs directory.
Open and edit your conf.py file. This can be tricky so I recommend to copy my conf.py file (or a conf.py from some other repository) and edit to change project, copyright and author.
Extensions used in my file:

  • sphinx.ext.autodoc for automatically build documentation,
  • sphinx.ext.viewcode for adding link for source code inside documentation,
  • sphinx.ext.napoleon necessary if google or numpy structured text form used.

You can now build the docs locally by simple running.

make html

Build your docs to readthedocs: For your docs to be permanently online and rebuilt every time you push something to your github repository, you can register for free and add your project. Readthedocs will read your conf.py and your .rst files and build your documentation as a subdomain, meaning anyone will be able to access this documentation on
https://your-project.readthedocs.io

Step 7: Build some tests for your code (optionally)

Create a tests directory inside your-project directory.

cd whole_project/
mkdir tests
# Install pytest package if not already installed
pip install pytest

Build some python scripts which will use your package and check if it works. You can see my python tests here.
You can now test if your code is working by running:
python tests/my_test_script.py

When you have a library ready, you should check what versions of python can run your code flawlessly. Running your tests to various python distributions will let you know all supported versions for your package Travis is an online platform which can run your tests for free at any required python version and also verify that your tests pass.
Here is the solasystem passing tests example.
In order to use travis you will need to create an account (login with github) for free and add a github repo (your project). You should also add a .travis.yml file, see a simple example here. This way, every time you commit to github, all the tests will run automatically.

Except passing your tests for a list of python versions, there is another important aspect of code testing: what percentage of your code is tested from your tests. You can easily measure it locally:

pip install codecov
cd whole_project/
coverage run --source=your_package -m pytest -v
# and view the report
coverage report -m

I suggest you go a step further and generate an html coverage report, which will help you check which parts of your code are tested and which are not; you can do this locally:

# if you want to generate html files locally
coverage html

Alternatively, you can also upload or create a coverage report at codecov.io. This way the results will be available for others to see in your repository. If you want to generate html files at codecov.io, as we did for travis above, create an account (login with github) for free and add a github repo (your project).
You can now either upload your previous generated report:

codecov --token=-your-project-token-provided-by-codecov --gcov-root=your_package/

or edit your travis file like this .travis.yml, so as to tell travis to run codecov after finishing your tests and upload the results to codecov.io.

Step 8: Update README.md 

After following the above steps, we can insert important information in our README.md.

A good practice is to insert badges in your README.md, that provide useful information regarding the status of your code (version, build, docs etc). Here are some typical examples:
pypi package badge provided by pypi, with last version uploaded to pypi
docs badge provided from readthedocs, declaring that you have successfully build your docs online.
build badge provided by travis declaring that you successfully pass your tests on all given python versions.
codecov badge provided by codecov.io showing what percent of your package code is tested with your tests.

Here is the markdown code I used for my own project:

[![PyPI version](https://badge.fury.io/py/solarsystem.svg)](https://badge.fury.io/py/solarsystem)
[![Documentation Status](https://readthedocs.org/projects/solarsystem/badge/?version=latest)](https://solarsystem.readthedocs.io/en/latest/?badge=latest)
[![Build Status](https://travis-ci.org/IoannisNasios/solarsystem.svg?branch=master)](https://travis-ci.org/IoannisNasios/solarsystem)
[![codecov](https://codecov.io/gh/IoannisNasios/solarsystem/branch/master/graph/badge.svg)](https://codecov.io/gh/IoannisNasios/solarsystem)
Badges example

Of course, there are many other things that you can (and should) include in your README.md, e.g. provide links, installation instructions and/or options for your library, citations etc. I’ll leave it to you here, as there are literally thousands of nice README files out there to get inspiration from. Happy coding!



Ioannis Nasios

Ioannis is one of our resident Data Scientists. He is a Kaggle Master, and currently in top-150 of Kaggle participants worldwide.
Ioannis Nasios

Latest posts by Ioannis Nasios (see all)

Leave a Reply

avatar
  Subscribe  
Notify of