Packaging Python Projects — Supplement

John Tucker
8 min readOct 6, 2020

--

A supplement to the official Packaging Python Projects tutorial; filling in a number of gaps.

I was surprised at the number of challenges I faced in walking through the official Packaging Python Projects tutorial; so much so that I thought to write this supplementary article.

If you wish to follow along, you will need Python (a 3.X version) and pipenv (for Python development workflow) installed on your workstation. While not necessary, I also used pyenv (for Python version management).

note: This article was written using the latest (as of the time of the article) version of Python: 3.8.6.

The final distribution package source code developed in this article is available for download.

A Simple Project

The tutorial starts off by having us create two folders and an empty file:

Right off the bat, I was confused by the dual meaning of the word “package”. There is the distribution package (outer folder) and the import package (inner folder).

Distribution Package

A versioned archive file that contains Python packages, modules, and other resource files that are used to distribute a Release. The archive file is what an end-user will download from the internet and install.

A distribution package is more commonly referred to with the single words “package” or “distribution”, but this guide may use the expanded term when more clarity is needed to prevent confusion with an Import Package (which is also commonly called a “package”) or another kind of distribution (e.g. a Linux distribution or the Python language distribution), which are often referred to with the single term “distribution”.

— Python — Python Packaging User Guide: Glossary

Import Package

A Python module which can contain other modules or recursively, other packages.

An import package is more commonly referred to with the single word “package”, but this guide will use the expanded term when more clarity is needed to prevent confusion with a Distribution Package which is also commonly called a “package”.

— Python — Python Packaging User Guide: Glossary

So, now that we understand that a distribution package can contain one or more import packages, it is recommended that they only contain one.

Distribute only one package (or only one module) per project, and use package (or module) name as project name.

— Python — PEP 423 — Naming conventions and recipes related to packaging

Now that we know we are going to create a distribution package that is named the same as the single import package it contains, we have to decide on a name.

Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.

— Python — PEP 8 — Style Guide for Python Code

It turns out that we will want to pick an all-lowercase name with no underscores. This is because, distribution packages uploaded to the Python Package Index cannot be named with underscores.

note: I learned this the hard way as I first named a distribution package with underscores only to find that when I uploaded it to the Python Package Index, the underscores were silently changed to dashes.

Finally, we will want to pick a name that does not conflict with an existing distribution package in the Python Package Index. With all this in mind, I started off by creating two folders and an empty file as such:

examplepkglarkintuckerllc
|- examplepkglarkintuckerllc
|- __init__.py

note: While this structure meets all the aforementioned requirements, in practice we would want to create a package name that is both meaningful and easy to remember.

Creating the (Import) Package Files

The official tutorial skips the step of creating anything else in the import package which makes it effectively meaningless when we test the final result. So, let us create a simple module with a single function; examplepkglarkintuckerllc/example_module.py:

For completeness, we supply a docstring for the import package in examplepkglarkintuckerllc/__init__.py:

To allow us to execute the package (say for development), we create examplepkglarkintuckerllc/__main__.py:

With this, we can execute the import package from the command line in the root of the distribution package:

$ python -m examplepkglarkintuckerllc
Hello World!

Because, the tutorial has us also creating a tests folder in the root of the distribution package, we can populate it with a test case; test_example_module.py:

And then we can run the test using the following command; from the distribution package folder:

$ python -m unittest discover -s tests
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

Creating the (Distribution) Package Files

We now focus on creating files in the distribution package folder; starting with README.md:

note: The file is named, README.md; I had to remove the extension to make it work with the embedded block above.

We create a file; LICENSE:

The package distribution configuration is supplied as a Python module; setup.py:

Things to observe:

  • The only functional difference between this and the official tutorial is the addition of the license keyword argument; this avoids having an empty license value in the uploaded distribution package
  • The principal stylistic difference is the sorting of the keyword arguments

Generating and Uploading Distribution Archives

Before we build and upload our distribution package, we ensure we have the most up-to-date related build tools:

$ pip install --upgrade setuptools wheel twine

note: Because I am using pyenv, which manages my python version, I can use a simplified syntax for installing python distribution packages.

Now we build the distribution package archive:

$ python setup.py sdist bdist_wheel

The build artifacts, not checked into source control, are three folders:

  • build
  • dist
  • examplepkglarkintuckerllc.egg-info

Assuming that we already created an account on TestPyPi, we upload the distribution package archive:

$ twine upload --repository testpypi dist/*

At this point, our distribution package is available to be installed from TestPyPi.

Installing Your Newly Uploaded Package

To illustrate using our package, we create a new folder and execute the following command in it:

$ pipenv install --pypi-mirror https://test.pypi.org/simple/ examplepkglarkintuckerllc

This command downloads the distribution package and installs the single import package that it contains. Here pipenv installs the distribution package into a virtual environment that specific to this folder.

One point of confusion here is that by installing the distribution package, we actually observe that our virtual environment’s site-packages folder contains the import package (also named examplepkglarkintuckerllc) delivered by the distribution package.

In the newly created folder, we now create a simple Python module that uses our import package; example.py:

Finally, we run our module:

$ pipenv run python example.py
Hello World!

Updating Distribution Packages

To illustrate updating our distribution package, we can simply change the version, e.g., 0.0.2, in the setup.py module and build a new archive:

$ python setup.py sdist bdist_wheel

note: Because building a new archive does not delete the previous archive, one will want to regularly delete the dist folder before building a new archive.

and then upload it:

$ twine upload --repository testpypi dist/*

Distribution Packages and Dependencies

Lets go through an example where we update our import package (in the distribution package) to depend on another package. From our distribution package folder, we execute:

$ pipenv install random-word

note: During the writing of this article, the random-word package started to fail; did not return words from a backend api. If you are following along, it may or or many not work. Did not want to change to a new package as I was done with the example code in the article.

In addition to create a virtual environment for this folder, it creates a Pipfile and Pipfile.lock in our distribution package folder to tracks the dependencies for code in this folder. It is important to note, however, that these files have nothing to do with our distribution package itself; they just happen to be in the same folder.

We update our code to use this dependency; examplepkglarkintuckerllc/example_module.py:

Observe that this distribution package did not follow the recommendation of using a single word name. This then lead to the discrepancy between the distribution package name (random-word) and import package name (random_word).

For completeness, we can update the examplepkglarkintuckerllc/__main__.py file:

And from the distribution package folder, we can test our code with:

$ pipenv run python -m examplepkglarkintuckerllc
Hello World!
amenability

The problem, however, we have not informed our distribution package of this dependency. Luckily, setuptools provides a solution:

install_requires is a setuptools setup.py keyword that should be used to specify what a project minimally needs to run correctly. When the project is installed by pip, this is the specification that is used to install its dependencies.

— Python — install_requires vs requirements files

At first glance, we would expect to simply add a keyword parameter to our setup.py file as such:

install_requires=['random-word'],

Maybe, actually highly recommended, we would add a specific version:

install_requires=['random-word==1.0.4"'],

We would expect that like other dependency management tools, npm for Node.js come to mind, pip would traverse the dependency tree and also install any dependencies that random-word had when we install our distribution package. Nope!

Looking at the setup.py in the source code for random-word, we can observe that it depends on the latest version of both the requests and nose distribution packages.

install_requires=[
'requests', 'nose'
],

But, these distribution packages have their own dependencies. This is painful.

Luckily pip provides a solution were we can determine the full set of dependencies (including versions) that our project depends on.

$ pipenv run pip freeze > requirements.txt

Looking at the contents of this file, we can now properly update the install_requires keyword attribute in our setup.py file:

$ cat requirements.txt
certifi==2020.6.20
chardet==3.0.4
idna==2.10
nose==1.3.7
Random-Word==1.0.4
requests==2.24.0
urllib3==1.25.10

We can partially automate this process by updating our setup.py configuration file to parse the requirements.txt file:

We can then bump the version (in setup.py) and build a new archive:

$ python setup.py sdist bdist_wheel

and then upload it to PyPi (not TestPyPi); requires creating a new account:

$ twine upload dist/*

The reason we use PyPi instead of TestPyPi is that when using the pypi-mirror option with TestPyPi when installing our package, pip attempts to find all the dependencies from TestPyPi. The specific versions of the dependencies we provided are unlikely to be in TestPyPi and our distribution will fail to install. I lost hours troubleshooting this problem.

Installing Your Newly Uploaded Package (Again)

As before, we create a new folder and execute the following command in it:

$ pipenv install examplepkglarkintuckerllc

In the newly created folder, we now create a simple Python module that uses our import package; example.py:

Finally, we run our module:

$ pipenv run python example.py
Hello World!
amenability

note: Again, by the time I got to the end of this article, the random-word package started to fail (even where it worked earlier).

Conclusion

After hours of banging my head on the wall and writing an overly complex article, I finally believe I understand how to properly create Python distribution packages. Hope you do too.

--

--

John Tucker
John Tucker

Written by John Tucker

Broad infrastructure, development, and soft-skill background

No responses yet