Extreme Python

Logo

Extremely good python content :)

About

Articles

How python absolute import works

by Alan

Python supports two types of imports: absolute and relative.

import X is the form for an absolute import. X is either a single identifier like foo or a dot path like foo.bar.

from X import Y is the form for both absolute and relative imports. However, if there is no leading dot in X, e.g., .foo.bar, then it’s an absolute import.

Lets look at some concrete examples using a sample directory:

app/
	main.py
	module_one.py
	package_one/
		__init__.py
		package_module_one.py

In main.py, we can include any of the following correct import statements:

A few things to point out:

  1. We never include .py in the import statement. Some languages are cool with the extension, but python imports only allow proper identifiers in the name of the thing being imported.
  2. There are no leading dots (i.e from .package_one ...). Leading dots means relative import, which I’m not covering here and don’t really recommend for most cases.
  3. You can import names defined inside modules directly either by specifying individual ones (like greeting), multiple ones comma delimited (like greeting, foobar) or wildcard (everything) using *. I recommend being explicit about what you’re bringing in and avoiding wildcard imports.

Now that I’ve shown you examples of what absolute import statements look like and what they do, lets dig a bit deeper into the import machinery process.

The lookup path

When the dotted (or undotted) names following either import or from is evaluated, where exactly does python start looking?

Step 1: Is it in the module cache?

There’s no point in importing something that’s already been imported. so before python goes looking for what you want, it checks if it already has it.

Step 2: Is it a built-in module?

There are some modules that come with all implementations of python (batteries included). This is also known as the python standard library.

There are some modules that are already loaded (hence already in the module cache by the time your code is executed) with the start of every python program such as os and sys. Others are available for import but are not loaded by default, such as json for example.

Step 3: Finally, is it in our module search path?

The module search path is a list of directories that python looks for modules in. The actual list can be accessed at sys.path.

The list of directories in sys.path is made up of (in exact order):

1) Current directory or directory of program

It all depends how you start the python program.

If you’re calling it inline like this:

python -c 'import sys; print(sys.path)'

Then the value will be “” which represents the current directory.

If you’re passing a script to the interpreter, then the value will be the directory of where that script is located.

So if you run python hey/there/main.py, then the value will be the absolute path to hey/there.

2) PYTHONPATH

The second value in the list will be the value of the PYTHONPATH environment variable if it exists.

PYTHONPATH='/usr/whattheshit' python -c 'import sys; print(sys.path)'

This will place /usr/whattheshit in front of the current directory value ''.

Here’s an example output using both inline execution and setting PYTHONPATH.

PYTHONPATH='/usr/whattheshit' python -c 'import sys; print(sys.path)'

[
	'',
	'/usr/whattheshit',
	 '/usr/lib/python2.7',
	 '/usr/lib/python2.7/plat-x86_64-linux-gnu',
	 '/usr/lib/python2.7/lib-tk',
	 '/usr/lib/python2.7/lib-old',
	 '/usr/lib/python2.7/lib-dynload',
	 '/usr/local/lib/python2.7/dist-packages',
	 '/usr/lib/python2.7/dist-packages',
	 '/usr/lib/python2.7/dist-packages/gtk-2.0'
 ]

You might be wondering what all that /usr/* stuff is at the end. You’ll find out next!

3) System python installation paths

That’s what all those /usr/* paths are. This depends on where python is installed on your platform and is where all of the core packages and files are located.

You already saw this list of these previously but here’s the exact list on my linux machine:

[
	'/usr/lib/python2.7',
	'/usr/lib/python2.7/plat-x86_64-linux-gnu',
	'/usr/lib/python2.7/lib-tk',
	'/usr/lib/python2.7/lib-old',
	'/usr/lib/python2.7/lib-dynload',
	'/usr/local/lib/python2.7/dist-packages',
	'/usr/lib/python2.7/dist-packages',
	'/usr/lib/python2.7/dist-packages/gtk-2.0'
]

I rarely ever have to pay any attention to these system paths. All you really have to know is that they’re there.

Lets bring it all together

All together, the search path (sys.path) is made up of:

  1. The current directory or the directory where script is being executed
  2. PYTHON_PATH environment variable
  3. Python system installation paths

Once its constructed based on the rules above, it remains the same throughout the lifetime of the program if unmodified.

Sometimes you need to add a directory to the search path at runtime to get what you need, which I’ll explain below.

Common questions

What if I have a module with the same name as a builtin module?

This depends. If you’re doing import sys or from sys ... and sys is also a module you defined, python will still return the built-in one first (because sys matches the name of a built-in). However, if you use dots (because your version of sys is in a specific package), then chances are find your specific version. For example: import test.sys because test.sys does not exist as a built-in (there is no built-in package named test).

When would you ever really need to change sys.path?

If you’re always running a top level program, you usually don’t have to.

For example:

app/
	main.py
	package_a/
	package_b/
	subdirectory_a/
		utility.py
	module_a.py

Running python main.py will append the directory containing the script (app/) to the module search path, which gives it access to pretty much everything from the toplevel down (all the top level modules and packages).

For example, main.py can have import module_a because module_a is in app/.

However, what if import module_a was also in the file subdirectory_a/utility.py?

Well, first of all, executing python subdirectory_a/utility.py will add subdirectory_a to the search path (because the rule is that if you execute a script, the parent directory containing that script is added to the front of sys.path). But since app/ is not in our search path, we’re going to get a ModuleNotFoundError.

We have a few options here to get what we want:

  1. Move utility.py to the top level directory /app (hey that’s cheating!)
  2. Modify sys.path to include the path to /app
  3. Set PYTHON_PATH to path to /app

Lets take a look at option #2. We can do this in the main script itself.

Example:

import sys 
sys.path.append('path/to/app/')
import module_a 

Now our import will succeed!

tags: