Relative imports for the billionth time
Script vs. Module
Here's an explanation. The short version is that there is a big difference between directly running a Python file, and importing that file from somewhere else. Just knowing what directory a file is in does not determine what package Python thinks it is in. That depends, additionally, on how you load the file into Python (by running or by importing).
There are two ways to load a Python file: as the top-level script, or as amodule. A file is loaded as the top-level script if you execute it directly, for instance by typing python myfile.py
on the command line. It is loaded as a module when an import
statement is encountered inside some other file. There can only be one top-level script at a time; the top-level script is the Python file you ran to start things off.
Naming
When a file is loaded, it is given a name (which is stored in its __name__
attribute). If it was loaded as the top-level script, its name is __main__
. If it was loaded as a module, its name is the filename, preceded by the names of any packages/subpackages of which it is a part, separated by dots.
So for instance in your example:
package/ __init__.py subpackage1/ __init__.py moduleX.py moduleA.py
if you imported moduleX
(note: imported, not directly executed), its name would be package.subpackage1.moduleX
. If you imported moduleA
, its name would be package.moduleA
. However, if you directly run moduleX
from the command line, its name will instead be __main__
, and if you directly run moduleA
from the command line, its name will be __main__
. When a module is run as the top-level script, it loses its normal name and its name is instead __main__
.
Accessing a module NOT through its containing package
There is an additional wrinkle: the module's name depends on whether it was imported "directly" from the directory it is in, or imported via a package. This only makes a difference if you run Python in a directory, and try to import a file in that same directory (or a subdirectory of it). For instance, if you start the Python interpreter in the directory package/subpackage1
and then do import moduleX
, the name of moduleX
will just be moduleX
, and not package.subpackage1.moduleX
. This is because Python adds the current directory to its search path when the interpreter is entered interactively; if it finds the to-be-imported module in the current directory, it will not know that that directory is part of a package, and the package information will not become part of the module's name.
A special case is if you run the interpreter interactively (e.g., just type python
and start entering Python code on the fly). In this case the name of that interactive session is __main__
.
Now here is the crucial thing for your error message: if a module's name has no dots, it is not considered to be part of a package. It doesn't matter where the file actually is on disk. All that matters is what its name is, and its name depends on how you loaded it.
Now look at the quote you included in your question:
Relative imports use a module's name attribute to determine that module's position in the package hierarchy. If the module's name does not contain any package information (e.g. it is set to 'main') then relative imports are resolved as if the module were a top level module, regardless of where the module is actually located on the file system.
Relative imports...
Relative imports use the module's name to determine where it is in a package. When you use a relative import like from .. import foo
, the dots indicate to step up some number of levels in the package hierarchy. For instance, if your current module's name is package.subpackage1.moduleX
, then ..moduleA
would mean package.moduleA
. For a from .. import
to work, the module's name must have at least as many dots as there are in the import
statement.
... are only relative in a package
However, if your module's name is __main__
, it is not considered to be in a package. Its name has no dots, and therefore you cannot use from .. import
statements inside it. If you try to do so, you will get the "relative-import in non-package" error.
Scripts can't import relative
What you probably did is you tried to run moduleX
or the like from the command line. When you did this, its name was set to __main__
, which means that relative imports within it will fail, because its name does not reveal that it is in a package. Note that this will also happen if you run Python from the same directory where a module is, and then try to import that module, because, as described above, Python will find the module in the current directory "too early" without realizing it is part of a package.
Also remember that when you run the interactive interpreter, the "name" of that interactive session is always __main__
. Thus you cannot do relative imports directly from an interactive session. Relative imports are only for use within module files.
Two solutions:
If you really do want to run
moduleX
directly, but you still want it to be considered part of a package, you can dopython -m package.subpackage1.moduleX
. The-m
tells Python to load it as a module, not as the top-level script.Or perhaps you don't actually want to run
moduleX
, you just want to run some other script, saymyfile.py
, that uses functions insidemoduleX
. If that is the case, putmyfile.py
somewhere else – not inside thepackage
directory – and run it. If insidemyfile.py
you do things likefrom package.moduleA import spam
, it will work fine.
Notes
For either of these solutions, the package directory (
package
in your example) must be accessible from the Python module search path (sys.path
). If it is not, you will not be able to use anything in the package reliably at all.Since Python 2.6, the module's "name" for package-resolution purposes is determined not just by its
__name__
attributes but also by the__package__
attribute. That's why I'm avoiding using the explicit symbol__name__
to refer to the module's "name". Since Python 2.6 a module's "name" is effectively__package__ + '.' + __name__
, or just__name__
if__package__
isNone
.)
This is really a problem within python. The origin of confusion is that people mistakenly takes the relative import as path relative which is not.
For example when you write in faa.py:
from .. import foo
This has a meaning only if faa.py was identified and loaded by python, during execution, as a part of a package. In that case,the module's name for faa.py would be for example some_packagename.faa. If the file was loaded just because it is in the current directory, when python is run, then its name would not refer to any package and eventually relative import would fail.
A simple solution to refer modules in the current directory, is to use this:
if __package__ is None or __package__ == '': # uses current directory visibility import fooelse: # uses current package visibility from . import foo
Here's a general recipe, modified to fit as an example, that I am using right now for dealing with Python libraries written as packages, that contain interdependent files, where I want to be able to test parts of them piecemeal. Let's call this lib.foo
and say that it needs access to lib.fileA
for functions f1
and f2
, and lib.fileB
for class Class3
.
I have included a few print
calls to help illustrate how this works. In practice you would want to remove them (and maybe also the from __future__ import print_function
line).
This particular example is too simple to show when we really need to insert an entry into sys.path
. (See Lars' answer for a case where we do need it, when we have two or more levels of package directories, and then we use os.path.dirname(os.path.dirname(__file__))
—but it doesn't really hurt here either.) It's also safe enough to do this without the if _i in sys.path
test. However, if each imported file inserts the same path—for instance, if both fileA
and fileB
want to import utilities from the package—this clutters up sys.path
with the same path many times, so it's nice to have the if _i not in sys.path
in the boilerplate.
from __future__ import print_function # only when showing how this worksif __package__: print('Package named {!r}; __name__ is {!r}'.format(__package__, __name__)) from .fileA import f1, f2 from .fileB import Class3else: print('Not a package; __name__ is {!r}'.format(__name__)) # these next steps should be used only with care and if needed # (remove the sys.path manipulation for simple cases!) import os, sys _i = os.path.dirname(os.path.abspath(__file__)) if _i not in sys.path: print('inserting {!r} into sys.path'.format(_i)) sys.path.insert(0, _i) else: print('{!r} is already in sys.path'.format(_i)) del _i # clean up global name space from fileA import f1, f2 from fileB import Class3... all the code as usual ...if __name__ == '__main__': import doctest, sys ret = doctest.testmod() sys.exit(0 if ret.failed == 0 else 1)
The idea here is this (and note that these all function the same across python2.7 and python 3.x):
If run as
import lib
orfrom lib import foo
as a regular package import from ordinary code,__package
islib
and__name__
islib.foo
. We take the first code path, importing from.fileA
, etc.If run as
python lib/foo.py
,__package__
will be None and__name__
will be__main__
.We take the second code path. The
lib
directory will already be insys.path
so there is no need to add it. We import fromfileA
, etc.If run within the
lib
directory aspython foo.py
, the behavior is the same as for case 2.If run within the
lib
directory aspython -m foo
, the behavior is similar to cases 2 and 3. However, the path to thelib
directory is not insys.path
, so we add it before importing. The same applies if we run Python and thenimport foo
.(Since
.
is insys.path
, we don't really need to add the absolute version of the path here. This is where a deeper package nesting structure, where we want to dofrom ..otherlib.fileC import ...
, makes a difference. If you're not doing this, you can omit all thesys.path
manipulation entirely.)
Notes
There is still a quirk. If you run this whole thing from outside:
$ python2 lib.foo
or:
$ python3 lib.foo
the behavior depends on the contents of lib/__init__.py
. If that exists and is empty, all is well:
Package named 'lib'; __name__ is '__main__'
But if lib/__init__.py
itself imports routine
so that it can export routine.name
directly as lib.name
, you get:
$ python2 lib.fooPackage named 'lib'; __name__ is 'lib.foo'Package named 'lib'; __name__ is '__main__'
That is, the module gets imported twice, once via the package and then again as __main__
so that it runs your main
code. Python 3.6 and later warn about this:
$ python3 lib.routinePackage named 'lib'; __name__ is 'lib.foo'[...]/runpy.py:125: RuntimeWarning: 'lib.foo' found in sys.modulesafter import of package 'lib', but prior to execution of 'lib.foo';this may result in unpredictable behaviour warn(RuntimeWarning(msg))Package named 'lib'; __name__ is '__main__'
The warning is new, but the warned-about behavior is not. It is part of what some call the double import trap. (For additional details see issue 27487.) Nick Coghlan says:
This next trap exists in all current versions of Python, including 3.3, and can be summed up in the following general guideline: "Never add a package directory, or any directory inside a package, directly to the Python path".
Note that while we violate that rule here, we do it only when the file being loaded is not being loaded as part of a package, and our modification is specifically designed to allow us to access other files in that package. (And, as I noted, we probably shouldn't do this at all for single level packages.) If we wanted to be extra-clean, we might rewrite this as, e.g.:
import os, sys _i = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) if _i not in sys.path: sys.path.insert(0, _i) else: _i = None from sub.fileA import f1, f2 from sub.fileB import Class3 if _i: sys.path.remove(_i) del _i
That is, we modify sys.path
long enough to achieve our imports, then put it back the way it was (deleting one copy of _i
if and only if we added one copy of _i
).