sklearn doesn't have attribute 'datasets' sklearn doesn't have attribute 'datasets' python-3.x python-3.x

sklearn doesn't have attribute 'datasets'


sklearn is a package. This answer said it very succinctly:

when you import a package, only variables/functions/classes in the __init__.py file of that package are directly visible, not sub-packages or modules.

datasets is a sub-package of sklearn. This is why this happens:

In [1]: import sklearnIn [2]: sklearn.datasets---------------------------------------------------------------------------AttributeError                            Traceback (most recent call last)<ipython-input-2-325a2bfc35d0> in <module>()----> 1 sklearn.datasetsAttributeError: module 'sklearn' has no attribute 'datasets'

However, the reason why this works:

In [3]: from sklearn import datasetsIn [4]: sklearn.datasetsOut[4]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>

is that when you load the sub-package datasets by doing from sklearn import datasets it is automatically added to the namespace of the package sklearn. This is one of the lesser-known "traps" of the Python import system.

Also, note that if you look at the __init__.py for sklearn you will see 'datasets' as a member of __all__, but this only allows you to do:

In [1]: from sklearn import *In [2]: datasetsOut[2]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>

One last point to note is that if you inspect either sklearn or datasets you will see that, although they are packages, their type is module. This is because all packages are considered modules - however, not all modules are packages.