How does collections.defaultdict work? How does collections.defaultdict work? python python

How does collections.defaultdict work?


Usually, a Python dictionary throws a KeyError if you try to get an item with a key that is not currently in the dictionary. The defaultdict in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a "default" item, it calls the function object that you pass to the constructor (more precisely, it's an arbitrary "callable" object, which includes function and type objects). For the first example, default items are created using int(), which will return the integer object 0. For the second example, default items are created using list(), which returns a new empty list object.


defaultdict means that if a key is not found in the dictionary, then instead of a KeyError being thrown, a new entry is created. The type of this new entry is given by the argument of defaultdict.

For example:

somedict = {}print(somedict[3]) # KeyErrorsomeddict = defaultdict(int)print(someddict[3]) # print int(), thus 0


defaultdict

"The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default(value to be returned) up front when the container is initialized."

as defined by Doug Hellmann in The Python Standard Library by Example

How to use defaultdict

Import defaultdict

>>> from collections import defaultdict

Initialize defaultdict

Initialize it by passing

callable as its first argument(mandatory)

>>> d_int = defaultdict(int)>>> d_list = defaultdict(list)>>> def foo():...     return 'default value'... >>> d_foo = defaultdict(foo)>>> d_intdefaultdict(<type 'int'>, {})>>> d_listdefaultdict(<type 'list'>, {})>>> d_foodefaultdict(<function foo at 0x7f34a0a69578>, {})

**kwargs as its second argument(optional)

>>> d_int = defaultdict(int, a=10, b=12, c=13)>>> d_intdefaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})

or

>>> kwargs = {'a':10,'b':12,'c':13}>>> d_int = defaultdict(int, **kwargs)>>> d_intdefaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})

How does it works

As is a child class of standard dictionary, it can perform all the same functions.

But in case of passing an unknown key it returns the default value instead of error. For ex:

>>> d_int['a']10>>> d_int['d']0>>> d_intdefaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12, 'd': 0})

In case you want to change default value overwrite default_factory:

>>> d_int.default_factory = lambda: 1>>> d_int['e']1>>> d_intdefaultdict(<function <lambda> at 0x7f34a0a91578>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0})

or

>>> def foo():...     return 2>>> d_int.default_factory = foo>>> d_int['f']2>>> d_intdefaultdict(<function foo at 0x7f34a0a0a140>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0, 'f': 2})

Examples in the Question

Example 1

As int has been passed as default_factory, any unknown key will return 0 by default.

Now as the string is passed in the loop, it will increase the count of those alphabets in d.

>>> s = 'mississippi'>>> d = defaultdict(int)>>> d.default_factory<type 'int'>>>> for k in s:...     d[k] += 1>>> d.items()[('i', 4), ('p', 2), ('s', 4), ('m', 1)]>>> ddefaultdict(<type 'int'>, {'i': 4, 'p': 2, 's': 4, 'm': 1})

Example 2

As a list has been passed as default_factory, any unknown(non-existent) key will return [ ](ie. list) by default.

Now as the list of tuples is passed in the loop, it will append the value in the d[color]

>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]>>> d = defaultdict(list)>>> d.default_factory<type 'list'>>>> for k, v in s:...     d[k].append(v)>>> d.items()[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]>>> ddefaultdict(<type 'list'>, {'blue': [2, 4], 'red': [1], 'yellow': [1, 3]})