Sometimes it’s useful to define an associative array which maps a certain key to a list of elements. This is used basically everywhere you need to achieve some categorization. Just to give an example, suppose we want to group a bunch of people by name. Each person has an unique id, while there may be multiple guys with the same name.

In a situation like this we need a dictionary mapping strings to lists of identifiers, and if we see a name for the first time we want an empty list.

Simple and stupid way:

def insert (D, name, id):
    if not name in D:
        l = []
        D[name] = l
    else:
        l = D[name]
    l.append(id)

foo = dict()
insert(foo, "Jack", 2)
insert(foo, "John", 3)
insert(foo, "John", 4)

What we get:

>>> foo
{'John': [3, 4], 'Jack': [2]}

A shorter way:

def insert (D, name, id):
    l = D.get(name, [])
    l.append(id)
    D[name] = l

Advantage: less rows of code;
Drawback: we always spawn a new list as default value.

Best way: Proudly Found Elsewhere

Python library provides this, so we get this functionality for free:

from collections import defaultdict
...
D = defaultdict(list)
D["jack"].append(2)
D["john"].append(3)
D["john"].append(4)

What we get:

>>> D
defaultdict(, {'john': [3, 4], 'jack': [2]})

Bonus trick:

What if we need to associate a single incremental identifier to each element of a certain set of names? Check this out:

import itertools
from collections import defaultdict
...
class Cnt :
    def __init__ (self):
        self.__call__ = itertools.count().next

idgen = defaultdict(Cnt())

We obtain this:

>>> idgen['foo']
0
>>> idgen['bar']
1
>>> idgen['foo']
0
>>> idgen['fubar']
2

Oh! Actually we can obtain the same result in a shorter (and maybe cleaner) way:

idgen = defaultdict(itertools.count().next)
>>> idgen['foo']
0
>>> idgen['bar']
1
>>> idgen['foo']
0
>>> idgen['fubar']
2
>>> idgen
defaultdict(<method-wrapper 'next' of itertools.count object at 0xa815a8>,
{'foo': 0, 'bar': 1, 'fubar': 2})