What hash algorithm does Python's dictionary mapping use?

python hashmap

The hash that it uses depends on the object being used as a key -- each class can define its own __hash__() method, and the value that it returns for a particular instance is what is used for the dictionary.

Python itself provides the hash implementation for str and tuple types. A quick look at the source should reveal the exact algorithm for those.

The hash of a tuple is based on the hashes of its contents. The algorithm is essentially this (simplified slightly):

def hash(tuple):    mult = 1000003    x = 0x345678    for index, item in enumerate(tuple):        x = ((x ^ hash(item)) * mult) & (1<<32)        mult += (82520 + (len(tuple)-index)*2)    return x + 97531

For strings, the interpreter also iterates over every character, combining them with this (again, slightly simplified) algorithm:

def hash(string):    x = string[0] << 7    for chr in string[1:]:        x = ((1000003 * x) ^ chr) & (1<<32)    return x

A bigger issue to worry about is avoiding hash collisions. Colliding hash keys will cause a linear search as the dictionary tries to find a place to store the new object (this is now being recognized as a security issue, and tha behavior may be changing in upcoming python versions)

CodeHunter

What hash algorithm does Python's dictionary mapping use?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last