HashMap get/put complexity

java data-structures hashmap complexity-theory

It depends on many things. It's usually O(1), with a decent hash which itself is constant time... but you could have a hash which takes a long time to compute, and if there are multiple items in the hash map which return the same hash code, get will have to iterate over them calling equals on each of them to find a match.

In the worst case, a HashMap has an O(n) lookup due to walking through all entries in the same hash bucket (e.g. if they all have the same hash code). Fortunately, that worst case scenario doesn't come up very often in real life, in my experience. So no, O(1) certainly isn't guaranteed - but it's usually what you should assume when considering which algorithms and data structures to use.

In JDK 8, HashMap has been tweaked so that if keys can be compared for ordering, then any densely-populated bucket is implemented as a tree, so that even if there are lots of entries with the same hash code, the complexity is O(log n). That can cause issues if you have a key type where equality and ordering are different, of course.

And yes, if you don't have enough memory for the hash map, you'll be in trouble... but that's going to be true whatever data structure you use.

java data-structures hashmap complexity-theory

It has already been mentioned that hashmaps are O(n/m) in average, if n is the number of items and m is the size. It has also been mentioned that in principle the whole thing could collapse into a singly linked list with O(n) query time. (This all assumes that calculating the hash is constant time).

However what isn't often mentioned is, that with probability at least 1-1/n (so for 1000 items that's a 99.9% chance) the largest bucket won't be filled more than O(logn)! Hence matching the average complexity of binary search trees. (And the constant is good, a tighter bound is (log n)*(m/n) + O(1)).

All that's required for this theoretical bound is that you use a reasonably good hash function (see Wikipedia: Universal Hashing. It can be as simple as a*x>>m). And of course that the person giving you the values to hash doesn't know how you have chosen your random constants.

TL;DR: With Very High Probability the worst case get/put complexity of a hashmap is O(logn).

java data-structures hashmap complexity-theory

I'm not sure the default hashcode is the address - I read the OpenJDK source for hashcode generation a while ago, and I remember it being something a bit more complicated. Still not something that guarantees a good distribution, perhaps. However, that is to some extent moot, as few classes you'd use as keys in a hashmap use the default hashcode - they supply their own implementations, which ought to be good.

On top of that, what you may not know (again, this is based in reading source - it's not guaranteed) is that HashMap stirs the hash before using it, to mix entropy from throughout the word into the bottom bits, which is where it's needed for all but the hugest hashmaps. That helps deal with hashes that specifically don't do that themselves, although i can't think of any common cases where you'd see that.

Finally, what happens when the table is overloaded is that it degenerates into a set of parallel linked lists - performance becomes O(n). Specifically, the number of links traversed will on average be half the load factor.

CodeHunter

HashMap get/put complexity

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last