Memory consumption of a list and set in Python Memory consumption of a list and set in Python python-3.x python-3.x

Memory consumption of a list and set in Python


I think it's because of the inherent difference between list and set or dict i.e. the way in which the elements are stored.

List is nothing but a collection of references to the original object. Suppose you create 1000 integers, then 1000 integer objects are created and the list only contains the reference to these objects.

On the other hand, set or dictionary has to compute the hash value for these 1000 integers and the memory is consumed according to the number of elements.

For ex: In both set and dict, by default, the smallest size is 8 (that is, if you are only storing 3 values, python will still allocate 8 elements). On resize, the number of buckets increases by 4x until we reach 50,000 elements, after which the size is increased by 2x. This gives the following possible sizes,

16, 64, 256, 1024, 4096, 16384, 65536, 131072, 262144, ...

Some examples:

In [26]: a=[i for i in range(60000)]In [27]: b={i for i in range(60000)}In [30]: b1={i for i in range(100000)}In [31]: a1=[i for i in range(100000)]In [32]: getsizeof(a)Out[32]: 514568In [33]: getsizeof(b)Out[33]: 2097376In [34]: getsizeof(a1)Out[34]: 824464In [35]: getsizeof(b1)Out[35]: 4194528

Answers:Yes, it's the internal structure in the way set stores the elements consumes this much memory. And, sys.getsizeof is correct only; There's nothing wrong with using that here.

For more detailed reference about list, set or dict please refer this chapter: High Performance Python