Why deepcopy of list of integers returns the same integers in memory? Why deepcopy of list of integers returns the same integers in memory? python python

Why deepcopy of list of integers returns the same integers in memory?


It was suggested in another answer that this may be due to the fact Python has interned objects for small integers. While this statement is correct, it is not what causes that behaviour.

Let's have a look at what happens when we use bigger integers.

> from copy import deepcopy> x = 1000> x is deepcopy(x)True

If we dig down in the copy module we find out that calling deepcopy with an atomic value defers the call to the function _deepcopy_atomic.

def _deepcopy_atomic(x, memo):    return x

So what is actually happening is that deepcopy will not copy an atomic value, but only return it.

By example this is the case for int, float, str, function and more.


The reason of this behavior is that Python optimize small integers so they are not actually in different memory location. Check out the id of 1, they are always the same:

>>> x = 1>>> y = 1>>> id(x)1353557072>>> id(y)1353557072>>> a = [1, 2, 3, 4, 5]>>> id(a[0])1353557072>>> import copy>>> b = copy.deepcopy(a)>>> id(b[0])1353557072

Reference from Integer Objects:

The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object. So it should be possible to change the value of 1. I suspect the behaviour of Python in this case is undefined. :-)


Olivier Melançon's answer is the correct one if we take this as a mechanical question of how the deepcopy function call ends up returning references to the same int objects rather than copies of them. I'll take a step back and answer the question of why that is the sensible thing for deepcopy to do.

The reason we need to make copies of data structures - either deep or shallow copies - is so we can modify their contents without affecting the state of the original; or so we can modify the original while still keeping a copy of the old state. A deep copy is needed for that purpose when a data structure has nested parts which are themselves mutable. Consider this example, which multiplies every number in a 2D grid, like [[1, 2], [3, 4]]:

import copydef multiply_grid(grid, k):    new_grid = copy.deepcopy(grid)    for row in new_grid:        for i in range(len(row)):            row[i] *= k    return new_grid

Objects such as lists are mutable, so the operation row[i] *= k changes their state. Making a copy of the list is a way to defend against mutation; a deep copy is needed here to make copies of both the outer list and the inner lists (i.e. the rows), which are also mutable.

But objects such as integers and strings are immutable, so their state cannot be modified. If an int object is 13 then it will stay 13, even if you multiply it by k; the multiplication results in a different int object. There is no mutation to defend against, and hence no need to make a copy.

Interestingly, deepcopy doesn't make copies of tuples if their components are all immutable, but it does when they have mutable components:

>>> import copy>>> x = ([1, 2], [3, 4])>>> x is copy.deepcopy(x)False>>> y = (1, 2)>>> y is copy.deepcopy(y)True

The logic is the same: if an object is immutable but has nested components which are mutable, then a copy is needed to avoid mutation to the components of the original. But if the whole structure is completely immutable, there is no mutation to defend against and hence no need for a copy.