Error with numpy array calculations using int dtype (it fails to cast dtype to 64 bit automatically when needed) Error with numpy array calculations using int dtype (it fails to cast dtype to 64 bit automatically when needed) numpy numpy

Error with numpy array calculations using int dtype (it fails to cast dtype to 64 bit automatically when needed)


Type casting and promotion in numpy is fairly complicated and occasionally surprising. This recent unofficial write-up by Sebastian Berg explains some of the nuances of the subject (mostly concentrating on scalars and 0d arrays).

Quoting from this document:

Python Integers and Floats

Note that python integers are handled exactly like numpy ones. They are, however, special in that they do not have a dtype associated with them explicitly. Value based logic, as described here, seems useful for python integers and floats to allow:

arr = np.arange(10, dtype=np.int8)arr += 1# or:res = arr + 1res.dtype == np.int8

which ensures that no upcast (for example with higher memory usage) occurs.

(emphasis mine.)

See also Allan Haldane's gist suggesting C-style type coercion, linked from the previous document:

Currently, when two dtypes are involved in a binary operation numpy's principle is that "the output dtype's range covers the range of both input dtypes", and when a single dtype is involved there is never any cast.

(emphasis again mine.)

So my understanding is that the promotion rules for numpy scalars and arrays differ, primarily because it's not feasible to check every element inside an array to determine whether casting can be done safely. Again from the former document:

Scalar based rules

Unlike arrays, where inspection of all values is not feasable, for scalars (and 0-D arrays) the value is inspected.

This would mean that you can either use np.int64 from the start to be safe (and if you're on linux then dtype=int will actually do this on its own), or check the maximum value of your arrays before suspect operations and determine if you have to promote the dtype yourself, on a case-by-case basis. I understand that this might not be feasible if you are doing a lot of calculations, but I don't believe there is a way around this considering numpy's current type promotion rules.