What is the difference between the types <type 'numpy.string_'> and <type 'str'>? What is the difference between the types <type 'numpy.string_'> and <type 'str'>? numpy numpy

What is the difference between the types <type 'numpy.string_'> and <type 'str'>?


numpy.string_ is the NumPy datatype used for arrays containing fixed-width byte strings. On the other hand, str is a native Python type and can not be used as a datatype for NumPy arrays*.

If you create a NumPy array containing strings, the array will use the numpy.string_ type (or the numpy.unicode_ type in Python 3). More precisely, the array will use a sub-datatype of np.string_:

>>> a = np.array(['abc', 'xy'])>>> aarray(['abc', 'xy'], dtype='<S3')>>> np.issubdtype('<S3', np.string_)True

In this case the datatype is '<S3': the < denotes the byte-order (little-endian), S denotes the string type and 3 indicates that each value in the array holds up to three characters (or bytes).

One property that np.string_ and str share is immutability. Trying to increase the length of a Python str object will create a new object in memory. Similarly, if you want fixed-width NumPy array to hold more characters, a new larger array will have to be created in memory.


* Note that it is possible to create a NumPy object array which contains references to Python str objects, but such arrays behave quite differently to normal arrays.