Convert string to numpy array
You could read them as ASCII characters then subtract 48 (the ASCII value of 0
). This should be the fastest way for large strings.
>>> np.fromstring("100110", np.int8) - 48array([1, 0, 0, 1, 1, 0], dtype=int8)
Alternatively, you could convert the string to a list of integers first:
>>> np.array(map(int, "100110"))array([1, 0, 0, 1, 1, 0])
Edit: I did some quick timing and the first method is over 100x faster than converting it to a list first.
Adding to above answers, numpy now gives a deprecation warning when you use fromstring
DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
.
A better option is to use the fromiter
. It performs twice as fast. This is what I got in jupyter notebook -
import numpy as npmystr = "100110"np.fromiter(mystr, dtype=int)>> array([1, 0, 0, 1, 1, 0])# Time comparison%timeit np.array(list(mystr), dtype=int)>> 3.5 µs ± 627 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)%timeit np.fromstring(mystr, np.int8) - 48>> 3.52 µs ± 508 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)%timeit np.fromiter(mystr, dtype=int)1.75 µs ± 133 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)