Find and replace multiple values in python
Assuming that your val_old
array is sorted (which is the case here, but if later on it's not, then don't forget to sort val_new
along with it!), you can use numpy.searchsorted
and then access val_new
with the results.
This does not work if a number has no mapping, you will have to provide 1to1 mappings in that case.
In [1]: import numpy as npIn [2]: a = np.array([2, 3, 2, 5, 4, 4, 1, 2])In [3]: old_val = np.array([1, 2, 3, 4, 5])In [4]: new_val = np.array([2, 3, 4, 5, 1])In [5]: a_new = np.array([3, 4, 3, 1, 5, 5, 2, 3])In [6]: i = np.searchsorted(old_val,a)In [7]: a_replaced = new_val[i]In [8]: all(a_replaced == a_new)Out[8]: True
50k numbers? No problem!
In [23]: def timed(): t0 = time.time() i = np.searchsorted(old_val, a) a_replaced = new_val[i] t1 = time.time() print('%s Seconds'%(t1-t0)) ....: In [24]: a = np.random.choice(old_val, 50000)In [25]: timed()0.00288081169128 Seconds
500k? You won't notice the difference!
In [26]: a = np.random.choice(old_val, 500000)In [27]: timed()0.019248008728 Seconds
In vanilla Python, without the speed of numpy
or pandas
, this is one way:
a = [2, 3, 2, 5, 4, 4, 1, 2]val_old = [1, 2, 3, 4, 5]val_new = [2, 3, 4, 5, 1]expected_a_new = [3, 4, 3, 1, 5, 5, 2, 3]d = dict(zip(val_old, val_new))a_new = [d.get(e, e) for e in a]print a_new # [3, 4, 3, 1, 5, 5, 2, 3]print a_new == expected_a_new # True
The average time complexity for this algorithm is O(M + N)
where M
is the length of your "translation list" and N
is the length of list a
.