Shuffling non-zero elements of each row in an array - Python / NumPy Shuffling non-zero elements of each row in an array - Python / NumPy python python

Shuffling non-zero elements of each row in an array - Python / NumPy


You could use the non-inplace numpy.random.permutation with explicit non-zero indexing:

>>> X = np.array([[2,3,1,0], [0,0,2,1]])>>> for i in range(len(X)):...     idx = np.nonzero(X[i])...     X[i][idx] = np.random.permutation(X[i][idx])... >>> Xarray([[3, 2, 1, 0],       [0, 0, 2, 1]])


I think I found the three-liner?

i, j = np.nonzero(a.astype(bool))k = np.argsort(i + np.random.rand(i.size))a[i,j] = a[i,j[k]]


As promised, this being the fourth day of the bounty period, here's my attempt at a vectorized solution. The steps involved are explained in some details below :

  • For easy reference, let's call the input array as a. Generate unique indices per row that covers the range for row length. For this, we can simply generate random numbers of the same shape as the input array and get the argsort indices along each row, which would be those unique indices. This idea has been explored before in this post.

  • Index into each row of input array with those indices as columns indices. Thus, we would need advanced-indexing here. Now, this gives us an array with each row being shuffled. Let's call it b.

  • Since the shuffling is restricted to per row, if we simply use the boolean-indexing : b[b!=0], we would get the non-zero elements being shuffled and also being restricted to lengths of non-zeros per row. This is because of the fact that the elements in a NumPy array are stored in row-major order, so with boolean-indexing it would have selected shuffled non-zero elements on each row first before moving onto the next row. Again, if we use boolean-indexing similarly for a, i.e. a[a!=0], we would have similarly gotten the non-zero elements on each row first before moving onto the next row and these would be in their original order. So, the final step would be to just grab masked elements b[b!=0] and assign into the masked places a[a!=0].

Thus, an implementation covering the above mentioned three steps would be -

m,n = a.shaperand_idx = np.random.rand(m,n).argsort(axis=1) #step1b = a[np.arange(m)[:,None], rand_idx]          #step2  a[a!=0] = b[b!=0]                              #step3 

A sample step-by-step run might make things clearer -

In [50]: a # Input arrayOut[50]: array([[ 8,  5,  0, -4],       [ 0,  6,  0,  3],       [ 8,  5,  0, -4]])In [51]: m,n = a.shape # Store shape information# Unique indices per row that covers the range for row lengthIn [52]: rand_idx = np.random.rand(m,n).argsort(axis=1)In [53]: rand_idxOut[53]: array([[0, 2, 3, 1],       [1, 0, 3, 2],       [2, 3, 0, 1]])# Get corresponding indexed arrayIn [54]: b = a[np.arange(m)[:,None], rand_idx]# Do a check on the shuffling being restricted to per rowIn [55]: a[a!=0]Out[55]: array([ 8,  5, -4,  6,  3,  8,  5, -4])In [56]: b[b!=0]Out[56]: array([ 8, -4,  5,  6,  3, -4,  8,  5])# Finally do the assignment based on masking on a and bIn [57]: a[a!=0] = b[b!=0]In [58]: a # Final verification on desired resultOut[58]: array([[ 8, -4,  0,  5],       [ 0,  6,  0,  3],       [-4,  8,  0,  5]])