Replace values in dataframe column depending on another column with condition

python pandas dataframe replace

Multiple steps but works. Find index of rows where y is 255 till you find the next 1. Save the values in idx. Now create new_x using the idx and the other two condition (y == 1 or y == 255). Ffill the rest.

# Index of rows between 255 and 1 in column yidx = df.loc[df['y'].replace(0, np.nan).ffill() == 255, 'y'].index# Create x_new1 and assign value of x where index is idx or y == 1 or y ==255df.loc[idx, 'x_new1'] = df['x']df.loc[(df['y'] == 1) | (df['y'] == 255) , 'x_new1'] = df['x']# ffill rest of the values in x_new1df['x_new1'] = df['x_new1'].ffill()    x       y   z   x_new   x_new10   12.28   1   1   12.28   12.281   11.99   0   1   12.28   12.282   11.50   0   1   12.28   12.283   11.20   0   1   12.28   12.284   11.01   0   1   12.28   12.285   9.74    255 0   9.74    9.746   13.80   0   0   13.80   13.807   15.20   0   0   15.20   15.208   17.80   0   0   17.80   17.809   12.10   1   1   12.10   12.1010  11.90   0   1   12.10   12.1011  11.70   0   1   12.10   12.1012  11.20   0   1   12.10   12.1013  10.30   255 0   10.30   10.30

python pandas dataframe replace

Try:

# mark the occurrences of 1 and 255df['is_1_255'] = df.y[(df.y==1)|(df.y==255)]df['x_n'] = None# copy the 1's df.loc[df.is_1_255==1,'x_n'] = df.loc[df.is_1_255==1,'x']# fill is_1_255 with markers, #255 means between 255 and 1, 1 means between 1 and 255df['is_1_255'] = df['is_1_255'].ffill()# update the 255 valuesdf.loc[df.is_1_255==255, 'x_n'] = df.loc[df.is_1_255==255,'x']# update the 1 valuesdf['x_n'].ffill(inplace=True)

Output:

+-----+-------+-----+---+-------+----------+-------+| idx |   x   |  y  | z | x_new | is_1_255 |  x_n  |+-----+-------+-----+---+-------+----------+-------+|   0 | 12.28 |   1 | 1 | 12.28 | 1.0      | 12.28 ||   1 | 11.99 |   0 | 1 | 12.28 | 1.0      | 12.28 ||   2 | 11.50 |   0 | 1 | 12.28 | 1.0      | 12.28 ||   3 | 11.20 |   0 | 1 | 12.28 | 1.0      | 12.28 ||   4 | 11.01 |   0 | 1 | 12.28 | 1.0      | 12.28 ||   5 | 9.74  | 255 | 0 | 9.74  | 255.0    | 9.74  ||   6 | 13.80 |   0 | 0 | 13.80 | 255.0    | 13.80 ||   7 | 15.20 |   0 | 0 | 15.20 | 255.0    | 15.20 ||   8 | 17.80 |   0 | 0 | 17.80 | 255.0    | 17.80 ||   9 | 12.10 |   1 | 1 | 12.10 | 1.0      | 12.10 ||  10 | 11.90 |   0 | 1 | 12.10 | 1.0      | 12.10 ||  11 | 11.70 |   0 | 1 | 12.10 | 1.0      | 12.10 ||  12 | 11.20 |   0 | 1 | 12.10 | 1.0      | 12.10 ||  13 | 10.30 | 255 | 0 | 10.30 | 255.0    | 10.30 |+-----+-------+-----+---+-------+----------+-------+

python pandas dataframe replace

Assuming clean data where 1 and 255 always occur in pairs, we can form groups of 1-255 and groupby to fill in the data.

s = (df.y.eq(1).cumsum() == df.y.eq(255).cumsum()+1)df['xnew'] = df.groupby(s.ne(s.shift()).cumsum().where(s)).x.transform('first').fillna(df.x)        x    y  z   xnew0   12.28    1  1  12.281   11.99    0  1  12.282   11.50    0  1  12.283   11.20    0  1  12.284   11.01    0  1  12.285    9.74  255  0   9.746   13.80    0  0  13.807   15.20    0  0  15.208   17.80    0  0  17.809   12.10    1  1  12.1010  11.90    0  1  12.1011  11.70    0  1  12.1012  11.20    0  1  12.1013  10.30  255  0  10.30

Though for something like this, you should really form a thorough unit test, because this logic can get quite tricky and problematic for incorrect inputs.

CodeHunter

Replace values in dataframe column depending on another column with condition

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last