Set value of first item in slice in python pandas Set value of first item in slice in python pandas pandas pandas

Set value of first item in slice in python pandas


I think you can use idxmax for get index of first True value and then set by loc:

np.random.seed(1)df = pd.DataFrame(np.random.randint(4, size=(5,1)))print (df)   00  11  32  03  04  3print ((df[0] == 0).idxmax())2df.loc[(df[0] == 0).idxmax(), 0] = 100print (df)     00    11    32  1003    04    3

df.loc[(df[0] == 3).idxmax(), 0] = 200print (df)     00    11  2002    03    04    3

EDIT:

Solution with not unique index:

np.random.seed(1)df = pd.DataFrame(np.random.randint(4, size=(5,1)), index=[1,2,2,3,4])print (df)   01  12  32  03  04  3df = df.reset_index()df.loc[(df[0] == 3).idxmax(), 0] = 200df = df.set_index('index')df.index.name = Noneprint (df)     01    12  2002    03    04    3

EDIT1:

Solution with MultiIndex:

np.random.seed(1)df = pd.DataFrame(np.random.randint(4, size=(5,1)), index=[1,2,2,3,4])print (df)   01  12  32  03  04  3df.index = [np.arange(len(df.index)), df.index]print (df)     00 1  11 2  32 2  03 3  04 4  3df.loc[(df[0] == 3).idxmax(), 0] = 200df = df.reset_index(level=0, drop=True)print (df)     01    12  2002    03    04    3

EDIT2:

Solution with double cumsum:

np.random.seed(1)df = pd.DataFrame([4,0,4,7,4], index=[1,2,2,3,4])print (df)   01  42  02  43  74  4mask = (df[0] == 0).cumsum().cumsum()print (mask)1    02    12    23    34    4Name: 0, dtype: int32df.loc[mask == 1, 0] = 200print (df)     01    42  2002    43    74    4


Consider the dataframe df

df = pd.DataFrame(dict(A=[1, 2, 3, 4, 5]))print(df)   A0  11  22  33  44  5

Create some arbitrary slice slc

slc = df[df.A > 2]print(slc)   A2  33  44  5

Access the first row of slc within df by using index[0] and loc

df.loc[slc.index[0]] = 0print(df)   A0  11  22  03  44  5


import pandas as pdimport numpy as npdf = pd.DataFrame(np.random.rand(6,1),index=[1,2,2,3,3,3])df[1] = 0df.columns=['a','b']df['b'][df['a']>=0.5]=1df=df.sort(['b','a'],ascending=[0,1])df.loc[df[df['b']==0].index.tolist()[0],'a']=0

In this method extra copy of the dataframe is not created but an extra column is introduced which can be dropped after processing. To choose any index instead o the first one you can change the last line as follows

df.loc[df[df['b']==0].index.tolist()[n],'a']=0

to change any nth item in a slice

df

          a  1  0.111089  2  0.255633  2  0.332682  3  0.434527  3  0.730548  3  0.844724  

df after slicing and labelling them

          a  b1  0.111089  02  0.255633  02  0.332682  03  0.434527  03  0.730548  13  0.844724  1

After changing value of first item in slice (labelled as 0) to 0

          a  b3  0.730548  13  0.844724  11  0.000000  02  0.255633  02  0.332682  03  0.434527  0