Set value of first item in slice in python pandas
I think you can use idxmax
for get index of first True
value and then set by loc
:
np.random.seed(1)df = pd.DataFrame(np.random.randint(4, size=(5,1)))print (df) 00 11 32 03 04 3print ((df[0] == 0).idxmax())2df.loc[(df[0] == 0).idxmax(), 0] = 100print (df) 00 11 32 1003 04 3
df.loc[(df[0] == 3).idxmax(), 0] = 200print (df) 00 11 2002 03 04 3
EDIT:
Solution with not unique index:
np.random.seed(1)df = pd.DataFrame(np.random.randint(4, size=(5,1)), index=[1,2,2,3,4])print (df) 01 12 32 03 04 3df = df.reset_index()df.loc[(df[0] == 3).idxmax(), 0] = 200df = df.set_index('index')df.index.name = Noneprint (df) 01 12 2002 03 04 3
EDIT1:
Solution with MultiIndex
:
np.random.seed(1)df = pd.DataFrame(np.random.randint(4, size=(5,1)), index=[1,2,2,3,4])print (df) 01 12 32 03 04 3df.index = [np.arange(len(df.index)), df.index]print (df) 00 1 11 2 32 2 03 3 04 4 3df.loc[(df[0] == 3).idxmax(), 0] = 200df = df.reset_index(level=0, drop=True)print (df) 01 12 2002 03 04 3
EDIT2:
Solution with double cumsum
:
np.random.seed(1)df = pd.DataFrame([4,0,4,7,4], index=[1,2,2,3,4])print (df) 01 42 02 43 74 4mask = (df[0] == 0).cumsum().cumsum()print (mask)1 02 12 23 34 4Name: 0, dtype: int32df.loc[mask == 1, 0] = 200print (df) 01 42 2002 43 74 4
Consider the dataframe df
df = pd.DataFrame(dict(A=[1, 2, 3, 4, 5]))print(df) A0 11 22 33 44 5
Create some arbitrary slice slc
slc = df[df.A > 2]print(slc) A2 33 44 5
Access the first row of slc
within df
by using index[0]
and loc
df.loc[slc.index[0]] = 0print(df) A0 11 22 03 44 5
import pandas as pdimport numpy as npdf = pd.DataFrame(np.random.rand(6,1),index=[1,2,2,3,3,3])df[1] = 0df.columns=['a','b']df['b'][df['a']>=0.5]=1df=df.sort(['b','a'],ascending=[0,1])df.loc[df[df['b']==0].index.tolist()[0],'a']=0
In this method extra copy of the dataframe is not created but an extra column is introduced which can be dropped after processing. To choose any index instead o the first one you can change the last line as follows
df.loc[df[df['b']==0].index.tolist()[n],'a']=0
to change any nth item in a slice
df
a 1 0.111089 2 0.255633 2 0.332682 3 0.434527 3 0.730548 3 0.844724
df after slicing and labelling them
a b1 0.111089 02 0.255633 02 0.332682 03 0.434527 03 0.730548 13 0.844724 1
After changing value of first item in slice (labelled as 0) to 0
a b3 0.730548 13 0.844724 11 0.000000 02 0.255633 02 0.332682 03 0.434527 0