How to drop a list of rows from Pandas dataframe?
Use DataFrame.drop and pass it a Series of index labels:
In : dfOut: one twoone 1 4two 2 3three 3 2four 4 1In : df.drop(df.index[[1,3]])Out: one twoone 1 4three 3 2
Note that it may be important to use the "inplace" command when you want to do the drop in line.
Because your original question is not returning anything, this command should be used.http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.drop.html
If the DataFrame is huge, and the number of rows to drop is large as well, then simple drop by index
df.drop(df.index) takes too much time.
In my case, I have a multi-indexed DataFrame of floats with
100M rows x 3 cols, and I need to remove
10k rows from it. The fastest method I found is, quite counterintuitively, to
take the remaining rows.
indexes_to_drop be an array of positional indexes to drop (
[1, 2, 4] in the question).
indexes_to_keep = set(range(df.shape)) - set(indexes_to_drop)df_sliced = df.take(list(indexes_to_keep))
In my case this took
20.5s, while the simple
5min 27s and consumed a lot of memory. The resulting DataFrame is the same.