How to drop a list of rows from Pandas dataframe? How to drop a list of rows from Pandas dataframe? python python

How to drop a list of rows from Pandas dataframe?


Use DataFrame.drop and pass it a Series of index labels:

In [65]: dfOut[65]:        one  twoone      1    4two      2    3three    3    2four     4    1In [66]: df.drop(df.index[[1,3]])Out[66]:        one  twoone      1    4three    3    2


Note that it may be important to use the "inplace" command when you want to do the drop in line.

df.drop(df.index[[1,3]], inplace=True)

Because your original question is not returning anything, this command should be used.http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.drop.html


If the DataFrame is huge, and the number of rows to drop is large as well, then simple drop by index df.drop(df.index[]) takes too much time.

In my case, I have a multi-indexed DataFrame of floats with 100M rows x 3 cols, and I need to remove 10k rows from it. The fastest method I found is, quite counterintuitively, to take the remaining rows.

Let indexes_to_drop be an array of positional indexes to drop ([1, 2, 4] in the question).

indexes_to_keep = set(range(df.shape[0])) - set(indexes_to_drop)df_sliced = df.take(list(indexes_to_keep))

In my case this took 20.5s, while the simple df.drop took 5min 27s and consumed a lot of memory. The resulting DataFrame is the same.