Pandas - Slice large dataframe into chunks Pandas - Slice large dataframe into chunks pandas pandas

Pandas - Slice large dataframe into chunks


You can use list comprehension to split your dataframe into smaller dataframes contained in a list.

n = 200000  #chunk row sizelist_df = [df[i:i+n] for i in range(0,df.shape[0],n)]

You can access the chunks with:

list_df[0]list_df[1]etc...

Then you can assemble it back into a one dataframe using pd.concat.

By AcctName

list_df = []for n,g in df.groupby('AcctName'):    list_df.append(g)


I'd suggest using a dependency more_itertools. It handles all edge cases like uneven partition of the dataframe and returns an iterator that will make things a tiny bit more efficient.

(updated using code from @Acumenus)

from more_itertools import slicedCHUNK_SIZE = 5index_slices = sliced(range(len(df)), CHUNK_SIZE)for index_slice in index_slices:  chunk = df.iloc[index_slice] # your dataframe chunk ready for use