Pandas to_csv progress bar with tqdm
You can divide the dataframe into chunks of n
rows and save the dataframe to a csv chunk by chunk using mode='w' for the first row and mode="a" for the rest:
Example:
import numpy as npimport pandas as pdfrom tqdm import tqdmdf = pd.DataFrame(data=[i for i in range(0, 10000000)], columns = ["integer"])print(df.head(10))chunks = np.array_split(df.index, 100) # chunks of 100 rowsfor chunck, subset in enumerate(tqdm(chunks)): if chunck == 0: # first row df.loc[subset].to_csv('data.csv', mode='w', index=True) else: df.loc[subset].to_csv('data.csv', header=None, mode='a', index=True)
Output:
integer0 01 12 23 34 45 56 67 78 89 9100%|██████████| 100/100 [00:12<00:00, 8.12it/s]