Apply GZIP compression to a CSV in Python Pandas
Using df.to_csv()
with the keyword argument compression='gzip'
should produce a gzip archive. I tested it using same keyword arguments as you, and it worked.
You may need to upgrade pandas, as gzip was not implemented until version 0.17.1, but trying to use it on prior versions will not raise an error, and just produce a regular csv. You can determine your current version of pandas by looking at the output of pd.__version__
.
From documentation
import gzipcontent = "Lots of content here"with gzip.open('file.txt.gz', 'wb') as f: f.write(content)
with pandas
import gzipcontent = df.to_csv( sep='|', header=True, index=False, quoting=csv.QUOTE_ALL, quotechar='"', doublequote=True, line_terminator='\n')with gzip.open('foo-%s.csv.gz' % todaysdatestring, 'wb') as f: f.write(content)
The trick here being that to_csv
outputs text if you don't pass it a filename. Then you just redirect that text to gzip
's write
method.