xls to csv converter xls to csv converter python python

xls to csv converter


I would use xlrd - it's faster, cross platform and works directly with the file. One thing to note - it doesn't work on xlsx files - so you'd have to save your Excel file as xls. Edit: As of version 0.8.0, xlrd reads both XLS and XLSX files.

import xlrdimport csvdef csv_from_excel():    wb = xlrd.open_workbook('your_workbook.xls')    sh = wb.sheet_by_name('Sheet1')    your_csv_file = open('your_csv_file.csv', 'wb')    wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)    for rownum in xrange(sh.nrows):        wr.writerow(sh.row_values(rownum))    your_csv_file.close()


I would use pandas. The computationally heavy parts are written in cython or c-extensions to speed up the process and the syntax is very clean. For example, if you want to turn "Sheet1" from the file "your_workbook.xls" into the file "your_csv.csv", you just use the top-level function read_excel and the method to_csv from the DataFrame class as follows:

import pandas as pddata_xls = pd.read_excel('your_workbook.xls', 'Sheet1', index_col=None)data_xls.to_csv('your_csv.csv', encoding='utf-8')

Setting encoding='utf-8' alleviates the UnicodeEncodeError mentioned in other answers.


Maybe someone find this ready-to-use piece of code useful. It allows to create CSVs from all spreadsheets in Excel's workbook.

enter image description here

Python 2:

# -*- coding: utf-8 -*-import xlrdimport csvfrom os import sys def csv_from_excel(excel_file):    workbook = xlrd.open_workbook(excel_file)    all_worksheets = workbook.sheet_names()    for worksheet_name in all_worksheets:        worksheet = workbook.sheet_by_name(worksheet_name)        with open(u'{}.csv'.format(worksheet_name), 'wb') as your_csv_file:            wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)            for rownum in xrange(worksheet.nrows):                wr.writerow([unicode(entry).encode("utf-8") for entry in worksheet.row_values(rownum)])if __name__ == "__main__":    csv_from_excel(sys.argv[1])

Python 3:

import xlrdimport csvfrom os import sysdef csv_from_excel(excel_file):    workbook = xlrd.open_workbook(excel_file)    all_worksheets = workbook.sheet_names()    for worksheet_name in all_worksheets:        worksheet = workbook.sheet_by_name(worksheet_name)        with open(u'{}.csv'.format(worksheet_name), 'w', encoding="utf-8") as your_csv_file:            wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)            for rownum in range(worksheet.nrows):                wr.writerow(worksheet.row_values(rownum))if __name__ == "__main__":    csv_from_excel(sys.argv[1])