How to read .xls in parallel using pandas? How to read .xls in parallel using pandas? pandas pandas

How to read .xls in parallel using pandas?


Just for your information: i'm reading 13 Mbyte, 29000 lines of csv in about 4 seconds. (not using parallel processing)Archlinux, AMD Phenom II X2, Python 3.4, python-pandas 0.16.2.

How big is your file and how long does it take to read it ?That would help to understand the problem better.Is your excel sheet very complex ? Maybe read_excel has difficulty processing that complexity ?

Suggestion: install genumeric and use the helper function ssconvert to translate the file to csv. In your program change to read_csv. Check the time used by ssconvert and the time taken by read_csv. By the way, python-pandas had major improvements while it went from version 13 .... 16, hence usefull to check you have a recent version.