python read_fwf error: 'dtype is not supported with python-fwf parser' python read_fwf error: 'dtype is not supported with python-fwf parser' pandas pandas

python read_fwf error: 'dtype is not supported with python-fwf parser'


Instead of specifying dtypes, specify a converter for the column you want to keep as str, building on @TomAugspurger's example:

from io import StringIOimport pandas as pddata = StringIO(u"""121301234121300123121300012""")pd.read_fwf(data, colspecs=[(0,3),(4,8)], converters = {1: str})

Leads to

    \n Unnamed: 10  121       01231  121       00122  121       0001

Converters are a mapping from a column name or index to a function to convert the value in the cell (eg. int would convert them to integer, float to floats, etc)


The documentation is probably incorrect there. I think the same base docstring is used for several readers. As for as a workaround, since you know the widths ahead of time, I think you can prepend the zeros after the fact.

With this file and widths [4, 5]

121301234121300123121300012

we get:

In [38]: df = pd.read_fwf('tst.fwf', widths=[4,5], header=None)In [39]: dfOut[39]:       0     10  1213  12341  1213   1232  1213    12

To fill in the missing zeros, would this work?

In [45]: df[1] = df[1].astype('str')In [53]: df[1] = df[1].apply(lambda x: ''.join(['0'] * (5 - len(x))) + x)In [54]: dfOut[54]:       0      10  1213  012341  1213  001232  1213  00012

The 5 in the lambda above comes from the correct width. You'd need to select out all the columns that need leading zeros and apply the function (with the correct width) to each.


This will work fine after pandas 0.20.2 version.

from io import StringIOimport pandas as pdimport numpy as npdata = StringIO(u"""121301234121300123121300012""")pd.read_fwf(data, colspecs=[(0,3),(4,8)], header = None, dtype = {0: np.str, 1: np.str})

Output:

     0     10  NaN   NaN1  121  01232  121  00123  121  0001