numpy genfromtxt/pandas read_csv; ignore commas within quote marks numpy genfromtxt/pandas read_csv; ignore commas within quote marks numpy numpy

numpy genfromtxt/pandas read_csv; ignore commas within quote marks


Just managed to find this:

The key parameter that I was missing is skipinitialspace=True - this "deals with the spaces after the comma-delimiter"

a=pd.read_csv('a.dat',quotechar='"',skipinitialspace=True)   address 1  address 2            address 3  num1  num2  num30  address 1  address 2            address 3     1     2     31  address 1  address 2  address 3, address4     1     2     3

This works :-)


Python's built-in csv module can deal with this kind of data.

with open("a.dat") as f:    reader = csv.reader(f, skipinitialspace=True)    header = next(reader)    dtype = numpy.dtype(zip(header, ['S20', 'S20', 'S20', 'f8', 'f8', 'f8']))    data = numpy.fromiter(itertools.imap(tuple, reader), dtype=dtype)