numpy genfromtxt/pandas read_csv; ignore commas within quote marks
Just managed to find this:
The key parameter that I was missing is skipinitialspace=True
- this "deals with the spaces after the comma-delimiter"
a=pd.read_csv('a.dat',quotechar='"',skipinitialspace=True) address 1 address 2 address 3 num1 num2 num30 address 1 address 2 address 3 1 2 31 address 1 address 2 address 3, address4 1 2 3
This works :-)
Python's built-in csv
module can deal with this kind of data.
with open("a.dat") as f: reader = csv.reader(f, skipinitialspace=True) header = next(reader) dtype = numpy.dtype(zip(header, ['S20', 'S20', 'S20', 'f8', 'f8', 'f8'])) data = numpy.fromiter(itertools.imap(tuple, reader), dtype=dtype)