Python: How to read a data file with uneven number of columns Python: How to read a data file with uneven number of columns numpy numpy

Python: How to read a data file with uneven number of columns


An even easier approach I just thought of:

with open("hk_L1.ref") as f:    data = numpy.array(f.read().split(), dtype=float).reshape(7000, 8)

This reads the data as a one-dimensional array first, completely ignoring all new-line characters, and then we reshape it to the desired shape.

While I think that the task will be I/O-bound anyway, this approach should use little processor time if it matters.


Provided I understood you correctly (see my comment) you can split your input in tokens, then process it in blocks of eight indistinctly:

#!/usr/bin/env python# -*- coding: utf-8 -*-f = open('filename.ref')tokens = f.read().split()rows = []for idx, token in enumerate(tokens):    if idx % 8 == 0:        # this is a new row, use a new list.        row = []        rows.append(row)    row.append(token)# rows is now a list of lists with the desired data.

This runs in under 0.2 seconds in my computer as is.

Edit: used @SvenMarnach's suggestion.


How about this?

data = []curRow = []dataPerRow = 8for row in FILE.readlines():    for item in row.split():         if len(curRow) == dataPerRow:             data.append(curRow)             curRow = []         curRow.Append(item)data.append(curRow)

(assuming FILE is the file being read in)You then have a list of lists, which can be used for whatever.