Python: How to read a data file with uneven number of columns
An even easier approach I just thought of:
with open("hk_L1.ref") as f: data = numpy.array(f.read().split(), dtype=float).reshape(7000, 8)
This reads the data as a one-dimensional array first, completely ignoring all new-line characters, and then we reshape it to the desired shape.
While I think that the task will be I/O-bound anyway, this approach should use little processor time if it matters.
Provided I understood you correctly (see my comment) you can split your input in tokens, then process it in blocks of eight indistinctly:
#!/usr/bin/env python# -*- coding: utf-8 -*-f = open('filename.ref')tokens = f.read().split()rows = []for idx, token in enumerate(tokens): if idx % 8 == 0: # this is a new row, use a new list. row = [] rows.append(row) row.append(token)# rows is now a list of lists with the desired data.
This runs in under 0.2 seconds in my computer as is.
Edit: used @SvenMarnach's suggestion.
How about this?
data = []curRow = []dataPerRow = 8for row in FILE.readlines(): for item in row.split(): if len(curRow) == dataPerRow: data.append(curRow) curRow = [] curRow.Append(item)data.append(curRow)
(assuming FILE is the file being read in)You then have a list of lists, which can be used for whatever.