Pythonic way to import data from multiple files into an array Pythonic way to import data from multiple files into an array numpy numpy

Pythonic way to import data from multiple files into an array


"But the problem with this code, is that I can only process data when it's in the for loop. "

Assuming your code works:

# Get folder path containing text filesfile_list = glob.glob(source_dir + '/*.TXT')data = []for file_path in file_list:    data.append(        np.genfromtxt(file_path, delimiter=',', skip_header=3, skip_footer=18))# now you can access it outside the "for loop..."for d in data:    print d


IF all data is of the same shape then just append to a list.

all_data = [] 

and in your loop:

all_data.append(data)

finally you have

asarray(all_data)

which is an array of shape (10,50,2) (transpose if you want). If the shapes don't match, then this does not work though, numpy cannot handle rows of different shapes. Then you might need another loop which creates arrays of the largest shape, and copy your data over.


crude but quick

listFiles=["1.txt","2.txt", ... ,"xxx.txt"]allData=[]for file in listFiles:    lines = open(file,'r').readlines()    filedata = {}    filedata['name'] = file    filedata['rawLines'] = lines    col1Vals = []    col2Vals = []    mapValues = {}    for line in lines:                  values = line.split(',')       col1Vals.append(values[0])       col2Vals.append(values[1])       mapValues[values[0]] = values[1]    filedata['col1'] = col1Vals    filedata['col2'] = col2Vals    filedata['map'] = mapValues    allData.append(filedata)


if you want to get a list of files from a specific directory, take a look at os.walk

Since it's not clear how you would want the data, I've shown numerous ways to store it

allData is a list of dictionaries

to get the 2nd column of data from the 3rd file you'd be able to do allData[2]['col2']

if you wanted the name of the third file alldata[2]['name']