How to read a text file into a list or an array with Python How to read a text file into a list or an array with Python python python

How to read a text file into a list or an array with Python


You will have to split your string into a list of values using split()

So,

lines = text_file.read().split(',')

EDIT:I didn't realise there would be so much traction to this. Here's a more idiomatic approach.

import csvwith open('filename.csv', 'r') as fd:    reader = csv.reader(fd)    for row in reader:        # do something


You can also use numpy loadtxt like

from numpy import loadtxtlines = loadtxt("filename.dat", comments="#", delimiter=",", unpack=False)


So you want to create a list of lists... We need to start with an empty list

list_of_lists = []

next, we read the file content, line by line

with open('data') as f:    for line in f:        inner_list = [elt.strip() for elt in line.split(',')]        # in alternative, if you need to use the file content as numbers        # inner_list = [int(elt.strip()) for elt in line.split(',')]        list_of_lists.append(inner_list)

A common use case is that of columnar data, but our units of storage are therows of the file, that we have read one by one, so you may want to transposeyour list of lists. This can be done with the following idiom

by_cols = zip(*list_of_lists)

Another common use is to give a name to each column

col_names = ('apples sold', 'pears sold', 'apples revenue', 'pears revenue')by_names = {}for i, col_name in enumerate(col_names):    by_names[col_name] = by_cols[i]

so that you can operate on homogeneous data items

 mean_apple_prices = [money/fruits for money, fruits in                     zip(by_names['apples revenue'], by_names['apples_sold'])]

Most of what I've written can be speeded up using the csv module, from the standard library. Another third party module is pandas, that lets you automate most aspects of a typical data analysis (but has a number of dependencies).


Update While in Python 2 zip(*list_of_lists) returns a different (transposed) list of lists, in Python 3 the situation has changed and zip(*list_of_lists) returns a zip object that is not subscriptable.

If you need indexed access you can use

by_cols = list(zip(*list_of_lists))

that gives you a list of lists in both versions of Python.

On the other hand, if you don't need indexed access and what you want is just to build a dictionary indexed by column names, a zip object is just fine...

file = open('some_data.csv')names = get_names(next(file))columns = zip(*((x.strip() for x in line.split(',')) for line in file)))d = {}for name, column in zip(names, columns): d[name] = column