Read CSV file to numpy array, first row as strings, rest as float Read CSV file to numpy array, first row as strings, rest as float numpy numpy

Read CSV file to numpy array, first row as strings, rest as float


You can keep the column names if you use the names=True argument in the function np.genfromtxt

 data = np.genfromtxt(path_to_csv, dtype=float, delimiter=',', names=True) 

Please note the dtype=float, that will convert your data to float. This is more efficient than using dtype=None, that asks np.genfromtxt to guess the datatype for you.

The output will be a structured array, where you can access individual columns by their name. The names will be taken from your first row. Some modifications may occur, spaces in a column name will be changed to _ for example. The documentation should cover most questions you could have.


I'm not sure what you mean when you say you need the headers in the final version, but you can generate a structured array where the columns are accessed by strings like this:

data = np.genfromtxt(path_to_csv, dtype=None, delimiter=',', names=True)

and then access columns with data['col1_name'], data['col2_name'], etc.


The whole idea of a numpy array is that all elements are the same type. Read the headers into a Python list and manage them separately from the numbers. You can also create a structured array (an array of records) and in this case you can use the headers to name the fields in the records. Storing them in the array would be redundant in that case.