pandas.read_csv from string or package data pandas.read_csv from string or package data numpy numpy

pandas.read_csv from string or package data


To pass a string to pandas read_csv(), you can use io.StringIO, i.e.:

import pandas as pdfrom io import StringIOdf = pd.read_csv(StringIO("csv string..."))


The following worked for me in 3.3:

>>> import numpy as np, pandas as pd>>> import io, pkgutil>>> wells = pkgutil.get_data('pymc.examples', 'data/wells.dat')>>> type(wells)<class 'bytes'>>>> df = pd.read_csv(io.BytesIO(wells), encoding='utf8', sep=" ", index_col="id", dtype={"switch": np.int8})>>> df.head()    switch  arsenic       dist  assoc  educid                                         1        1     2.36  16.826000      0     02        1     0.71  47.321999      0     03        0     2.07  20.966999      0    104        1     1.15  21.486000      0    125        1     1.10  40.874001      1    14[5 rows x 5 columns]

N.B. I had to manually put wells.dat in that location, so I can't swear I copied it correctly and that there isn't terminal whitespace, because I deleted some. But passing read_csv a BytesIO object and an encoding parameter should work. (Actually, you can probably get away without it, but it's a good habit. io.TextIOWrapper might be another option.)