pandas.read_csv from string or package data
To pass a string
to pandas read_csv()
, you can use io.StringIO
, i.e.:
import pandas as pdfrom io import StringIOdf = pd.read_csv(StringIO("csv string..."))
The following worked for me in 3.3:
>>> import numpy as np, pandas as pd>>> import io, pkgutil>>> wells = pkgutil.get_data('pymc.examples', 'data/wells.dat')>>> type(wells)<class 'bytes'>>>> df = pd.read_csv(io.BytesIO(wells), encoding='utf8', sep=" ", index_col="id", dtype={"switch": np.int8})>>> df.head() switch arsenic dist assoc educid 1 1 2.36 16.826000 0 02 1 0.71 47.321999 0 03 0 2.07 20.966999 0 104 1 1.15 21.486000 0 125 1 1.10 40.874001 1 14[5 rows x 5 columns]
N.B. I had to manually put wells.dat
in that location, so I can't swear I copied it correctly and that there isn't terminal whitespace, because I deleted some. But passing read_csv
a BytesIO
object and an encoding parameter should work. (Actually, you can probably get away without it, but it's a good habit. io.TextIOWrapper
might be another option.)