Pandas read_csv from url Pandas read_csv from url python python

Pandas read_csv from url


In the latest version of pandas (0.19.2) you can directly pass the url

import pandas as pdurl="https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv"c=pd.read_csv(url)


UPDATE: From pandas 0.19.2 you can now just pass read_csv() the url directly, although that will fail if it requires authentication.


For older pandas versions, or if you need authentication, or for any other HTTP-fault-tolerant reason:

Use pandas.read_csv with a file-like object as the first argument.

  • If you want to read the csv from a string, you can use io.StringIO.

  • For the URL https://github.com/cs109/2014_data/blob/master/countries.csv, you get html response, not raw csv; you should use the url given by the Raw link in the github page for getting raw csv response , which is https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv

Example:

import pandas as pdimport ioimport requestsurl="https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv"s=requests.get(url).contentc=pd.read_csv(io.StringIO(s.decode('utf-8')))

Notes:

in Python 2.x, the string-buffer object was StringIO.StringIO


As I commented you need to use a StringIO object and decode i.e c=pd.read_csv(io.StringIO(s.decode("utf-8"))) if using requests, you need to decode as .content returns bytes if you used .text you would just need to pass s as is s = requests.get(url).text c = pd.read_csv(StringIO(s)).

A simpler approach is to pass the correct url of the raw data directly to read_csv, you don't have to pass a file like object, you can pass a url so you don't need requests at all:

c = pd.read_csv("https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv")print(c)

Output:

                              Country         Region0                             Algeria         AFRICA1                              Angola         AFRICA2                               Benin         AFRICA3                            Botswana         AFRICA4                             Burkina         AFRICA5                             Burundi         AFRICA6                            Cameroon         AFRICA..................................

From the docs:

filepath_or_buffer :

string or file handle / StringIO The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. For instance, a local file could be file ://localhost/path/to/table.csv