Reading Files in HDFS (Hadoop filesystem) directories into a Pandas dataframe

It looks like the pydoop.hdfs module solves this problem while meeting a good set of the goals:

I was not not able to evaluate this, as pydoop has very strict requirements to compile and my Hadoop version is a bit dated.

CodeHunter