How can I remove extra whitespace from strings when parsing a csv file in Pandas?
You could use converters:
import pandas as pddef strip(text): try: return text.strip() except AttributeError: return textdef make_int(text): return int(text.strip('" '))table = pd.read_table("data.csv", sep=r',', names=["Year", "Make", "Model", "Description"], converters = {'Description' : strip, 'Model' : strip, 'Make' : strip, 'Year' : make_int})print(table)
yields
Year Make Model Description0 1997 Ford E350 None1 1997 Ford E350 None2 1997 Ford E350 Super, luxurious truck3 1997 Ford E350 Super "luxurious" truck4 1997 Ford E350 Super luxurious truck5 1997 Ford E350 None6 1997 Ford E350 None7 2000 Mercury Cougar None
Adding parameter skipinitialspace=True
to read_table
worked for me.
So try:
pd.read_table("data.csv", sep=r',', names=["Year", "Make", "Model", "Description"], skipinitialspace=True)
Same thing works in pd.read_csv()
.