Python: skip comment lines marked with # in csv.DictReader Python: skip comment lines marked with # in csv.DictReader python python

Python: skip comment lines marked with # in csv.DictReader


Actually this works nicely with filter:

import csvfp = open('samples.csv')rdr = csv.DictReader(filter(lambda row: row[0]!='#', fp))for row in rdr:    print(row)fp.close()


Good question. Python's CSV library lacks basic support for comments (not uncommon at the top of CSV files). While Dan Stowell's solution works for the specific case of the OP, it is limited in that # must appear as the first symbol. A more generic solution would be:

def decomment(csvfile):    for row in csvfile:        raw = row.split('#')[0].strip()        if raw: yield rawwith open('dummy.csv') as csvfile:    reader = csv.reader(decomment(csvfile))    for row in reader:        print(row)

As an example, the following dummy.csv file:

# comment # commenta,b,c # comment1,2,310,20,30# comment

returns

['a', 'b', 'c']['1', '2', '3']['10', '20', '30']

Of course, this works just as well with csv.DictReader().


Another way to read a CSV file is using pandas

Here's a sample code:

df = pd.read_csv('test.csv',                 sep=',',     # field separator                 comment='#', # comment                 index_col=0, # number or label of index column                 skipinitialspace=True,                 skip_blank_lines=True,                 error_bad_lines=False,                 warn_bad_lines=True                 ).sort_index()print(df)df.fillna('no value', inplace=True) # replace NaN with 'no value'print(df)

For this csv file:

a,b,c,d,e1,,16,,55#,,65##778,77,77,,16#86,18##This is a comment13,19,25,28,82

we will get this output:

       b   c     d   ea                     1    NaN  16   NaN  558   77.0  77   NaN  1613  19.0  25  28.0  82           b   c         d   ea                             1   no value  16  no value  558         77  77  no value  1613        19  25        28  82