How to replace all '0xa0' chars with a ' ' in a bunch of text files?
OK, first point: your output file is set to automatically encode text written to it as utf-8
, so don't include an explicit encode('utf-8')
method call when passing arguments to the write()
method.
So the first thing to try is to simply use the following in your inner loop:
writer.write(line)
If that doesn't work, then the problem is almost certainly the fact that, as others have noted, you aren't decoding your input file properly.
Taking a wild guess and assuming that your input files are encoded in cp1252
, you could try as a quick test the following in the inner loop:
for line in codecs.open(infile, 'r', 'cp1252'): writer.write(line)
Minor point: 'wtr' is a nonsensical mode string (as write access implies read access). Simplify it to either 'wt' or even just 'w'.