Remove Unicode characters from textfiles - sed , other Bash/shell methods
Clear all non-ASCII characters of file.txt
:
$ iconv -c -f utf-8 -t ascii file.txt$ strings file.txt
If you want to remove only particular characters and you have Python, you can:
CHARS=$(python -c 'print u"\u0091\u0092\u00a0\u200E".encode("utf8")')sed 's/['"$CHARS"']//g' < /tmp/utf8_input.txt > /tmp/ascii_output.txt