Removing non-alphanumeric characters with sed Removing non-alphanumeric characters with sed bash bash

Removing non-alphanumeric characters with sed


's -c (complement) flag may be an option

echo "Â10.41.89.50-._ " | tr -cd '[:alnum:]._-'


You might want to use the [:alpha:] class instead:

echo "Â10.41.89.50 " | sed "s/[[:alpha:].-]//g"

should work. If not, you might need to change your local settings.

On the other hand, if you only want to keep the digits, the hyphens and the period::

echo "Â10.41.89.50 " | sed "s/[^[:digit:].-]//g"

If your string is in a variable, you can use pure bash and parameter expansions for that:

$ dirty="Â10.41.89.50 "$ clean=${dirty//[^[:digit:].-]/}$ echo "$clean"10.41.89.50

or

$ dirty="Â10.41.89.50 "$ clean=${dirty//[[:alpha:]]/}$ echo "$clean"10.41.89.50

You can also have a look at 1_CR's answer.


Well sed won't support unicode characters. Use perl instead:

> s="Â10.41.89.50 "> perl -pe 's/[^\w.-]+//g' <<< "$s"10.41.89.50