Replace non ASCII character from string Replace non ASCII character from string java java

Replace non ASCII character from string


This will search and replace all non ASCII letters:

String resultString = subjectString.replaceAll("[^\\x00-\\x7F]", "");


FailedDev's answer is good, but can be improved. If you want to preserve the ascii equivalents, you need to normalize first:

String subjectString = "öäü";subjectString = Normalizer.normalize(subjectString, Normalizer.Form.NFD);String resultString = subjectString.replaceAll("[^\\x00-\\x7F]", "");=> will produce "oau"

That way, characters like "öäü" will be mapped to "oau", which at least preserves some information. Without normalization, the resulting String will be blank.


This would be the Unicode solution

String s = "A função, Ãugent";String r = s.replaceAll("\\P{InBasic_Latin}", "");

\p{InBasic_Latin} is the Unicode block that contains all letters in the Unicode range U+0000..U+007F (see regular-expression.info)

\P{InBasic_Latin} is the negated \p{InBasic_Latin}