Illegal character - CTRL-CHAR Illegal character - CTRL-CHAR xml xml

Illegal character - CTRL-CHAR


I would do what OrangeDog suggest. But if you want to solve it in your code try:

replaceAll("[\\x00-\\x09\\x11\\x12\\x14-\\x1F\\x7F]", "")

\\x12 is the char.


This error is being thrown by the Woodstox XML parser. The source code from the InputBootstrapper class looks like this:

protected void reportUnexpectedChar(int i, String msg)    throws WstxException{    char c = (char) i;    String excMsg;    // WTF? JDK thinks null char is just fine as?!    if (Character.isISOControl(c)) {        excMsg = "Unexpected character (CTRL-CHAR, code "+i+")"+msg;    } else {        excMsg = "Unexpected character '"+c+"' (code "+i+")"+msg;    }    Location loc = getLocation();    throw new WstxUnexpectedCharException(excMsg, loc, c);}

Amusing comment aside, the Woodstox is performing some additional validation on top of the JDK parser, and is rejecting the ASCII character 15 as invalid.

As to why that character is there, we can't tell you that, it's in your data. Similarly, we can't tell you if removing that character will break anything, since again, it's your data. You can only establish that for yourself.


Thanks guys for you inputs. I am sharing solution might be helpful for others. The requirement was not to wipe out CONTROL CHAR, it should remain as it is in DB also and one WS sends it across n/w client should able to get the CONTROL CHAR. So I implemented the code as follow:

  1. Encode strings using URLEncoder in Web-Service code.
  2. At client Side decode it using URLDecoder

Sharing sample code and output bellow.
Sample code:

System.out.println("NewSfn");  System.out.println(URLEncoder.encode("NewSfn", "UTF-8"));  System.out.println(URLDecoder.decode("NewSfn", "UTF-8"));  

Output:

NewSfn  New%0FSfn  NewSfn 

So client will recieve CONTROL CHARs.

EDIT: Stack Exchange is not showing CONTROL CHAR above. NewSfn is like this New(CONTROL CHAR)Sfn.