Special Characters in XML Special Characters in XML xml xml

Special Characters in XML


You are trying to use an HTML entity in a non-HTML or non-XHTML document. These entities are declared in the document's Document Type Definition (DTD).

You should use the numerical Unicode version of the entity reference. For example, in the case of » you should use »

Alternatively, you can define them in your XML document's DTD:

<!ENTITY entity-name "entity-value"><!ENTITY raquo "&#187;">

Otherwise, if your document is UTF-8, I believe you can just use the actual character directly in your XML document.

»


did you specify a doc type for your file ?

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

I think you might get such errors if you forget to specify it.

Also sometimes the entities work if you specify them by number instead of name.

&#187; &#171; instead of » and «


You don't need to declare an entity in your DTD, or even use a DTD. You probably don't need to use the Unicode representation of the character. You certainly don't need to use a CDATA section.

What you need to do is use a DOM to build your XML instead of trying to build it with string manipulation. The DOM will fix this problem for you.

In C#, this code:

 XmlDocument d = new XmlDocument(); d.LoadXml("<foo/>"); char c = (char)187; d.DocumentElement.InnerText = "Here's that character: " + c; Debug.WriteLine(d.OuterXml); d.DocumentElement.InnerText = "Here it is as an HTML entity: »"; Debug.WriteLine(d.OuterXml);

produces this output:

<foo>Here's that character: »</foo><foo>Here it is as an HTML entity: &raquo;</foo>

As you can see from the first example, the » character is perfectly legal in XML text. But I don't think you're trying to represent that character.

I think you're trying to do what's in the second example, based on the error message that you reported. You're trying to represent the string of characters ». The proper way to represent that string of characters in XML text is by escaping the ampersand; thus: &raquo;.

So if you must use string manipulation to build your XML, just make sure that you escape any ampersands in your source data. Not to belabor the point, but if you were using a DOM, this would have been done for you automatically.

One other thing. It's quite likely that in your original question, which now reads "I am using »", what you actually typed is "I am using »". The actual post doesn't look like that, though. If you need to represent text literally in markdown, enclose it in backticks; otherwise, HTML entities will get converted to their character representation when the post is rendered.