Escape invalid XML characters in C# Escape invalid XML characters in C# xml xml

Escape invalid XML characters in C#


As the way to remove invalid XML characters I suggest you to use XmlConvert.IsXmlChar method. It was added since .NET Framework 4 and is presented in Silverlight too. Here is the small sample:

void Main() {    string content = "\v\f\0";    Console.WriteLine(IsValidXmlString(content)); // False    content = RemoveInvalidXmlChars(content);    Console.WriteLine(IsValidXmlString(content)); // True}static string RemoveInvalidXmlChars(string text) {    var validXmlChars = text.Where(ch => XmlConvert.IsXmlChar(ch)).ToArray();    return new string(validXmlChars);}static bool IsValidXmlString(string text) {    try {        XmlConvert.VerifyXmlChars(text);        return true;    } catch {        return false;    }}

And as the way to escape invalid XML characters I suggest you to use XmlConvert.EncodeName method. Here is the small sample:

void Main() {    const string content = "\v\f\0";    Console.WriteLine(IsValidXmlString(content)); // False    string encoded = XmlConvert.EncodeName(content);    Console.WriteLine(IsValidXmlString(encoded)); // True    string decoded = XmlConvert.DecodeName(encoded);    Console.WriteLine(content == decoded); // True}static bool IsValidXmlString(string text) {    try {        XmlConvert.VerifyXmlChars(text);        return true;    } catch {        return false;    }}

Update:It should be mentioned that the encoding operation produces a string with a length which is greater or equal than a length of a source string. It might be important when you store a encoded string in a database in a string column with length limitation and validate source string length in your app to fit data column limitation.


Use SecurityElement.Escape

using System;using System.Security;class Sample {  static void Main() {    string text = "Escape characters : < > & \" \'";    string xmlText = SecurityElement.Escape(text);//output://Escape characters : < > & " &apos;    Console.WriteLine(xmlText);  }}


If you are writing xml, just use the classes provided by the framework to create the xml. You won't have to bother with escaping or anything.

Console.Write(new XElement("Data", "< > &"));

Will output

<Data>< > &</Data>

If you need to read an XML file that is malformed, do not use regular expression. Instead, use the Html Agility Pack.