XML serializing with XmlWriter via StringBuilder is utf-16 while via Stream is utf-8?
When you create an XmlWriter
around a TextWriter
, the XmlWriter
always uses the encoding of the underlying TextWriter
. The encoding of a StringWriter
is always UTF-16, since that's how .NET strings are encoded internally.
When you create an XmlWriter
around a Stream
, there is no encoding defined for the Stream
, so it uses the encoding specified in the XmlWriterSettings
.
The most elegant solution for me is to write to a memorystream and then using encoding to encode the stream to whatever encoding is required.like so
using (MemoryStream memS = new MemoryStream()) { //set up the xml settings XmlWriterSettings settings = new XmlWriterSettings(); settings.OmitXmlDeclaration = OmitXmlHeader; using (XmlWriter writer = XmlTextWriter.Create(memS, settings)) { //write the XML to a stream xmlSerializer.Serialize(writer, objectToSerialize); writer.Close(); } //encode the memory stream to xml retString.AppendFormat("{0}", encoding.GetString(memS.ToArray())); memS.Close(); }
where the encoding takes place at ....encoding.GetString(memS.ToArray())...
Where possible, the XmlWriter uses the encoding of the underlying stream. It it wrote UTF-8 data to a stream it knew was UTF-16, you'd end up with a mess. Writing UTF-16 data to a UTF-8 stream also causes problems, especially for environments that use null terminated strings (like C/C++).
The StringBuilder/StringWriter presents a UTF-16 stream to the XmlWriter, so the XmlWriter ignores your requested setting and uses that.
In practise I usually don't emit the header, that way I can use a StringBuilder underneath and save a few lines of code messing about with switching encodings.