SQL Server 2008 - Add XML Declaration to XML Output SQL Server 2008 - Add XML Declaration to XML Output xml xml

SQL Server 2008 - Add XML Declaration to XML Output


TL;DR

Concatenate this: <?xml version="1.0" encoding="windows-1252" ?> with your XML, converted to varchar(max).

Details

I agree with j0N45 that the schema will not change anything. As the answer he references points out:

You have to add it manually.

I provided some example code to do so in another answer. Basically, you CONVERT the XML into varchar or nvarchar and then concatenate it with the XML declaration, such as <?xml version="1.0" encoding="windows-1252" ?>.

However, it's important to choose the right encoding. SQL Server produces non-Unicode strings according to its collation settings. By default, that will be governed by the database collation settings, which you can determine using this SQL:

SELECT DATABASEPROPERTYEX('ExampleDatabaseName', 'Collation');

A common default collation is "SQL_Latin1_General_CP1_CI_AS", which has a code page of 1252. You can retrieve the code page with this SQL:

SELECT COLLATIONPROPERTY('SQL_Latin1_General_CP1_CI_AS', 'CodePage') AS 'CodePage';

For code page 1252, you should use an encoding name of "windows-1252". The use of "ISO-8859-1" is inaccurate. You can test that using the "bullet" character: •. It has a Unicode Code Point value of 8226 (Hex 2022). You can generate the character in SQL reliably, regardless of collation, using this code:

SELECT NCHAR(8226);

It has also has a code point of 149 in the windows-1252 code page, so you if you are using the common, default collation of "SQL_Latin1_General_CP1_CI_AS", then you can also produce it using:

SELECT CHAR(149);

However, CHAR(149) won't be a bullet in all collations. For example, if you try this:

SELECT CONVERT(char(1),char(149)) COLLATE Chinese_Hong_Kong_Stroke_90_BIN;

You don't get a bullet at all.

The "ISO-8859-1" code page is Windows-28591. None of the SQL Server collations (in 2005 anyway) use that code page. You can get a full list of code pages using:

SELECT [Name], [Description], [CodePage] = COLLATIONPROPERTY([Name], 'CodePage')FROM ::fn_helpcollations()ORDER BY [CodePage] DESC;

You can further verify that "ISO-8859-1" is the wrong choice by trying to use it in SQL itself. The following SQL:

SELECT CONVERT(xml,'<?xml version="1.0" encoding="ISO-8859-1"?><test></test>');

Will produce XML which does not contain a bullet. Indeed, it won't produce any character, because ISO-8859-1 has no character defined for code point 149.

SQL Server handles Unicode strings differently. With Unicode strings (nvarchar), "there is no need for different code pages to handle different sets of characters". However, SQL Server does NOT use "UTF-8" encoding. If you try to use it within SQL itself:

SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UTF-8"?><test></test>');

You will get an error:

Msg 9402, Level 16, State 1, Line 1 XML parsing: line 1, character 38, unable to switch the encoding

Rather, SQL uses "UCS-2" encoding, so this will work:

SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UCS-2"?><test></test>');


I think this answers to your question How to add xml encoding <?xml version="1.0" encoding="UTF-8"?> to xml Output in SQL Server.

I don't think creating a schema would change anything, because it is only used to validation.

Cheers