In SQL Server, what is the best way to determine if a given string is a valid XML or not? In SQL Server, what is the best way to determine if a given string is a valid XML or not? xml xml

In SQL Server, what is the best way to determine if a given string is a valid XML or not?


I would like to parse the XML data and extract the actual message.

Perhaps it is not necessary to check for valid XML. You could check for the presence of the appropriate xml tag with charindex in a case statement and extract the error message using substring.

Here is a sample with a simplified XML string but I think you get the idea.

declare @T table(ID int, Col1 nvarchar(1000))insert into @T values(1, 'No xml value 1'),(2, 'No xml value 2'),(3, '<root><item>Text value in xml</item></root>')select  case when charindex('<item>', Col1) = 0  then Col1  else    substring(Col1, charindex('<item>', Col1)+6, charindex('</item>', Col1)-charindex('<item>', Col1)-6)  end  from @T

Result:

No xml value 1No xml value 2Text value in xml


Based on the accepted answer, I created this to check for valid XML and optionally convert the input string to XML (or extract the required elements / attributes from the XML), because I figured out that TRY_CONVERT works succesfully in case you just pass in plain text, what I did not expect, so another check was required to prevent the final cast to XML I need to work in case the source column just holds some text (sample row 1):

declare @T table(ID int, Col1 nvarchar(1000))insert into @T values(1, 'random text value 1'),(2, '<broken> or invalid xml value 2'),(3, '<root><item>valid xml</item></root>')select id, Col1, Converted_XML = CASE         when [Col1] IS NULL THEN NULL                   /* NULL stays NULL */         when TRY_CONVERT(xml, [Col1]) is null THEN NULL /* Xml Document Error */        when CHARINDEX('<', [Col1]) < 1 AND CHARINDEX('>', [Col1]) < 1 THEN NULL        /* no xml */        else CONVERT(xml, [Col1])                       /* Parsing succesful. => in this case you can convert string to XML and/or extract the values */    END,    Result_Comment = CASE         when [Col1] IS NULL THEN 'NULL always stays NULL'        when TRY_CONVERT(xml, [Col1]) is null THEN 'Xml Document Error'        when CHARINDEX('<', [Col1]) < 1 AND CHARINDEX('>', [Col1]) < 1 THEN 'no xml'        else [Col1]    ENDFROM @T ;


You could create an XML schema and use it to validate against the XML strings.

See here for additional info : http://msdn.microsoft.com/en-us/library/ms176009.aspx

Here's an example:

CREATE XML SCHEMA COLLECTION UserSchemaCollection AS N'<?xml version="1.0" encoding="UTF-16"?>  <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">    <xsd:element name = "User" >        <xsd:complexType>            <xsd:sequence>                <xsd:element name = "UserID" />                <xsd:element name = "UserName" />            </xsd:sequence>        </xsd:complexType>    </xsd:element>  </xsd:schema>';DECLARE @x XML(UserSchemaCollection)SELECT @x = '<User><UserID>1234</UserID><UserName>Sebastian</UserName></User>'

Examples:

DECLARE @y XML(UserSchemaCollection)SELECT @y = '<User><UserName>Sebastian</UserName></User>'

Msg 6965, Level 16, State 1, Line 2XML Validation: Invalid content. Expected element(s):UserID where element 'UserName' was specified. Location: /:User[1]/:UserName[1]

DECLARE @z XML(UserSchemaCollection)SELECT @z = 'Some text'

Msg 6909, Level 16, State 1, Line 2XML Validation: Text node is not allowed at this location, the type was defined with element only content or with simple content. Location: /