In SQL Server, what is the best way to determine if a given string is a valid XML or not?
I would like to parse the XML data and extract the actual message.
Perhaps it is not necessary to check for valid XML. You could check for the presence of the appropriate xml
tag with charindex
in a case statement and extract the error message using substring
.
Here is a sample with a simplified XML string but I think you get the idea.
declare @T table(ID int, Col1 nvarchar(1000))insert into @T values(1, 'No xml value 1'),(2, 'No xml value 2'),(3, '<root><item>Text value in xml</item></root>')select case when charindex('<item>', Col1) = 0 then Col1 else substring(Col1, charindex('<item>', Col1)+6, charindex('</item>', Col1)-charindex('<item>', Col1)-6) end from @T
Result:
No xml value 1No xml value 2Text value in xml
Based on the accepted answer, I created this to check for valid XML and optionally convert the input string to XML (or extract the required elements / attributes from the XML), because I figured out that TRY_CONVERT works succesfully in case you just pass in plain text, what I did not expect, so another check was required to prevent the final cast to XML I need to work in case the source column just holds some text (sample row 1):
declare @T table(ID int, Col1 nvarchar(1000))insert into @T values(1, 'random text value 1'),(2, '<broken> or invalid xml value 2'),(3, '<root><item>valid xml</item></root>')select id, Col1, Converted_XML = CASE when [Col1] IS NULL THEN NULL /* NULL stays NULL */ when TRY_CONVERT(xml, [Col1]) is null THEN NULL /* Xml Document Error */ when CHARINDEX('<', [Col1]) < 1 AND CHARINDEX('>', [Col1]) < 1 THEN NULL /* no xml */ else CONVERT(xml, [Col1]) /* Parsing succesful. => in this case you can convert string to XML and/or extract the values */ END, Result_Comment = CASE when [Col1] IS NULL THEN 'NULL always stays NULL' when TRY_CONVERT(xml, [Col1]) is null THEN 'Xml Document Error' when CHARINDEX('<', [Col1]) < 1 AND CHARINDEX('>', [Col1]) < 1 THEN 'no xml' else [Col1] ENDFROM @T ;
You could create an XML schema and use it to validate against the XML strings.
See here for additional info : http://msdn.microsoft.com/en-us/library/ms176009.aspx
Here's an example:
CREATE XML SCHEMA COLLECTION UserSchemaCollection AS N'<?xml version="1.0" encoding="UTF-16"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name = "User" > <xsd:complexType> <xsd:sequence> <xsd:element name = "UserID" /> <xsd:element name = "UserName" /> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>';DECLARE @x XML(UserSchemaCollection)SELECT @x = '<User><UserID>1234</UserID><UserName>Sebastian</UserName></User>'
Examples:
DECLARE @y XML(UserSchemaCollection)SELECT @y = '<User><UserName>Sebastian</UserName></User>'
Msg 6965, Level 16, State 1, Line 2XML Validation: Invalid content. Expected element(s):UserID where element 'UserName' was specified. Location: /:User[1]/:UserName[1]
DECLARE @z XML(UserSchemaCollection)SELECT @z = 'Some text'
Msg 6909, Level 16, State 1, Line 2XML Validation: Text node is not allowed at this location, the type was defined with element only content or with simple content. Location: /