Microsoft SQL Server 2005/2008: XML vs text/varchar data type Microsoft SQL Server 2005/2008: XML vs text/varchar data type xml xml

Microsoft SQL Server 2005/2008: XML vs text/varchar data type


If you store xml in an xml typed column, the data will not get stored as simple text, as in the nvarchar case, it will be stored in some sort of parsed data tree, which in turn will be smaller than the unparsed xml version. This not only decreases the database size, but gives you other advantages, like validation, easy manipulation etc. (even though you're not using any of these, still, they are there for future use).

On the other hand, the server will have to parse the data upon insertion, which will probably slow your database down - you have to make a decision of speed vs. size.

Edit:

Personally, I think that data in the database should be stored as xml only when it has structure which is hard to implement in a relational model, e.g. layouts, style descriptions etc. Usually that means that there won't be much data and speed is not an issue, thus added xml features, like data validation and manipulation ability (also, last but not least, the ability to click on the value in managment studio and see formatted xml - I really love that feature!), outweight the costs.

I don't have direct experience in storing large amounts of xml in the database and I wouldn't do that if I had the option, since it is almost always slower that a relational model, but if that would be the case, I'd recommend profiling both options, and choosing between size and speed that best suit your needs.


My quick investigation shows that MS SQL 2005 (Express Edition)

Microsoft SQL Server 2005 - 9.00.3073.00 (Intel X86) Aug 5 2008 12:31:12 Copyright (c) 1988-2005 Microsoft Corporation Express Edition on Windows NT 6.0 (Build 6000: )

store XML with overhead about 70% (possible for faster processing/parsing).

My data before conversion: rows=160320, reserved=178576 KB, data=178184 KB, index_size=272 KB, unused=120 KB

My data after the conversion: rows=160320, reserved=309702 KB, data=307216 KB, index_size=1672 KB, unused=184 KB

So it hasn't any sense to store XML data in XML data type if you are not planning use XML technology on database side.


I am using xml extensively for communicating to handheld devices and am using XQuery in most all my stored procs to retrieve only the data from the XML that I need. It works GREAT!! I am just worried about storage space because with only a hundred thousand records or so, the DB size is 1 to 2 GB. We are expecting going live with millions of records stored for logging and customer usage.. so it will be worrisome until I see what it's actually going to do in terms of space usage.