Why would people use pure XML databases over plain RDBMs? Why would people use pure XML databases over plain RDBMs? database database

Why would people use pure XML databases over plain RDBMs?


Conditions that suggest XML isn't a crazy idea

  • If your data looks like a collection of documents. For example, novels have structure, e.g. chapters, paragraphs, sentences, words. You might want to access the structure programatically, but it would be hard to make a relational schema that would support that.

  • A mind boggling number of fields and tables required, almost all are optional. For example, not all novels have a villain, but a villain attribute or tag would be easy enough to add to an xml document.

  • If you have a fairly small amount of data.

  • Data is strongly hierarchical. It is easier to query an XML document of a organizational chart than to do the similar query on an employee table with a manager column that links to itself.

Example- DasBlog which uses plain ole xml as the datastore.

Conditions that suggest a relational model is better

  • Most of your data fits nicely into tables and columns, fairly small numbers of fields, most fields are required.

  • There is a lot of data. The Relational world has been optimizing for performance much longer than the XML database world.

You can have it both ways

  • Most modern relational databases support xml as a first class data type.


One advantage that hasn't been mentioned is it's much easier if the data is supposed to be available to the public (via RSS or whatever). If the main use for the data is for some kind of public API, or if it is going to be formatted as XML later anyways, then why not? Say you wanted to store some HTML templates. Wouldn't it be easier to store it as HTML? You'd also save the overhead of an RDBMS and processing the data into XML.

XML is also okay if you do almost all reading and very little writing, although a simpler format such as JSON might be more efficient depending on what you're doing.

In any other case, especially if there is a lot of data manipulation involved, real databases (whether relational, object-oriented, document-oriented, whatever) are going to be much more efficient because they're built for that. XML wasn't meant for 100,000,000 rows of data.

I think the reason some databases use XML is because it's such a widely-used format, especially for things like RSS feeds. Like I said before, if your data needs to be XML in the end, then why not store it as XML and make your life easier?


As someone who works on an open source product that heavily uses XML databases, I find XML data sources invaluable as they are representations of the pure data structures used in the program.

XML allows me to model a complex structure in code then serialize it directly to XML to be read in either else where or at a later date, or my specifically in my case export a complex structure in XML then manipulate and query it in memory. There are other options such as ODBMS and ORM which offer many of the same advantages (and then some more), but come with a knowledge or performance overhead.