Storing formatted text in a DB while maintaining abstraction Storing formatted text in a DB while maintaining abstraction database database

Storing formatted text in a DB while maintaining abstraction


I would store the structure of the document using XML, and always apply some XSLT transformation before showing it in the web browser. That way the information can be adapted to different browsers, or other usages like displaying in a normal UI or exporting to some plain text document.

The structure would have to be something meaningful, not only formatting information. Ideally it would be a representation of some domain-specific data model.

Of course nothing stops you, if the meaningful information is the document structure, to define something like:

<document>  <title>SomeTitle</title>  <paragraph>Some Long paragraph text</paragraph></document>

Another advantage of using XML in this context is that if your database supports it (like Oracle), you could query the contents of the text.

We are assuming that the text is something that doesn't need to be queried often, or that the content is really for display purposes only. Otherwise it might be better normalize the database.


There are two ideas that clash slightly in your question - that of keeping the data separate to the content so it can be re purposed, and that of including formatting data.

Is the formatting data part of the data, or just meta data?

Haven't we seen this before; it basically seems to be a CSS / HTML conundrum.

If these blocks of text fit into a known data scheme (as Mario's answer assumes) then yeah, I'd go with his answer, but re-reading your questions I'll answer (and assume) you have bits of formatting within, say, the paragraph tag that Mario used?

Going with the idea the formatting is basically part of the data, not just an added extra, I'd suggest adopting something like the CSS / HTML solution. Store the text with standard XHTML tags, ready for your CSS. This could then be parsed when you want to use a standard UI (as in non web app?), and just strip the tags and replace as necessary.

Of course, you could make up your own markup ([myBitOfText #] instead of < span class="myBitOfText />) but you may as well have one return from your database that requires no re-purposing or string manipulation.