What is the reason that CDATA even exists? What is the reason that CDATA even exists? xml xml

What is the reason that CDATA even exists?


CDATA sections are just for the convenience of human authors, not for programs. Their only use is to give humans the ability to easily include e.g. SVG example code in an XHTML page without needing to carefully replacing every < with < and so on.

That is for me the intended use. Not to make the resulting document a few bytes smaller because you can use < instead of <.

Also again taking the sample from above (SVG code in xhtml) it makes it easy for me to check the source code of the XHTML file and just copy-paste the SVG code out without again needing to back-replace < with <.


PCDATA - parsed character data which means the data entered will be parsed by the parser.

CDATA - the data entered between CDATA elements will not be parsed by the parser.that is the text inside the CDATA section will be ignored by the parser. as a result a malicious user can sent destroying data to the application using these CDATA elements.

CDATA section starts with <![CDATA[ and ends with ]]>.

The only string that cannot occur in CDATA is ]]>.

The only reason why we use CDATA is: text like Javascript code contains lot of <, & characters. To avoid errors, script code can be defined as CDATA, because using < alone will generate an error, as parser interprets it as the start of new element. Similarly & can be interpreted as a start of the character entity by the parser.


I believe that CDATA was intended to allow raw binary data: as long as it doesn't contain "]]>" then anything goes in a CDATA section. This does set it apart from normal XML and should speed up parsing (and negate the necessity for full text encoding, thus giving a second performance boost).Actually it proved quite problematic what with people not escaping the closing sequence and several early parsers being variously broken, so most now just use a text encoding for binary data, making the CDATA section somewhat pointless, yes.

EDIT: note that this answer is in fact wrong, as Tomalak identifies in comments. I've not deleted it because I know there are other people out there who think that raw binary is acceptable in CDATA and this might clear up that little misunderstanding.