How do you make strings "XML safe"? How do you make strings "XML safe"? php php

How do you make strings "XML safe"?


Since PHP 5.4 you can use:

htmlspecialchars($string, ENT_XML1);

You should specify the encoding, such as:

htmlspecialchars($string, ENT_XML1, 'UTF-8');

Update

Note that the above will only convert:

  • & to &
  • < to <
  • > to >

If you want to escape text for use in an attribute enclosed in double quotes:

htmlspecialchars($string, ENT_XML1 | ENT_COMPAT, 'UTF-8');

will convert " to " in addition to &, < and >.


And if your attributes are enclosed in single quotes:

htmlspecialchars($string, ENT_XML1 | ENT_QUOTES, 'UTF-8');

will convert ' to &apos; in addition to &, <, > and ".

(Of course you can use this even outside of attributes).


See the manual entry for htmlspecialchars.


By either escaping those characters with htmlspecialchars, or, perhaps more appropriately, using a library for building XML documents, such as DOMDocument or XMLWriter.

Another alternative would be to use CDATA sections, but then you'd have to look out for occurrences of ]]>.

Take also into consideration that that you must respect the encoding you define for the XML document (by default UTF-8).


1) You can wrap your text as CDATA like this:

<mytag>    <![CDATA[Your text goes here. Btw: 5<6 and 6>5]]></mytag>

see http://www.w3schools.com/xml/xml_cdata.asp

2) As already someone said: Escape those chars. E.g. like so:

5<6 and 6>5