illegal self closing node notation for empty nodes - outputting XHTML with PHP DOMDocument illegal self closing node notation for empty nodes - outputting XHTML with PHP DOMDocument xml xml

illegal self closing node notation for empty nodes - outputting XHTML with PHP DOMDocument


Sorry for the late reply, but you know... it was Christmas. :D

function export_html(DOMDocument $dom){        $voids = ['area',                  'base',                  'br',                  'col',                  'colgroup',                  'command',                  'embed',                  'hr',                  'img',                  'input',                  'keygen',                  'link',                  'meta',                  'param',                  'source',                  'track',                  'wbr'];        // Every empty node. There is no reason to match nodes with content inside.        $query = '//*[not(node())]';        $nodes = (new DOMXPath($dom))->query($query);        foreach ($nodes as $n) {                if (! in_array($n->nodeName, $voids)) {                        // If it is not a void/empty tag,                        // we need to leave the tag open.                        $n->appendChild(new DOMComment('NOT_VOID'));                }        }        // Let's remove the placeholder.        return str_replace('<!--NOT_VOID-->', '', $dom->saveXML());}

In your example

$dom = new DOMDocument();$dom->loadXML(<<<XML<html>        <textarea id="something"></textarea>        <div id="someDiv" class="whaever"></div></html>XML);

echo export_html($dom); will produce

<?xml version="1.0"?><html>    <textarea id="something"></textarea>    <div id="someDiv" class="whaever"></div></html>

Merry Christmas! ^_^


Should you not know that HTML5 can be written and served as XML look at this: "It seems not very clear for many people. So let’s set the record straight. HTML 5 can be written in html and XML."

Next to actually serve any PHP example as XML set the according header:

header("content-type: application/xhtml+xml; charset=UTF-8");

In actual XML documents you cannot have any self closing tags written without a closing slash. No <br> instead of </br> etc. With that prelude let's go on...

We found that using the LIBXML_NOEMPTYTAG option in

$xml=new DOMDocument();$xml->loadXML(utf8_encode($temp));  // do stuff with the DOM$temp=utf8_decode($xml->saveXML(NULL, LIBXML_NOEMPTYTAG));

does not "solve" the problem but reverses it. The HTML5 spec names a number of "void elements". they are: area, base, br, col, embed, hr, img, input, keygen, link, meta, param, source, track, wbr and to quote the spec on them: "Void elements can't have any contents (since there's no end tag, no content can be put between the start tag and the end tag)."

Because of their defined lack of content the void elements can be used to get this right by a simple RegExp (in lack of an actual solution):

$temp = preg_replace('#></(area|base|br|col|embed|hr|img|input|keygen|link|meta|param|source|track|wbr)>#si', '/>', $temp);

After which we can go on with the other stupid fixes I had in the question:

$temp=str_replace(' xmlns:default="http://www.w3.org/1999/xhtml"','',$temp);$temp=str_replace('<default:',"<",$temp);$temp=str_replace('</default:',"</",$temp);