What does LIBXML_NOENT do (and why isn't it called LIBXML_ENT)? What does LIBXML_NOENT do (and why isn't it called LIBXML_ENT)? php php

What does LIBXML_NOENT do (and why isn't it called LIBXML_ENT)?


Q: What exactly does the LIBXML_NOENT flag do?

The flag enables the substitution of XML character entity references, external or not.

Q: Why is it called LIBXML_NOENT? What is it short for, and wouldn't LIBXML_ENT or LIBXML_PARSE_EXTERNAL_ENTITIES be a better fit?

The name is indeed misleading. I think that NOENT simply means that the node tree of the parsed document won't contain any entity nodes, so the parser will substitute entities. Without NOENT, the parser creates DOMEntityReference nodes for entity references.

Q: Is there a flag that actually prevents the parsing of all entities?

LIBXML_NOENT enables the substitution of all entity references. If you don't want entities to be expanded, simply omit the flag. For example

$xml = '<!DOCTYPE test [<!ENTITY c "TEST">]><test>&c;</test>';$dom = new DOMDocument();$dom->loadXML($xml);echo $dom->saveXML();

prints

<?xml version="1.0"?><!DOCTYPE test [<!ENTITY c "TEST">]><test>&c;</test>

It seems that textContent replaces entities on its own which might be a peculiarity of the PHP bindings. Without LIBXML_NOENT, it leads to different behavior for internal and external entities because the latter won't be loaded.