Best way to process large XML in PHP [duplicate] Best way to process large XML in PHP [duplicate] xml xml

Best way to process large XML in PHP [duplicate]


For a large file, you'll want to use a SAX parser rather than a DOM parser.

With a DOM parser it will read in the whole file and load it into an object tree in memory. With a SAX parser, it will read the file sequentially and call your user-defined callback functions to handle the data (start tags, end tags, CDATA, etc.)

With a SAX parser you'll need to maintain state yourself (e.g. what tag you are currently in) which makes it a bit more complicated, but for a large file it will be much more efficient memory wise.


My take on it:

https://github.com/prewk/XmlStreamer

A simple class that will extract all children to the XML root element while streaming the file.Tested on 108 MB XML file from pubmed.com.

class SimpleXmlStreamer extends XmlStreamer {    public function processNode($xmlString, $elementName, $nodeIndex) {        $xml = simplexml_load_string($xmlString);        // Do something with your SimpleXML object        return true;    }}$streamer = new SimpleXmlStreamer("myLargeXmlFile.xml");$streamer->parse();


When using a DOMDocument with large XML files, don't forget to pass the LIBXML_PARSEHUGE flag in the options of the load() method. (Same applies for the other load methods of the DOMDocument object)

    $checkDom = new \DOMDocument('1.0', 'UTF-8');    $checkDom->load($filePath, LIBXML_PARSEHUGE);

(Works with a 120mo XML file)