How to extract html comments and all html contained by node? How to extract html comments and all html contained by node? curl curl

How to extract html comments and all html contained by node?


Comment nodes should be easy to find in XPath with the comment() test, analogous to the text() test:

$comments = $xpath->query('//comment()'); // or another path, as you prefer

They are standard nodes: here is the manual entry for the DOMComment class.


To your other question, it's a bit trickier. The simplest way is to use saveXML() with its optional $node argument:

$html = $dom->saveXML($el);  // $el should be the element you want to get                              // the HTML for


For the HTML comments a fast method is:

 function getComments ($html) {     $rcomments = array();     $comments = array();     if (preg_match_all('#<\!--(.*?)-->#is', $html, $rcomments)) {         foreach ($rcomments as $c) {             $comments[] = $c[1];         }         return $comments;     } else {         // No comments matchs         return null;     } }


That Regex\s*<!--[\s\S]+?-->
Helps to you.

In regex Test