DomDocument and special characters
Solution:
$oDom = new DOMDocument();$oDom->encoding = 'utf-8';$oDom->loadHTML( utf8_decode( $sString ) ); // important!$sHtml = '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">';$sHtml .= $oDom->saveHTML( $oDom->documentElement ); // important!
The saveHTML()
method works differently specifying a node.You can use the main node ($oDom->documentElement
) adding the desired !DOCTYPE
manually.Another important thing is utf8_decode()
.All the attributes and the other methods of the DOMDocument
class, in my case, don't produce the desired result.
Try to set the encoding type after you have loaded the HTML.
$dom = new DOMDocument();$dom->loadHTML($data);$dom->encoding = 'utf-8';echo $dom->saveHTML();
$dom = new DomDocument();$str = htmlentities($str);$dom->loadHTML(utf8_decode($str));$dom->encoding = 'utf-8';...$str = $dom->saveHTML();$str = html_entity_decode($str);
The above code worked for me.