htmlentities in PHP but preserving html tags htmlentities in PHP but preserving html tags php php

htmlentities in PHP but preserving html tags


You can get the list of correspondances character => entity used by htmlentities, with the function get_html_translation_table ; consider this code :

$list = get_html_translation_table(HTML_ENTITIES);var_dump($list);

(You might want to check the second parameter to that function in the manual -- maybe you'll need to set it to a value different than the default one)

It will get you something like this :

array  ' ' => string ' ' (length=6)  '¡' => string '¡' (length=7)  '¢' => string '¢' (length=6)  '£' => string '£' (length=7)  '¤' => string '¤' (length=8)  ....  ....  ....  'ÿ' => string 'ÿ' (length=6)  '"' => string '"' (length=6)  '<' => string '<' (length=4)  '>' => string '>' (length=4)  '&' => string '&' (length=5)

Now, remove the correspondances you don't want :

unset($list['"']);unset($list['<']);unset($list['>']);unset($list['&']);

Your list, now, has all the correspondances character => entity used by htmlentites, except the few characters you don't want to encode.

And now, you just have to extract the list of keys and values :

$search = array_keys($list);$values = array_values($list);

And, finally, you can use str_replace to do the replacement :

$str_in = '<p><font style="color:#FF0000">Camión español</font></p>';$str_out = str_replace($search, $values, $str_in);var_dump($str_out);

And you get :

string '<p><font style="color:#FF0000">Camión español</font></p>' (length=84)

Which looks like what you wanted ;-)


Edit : well, except for the encoding problem (damn UTF-8, I suppose -- I'm trying to find a solution for that, and will edit again)

Second edit couple of minutes after : it seem you'll have to use utf8_encode on the $search list, before calling str_replace :-(

Which means using something like this :

$search = array_map('utf8_encode', $search);

Between the call to array_keys and the call to str_replace.

And, this time, you should really get what you wanted :

string '<p><font style="color:#FF0000">Camión español</font></p>' (length=70)


And here is the full portion of code :

$list = get_html_translation_table(HTML_ENTITIES);unset($list['"']);unset($list['<']);unset($list['>']);unset($list['&']);$search = array_keys($list);$values = array_values($list);$search = array_map('utf8_encode', $search);$str_in = '<p><font style="color:#FF0000">Camión español</font></p>';$str_out = str_replace($search, $values, $str_in);var_dump($str_in, $str_out);

And the full output :

string '<p><font style="color:#FF0000">Camión español</font></p>' (length=58)string '<p><font style="color:#FF0000">Camión español</font></p>' (length=70)

This time, it should be ok ^^
It doesn't really fit in one line, is might not be the most optimized solution ; but it should work fine, and has the advantage of allowing you to add/remove any correspondance character => entity you need or not.

Have fun !


Might not be terribly efficient, but it works

$sample = '<p><font style="color:#FF0000">Camión español</font></p>';echo htmlspecialchars_decode(    htmlentities($sample, ENT_NOQUOTES, 'UTF-8', false)  , ENT_NOQUOTES);


This is optimized version of the accepted answer.

$list = get_html_translation_table(HTML_ENTITIES);unset($list['"']);unset($list['<']);unset($list['>']);unset($list['&']);$string = strtr($string, $list);