I sometimes see that people wrap json_encode() in htmlspecialchars(). Why? I sometimes see that people wrap json_encode() in htmlspecialchars(). Why? php php

I sometimes see that people wrap json_encode() in htmlspecialchars(). Why?


I don't know php, so i'll assume htmlspecialchars escapes any html special characters.

Given that assumption then the use case is planting json data directly inside html content, a la

echo "<script>"echo json_encode($result)echo "</script>"

Given json encoding only encodes content to avoid escaping from the JS parser, this scenario would allow someone to insert JSON data that the html parser interpreted as ending the script tag.

Something like

{"foo": "</script><script>doSomethingEvil()</script>"}

would then reach the browser as

<script>{"foo": "</script><script>doSomethingEvil()</script>"}<script>

Which clearly results in doSomethingEvil() being executed. By escaping any html tokens you end up sending something like

<script>{"foo": "</script><script>doSomethingEvil()</script>"}<script>

Instead, which avoids the XSS vulnerability.

A far better solution to this problem is to simply not send JSON data directly in an HTML source (JSON encoding just makes the content safe to embed in JS, not HTML)


None of the characters which have special meaning in JSON have special meaning in HTML (except for the quote characters which ENT_NOQUOTES explicitly excludes).

It makes the data safe for dumping into HTML documents (so long as it isn't placed inside an attribute value) via innerHTML. Naturally, it makes the data unsafe for any purpose that does't involve building markup using strings.

I wouldn't advise doing this (in the general case), escaping should be done at the last possible moment before inserting into the desired data format (to maximise reusability and make it clear where developers should expect to see escaping code).


olliej's answer is excellent, but there is one case they haven't covered.

Why would you use htmlspecialchars on JSON?

If you are embedding in a script tag in XHTML outside a CDATA section then you should escape XML special characters. htmlspecialchars should work fine for that.

This is a pretty marginal case though so take olliej's advice.