Problem json_encode utf-8 [duplicate] Problem json_encode utf-8 [duplicate] json json

Problem json_encode utf-8 [duplicate]


json_encode() is not actually outputting JSON* there. It’s outputting a javascript string. (It outputs JSON when you give it an object or an array to encode.) That’s fine, as a javascript string is what you want.

In javascript (and in JSON), č may be escaped as \u010d. The two are equivalent. So there’s nothing wrong with what json_encode() is doing. It should work fine. I’d be very surprised if this is actually causing you any form of problem. However, if the transfer is safely in a Unicode encoding (UTF-8, usually)†, there’s no need for it either. If you want to turn off the escaping, you can do so thus: json_encode('Svrček', JSON_UNESCAPED_UNICODE). Note that the flag JSON_UNESCAPED_UNICODE was introduced in PHP 5.4.0, and is unavailable in earlier versions.

By the way, contrary to what @onteria_ says, JSON does use UTF-8:

The character encoding of JSON text is always Unicode. UTF-8 is the only encoding that makes sense on the wire, but UTF-16 and UTF-32 are also permitted.


* Or, at least, it's not outputting JSON as defined in RFC 4627. However, there are other definitions of JSON, by which scalar values are allowed.

† JSON may be in UTF-8, UTF-16LE, UTF-16BE, UFT-32LE, or UTF-32BE.


Ok, so, after you make database connection in your php script, put this line, and it should work, at least it solved my problem:

mysql_query('SET CHARACTER SET utf8');


Yes, json_encode escapes non-ascii characters. If you decode it you'll get your original result:

$string="こんにちは";echo "ENCODING: " . mb_detect_encoding($string) . "\n";$encoded = json_encode($string);echo "ENCODED JSON: $encoded\n";$decoded = json_decode($encoded);echo "DECODED JSON: $decoded\n";

Output:

ENCODING: UTF-8ENCODED JSON: "\u3053\u3093\u306b\u3061\u306f"DECODED JSON: こんにちは

EDIT: It's worth nothing that:

JSON uses Unicode exclusively.

The self-documenting format that describes structure and field names as well as specific values;

Source: http://www.json.org/fatfree.html

It uses Unicode NOT UTF-8. This FAQ Explains the difference between UTF-8 and Unicode:

http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8

You use JSON, your non-ascii characters get escaped into Unicode code points. For example こ = code point 3053.