How do I decode a string with escaped unicode? How do I decode a string with escaped unicode? javascript javascript

How do I decode a string with escaped unicode?


Edit (2017-10-12):

@MechaLynx and @Kevin-Weber note that unescape() is deprecated from non-browser environments and does not exist in TypeScript. decodeURIComponent is a drop-in replacement. For broader compatibility, use the below instead:

decodeURIComponent(JSON.parse('"http\\u00253A\\u00252F\\u00252Fexample.com"'));> 'http://example.com'

Original answer:

unescape(JSON.parse('"http\\u00253A\\u00252F\\u00252Fexample.com"'));> 'http://example.com'

You can offload all the work to JSON.parse


UPDATE: Please note that this is a solution that should apply to older browsers or non-browser platforms, and is kept alive for instructional purposes. Please refer to @radicand 's answer below for a more up to date answer.


This is a unicode, escaped string. First the string was escaped, then encoded with unicode. To convert back to normal:

var x = "http\\u00253A\\u00252F\\u00252Fexample.com";var r = /\\u([\d\w]{4})/gi;x = x.replace(r, function (match, grp) {    return String.fromCharCode(parseInt(grp, 16)); } );console.log(x);  // http%3A%2F%2Fexample.comx = unescape(x);console.log(x);  // http://example.com

To explain: I use a regular expression to look for \u0025. However, since I need only a part of this string for my replace operation, I use parentheses to isolate the part I'm going to reuse, 0025. This isolated part is called a group.

The gi part at the end of the expression denotes it should match all instances in the string, not just the first one, and that the matching should be case insensitive. This might look unnecessary given the example, but it adds versatility.

Now, to convert from one string to the next, I need to execute some steps on each group of each match, and I can't do that by simply transforming the string. Helpfully, the String.replace operation can accept a function, which will be executed for each match. The return of that function will replace the match itself in the string.

I use the second parameter this function accepts, which is the group I need to use, and transform it to the equivalent utf-8 sequence, then use the built - in unescape function to decode the string to its proper form.


Note that the use of unescape() is deprecated and doesn't work with the TypeScript compiler, for example.

Based on radicand's answer and the comments section below, here's an updated solution:

var string = "http\\u00253A\\u00252F\\u00252Fexample.com";decodeURIComponent(JSON.parse('"' + string.replace(/\"/g, '\\"') + '"'));

http://example.com