How to convert \uXXXX unicode to UTF-8 using console tools in *nix How to convert \uXXXX unicode to UTF-8 using console tools in *nix unix unix

How to convert \uXXXX unicode to UTF-8 using console tools in *nix


Might be a bit ugly, but echo -e should do it:

echo -en "$(curl $URL)"

-e interprets escapes, -n suppresses the newline echo would normally add.

Note: The \u escape works in the bash builtin echo, but not /usr/bin/echo.

As pointed out in the comments, this is bash 4.2+, and 4.2.x have a bug handling 0x00ff/17 values (0x80-0xff).


I don't know which distribution you are using, but uni2ascii should be included.

$ sudo apt-get install uni2ascii

It only depend on libc6, so it's a lightweight solution (uni2ascii i386 4.18-2 is 55,0 kB on Ubuntu)!

Then to use it:

$ echo 'Character 1: \u0144, Character 2: \u00f3' | ascii2uni -a U -qCharacter 1: ń, Character 2: ó


I found native2ascii from JDK as the best way to do it:

native2ascii -encoding UTF-8 -reverse src.txt dest.txt

Detailed description is here: http://docs.oracle.com/javase/1.5.0/docs/tooldocs/windows/native2ascii.html

Update:No longer available since JDK9: https://bugs.openjdk.java.net/browse/JDK-8074431