What's the difference between hex code (\x) and unicode (\u) chars?

r unicode hex

The escape sequence \xNN inserts the raw byte NN into a string, whereas \uNN inserts the UTF-8 bytes for the Unicode code point NN into a UTF-8 string:

> charToRaw('\xA3')[1] a3> charToRaw('\uA3')[1] c2 a3

These two types of escape sequence cannot be mixed in the same string:

> '\ua3\xa3'Error: mixing Unicode and octal/hex escapes in a string is not allowed

This is because the escape sequences also define the encoding of the string. A \uNN sequence explicitly sets the encoding of the entire string to "UTF-8", whereas \xNN leaves it in the default "unknown" (aka. native) encoding:

> Encoding('\xa3')[1] "unknown"> Encoding('\ua3')[1] "UTF-8"

This becomes important when printing strings, as they need to be converted into the appropriate output encoding (e.g., that of your console). Strings with a defined encoding can be converted appropriately (see enc2native), but those with an "unknown" encoding are simply output as-is:

On Linux, your console is probably expecting UTF-8 text, and as 0xA3 is not a valid UTF-8 sequence, it gives you "�".
On Windows, your console is probably expecting Windows-1252 text, and as 0xA3 is the correct encoding for "£", that's what you see. (When the string is \uA3, a conversion from UTF-8 to Windows-1252 takes place.)

If the encoding is set explicitly, the appropriate conversion will take place on Linux:

> s <- '\xa3'> Encoding(s) <- 'latin1'> cat(s)£

CodeHunter

What's the difference between hex code (\x) and unicode (\u) chars?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last