How do I escape a Unicode string with Ruby? How do I escape a Unicode string with Ruby? ruby ruby

How do I escape a Unicode string with Ruby?


In Ruby 1.8.x, String#inspect may be what you are looking for, e.g.

>> multi_byte_str = "hello\330\271!"=> "hello\330\271!">> multi_byte_str.inspect=> "\"hello\\330\\271!\"">> puts multi_byte_str.inspect"hello\330\271!"=> nil

In Ruby 1.9 if you want multi-byte characters to have their component bytes escaped, you might want to say something like:

>> multi_byte_str.bytes.to_a.map(&:chr).join.inspect=> "\"hello\\xD8\\xB9!\""

In both Ruby 1.8 and 1.9 if you are instead interested in the (escaped) unicode code points, you could do this (though it escapes printable stuff too):

>> multi_byte_str.unpack('U*').map{ |i| "\\u" + i.to_s(16).rjust(4, '0') }.join=> "\\u0068\\u0065\\u006c\\u006c\\u006f\\u0639\\u0021"


To use a unicode character in Ruby use the "\uXXXX" escape; where XXXX is the UTF-16 codepoint. see http://leejava.wordpress.com/2009/03/11/unicode-escape-in-ruby/


If you have Rails kicking around you can use the JSON encoder for this:

require 'active_support'x = ActiveSupport::JSON.encode('ยต')# x is now "\u00b5"

The usual non-Rails JSON encoder doesn't "\u"-ify Unicode.