How do I encode/decode HTML entities in Ruby? How do I encode/decode HTML entities in Ruby? ruby ruby

How do I encode/decode HTML entities in Ruby?


To encode the characters, you can use CGI.escapeHTML:

string = CGI.escapeHTML('test "escaping" <characters>')

To decode them, there is CGI.unescapeHTML:

CGI.unescapeHTML("test "unescaping" <characters>")

Of course, before that you need to include the CGI library:

require 'cgi'

And if you're in Rails, you don't need to use CGI to encode the string. There's the h method.

<%= h 'escaping <html>' %>


HTMLEntities can do it:

: jmglov@laurana; sudo gem install htmlentitiesSuccessfully installed htmlentities-4.2.4: jmglov@laurana;  irbirb(main):001:0> require 'htmlentities'=> []irb(main):002:0> HTMLEntities.new.decode "¡I&#39;m highly annoyed with character references!"=> "¡I'm highly annoyed with character references!"


I think Nokogiri gem is also a good choice. It is very stable and has a huge contributing community.

Samples:

a = Nokogiri::HTML.parse "foo bär"    a.text => "foo bär"

or

a = Nokogiri::HTML.parse "¡I&#39;m highly annoyed with character references!"a.text=> "¡I'm highly annoyed with character references!"