Splitting emoji, safely Splitting emoji, safely javascript javascript

Splitting emoji, safely


JavaScript's strings are UTF-16, so your emoji is internally represented as two code units:

> "\ud83d\ude0e" === "😎"true

The String.prototype.split function doesn't really care about surrogate pairs in UTF-16, so it naively reverses the individual code units and breaks your emoji, because JavaScript doesn't provide any way to deal with individual characters in strings.

There's no easy way to deal with it. You need a library like spliddit to handle the individual code units properly.

I'm not 100% familiar with the terminology, so please edit my answer as needed.


spliddit can't currently correctly split for example this Hindi text into its 5 characters: "अनुच्छेद"

You need the grapheme-splitter library:https://github.com/orling/grapheme-splitterIt is a full implementation of the UAX-29 Unicode standard and will split even the most exotic letters, emoji being just one of many use cases