Trouble with strings with <U+0092> Unicode characters Trouble with strings with <U+0092> Unicode characters r r

Trouble with strings with <U+0092> Unicode characters


Not sure it will work for you but for the same symptoms i did convert the strings to ascii:

x <- iconv(x, "", "ASCII", "byte")

For non ascii chars, the indication is "<xx>" with the hex code of the byte.

You can then gsub the hex codes to the values that suit you.


I've had a bit of a horrible time with this pernicious little problem, but I think/hope I've finally got somewhere.

After messing around with the read_csv options locale=locale(encoding="xyz") and trying various combinations of other solutions - the gsub solution didn't work, I treid the stringi solution...

It didn't work, either. But it has a function str_enc_detect, which I ran on the problem values stri_enc_detect(x). It gave me a locale I hadn't tried - in this case windows-1252 - which I promptly set in read_csv options: locale=locale(encoding = "windows-1252")

Hey presto it's displaying correctly now.


In my case I managed to replace the weird character with the below code example

Table$Column[Table$Column== "your weird character"] <- "new character"