how to properly concatenate bidi strings in r? how to properly concatenate bidi strings in r? r r

how to properly concatenate bidi strings in r?


There is actually no problem with gsub:

text <- dput("یہ جملہ ایک مثال کے لیے استعمال کیا جا رہا ہے")"<U+06CC><U+06C1> <U+062C><U+0645><U+0644><U+06C1> <U+0627><U+06CC><U+06A9><U+0645><U+062B><U+0627><U+0644> <U+06A9><U+06D2> <U+0644><U+06CC><U+06D2> <U+0627><U+0633><U+062A><U+0639><U+0645><U+0627><U+0644> <U+06A9><U+06CC><U+0627> <U+062C><U+0627> <U+0631><U+06C1><U+0627> <U+06C1><U+06D2>"pattern <- dput("کیا جا")"<U+06A9><U+06CC><U+0627> <U+062C><U+0627>"replaceWith <- dput(paste0("<somemark>", pattern, "</somemark>"))"<somemark><U+06A9><U+06CC><U+0627> <U+062C><U+0627></somemark>"dput(gsub(pattern, replaceWith, text))"<U+06CC><U+06C1> <U+062C><U+0645><U+0644><U+06C1> <U+0627><U+06CC><U+06A9> <U+0645><U+062B><U+0627><U+0644> <U+06A9><U+06D2> <U+0644><U+06CC><U+06D2> <U+0627><U+0633><U+062A><U+0639><U+0645><U+0627><U+0644> <somemark><U+06A9><U+06CC><U+0627> <U+062C><U+0627></somemark> <U+0631><U+06C1><U+0627> <U+06C1><U+06D2>"

The rendering of the result ( a string containing both right to left and left to right characters) is also quite logical to me:

  1. The beginning of the string contains right to left characters so is rendered from right to left

یہ جملہ ایک مثال کے لیے استعمال

  1. then the string continues with left to right characters. It is rendered left to right and added at the end (the left of what was previously rendered),

یہ جملہ ایک مثال کے لیے استعمال <somemark>

  1. then the string continues with right to left characters. It is rendered right to left and added at the end,

یہ جملہ ایک مثال کے لیے استعمال <somemark>کیا جا

  1. then the string continues with left to right characters. It is rendered left to right and added at the end,

یہ جملہ ایک مثال کے لیے استعمال <somemark>کیا جا</somemark>

  1. and finally the string ends with right to left characters. It is rendered right to left and added at the end.

یہ جملہ ایک مثال کے لیے استعمال <somemark>کیا جا</somemark> رہا ہے

Your idea of what should be rendered doesn't seem to me more logical, but I must admit I don't have experience with right to left text rendering.

Anyway, if the formatting has to be interpreted by the renderer like the <b>...</b> tags in HTML, then it works perfectly (in markdown/html):

یہ جملہ ایک مثال کے لیے استعمال <b>کیا جا</b> رہا ہے

renders as

یہ جملہ ایک مثال کے لیے استعمال کیا جا رہا ہے

I have not managed to print nothing in shiny but question marks:

???? ???????? ?????? ???????? ???? ?????? ?????????????? <somemark>?????? ????</somemark> ?????? ????


I gave it a try . I did take the liberty of hard coding the args instead of reading from session, though.

Server: output$mysub <- function(){ # (text=NULL, pattern=NULL)text <- "یہ جملہ ایک مثال کے لیے استعمال کیا جا رہا ہے"pattern <- "کیا جا"Encoding(text) <- "UTF-8"Encoding(pattern) <- "UTF-8"print(text)beforePattern <- substr(text, 1, regexpr(pattern, text)[1]-1)afterPattern <- substr(text, regexpr(pattern,text)[1] + nchar(pattern), nchar(text))replaceWith <- paste0("<somemark>", pattern, "</somemark>")result <- paste(afterPattern, replaceWith, beforePattern)# result <- paste( beforePattern, replaceWith, afterPattern)# Encoding(result) <- "UTF-8"print(length(result))print(result)return(result)}# ui.R: h2( textOutput("mysub") )

The output I got on shiny webpage was :bidi text output