Split a character vector into individual characters? (opposite of paste or stringr::str_c)
Yes, strsplit
will do it. strsplit
returns a list, so you can either use unlist
to coerce the string to a single character vector, or use the list index [[1]]
to access first element.
x <- paste(LETTERS, collapse = "")unlist(strsplit(x, split = ""))# [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"#[20] "T" "U" "V" "W" "X" "Y" "Z"
OR (noting that it is not actually necessary to name the split
argument)
strsplit(x, "")[[1]]# [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"#[20] "T" "U" "V" "W" "X" "Y" "Z"
You can also split on NULL
or character(0)
for the same result.
str_extract_all()
from stringr
offers a nice way to perform this operation:
str_extract_all("ABCDEFGHIJKLMNOPQRSTUVWXYZ", boundary("character"))[[1]] [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U"[22] "V" "W" "X" "Y" "Z"
This is rendered stepwise for clarity; in practice, a function would be created.
To find the number of times any character is repeated in sequence
the_string <- "BaaaaaaH"# split string into charactersthe_runs <- strsplit(the_string, "")[[1]]# find runsresult <- rle(the_runs)# find values that are repeatedresult$values[which(result$lengths > 1)]#> [1] "a"# retest with more runsthe_string <- "BaabbccH"# split string into charactersthe_runs <- strsplit(the_string, "")[[1]]# find runsresult <- rle(the_runs)# find values that are repeatedresult$values[which(result$lengths > 1)]#> [1] "a" "b" "c"