How to delete a segment of a string with a specific start and end in R using regular expressions?
You can do this with ease using the qdapRegex package that I maintain:
str = c("F14 : M114L","W15 : M116L, W15 : M118L","W15 : D111L, F14 : E112L, F14 : M116L")library(qdapRegex)rm_between(str, "\\s:", "L")## [1] "F14" "W15, W15" "W15, F14, F14"
qdapRegex aims to be useful as it teaches. If you are interested in the regex used...
S("@rm_between", "\\s:", "L")## [1] "(\\s:)(.*?)(L)"gsub(S("@rm_between", "\\s:", "L") , "", str)
Couple of approaches.
Take the first few letters if it's always three:
substr(str,1,3)
I personally like stringr
too. It makes extraction really straightforward. Pattern what you want, not what you don't want.
library(stringr)str_extract(str,"[A-Z][0-9]*")
I've simplified these for a vector, but you have sub elements, you'll need something like:
splits <- strsplit(str,", ")result <- lapply(splits, substr, start = 1, stop = 3 )
or
result <- lapply(splits, str_extract, pattern = "[A-Z][0-9]*")