r - 从向量中提取多个子串模式

假设我有一个向量如下:

patient_condition <- c("Pre_P1","Post_P1","Enriched_Post_P1","Post_P1_2","Pre_P2","Post_P2", "P3_Pre")
to_match <- c("P1","P2","P3")

我想创建另一个向量，如果新向量是子字符串，则它只包含 to_match 中的值。

[1] "P1"  "P1"  "P1"  "P1"  "P2"  "P2"  "P3"

感谢任何帮助。谢谢!

最佳答案

我们可以使用

stringr::str_extract(patient_condition, "P[0-9]+")
#[1] "P1" "P1" "P1" "P1" "P2" "P2" "P3"

杂项回复

In my case, this answer works. but I guess the question I ask is extracting substrings from a vector given some values to match. Meaning this answer won't work if I want to extract characters (i.e. Pre, Post, Enriched, etc)

to_match <- c("Pre", "Post", "Enriched")

在那种情况下，我们可以使用

## R-level loop through `to_match`
tmp <- t(sapply(to_match, stringr::str_extract, string = patient_condition))
tmp[!is.na(tmp)]
#[1] "Pre"      "Post"     "Enriched" "Post"     "Pre"      "Post"     "Pre"

或

## convert multiple matches to REGEX "or" operation `|`
stringr::str_extract(patient_condition, paste0(to_match, collapse = "|"))
#[1] "Pre"      "Post"     "Enriched" "Post"     "Pre"      "Post"     "Pre"

ThomasIsCoding's answer使用 gregexpr + regmatches 也是一个不错的选择。

请注意，这是在执行精确子字符串匹配。

https://stackoverflow.com/questions/72957686/

相关文章：

api - 概念 api 缺少属性值

python - 在具有列表作为值的 python 字典中查找最低值或最少的项目

html - 在响应图像上放置和缩放文本

javascript - 尝试在 Mobile Safari 上调试 Javascript 但当 i

c - 可变参数在多次转发时如何在堆栈上表示？

javascript - 我怎样才能从传递给 `String.prototype.replace`

perl - 如何为每个 XML 输出添加换行符？

r - 如何用 NA、na_if、if_else、regex 替换某些字符串

r - 在行内的顺序中调整很少的频率来计算频率

regex - 如何从文件中提取一行中存在的所有 IP 地址？