我有一个非常简单的问题:如何在单个代码中将以下文本分成 3 个
mycodes <- c("ATTTGGGCTAATTTTGTTTCTTTCTGGGTCTCTC")
strsplit(mycodes, split = character(3), fixed = T, perl = FALSE, useBytes = FALSE)
[[1]]
[1] "A" "T" "T" "T" "G" "G" "G" "C" "T" "A" "A" "T" "T" "T" "T" "G" "T" "T" "T" "C"
[21] "T" "T" "T" "C" "T" "G" "G" "G" "T" "C" "T" "C" "T" "C"
这不是我想要的;我一次要三个字母:
[1] "ATT" "TGG", "GCT"...............and so on the final may be of one, two or three letters depending upon the letter availability.
谢谢;
最佳答案
我假设您想使用密码子。如果是这种情况,您可能需要查看 Bioconductor 的 Biostrings 包。它提供了多种工具来处理生物序列数据。
library(Biostrings)
?codons
你可以通过一点点笨拙的强制来实现你想要的:
as.character(codons(DNAString(mycodes)))
https://stackoverflow.com/questions/7452156/