我讲过不同长度的 Turn
。我想将 Turn
分解成单独的单词,并根据单词在话语中的位置将每个单词分配到一个新列:
test <- data.frame(
Speaker = c("PS1G3","PS1G2","PS1G3","PS1G2"),
N_words = c(2,3,4,1),
Turn = c("can you","what 's the","are you going up","what"),
c5 = c("VM0 PNP","DTQ VBZ AT0","VBB PNP VVG AVP","DTQ"))
我正在寻找的输出是这样的:
test
Speaker N_words Turn c5 w1 w2 w3 w4
1 PS1G3 2 can you VM0 PNP can you <NA> <NA>
2 PS1G2 3 what 's the DTQ VBZ AT0 what 's the <NA>
3 PS1G3 4 are you going up VBB PNP VVG AVP are you going up
4 PS1G2 1 what DTQ what <NA> <NA> <NA>
我知道如何将 Turn
拆分成单独的单词,但卡在那里:
lapply(test$Turn, function(x) unlist(strsplit(x, "\\s")))
最佳答案
我们可以使用 base R
中的 read.table
cbind(test, read.table(text = test$Turn,
header = FALSE, fill = TRUE, quote = "", na.strings = ""))
-输出
# Speaker N_words Turn c5 V1 V2 V3 V4
#1 PS1G3 2 can you VM0 PNP can you <NA> <NA>
#2 PS1G2 3 what 's the DTQ VBZ AT0 what 's the <NA>
#3 PS1G3 4 are you going up VBB PNP VVG AVP are you going up
#4 PS1G2 1 what DTQ what <NA> <NA> <NA>
https://stackoverflow.com/questions/67400851/