在 R 中,我需要按组计算存在多少个连续月份,直到该行的月份。这是一个连续计数,一旦缺少月份,应该重新开始计数。这是一个示例,结果列中包含所需的结果。
date <- c("2020-01-01", "2020-02-01", "2020-03-01", "2020-05-01", "2020-06-01", "2020-01-01", "2020-03-01", "2020-04-01")
group <- c("a","a","a","a","a","b","b","b")
result <- c(1,2,3,1,2,1,1,2)
data.frame(date=as.Date(date), group=group, result=result)
对于“a”组,计数中断并从 5 月重新开始,因为“a”不存在 4 月。 “b”也一样,2 月不存在,因此从 3 月重新开始计数。如何获取结果列?
最佳答案
这是一个 data.table
选项,使用 rowid
+ cumsum
setDT(df)[, result := rowid(cumsum(c(TRUE, round(diff(date) / 30.42) != 1))), group]
给出
date group result
1: 2020-01-01 a 1
2: 2020-02-01 a 2
3: 2020-03-01 a 3
4: 2020-05-01 a 1
5: 2020-06-01 a 2
6: 2020-01-01 b 1
7: 2020-03-01 b 1
8: 2020-04-01 b 2
数据
> dput(df)
structure(list(date = structure(c(18262, 18293, 18322, 18383,
18414, 18262, 18322, 18353), class = "Date"), group = c("a",
"a", "a", "a", "a", "b", "b", "b")), class = "data.frame", row.names = c(NA,
-8L))
https://stackoverflow.com/questions/67339893/