r - 获取一段时间内的累积值计数

假设我有以下数据。

dt = data.table(
  date = c("2020-10-01", "2020-10-02", "2020-10-03", "2020-10-01", "2020-10-01",
           "2020-10-03", "2020-10-04", "2020-10-04", "2020-10-05", "2020-10-05"),
  client = sample(LETTERS[1:3], 10, replace = TRUE),
  vals = rnorm(10))

dt[order(date)]

dt2 = dt[order(date), .(sum_vals = sum(vals)), by = .(date, client)]

         date client    sum_vals
1: 2020-10-01      B  2.53737527
2: 2020-10-01      C  0.64366866
3: 2020-10-02      A  1.01776243
4: 2020-10-03      C -0.06303562
5: 2020-10-03      A  0.63702089
6: 2020-10-04      B  0.12681052
7: 2020-10-04      A  0.82889616
8: 2020-10-05      B -1.45734539
9: 2020-10-05      C  0.02594185

我想做的是按日期计算累计客户数。

所以在这种情况下,它看起来像这样。

         date.  acts 
1: 2020-10-01      2   # b and c we active on 10/01 or before
2: 2020-10-02      3   # a, b and c we active on 10/02 or before
3: 2020-10-03      3   # a, b and c we active on 10/03 or before
4: 2020-10-04      3   # a, b and c we active on 10/04 or before
5: 2020-10-05      3   # a, b and c we active on 10/05 or before

关于如何使用 data.table 或 dplyr 实现这一点有什么想法吗?

最佳答案

我们可以做

dt2[,  .(date = unique(date), acts = unlist(lapply(unique(date),  
         function(x) uniqueN(client[date <= x]))))]

-输出

          date acts
1: 2020-10-01    2
2: 2020-10-02    2
3: 2020-10-03    3
4: 2020-10-04    3
5: 2020-10-05    3

https://stackoverflow.com/questions/69170009/

相关文章:

vue.js - Nuxt : how can I get sourcemap files and

python - 我如何拆分第二个 ", "

sql - Oracle SQL 19c中如何获取日期对应的季度

python - bcrypt 中的密码检查如何工作?

c - 在 C 程序中使用汇编函数

awk - 根据列将大文件拆分为多个文件

javascript - typescript 创建新 map

kotlin - 优化 for 循环以将项目添加到 map

javascript - 格式化电话号码(去除空格并替换第一个数字,如果它是 0)

r - 在避免循环的同时计算 "smaller"行数比其他行数