bash - 在组合前面的数据的同时在多行中添加值

我正在尝试将多行中的整数相加,同时维护数字之前的数据。

这是我的原始数据:

"2020-06","28347","Afghanistan","791","anonymous","3128"
"2020-06","28347","Afghanistan","830","anonymous","402"
"2020-06","28347","Afghanistan","10019","anonymous","79"
"2020-06","28347","Afghanistan","10070","anonymous","829"
"2020-06","28347","Afghanistan","10604","anonymous","4319"
"2020-06","28347","Albania","266","anonymous","60"
"2020-06","28347","Albania","824","anonymous","23"
"2020-06","28347","Albania","10163","anonymous","166"
"2020-06","28347","Algeria","267","anonymous","11047"

这是我期望的输出:

28347,Afghanistan,8757
28347,Albania,249
28347,Algeria,11047

到目前为止,我所做的是从数据中提取第二列和第三列,然后尝试使用 grep 遍历每一列并将值加在一起。不幸的是,我得到的是总合并值,而不是每个国家/地区的值。

COUNTRIES=$(awk -F\, '{OFS=",";}{print $2,$3}' file.dat | sort | uniq)

for COUNTRY in "${COUNTRIES[@]}"
do
  NUMBER=$(grep $COUNTRY file.dat | awk -F\, '{print $6}' | sed 's/\"//g' | awk '{s+=$1} END {print s}')
  echo "$COUNTRY,$NUMBER" | sed 's/\"//g'
done

这给了我

28347,Afghanistan
28347,Albania
28347,Algeria,20053

我不太清楚为什么它会给我全部总数而不是每个国家/地区的总数。有什么想法吗?

最佳答案

你可以使用这个 awk:

awk -F'","' -v OFS=, '{sums[$2 OFS $3] += $NF} END {for (i in sums) print i, sums[i]}' file

28347,Albania,249
28347,Algeria,11047
28347,Afghanistan,8757

如果您想按国家名称的字母顺序排序,请使用此 gnu awk 变体:

awk -F'","' -v OFS=, '
{sums[$2 OFS $3] += $NF}
END {
   PROCINFO["sorted_in"]="@ind_str_asc"
   for (i in sums)
      print i, sums[i]
}' file

28347,Afghanistan,8757
28347,Albania,249
28347,Algeria,11047

https://stackoverflow.com/questions/67939757/

相关文章:

javascript - 计算数组中每个元素的频率 - javascript

sql - 有没有一种方法可以不使用 FOR 循环来创建虚拟记录?

html - 如何使整个 HTML 日期字段可点击?

vuejs3 - react 对象未在模板 Vue3 Composition API 上更新

r - 从单个表中查找多列

c - 分拣技术 C

r - 将数据帧的每一行乘以它的向量 R

next.js - NextJS 将类从页面传递到布局组件

java - 如何在 Java 数组上设置新的属性或方法?

docker - docker compose up 后后端到 redis 连接被拒绝