python - pandas 将嵌套字典转换为 mutiIndex 行和列

我有一个嵌套字典,我想把它变成一个多索引行和列,如下所示。但是我的数据以某种方式丢失在表中。

    test= {12: {'Category 1': {'TestA': {'att_1': 1, 'att_2': 'whatever'}, 'TestB': {'att_1': 3, 'att_2': 'spring'}}, 'Category 2': {'TestA': {'att_1': 23, 'att_2': 'another'}, 'TestB': {'att_1': 9, 'att_2': 'summer'}}}, 15: {'Category 1': {'TestA': {'att_1': 10, 'att_2': 'foo'}, 'TestB': {'att_1': 29, 'att_2': 'fall'}}, 'Category 2': {'TestA': {'att_1': 30, 'att_2': 'bar'}, 'TestB': {'att_1': 36, 'att_2': 'winter'}}}}
columns=pd.MultiIndex.from_arrays([['TestA','TestA','TestB','TestB'],['att_1','att_2','att_1','att_2']])

我想要的格式:

              TestA       TestB      
              att_1 att_2 att_1 att_2
12 Category 1   NaN   NaN   NaN   NaN
   Category 2   NaN   NaN   NaN   NaN
15 Category 1   NaN   NaN   NaN   NaN
   Category 2   NaN   NaN   NaN   NaN

我做到了

    pd.DataFrame(test,index=pd.MultiIndex.from_arrays([[12,12,15,15],['Category 1','Category 2','Category 1','Category 2']]),columns=pd.MultiIndex.from_arrays([['TestA','TestA','TestB','TestB'],['att_1','att_2','att_1','att_2']]))

我的数据丢失如下:

             TestA       TestB      
              att_1 att_2 att_1 att_2
12 Category 1   NaN   NaN   NaN   NaN
   Category 2   NaN   NaN   NaN   NaN
15 Category 1   NaN   NaN   NaN   NaN
   Category 2   NaN   NaN   NaN   NaN

如果我只有 multiIndex 行,它会起作用,但我想要 multiIndex 行和列。

     pd.DataFrame.from_dict({(i,j): test[i][j] 
                           for i in test.keys() 
                           for j in test[i].keys()},
                       orient='index')

                                           TestA                             TestB
12 Category 1  {'att_1': 1, 'att_2': 'whatever'}   {'att_1': 3, 'att_2': 'spring'}
   Category 2  {'att_1': 23, 'att_2': 'another'}   {'att_1': 9, 'att_2': 'summer'}
15 Category 1      {'att_1': 10, 'att_2': 'foo'}    {'att_1': 29, 'att_2': 'fall'}
   Category 2      {'att_1': 30, 'att_2': 'bar'}  {'att_1': 36, 'att_2': 'winter

最佳答案

您可以获得所需的数据框:

import pandas as pd
import numpy as np

test= {12: {'Category 1': {'TestA': {'att_1': 1, 'att_2': 'whatever'}, 'TestB': {'att_1': 3, 'att_2': 'spring'}}, 'Category 2': {'TestA': {'att_1': 23, 'att_2': 'another'}, 'TestB': {'att_1': 9, 'att_2': 'summer'}}}, 15: {'Category 1': {'TestA': {'att_1': 10, 'att_2': 'foo'}, 'TestB': {'att_1': 29, 'att_2': 'fall'}}, 'Category 2': {'TestA': {'att_1': 30, 'att_2': 'bar'}, 'TestB': {'att_1': 36, 'att_2': 'winter'}}}}

# Row indexes
row_index = [[12,12,15,15],['Category 1','Category 2','Category 1','Category 2']]

# Column indexes
col_index = [['TestA','TestA','TestB','TestB'],['att_1','att_2','att_1','att_2']]

# Values row wise
values =[1,'whatever',3,'spring',23,'another',9,'summer',10,'foo',29,'fall',30,'bar',36,'winter']

# Convert the list of values to numpy array
value = np.array(values)

# Reshape the value as (4,4) array as the matrix/dataframe is of shape (4,4)
value = value.reshape(4,4)

# Get your required data frame
pd.DataFrame(value, index=row_index, columns=col_index)

https://stackoverflow.com/questions/60440396/

相关文章:

apache-kafka - max.request.size 和 message.max.byte

javascript - nuxt layout - 为移动端和桌面端提供不同的布局

c# - 如果正在使用的 USB COM 端口已被移除,SerialPort.Close() 会卡住

javascript - 禁用移动浏览器滑动返回行为?

google-cloud-platform - 使用 GCP AutoML 进行自定义实体提取的预测

mysql - 无法在 AWS-mysql 中创建表并抛出错误代码 1044。拒绝用户 'user1

ios - GeometryReader 在 SwiftUI 中占用额外空间

swift - 使用 Storyboard将 Collection View 布局设置为组合布局

python - "Chrome not reachable"在非 headless 模式下使用 X

git - 撤消 Git 中已推送到远程仓库的特定提交