python – 从pandas MultiIndex中选择列

我有DataFrame和MultiIndex列,如下所示：

# sample data
col = pd.MultiIndex.from_arrays([['one', 'one', 'one', 'two', 'two', 'two'],
                                ['a', 'b', 'c', 'a', 'b', 'c']])
data = pd.DataFrame(np.random.randn(4, 6), columns=col)
data

从第二级别选择特定列(例如[‘a’,’c’]而不是范围)的正确,简单方法是什么？

目前我这样做：

import itertools
tuples = [i for i in itertools.product(['one', 'two'], ['a', 'c'])]
new_index = pd.MultiIndex.from_tuples(tuples)
print(new_index)
data.reindex_axis(new_index, axis=1)

然而,它并不是一个好的解决方案,因为我必须淘汰itertools,手工构建另一个MultiIndex然后重新索引(我的实际代码甚至更麻烦,因为列列表不是那么容易获取).我很确定必须有一些ix或xs这样的方法,但我尝试的所有内容都会导致错误.

最佳答案

这不是很好,但也许：

>>> data
        one                           two                    
          a         b         c         a         b         c
0 -0.927134 -1.204302  0.711426  0.854065 -0.608661  1.140052
1 -0.690745  0.517359 -0.631856  0.178464 -0.312543 -0.418541
2  1.086432  0.194193  0.808235 -0.418109  1.055057  1.886883
3 -0.373822 -0.012812  1.329105  1.774723 -2.229428 -0.617690
>>> data.loc[:,data.columns.get_level_values(1).isin({"a", "c"})]
        one                 two          
          a         c         a         c
0 -0.927134  0.711426  0.854065  1.140052
1 -0.690745 -0.631856  0.178464 -0.418541
2  1.086432  0.808235 -0.418109  1.886883
3 -0.373822  1.329105  1.774723 -0.617690

会工作？

点击查看更多相关文章

转载注明原文：python – 从pandas MultiIndex中选择列 - 乐贴网