import pandas as pd

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
print(df.columns)
df

Index(['col1', 'col2'], dtype='object')

After adding a new column, it appears in the Index returned by df.columns.

df['col3'] = [5, 6]
print(df.columns)
df

Index(['col1', 'col2', 'col3'], dtype='object')

After dropping a column, using axis=1, it is no longer in the Index object.

x = df.drop(['col1'], axis=1)
display(x.columns)
x

Index(['col2', 'col3'], dtype='object')

But how does it work for a hierarchical MultiIndex?

columns = pd.MultiIndex.from_product([['head', 'body'], ['x', 'y']],
                                     names=['bodypart', 'coordinates'])

df = pd.DataFrame([[1, 2, 3, 4], [9, 2, 3, 4]], columns=columns)
df

display(df.columns)
display(df.columns.levels)
display(list(df.columns.levels[0]))

MultiIndex([('head', 'x'),
            ('head', 'y'),
            ('body', 'x'),
            ('body', 'y')],
           names=['bodypart', 'coordinates'])

FrozenList([['body', 'head'], ['x', 'y']])

['body', 'head']

What happens if we add another column? Will it appear in the column index?

df['tail', 'x'] = 5
df['tail', 'y'] = 9

display(df)

display(df.columns)
display(df.columns.levels)
display(list(df.columns.levels[0]))

MultiIndex([('head', 'x'),
            ('head', 'y'),
            ('body', 'x'),
            ('body', 'y'),
            ('tail', 'x'),
            ('tail', 'y')],
           names=['bodypart', 'coordinates'])

FrozenList([['body', 'head', 'tail'], ['x', 'y']])

['body', 'head', 'tail']

It seems to be in columns level 0! And what happens if the drop on of the original columns? Is the columns Index updated accordingly?

df_drop = df.drop(['body'], axis=1)

display(df_drop)

display(df_drop.columns)
display(df_drop.columns.levels)
display(list(df_drop.columns.levels[0]))

/home/kwittek/.local/share/virtualenvs/winkie-pT7_bdXL/lib/python3.8/site-packages/pandas/core/generic.py:4150: PerformanceWarning: dropping on a non-lexsorted multi-index without a level parameter may impact performance.
  obj = obj._drop_axis(labels, axis, level=level, errors=errors)

MultiIndex([('head', 'x'),
            ('head', 'y'),
            ('tail', 'x'),
            ('tail', 'y')],
           names=['bodypart', 'coordinates'])

FrozenList([['body', 'head', 'tail'], ['x', 'y']])

['body', 'head', 'tail']

So we can see, dropping a column does not remove it from the column index (which is backed by a FrozenList)! While some might consider this a bug, the pandas developers think this is a philosophical question and actually works as intended.

However, there is a good workarond:

display(df_drop.columns.get_level_values(0).unique())

Index(['head', 'tail'], dtype='object', name='bodypart')

There is another way, which means setting a new column index. While it seems this is a reasonable approach for some use cases, there might be unforseen (performance) implication, which are the reasons, that this is not the default behaviour.

df_drop.columns = df_drop.columns.remove_unused_levels()
display(df_drop.columns.levels)

FrozenList([['head', 'tail'], ['x', 'y']])

And how can we set values on certain values of a MutliIndexed DataFrame, if there is a non-MultiIndexed column?

df['behavior'] = 'cute'
df

df.behavior[df['head', 'x'] == 1] = 'foobar' # this does not work!
df

<ipython-input-12-5a1259736b5a>:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df.behavior[df['head', 'x'] == 1] = 'foobar' # this does not work!
/home/kwittek/.local/share/virtualenvs/winkie-pT7_bdXL/lib/python3.8/site-packages/pandas/core/series.py:963: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._where(~key, value, inplace=True)

As you can see, setting the value using chained indexing does not work. This comes from the fact how the Pandas DSL is translated into Python method calls. The official docs provide a detailed explanation of the reasons. You are actually getting a copy! (which is logged in a warning, that might be visible, depending on how you render the notebook)

Instead, we should make use of the loc method:

df.loc[df['head', 'x'] == 1, ['behavior']] = 'buzzz'
df

Behind the scenes, the operator overloading of the Python data model come into practice, which are used extensively by the Pandas DSL. What looks similar to a method call, will actually call the overloaded __getitem__ method on an internal _LocIndexer object. While the internal code branches are a bit more involved, what will happen functionally in our case is using the first argument as a boolean Series to select specific rows and the second argument for specifying which column to access (and thereby override).

	col1	col2
0	1	3
1	2	4

	col1	col2	col3
0	1	3	5
1	2	4	6

	col2	col3
0	3	5
1	4	6

bodypart	head		body
coordinates	x	y	x	y
0	1	2	3	4
1	9	2	3	4

Pandas MultiIndex examples