Pandas selecting by label sometimes return Series, sometimes returns DataFrame Pandas selecting by label sometimes return Series, sometimes returns DataFrame python python

Pandas selecting by label sometimes return Series, sometimes returns DataFrame


Granted that the behavior is inconsistent, but I think it's easy to imagine cases where this is convenient. Anyway, to get a DataFrame every time, just pass a list to loc. There are other ways, but in my opinion this is the cleanest.

In [2]: type(df.loc[[3]])Out[2]: pandas.core.frame.DataFrameIn [3]: type(df.loc[[1]])Out[3]: pandas.core.frame.DataFrame


You have an index with three index items 3. For this reason df.loc[3] will return a dataframe.

The reason is that you don't specify the column. So df.loc[3] selects three items of all columns (which is column 0), while df.loc[3,0] will return a Series. E.g. df.loc[1:2] also returns a dataframe, because you slice the rows.

Selecting a single row (as df.loc[1]) returns a Series with the column names as the index.

If you want to be sure to always have a DataFrame, you can slice like df.loc[1:1]. Another option is boolean indexing (df.loc[df.index==1]) or the take method (df.take([0]), but this used location not labels!).


The TLDR

When using loc

df.loc[:] = Dataframe

df.loc[int] = Dataframe if you have more than one column and Series if you have only 1 column in the dataframe

df.loc[:, ["col_name"]] = Dataframe if you have more than one row and Series if you have only 1 row in the selection

df.loc[:, "col_name"] = Series

Not using loc

df["col_name"] = Series

df[["col_name"]] = Dataframe