'Could not interpret input' error with Seaborn when plotting groupbys
The reason for the exception you are getting is that Program
becomes an index of the dataframes df_mean
and df_count
after your group_by
operation.
If you wanted to get the factorplot
from df_mean
, an easy solution is to add the index as a column,
In [7]:df_mean['Program'] = df_mean.indexIn [8]:%matplotlib inlineimport seaborn as snssns.factorplot(x='Program', y='Value', data=df_mean)
However you could even more simply let factorplot
do the calculations for you,
sns.factorplot(x='Program', y='Value', data=df)
You'll obtain the same result.Hope it helps.
EDIT after comments
Indeed you make a very good point about the parameter as_index
; by default it is set to True, and in that case Program
becomes part of the index, as in your question.
In [14]:df_mean = df.groupby('Program', as_index=True).mean().sort(['Value'], ascending=False)[['Value']]df_meanOut[14]: ValueProgram prog3 45prog2 40prog1 20
Just to be clear, this way Program
is not column anymore, but it becomes the index. the trick df_mean['Program'] = df_mean.index
actually keeps the index as it is, and adds a new column for the index, so that Program
is duplicated now.
In [15]:df_mean['Program'] = df_mean.indexdf_meanOut[15]: Value ProgramProgram prog3 45 prog3prog2 40 prog2prog1 20 prog1
However, if you set as_index
to False, you get Program
as a column, plus a new autoincrement index,
In [16]:df_mean = df.groupby('Program', as_index=False).mean().sort(['Value'], ascending=False)[['Program', 'Value']]df_meanOut[16]: Program Value2 prog3 451 prog2 400 prog1 20
This way you could feed it directly to seaborn
. Still, you could use df
and get the same result.
Hope it helps.