linearmodels panelOLS: Regression output with stars
A bit late but here is what I use. In the example above I calculated two fixed effects regressions with their results stored in fe_res_VS
and fe_res_CVS
:
pd.set_option('precision', 4)pd.options.display.float_format = '{:,.4f}'.formatReg_Output_FAmount= pd.DataFrame()#1) Table1 = pd.DataFrame(fe_res_VS.params)Table1['id'] = np.arange(len(Table1))#create numerical index for pd.DataFrameTable1 = Table1.reset_index().set_index(keys = 'id')#set numercial index as new indexTable1 = Table1.rename(columns={"index":"parameter", "parameter":"coefficient 1"})P1 = pd.DataFrame(fe_res_VS.pvalues)P1['id'] = np.arange(len(P1))#create numerical index for pd.DataFrameP1 = P1.reset_index().set_index(keys = 'id')#set numercial index as new indexP1 = P1.rename(columns={"index":"parameter"})Table1 = pd.merge(Table1, P1, on='parameter')Table1['significance 1'] = np.where(Table1['pvalue'] <= 0.01, '***',\ np.where(Table1['pvalue'] <= 0.05, '**',\ np.where(Table1['pvalue'] <= 0.1, '*', '')))Table1.rename(columns={"pvalue": "pvalue 1"}, inplace=True) SE1 = pd.DataFrame(fe_res_VS.std_errors)SE1['id'] = np.arange(len(SE1))#create numerical index for pd.DataFrameSE1 = SE1.reset_index().set_index(keys = 'id')#set numercial index as new indexSE1 = SE1.rename(columns={"index":"parameter", "std_error":"coefficient 1"})SE1['parameter'] = SE1['parameter'].astype(str) + '_SE'SE1['significance 1'] = ''SE1 = SE1.round(4)SE1['coefficient 1'] = '(' + SE1['coefficient 1'].astype(str) + ')'Table1 = Table1.append(SE1)Table1 = Table1.sort_values('parameter')Table1.replace(np.nan,'', inplace=True)del P1del SE1#2) Table2 = pd.DataFrame(fe_res_CVS.params)Table2['id'] = np.arange(len(Table2))#create numerical index for pd.DataFrameTable2 = Table2.reset_index().set_index(keys = 'id')#set numercial index as new indexTable2 = Table2.rename(columns={"index":"parameter", "parameter":"coefficient 2"})P2 = pd.DataFrame(fe_res_CVS.pvalues)P2['id'] = np.arange(len(P2))#create numerical index for pd.DataFrameP2 = P2.reset_index().set_index(keys = 'id')#set numercial index as new indexP2 = P2.rename(columns={"index":"parameter"})Table2 = pd.merge(Table2, P2, on='parameter')Table2['significance 2'] = np.where(Table2['pvalue'] <= 0.01, '***',\ np.where(Table2['pvalue'] <= 0.05, '**',\ np.where(Table2['pvalue'] <= 0.1, '*', '')))Table2.rename(columns={"pvalue": "pvalue 2"}, inplace=True) SE2 = pd.DataFrame(fe_res_CVS.std_errors)SE2['id'] = np.arange(len(SE2))#create numerical index for pd.DataFrameSE2 = SE2.reset_index().set_index(keys = 'id')#set numercial index as new indexSE2 = SE2.rename(columns={"index":"parameter", "std_error":"coefficient 2"})SE2['parameter'] = SE2['parameter'].astype(str) + '_SE'SE2['significance 2'] = ''SE2 = SE2.round(4)SE2['coefficient 2'] = '(' + SE2['coefficient 2'].astype(str) + ')'Table2 = Table2.append(SE2)Table2 = Table2.sort_values('parameter')Table2.replace(np.nan,'', inplace=True)del P2del SE2#Merging Tables and adding StatsReg_Output_FAmount= pd.merge(Table1, Table2, on='parameter', how='outer')Reg_Output_FAmount = Reg_Output_FAmount.append(pd.DataFrame(np.array([["observ.", fe_res_VS.nobs, '', fe_res_CVS.nobs, '']]), columns=['parameter', 'pvalue 1', 'significance 1', 'pvalue 2', 'significance 2']), ignore_index=True)Reg_Output_FAmount = Reg_Output_FAmount.append(pd.DataFrame(np.array([["Rsquared", "{:.4f}".format(fe_res_VS.rsquared), '', "{:.4f}".format(fe_res_CVS.rsquared), '']]), columns=['parameter', 'pvalue 1', 'significance 1', 'pvalue 2', 'significance 2']), ignore_index=True)Reg_Output_FAmount= Reg_Output_FAmount.append(pd.DataFrame(np.array([["Model type", fe_res_VS.name, '', fe_res_CVS.name, '']]), columns=['parameter', 'pvalue 1', 'significance 1', 'pvalue 2', 'significance 2']), ignore_index=True)Reg_Output_FAmount = Reg_Output_FAmount.append(pd.DataFrame(np.array([["DV", fe_res_VS.model.dependent.vars[0], '', fe_res_CVS.model.dependent.vars[0], '']]), columns=['parameter', 'pvalue 1', 'significance 1', 'pvalue 2', 'significance 2']), ignore_index=True)Reg_Output_FAmount.fillna('', inplace=True)
resulting in a nice regression output looking like that:
parameter coefficient 1 pvalue 1 significance 1 coefficient 2 pvalue 2 significance 20 IV 0.0676 0.2269 0.0732 0.1835 1 IV_SE (0.0559) (0.055) 2 Control 0.3406 0.0125 ** 0.3482 0.0118 **3 Control_SE (0.1363) 0.1383) 4 const 0.2772 0.0000 *** 0.2769 0.0000 ***5 const_SE (0.012) (0.012) 6 observ. 99003 99003 7 Rsquared 0.12 0.14 8 Model type PanelOLS PanelOLS 9 DV FAmount FAmount
Have been struggling with the same problem for a few days. Very excited to share with my peers a very easy way to do it: include the significance stars, remove CIs.Here it is:
Step 1: install linearmodels package.
Step 2: import compare function from linearmodels.panel
from linearmodels.panel import compare
Step3: Use compare function and specify the arguments as you want in compare. For instance, specifying stars = True
will give you significance stars. Very convenient!
compare({'model_A_name': results of model_A, 'model_B_name': results of model_B, }, stars = True)
This small function saved my life! Enjoy it.
One more thing, please know that the stars are based on the p-value of the coefficient where 1, 2 and 3-stars correspond to p-values of 10%, 5% and 1%, respectively. I am not sure whether there is a way to make a customized stars measurement, like 1, 2 and 3-stars correspond to p-values of 5%, 1% and 0.1%.
The credit goes to the fantastic package developer and maintainer. Thank you all! Please see the file and get more information at:~/opt/anaconda3/lib/python3.7/site-packages/linearmodels/panel/results.py