scikit learn output metrics.classification_report into CSV/tab-delimited format
As of scikit-learn
v0.20, the easiest way to convert a classification report to a pandas
Dataframe is by simply having the report returned as a dict
:
report = classification_report(y_test, y_pred, output_dict=True)
and then construct a Dataframe and transpose it:
df = pandas.DataFrame(report).transpose()
From here on, you are free to use the standard pandas
methods to generate your desired output formats (CSV, HTML, LaTeX, ...).
See the documentation.
If you want the individual scores this should do the job just fine.
import pandas as pddef classification_report_csv(report): report_data = [] lines = report.split('\n') for line in lines[2:-3]: row = {} row_data = line.split(' ') row['class'] = row_data[0] row['precision'] = float(row_data[1]) row['recall'] = float(row_data[2]) row['f1_score'] = float(row_data[3]) row['support'] = float(row_data[4]) report_data.append(row) dataframe = pd.DataFrame.from_dict(report_data) dataframe.to_csv('classification_report.csv', index = False)report = classification_report(y_true, y_pred)classification_report_csv(report)
We can get the actual values from the precision_recall_fscore_support
function and then put them into data frames.the below code will give the same result, but now in a pandas dataframe:
clf_rep = metrics.precision_recall_fscore_support(true, pred)out_dict = { "precision" :clf_rep[0].round(2) ,"recall" : clf_rep[1].round(2) ,"f1-score" : clf_rep[2].round(2) ,"support" : clf_rep[3] }out_df = pd.DataFrame(out_dict, index = nb.classes_)avg_tot = (out_df.apply(lambda x: round(x.mean(), 2) if x.name!="support" else round(x.sum(), 2)).to_frame().T)avg_tot.index = ["avg/total"]out_df = out_df.append(avg_tot)print out_df