scikit learn output metrics.classification_report into CSV/tab-delimited format scikit learn output metrics.classification_report into CSV/tab-delimited format python python

scikit learn output metrics.classification_report into CSV/tab-delimited format


As of scikit-learn v0.20, the easiest way to convert a classification report to a pandas Dataframe is by simply having the report returned as a dict:

report = classification_report(y_test, y_pred, output_dict=True)

and then construct a Dataframe and transpose it:

df = pandas.DataFrame(report).transpose()

From here on, you are free to use the standard pandas methods to generate your desired output formats (CSV, HTML, LaTeX, ...).

See the documentation.


If you want the individual scores this should do the job just fine.

import pandas as pddef classification_report_csv(report):    report_data = []    lines = report.split('\n')    for line in lines[2:-3]:        row = {}        row_data = line.split('      ')        row['class'] = row_data[0]        row['precision'] = float(row_data[1])        row['recall'] = float(row_data[2])        row['f1_score'] = float(row_data[3])        row['support'] = float(row_data[4])        report_data.append(row)    dataframe = pd.DataFrame.from_dict(report_data)    dataframe.to_csv('classification_report.csv', index = False)report = classification_report(y_true, y_pred)classification_report_csv(report)


We can get the actual values from the precision_recall_fscore_support function and then put them into data frames.the below code will give the same result, but now in a pandas dataframe:

clf_rep = metrics.precision_recall_fscore_support(true, pred)out_dict = {             "precision" :clf_rep[0].round(2)            ,"recall" : clf_rep[1].round(2)            ,"f1-score" : clf_rep[2].round(2)            ,"support" : clf_rep[3]            }out_df = pd.DataFrame(out_dict, index = nb.classes_)avg_tot = (out_df.apply(lambda x: round(x.mean(), 2) if x.name!="support" else  round(x.sum(), 2)).to_frame().T)avg_tot.index = ["avg/total"]out_df = out_df.append(avg_tot)print out_df