How to save Scikit-Learn-Keras Model into a Persistence File (pickle/hd5/json/yaml) How to save Scikit-Learn-Keras Model into a Persistence File (pickle/hd5/json/yaml) python python

How to save Scikit-Learn-Keras Model into a Persistence File (pickle/hd5/json/yaml)


Edit 1 : Original answer about saving model

With HDF5 :

# saving modeljson_model = model_tt.model.to_json()open('model_architecture.json', 'w').write(json_model)# saving weightsmodel_tt.model.save_weights('model_weights.h5', overwrite=True)# loading modelfrom keras.models import model_from_jsonmodel = model_from_json(open('model_architecture.json').read())model.load_weights('model_weights.h5')# dont forget to compile your modelmodel.compile(loss='binary_crossentropy', optimizer='adam')

Edit 2 : full code example with iris dataset

# Train model and make predictionsimport numpyimport pandasfrom keras.models import Sequential, model_from_jsonfrom keras.layers import Densefrom keras.utils import np_utilsfrom sklearn import datasetsfrom sklearn import preprocessingfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import LabelEncoder# fix random seed for reproducibilityseed = 7numpy.random.seed(seed)# load datasetiris = datasets.load_iris()X, Y, labels = iris.data, iris.target, iris.target_namesX = preprocessing.scale(X)# encode class values as integersencoder = LabelEncoder()encoder.fit(Y)encoded_Y = encoder.transform(Y)# convert integers to dummy variables (i.e. one hot encoded)y = np_utils.to_categorical(encoded_Y)def build_model():    # create model    model = Sequential()    model.add(Dense(4, input_dim=4, init='normal', activation='relu'))    model.add(Dense(3, init='normal', activation='sigmoid'))    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])    return modeldef save_model(model):    # saving model    json_model = model.to_json()    open('model_architecture.json', 'w').write(json_model)    # saving weights    model.save_weights('model_weights.h5', overwrite=True)def load_model():    # loading model    model = model_from_json(open('model_architecture.json').read())    model.load_weights('model_weights.h5')    model.compile(loss='categorical_crossentropy', optimizer='adam')    return modelX_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.3, random_state=seed)# buildmodel = build_model()model.fit(X_train, Y_train, nb_epoch=200, batch_size=5, verbose=0)# savesave_model(model)# loadmodel = load_model()# predictionspredictions = model.predict_classes(X_test, verbose=0)print(predictions)# reverse encodingfor pred in predictions:    print(labels[pred])

Please note that I used Keras only, not the wrapper. It only add some complexity in something simple. Also code is volontary not factored so you can have the whole picture.

Also, you said you want to output 1 or 0. It is not possible in this dataset because you have 3 output dims and classes (Iris-setosa, Iris-versicolor, Iris-virginica). If you had only 2 classes then your output dim and classes would be 0 or 1 using sigmoid output fonction.


Just adding to gaarv's answer - If you don't require the separation between the model structure (model.to_json()) and the weights (model.save_weights()), you can use one of the following:

  • Use the built-in keras.models.save_model and 'keras.models.load_model` that store everything together in a hdf5 file.
  • Use pickle to serialize the Model object (or any class that contains references to it) into file/network/whatever..
    Unfortunetaly, Keras doesn't support pickle by default. You can usemy patchy solution that adds this missing feature. Working code ishere: http://zachmoshe.com/2017/04/03/pickling-keras-models.html


Another great alternative is to use callbacks when you fit your model. Specifically the ModelCheckpoint callback, like this:

from keras.callbacks import ModelCheckpoint#Create instance of ModelCheckpointchk = ModelCheckpoint("myModel.h5", monitor='val_loss', save_best_only=False)#add that callback to the list of callbacks to passcallbacks_list = [chk]#create your modelmodel_tt = KerasClassifier(build_fn=create_model, nb_epoch=150, batch_size=10)#fit your model with your data. Pass the callback(s) heremodel_tt.fit(X_train,y_train, callbacks=callbacks_list)

This will save your training each epoch to the myModel.h5 file. This provides great benefits, as you are able to stop your training when you desire (like when you see it has started to overfit), and still retain the previous training.

Note that this saves both the structure and weights in the same hdf5 file (as showed by Zach), so you can then load you model using keras.models.load_model.

If you want to save only your weights separately, you can then use the save_weights_only=True argument when instantiating your ModelCheckpoint, enabling you to load your model as explained by Gaarv. Extracting from the docs:

save_weights_only: if True, then only the model's weights will be saved (model.save_weights(filepath)), else the full model is saved (model.save(filepath)).