Running Keras model for prediction in multiple threads Running Keras model for prediction in multiple threads python python

Running Keras model for prediction in multiple threads


multi threading in python doesn't necessarily make a better use of your resources since python uses global interpreter lock and only one native thread can run at a time.

in python, usually you should use multi processing to utilize your resources, but since we're talking about keras models, I'm not sure even that is the right thing to do.loading several models in several processes has its own overhead, and you could simply increase the batch size as others have already pointed out.

OR if you have a heavy pre-processing stage you could preprocess your data in one process and predict them in another (although I doubt that would be necessary either).


It's a bad idea to predict data in multiple threads . You can use greater batch_size in model.predict when you predict data offline and use tensorflow serving when you predict data online.


Keras is not thread safe, for predicting large batch you can use batch_size to set Max limits.If you are deploying to production than the ideal is to convert the model weights tensorflow protobuf and than use tensorflow serving.

You can follow this bloghttp://machinelearningmechanic.com/keras/2019/06/26/keras-serving-keras-model-quickly-with-tensorflow-serving-and-docker-md.html