Tensorflow Serving: When to use it rather than simple inference inside Flask service?

python tensorflow flask tensorflow-serving

I believe most of the reasons why you would prefer Tensorflow Serving over Flask are related to performance:

Tensorflow Serving makes use of gRPC and Protobuf while a regularFlask web service uses REST and JSON. JSON relies on HTTP 1.1 whilegRPC uses HTTP/2 (there are important differences). In addition,Protobuf is a binary format used to serialize data and it is moreefficient than JSON.
TensorFlow Serving can batch requests to the same model, which uses hardware (e.g. GPUs) more appropriate.
TensorFlow Serving can manage model versioning

As almost everything, it depends a lot on the use case you have and your scenario, so it's important to think about pros and cons and your requirements. TensorFlow Serving has great features, but these features could be also implemented to work with Flask with some effort (for instance, you could create your batch mechanism).

python tensorflow flask tensorflow-serving

Flask is used to handle request/response whereas Tensorflow serving is particularly built for serving flexible ML models in production.

Let's take some scenarios where you want to:

Serve multiple models to multiple products (Many to Many relations) atthe same time.
Look which model is making an impact on your product (A/B Testing).
Update model weights in production, which is as easy as saving a newmodel to a folder.
Have a performance equal to code written in C/C++.

And you can always use all those advantages for FREE by sending requests to TF Serving using Flask.

CodeHunter

Tensorflow Serving: When to use it rather than simple inference inside Flask service?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last