Serving different Models in one MMS

christian_b · May 23, 2018, 6:26pm

Currently, it seems that MMS does support only one service file per model server. Which is the preferred way to service different types (based on pre/post-processing) of models?

Is it possible to apply conditional logic in the service file based on the queried endpoint?
Can multiple MMS co-exist on a single node separated by Servicing port
Should one implement a MultiNode Service class
…

Thanks for pointers and guidance …

vamshidhardk · May 23, 2018, 6:53pm

To understand the requirement further, could you specify the use case for requiring multiple service files for the same model?

MMS supports loading and serving multiple models at the same time. Each of these models have their own endpoint for you to run queries. If you are trying to run multiple models, you could use different model service files in each model.

Also, how are you running MMS? Is it the container image or standalone MMS? You could run multiple model-servers on multiple ports, but thats not advised…

christian_b · May 24, 2018, 6:34am

My inquiry is exactly what you describe … That is, I would like to have a service file per model.
I have two different types of model:

Classification
Detection

I would like to use a different service file for each. So I do …

mxnet-model-export --model-name model1 --model-path <DIR-MODEL1> --service-file-path <DIR-MODEL1>/<Model1>.py

mxnet-model-export --model-name model2 --model-path <DIR-MODEL2> --service-file-path <DIR-MODEL2>/<Model2>.py

This generates the two different model files (each having a separate model service file). In each model’s model file __init__ method, I included a simple print("Model1") or print("Model2") statement.

Now, I start the server:

mxnet-model-server --models model1=<MODEL1>.model model2=<MODEL2>.model --host <MyHost>

The models get registered as Flask endpoints. When I look through the startup messages, I see that only the __init__ of the first model being registered is executed (e.g., “Model1” is print). Meaning that this __init__ method is run for Model 1 and Model 2.

Later on, when I call the endpoints via curl, I see the behavior that …

CURL MODEL 1 - OK 
CURL MODEL 2 - Exception with ... 

File "/usr/local/lib/python2.7/dist-packages/mms/serving_frontend.py", line 468, in predict_callback
    response = modelservice.inference(input_data)
  File "/usr/local/lib/python2.7/dist-packages/mms/model_service/model_service.py", line 105, in inference
    data = self._postprocess(data)
  File "/home/local/.../<MODEL1>.py", line 31, in _postprocess

That is, the service file for MODEL1 is hit.

My suspicion is probably confirmed by this statement:

Note that if you supply a custom service for pre or post-processing, both models will use that same pipeline. There is currently no support for using different pipelines per-model.

Consequently, my question is if and how I can I deal with the above scenario.

vamshidhardk · October 30, 2018, 3:30am

@christian_b

We have a new version of MMS on our GitHub. You should definitely check this version out. It’s very flexible and highly scalable compared to the previous version. Do let us know if you would like to check it out.

Topic		Replies	Views
How can we serve mxnet models built with gluon API? Gluon	4	2147	July 16, 2018
Accessing MMS server from external machine MXNet Model Server	0	360	October 14, 2020
Process multiple model consecutively with one curl command MXNet Model Server	0	289	October 15, 2020
MXNet model parallelism on CPUs Discussion	1	662	April 17, 2019
Handling of exceptions and https return code in MMS MXNet Model Server	1	626	June 13, 2018

Serving different Models in one MMS

Related Topics