Gluon-cv inference example // can't get a gluon-cv model zoo model loaded in mms

Hi,

I am trying to get a finetuned ResNet50_v1 model running in MXNet model Server, but loading the model fails with an error.

An overview of the steps:

  1. gluoncv.model_zoo.get_model(‘ResNet50_v1’, pretrained=True)
  2. I change the output layer (more classes), initialize the new layer and hybridize the model.
  3. ctx = [mx.cpu(0)]
  4. gluoncv.utils.export_block(model_name, net, preprocess=True, layout=‘HWC’)

=> GluonCV’s export_block function is special in that it adds two pre-processing layers right after the data layer. See extract of the model structure file below.

Now I’m hosting this model in AWS Labs MMS server, running in a docker container. I use the example mxnet_vision_service python wrapper, no modifications.

When I start the container and it calls the python code to load my model, I systematically get the following error:

"
Backend worker process die.
Traceback (most recent call last):
File “/usr/local/lib/python2.7/dist-packages/mms/model_service_worker.py”, line 167, in
worker.run_server()
File “/usr/local/lib/python2.7/dist-packages/mms/model_service_worker.py”, line 150, in run_server
self.handle_connection(cl_socket)
File “/usr/local/lib/python2.7/dist-packages/mms/model_service_worker.py”, line 116, in handle_connection
service, result, code = self.load_model(msg)
File “/usr/local/lib/python2.7/dist-packages/mms/model_service_worker.py”, line 96, in load_model
service = model_loader.load(model_name, model_dir, handler, gpu, batch_size)
File “/usr/local/lib/python2.7/dist-packages/mms/model_loader.py”, line 126, in load
entry_point(None, service.context)
File “/home/model-server/tmp/models/26f63b91906d31c8c71285eb2fb1095a35220e26/mxnet_vision_service.py”, line 84, in handle
_service.initialize(context)
File “/home/model-server/tmp/models/26f63b91906d31c8c71285eb2fb1095a35220e26/mxnet_model_service.py”, line 96, in initialize
self.mx_model.bind(for_training=False, data_shapes=data_shapes)
File “/usr/local/lib/python2.7/dist-packages/mxnet/module/module.py”, line 429, in bind
state_names=self._state_names)
File “/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py”, line 279, in init
self.bind_exec(data_shapes, label_shapes, shared_group)
File “/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py”, line 375, in bind_exec
shared_group))
File “/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py”, line 662, in _bind_ith_exec
shared_buffer=shared_data_arrays, **input_shapes)
File “/usr/local/lib/python2.7/dist-packages/mxnet/symbol/symbol.py”, line 1528, in simple_bind
raise RuntimeError(error_msg)
RuntimeError: simple_bind error. Arguments:
data: (1, 3, 224, 224)
Error in operator _defaultpreprocess0_broadcast_minus0: [18:04:44] src/operator/tensor/./elemwise_binary_broadcast_op.h:68: Check failed: l == 1 || r == 1 operands could not be broadcast together with shapes [1,3,224,224] [1,1,1,3]
"

This layer is subtracting the mean (R,G,B) value from each of the 224x224 pixels.
While typing this I’m seeing that the shapes don’t match, so I should have either [1,224,224,3] or [1,3,1,1]. Let’s check that first.

My main question:

For exporting the model I’m just following this example:
https://gluon-cv.mxnet.io/build/examples_deployment/export_network.html

Is there a good example of a python inference server that can load GluonCV models exported using this (default) method?

thanks,
Lieven

Extract from the net-symbol.json file, starting at line 1:

{
“nodes”: [
{
“op”: “null”,
“name”: “data”,
“inputs”:
},
{
“op”: “null”,
“name”: “_defaultpreprocess0_init_mean”,
“attrs”: {
dtype”: “0”,
init”: “Constant__defaultpreprocess0_init_mean_140368577460824”,
lr_mult”: “1.0”,
shape”: “(1, 1, 1, 3)”,
storage_type”: “0”,
wd_mult”: “1.0”
},
“inputs”:
},
{
“op”: “broadcast_sub”,
“name”: “_defaultpreprocess0_broadcast_minus0”,
“inputs”: [[0, 0, 0], [1, 0, 0]]
},
{
“op”: “null”,
“name”: “_defaultpreprocess0_init_scale”,
“attrs”: {
dtype”: “0”,
init”: “Constant__defaultpreprocess0_init_scale_140368577460600”,
lr_mult”: “1.0”,
shape”: “(1, 1, 1, 3)”,
storage_type”: “0”,
wd_mult”: “1.0”
},
“inputs”:
},
{
“op”: “broadcast_div”,
“name”: “_defaultpreprocess0_broadcast_div0”,
“inputs”: [[2, 0, 0], [3, 0, 0]]
},
{
“op”: “transpose”,
“name”: “_defaultpreprocess0_transpose0”,
“attrs”: {“axes”: “(0, 3, 1, 2)”},
“inputs”: [[4, 0, 0]]
},
{
“op”: “null”,
“name”: “resnetv10_conv0_weight”,
“attrs”: {
dtype”: “0”,
lr_mult”: “1.0”,
shape”: “(64, 3, 7, 7)”,
storage_type”: “0”,
wd_mult”: “1.0”
},
“inputs”:
},

I figured out the problem and the solution, adding it here in the hope it can save some time to other people.

When constructing an MMS mode file you need to provide a signature.json. This file the describes the type of data the model expects. The example file is:

{
“inputs”: [
{
“data_name”: “input_0”,
“data_shape”: [
0,
3,
224,
224
]
}
]
}

The default data_shape makes perfect sense, but doesn’t work for a model exported with the Gluon CV export_block function.

The preprocessing layers that the export_block function adds to the model, do not only subtract the mean and divide by the standard deviation, but then also transposes the data from (B,C,W,H) to (B,W,H,C).

So in order to make the Gluon CV pre-processing layers bind succesfully when loading the model in MMS, we need to set the input image shape to (1,W,H,C).

2 Likes

thanks for posting your solution @lgo!