Quantization for object detection


I was trying to adapt the quantization example at https://github.com/apache/incubator-mxnet/tree/master/example/quantization for an object detection scenario, using the vgg16_reduced of the (now deprecated) mxnet-ssd github repository.

Is this possible? The quantization process works without problems but when I try to load the quantized model with the usual sequence mx.model.load_checkpoint - mx.mod.Module - mod.bind i receive an error: Check failed: data shape[C] % 4 == 0U (3 vs 0) for 8bit cudnn conv, the number of channel must be multiple of 4.

I compared a bit the the original example of inference with mine and the data shapes seem to be pretty similar. (3, 244, 244 vs 3,300,300)

The current implementation is based on cuDNN int8 convolution and requires that the input data must have multiples of 4 channel numbers for the quantized conv op. Normally, you need to skip generating quantized conv op for replacing the first conv layer during the quantization process by specifying excluded_sym_names as in

You can open the model json file and check the node name of the first conv layer.

I wonder which layers should be excluded when quantizing. I found that in the /example/ssd/quantization.py example, some flatten and concat layers were listed in the excluded_sym_names, and I want to know whether other layers, like tranpose, slice, zeros_like etc. should also be excluded?