How to modify layers in pretrained object detection model?

Greetings everyone,

I am trying to replace the covolution layers in the feature extraction block of a SSD network, and if possible the auxiliary layers of the SSD network with deformable convolution layers in [1]

However, there seems to be no clear direction on how to this. Is someone able to advise on how to do so?

[1]https://mxnet.incubator.apache.org/versions/1.5.0/api/python/gluon/contrib.html#mxnet.gluon.contrib.cnn.DeformableConvolution

You can iterate through the layers and replace them:

def replace_conv2D(net):
    for key, layer in net._children.items():
        if isinstance(layer, gluon.nn.Conv2D):
            new_conv = gluon.nn.Conv2D(
                channels=layer._channels // 2,
                kernel_size=layer._kwargs['kernel'],
                strides=layer._kwargs['stride'],
                padding=layer._kwargs['pad'],
                in_channels=layer._in_channels // 2)
            with net.name_scope():
                net.register_child(new_conv, key)
            new_conv.initialize(mx.init.Xavier())
        else:
            replace_conv2D(layer)
net = gluon.model_zoo.vision.get_model("resnet18_v1", pretrained=True)
replace_conv2D(net)

You can check these threads for more details: Modifying pre-trained gluon model zoo and Modify structure of loaded network

1 Like

Hi @NRauschmayr, thank you very much for replying to my post :slight_smile:

I have tried replacing a resnet_v18 model with the 2 links you have given, and could verify in the model printout that the convolutional layers were replaced.

However, with the SSD_resnet50 model i am using, i could not verify that a replacement of the convolutional layers has occurred.

Before Replacing Conv2D with DeformableConvolution in ResNet(truncated for readability)

ResNetV1(
  (features): HybridSequential(
    (0): Conv2D(3 -> 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=64)
    (2): Activation(relu)
    (3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
    )

After Replacing Conv2D with DeformableConvolution in ResNet(truncated for readability)

ResNetV1(
(features): HybridSequential(
(0): DeformableConvolution(None → 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=64)
(2): Activation(relu)
(3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)

Before replacing Conv2D with Deformable Convolution in SSD_resnet50(truncated and compacted for readability):

SSD(
      (features): FeatureExpander(
      <Symbol group [ssd0_resnetv10_stage3_activation5, ssd0_resnetv10_stage4_activation2, ssd0_expand_reu0, ssd0_expand_reu1, ssd0_expand_reu2, ssd0_expand_reu3]> : 1 -> 6
      )
      (class_predictors): HybridSequential(
        (0): ConvPredictor(
          (predictor): Conv2D(1024 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        )
        (1): ConvPredictor((predictor): Conv2D(2048 -> 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
        (2): ConvPredictor((predictor): Conv2D(512 -> 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
        (3): ConvPredictor((predictor): Conv2D(512 -> 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
        (4): ConvPredictor((predictor): Conv2D(256 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
        (5): ConvPredictor((predictor): Conv2D(256 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
      )
      (box_predictors): HybridSequential(
        (0): ConvPredictor((predictor): Conv2D(1024 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
        (1): ConvPredictor((predictor): Conv2D(2048 -> 24, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
        (2): ConvPredictor((predictor): Conv2D(512 -> 24, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
        (3): ConvPredictor((predictor): Conv2D(512 -> 24, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
        (4): ConvPredictor((predictor): Conv2D(256 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
        (5): ConvPredictor((predictor): Conv2D(256 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))

After replacing Conv2D with Link 1

SSD(
  (features): FeatureExpander(
  <Symbol group [ssd0_resnetv10_stage3_activation5, ssd0_resnetv10_stage4_activation2, ssd0_expand_reu0, ssd0_expand_reu1, ssd0_expand_reu2, ssd0_expand_reu3]> : 1 -> 6
  )
  (class_predictors): HybridSequential(
    (0): ConvPredictor(
      (predictor): Conv2D(1024 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (1): ConvPredictor((predictor): Conv2D(2048 -> 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
    (2): ConvPredictor((predictor): Conv2D(512 -> 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
    (3): ConvPredictor((predictor): Conv2D(512 -> 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
    (4): ConvPredictor((predictor): Conv2D(256 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
    (5): ConvPredictor((predictor): Conv2D(256 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
  )
  (box_predictors): HybridSequential(
    (0): ConvPredictor((predictor): Conv2D(1024 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
    (1): ConvPredictor((predictor): Conv2D(2048 -> 24, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
    (2): ConvPredictor((predictor): Conv2D(512 -> 24, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
    (3): ConvPredictor((predictor): Conv2D(512 -> 24, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
    (4): ConvPredictor((predictor): Conv2D(256 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
    (5): ConvPredictor((predictor): Conv2D(256 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))

As observed, there is no difference. I cannot verify if the ResNet feature extractor had its convolutional layers replaced.

After replacing with Conv2D with Link 2(truncated and compacted for readability):

features FeatureExpander(
<Symbol group [ssd2_resnetv10_stage3_activation5, ssd2_resnetv10_stage4_activation2, ssd2_expand_reu0, ssd2_expand_reu1, ssd2_expand_reu2, ssd2_expand_reu3]> : 1 -> 6
)

Recursing instead

class_predictors HybridSequential(
  (0): ConvPredictor((predictor): Conv2D(1024 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
  (1): ConvPredictor((predictor): Conv2D(2048 -> 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
  (2): ConvPredictor((predictor): Conv2D(512 -> 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
  (3): ConvPredictor((predictor): Conv2D(512 -> 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
  (4): ConvPredictor((predictor): Conv2D(256 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
  (5): ConvPredictor((predictor): Conv2D(256 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
)

Recursing instead
0 ConvPredictor((predictor): Conv2D(1024 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))

Recursing instead
predictor Conv2D(1024 -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
replacing layer

---------------------------------------------------------------------------

TypeError                                
Traceback (most recent call last)

<ipython-input-13-14db52908be1> in <module>()
     28             replace_conv2D(layer)
     29 
---> 30 replace_conv2d(net)
     31 #print("After")
     32 #print(net)

4 frames

/usr/local/lib/python3.6/dist-packages/mxnet/gluon/block.py in __setattr__(self, name, value)
    197                 raise TypeError('Changing attribute type for {name} from {type1} to {type2}' \
    198                                 'is not allowed.'.format(
--> 199                                     name=name, type1=type(existing), type2=type(value)))
    200 
    201         if isinstance(value, Block):

TypeError: Changing attribute type for predictor from <class 'mxnet.gluon.nn.conv_layers.Conv2D'> to 
<class 'mxnet.gluon.contrib.cnn.conv_layers.DeformableConvolution'>is not allowed.	

It seems that the predictor blocks do not allow replacement, but i am fine with it. However i still cannot confirm if the convolutational layers in the feature extraction block are replaced with their deformable counterparts