Which conv layer should be selected as the last conv layer in gradcam example when using resnet50v2?

7oud · January 13, 2019, 12:53am

I followed the “Visualizing Decisions of Convolutional Neural Networks”, and it gives the correct output images when using vgg16 pretrained model. ThenI changed the network to ResNet50v2 with its pretrained model, but the output images looks abnormal, Some code snippet are as follow.

    # ResNetV2 using gradcam's Conv2D and Activation
    net = ResNetV2(BottleneckV2, layers, channels, **kwargs)
    net.initialize(ctx=ctx)

    resnet50v2 = mx.gluon.model_zoo.vision.resnet50_v2()
    # load pretrain model
    resnet50v2.load_parameters('D:/Model/mxnet/models/resnet50_v2-ecdde353.params', ctx=ctx)
    params = resnet50v2.collect_params()
    for key in params:
        param = params[key]
        net.collect_params()[net.prefix + key.replace(resnet50v2.prefix, '')].set_data(param.data())

    # ...
    last_conv_layer_name = net.features[8][2].conv3.name
    show_images(*visualize(network, "hummingbird.jpg", last_conv_layer_name))

Figure_1
The upper row uses resnet pretrained model, and the lower row uses vgg16 model

Update on 23th, Jan
I try to output the imggrad without recording gradient of conv layer, because the detail is recovered only using img grad, in this situation, only Activation layer(Relu) is rewritten. The result is correct when using vgg16, , but if using resnet, the imggrad is also unclear.

thomelane · January 14, 2019, 7:20pm

Hi @7oud,

A ResNet is a collection of residual blocks, and a residual block just computes the ‘change’ (i.e the residual) to the overall feature map. So I think you’re just showing the pixels that contribute to the largest change in that specific residual block which isn’t really that meaningful or useful. You need to be working with the overall feature map.

You could try to apply GradCam just before the GlobalAvgPool2D.

Also, for networks where the output of the GlobalAvgPool2D corresponds to class logits (i.e. before softmax), you can just visualise the feature maps that the input to GlobalAvgPool2D directly, which avoids the need for GradCam. Just select the channel from the feature map that corresponds to the class you’re interested in. Although in this resnet50_v2 network, it looks like there’s a dense layer at the end which breaks the correspondence between channel and class.

7oud · January 15, 2019, 5:09am

@thomelane Thanks for explanation!
There is a problem when using GradCam before GlobalAvgPool, it can not input a non-conv layer name in gradcam.visualize() function, because a Conv2d is rewritten. May I use the resnet model without modifying gradcam.py?
As you said, the feature map before GAP can be used for visualizing the heatmap, but if I want to visualize the Saliency map(4-th picture), the gradcam should be needed.

thomelane · January 15, 2019, 9:58pm

I think you’ll have to modify the gradcam.py script and overwrite the GlobalAvgPool2D operator, but it shouldn’t be too tricky if you copy the form of the Conv2D. @indu can you confirm this?

7oud · January 18, 2019, 9:23am

@thomelane The output of GlobalAvgPool2D is 1x1 shape, its gradient and output is too small to recover the image activation. Maybe the last Add layer should be used, am I right? However, how to rewritten the add op ?

sathk882 · June 25, 2020, 4:56am

Hi @7oud,

Were you able to fix this problem?

Topic		Replies	Views
Layer access in a pre-trained model	6	1062	November 21, 2018
Change Conv2D layer on pretrained network for Image Segmentation	15	1992	October 28, 2018
GluonCV, Faster RCNN, normalization layers Discussion	7	852	July 4, 2018
Faster_rcnn with resnet50 backbone Gluon	0	310	November 16, 2022
Gluon pretrained model layer access and usage	6	5696	October 31, 2019

Which conv layer should be selected as the last conv layer in gradcam example when using resnet50v2?

Related Topics