Can BlockGrad and weight decay be used together?

hurrdurrrderp · April 7, 2018, 3:59pm

I want to use some pre-trained model, like ResNet or VGG, freeze all convolutional layers and replace fully connected layers on top of them. I’m pretty sure that features that conolutional layers provide are good for my task, so I put mx.sym.BlockGrad between these two parts. I also don’t want my model to overfit, so I choose some weight decay for optimizer. Am I correct that this will lead to all weights of convolutional layers to become zero eventually? If yes, how can I apply weight decay only to some layers?

indu · April 7, 2018, 6:38pm

You can use the fixed_param_names parameter while creating Module. You can provide a regex matching the parameter names you want to freeze. Check this example.

Topic		Replies	Views
Implementation of weighted softmax by extending mx.autograd.Function fails	2	651	September 2, 2019
Symbolic mode: how to block the gradient in the graph? MXNet Model Server	2	744	November 20, 2019
Does convolution layer has weight decay param? just as fc layer "wd_mult" Discussion	1	451	July 3, 2018
Loss not decreasing (Tried a lot of ideas) Discussion	2	1058	August 15, 2018
Multiple weight decay rates Gluon	4	773	January 4, 2019

Can BlockGrad and weight decay be used together?

Related Topics