How to train specific layers using gluon with different learning rate?

JWarlock · November 20, 2018, 1:15pm

Hi!
It’s actually 2 questions.

For example, I want to train the frst conv layer while the other layers are freezed. However, it seems that I can only keep the base layer unchanged while training the last layer:

for X, y in train_data:
y = y.astype(‘float32’).as_in_context(ctx)
output_features = net.features(X.as_in_context(ctx))
with autograd.record():
outputs = net.output(output_features)
l = loss(outputs, y)
l.backward()
trainer.step(batch_size)
train_l += l.mean().asscalar()

Is there any way to train different block of a net with different learning rate? Fro example , the lr of the output part is still lr, while the middle layers of the model are trained with lr/3, and the base is trained with even smaller lr/10. I know that in fast.ai it’s very easy to do so:

lrs=np.array([lr/10,lr/3,lr])
learner.fit(lrs/4,4,cycle_len=2,use_clr=(10,20))

How can I do above tasks?
Thanks a lot!

thomelane · January 14, 2019, 10:52pm

Hi @JWarlock,

Just spotted this question (which you’ve probably solved already!), but since it’s related to another question I just asked here I can give you a simple example for both of these two scenarios.

import mxnet as mx

net = mx.gluon.nn.HybridSequential()
net.add(mx.gluon.nn.Conv2D(channels=3, kernel_size=3))
net.add(mx.gluon.nn.Conv2D(channels=4, kernel_size=3))
net.add(mx.gluon.nn.Dense(units=5))

# 'freezing' the 1st Conv2D layer
for param in net[0].collect_params().values():
    param.grad_req = 'null'

# 1/2 the learning rate used in the 2nd Conv2D layer
for param in net[1].collect_params().values():
    param.lr_mult = 0.5

Topic		Replies	Views
Gluon: Per-layer learning rate for fine tuning a pretrained network	1	1010	November 27, 2018
Is this a correct way to copy features by the pretrained model of glunoncv? Gluon	1	1101	December 12, 2018
There are some question during the training process Discussion	1	458	June 1, 2018
Learning rate doesnt decrease after resuming training in MXNets Gluon example	1	463	January 15, 2019
Help with simple classification Gluon	3	324	September 18, 2020

How to train specific layers using gluon with different learning rate?

Related Topics