Issue about training faster-rcnn using adam

Lowtec-Sam · April 12, 2018, 6:37am

I tried train my model using adam by modify the code as follow:

adam_optimizer = mx.optimizer.Adam()
optimizer_params = {‘wd’: 0.0,
‘learning_rate’: lr,
‘lr_scheduler’: lr_scheduler,
‘rescale_grad’: (1.0/batch_size)}
#train
mod.fit(train_data, eval_metric=eval_metrics, epoch_end_callback=epoch_end_callback,
batch_end_callback=batch_end_callback, kvstore=args.kvstore,
optimizer=adam_optimizer , optimizer_params=optimizer_params,
arg_params=arg_params, aux_params=aux_params, begin_epoch=begin_epoch,
num_epoch=end_epoch)

but, the training loss at the beginning became very large (Train-RPNAcc = 0.843564, RPNLogLoss=1.250399, RPNL1Loss=30.644248, RCNNAcc=0.818452, RCNNLogLoss=2.148393, RCNNL1Loss=90.618619). If I change it back by using sgd, the loss value would became nomall (Train-RPNAcc = 0.904762, RPNLogLoss=0.297273, RPNL1Loss=1.345441, RCNNAcc=0.849330, RCNNLogLoss=0.461176, RCNNL1Loss=1.354734).
It seems that the pretrained model has not been load correctly when I using adam.
why?
thanks a lot!

safrooze · April 17, 2018, 2:09am

If the only change you’ve made to the two training scripts is changing the optimizer, that would have no impact on loading pre-trained parameters. Adam does require different set of hyper-parameters, including the learning rate, than normal SGD. For example, rescale_grad has no impact in Adam because of how the optimization algorithm. I would try reducing the learning_rate by a factor of 10 or more until you see proper convergence behavior and and let Adam “adapt”.

Lowtec-Sam · May 24, 2018, 12:45pm

I have tried reducing the learning_rate to 1e-6, but it doesn’t work although.
Adam is not suit for this task? but it’s OK with caffe.
so confused

Topic		Replies	Views
How to load a pre-trained model that is with customized loss	1	391	August 20, 2018
Calculating loss b/w training Performance	2	830	May 22, 2019
Difficulties with recurrent network Gluon	0	439	August 11, 2020
Proper usage of BatchNorm during inference? Discussion python , gluon , docs	5	3692	February 8, 2019
How did EvalMetric update params? Discussion python , general-question	6	766	February 25, 2019

Issue about training faster-rcnn using adam

Related Topics