What is the best way of saving gluon.Trainer
? I need to update a pre-trained model by training it with a new dataset; however, if I recreate the Trainer
, it would start with the initial learning rate, which can be confusing for optimizers such as AdaGrad
that adjusts the learning rate with respect to the frequently occurring features. I could not find a method such as save_params
for Trainer
so please let me know if there is an easy way of saving it. Thanks!
From the documentation:
you can use save_states(fname)
to save your trainer parameters, and then load_states(fname)
to load it in the previous configuration.
e.g.
trainer = gluon.Trainer(mynet.collect_params(),'adam',{'learning_rate':lr})
flname = r'trainer_adam.states'
trainer.save_states(flname)
then restore
trainer.load_states(flname)
edit: The save command works, but when trying to restore trainer with load_states
I get an error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-13-a93de5be24c4> in <module>()
1 with autograd.record():
----> 2 trainer.load_states(r'../saved_models/resunet-trainer-epoch-41-stats.states')
/home/foivos/mxnet/gluon/trainer.pyc in load_states(self, fname)
224 Path to input states file.
225 """
--> 226 if self._update_on_kvstore:
227 self._kvstore.load_optimizer_states(fname)
228 self._optimizer = self._kvstore._updater.optimizer
AttributeError: 'Trainer' object has no attribute '_update_on_kvstore'
edit 2: Whithout being able to understand completely what is going on (am learning Gluon/mxnet these days), it seems you need to call at least once the step
operation in order to “create” (?) the attribute '_update_on_kvstore'
. When I perform at least one trainer.step(Nbatch)
loading states works normally. You need to perform a trainer.step(Nbatch)
operation also before saving states for the first time.
edit 3: Updated to the latest version of mxnet v1.1.0, now there is no problem in loading previously saved states directly (the error I described above does not appear). With delayed initialization you need to make a single forward pass before updating the parameters, so the optimizer knows the correct dimensions of the layers (I got an error loading trainer.load_states('some_flname.states')
without running a single forward pass). I think it relates to this issue