Difference between exported and imported model results

I can export my hybridsequential network using the built in export function. This works fine, but if I try to import it again using the import function, its loss will suddenly rise for a while for no reason at all.

From what I’ve seen, when first creating the network my loss will slowly drop from 2000 to 600 over several thousand epochs(where each epoch has 500 batches), after I export and import it again, my loss is still around 600 for the first batch, but after that the autograd backward function will change my network’s output, so batch 2-500 will all have an increasingly larger loss, and make its loss go back to around 1900 for the first epoch before it starts dropping again over several hundred more epochs.

Why does it not just load in the model and resume training without going the wrong way first? I’m loading in the same learning rate scheduler (tried changing that) and trainer in general so I’m confused on what it’s doing.

instead of using export/import, try using save_parameters/load_parameters and check the result.

That gives the same problem sadly.

I have found the solution though. The trainer/optimizer is not exported along with the other data unless specifically called through trainer.save_states() which means the rise in loss after is because the trainer doesn’t know what it was doing.

A bit weird that the official docs/examples do not mention having to save your trainer states explicitly when talking about saving and loading models to resume training.