Training gets slower with time

Hello,

I’m trying to train some resnet with triplet loss. Everything seems to work fine except the fact the training get slower and slower after each batch.

You can see how the training is slowing here :

Does anyone have an idea of why is this happening and how to resolve this ?

In advance, thank you for your help !

Lucas

Hi @lucas,

It’s most likely that either your sampling techniques for the triplet loss, or any metric accumulation you are doing is computationnally expensive and as the training goes on, the training loop gets slower. It’s unlikely to be related to mxnet per se, and very likely to be related to any logging / sampling / metric you are computing in your training loop.

Hello @ThomasDelteil,

Thank you for your reply, it was indeed the metric accumulation in the training loop which was slowing everything.

Thanks a lot for your help !