It’s most likely that either your sampling techniques for the triplet loss, or any metric accumulation you are doing is computationnally expensive and as the training goes on, the training loop gets slower. It’s unlikely to be related to mxnet per se, and very likely to be related to any logging / sampling / metric you are computing in your training loop.