Mxnet forward operation on the first batch is very slow

data_iter =, batch_size=batch_size)
for batch in data_iter:

The first batch takes very long time while the following batches are super fast. IS this normal? What’s the reason for that?

When you say “very long time” what is it in seconds?
Could that be due to time taken to load model params into memory? How large is the model?

The model is Google Inception BN. model file is 45.3M. I tested using 400, 800, 10k images stack into a ND array. It will > 1 minute for the 400, 800 to run while 10K dataset just hang on there forever.

What’s the data type of imgs passed to NDArrayIter here?

On first batch MXNet initializes GPU context and tune CUDNN for performance, which might take a long time