Reading images fast: list comprehension vs for loop


I’m looking for the fastest way to read a folder of same-size images into an NDArray. Surprisingly, using a for loop of concats is 3x faster than doing a list comprehension. Any idea why? Any suggestion of fast technique for that?

Idea 1: For Loop of concats (100ms)

ims = (mxim.imread(batch_path + '/' + piclist[0])
       .expand_dims(0)  # Create an extra dim for the concat

for picname in piclist[1:]:
    pic = mxim.imread(batch_path + '/' + picname).expand_dims(0)
    ims = nd.concat(ims, pic.as_in_context(ctx), dim=0)

Idea 2: list comprehension (320ms)

ims = nd.concat(
    *[mxim.imread(batch_path + '/' + pic).expand_dims(0) for pic in piclist],


Turns out the list comprehension on the GPU is actually even faster (vs concatenating on CPU and sending the whole concat after)


ims = [mxim.imread(batch_path + '/' + pic).expand_dims(0).as_in_context(ctx) for pic in piclist]
ims = nd.concat(*ims, dim=0)