I ran into some performance issues packaging about 2 million images into .rec files – after about 200k, the time to pack each 1,000 images got really high. As a workaround, I used the
--chunks option for im2rec, which resulted in about 10 .rec files with around 200k images each.
That was nice, but I don’t see an easy way to combine them. I’m using gluon’s ImageRecordDataset, which only accepts a single .rec and idx file.
Is there an easy way to combine these .rec I’ve generated? I’d be OK combining them into one file for passing to ImageRecordDataset, or having a dataset support multiple input files.