Is there an example of a data iterator which does a lot of multithreaded preprocessing and builds a queue for each GPU?
If you’re using Gluon API you can set
num_workers on the
DataLoader to use multi-processing with any type of
Dataset. You typically want to set this to the number of CPUs available for optimal performance which you can find with
multiprocessing.cpu_count(). All data loading and preprocessing (e.g. data augmentation) will be performed in parallel across different processes, and automatically added to a queue to be sent to the GPUs. Check out the tutorial here for example of this.
With Module API, you can perform multi-threading (different from multi-processing) for the data loading and augmentation using the
preprocess_threads argument of
Yeah, that leads to my real question - how do you guarantee each batch is only sent once? What if two threads/processes call next at the same time? There’s no lock on the actual index update. My current solution is:
with self.rlock: self.index += 1
multiprocessing.value works as well
@thomelane Just wanted to follow up on this
I’m not exactly sure what code you’re looking at. In the DataLoader in Gluon, the main process creates a batch of indices that is then passed to each worker process. A worker process fetches a batch of indices and constructs the batch of adata by reading the data at the indices in the batch of indices. This is the code if you’re interested: https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/data/dataloader.py#L215
Is this possible with native MXNet? Or just gluon?
Hi @dmadeka, you can use data iterators like
mxnet.io.ImageRecordIter that run on the engine (if that’s what you mean by native), but Gluon
DataLoader paradigm is more flexible and easier to work with.