mxnet.io.CSVIter delimiter

Hi,
do you know how to modify csv delimiter when calling mxnet.io.CSVIter? It seems to be hard coded to comma ‘,’ by default and I don’t see how to change this.

Alternatively, I would like to create an iterator that allows to point to a directory containing multiple files and then create a multithreaded reader from those files (which have a given delimiter, not necessarily the comma)

Thanks,
E

You could try implementing a customized CSVIter. Something like the following:

class customizedCSVIter(mx.io.DataIter):
    def __init__(self, data_names, data_shapes, label_names, label_shapes, 
                 csvfile, delimiter=',', batch_size=100):
        self.delimiter=delimiter
        self.batch_size = batch_size
        self._provide_data = [ ]
        self._provide_label = []
        self.file = open(csvfile,'r')
        self.csvreader = csv.reader(self.file, delimiter=delimiter)

    def __iter__(self):
        return self

    def reset(self):
        self.file.seek(0)
        self.csvreader = csv.reader(self.file, delimiter=self.delimiter)

    def __next__(self):
        return self.next()

    @property
    def provide_data(self):
        return self._provide_data

    @property
    def provide_label(self):
        return self._provide_label

    def next(self):

Thanks. That’s what I ended up doing. I was just wandering if this is going to have the good properties of the native CSVIter class that does multithreading while reading the data. I want to be sure that while the next batch is being put together by one of the machine cores the gradient is being computed and used by another thread. I didn’t test if my implementation does multi-threading when I use it inside a training loop. Do you know if that’s being done at a higher level?