Return variable number of ndarray in __getitem__ in custom dataset

Developing a multi output reader able to feed into the DataLoader of gluoncv, but I encounter an issue.

def __getitem__(self, idx):
        base = mx.image.imread(self._image_list[idx])        
        if self._transform is not None:
            base = self._transform(base)
        #self._label_list[idx] is a list, self._label_list[idx][0] is a ndarray, which encode label with one hot
        return base, self._label_list[idx][0], self._label_list[idx][1] //this size of the tuple could change

I would like to return self._label_list[idx](return list do not accepted by DataLoader) rather than split them up to self._label_list[idx][0], self._label_list[idx][1] because the size of the outputs may vary(the network could more than two branches).

How could I solve this issue?Thanks

Edit : source codes put at github

I had a look on your code at https://github.com/stereomatchingkiss/blogCodes2/blob/master/mxnet_and_multi_outputs/MultiOutputImageDataset.py but I am not sure I fully understand your question. You want to use a list of labels instead of self._label_list[idx][0], self._label_list[idx][1] because you may have more than 2 labels per image?
If so it might be easier to load the data into an ArrayDataset, which is then loaded by the DataLoader instead of overwriting __getitem__.

1 Like