Pascal VOC Data processing data-loader issue

I am trying to process PASCAL VOC 2012 on my own by overriding ArrayDataset class and then using it in dataloader. However, I keep getting TBlob.get_with_shape: new and old shape do not match total elements error no matter how I change it.

Here is the code for reference, am I missing something?

class PascalImage(ArrayDataset):
"""Generate ArrayDataset of tuples structured as (image,lable)"""

def __init__(self, data_path, d_type):
    self._path = data_path
    self._data_type = d_type
    self.max_len = 0
    self._class = ['__background__','person', 'bird', 'cat', 'cow', 'dog', 'horse', 'sheep', 
                   'aeroplane', 'bicycle', 'boat', 'bus', 'car', 'motorbike', 'train', 
                   'bottle', 'chair', 'diningtable', 'pottedplant', 'sofa', 'tvmonitor']
    self._num_class = len(self._class)
    self._class_lookup = {key:val for key,val in zip(self._class, range(self._num_class))}
    self._img_name_list = self._get_img_name_list()[:10]
    self._data = self._get_data()
    self._length = len(self._data)
    super(PascalImage, self).__init__(self._data)


def _get_img_name_list(self):
    """
        Gets list of image file names from metafiles. Checks for the 
        type of data to be generated, either train, trainval or test.
    """
    name_list = []
    
    file_name = 'train.txt'
    if self._data_type =="trainval":
        file_name = 'trainval.txt'
    elif self._data_type == "tests":
        file_name = 'test.txt'
    else:
        file_name = 'train.txt'
        
    with open(os.path.join(self._path, 'ImageSets/Main', file_name), 'r') as f:
        name_list = f.readlines()
        
    return [img_name.strip() for img_name in name_list]


def _get_image(self):
    """
        Given list of image files reads every image using cv2 library and stores
        list of numpy array representation of images.
    """
    return [nd.array(cv2.imread(os.path.join(self._path, 'JPEGImages', img_name+'.jpg'))) 
            for img_name in self._img_name_list]


def _prase_xml(self, img_name):
    """
        Given an image file it scrapes the bounding box and classes from the image.
        
        @param img_name: Image file number.
        @returns bboxes: Numpy array of list of bounding boxes.
    """
    xmlTree = ET.parse(os.path.join(self._path, 'Annotations', img_name+'.xml'))
    objects = xmlTree.findall('object')
    bboxes = np.zeros((len(objects), 5), dtype=np.uint16)
    if len(objects) > self.max_len:
        self.max_len = len(objects)
    for ix, obj in enumerate(objects):
        bbox = obj.find('bndbox')
        xmin = float(bbox.find('xmin').text) - 1
        ymin = float(bbox.find('ymin').text) - 1
        xmax = float(bbox.find('xmax').text) - 1
        ymax = float(bbox.find('ymax').text) - 1
        obj_class = self._class_lookup[obj.find('name').text.lower().strip()]
        bboxes[ix, :]= [obj_class, xmin, ymin, xmax, ymax]
    return nd.array(bboxes)


def _get_bbox_class(self):
    """Gets list of bounding boxes for various images in the dataset."""
    return [self._prase_xml(img_name) for img_name in self._img_name_list]


def _get_data(self):
    """Gets the tuple dataset of image and bboxes."""
    return list(zip(self._get_image(), self._get_bbox_class()))


def __getitem__(self, idx):
    return self._data[0][idx]

Transformation that I want to apply on the function

def aug_transform(image, label):
    """
        Nromalize images and convert label into fromat from x,y, x+w, y+h to 
        center-x, center-y and width and height.
    """
    # Pad bounding boxes and convert the image size
    # Conver bounding boxes from x,y, xmax, ymax to x, y, x+w, y+h
    
    max_len = 56
    image = np.expand_dims(image, axis=2)
    print image.shape
    padded_bbox = np.zeros((max_len, label.shape[1]+1), dtype=np.float32)
    bbox = label
    print bbox.shape
    w = bbox[:, 3] - bbox[:, 1]
    h = bbox[:, 4] - bbox[:, 2]
    bbox[:, 1] = bbox[:,1] + w/2
    bbox[:, 2] = bbox[:,2] + h/2
    image = nd.array(image)
    return image, nd.array(bbox)

And when I run this code,

trainDataset = PascalImage('VOCdevkit/VOC2012','train')
train_data = gluon.data.DataLoader(trainDataset.transform(aug_transform), batch_size, shuffle=True, last_batch="discard", num_workers=multiprocessing.cpu_count()-2)

And processing it as
for ix, X, Y in enumerate(train_data):
print ix
print X

Gives error
Traceback (most recent call last):
Traceback (most recent call last):
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/queues.py”, line 268, in _feed
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/queues.py”, line 268, in _feed
send(obj)
send(obj)
File “/Users/aakashp/Documents/AmazonPrep/Programming/PracticeNN/practenv/lib/python2.7/site-packages/mxnet/gluon/data/dataloader.py”, line 83, in send
File “/Users/aakashp/Documents/AmazonPrep/Programming/PracticeNN/practenv/lib/python2.7/site-packages/mxnet/gluon/data/dataloader.py”, line 83, in send
ForkingPickler(buf, pickle.HIGHEST_PROTOCOL).dump(obj)
ForkingPickler(buf, pickle.HIGHEST_PROTOCOL).dump(obj)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 224, in dump
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 224, in dump
self.save(obj)
self.save(obj)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 286, in save
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 286, in save
f(self, obj) # Call unbound method with explicit self
f(self, obj) # Call unbound method with explicit self
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 554, in save_tuple
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 554, in save_tuple
save(element)
save(element)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 286, in save
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 286, in save
f(self, obj) # Call unbound method with explicit self
f(self, obj) # Call unbound method with explicit self
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 606, in save_list
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 606, in save_list
self._batch_appends(iter(obj))
self._batch_appends(iter(obj))
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 639, in _batch_appends
save(x)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 639, in _batch_appends
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 286, in save
save(x)
f(self, obj) # Call unbound method with explicit self
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py”, line 286, in save
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/forking.py”, line 66, in dispatcher
f(self, obj) # Call unbound method with explicit self
rv = reduce(obj)
File “/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/forking.py”, line 66, in dispatcher
File “/Users/aakashp/Documents/AmazonPrep/Programming/PracticeNN/practenv/lib/python2.7/site-packages/mxnet/gluon/data/dataloader.py”, line 48, in reduce_ndarray
rv = reduce(obj)
return rebuild_ndarray, data._to_shared_mem()
File “/Users/aakashp/Documents/AmazonPrep/Programming/PracticeNN/practenv/lib/python2.7/site-packages/mxnet/gluon/data/dataloader.py”, line 48, in reduce_ndarray
File “/Users/aakashp/Documents/AmazonPrep/Programming/PracticeNN/practenv/lib/python2.7/site-packages/mxnet/ndarray/ndarray.py”, line 200, in _to_shared_mem
return rebuild_ndarray, data._to_shared_mem()
File “/Users/aakashp/Documents/AmazonPrep/Programming/PracticeNN/practenv/lib/python2.7/site-packages/mxnet/ndarray/ndarray.py”, line 200, in _to_shared_mem
self.handle, ctypes.byref(shared_pid), ctypes.byref(shared_id)))
File “/Users/aakashp/Documents/AmazonPrep/Programming/PracticeNN/practenv/lib/python2.7/site-packages/mxnet/base.py”, line 252, in check_call
self.handle, ctypes.byref(shared_pid), ctypes.byref(shared_id)))
File “/Users/aakashp/Documents/AmazonPrep/Programming/PracticeNN/practenv/lib/python2.7/site-packages/mxnet/base.py”, line 252, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
raise MXNetError(py_str(LIB.MXGetLastError()))
MXNetError: [18:57:17] include/mxnet/./tensor_blob.h:257: Check failed: this->shape
.Size() == shape.Size() (562500 vs. 499500) TBlob.get_with_shape: new and old shape do not match total elements

Stack trace returned 10 entries:
[bt] (0) 0   libmxnet.so                         0x000000010b1f0b90 libmxnet.so + 15248
[bt] (1) 1   libmxnet.so                         0x000000010b1f093f libmxnet.so + 14655
[bt] (2) 2   libmxnet.so                         0x000000010b2238b1 libmxnet.so + 223409
[bt] (3) 3   libmxnet.so                         0x000000010c4ea755 libmxnet.so + 19912533
[bt] (4) 4   libmxnet.so                         0x000000010c7338ed MXNDListFree + 561325
[bt] (5) 5   libmxnet.so                         0x000000010c6ba7f4 MXNDListFree + 65460
[bt] (6) 6   libmxnet.so                         0x000000010c6bcec8 MXNDListFree + 75400
[bt] (7) 7   libmxnet.so                         0x000000010c6c0261 MXNDListFree + 88609
[bt] (8) 8   libmxnet.so                         0x000000010c6c017f MXNDListFree + 88383
[bt] (9) 9   libmxnet.so                         0x000000010c6bdc55 MXNDListFree + 78869


MXNetError: [18:57:17] include/mxnet/./tensor_blob.h:257: Check failed: this->shape_.Size() == shape.Size() (490500 vs. 663000) TBlob.get_with_shape: new and old shape do not match total elements

Stack trace returned 10 entries:
[bt] (0) 0   libmxnet.so                         0x000000010b1f0b90 libmxnet.so + 15248
[bt] (1) 1   libmxnet.so                         0x000000010b1f093f libmxnet.so + 14655
[bt] (2) 2   libmxnet.so                         0x000000010b2238b1 libmxnet.so + 223409
[bt] (3) 3   libmxnet.so                         0x000000010c4ea755 libmxnet.so + 19912533
[bt] (4) 4   libmxnet.so                         0x000000010c7338ed MXNDListFree + 561325
[bt] (5) 5   libmxnet.so                         0x000000010c6ba7f4 MXNDListFree + 65460
[bt] (6) 6   libmxnet.so                         0x000000010c6bcec8 MXNDListFree + 75400
[bt] (7) 7   libmxnet.so                         0x000000010c6c0261 MXNDListFree + 88609
[bt] (8) 8   libmxnet.so                         0x000000010c6c017f MXNDListFree + 88383
[bt] (9) 9   libmxnet.so                         0x000000010c6bdc55 MXNDListFree + 78869

Can someone guide me where am I going wrong?

Note: All image sizes and number of bounding boxes varies.

It looks are trying to batchify images of different sizes, this is not possible, you need to resize them to the same size in your transform before so that they can be put into a single tensor for your batch.

Alternatively you can have a look at gluon-cv dataset for Pascal Voc that has already an implementation:
https://gluon-cv.mxnet.io/api/data.datasets.html#gluoncv.data.VOCDetection

So is there a way that I can try with different/variable image sizes? What do you suggest? Bucketing? I have image data with variable number of bounding boxes in it. So do you pad bounding boxes, that I have already done. But I want to be able to represent data in the format where things could be of variable sizes. What do you suggest?

If you want to use batches of size > 1, you will need to have images of the same size.
Bucketing is an option, padding is another, resizing and/or random cropping is another.

For the labels it is going to depend on what network you are using for training. For SSD for example, you need to generate training targets based of your default anchor boxes and your target bounding boxes. Since every image as the same number of default anchors, you won’t need to pad them to make them match in size between images.

1 Like