Data loader with rectangular images for object-detections

LauLauThom · March 23, 2020, 11:34am

Hi guys,
I read through the object-detection tutorial to finetune an existing network for object-detection.

https://gluon-cv.mxnet.io/build/examples_detection/finetune_detection.html

and I also checked this script

dmlc/gluon-cv/blob/master/scripts/detection/ssd/train_ssd.py

"""Train SSD"""
import argparse
import os
import logging
import warnings
import time
import numpy as np
import mxnet as mx
from mxnet import nd
from mxnet import gluon
from mxnet import autograd
import gluoncv as gcv
gcv.utils.check_version('0.6.0')
from gluoncv import data as gdata
from gluoncv import utils as gutils
from gluoncv.model_zoo import get_model
from gluoncv.data.batchify import Tuple, Stack, Pad
from gluoncv.data.transforms.presets.ssd import SSDDefaultTrainTransform
from gluoncv.data.transforms.presets.ssd import SSDDefaultValTransform
from gluoncv.data.transforms.presets.ssd import SSDDALIPipeline

This file has been truncated. show original

So I understood that for SSD the images are resized such that the shorter side is 512 or 300 depending on the network.
In the example with the Pikachu dataset the images are squared (datashape = 512), and so the data- loader is configured with square dimensions
width, height = datashape, datashape
and
SSDDefaultTrainTransform(width, height, anchors)

I was wondering if I have rectangular and not squared images should I modify those lines to match this case ?
Like if I know that my images will be resized such that the shorter side is going to be 512, then I could apply the same ratio to calculate the expected length of the longer side and use that for width, height above.

LauLauThom · March 30, 2020, 2:39pm

Dear @zhreshold @hetong007,
I noticed you were the last ones to contribute to train_ssd.py, would you be able to shed some light here ?
Thanks a lot !

zhreshold · March 30, 2020, 7:00pm

For SSD, it’s ideally taking a batch of same input resolutions, so if you decide to not use square images, you can set a fixed shape, e.g., (384, 512) in the transform functions, and that will be all.

Topic		Replies	Views
Non-square input image into ssd python , gluon-cv , how-to	3	1354	February 21, 2019
How to float16 gluoncv SSD finetuning? Gluon	3	688	December 28, 2019
Can data loader work with different input shape Gluon	3	1395	November 15, 2018
Help with SSD SmoothL1 metric reporting NaN during training Gluon	7	1378	December 27, 2023
Cryptic failure of SSD training with gluoncv 0.5.0 Gluon	1	503	October 23, 2019

Data loader with rectangular images for object-detections

Related Topics