Image resizing and scale invariance for object-detection

LauLauThom · October 16, 2019, 2:13pm

I followed the tutorial to finetune an object-detection network.

My images are initially 2048x2048 and are rescaled to 512x512 during training (function get_dataloader with data_shape=512) , and also for detection (using gcv.data.transforms.presets.ssd.load_test with parameter short=512 and max_size=1024).

I can use my fine-tuned network to detect my objects in new 2048x2048 squared images.

I tried then run the detection on cropped images (1466x442) but then it completely fails !
The load_test function returns an image of dimensions 1024x309 with those rectangular cropped images.
I though the data augmentation used during the training would make the trained network scale-invariant to some extent, or at least such that it stills perform well on cropped images.

NRauschmayr · October 16, 2019, 3:17pm

Not all features are scale invariant. So if the objects that you want the model to detect appear 4 times larger in the test data than in the training data, the model will likely not be able to detect them. The best is to use the same preprocessing for training and test data: if you rescale your training images, then you should do the same for the test images.
Here is a nice article that explains the problem: https://miguel-data-sc.github.io/2017-11-23-second/

Topic		Replies	Views
Finetuneing a pretrained ResNet50_v1d in gluoncv Gluon	1	454	December 31, 2018
When do resizing on gluoncv + .rec? Gluon	1	408	November 27, 2018
Object Detection image collection and bounding box best practices? Discussion	3	1416	November 27, 2018
How does size of training image matter with Image Classification results Discussion	1	465	August 23, 2018
Number of predicted bounding boxes are 0 and box loss goes down Discussion	4	430	January 21, 2019

Image resizing and scale invariance for object-detection

Related Topics