Number of predicted bounding boxes are 0 and box loss goes down

I am training SSD. My loss function consists of classification loss and box loss, where classification loss is softmax cross entropy and box loss is L1 loss.

As I run network for longer number of epoch I realized that box loss is consistently going down. But number of bounding box predicted by the network is 0.

What are the situation that are prone to such situation?

Do you train the network from scratch or do you use a pre-trained SSD? Are you normalizing the input data? Can you provide a minimal reproducible example?

I am using pre-trained ResNet34v1. Also I am normalizing the input data bounded to between 0 and 1, although it is not normalized using |data-mean|/SD format but for now I am just diving images by 255 and lable data by size of the image.

Okay I have pretty much attached my entire code in Jupyter-notebook with some sample examples linked here…

JupyterNotebook Google Drive Link

Set up virtualenv and you should be good to go once you install requirements.txt.

I do know anchor boxes are bit off and might not be scaled properly. I am working on fixing that.

So main question is why the loss is decreasing should I use IOU as error function?

Also do you think would it viable to use ResNet101 instead of 34 for image of size 1675x1250.

And one question is would be good to use rectangular images? That was the next area that I wanted to explore.

Thanks

@ThomasDelteil

I would appreciate your input on this.

Hi @aakashpatel, you can have a look at this tutorial on fine-tuning a pre-trained object detection algorithm.
https://gluon-cv.mxnet.io/build/examples_detection/finetune_detection.html

For the loss, there is usually a box loss for localization and a prediction loss for classification.

Rectangular images should be fine and shouldn’t impact anything.
1675x1250 seems very big for ResNet101 you might only manage to fit a single image per batch or not even, I would consider maybe rescaling your images down a bit. But it depends on your GPU.