What could be reason for getting 0 as boxloss in SSD?

I am trying to SSD network on my own dataset. However, I keep getting boxloss to be 0. Classification loss is not 0. But only box loss is 0.

When my default anchors, labels and class_predictions go through MutliboxTarget, I get 0 for box_target, box_mask, and class_target.

I have tried lot of things

  • Changed size of image
  • Converted image from rectangular to sqare
  • Tried with similar small dataset
  • Random translation and others.
  • Even passed some other dataset through my network (Where the network works fine and Multibox target returns some values apart from 0)
  • Tried classifying on same dataset with 2 classes. (Background, Object)
  • Let it run for long number of ephoces. (But still box loss is 0, class loss approaches 0 and eventually network gets converged only on class loss.
  • Ensured that training_target function works fine (copy pasted the literal code from MXNet object detection module)
  • Even tried it on small network with no body, still mutilbox target returns 0.]
  • Played around with lots of different anchorsizes and ratios.

So my question is what could be wrong? What could be wrong with my dataset? Why does multibox target always return 0 for everything?

And what exactly does a multibox target function do?

I meet the same problem two days ago. I found the problem is that the shape of label for every sample in one batch is different. Did you check that?

Yes my labels are padded to 117. Every label. So I have 117 label per image. (Batch size, 117, 5)

Are you using SSD from gluon CV or your own implementation?

My own implementation.

Is your “boxloss” an L1 loss between predicted box center and width/height and anchors? If so, did you try outputting these values for one sample to see what the output it?

My box loss is L1 loss.

However the issue is when I use MultiboxTarget (Before to computer targets for the purpose of calculating the boxloss) I get 0 matrix for box mask and target. This leads to overall box predictions to be 0.

Hence boxloss is 0.

Now what I don’t get is why am I getting MultiboxTarget’s output 0 no matter which image I pass.

The entire network minimizes the class loss and converges. What are the possible scenarios when MultiboxTarget could return 0 for box_mask and box_target?

Can you explain what the function exactly does then if there is a possibility I might can implement that.

Question to general audience are we suppose to normalize the labels between 0.0 and 1.0 to be able to use the MultiBoxTarget function?

Because when I normalized the labels between those values I was able to get non zero bounding box loss.