Is your “boxloss” an L1 loss between predicted box center and width/height and anchors? If so, did you try outputting these values for one sample to see what the output it?
However the issue is when I use MultiboxTarget (Before to computer targets for the purpose of calculating the boxloss) I get 0 matrix for box mask and target. This leads to overall box predictions to be 0.
Hence boxloss is 0.
Now what I don’t get is why am I getting MultiboxTarget’s output 0 no matter which image I pass.
The entire network minimizes the class loss and converges. What are the possible scenarios when MultiboxTarget could return 0 for box_mask and box_target?
Can you explain what the function exactly does then if there is a possibility I might can implement that.