Hi,
I have a Gluon network that predicts a matrix from my input. I want to define a loss function like this:
- for every even column, use the L2 loss of the absolute values.
l2_loss(pred[0::2,:],true[0::2,:])
- for every odd column I want use the L2 loss of the difference between the even column next to it:
l2_loss(pred[1::2,:] - pred[0::2,:], true[1::2,:] - true[0::2,:])
If I put this together, I get something like this: lambda l, p: l2_loss(pred[:,0::2,:], true[:,0::2,:]) + l2_loss(pred[:,1::2,:]-pred[:,0::2,:],true[:,1::2,:]-true[:,0::2,:])
However, what I want to achieve is that the -pred[:,0::2,:]
part in the loss for the odd columns is considered constant, that is I do not want to back-propagate the error from the second term into the even columns. I hope this explanation makes sense.
Any idea how I can achieve that?
Edit: I am using a copy() on the -pred[:,0::2,:]
term, which trains. But I don’t know if that will do what I expect? lambda l, p: l2_loss(pred[:,0::2,:], true[:,0::2,:]) + l2_loss(pred[:,1::2,:]-pred.copy()[:,0::2,:],true[:,1::2,:]-true[:,0::2,:])
Edit 2: Just to be on the super safe side I am now also doing a detach()
: lambda l, p: l2_loss(pred[:,0::2,:], true[:,0::2,:]) + l2_loss(pred[:,1::2,:]-pred[:,0::2,:].copy().detach(),true[:,1::2,:]-true[:,0::2,:])
but I would sill like to understand what’s really needed to make this work and if the copy().detach() is doing what I hope or if it’s overkill or wrong…