Gluon: access layer weights

simomaur · June 5, 2018, 9:51pm

also referring to Multiple losses
it turned out that given the following loss function (I still don’t get why in MXNet the losses return from hybrid_forward have size (batch_size,) instead of a scalar loss)

class SomeLoss(mx.gluon.loss.Loss):
def __init__(self, weight=1., batch_axis=0, **kwargs):
    super(SomeLoss, self).__init__(weight=weight, batch_axis=batch_axis, **kwargs)

def hybrid_forward(self, F, x, sample_weight=None):
    y = F.sign(data=x)
    b_n = 0.5 * (y + 1)
    mu_m = F.mean(b_n, axis=0)
    loss = F.square(mu_m - 0.5)
    return loss

the gradients do not backpropagate correctly. Which is the reason that the weights of the network are not updated!
I further found out that this is related to the fact that x is not used inside the operators, except for F.sign(…), but this function is non-differentiable at x=0 and zero everywhere else.
As a solution we could approximate this with F.sigmoid/F.tanh, but I still wonder why the backend cannot handle this since for this loss:

class OtherLoss(mx.gluon.loss.Loss):
def __init__(self, weight=1., batch_axis=0, **kwargs):
    super(OtherLoss, self).__init__(weight=weight, batch_axis=batch_axis, **kwargs)

def hybrid_forward(self, F, x, sample_weight=None):
    y = F.sign(data=x)
    b_n = 0.5 * (y + 1)
    loss = F.square(b_n - x)
    loss = F.mean(loss, axis=0, exclude=True)
    return loss

the gradients are calculated and the weights updated.

Topic		Replies	Views
Symbolic API layer's weights Discussion	1	497	May 18, 2020
How to access the weight matrix of gluon.nn.Dense in `hybrid_forward`? Gluon	3	1767	September 25, 2018
Freezing weight training for certain inputs to a hidden layer Gluon python , gluon , how-to	11	5883	February 9, 2020
How to initialize weights of Conv2D layer given an NDarray? Gluon	1	667	July 15, 2019
How to apply custom initialization to only a specific layer? Gluon	4	1885	June 18, 2019

Gluon: access layer weights

Related Topics