Dense layer can't specify linear explicity

dmadeka · October 9, 2017, 1:42am

I’m curious why gluon.nn.Dense doesn’t support activation='linear' instead of treating both non-specification of an activation and activation='linear' as having the same behavior?

Im mostly asking, because a) for python - its usually considered better to be explicit, b) for pedagogical purposes - it seems to make the code nicer to read… Just curious why the decision was made

smolix · October 9, 2017, 3:19am

Thanks for the suggestion. I guess this was simply made to keep the code looking clean (otherwise you’d have to intercept the ‘linear’ argument specifically for every layer that’s being coded up, thus adding quite a bit of spurious code).

dmadeka · October 9, 2017, 3:21am

Im not sure I understand @smolix. A keyword argument means they wouldn’t have to specify it each time, but could if they wanted to. Im thinking something like:

def Dense(x, activation='linear'):

So, in case you did want to make it clear you could… And I don’t think it changes the current behavior, but I might be wrong?

smolix · October 9, 2017, 3:26am

I understand what you want. But at some point on the backend someone needs to take care of the argument. Not a big deal … if you feel strongly about it, why not create a pull request that implements it for all available layers. This is open source .

dmadeka · October 9, 2017, 3:27am

Sure, I can have a go. I just wanted to make sure I wasn’t diving deep before discovering there was a systematic reason not to do that

smolix · October 9, 2017, 3:28am

Best check with Junyuan. But AFAICT it’s just that nobody so far asked for this.

dmadeka · October 9, 2017, 3:31am

@piiswrong I was thinking of just modifying this line:

github.com

apache/incubator-mxnet/blob/master/python/mxnet/gluon/nn/basic_layers.py#L194


             in_units=0, **kwargs):
    super(Dense, self).__init__(**kwargs)
    self._flatten = flatten
    with self.name_scope():
        self._units = units
        self._in_units = in_units
        self.weight = self.params.get('weight', shape=(units, in_units),
                                      init=weight_initializer,
                                      allow_deferred_init=True)
        if use_bias:
            self.bias = self.params.get('bias', shape=(units,),
                                        init=bias_initializer,
                                        allow_deferred_init=True)
        else:
            self.bias = None
        if activation is not None:
            self.act = Activation(activation, prefix=activation+'_')
        else:
            self.act = None


def hybrid_forward(self, F, x, weight, bias=None):

if activation is not None and activation != 'linear':
   self.act = Activation(activation, prefix=activation+'_')
else:
   self.act = None

dmadeka · October 9, 2017, 5:31am

Okay, it looks like there are three real options here. None seem too appealing to me, which might be why it was never done:

What I suggested above, which adds lines of code to the files basic_layers.py, conv_layers.py, rnn_cell.py, rnn_layer.py
Adding a decorator to the __init__ functions or (slightly preferable) - using a common MetaClass/BaseClass and overriding the activation attribute to be None if it is 'linear'
Changing the underlying C/cuda code to allow a linear activation - this, if possible, would be the best approach, since my guess is it will apply to all other languages - avoiding the creation of an inconsistent interface. What I don’t see is how this might impact performance or create unnecessary overhead everytime a new layer is created

Im happy to do any of the three, or to let it be if these complications were why they weren’t done in the first place

smolix · October 9, 2017, 3:38pm

Quite honestly, I would probably let it slide. There’s no extra functionality that this offers. It’s just that it looks prettier by some standard. But it adds extra code that means that there’s extra space for introducing bugs. So, unless someone badly wants it, let’s keep it the way it is.

Topic		Replies	Views
How to access the weight matrix of gluon.nn.Dense in `hybrid_forward`? Gluon	3	1773	September 25, 2018
Activation function with learnable parameter (Solved) Discussion	6	2009	December 8, 2017
Problem when hybridizing with sparse dot Gluon	3	789	September 6, 2018
Freezing weight training for certain inputs to a hidden layer Gluon python , gluon , how-to	11	5890	February 9, 2020
Gluon performance compared with low level api Gluon	3	1062	February 9, 2018

Dense layer can't specify linear explicity

Related Topics