How to apply custom initialization to only a specific layer?

Hi, I want to apply custom initialization to only a specific layer like the code below in Pytorch, but I don’t know how to do it in MxNet.

Just use the weight_initializer and bias_initializer parameters.

2 Likes

@adrian Thank for the reply!
But how about when I initialize the network later, like net.collect_params().initialize(init.Xavier()) ? Will that layer get reset?

No, the layer won’t get reset unless you explicitly set force_reinit=True

Also you can do what you want by referencing the layer directly when initializing the network by saying something like:
net.W.collect_params.initialize(my_custom_init)
and then initializing other layers separately.

Alternatively you can init the entire network first and then force reinit your particular layer you want to do custom for like:

net.collect_params.initialize(init.Xavier())
net.W.collect_params.initialize(my_custom_init, force_reinit=True)

3 Likes

@MrDough I think you might be misinterpreting what setting a parameter initializer does. It does not cause that parameter to be initialized immediately. What it does is associate with the weight Parameter object an initializer to be called whenever initialization is finally invoked.

So until you call net.collect_params().initialize(init.Xavier()), no initialization has actually been performed. When you do call it, what happens is MXNet (Gluon) is going to iterate over each parameter. If that parameter has a custom initializer associated with it, it will use it. If it does not, it will use the initializer you provided to the call to initialize.

This code should help clarify what’s going on.

import mxnet as mx
import numpy as np


if __name__ == '__main__':
    net = mx.gluon.nn.Sequential(prefix='mynet_')
    with net.name_scope():
        net.add(
            mx.gluon.nn.Dense(5, in_units=3, weight_initializer=mx.init.Constant(2)),  # weight name will be mynet_dense0_weight
            mx.gluon.nn.Dense(1, in_units=5)
        )

    params = net.collect_params()
    target_param = 'mynet_dense0_weight'
    try:
        print(params[target_param].data())
    except:
        # an exception will always be caught because we didn't initialize yet
        print('mynet_dense0_weight has not been initialized so .data() failed!')
        print(f'but it does have an custom initializer set for it: {params[target_param].init}')

    # only after this call is initialization run
    net.initialize(mx.init.Xavier())

    # but parameters with custom initializers set for them will defer to them, not Xavier
    np.testing.assert_allclose(params[target_param].data().asnumpy(), np.full((5, 3), 2.))

    # re-initing doesn't matter, it will still defer to any custon initializers set for the parameters
    net.initialize(mx.init.Xavier(), force_reinit=True)
    np.testing.assert_allclose(params[target_param].data().asnumpy(), np.full((5, 3), 2.))

In that example, I provided a weight initializer for a Dense rather than Conv2d, but it’s the same concept.

1 Like