Hi, I want to apply custom initialization to only a specific layer like the code below in Pytorch, but I don’t know how to do it in MxNet.
Just use the weight_initializer and bias_initializer parameters.
@adrian Thank for the reply!
But how about when I initialize the network later, like net.collect_params().initialize(init.Xavier()) ? Will that layer get reset?
No, the layer won’t get reset unless you explicitly set force_reinit=True
Also you can do what you want by referencing the layer directly when initializing the network by saying something like:
net.W.collect_params.initialize(my_custom_init)
and then initializing other layers separately.
Alternatively you can init the entire network first and then force reinit your particular layer you want to do custom for like:
net.collect_params.initialize(init.Xavier())
net.W.collect_params.initialize(my_custom_init, force_reinit=True)
@MrDough I think you might be misinterpreting what setting a parameter initializer does. It does not cause that parameter to be initialized immediately. What it does is associate with the weight Parameter
object an initializer to be called whenever initialization is finally invoked.
So until you call net.collect_params().initialize(init.Xavier())
, no initialization has actually been performed. When you do call it, what happens is MXNet (Gluon) is going to iterate over each parameter. If that parameter has a custom initializer associated with it, it will use it. If it does not, it will use the initializer you provided to the call to initialize.
This code should help clarify what’s going on.
import mxnet as mx
import numpy as np
if __name__ == '__main__':
net = mx.gluon.nn.Sequential(prefix='mynet_')
with net.name_scope():
net.add(
mx.gluon.nn.Dense(5, in_units=3, weight_initializer=mx.init.Constant(2)), # weight name will be mynet_dense0_weight
mx.gluon.nn.Dense(1, in_units=5)
)
params = net.collect_params()
target_param = 'mynet_dense0_weight'
try:
print(params[target_param].data())
except:
# an exception will always be caught because we didn't initialize yet
print('mynet_dense0_weight has not been initialized so .data() failed!')
print(f'but it does have an custom initializer set for it: {params[target_param].init}')
# only after this call is initialization run
net.initialize(mx.init.Xavier())
# but parameters with custom initializers set for them will defer to them, not Xavier
np.testing.assert_allclose(params[target_param].data().asnumpy(), np.full((5, 3), 2.))
# re-initing doesn't matter, it will still defer to any custon initializers set for the parameters
net.initialize(mx.init.Xavier(), force_reinit=True)
np.testing.assert_allclose(params[target_param].data().asnumpy(), np.full((5, 3), 2.))
In that example, I provided a weight initializer for a Dense
rather than Conv2d
, but it’s the same concept.