Which initializer will be used if I specify the init params in Conv2D and then call the initialize()?

below is my code:

import mxnet as mx
from mxnet import nd
from mxnet.gluon import nn

class ConvBlock(nn.HybridBlock):
    def __init__(self, **kwargs):
        super(ConvBlock, self).__init__(**kwargs)
        self.feature = nn.HybridSequential()
        self.feature.add(nn.Conv2D(16, 3, 1, 1, weight_initializer=mx.init.Xavier()))
        self.feature.add(nn.BatchNormal())
        self.feature.add(nn.ReLU())
    def hybrid_forward(F, x):
        return self.feature(x)

if __name__ == '__main__':
    net = ConvBlock()
    net.initialize()   # what initializer is used actually to initialize the net parameters ?

My question is what’s the actual initializer used during initialization of net.feature[0], i.e. Conv2D(...) ?

If the initialization method is not specified, such as net.initialize() , MXNet will use the default random initialization method: each element of the weight parameter is randomly sampled with a uniform distribution U[−0.07,0.07]U[−0.07,0.07] and the bias parameters are all set to 0. Please refer to http://d2l.ai/chapter_multilayer-perceptrons/numerical-stability-and-init.html#parameter-initialization for more information.

For example, if you use net.initialize(mx.init.One()) AND weight_initializer=mx.init.One() as in:

net = nn.Sequential()
net.add(nn.Conv2D(16, 3, 1, 1, weight_initializer=mx.init.One(), activation='relu'))
net.add(nn.Dense(10))
net.initialize(mx.init.One()) 

x = nd.random.uniform(shape=(1,1,2, 10))
net(x)
print(net[1].weight.data())

You can see that the weights are:

[[1. 1. 1. … 1. 1. 1.]
[1. 1. 1. … 1. 1. 1.]
[1. 1. 1. … 1. 1. 1.]

[1. 1. 1. … 1. 1. 1.]
[1. 1. 1. … 1. 1. 1.]
[1. 1. 1. … 1. 1. 1.]]
<NDArray 10x320 @cpu(0)>

But if you use net.initialize() AND weight_initializer=mx.init.One() as in:

net = nn.Sequential()
net.add(nn.Conv2D(16, 3, 1, 1, weight_initializer=mx.init.One(), activation='relu'))
net.add(nn.Dense(10))
net.initialize() 

x = nd.random.uniform(shape=(1,1,2, 10))
net(x)
print(net[1].weight.data())

The weights are:

[[-0.02335968 -0.01407015 -0.05864581 … 0.05376593 -0.03404731
-0.06720807]
[-0.05495147 0.06723212 0.0113286 … -0.00772312 0.01102494
0.05772652]
[ 0.00861803 -0.06468396 0.05282693 … 0.06916221 -0.01219748
-0.0272661 ]

[ 0.04516336 0.0003779 0.01198862 … -0.04161773 0.02530347
0.06112661]
[-0.00585618 -0.02169332 -0.01204515 … -0.05935499 -0.03586181
-0.06428157]
[-0.0113162 0.01251774 -0.05391466 … 0.00918427 0.03665156
0.02068089]]
<NDArray 10x320 @cpu(0)>