First of all welcome to the community.
Now to your solution.
1 - You have to use nn.HybridBlock
(which can be hybridized), instead of nn.Block
.
2 - So you have to write a "hybrid_forward"
not "forward"
.
3 - While using HybridBlock remember to write "F"
as an argument inside your "hybrid_forward"
definition.
4 - You have to hybridize the layer that you want to insert.
Hybridizing converts your code from the dynamic graph to static graph, which can be further attached to a static graph(if needed). In MXNet usually(or I should say 99.9999% of the time) we hybridize our model for better performance. For more about Gluon performance checkout this, for more about Gluon checkout this, this, this and this, and this HOLY BOOK FOR GLUON.
And for your case we can plug a hybridized gluon layer in symbol model.
Below is the working code for a toy example:
class gluon_layer(nn.HybridBlock):
def __init__(self, **kwargs):
super(gluon_layer, self).__init__(**kwargs)
self.dense = nn.Dense(128, 'relu')
def hybrid_forward(self, F, x):
return self.dense(x)
GLUON_LAYER = gluon_layer()
GLUON_LAYER.hybridize()
data = mx.sym.var('data')
layer1 = GLUON_LAYER(data)
layer2 = mx.sym.FullyConnected(data = layer1, num_hidden = 10)
output = mx.sym.SoftmaxOutput(data = layer2, name = 'softmax')
Getting our toy mnist data, and training the model
mnist = mx.test_utils.get_mnist()
train_iter = mx.io.NDArrayIter(mnist['train_data'], mnist['train_label'], 128, shuffle = True)
# create a module
module = mx.mod.Module(symbol = output,
context = mx.gpu(), # change to mx.cpu() if you don't have gpu
data_names = ['data'],
label_names = ['softmax_label'])
# fit the module
module.fit(train_iter,
optimizer = 'sgd',
optimizer_params = {'learning_rate':0.1},
num_epoch = 5)
Will print
INFO:root:Epoch[0] Train-accuracy=0.779351
INFO:root:Epoch[0] Time cost=0.674
INFO:root:Epoch[1] Train-accuracy=0.908615
INFO:root:Epoch[1] Time cost=0.636
INFO:root:Epoch[2] Train-accuracy=0.924257
INFO:root:Epoch[2] Time cost=0.789
INFO:root:Epoch[3] Train-accuracy=0.934768
INFO:root:Epoch[3] Time cost=0.792
INFO:root:Epoch[4] Train-accuracy=0.943230
INFO:root:Epoch[4] Time cost=0.800
Though I wouldn’t recommend you to use such practices as I have shown above.
Mixing Gluon and Symbol is not a recommended way to build and train your model.
And as far I know there is just literally nothing that you can’t do with Gluon. And as a personal suggestion I’d suggest you to use Gluon, its lot more flexible and easy to write and debug.
Hope this helps.