NLP prediction using a CNN pretrained model


Following the tutorial posted here I tried loading the model from the checkpoint for the purpose of using it to make predictions on single samples of text (in other words, batches of 1 sample). However, because the model is trained on a batch of size 50, I have problems loading the model.

sym, arg_params, aux_params = mx.model.load_checkpoint('cnn', 3)
mod = mx.mod.Module(symbol=sym, context=mx.cpu(), label_names=None)
mod.bind(for_training=False, data_shapes=[('data', (1,56))], 
mod.set_params(arg_params, aux_params, allow_missing=True) 

The above code breaks with:

data: (1, 56)
Error in operator reshape0: [20:32:53] src/operator/tensor/./matrix_op-inl.h:179: Check failed: oshape.Size() == dshape.Size() Target shape size is different to source. Target: 840000
Source: 16800

This is because the CNN model has several Reshape layers which are configured based on the batch size:

conv_input = mx.sym.Reshape(data=embed_layer, target_shape=(batch_size, 1, sentence_size, num_embed))

The questions is how can I load the model and use it for predicting on one sample of text? I do not want to train with a batch size of 1, because that is not optimal.

If the network architecture is related to batch size, you may need to feed in the corresponding batch size of data. A possible solution is to repeat your input data 50 times and composes (50, 56) data shape.

1 Like

That is indeed a way of solving the issue and I have actually tried it, alas I don’t think it is elegant to brute force my way into using the model

I face the same problem. Is there another solution than repeating the data to get the same size as the batch_size?