Serialize MXNet model cache to reduce model init time?

shashi-tn · February 6, 2020, 1:13am

First time MXNet user here.

I am trying to deploy a Seq2Seq model using Sockeye which uses MXNet. At inference time, the model seems pretty quick except for when it sees a piece of text which is longer than the text blobs it has seen before.

I figured out a work-around for this by initializing the model at the beginning with texts of varying length that I would expect at inference time - small to large.
It did the job, but now the model initialization time seems too long for my use case.

I am wondering how can I initialize the model with a custom set of text data and serialize the cache and carry it along the model to deployment.
Then, I would just need to a. load the architecture + b. load the weights + c. load the cache.

Has anybody tried anything like this? Or have an opinion whether his would work (reduce the model initialization time)? Or any other ideas to do the same?

I am open to suggestions.

If have tried something like this or have come across cases where this has been tried, can you point me towards it?

Topic		Replies	Views
Export an mxnet model which is train in python and test in c++ Discussion	2	969	August 18, 2018
Organizing diverse input data for mxnet models Discussion	0	455	April 3, 2018
NLP prediction using a CNN pretrained model Discussion	3	826	January 26, 2018
MXNet and sparse attention Discussion	0	1621	February 7, 2020
How to transfer res2net from pytorch to mxnet? Discussion	0	427	February 13, 2020

Serialize MXNet model cache to reduce model init time?

Related Topics