Saving and loading cudNN autotune and graph optimization

Hello everyone,

I created a feature request on GitHub about the topic of saving cudNN autotune and graph fusion to disk in order to be reloaded later:

In our use case TensorRT is very helpful but requires a long start-up time.
This start-up time leads to the point that our executable is being killed by external programs because it does not reply in time to ping requests.

Is anyone interested in this feature, too?


Thanks for creating the feature request. A similar request has been opened a while ago: It would be certainly helpful to cache autotune results.


I am interested in this feature. It would make MXNet much simpler to implement for production use.

Multiple processes of MXNet on the same server have this risk: If multiple cudNN autotunes are triggered at the same time, the spike in memory may cause out of memory errors.