What triggers performance tests for best convolution?

mikeobr · January 22, 2019, 7:13pm

I haven’t been able to find thorough documentation on exactly how the performance tests are run (controlled by MXNET_CUDNN_AUTOTUNE_DEFAULT=1).

If I am passing 3 different shapes through a network, will they be run three times? Will it only run once per input shape if running inference on thousands of inputs?

ThomasDelteil · January 22, 2019, 7:46pm

Here is the relevant part of the code:
src/operator/nn/cudnn/cudnn_algoreg-inl.h:89

    ParamKey key{param, in_shape[0], in_shape[1], out_shape[0], cudnn_data_type,
                 cudnn_forward_compute_type, cudnn_backward_compute_type, sm_arch, add_to_weight};
    auto i = reg_.find(key);
    if (i != reg_.end()) {
      *fwd = i->second.fwd;
      *bwd = i->second.bwd;
      *flt = i->second.flt;
    } else {
    ... (find best algo)

As you can see is run once per unique key

Topic		Replies	Views
When to set CUDNN_AUTOTUNE_DEFAULT to 0? Performance	1	1562	October 23, 2018
Is it normal that mxnet takes up much more GPU memory at the start up? Discussion	3	2897	May 30, 2018
Saving and loading cudNN autotune and graph optimization Discussion	2	869	February 6, 2020
Checking if CuDnn is actually used? Discussion	5	1871	July 29, 2019
Parallel execution on GoogLeNet Discussion	1	460	March 30, 2018

What triggers performance tests for best convolution?

Related Topics