Manually compile MXNet1.3.0 release code, which is very inefficient after testing

I compiled the 1.3.0 release code according to the official way.
Code download address:
Compile method reference:
Build MXNet from Source
The compile command is:
Make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1
The compilation is successful, but the performance of the resulting is quite different from that of the published version of
Testing method: Face detection using mtcnn algorithm on a single GPU, the speed of publishing version processing is 90 sheets per second, and I compiled only 20 sheets per second (GPU utilization is very low at this time).

I would like to ask you a few questions:

  1. I found that the compiled libmxnet. so is only 295M, while the official release is 500M. What’s the difference?
  2. Later, it was found that the I compiled is a Debug version. How can I compile the release version?
  3. why do I compile my own performance so low?

The difference can come from several factors, usually it can be from the OpenBlas version or OpenCV version you are compiling against, and which flags have been set for these libraries. We are moving the official publishing pipeline to the public jenkins so you will be able to see all the flags for each library, and see which libraries are statically linked and which ones are dynamically linked

1 Like