Very low CPU utilization

When using MXNet (Scala) to predict on CPU using a CNN, the CPU utilization never goes above 15%. The machine doesn’t appear to be doing much IO either.

Any idea what could be going wrong?

Could you share more information such as version of MXNet (build from source or pip installed, if build from source, what are the compile flags). If possible, could you also share the script you ran for predication?


export MXNET_CPU_WORKER_NTHREADS=(a larger int)


Using MXNet 0.11.0:

  • built from source
  • OpenMP enabled
  • Using MKL (MKLML)
  • Not using experimental MKL
  • Using Lapack

Per the suggestions on the MXNet CPU performance page, I tried setting the thread affinity and number of available threads to OpenMP, but there’s no difference.

I can’t share all of my prediction code, but it’s pretty standard:

  • Features given as List[Array[Float]]
  • Turned into an NDArray
  • Wrapped into an NDArrayIter
  • Passed to Module::predict

I’ll give that a shot and report back. Thanks.