Python multithread queue and multi process on CPU question

MyraBaba · January 1, 2019, 10:59am

Hi,

I am trying to extract futures and inferences on the CPU. I have mac book pro 8 cores and 16 Gig ram.

mxnet: 1.3.1
mxnet-mkl : 1.3.1
python 3.6

in my python code :
os.environ[“MXNET_CPU_WORKER_NTHREADS”] = “4”
os.environ[“MXNET_CPU_PRIORITY_NTHREADS”] = “8”
os.environ[“OMP_NUM_THREADS”] = “8”
os.environ[“MXNET_CPU_NNPACK_NTHREADS”] = “8”
os.environ[“MXNET_MP_OPENCV_NUM_THREADS”] = “1”

try to set environment variable

I have two question

1 - In my mac I can utilize only 2-3 cores instead of 8 . How can I say to mxnet use all possible cores ? How can I use all power of my cpus ?

2 - I am using multithread queue and want to inference in 4 thread simultaneously . It I use 1 thread all is ok.

When I increase it 2 or 4 it gives below error time to time . but in 1 thread there is no error in same image inference.

<class ‘mxnet.base.MXNetError’>, MXNetError(’[13:45:17] src/operator/contrib/…/tensor/…/elemwise_op_common.h:133: Check failed: assign(&dattr, (*vec)[i]) Incompatible attr in node at 0-th output: expected [1,3,20,35], got [1,3,198,360]\n\nStack trace returned 10 entries:\n[bt] (0) 0 libmxnet.so 0x0000000111601b90 libmxnet.so + 15248\n[bt] (1) 1 libmxnet.so 0x000000011160193f libmxnet.so + 14655\n[bt] (2) 2 libmxnet.so 0x0000000111601569 libmxnet.so + 13673\n[bt] (3) 3 libmxnet.so 0x000000011173d1c2 libmxnet.so + 1307074\n[bt] (4) 4 libmxnet.so 0x000000011173ce1f libmxnet.so + 1306143\n[bt] (5) 5 libmxnet.so 0x0000000111737f94 libmxnet.so + 1286036\n[bt] (6) 6 libmxnet.so 0x0000000112b485da MXNDListFree + 502922\n[bt] (7) 7 libmxnet.so 0x0000000112b470a4 MXNDListFree + 497492\n[bt] (8) 8 libmxnet.so 0x0000000112aa441e MXCustomFunctionRecord + 20926\n[bt] (9) 9 libmxnet.so 0x0000000112aa5140 MXImperativeInvokeEx + 176\n\n’,), <traceback object at 0x14f788148>)

sad · January 3, 2019, 7:42pm

Hi,

Do you mean your mac only utilizes 2-3 cores during inference? The cpu cores are also used during data loading so there may be some contention there. The environment variables you set seem to be the right ones for controlling the number of worker threads for the engine which controls how many independent operators can be executed in parallel. https://github.com/apache/incubator-mxnet/blob/master/docs/faq/env_var.md#set-the-number-of-threads
you can’t run inference with 4 threads simultaneously because the mxnet engine is not thread safe. You could explore having multiple processes creating the input data but you have to use a single input queue to the computational engine. There’s some more info in this github issue. https://github.com/apache/incubator-mxnet/issues/3946

Topic		Replies	Views
Multi CPU cores usage Gluon	2	3884	June 5, 2018
Very low CPU utilization Performance	3	1026	October 20, 2017
Parallelize Operators Performance	0	321	August 3, 2020
Speedup inference / multithread inference Python python , performance	6	7544	December 16, 2019
Python CPU single thread configuration	2	1588	October 22, 2019

Python multithread queue and multi process on CPU question

Related Topics