I’m facing a trouble when i try training mnist by distributing on mxnet. and i got the following information:
Traceback (most recent call last):
File “train_mnist.py”, line 25, in
from common import find_mxnet, fit
File “/home/lm/incubator-mxnet/example/image-classification/common/find_mxnet.py”, line 24, in
import mxnet as mx
File “/home/lm/incubator-mxnet/example/image-classification/common/…/…/…/python/mxnet/init.py”, line 25, in
from . import engine
File “/home/lm/incubator-mxnet/example/image-classification/common/…/…/…/python/mxnet/engine.py”, line 23, in
from .base import _LIB, check_call
File “/home/lm/incubator-mxnet/example/image-classification/common/…/…/…/python/mxnet/base.py”, line 29, in
import numpy as np
ImportError: No module named numpy
Traceback (most recent call last):
File “train_mnist.py”, line 25, in
from common import find_mxnet, fit
File “/home/lm/incubator-mxnet/example/image-classification/common/find_mxnet.py”, line 24, in
import mxnet as mx
File “/home/lm/incubator-mxnet/example/image-classification/common/…/…/…/python/mxnet/init.py”, line 25, in
from . import engine
File “/home/lm/incubator-mxnet/example/image-classification/common/…/…/…/python/mxnet/engine.py”, line 23, in
from .base import _LIB, check_call
File “/home/lm/incubator-mxnet/example/image-classification/common/…/…/…/python/mxnet/base.py”, line 29, in
import numpy as np
ImportError: No module named numpy
Exception in thread Thread-3:
Traceback (most recent call last):
File “/home/lm/anaconda3/envs/lm2/lib/python2.7/threading.py”, line 801, in __bootstrap_inner
self.run()
File “/home/lm/anaconda3/envs/lm2/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “/home/lm/incubator-mxnet/tools/…/dmlc-core/tracker/dmlc_tracker/ssh.py”, line 61, in run
subprocess.check_call(prog, shell = True)
File “/home/lm/anaconda3/envs/lm2/lib/python2.7/subprocess.py”, line 186, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command ‘ssh -o StrictHostKeyChecking=no lm@10.10.143.238 -p 22 ‘export LD_LIBRARY_PATH=/home/lm/PSPNet/build/lib:/usr/local/cuda-8.0/lib64; export DMLC_ROLE=worker; export DMLC_PS_ROOT_PORT=9108; export DMLC_PS_ROOT_URI=10.10.143.108; export DMLC_NUM_SERVER=1; export DMLC_NUM_WORKER=1; cd /home/lm/incubator-mxnet/example/image-classification/; python train_mnist.py --network lenet --kv-store dist_device_sync’’ returned non-zero exit status 1
training mnist on a single machine is no problem, i can import numpy in python when training on a single machine.
what can i do?