Forward-backward pass being a bottleneck in multi-gpu training

sad · July 11, 2019, 4:33pm

Copying @ThomasDelteil’s answer here for points 2 and 3 for greater visibility.

“wanted to take the time to run some experiment to give you more data points, but what I would recommend is trying horovod on a single node. horovod use nccl for GPU to GPU communication and each GPU is running its own process. for horovod this discussion might be helpful for you Distributed training questions”

Topic		Replies	Views
How to run 2 models together in parallel with the same data input in model training? Discussion	0	851	September 11, 2020
Single-machine multi-GPU training, time is not speeding up Gluon	5	2166	November 16, 2018
Documentation Request: Model Parallelism Tutorial Performance	6	1846	March 10, 2018
Training on GPU not much faster than on CPU Gluon	12	1611	April 5, 2019
The Gluon API framework mp Gluon	3	517	May 14, 2018

Forward-backward pass being a bottleneck in multi-gpu training

Related Topics