Aggregate gradients manually over n batches

olivcruche October 20, 2019, 1:11pm 25

this fix seems to work! About stale gradient

for p in net.collect_params().values():
    if p.grad_req != 'null':
        p.grad_req = 'add'

How to make gradient accumulation work in MXNet?

Topic		Replies	Views
About stale gradient Gluon	17	3210	October 19, 2020
Gradient fetching Discussion	2	587	May 31, 2018
Implementation of weighted softmax by extending mx.autograd.Function fails	2	651	September 2, 2019
How to implement the addtion of grad in the backback-propagating,how to add extra term (which is the gradient to middle net layer output) to the network	2	591	August 18, 2018
WGAN-gp: can't compute gradient penalty with gluon? Gluon	0	411	October 15, 2020