About stale gradient

Hey, bro!!! GOOD NEWS!!! I think I figure out the problem!!! The ‘grad_req’ of ‘bn0_moving_mean’ is ‘null’, which means it does not need gradient:


However, I write this line in my training code:

which compulsively change all ‘grad_req’ to ‘add’, and hence leads to the warning, after I modify my training code like this:

The warning is gone:grin::grin::grin: