Obtaining second order derivatives for a function wrt arbitrary parameters in the computation graph

noisychannel · June 14, 2018, 8:34pm

Hi,

We have an implementation of a recurrent network in MXnet and are trying to obtain the second order derivatives of a loss function with respect to arbitrary(all) parameters in the computational graph.

Is there a way to do this? Any help would be nice

jmacglashan · June 19, 2018, 3:44pm

It’s unclear to me how many operators support higher-order gradients at this time, so it might not work on your network, but there is an interface that should allow you to do it provided all the operators support it.

See:
mxnet.autograd.grad

You can find documentation for it on this page, but you have to scroll down because for some reason, there isn’t an anchor link for it at the top.

Gist should be something like this (I didn’t test this)

with mx.autograd.record():
  output = net(x)
  loss = loss_func(output)
  dz = mx.autograd.grad(loss, [z], create_graph=True)  # where [z] is the parameter(s) you want

dz[0].backward()  # now the actual parameters should have second order gradients

noisychannel · June 19, 2018, 4:27pm

Thanks! This (https://github.com/apache/incubator-mxnet/issues/10002) seems to imply that not all operators support this yet.

I’ll try the interface you’ve pointed to and report operators it fails on to the contributors.

feevos · September 18, 2018, 2:33pm

Dear all,

is there a timeline on when higher order derivatives will be released for mxnet/gluon? A lot of GAN-like systems require them for stabilised training.

jmacglashan · October 19, 2018, 3:25am

Looks like it’s starting to get some active development efforts. Refer to here (particularly the end of the thread):

And here:
https://issues.apache.org/jira/browse/MXNET-978

I’m also eagerly watching this one. It’s the last critical feature that I think Mxnet lacks.

apeforest · June 27, 2019, 7:57am

A few operators such as sigmoid, relu, log etc. already support second order gradients. Please stay tuned. We are adding support for more operators.

feevos · June 28, 2019, 5:46am

Thank you very much for this @apeforest.

Topic		Replies	Views
WGAN-gp: can't compute gradient penalty with gluon? Gluon	0	407	October 15, 2020
Adding network gradient to the computational graph Gluon	3	1642	December 17, 2018
How to compute higher order gradients	1	910	July 3, 2018
Multiple losses Gluon	7	3649	June 5, 2018
Gradient fetching Discussion	2	586	May 31, 2018

Obtaining second order derivatives for a function wrt arbitrary parameters in the computation graph

Related Topics