Obtaining second order derivatives for a function wrt arbitrary parameters in the computation graph


We have an implementation of a recurrent network in MXnet and are trying to obtain the second order derivatives of a loss function with respect to arbitrary(all) parameters in the computational graph.

Is there a way to do this? Any help would be nice :slight_smile:

It’s unclear to me how many operators support higher-order gradients at this time, so it might not work on your network, but there is an interface that should allow you to do it provided all the operators support it.


You can find documentation for it on this page, but you have to scroll down because for some reason, there isn’t an anchor link for it at the top.

Gist should be something like this (I didn’t test this)

with mx.autograd.record():
  output = net(x)
  loss = loss_func(output)
  dz = mx.autograd.grad(loss, [z], create_graph=True)  # where [z] is the parameter(s) you want

dz[0].backward()  # now the actual parameters should have second order gradients
1 Like

Thanks! This (https://github.com/apache/incubator-mxnet/issues/10002) seems to imply that not all operators support this yet.

I’ll try the interface you’ve pointed to and report operators it fails on to the contributors.

Dear all,

is there a timeline on when higher order derivatives will be released for mxnet/gluon? A lot of GAN-like systems require them for stabilised training.

Looks like it’s starting to get some active development efforts. Refer to here (particularly the end of the thread):

And here:

I’m also eagerly watching this one. It’s the last critical feature that I think Mxnet lacks.

1 Like

A few operators such as sigmoid, relu, log etc. already support second order gradients. Please stay tuned. We are adding support for more operators.


Thank you very much for this @apeforest.