# WGAN-gp: can't compute gradient penalty with gluon?

Dear all,

I’m working on the implementation of WGAN with gradient penalty, but I get an error when doing a backward step:

MXNetError: Operator _backward_Convolution is non-differentiable because it didn't register FGradient attribute.


I guess it goes wrong because of the computation of \left(|| \nabla net_c(x_m)||_2 - 1 \right)^2 term inside the loss function. If I remove this term from the loss function, the training loop works again. Did I do something wrong in the implementation with Gluon? What is the proper way to compute a loss function with second order derivative?

I made a minimal example to reproduce the error:

import mxnet as mx
from mxnet import nd, gluon, autograd
from mxnet.gluon import nn

# Define and init dummy network.
net = nn.HybridSequential()
nn.Conv2D(in_channels=1, channels=64, kernel_size=4, strides=2, activation="relu"),
nn.Conv2D(in_channels=64, channels=128, kernel_size=4, strides=2, activation="relu"),
nn.Conv2D(in_channels=128, channels=1, kernel_size=4, strides=2)
)
net.initialize()

trainer = gluon.Trainer(net.collect_params(), "adam", {"learning_rate": 0.00002})

batch_size = 8
clambda = 10

# Do one training step
xr = nd.random.randn(batch_size, 1, 28, 28)
xf = nd.random.randn(batch_size, 1, 28, 28)
epsilon = nd.ones(shape=(batch_size, 1, 1, 1)) * 0.5
xm = epsilon * xr + (1 - epsilon) * xf