WGAN-gp: can't compute gradient penalty with gluon?

blakec · October 15, 2020, 12:29am

Dear all,

I’m working on the implementation of WGAN with gradient penalty, but I get an error when doing a backward step:

MXNetError: Operator _backward_Convolution is non-differentiable because it didn't register FGradient attribute.

I guess it goes wrong because of the computation of \left(|| \nabla net_c(x_m)||_2 - 1 \right)^2 term inside the loss function. If I remove this term from the loss function, the training loop works again. Did I do something wrong in the implementation with Gluon? What is the proper way to compute a loss function with second order derivative?

I made a minimal example to reproduce the error:

import mxnet as mx
from mxnet import nd, gluon, autograd
from mxnet.gluon import nn

# Define and init dummy network.
net = nn.HybridSequential()
net.add(
    nn.Conv2D(in_channels=1, channels=64, kernel_size=4, strides=2, activation="relu"),
    nn.Conv2D(in_channels=64, channels=128, kernel_size=4, strides=2, activation="relu"),
    nn.Conv2D(in_channels=128, channels=1, kernel_size=4, strides=2)
)
net.initialize()

trainer = gluon.Trainer(net.collect_params(), "adam", {"learning_rate": 0.00002})

batch_size = 8
clambda = 10

# Do one training step
with autograd.record():
    xr = nd.random.randn(batch_size, 1, 28, 28)
    xf = nd.random.randn(batch_size, 1, 28, 28)
    epsilon = nd.ones(shape=(batch_size, 1, 1, 1)) * 0.5
    xm = epsilon * xr + (1 - epsilon) * xf
    xm.attach_grad()
    yr = net(xr)
    ym = net(xm)
    grad_ym = mx.autograd.grad(heads=ym, variables=[xm], retain_graph=True, create_graph=True)[0]
    grad_ym = grad_ym.reshape(batch_size, -1)
    loss = nd.mean(ym) - nd.mean(yr) + clambda * nd.mean((nd.norm(grad_ym, axis=1) - 1) ** 2)
    print("loss: ", loss)
loss.backward()
trainer.step(batch_size)

Topic		Replies	Views
Obtaining second order derivatives for a function wrt arbitrary parameters in the computation graph	6	1766	June 28, 2019
Adding network gradient to the computational graph Gluon	3	1649	December 17, 2018
Multiple losses Gluon	7	3664	June 5, 2018
Gradients for Embedding layers in Gluon Gluon	3	796	September 21, 2018
Who can help me solve this error？（batch_loss.backward() error when use slice) Discussion	1	2296	March 8, 2018

WGAN-gp: can't compute gradient penalty with gluon?

Related Topics