Hello,
in the paper "Deep Pyramidal Residual Networks" (https://arxiv.org/abs/1610.02915), zero-padding of the feature channels was used in order to increase the dimensionality.
There are two sample code sources for PyramidNet:
# [...]
batch_size = out.size()[0]
residual_channel = out.size()[1]
shortcut_channel = shortcut.size()[1]
if residual_channel != shortcut_channel:
padding = torch.autograd.Variable(torch.cuda.FloatTensor(batch_size, residual_channel - shortcut_channel, featuremap_size[0], featuremap_size[1]).fill_(0))
out += torch.cat((shortcut, padding), 1)
else:
out += shortcut
I’m wondering how you can do this in Gluon in an efficient way.
Should you create a custom operator which pads additional channels to the data or should one apply the element-wise addition on only a subset of the channels?
This question was also asked on Github 4 years ago: