Dear all,
I just saw this paper on coordinate convolution layers, and I thought it may prove useful for problems in semantic segmentation - edge detection etc. I will test it in the following weeks. So I gave it a try to implement it in a HybridBlock scheme. I haven’t even read the whole paper yet (section S8 has the implementation), but I think the basic idea is to augment the channels with 2 new channels, i indices, j indices. I tried to implement initially a collective index idx = i + dim(i)*j
but it wasn’t easy in a HybridBlock
format :/.
This is my “solution” (needs polishing for sure):
class coordConv(HybridBlock):
def __init__(self,channels, kernel_size = 3, padding = 1, strides = 1, **kwards):
HybridBlock.__init__(self,**kwards)
with self.name_scope():
self.conv = gluon.nn.Conv2D(channels=channels, kernel_size=kernel_size,padding=padding, strides=strides, **kwards)
def hybrid_forward(self,F,x):
temp = F.ones_like(F.slice_axis(x,axis=1,begin=0,end=1))
rows = F.argsort(temp,axis=2)
cols = F.argsort(temp,axis=3)
x = F.concat(x,rows,cols,dim=1)
return self.conv(x)
mynet = coordConv(32)
mynet.initialize(mx.init.Xavier())
mynet.hybridize()
xx = nd.random.uniform(shape = [5,64,128,128])
temp = mynet(xx)
print (temp.shape)
(5,32,128,128)
Could the experts provide some feedback?
I ended up in this solution observing the following tests:
temp = nd.ones(shape=[5,1,3,3])
nd.argsort(temp,axis=2)
prints
[[[[0. 0. 0.]
[1. 1. 1.]
[2. 2. 2.]]]
[[[0. 0. 0.]
[1. 1. 1.]
[2. 2. 2.]]]
[[[0. 0. 0.]
[1. 1. 1.]
[2. 2. 2.]]]
[[[0. 0. 0.]
[1. 1. 1.]
[2. 2. 2.]]]
[[[0. 0. 0.]
[1. 1. 1.]
[2. 2. 2.]]]]
<NDArray 5x1x3x3 @cpu(0)>
and
temp = nd.ones(shape=[5,1,3,3])
nd.argsort(temp,axis=3)
prints
[[[[0. 1. 2.]
[0. 1. 2.]
[0. 1. 2.]]]
[[[0. 1. 2.]
[0. 1. 2.]
[0. 1. 2.]]]
[[[0. 1. 2.]
[0. 1. 2.]
[0. 1. 2.]]]
[[[0. 1. 2.]
[0. 1. 2.]
[0. 1. 2.]]]
[[[0. 1. 2.]
[0. 1. 2.]
[0. 1. 2.]]]]
<NDArray 5x1x3x3 @cpu(0)>
so this looks like rows and columns indices (repeated).
edit: The function nd.argsort
is not consinstently strictly increasing for large dimensionality of the input array, at least when all elements are the same. E.g.
temp = nd.ones(shape=[5,1,32,32])
nd.argsort(temp,axis=3)[0,0]
prints
[[16. 16. 16. ... 16. 16. 16.]
[31. 31. 31. ... 31. 31. 31.]
[30. 30. 30. ... 30. 30. 30.]
...
[ 3. 3. 3. ... 3. 3. 3.]
[ 2. 2. 2. ... 2. 2. 2.]
[ 1. 1. 1. ... 1. 1. 1.]]
which is not strictly ordered, so the implementation I presented is wrong.
edit2: A secondary sort solves the problem. The following implementation works:
class coordConv(HybridBlock):
def __init__(self,channels, kernel_size = 3, padding = 1, strides = 1, **kwards):
HybridBlock.__init__(self,**kwards)
with self.name_scope():
self.conv = gluon.nn.Conv2D(channels=channels, kernel_size=kernel_size,padding=padding, strides=1, **kwards)
def hybrid_forward(self,F,x):
temp = F.ones_like(F.slice_axis(x,axis=1,begin=0,end=1))
rows = F.sort(F.argsort(temp,axis=-1),axis=-1)
rmax = F.max(rows)
cols = F.sort(F.argsort(temp,axis=-2),axis=-2)
cmax = F.max(cols)
rows = F.broadcast_div(rows , rmax)
cols = F.broadcast_div(cols , cmax)
rows = 2. * rows - 1.
cols = 2. * cols - 1.
x = F.concat(x,rows,cols,dim=1)
return self.conv(x), rows, cols # I am also returning rows, cols for vis purposes
mynet = coordConv(32)
mynet.initialize(mx.init.Xavier())
mynet.hybridize()
xx = nd.random.uniform(shape = [5,64,256,256])
temp = mynet(xx)
fig = figure(figsize=(13,5))
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
im1 = ax1.imshow(temp[1][0,0].asnumpy())
im2 = ax2.imshow(temp[2][0,0].asnumpy(),)
colorbar(im1,ax=ax1)
colorbar(im2,ax=ax2)
in addition, following the pytorch implementation of @kobenaxie , rows, cols
are in range [-1,1]