Hi! I just found that the diag
operator does not support N-d arrays where N > 2
. According to my own experience, it could be made more useful if the N > 2
cases are properly designed. For example, I find it troublesome to take the diagonals of several matrices of the same shape at the same time. To support this, the behaviour when N > 2
could be designed as taking the diagonal of the last two axes, i.e., when fed with an array of shape [d1, d2, d3, ..., dn-2, dn-1, dn]
, where the diagonal of [dn-1, dn]
is of length k
, diag
would return an array of shape [d1, d2, d3, ..., dn-2, k]
. Of course, this could be designed to be more flexible (allowing specifying the axes to reduce, for example).
PyTorch provides a diag
operator that behaves in the same way. Tensorflow actually splits it into two operators, diag
and diag_part
, the former of which constructs diagonal matrices and the latter takes diagonals from matrices. They are designed to support N > 2
but not in a way I find useful or flexible.
@jason_yu I agree that would be nice upgrade of the diag operator. The reason why it’s not been done before I think it’s because there wasn’t a clear use-case. I would encourage you to create an issue on the github repo and suggest the tags ‘Feature Request’ and ‘Operator’.
Even better, I think that would be a great way to contribute to MXNet if you wanted to expand the behavior of the diagonal operator to operator on the last dimensions of the NDimensional array.
@jason_yu btw thanks for reporting the spam
2 Likes
Thanks for your response! I have taken your advice and created the issue.