I am completely lost by how the broadcast is working. Would someone please care to explain how this goes about in mxnet. I was hoping for completely different outcome from its operation for the example in the book


Can you post a snippet of the code you’re looking at? And what you expect the results of the operation to be? Just a note, broadcast in mxnet is not supported to the full range it might be in something like numpy for example

Here is the broadcast example
a = nd.arange(3).reshape((3, 1))
b = nd.arange(2).reshape((1, 2))

a + b
[[0. 1.] [1. 2.] [2. 3.]] <NDArray 3x2 @cpu(0)>

I expected the outcome to be
[[0. 1.] [1. 1.] [2. 1.]] <NDArray 3x2 @cpu(0)>

I don’t understand how you can expect something like this, but I will give you explanation of the output.
a = [[0],

b = [[0, 1]]

when you compute a + b, it would be broadcast_add by default, which means that all elements of a will add to single element of b. a had 3 elements so output will also contain 3 elements.
So a + b is -
[[0,1] + 0,
[0,1] + 1,
[0,1] + 2]
which is indeed what you’re getting.

1 Like