I am completely lost by how the broadcast is working. Would someone please care to explain how this goes about in mxnet. I was hoping for completely different outcome from its operation for the example in the book

Hi,

Can you post a snippet of the code you’re looking at? And what you expect the results of the operation to be? Just a note, broadcast in mxnet is not supported to the full range it might be in something like numpy for example

Here is the broadcast example

a = nd.arange(3).reshape((3, 1))

b = nd.arange(2).reshape((1, 2))

a + b

[[0. 1.] [1. 2.] [2. 3.]] <NDArray 3x2 @cpu(0)>

I expected the outcome to be

[[0. 1.] [1. 1.] [2. 1.]] <NDArray 3x2 @cpu(0)>

I don’t understand how you can expect something like this, but I will give you explanation of the output.

a = [[0],

[1],

[2]]

b = [[0, 1]]

when you compute a + b, it would be broadcast_add by default, which means that all elements of a will add to single element of b. a had 3 elements so output will also contain 3 elements.

So a + b is -

[[0,1] + 0,

[0,1] + 1,

[0,1] + 2]

which is indeed what you’re getting.