Expected labels of dependent variable in binary classification

I am a bit confused about which labels MXNet is expecting in a binary classification context.
In my problem, I have a dependent variable which looks like an array of 1s and 0s i.e. [1,0,0,0,1,1….0,0,0,1].
In numpy terms, its shape is (n_data_points,).

Given that, my last 2 layers in the model are defined as follows:

fc2 = mx.symbol.FullyConnected(data = fc1bn, name=‘fc2’, num_hidden=1)
mlp = mx.symbol.LogisticRegressionOutput(data = fc2, name = ‘softmax’)

This works perfectly.
Thing is, this works as well

fc2 = mx.symbol.FullyConnected(data = fc1bn, name=‘fc2’, num_hidden=2)
mlp = mx.symbol.SoftmaxOutput(data = fc2, name = ‘softmax’)

whilst I would have expected the above to work only if the dependent variable was one-hot-encoded, i.e. [[1,0],[0,1],[0,1],[0,1],[1,0],…[0,1],[1,0]], or again, in numpy terms, shaped as (n_data_points,2).

Apparently SoftmaxOutput is smart enough to spit out a probability and return argmax at the same time.
Now, the question is, is there a recommended way of structuring a binary classification problem?
Shall one use a one-hot-encoded variable or not?
Knowing that LogisticRegressionOutput and SoftmaxOutput do exactly the same thing in a binary context, which one is recommended?

As the examples here show: http://mxnet.incubator.apache.org/api/python/symbol.html?highlight=softmaxoutput#mxnet.symbol.SoftmaxOutput
SoftmaxOutput by default takes integer labels.

Thanks a lot! This is helpful