Just noticed that gluon rnn layer did not return the output after softmax, instead return the out with shape of (sequence_length, batch_size, num_hidden), is that means we have to manually add a softmax layer with the rnn output as input to get the probability distribution?
out: output tensor with shape (sequence_length, batch_size, num_hidden) when layout is “TNC”. If bidirectional is True, output shape will instead be (sequence_length, batch_size, 2*num_hidden)
out_states: output recurrent state tensor with the same shape as states. If states is None out_states will not be returned.