Hi there,
I have a one hot tensor like this:
First dim is batch size, second dim is char, third is one hot vector. One batch looks like this:
And an embedding tensor like this:
Each element is in a certain range:
Then I use char_embed = nd.batch_dot(one_hot, bert_embed)
to lookup my embedding, the result contains nan
which confuses me for a whole day.
If there is no nan
in bert_embed
, why is nan
produced when it is batch_dot
with a one hot tensor?