Dot product on fp16 for simple networks

I’m working on a problem that can be seen as a very simple neural network or a handful of linear algebra operations. I’m hoping to use MXNet to take advantage of the super fast tensor cores on the Volta V100 chips on Amazon EC2 P3.* instances. But when I try to do a simple matrix-vector multiply with mx.nd.dot on two fp16 NDArrays, MXNet crashes on me saying dot only supports float32 and float64 (code).

Is there some other way to do this? Or is this not supported yet?