KFAC, KFRA and all more fancier "backward" passes

Hi,
So KFAC and KFRA are quite more sophisticated optimization methods for neural networks. They use a Kronecker-Factored approximation for each block of the Gauss-Newton. However, they require essentially a very specialized extra backward pass. I was wondering if anyone could give advice on what would be the most appropriate way of actually doing this on the Block levels. Additionally, they require matrix inverses I was wondering if the corresponding functions (e.g. potrf and trsm) have also been linked to work on the GPU via cublas/cusolve.

lapack functions should work on GPU now.

To modify the backward pass, consider using autograd.Function