I had this idea that it is possible to increase neural net layer sizes incrementally along the training. Could you guys look into it and feedback if it’s not worth it.
Ok, imagine you’re training a network and you want to test things out how it would work if you had a bigger layer sizes. But changing network architecture will require weights reinitialization and time for learning from scratch which is a pain if you spent a week on training your model on a cheepy hardware.
So how this can be managed theoretically:
(I’ve found it hard to describe the problem using current editor, so below is the link to image)
Explanations:
The link above shows a way to augment weight matrix (just with zero matrix) in a way that would not affect the output (even though weight matrices have different sizes, the result of matrix multiplication and addition of biases yields to the same result due to properties of matrix multiplication). If output values didn’t change in result of such augmentation then the activation function will also yield to the same results and the output of whole neural network will be the same.
There is problem to if however. Using zeros is not a good idea because all of the weights might receive the same deltas during training if number of network outputs is low. So instead of initializing a matrix with raw zeros it is possible to augment it with random values that are very close to zero matrix.
One more thing, first layer weight matrix can not be changed or it will affect the size of the input. So this change is possible only on layers 2,3,4 …
Question:
In order to test this I need to create a new neural network layer based on values of existing one which turned out an unbearable task in MxNet leading to re-implementation of layer itself or maybe I’m missing something and it is simple to create a new layer based on custom weights and biases?
Another question is about inserting additional hidden layers, in-between those that are already defined, is there a way to do that?
Thank you in advance!