If I wanted to reuse an output from one part of a graph in another (without backproping gradients) - is there an example of how I could do this in the MXNet Symbol API? The diagram for this would be below - doing something like:
output_1 = mx.sym.take(output, axis=1, begin=0, end=1)
out = output_1
out = mx.sym.BlockGrad(out)
new_input = mx.sym.concat(out, new_input)
fc = mx.sym.FullyConnected(new_input)
second_out = mx.sym.Activation(fc, act_type='relu')
final = output_1 + second_out
There isn’t an example as far as I can tell. I agree that it would be nice to create one. The pseudo code you write above should work. What issues did you run into?
For some reason, it completely destroyed the metrics - my question is when I do out = output_1 does it create a new copy - or do I need to deep copy it? Does the gradient blocking for out affect output_1?