I am reading a tutorial about MxNet. The writers use ‘mxnet.gluon.nn.Sequential()’ as a container to store some blocks (see code 1); then, they rewrite the connection of blocks in ‘def forward(self, x)’ (see codes 2 and 3). Is there any side effect by doing this? By the way, what is the difference between ‘Sequential()’ and ‘HybridSequential()’. I try a list to replace the ‘Sequential’, and I get following warnings doing the initialization process.
“ToySSD.downsamplers” is a container with Blocks. Note that Blocks inside the list, tuple or dict will not be registered automatically. Make sure to register them using register_child() or switching to nn.Sequential/nn.HybridSequential instead.’
As far as I know, if you put some blocks in ‘mxnet.gluon.nn.Sequential()’ or ‘mxnet.gluon.nn.HybridSequential()’, this action is telling the computer that these blocks are connected. However, if you design the relationship of blocks in the ‘forward’ function, you are telling the computer to connect these blocks in another way. Will it lead to confusion? If I only design some block connections in ‘forward’, what are the relationships of the other blocks in ‘Sequential()’ that are not designed in ‘forward’ function?
The entire tutorial can be found in here.
code 1:
def toy_ssd_model(num_anchors, num_classes): downsamplers = nn.Sequential() for _ in range(3): downsamplers.add(down_sample(128)) class_predictors = nn.Sequential() box_predictors = nn.Sequential() for _ in range(5): class_predictors.add(class_predictor(num_anchors, num_classes)) box_predictors.add(box_predictor(num_anchors)) model = nn.Sequential() model.add(body(), downsamplers, class_predictors, box_predictors) return model
code 2:
def toy_ssd_forward(x, model, sizes, ratios, verbose=False): body, downsamplers, class_predictors, box_predictors = model anchors, class_preds, box_preds = [], [], [] # feature extraction x = body(x) for i in range(5): # predict anchors.append(MultiBoxPrior( x, sizes=sizes[i], ratios=ratios[i])) class_preds.append( flatten_prediction(class_predictors[i](x))) box_preds.append( flatten_prediction(box_predictors[i](x))) if verbose: print('Predict scale', i, x.shape, 'with', anchors[-1].shape[1], 'anchors') # down sample if i < 3: x = downsamplers[i](x) elif i == 3: x = nd.Pooling( x, global_pool=True, pool_type='max', kernel=(x.shape[2], x.shape[3])) # concat data return (concat_predictions(anchors), concat_predictions(class_preds), concat_predictions(box_preds))
code 3:
from mxnet import gluon class ToySSD(gluon.Block): def __init__(self, num_classes, verbose=False, **kwargs): super(ToySSD, self).__init__(**kwargs) # anchor box sizes and ratios for 5 feature scales self.sizes = [[.2,.272], [.37,.447], [.54,.619], [.71,.79], [.88,.961]] self.ratios = [[1,2,.5]]*5 self.num_classes = num_classes self.verbose = verbose num_anchors = len(self.sizes[0]) + len(self.ratios[0]) - 1 # use name_scope to guard the names with self.name_scope(): self.model = toy_ssd_model(num_anchors, num_classes) def forward(self, x): anchors, class_preds, box_preds = toy_ssd_forward( x, self.model, self.sizes, self.ratios, verbose=self.verbose) # it is better to have class predictions reshaped for softmax computation class_preds = class_preds.reshape(shape=(0, -1, self.num_classes+1)) return anchors, class_preds, box_preds