Hey all, as the title indicates I think there’s a bug (or it’s just weird behavior) in how ParameterDict handles the additional ParameterDict’s parameters that it can be initialized with.
The Gluon Block’s collect_param method does not surface the parameters of “shared” ParameterDicts and we need this functionality for some other work.
I think this is a bug in the implementation of the ParameterDict class’s getitem functionality, but I definitely could be missing something. The get
method looks in the shared dict, but the ___getitem___
method doesn’t, and the ___getitem___
is what the Gluon Block’s collect_param
s method uses. I don’t need it to add items to the shared dict, just for them to show up when I loop through the overall ParameterDict.
Minimum reproducible example
from mxnet.gluon import ParameterDict
outer_params = ParameterDict()
outer_params.get("a", shape=(2,2))
outer_params.initialize()
from mxnet.gluon import HybridBlock
class ExampleBlock(HybridBlock):
def __init__(self, param_dict, prefix=''):
super(ExampleBlock, self).__init__(prefix=prefix, params=param_dict)
eb = ExampleBlock(outer_params)
eb.collect_params()
Running this, I would expect it to output the a
parameter, but instead it outputs just an empty parameter dictionary because collect_params doesn’t look inside the shared param_dict.
What have you tried to solve it?
To unblock temporarily, we’re just using a workaround in our code and just manually overriding the _params
of the Block with the correct ParameterDict, but that’s not a real option even in the short term. The obvious solution I see would be to make__getitem__
also loop over the shared ParameterDict’s item’s but I don’t know what issues that might cause elsewhere.
Thanks all,
Eric
P.S. Related issue I filed here.