I’m within spitting distance of having a major project finished with MXNet, but I’ve hit an absolute wall with this latest issue. Any advice anyone has is welcome!
I’ve rewritten this network as small as possible while still encountering the same issue. I’m pretty sure it relates to how I’m loading my data from a CSV file.
batch_size = 100 class MyCsvSplittingIter(mx.io.DataIter): def __init__(self, fname, batch_size, n_col): super(MyCsvSplittingIter, self).__init__() self.batch_size = batch_size self.n_col = n_col self.csv_iter = mx.io.CSVIter(data_csv=fname, data_shape=(n_col,), batch_size=batch_size, round_batch=True) input1_desc = mx.io.DataDesc('input1', (batch_size,)) target_desc = mx.io.DataDesc('target', (batch_size,)) self._provide_data = [input1_desc] self._provide_label = [target_desc] def __iter__(self): return self def reset(self): self.csv_iter.reset() def __next__(self): return self.next() @property def provide_data(self): return self._provide_data @property def provide_label(self): return self._provide_label def next(self): batch = self.csv_iter.next() input1_data = slice(batch.data[0], begin=(0,0), end=(batch_size,4)) target_data = slice(batch.data[0], begin=(0,4), end=(batch_size,5)) return mx.io.DataBatch([input1_data], [target_data])
input1_data_ = mx.sym.Variable('input1') target = mx.symbol.Variable('target') net = mx.sym.FullyConnected(input1_data_, name='end', num_hidden=1) net = mx.sym.LinearRegressionOutput(net, label=target, name='lro') mod = mx.mod.Module(symbol=net, context=mx.cpu(), data_names=['input1'], label_names=['target']) train_iter = MyCsvSplittingIter('train.csv', batch_size, 149) # the csv has more data than I need test_iter = MyCsvSplittingIter('test.csv', batch_size, 149) train_iter.reset() mod.fit(train_iter, eval_data=test_iter, optimizer='sgd', optimizer_params={'learning_rate':0.1}, eval_metric='acc', num_epoch=2)
Straightforward, apart from reading my data out of the CSV file, but I think it’s working. I’m following the steps recommended here: Label and data in the same csv - #2 by safrooze with some tweaks.
What I get back is this error message:
MXNetError: [02:39:01] src/executor/graph_executor.cc:876: Shape of unspecifie arg: end_weight changed. This can cause the new executor to not share parameters with the old one. Please check for error in network.If this is intended, set partial_shaping=True to suppress this warning.
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/module/base_module.py in fit
→ 528 self.forward_backward(data_batch)~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/module/base_module.py in forward_backward(
→ 196 self.forward(data_batch, is_train=True)~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/module/module.py in forward(
→ 629 self.reshape(new_dshape, new_lshape)~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/module/module.py in reshape(
→ 477 self._exec_group.reshape(self._data_shapes, self._label_shapes)~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/module/executor_group.py
→ 403 self.bind_exec(data_shapes, label_shapes, reshape=True)~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/module/executor_group.py in bind_exec(
→ 379 allow_up_sizing=True, **dict(data_shapes_i + label_shapes_i))~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/executor.py in reshape(
→ 458 ctypes.byref(handle)))~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/base.py in check_call(ret)
→ 252 raise MXNetError(py_str(_LIB.MXGetLastError()))
For some context around my motivation, that csv file actually has 149 rows - 1 label, 4 data, and 24 sets of 6 columns describing 24 items of the same class. My actual, crazy-large network uses weight tying to process all 24 items then combine their processed output. It’s a fun little project. But even when I drop the weight sharing, the multiple inputs, all of that, I still can’t do something as simple as read multiple data inputs from a single CSV file. Both my big network and the smaller one above result in the same error message.
Any advice is welcome. Thanks,
mabbo