How to add a L2 Penalty function in mxnet's LR model?

I cont use gluon’s LR for I have a large data which must be loaded by dataiter of mxnet, but i don’t know if it has a L2 or L1 penalty function , how to deal with it?

@janelu9, can you precise why you cannot use Gluon for loading your data?
Are you aware of the Dataset and DataLoader classes that might help you with that? DataLoader allows the use of multiple workers for asynchronously pre-fetching data effectively.

You can use the weight decay wd parameter of the trainer, this wd parameter is accepted by all optimizers. In most cases you can see it as L2 regularization, and it is precisely true (with a factor 2) when using SGD. More details here: https://bbabenko.github.io/weight-decay/

trainer = gluon.Trainer(net.collect_params(), 'sgd', 
                        {'learning_rate': LEARNING_RATE,
                         'wd':WDECAY,
                         'momentum':MOMENTUM})