I’m doing a binary classification problem on the Pima Indians dataset. My MLP in Keras gets an accuracy of 85%, but the MxNet version of the same network only gets about 63% and outputs mostly constant values. I’ve tried normalising the data, scaling it, changing activations, number of neurons, batch size and epochs, but nothing seems to help. Any suggestions on what’s causing the huge difference?

Here’s the MxNet code:

```
batch_size=10
train_iter=mx.io.NDArrayIter(mx.nd.array(df_train), mx.nd.array(y_train),
batch_size, shuffle=True)
val_iter=mx.io.NDArrayIter(mx.nd.array(df_test), mx.nd.array(y_test), batch_size)
data=mx.sym.var('data')
fc1 = mx.sym.FullyConnected(data=data, num_hidden=12)
act1 = mx.sym.Activation(data=fc1, act_type='relu')
fc2 = mx.sym.FullyConnected(data=act1, num_hidden=8)
act2 = mx.sym.Activation(data=fc2, act_type='relu')
fcfinal = mx.sym.FullyConnected(data=act2, num_hidden=2)
mlp = mx.sym.SoftmaxOutput(data=fcfinal, name='softmax')
mlp_model = mx.mod.Module(symbol=mlp, context=mx.cpu())
mlp_model.fit(train_iter,
eval_data=val_iter,
optimizer='sgd',
eval_metric='ce',
num_epoch=150)
```

Here’s the Keras code:

```
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(df_train_res, y_train_res)
```