Terrible classification accuracy of mxnet

I’m new to mxnet, I tried the mnist handwritten recognition of the tutorial and worked perfectly. However, when I modify the example to classify 2 simple classes from an hyperspectral image (220 bands-features, 2D data set: samples x bands) using simple MLP, the network does not learn on each epoch it remains with a fixed bad accuracy (60-70%), which was the best I could get varying the number of layers and hidden nodes. I tested the same data set in weka, which uses a single hidden layer and got accuracy of almost 100% (it takes forever, but it learns). If I use the same network layout in mxnet (a single hidden layer with the same number of nodes as in Weka) I get accuracy of 0% on all epochs. So the network in mxnet is not learning on each epoch and accuracy stays the same (really bad). I’ve tried other example in the web (1000 random vectors of 100 features, 10 classes) and the network learns beatifully, but not with my dataset. Any help would be really appreaciated.

Do you mind posting your original code? What optimizer and metric are you using? Did you try different hyper-parameters? A random guess would result in an accuracy of 50% for 2-class classification.

Hi Eric, I’m posting here the code, I cannot upload here the images though. If you provide me an e-mail, I could send you everything you’ll need to run it. I’m puzzled by the fact that Weka does so good with a very simple network layout (but it takes forever, it would not be practical for a big image).

import mxnet
import logging
import spectral
import numpy

#import data
img = spectral.open_image(‘19920612_AVIRIS_IndianPine_Site3.lan’)
no_bands = img.shape[2]
#Training
fraction_training_samples = 0.9
labels_file = spectral.envi.open(‘IndianPineLabels.hdr’)
labels = labels_file.read_band(0)
training_data_list = []
testing_data_list = []
training_label_list = []
testing_label_list = []
no_classes = int( labels.max() )
for i in range(1, no_classes+1):
(rows, cols)=(labels == i).nonzero()
samples = len(rows)
no_training_samples = int( samples * fraction_training_samples )
for p in range(no_training_samples):
training_data_list.append( img[rows[p], cols[p], :].flatten() )
training_label_list.append(i)
for p in range(no_training_samples, samples):
testing_data_list.append( img[rows[p], cols[p], :].flatten() )
testing_label_list.append(i)
training_data = numpy.asarray(training_data_list, dtype=numpy.float32)
training_labels = numpy.asarray(training_label_list)
testing_data = numpy.asarray(testing_data_list, dtype=numpy.float32)
testing_labels = numpy.asarray(testing_label_list)

#Normalize features
for i in range(no_bands):
maximum = max(training_data[:,i].max(), testing_data[:,i].max())
minimum = min(training_data[:,i].min(), testing_data[:,i].min())
training_data[:,i] = (training_data[:,i]-minimum)/(maximum-minimum)
testing_data[:,i] = (testing_data[:,i]-minimum)/(maximum-minimum)

batch_size=100
train_iter = mxnet.io.NDArrayIter(training_data, training_labels, batch_size, shuffle=True)
test_iter = mxnet.io.NDArrayIter(testing_data, testing_labels, batch_size)

#Multilayer Perceptron
data = mxnet.sym.var(‘data’)

The first fully-connected layer and the corresponding activation function

fc1 = mxnet.sym.FullyConnected(data=data, num_hidden=110)
act1 = mxnet.sym.Activation(data=fc1, act_type=“relu”)

The second fully-connected layer and the corresponding activation function

fc2 = mxnet.sym.FullyConnected(data=act1, num_hidden = 50)
act2 = mxnet.sym.Activation(data=fc2, act_type=“relu”)
fc3 = mxnet.sym.FullyConnected(data=act2, num_hidden = 10)
act3 = mxnet.sym.Activation(data=fc3, act_type=“relu”)
fc4 = mxnet.sym.FullyConnected(data=act3, num_hidden=2)

Softmax with cross entropy loss

mlp = mxnet.sym.SoftmaxOutput(data=fc4, name=‘softmax’)

logging.getLogger().setLevel(logging.DEBUG) # logging to stdout

create a trainable module on CPU

mlp_model = mxnet.mod.Module(symbol=mlp, context=mxnet.cpu())
mlp_model.fit(train_iter, # train data
eval_data=test_iter, # validation data
optimizer=‘sgd’, # use SGD to train
optimizer_params={‘learning_rate’:0.1}, # use fixed learning rate
eval_metric=‘acc’, # report accuracy during training
batch_end_callback = mxnet.callback.Speedometer(batch_size, 100), # output progress for each 100 data batches
num_epoch=50)