I have tested the python examples of mnist with Alex network. It works fine.
However, I changed the code to feed my own images. These are two class artificial images (red outlined circle vs. blue outlined square). In Mathematica with Alex network, this learns very fast (1 iteration) with an accuracy of 100%.Of course I changed the last FC layer to 2 outputs.
In MxNet however it does not learn. The output probability is around [0.4966,0.5034] and the same for each image.
I checked whether the batch contains the right encoded image and corresponding label. That seems OK. I used several values for learning rate and initialization but nothing seems to work. The only difference is that MNIST image pixel values are within 0…1, and my images have pixel values between 0…255.
Any idea what’s wrong?
Using MxNet 0.11 with CUDA 8.0 / CUDNN 7 (used pip install), on Windows 10, VS2017 Anaconda 2, python 2.7 , Titan-X GPU.
Code (rec/lst files can be downloaded from: download link rec/lst files):
import numpy as np
import mxnet as mx
import matplotlib.pyplot as plt
batch_size = 10
#train_iter = mx.io.NDArrayIter(mnist[‘train_data’], mnist[‘train_label’], batch_size, shuffle=True)
#val_iter = mx.io.NDArrayIter(mnist[‘test_data’], mnist[‘test_label’], batch_size)
def get_iterators(batch_size, data_shape=(3, 28, 28)):
train = mx.io.ImageRecordIter(
path_imgrec = ‘C:/DataSets/CirclesAndRectsBase28Color/CirclesAndRectsBase28Color_train.rec’,
path_imglist = ‘C:/DataSets/CirclesAndRectsBase28Color/CirclesAndRectsBase28Color_train.lst’,
data_name = ‘data’,
label_name = ‘softmax_label’,
batch_size = batch_size,
data_shape = data_shape,
mean_r = 220,
mean_g = 220,
mean_b = 220,
shuffle = False,
rand_crop = False,
rand_mirror = False)
val = mx.io.ImageRecordIter(
path_imgrec = 'C:/DataSets/CirclesAndRectsBase28Color/CirclesAndRectsBase28Color_val.rec',
path_imglist = 'C:/DataSets/CirclesAndRectsBase28Color/CirclesAndRectsBase28Color_val.lst',
data_name = 'data',
label_name = 'softmax_label',
batch_size = batch_size,
data_shape = data_shape,
rand_crop = False,
rand_mirror = False)
return (train, val)
data = mx.sym.var(‘data’)
first conv layer
conv1 = mx.sym.Convolution(data=data, kernel=(5,5), num_filter=20)
tanh1 = mx.sym.Activation(data=conv1, act_type=“tanh”)
pool1 = mx.sym.Pooling(data=tanh1, pool_type=“max”, kernel=(2,2), stride=(2,2))
second conv layer
conv2 = mx.sym.Convolution(data=pool1, kernel=(5,5), num_filter=50)
tanh2 = mx.sym.Activation(data=conv2, act_type=“tanh”)
pool2 = mx.sym.Pooling(data=tanh2, pool_type=“max”, kernel=(2,2), stride=(2,2))
first fullc layer
flatten = mx.sym.flatten(data=pool2)
fc1 = mx.symbol.FullyConnected(data=flatten, num_hidden=500)
tanh3 = mx.sym.Activation(data=fc1, act_type=“tanh”)
second fullc
fc2 = mx.sym.FullyConnected(data=tanh3, num_hidden=2)
softmax loss
lenet = mx.sym.SoftmaxOutput(data=fc2, name=‘softmax’)
import logging
logging.getLogger().setLevel(logging.DEBUG) # logging to stdout
create a trainable module on CPU
(train, val) = get_iterators(batch_size)
#test if image is the image we expect
train.reset()
batch = train.next()
dataN = batch.data[0]
im = dataN[0].asnumpy()
imT = im.transpose(2,1,0)
imTscaled = imT * (1.0/255.0)
plt.imshow(imTscaled)
plt.show()
lenet_model = mx.mod.Module(symbol=lenet, context=mx.cpu())
lenet_model.fit(train,
eval_data=val,
optimizer=‘sgd’,
optimizer_params={‘learning_rate’:1},
eval_metric=‘acc’,
batch_end_callback = mx.callback.Speedometer(batch_size, 20),
num_epoch=10,
initializer = mx.initializer.Xavier)
val.reset();
prob = lenet_model.predict(val)
probNP = prob.asnumpy()
a = probNP.shape
b = prob.ndim