I could use some help working on a custom iterator for the python api. I have a list of libsvm files that I want to read in and extract certain elements from - to facilitate setting up private/public spaces in a mixed-data and multi-task procedure. I can do this one by one with scipy libsvm loader, but I’m tangled up wrapping the data extraction procedure around a list of files. Is it possible to do this?
x,y = load_svmlight_file(input_file, n_features = 55182)
a = x[:,0].toarray().flatten()
b = x[:,1:211]
c = x[:,212:9568]
d = x[:,9569:55182]
data = {‘a’:mx.nd.array(a), ‘b’:mx.nd.sparse.array(b), ‘c’:mx.nd.sparse.array©,‘d’:mx.nd.sparse.array(d)}
label = {‘autoencoder_label’:mx.nd.sparse.array(b), ‘softmax_label’:mx.nd.array(y)}
train_iter = mx.io.NDArrayIter(data=data, label=label, batch_size=64, shuffle=True, last_batch_handle=‘discard’)