hi,
I want to load a sparse data into gluon.
when i call todense() to turn the csr_matrix into a numpy array, I got a memory error (my array is quite large, something about 40000 items with about 164000 features in it).
when i call mx.nd.sparse.array() and pass my csr_matrix, I got a matrix for mxnet. but, when I want to load it with gluon (for turning it into batches and load it), I got undefined behaviour, and I don’t know how to iterate through my sparse data in gluon.
this is my data loading stuff:
x_train = mx.nd.sparse.array(x_train, ctx=mx.ggpu())
x_test = mx.nd.sparse.array(x_test, ctx=mx.gpu())
y_train = nd.array(y_train)
y_test = nd.array(y_test)
batch_size = 32
epochs = 10
train_iter = gluon.data.DataLoader(dc(x_train, y_train), batch_size = batch_size, shuffle = True)
test_iter = gluon.data.DataLoader(dc(x_test, y_test), batch_size = batch_size, shuffle = False)
trainer = gluon.Trainer(m.collect_params(), "adam", {"learning_rate":0.0001})
met = mx.metric.Accuracy()
for e in range(1, epochs):
met.reset()
for i, (xt, yt) in enumerate(train_iter):
xt.attach_grad()
yt.attach_grad()
with autograd.record():
l = criterian(m(xt), yt)
l.backward()
print("batch {0}, loss:{1}".format(i, l.mean()))
trainer.step(xt[0])
thanks.