For 8.3.2, I created a large matrix where each column represents a one-hot encoding of each character and summed up every N one-hot encodings as my training features. As there are almost 5 million such samples in total, it took me forever to run. Is there a faster way for us to obtain such an embedding. Thank you.

Not sure if this will help, but I tried coverting the matrices to numpy first, used .sum(axis=1) in numpy and covert it back to nd.array-- I think it’s faster than repeatedly adding nd.arrays

1 Like

Thanks. I later tried creating a nd.zeros matrix and filled it with nd.sum(…) vectors and it took reasonable time.