SIMD based symbol or imperative operators


First of all I want to praise all the developers involved in this outstanding project!
I am doing my 1st year masters thesis on re-using ML frameworks for columnar databases internal workload.
At this point I benchmark/compare ML frameworks. Here is my graph based on 1st order int64 flat Tensors( shape = (8192)).
or (
equal(src_tensor, first_filter_tensor),
equal(src_tensor, first_filter_tensor)),
src_tensor, empty_magic_tensor)
Did I get it right there is no SIMD-based CPU implementation for: where, or, equal ops in mshadow ATM? Does it make any sense to try Symbol ops or they are translated in the same GOMP + mshadow primitives?
I appreciate any suggestion on performance/imperative code.
Here is the actual code I used to measure the timings for the graph I mentioned.