Performance issue on Intel CPU

Hey!
I am using mxnet for inference on a face-recognition model that extracts features from face image. The main issue is the on using cpu context on intel chipset is way slower than amd ryzen chipset.

Intel processor - Intel Xeon Gold 6138
AMD processor - Ryzen 5 2600
In other generic cpu benchmarks, both chipsets perform almost same in single-core tests.
But inference takes 0.3 seconds on AMD and 7 seconds on Intel.

The model is initialised with the following block,

import mxnet as mx

# Init model
ctx = mx.cpu()
sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)
all_layers = sym.get_internals()
sym = all_layers[layer_name+'_output']
model = mx.mod.Module(symbol=sym, context=ctx, label_names = None)
model.bind(data_shapes=[('data', (1, 3, image_size[0], image_size[1]))])
model.set_params(arg_params, aux_params)
ctx.empty_cache()

And inference with the following block,

model.forward(aligned_img, is_train=False)
embedding = self.model.get_outputs()[0].asnumpy()

I have confirmed the imgs/sec value from mxnet official github repository’s image-classification/benchmark_score.py and the results are same.

I need to prepare my model for CPU inference, so I need it to be of similar performance on similar chipsets.