Optimized topology-aware PS by default?

Hi,
Is the topology-aware single-node all-reduce described here https://cwiki.apache.org/confluence/display/MXNET/Single+machine+All+Reduce+Topology-aware+Communication implemented as a default for single-node multi-device reduction?
Cheers