Joining multiple mx.recordio.MXIndexedRecordIO files

assuming I have a set of record files with their indices, e.g. train1.rec, train1.idx, train2.rec, train2.idx what is the most efficient way to join them together in a single train.rec, train.idx?

I have a large set of large raster files ~ 300 x (3 x 10k x 10k), and I want to process them (extract training chips for segmentation) in a distributed environment, to create fast a single rec file. I can process in parallel the extraction from a single file (with locks) but then am bound in serially processing all 300 images one by one - for a single rec file (am actually trying to hack this, but not there yet). Any ideas?

Actually, opening multiple files and joining them together is a viable (timewise) option, and much faster than creating them at once ~ 1 minute for 4 large rasters. Something like (needs polishing):

from multiprocessing import Lock
from pathos.pools import ThreadPool as pp
import mxnet as mx
import pickle
import glob

class JoinRecIO(object):
    def __init__(self,prefix_read=r'../temp/'):
        flname_recs = sorted(glob.glob(prefix_read + r'*/*.rec'))
        flname_idxs = sorted(glob.glob(prefix_read + r'*/*.idx'))
        self.nameTuples = list(zip(flname_idxs, flname_recs))
        self.record_target = mx.recordio.MXIndexedRecordIO(idx_path=prefix_read+'global.idx', uri=prefix_read +'global.rec', flag='w')
        self.lock = Lock()
        self.global_idx = 0
    def thread_write(self,nthreads=8):
        pool = pp(nthreads),self.nameTuples)
    def write_rec(self, names_idx_rec):
        name1, name2 = names_idx_rec
        record_read = mx.recordio.MXIndexedRecordIO(idx_path=name1, uri=name2, flag='r')

        for key in record_read.keys:
            read_in = record_read.read_idx(key)
            self.global_idx += 1


myJoin = JoinRecIO()