Gluon.data.vision.transforms.Compose, why does the order in Compose matter?

wilhelm · February 3, 2023, 5:47pm

Dear all,

I noticed that, the order of the transformation defined when creating the Compose class lead to errors or to a working code.

Example
the following snippet is from my code:

…
transformer = transforms.Compose([transforms.CenterCrop(227), transforms.ToTensor()])

trainDataset = ImageFolderDataset(root = ‘~/images’, flag = 1).transform_first(transformer)
valDataset = ImageFolderDataset(root = ‘~/images’, flag = 1).transform_first(transformer)

trainData = DataLoader(trainDataset, batch_size = 64, shuffle = True, last_batch = ‘discard’)
valData = DataLoader(valDataset, batch_size = 64, shuffle = True, last_batch = ‘discard’)
…

it runs smoothly till the end of the program. No errors, no problems. I get the images in the desired size (227x227x3) and everything is fine.

But, if I swap the two transformations like in the following snippet:

transformer = transforms.Compose([transforms.ToTensor(), transforms.CenterCrop(227)])

trainDataset = ImageFolderDataset(root = ‘~/images’, flag = 1).transform_first(transformer)
valDataset = ImageFolderDataset(root = ‘~/images’, flag = 1).transform_first(transformer)

trainData = DataLoader(trainDataset, batch_size = 64, shuffle = True, last_batch = ‘discard’)
valData = DataLoader(valDataset, batch_size = 64, shuffle = True, last_batch = ‘discard’)

then I get the following error:

Traceback (most recent call last):
File “mx_bloodcells.py”, line 113, in
data = mx.gluon.utils.split_and_load(inputs, ctx_list = ctx, batch_axis = 0)
File “/opt/anaconda/anaconda3/envs/dl/lib/python3.8/site-packages/mxnet/gluon/utils.py”, line 118, in split_and_load
slices = split_data(data, len(ctx_list), batch_axis, even_split)
File “/opt/anaconda/anaconda3/envs/dl/lib/python3.8/site-packages/mxnet/gluon/utils.py”, line 64, in split_data
size = data.shape[batch_axis]
File “/opt/anaconda/anaconda3/envs/dl/lib/python3.8/site-packages/mxnet/ndarray/ndarray.py”, line 2409, in shape
check_call(_LIB.MXNDArrayGetShapeEx(
File “/opt/anaconda/anaconda3/envs/dl/lib/python3.8/site-packages/mxnet/base.py”, line 246, in check_call
raise get_last_ffi_error()
mxnet.base.MXNetError: Traceback (most recent call last):
File “/work/mxnet/src/c_api/c_api.cc”, line 2104
MXNetError: Check failed: arr->shape().Size() < (int64_t{1} << 31) - 1 (16885022720 vs. 2147483647) : [Get Shape] Size of tensor you are trying to allocate is larger than 2^31 elements. Please build with flag USE_INT64_TENSOR_SIZE=1

but the network is the exactly the same. There were no changes in the code, I only swapped the two transformations.

Could you please help me to understand this problem?
Thanks

wilhelm · February 3, 2023, 6:03pm

Ok,
I didn’t found out why, but I found out, that this behaviour already known is.

In the official documentation online it says:

Caution: ordering of transforms is important. e.g. ToTensor should be applied before Normalize, but after Resize and CenterCrop.

Source.

Topic		Replies	Views
Gluon.data.vision.transforms	4	639	August 5, 2019
Dataset bizarre behavior after using .transform Gluon	7	668	April 2, 2019
Dataset .transform mapping over tuples conventions Gluon	2	518	April 2, 2019
Is dataset transform executed at training or instanciation? Gluon	2	447	June 12, 2019
Load 2 data-sets in a same order Discussion	5	710	October 26, 2018

Gluon.data.vision.transforms.Compose, why does the order in Compose matter?

Related Topics