Is this a correct way to prepare custom data for yolo v3 detector?

stereomatchingkiss · January 7, 2019, 4:58pm

My solution

convert the data to rec format as this post mentioned

Change the get_dataset function to following(training set and validate set are pikachu_train.rec of this example)

def get_dataset(args): 
 train_dataset = gcv.data.RecordFileDetection(args.train_dataset)
 val_dataset = gcv.data.RecordFileDetection(args.validate_dataset)
 classes = read_classes(args)//this function read the classes from a txt file
 val_metric = VOC07MApMetric(iou_thresh=0.5, class_names=classes)
 
 if args.num_samples < 0:
     args.num_samples = len(train_dataset)
 if args.mixup:
     from gluoncv.data import MixupDetection
     train_dataset = MixupDetection(train_dataset)
 return train_dataset, val_dataset, val_metric

Other things are same as train_yolo3.py

The training results looks fine(?), although not as good as ssd, wonder if I commit any bugs.

ps : validate function don’t work at all, finding a way to solve it. Error message is
“ValueError: zero-dimensional arrays cannot be concatenated”

thomelane · January 7, 2019, 9:46pm

Hi @stereomatchingkiss,

One reason for the drop in performance compared to SSD could be due to the lack of augmentation being applied. I see that in the SSD example you’ve linked to SSDDefaultTrainTransform is used. You could try YOLO3DefaultTrainTransform for your case.

As for the error, it would be great if you could provide the full stack trace. You mentioned that you’re using pikachu_train.rec for the validation set, is this what you intended?

stereomatchingkiss · January 8, 2019, 12:13am

Yes, I use YOLO3DefaultTrainTransform in my case. Change the detection size to 608 can detect more pikachu, but still lack than SSD, maybe one of the reason is pikachu are too small and yolov3 is bad at detect small object compare with ssd.

def get_dataloader(net, train_dataset, val_dataset, data_shape, batch_size, num_workers, args):
    """Get dataloader."""
    width, height = data_shape, data_shape
    batchify_fn = Tuple(*([Stack() for _ in range(6)] + [Pad(axis=0, pad_val=-1) for _ in range(1)]))  # stack image, all targets generated
    if args.no_random_shape:
        print("no random shape")
        train_loader = gluon.data.DataLoader(
            train_dataset.transform(YOLO3DefaultTrainTransform(width, height, net, mixup=args.mixup)),
            batch_size, True, batchify_fn=batchify_fn, last_batch='rollover', num_workers=num_workers)
    else:
        print("with random shape")
        transform_fns = [YOLO3DefaultTrainTransform(x * 32, x * 32, net, mixup=args.mixup) for x in range(10, 20)]
        train_loader = RandomTransformDataLoader(
            transform_fns, train_dataset, batch_size=batch_size, interval=10, last_batch='rollover',
            shuffle=True, batchify_fn=batchify_fn, num_workers=num_workers)
    val_batchify_fn = Tuple(Stack(), Pad(pad_val=-1))    
    val_loader = gluon.data.DataLoader(
        val_dataset.transform(YOLO3DefaultValTransform(width, height)),
        batch_size, True, batchify_fn=val_batchify_fn, last_batch='keep', num_workers=num_workers)
    return train_loader, val_loader

Traceback (most recent call last):
  File "train_yolo3_custom.py", line 364, in <module>
    validate(net, val_data, ctx, eval_metric)
  File "train_yolo3_custom.py", line 181, in validate
    eval_metric.update(det_bboxes, det_ids, det_scores, gt_bboxes, gt_ids, gt_difficults)
  File "C:\Users\yyyy\Anaconda3\lib\site-packages\gluoncv\utils\metrics\voc_detection.py", line 107, in update
    gt_bboxes, gt_labels, gt_difficults]]):
  File "C:\Users\yyyy\Anaconda3\lib\site-packages\gluoncv\utils\metrics\voc_detection.py", line 106, in <listcomp>
    *[as_numpy(x) for x in [pred_bboxes, pred_labels, pred_scores,
  File "C:\Users\yyyy\Anaconda3\lib\site-packages\gluoncv\utils\metrics\voc_detection.py", line 97, in as_numpy
    return np.concatenate(out, axis=0)
ValueError: zero-dimensional arrays cannot be concatenated

My goal is make the script support any rec file, I use pikachu_train.rec in this post because I want to make sure the data is fine.

Full codes are put here pastebin

Thanks for your help

thomelane · January 8, 2019, 1:38am

Given you’re using YOLO v3 I’d expect the opposite actually! It uses a Feature Pyramid Network which is supposed to give improved performance on small objects.

Many thanks for sharing your code by the way. I’ll try running it and get back to you. Cheers, Thom

stereomatchingkiss · January 8, 2019, 2:39am

Weird, maybe training part got some bugs.

Thanks too. By the way, following is the command I use

python train_yolo3_custom.py --epochs 1 --lr 0.0001 --train_dataset pikachu_train.rec --validate_dataset pikachu_train.rec --classes_list pikachu_list.txt --batch-size 4

In order to make the training codes work, I comment out codes of validation
You can saw the results apply on “pikachu_test.jpg” by enable following codes(last 4 lines)

x, image = gcv.data.transforms.presets.yolo.load_test('pikachu_test.jpg')
cid, score, bbox = net(x)
ax = viz.plot_bbox(image, bbox[0], score[0], cid[0], class_names=classes)
plt.show()

You can download the pikachu_train.rec by following codes

url = 'https://apache-mxnet.s3-accelerate.amazonaws.com/gluon/dataset/pikachu/train.rec'
idx_url = 'https://apache-mxnet.s3-accelerate.amazonaws.com/gluon/dataset/pikachu/train.idx'
download(url, path='pikachu_train.rec', overwrite=False)
download(idx_url, path='pikachu_train.idx', overwrite=False)

stereomatchingkiss · January 8, 2019, 3:41am

pikachu_list.txt only has one line of text

pikachu

stereomatchingkiss · January 8, 2019, 12:06pm

Do some change to the training options, now the results can compete with ssd.

Put the codes after update at pastebin

The command I use

python train_yolo3_custom.py --epochs 10 --lr 0.001 --train_dataset pikachu_train.rec --validate_dataset pikachu_train.rec --classes_list pikachu_list.txt --batch-size 8 --no-random-shape

Notes :

yolo3 converge slower compare with ssd(same learning rate, 0.001) and the random-shape eat many memory
if random-shape was on, it will eat a lot of memory and the learning rate need to smaller(0.0001), else the loss will be nan

About validate function, I am still finding a way to make it work, if possible I do not want to manually manipulate the array but use the function in the library.

Edit : Find out the bug of validate

Looks like validate is a bug of gluoncv(I am using gluoncv on windows), my solution is

Copy voc_detection on github
Change the file name to voc_detection_2.py
Move it to the folder of gluoncv.utils.metrics(mine is C:\my_folder\Anaconda3\Lib\site-packages\gluoncv\utils\metrics)
Change the codes from from gluoncv.utils.metrics.voc_detection import VOC07MApMetric to from gluoncv.utils.metrics.voc_detection_2 import VOC07MApMetric

thomelane · January 8, 2019, 6:25pm

Glad you manage to get competitive results! Still need me to run the code?

Was there a particular Github issue you found referencing the validation issue?
If not, it might be a good idea for us to add one for this.

And an alternative to copying files from the repository and renaming is to install the nightly build using:

pip install gluoncv --pre --upgrade

stereomatchingkiss · January 8, 2019, 11:35pm

Thanks, I think don’t need anymore by now

The issue is as_numpy function, original implementation did not consider the case when the array shape
not able to concatenate

def as_numpy(a):
            """Convert a (list of) mx.NDArray into numpy.ndarray"""
            if isinstance(a, (list, tuple)):
                out = [x.asnumpy() if isinstance(x, mx.nd.NDArray) else x for x in a]
                out = np.array(out)
                return np.concatenate(out, axis=0)
            elif isinstance(a, mx.nd.NDArray):
                a = a.asnumpy()
            return a

It should change to

def as_numpy(a):
            """Convert a (list of) mx.NDArray into numpy.ndarray"""
            if isinstance(a, (list, tuple)):
                out = [x.asnumpy() if isinstance(x, mx.nd.NDArray) else x for x in a]
                try:
                    out = np.concatenate(out, axis=0)
                except ValueError:
                    out = np.array(out)
                return out
            elif isinstance(a, mx.nd.NDArray):
                a = a.asnumpy()
            return a

just catch the exception and the problem could be solved

Thanks, but I would prefer to stick with the “stable” version

kargarisaac · March 20, 2019, 9:17am

Thank you for your great conversation. I have a question about SSDDefaultTrainTransform or YOLO3DefaultTrainTransform. Do they do the augmentation? Is it possible to select which data augmentation? I saw some different (maybe parallel) functions to do augmentation such as CreateDetAugmenter. is there any tutorial or example to show how to use data augmentation in object detection?
thanks

Topic		Replies	Views
What dataset format is required for gluoncv Yolov3 training? Gluon	1	410	March 31, 2020
Finetuning YOLO and FRCNN Gluon	4	1191	February 6, 2020
Error finetuning YOLO3 Discussion	1	496	May 31, 2019
Cannot fine tune yolo3 Gluon python , gluon-cv , debugging	3	1239	February 13, 2019
Gluoncv yolov3 training Gluon	1	398	February 24, 2020

Is this a correct way to prepare custom data for yolo v3 detector?

Related Topics