Retraining SSD for pencils and pens

We have created a custom model with

net = gcv.model_zoo.get_model(‘ssd_512_resnet50_v1_custom’, classes=[“pencil”,“pen”], pretrained_base=True, transfer=‘voc’)

How should we retrain the model for higher accuracy? Do we add in new labeled images into the existing dataset, append to the trainval.txt, and just use the same model as above?

Do you want your model to recognize only pens and pencils?

Just for an overview, you need to:

  • get labelled images of pens and pencils;
  • starting from these images, you have to prepare a custom dataset. I generally use this program to manually select the bounding boxes locating the objects you want to detect in an image. This produces a .xml file for each image, then you can use them to generate the .lst and .rec files for the custom dataset. If you need, I have a program to generate the .rec files automatically, but you still have to manually label all the images, which is the boring part :slight_smile: ;
  • once you have the dataset, you can finetune a pre-trained model. You can use a model pre-trained on the COCO dataset as a starting point, considering the objects should be similar;

If you need any help, go ahead and ask.

Hi @LewsTherin511 thanks for the overview!

The reason we doing only pencil and pen because their shapes are quite identical and so it might be good dataset to check on how to improve the accuracy with retraining.

  1. First, we did use the same tool LabelImg to annotate the images of pencils and pens. We also uses imgaug to create augmented images.

  2. Next we train our first model with the train_ssd.py. For the custom dataset, we declared a VOCLike to be used in get_dataset():

class VOCLike(VOCDetection):
CLASSES = [“pencil”, “pen”]

def __init__(self, root, splits, transform=None, index_map=None, preload_label=True):
    super(VOCLike, self).__init__(root, splits, transform, index_map, preload_label)
  1. After a few prediction with different images from the training and validation dataset, we found a few images that the model failed to recognized, and have annotated these new images with LabelImg + augmentation like what we did in step#1.

However, the question come with the retraining the model with more dataset to the classes:

  1. Do we just add the new images to existing dataset in the VOC folder, and train using the same script train_ssd.py?

  2. Or do we retrain the existing model with finetune_detection.py? The script finetune_detection.py uses custom model ssd_512_mobilenet1.0_custom instead. However, we noticed the script net.reset_class(classes), which is not applicable to us, because we are training for the same classes?

Ok, I didn’t understand what you wanted to do precisely, now I think it’s clear!

Honestly, I never tried what you’re doing here, but I guess you could:

  • create a new dataset with the new “difficult” images;

  • use finetune_detection.py;

  • load the same custom model you used during the first training, and load the last .param file you saved during the training, something like:

    classes = ["pen", "pencil"]
    net = gcv.model_zoo.get_model('ssd_512_mobilenet1.0_custom', classes=classes, pretrained_base=False, ctx=ctx)
    net.load_parameters("path_to_param_file", ctx=ctx)

At that point, you can run the training again on the new dataset, so that ti doesn’t start from scratch, but it goes on from where you left it.
Again, I never really tried, so you can either try or wait for someone else’s opinion as well. :slight_smile:

1 Like