Im2rec.py I surrender! Generated list is incorrect

Hi all,

I’m using Amazon Sagemaker to build my first model using the single shot multibox detector. I’ve run through the example which uses the COCO dataset and everything worked fine. I’m now trying to train using my own images.

I have 600 images of dogs, and for my first ever model I just want to identify if there is a dog in the picture, that’s it. I am trying to use im2rec.py to create a recordio dataset from my 600 jpg images. All the jpg images are in a single directory. I run the following command to generate the list files:

python im2rec.py --num-thread 8 --list --recursive --test-ratio=0.3 --train-ratio=0.7 collie_lst ~/images/combined/

That generates 2 files, collie_lst_test.lst and collie_lst_train.lst.

When I look at the files however they do not look correct. Here are the first few lines:

571 0.000000 626.jpg
63 0.000000 158.jpg
288 0.000000 365.jpg
473 0.000000 533.jpg
614 0.000000 669.jpg
249 0.000000 329.jpg

My understanding is that there should be a lot more data for each image, but no matter what I try, this is all I get. I’ve been trying different permutations of the command for hours with no luck, if anyone can help me understand what I am doing wring I would be eternally grateful!

Thank you.

Hi, this looks correct? col 1 is the image index, col 2 is the class (0 or 1), col 3 is picture path. What makes you think it’s wrong? note:

  • If you train an image classifier to detect presence of dogs, that’s at least 2 classes (dog/not dog) so if your whole dataset is only dogs the model be able to learn what non-dog object look like
  • If you want to do object detection you need to provide localization metadata to your training data

Thank you for the reply.

When I try to run training in Sagemaker I receive the following error:

“Not enough label packed in img_list or rec file”

When I search for the cause of the error message I get a link to a Sagemaker github issue where someone else hit the same problem:

The post suggests the error is:

According to the log you posted, the input RecordIO file does not contain enough annotations for training the object detection algorithm. Before you convert the images into RecordIO format, please make sure the .lst file you generated contains all the annotation information. The annotation information for each object is represented as [class_index, xmin, ymin, xmax, ymax].

So reading that I was assuming I need to have class_index, xmin, ymin, xmax, ymax for each image.

Does that make sense? Sorry, this is a pretty steep learning curve :slight_smile:

yes, basically to train an object detection algorithm your data needs to contain both classification (which object?) and localisation (where?) information. If you are creating your own dataset you need to have bounding box information (xmin, ymin, xmax, ymax) to indentify where your objects are.
Those pages may help:
https://gluon-cv.mxnet.io/build/examples_datasets/detection_custom.html

1 Like

OK, great, thanks again!

I’ll read the article in full, but basically I need to go through all 600 images manually and add a bounding box? This could take a while :slight_smile: