Region-based CNNs (R-CNNs)

https://d2l.ai/chapter_computer-vision/rcnn.html

1 Like

Hi,
Thanks for the great intro.
It would be really appreciated if you could add some more information to this section. There are many small details missing here that can create a lot of confusion.
Below is a sample introductory explanation from Mathworks.com concerning R-CNN,Fast-RCNN and Faster RCNN.
The figures are really nice and the accompanying explanation also clears a lot of confusion.

Object Detection Using R-CNN Algorithms
Models for object detection using regions with CNNs are based on the following three processes:

  • Find regions in the image that might contain an object. These regions are called region proposals .
  • Extract CNN features from the region proposals.
  • Classify the objects using the extracted features.

There are three variants of an R-CNN. Each variant attempts to optimize, speed up, or enhance the results of one or more of these processes.

R-CNN

The R-CNN detector [2] first generates region proposals using an algorithm such as Edge Boxes[1]. The proposal regions are cropped out of the image and resized. Then, the CNN classifies the cropped and resized regions. Finally, the region proposal bounding boxes are refined by a support vector machine (SVM) that is trained using CNN features.

Use the trainRCNNObjectDetector function to train an R-CNN object detector. The function returns an rcnnObjectDetector object that detects objects in an image.

Fast R-CNN

As in the R-CNN detector , the Fast R-CNN[3] detector also uses an algorithm like Edge Boxes to generate region proposals. Unlike the R-CNN detector, which crops and resizes region proposals, the Fast R-CNN detector processes the entire image. Whereas an R-CNN detector must classify each region, Fast R-CNN pools CNN features corresponding to each region proposal. Fast R-CNN is more efficient than R-CNN, because in the Fast R-CNN detector, the computations for overlapping regions are shared.

Use the trainFastRCNNObjectDetector function to train a Fast R-CNN object detector. The function returns a fastRCNNObjectDetector that detects objects from an image.

Faster R-CNN

The Faster R-CNN[4] detector adds a region proposal network (RPN) to generate region proposals directly in the network nstead of using an external algorithm like Edge Boxes. The RPN uses Anchor Boxes for Object Detection. Generating region proposals in the network is faster and better tuned to your data.

Use the trainFasterRCNNObjectDetector function to train a Faster R-CNN object detector. The function returns a fasterRCNNObjectDetector that detects objects from an image.

Comparison of R-CNN Object Detectors

This family of object detectors uses region proposals to detect objects within images. The number of proposed regions dictates the time it takes to detect objects in an image. The Fast R-CNN and Faster R-CNN detectors are designed to improve detection performance with a large number of regions.

R-CNN Detector Description
trainRCNNObjectDetector * Slow training and detection
* Allows custom region proposal
trainFastRCNNObjectDetector * Allows custom region proposal
trainFasterRCNNObjectDetector * Optimal run-time performance
* Does not support a custom region proposal