Thanks for the great intro.
It would be really appreciated if you could add some more information to this section. There are many small details missing here that can create a lot of confusion.
Below is a sample introductory explanation from Mathworks.com concerning R-CNN,Fast-RCNN and Faster RCNN.
The figures are really nice and the accompanying explanation also clears a lot of confusion.
Object Detection Using R-CNN Algorithms
Models for object detection using regions with CNNs are based on the following three processes:
- Find regions in the image that might contain an object. These regions are called region proposals .
- Extract CNN features from the region proposals.
- Classify the objects using the extracted features.
There are three variants of an R-CNN. Each variant attempts to optimize, speed up, or enhance the results of one or more of these processes.
The R-CNN detector  first generates region proposals using an algorithm such as Edge Boxes. The proposal regions are cropped out of the image and resized. Then, the CNN classifies the cropped and resized regions. Finally, the region proposal bounding boxes are refined by a support vector machine (SVM) that is trained using CNN features.
As in the R-CNN detector , the Fast R-CNN detector also uses an algorithm like Edge Boxes to generate region proposals. Unlike the R-CNN detector, which crops and resizes region proposals, the Fast R-CNN detector processes the entire image. Whereas an R-CNN detector must classify each region, Fast R-CNN pools CNN features corresponding to each region proposal. Fast R-CNN is more efficient than R-CNN, because in the Fast R-CNN detector, the computations for overlapping regions are shared.
The Faster R-CNN detector adds a region proposal network (RPN) to generate region proposals directly in the network nstead of using an external algorithm like Edge Boxes. The RPN uses Anchor Boxes for Object Detection. Generating region proposals in the network is faster and better tuned to your data.
Comparison of R-CNN Object Detectors
This family of object detectors uses region proposals to detect objects within images. The number of proposed regions dictates the time it takes to detect objects in an image. The Fast R-CNN and Faster R-CNN detectors are designed to improve detection performance with a large number of regions.