Faster R-CNN: Model Design

Architecture

Faster R-CNN Model Structure:

        +-------------------------------+
        |          Input Image          |
        +-------------------------------+
                        |
                        |
                        |
                        |
                        |
        +-------------------------------+
        |    Convolutional Backbone     |------------------|
        +-------------------------------+                  |
                        |                                  |
                        |                                  |
                        |                                  |
                        |                                  |
                        |                                  |
        +-------------------------------+                  |
        | Region Proposal Network (RPN) |                  |
        +-------------------------------+                  |
                        |                                  |
                        |                                  |
                        |----------------------------------|
                        |
                        |
        +-------------------------------+
        |       RoI Pooling Layer       |
        +-------------------------------+
                        |
                        |
                        |
                        |
                        |
        +-------------------------------+
        |     Fully Connected Layers    |------------------|
        +-------------------------------+                  |
                        |                                  |
                        |                                  |
                        |                                  |
                        |                                  |
                        |                                  |
        +-------------------------------+  +-------------------------------+
        |        Classification         |  |          Localization         |
        +-------------------------------+  +-------------------------------+

where:

  • both the RoI pooling layer and its following-up FC layers constitute the Fast R-CNN detector [3]

  • the backbone convolutional network adapts ZF or VGG [13]

Back to Object Detection.