Novel Approach for Object Recognition using Self Attention Networks: ORSAN
DOI:
https://doi.org/10.17762/msea.v70i2.2026Abstract
We propose BoTrNe, a theoretically simple but strong backbone architecture for various computer vision tasks such as image classification, object recognition, and instance segmentation that includes self-attention. Our method substantially improves on the baselines, on instance segmentation, and object recognition while simultaneously lowering the parameters, with little latency overhead, by simply substituting spatial convolutions with global self-attention in the last three bottleneck blocks of a ResNet. We also show how ResNet bottleneck blocked with self-attention may be regarded as Transformer blocks via the architecture of BoTrNe. BoTrNe obtains 46.2 % Mask AP and 51.8 % Box AP utilizing the Mask R-CNN framework on the COCO Instance Segmentation benchmark, exceeding the previous best reported single model and weighted linear results of ResNet tested on the COCO validation set.