Timezone: »
We present SNIPER, an algorithm for performing efficient multi-scale training in instance level visual recognition tasks. Instead of processing every pixel in an image pyramid, SNIPER processes context regions around ground-truth instances (referred to as chips) at the appropriate scale. For background sampling, these context-regions are generated using proposals extracted from a region proposal network trained with a short learning schedule. Hence, the number of chips generated per image during training adaptively changes based on the scene complexity. SNIPER only processes 30% more pixels compared to the commonly used single scale training at 800x1333 pixels on the COCO dataset. But, it also observes samples from extreme resolutions of the image pyramid, like 1400x2000 pixels. As SNIPER operates on resampled low resolution chips (512x512 pixels), it can have a batch size as large as 20 on a single GPU even with a ResNet-101 backbone. Therefore it can benefit from batch-normalization during training without the need for synchronizing batch-normalization statistics across GPUs. SNIPER brings training of instance level recognition tasks like object detection closer to the protocol for image classification and suggests that the commonly accepted guideline that it is important to train on high resolution images for instance level visual recognition tasks might not be correct. Our implementation based on Faster-RCNN with a ResNet-101 backbone obtains an mAP of 47.6% on the COCO dataset for bounding box detection and can process 5 images per second during inference with a single GPU. Code is available at https://github.com/MahyarNajibi/SNIPER/ .
Author Information
Bharat Singh (University of Maryland, College Park)
Mahyar Najibi (University of Maryland)
Larry Davis (University of Maryland)
More from the Same Authors
-
2021 Poster: Revisiting 3D Object Detection From an Egocentric Perspective »
Boyang Deng · Charles R Qi · Mahyar Najibi · Thomas Funkhouser · Yin Zhou · Dragomir Anguelov -
2019 Poster: Adversarial training for free! »
Ali Shafahi · Mahyar Najibi · Mohammad Amin Ghiasi · Zheng Xu · John Dickerson · Christoph Studer · Larry Davis · Gavin Taylor · Tom Goldstein -
2018 Poster: Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks »
Ali Shafahi · W. Ronny Huang · Mahyar Najibi · Octavian Suciu · Christoph Studer · Tudor Dumitras · Tom Goldstein -
2014 Poster: A Probabilistic Framework for Multimodal Retrieval using Integrative Indian Buffet Process »
Bahadir Ozdemir · Larry Davis -
2008 Poster: Automatic online tuning for fast Gaussian summation »
Vlad I Morariu · Balaji Vasan Srinivasan · Vikas C Raykar · Ramani Duraiswami · Larry Davis -
2008 Poster: A "Shape Aware" Model for semi-supervised Learning of Objects and its Context »
Abhinav Gupta · Jianbo Shi · Larry Davis -
2008 Spotlight: A "Shape Aware'' Model for semi-supervised Learning of Objects and its Context »
Abhinav Gupta · Jianbo Shi · Larry Davis