Timezone: »
PointNet++ is one of the most influential neural architectures for point cloud understanding. Although the accuracy of PointNet++ has been largely surpassed by recent networks such as PointMLP and Point Transformer, we find that a large portion of the performance gain is due to improved training strategies, i.e. data augmentation and optimization techniques, and increased model sizes rather than architectural innovations. Thus, the full potential of PointNet++ has yet to be explored. In this work, we revisit the classical PointNet++ through a systematic study of model training and scaling strategies, and offer two major contributions. First, we propose a set of improved training strategies that significantly improve PointNet++ performance. For example, we show that, without any change in architecture, the overall accuracy (OA) of PointNet++ on ScanObjectNN object classification can be raised from 77.9% to 86.1%, even outperforming state-of-the-art PointMLP. Second, we introduce an inverted residual bottleneck design and separable MLPs into PointNet++ to enable efficient and effective model scaling and propose PointNeXt, the next version of PointNets. PointNeXt can be flexibly scaled up and outperforms state-of-the-art methods on both 3D classification and segmentation tasks. For classification, PointNeXt reaches an overall accuracy of 87.7 on ScanObjectNN, surpassing PointMLP by 2.3%, while being 10x faster in inference. For semantic segmentation, PointNeXt establishes a new state-of-the-art performance with 74.9% mean IoU on S3DIS (6-fold cross-validation), being superior to the recent Point Transformer. The code and models are available at https://github.com/guochengqian/pointnext.
Author Information
Guocheng Qian (KAUST)
Yuchen Li (King Abdullah University of Science and Technology)
Houwen Peng (Microsoft Research)
Jinjie Mai (King Abdullah University of Science and Technology)
Hasan Hammoud (KAUST)
Mohamed Elhoseiny (KAUST)
Bernard Ghanem (KAUST)
More from the Same Authors
-
2021 Spotlight: ASSANet: An Anisotropic Separable Set Abstraction for Efficient Point Cloud Representation Learning »
Guocheng Qian · Hasan Hammoud · Guohao Li · Ali Thabet · Bernard Ghanem -
2022 : Certified Robustness in Federated Learning »
Motasem Alfarra · Juan Perez · Egor Shulgin · Peter Richtarik · Bernard Ghanem -
2022 : A Simple Baseline that Questions the Use of Pretrained-Models in Continual Learning »
Paul Janson · Wenxuan Zhang · Rahaf Aljundi · Mohamed Elhoseiny -
2022 Spotlight: Lightning Talks 6A-4 »
Xiu-Shen Wei · Konstantina Dritsa · Guillaume Huguet · ABHRA CHAUDHURI · Zhenbin Wang · Kevin Qinghong Lin · Yutong Chen · Jianan Zhou · Yongsen Mao · Junwei Liang · Jinpeng Wang · Mao Ye · Yiming Zhang · Aikaterini Thoma · H.-Y. Xu · Daniel Sumner Magruder · Enwei Zhang · Jianing Zhu · Ronglai Zuo · Massimiliano Mancini · Hanxiao Jiang · Jun Zhang · Fangyun Wei · Faen Zhang · Ioannis Pavlopoulos · Zeynep Akata · Xiatian Zhu · Jingfeng ZHANG · Alexander Tong · Mattia Soldan · Chunhua Shen · Yuxin Peng · Liuhan Peng · Michael Wray · Tongliang Liu · Anjan Dutta · Yu Wu · Oluwadamilola Fasina · Panos Louridas · Angel Chang · Manik Kuchroo · Manolis Savva · Shujie LIU · Wei Zhou · Rui Yan · Gang Niu · Liang Tian · Bo Han · Eric Z. XU · Guy Wolf · Yingying Zhu · Brian Mak · Difei Gao · Masashi Sugiyama · Smita Krishnaswamy · Rong-Cheng Tu · Wenzhe Zhao · Weijie Kong · Chengfei Cai · WANG HongFa · Dima Damen · Bernard Ghanem · Wei Liu · Mike Zheng Shou -
2022 Spotlight: Egocentric Video-Language Pretraining »
Kevin Qinghong Lin · Jinpeng Wang · Mattia Soldan · Michael Wray · Rui Yan · Eric Z. XU · Difei Gao · Rong-Cheng Tu · Wenzhe Zhao · Weijie Kong · Chengfei Cai · WANG HongFa · Dima Damen · Bernard Ghanem · Wei Liu · Mike Zheng Shou -
2022 Poster: Egocentric Video-Language Pretraining »
Kevin Qinghong Lin · Jinpeng Wang · Mattia Soldan · Michael Wray · Rui Yan · Eric Z. XU · Difei Gao · Rong-Cheng Tu · Wenzhe Zhao · Weijie Kong · Chengfei Cai · WANG HongFa · Dima Damen · Bernard Ghanem · Wei Liu · Mike Zheng Shou -
2022 Poster: Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding »
Eslam Bakr · Yasmeen Alsaedy · Mohamed Elhoseiny -
2021 Poster: ASSANet: An Anisotropic Separable Set Abstraction for Efficient Point Cloud Representation Learning »
Guocheng Qian · Hasan Hammoud · Guohao Li · Ali Thabet · Bernard Ghanem -
2021 Poster: Low-Fidelity Video Encoder Optimization for Temporal Action Localization »
Mengmeng Xu · Juan Manuel Perez Rua · Xiatian Zhu · Bernard Ghanem · Brais Martinez -
2021 Poster: Searching the Search Space of Vision Transformer »
Minghao Chen · Kan Wu · Bolin Ni · Houwen Peng · Bei Liu · Jianlong Fu · Hongyang Chao · Haibin Ling -
2021 Poster: Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training »
Hongwei Xue · Yupan Huang · Bei Liu · Houwen Peng · Jianlong Fu · Houqiang Li · Jiebo Luo -
2020 Poster: Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search »
Houwen Peng · Hao Du · Hongyuan Yu · QI LI · Jing Liao · Jianlong Fu -
2020 Poster: Self-Supervised Learning by Cross-Modal Audio-Video Clustering »
Humam Alwassel · Dhruv Mahajan · Bruno Korbar · Lorenzo Torresani · Bernard Ghanem · Du Tran -
2020 Spotlight: Self-Supervised Learning by Cross-Modal Audio-Video Clustering »
Humam Alwassel · Dhruv Mahajan · Bruno Korbar · Lorenzo Torresani · Bernard Ghanem · Du Tran