Efficient Equivariant Network

Lingshen He · Yuxuan Chen · zhengyang shen · Yiming Dong · Yisen Wang · Zhouchen Lin

Keywords: [ Deep Learning ] [ Vision ]

Abstract: Convolutional neural networks (CNNs) have dominated the field of Computer Vision and achieved great success due to their built-in translation equivariance. Group equivariant CNNs (G-CNNs) that incorporate more equivariance can significantly improve the performance of conventional CNNs. However, G-CNNs are faced with two major challenges: \emph{spatial-agnostic problem} and \emph{expensive computational cost}. In this work, we propose a general framework of previous equivariant models, which includes G-CNNs and equivariant self-attention layers as special cases. Under this framework, we explicitly decompose the feature aggregation operation into a kernel generator and an encoder, and decouple the spatial and extra geometric dimensions in the computation. Therefore, our filters are essentially dynamic rather than being spatial-agnostic. We further show that our \emph{E}quivariant model is parameter \emph{E}fficient and computation \emph{E}fficient by complexity analysis, and also data \emph{E}fficient by experiments, so we call our model $E^4$-Net. Extensive experiments verify that our model can significantly improve previous works with smaller model size.Especially, under the setting of training on $1/5$ data of CIFAR10, our model improves G-CNNs by $5\%+$ accuracy,while using only $56\%$ parameters and $68\%$ FLOPs.

Chat is not available.