Timezone: »

Hierarchical Neural Architecture Search for Deep Stereo Matching
Xuelian Cheng · Yiran Zhong · Mehrtash Harandi · Yuchao Dai · Xiaojun Chang · Hongdong Li · Tom Drummond · Zongyuan Ge

Wed Dec 09 09:00 PM -- 11:00 PM (PST) @ Poster Session 4 #1315

To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation. The underlying idea for the NAS algorithm is straightforward, namely, to allow the network the ability to choose among a set of operations (\eg convolution with different filter sizes), one is able to find an optimal architecture that is better adapted to the problem at hand. However, so far the success of NAS has not been enjoyed by low-level geometric vision tasks such as stereo matching. This is partly due to the fact that state-of-the-art deep stereo matching networks, designed by humans, are already sheer in size. Directly applying the NAS to such massive structures is computationally prohibitive based on the currently available mainstream computing resources. In this paper, we propose the first \emph{end-to-end} hierarchical NAS framework for deep stereo matching by incorporating task-specific human knowledge into the neural architecture search framework. Specifically, following the gold standard pipeline for deep stereo matching (\ie, feature extraction -- feature volume construction and dense matching), we optimize the architectures of the entire pipeline jointly. Extensive experiments show that our searched network outperforms all state-of-the-art deep stereo matching architectures and is ranked at the top 1 accuracy on KITTI stereo 2012, 2015, and Middlebury benchmarks, as well as the top 1 on SceneFlow dataset with a substantial improvement on the size of the network and the speed of inference. Code available at https://github.com/XuelianCheng/LEAStereo.

Author Information

Xuelian Cheng (Monash University)
Yiran Zhong (Australian National University)
Mehrtash Harandi (Monash University)

I am a senior lecturer in the department of Electrical and Computer Systems Eng. (ECSE) at Monash University. I am also a contributing research scientist at the Machine Learning Research Group (MLRG)-Data61-CSIRO. Before joining Monash University, I spent 5 wonderful years at Canberra Research Laboratory-NICTA, working with prof. Richard Hartley and prof. Fatih Porikli. Prior to that, I worked at Queensland Research Laboratory-NICTA with Prof. Brian Lovell. I am interested in various aspects of learning, especially with a flavor of visual data (see my google scholar page).

Yuchao Dai (Northwestern Polytechnical University)
Xiaojun Chang (Monash University)
Hongdong Li (Australian National University)
Tom Drummond (Monash University)
Zongyuan Ge (Monash University)

More from the Same Authors