Poster

LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

Zuxuan Wu ⋅ Caiming Xiong ⋅ Yu-Gang Jiang ⋅ Larry Davis

Keywords: Applications Video Analysis Efficient Inference Methods Applications -> Computer Vision; Deep Learning

2019 Poster

[ Paper] [ Poster]

Abstract

This paper presents LiteEval, a simple yet effective coarse-to-fine framework for resource efficient video recognition, suitable for both online and offline scenarios. Exploiting decent yet computationally efficient features derived at a coarse scale with a lightweight CNN model, LiteEval dynamically decides on-the-fly whether to compute more powerful features for incoming video frames at a finer scale to obtain more details. This is achieved by a coarse LSTM and a fine LSTM operating cooperatively, as well as a conditional gating module to learn when to allocate more computation. Extensive experiments are conducted on two large-scale video benchmarks, FCVID and ActivityNet, and the results demonstrate LiteEval requires substantially less computation while offering excellent classification accuracy for both online and offline predictions.

Chat is not available.