Timezone: »
Video action detection requires annotations at every frame, which drastically increases the labeling cost. In this work, we focus on efficient labeling of videos for action detection to minimize this cost. We propose active sparse labeling (ASL), a novel active learning strategy for video action detection. Sparse labeling will reduce the annotation cost but poses two main challenges; 1) how to estimate the utility of annotating a single frame for action detection as detection is performed at video level?, and 2) how these sparse labels can be used for action detection which require annotations on all the frames? This work attempts to address these challenges within a simple active learning framework. For the first challenge, we propose a novel frame-level scoring mechanism aimed at selecting most informative frames in a video. Next, we introduce a novel loss formulation which enables training of action detection model with these sparsely selected frames. We evaluate the proposed approach on two different action detection benchmark datasets, UCF-101-24 and J-HMDB-21, and observed that active sparse labeling can be very effective in saving annotation costs. We demonstrate that the proposed approach performs better than random selection, outperforming all other baselines, with performance comparable to supervised approach using merely 10% annotations.
Author Information
Aayush Rana (University of Central Florida)
Yogesh Rawat (University of Central Florida)
More from the Same Authors
-
2023 Poster: Revealing the unseen: Benchmarking video action recognition under occlusion »
Shresth Grover · Vibhav Vineet · Yogesh Rawat -
2023 Poster: On Occlusions in Video Action Detection: Benchmark Datasets And Training Recipes »
Rajat Modi · Vibhav Vineet · Yogesh Rawat -
2022 Poster: Robustness Analysis of Video-Language Models Against Visual and Language Perturbations »
Madeline Chantry · Shruti Vyas · Hamid Palangi · Yogesh Rawat · Vibhav Vineet -
2022 Poster: Don't Pour Cereal into Coffee: Differentiable Temporal Logic for Temporal Action Segmentation »
Ziwei Xu · Yogesh Rawat · Yongkang Wong · Mohan Kankanhalli · Mubarak Shah -
2021 Poster: Reformulating Zero-shot Action Recognition for Multi-label Actions »
Alec Kerrigan · Kevin Duarte · Yogesh Rawat · Mubarak Shah -
2018 Poster: VideoCapsuleNet: A Simplified Network for Action Detection »
Kevin Duarte · Yogesh Rawat · Mubarak Shah