Timezone: »
Unstructured social group activity recognition in web videos is a challenging task due to 1) the semantic gap between class labels and low-level visual features and 2) the lack of labeled training data. To tackle this problem, we propose a "relevance topic model" for jointly learning meaningful mid-level representations upon bag-of-words (BoW) video representations and a classifier with sparse weights. In our approach, sparse Bayesian learning is incorporated into an undirected topic model (i.e., Replicated Softmax) to discover topics which are relevant to video classes and suitable for prediction. Rectified linear units are utilized to increase the expressive power of topics so as to explain better video data containing complex contents and make variational inference tractable for the proposed model. An efficient variational EM algorithm is presented for model parameter estimation and inference. Experimental results on the Unstructured Social Activity Attribute dataset show that our model achieves state of the art performance and outperforms other supervised topic model in terms of classification accuracy, particularly in the case of a very small number of labeled training videos.
Author Information
Fang Zhao (Chinese Academy of Sciences)
Yongzhen Huang (Chinese Academy of Sciences)
Liang Wang (NLPR, China)
Tieniu Tan (Chinese Academy of Sciences)
More from the Same Authors
-
2022 Spotlight: MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-text Matching »
Yan Huang · Yuming Wang · Yunan Zeng · Liang Wang -
2022 Poster: MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-text Matching »
Yan Huang · Yuming Wang · Yunan Zeng · Liang Wang -
2021 Poster: Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision »
Keji He · Yan Huang · Qi Wu · Jianhua Yang · Dong An · Shuanglin Sima · Liang Wang -
2020 Poster: Unfolding the Alternating Optimization for Blind Super Resolution »
zhengxiong luo · Yan Huang · Shang Li · Liang Wang · Tieniu Tan -
2019 Poster: Efficient Neural Architecture Transformation Search in Channel-Level for Object Detection »
Junran Peng · Ming Sun · ZHAO-XIANG ZHANG · Tieniu Tan · Junjie Yan -
2018 Poster: IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis »
Huaibo Huang · zhihang li · Ran He · Zhenan Sun · Tieniu Tan -
2017 Poster: Deep Supervised Discrete Hashing »
Qi Li · Zhenan Sun · Ran He · Tieniu Tan -
2015 Poster: Bidirectional Recurrent Convolutional Networks for Multi-Frame Super-Resolution »
Yan Huang · Wei Wang · Liang Wang