Timezone: »

TokenLearner: Adaptive Space-Time Tokenization for Videos
Michael Ryoo · AJ Piergiovanni · Anurag Arnab · Mostafa Dehghani · Anelia Angelova

Thu Dec 09 08:30 AM -- 10:00 AM (PST) @ Virtual

In this paper, we introduce a novel visual representation learning which relies on a handful of adaptively learned tokens, and which is applicable to both image and video understanding tasks. Instead of relying on hand-designed splitting strategies to obtain visual tokens and processing a large number of densely sampled patches for attention, our approach learns to mine important tokens in visual data. This results in efficiently and effectively finding a few important visual tokens and enables modeling of pairwise attention between such tokens, over a longer temporal horizon for videos, or the spatial content in image frames. Our experiments demonstrate strong performance on several challenging benchmarks for video recognition tasks. Importantly, due to our tokens being adaptive, we accomplish competitive results at significantly reduced computational cost. We establish new state-of-the-arts on multiple video datasets, including Kinetics-400, Kinetics-600, Charades, and AViD.

Author Information

Michael Ryoo (Google; Stony Brook University)
AJ Piergiovanni (Indiana University)
Anurag Arnab (University of Oxford)
Mostafa Dehghani (Google Brain)
Anelia Angelova (Google Research)

More from the Same Authors

  • 2021 Poster: Attention Bottlenecks for Multimodal Fusion »
    Arsha Nagrani · Shan Yang · Anurag Arnab · Aren Jansen · Cordelia Schmid · Chen Sun
  • 2021 Poster: Compressive Visual Representations »
    Kuang-Huei Lee · Anurag Arnab · Sergio Guadarrama · John Canny · Ian Fischer
  • 2020 Poster: AViD Dataset: Anonymized Videos from Diverse Countries »
    AJ Piergiovanni · Michael S Ryoo
  • 2019 : Coffee + Posters »
    Changhao Chen · Nils Gählert · Edouard Leurent · Johannes Lehner · Apratim Bhattacharyya · Harkirat Singh Behl · TeckYian Lim · Shiho Kim · Jelena Novosel · Błażej Osiński · Arindam Das · Ruobing Shen · Jeffrey Hawke · Joachim Sicking · Babak Shahian Jahromi · Theja Tulabandhula · Claudio Michaelis · Evgenia Rusak · WENHANG BAO · Hazem Rashed · JP Chen · Amin Ansari · Jaekwang Cha · Mohamed Zahran · Daniele Reda · Jinhyuk Kim · Kim Dohyun · Ho Suk · Junekyo Jhung · Alexander Kister · Matthias Fahrland · Adam Jakubowski · Piotr Miłoś · Jean Mercat · Bruno Arsenali · Silviu Homoceanu · Xiao-Yang Liu · Philip Torr · Ahmad El Sallab · Ibrahim Sobh · Anurag Arnab · Christopher Galias