Timezone: »

How Transferable are Video Representations Based on Synthetic Data?
Yo-whan Kim · Samarth Mishra · SouYoung Jin · Rameswar Panda · Hilde Kuehne · Leonid Karlinsky · Venkatesh Saligrama · Kate Saenko · Aude Oliva · Rogerio Feris

Tue Nov 29 02:00 PM -- 04:00 PM (PST) @ Hall J #1033

Action recognition has improved dramatically with massive-scale video datasets. Yet, these datasets are accompanied with issues related to curation cost, privacy, ethics, bias, and copyright. Compared to that, only minor efforts have been devoted toward exploring the potential of synthetic video data. In this work, as a stepping stone towards addressing these shortcomings, we study the transferability of video representations learned solely from synthetically-generated video clips, instead of real data. We propose SynAPT, a novel benchmark for action recognition based on a combination of existing synthetic datasets, in which a model is pre-trained on synthetic videos rendered by various graphics simulators, and then transferred to a set of downstream action recognition datasets, containing different categories than the synthetic data. We provide an extensive baseline analysis on SynAPT revealing that the simulation-to-real gap is minor for datasets with low object and scene bias, where models pre-trained with synthetic data even outperform their real data counterparts. We posit that the gap between real and synthetic action representations can be attributed to contextual bias and static objects related to the action, instead of the temporal dynamics of the action itself. The SynAPT benchmark is available at https://github.com/mintjohnkim/SynAPT.

Author Information

Yo-whan Kim (Massachusetts Institute of Technology)
Samarth Mishra (Boston University)
SouYoung Jin (Massachusetts Institute of Technology)
Rameswar Panda (MIT-IBM Watson AI Lab)
Hilde Kuehne (Goethe University Frankfurt)
Leonid Karlinsky (Weizmann Institute of Science)
Venkatesh Saligrama (Boston University)
Kate Saenko (Boston University & MIT-IBM Watson AI Lab, IBM Research)
Aude Oliva (Massachusetts Institute of Technology)
Rogerio Feris (MIT-IBM Watson AI Lab, IBM Research)

More from the Same Authors