Timezone: »
Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener, with implications for various applications in AR, VR, and robotics. Whereas traditional methods to estimate RIRs assume dense geometry and/or sound measurements throughout the environment, we explore how to infer RIRs based on a sparse set of images and echoes observed in the space. Towards that goal, we introduce a transformer-based method that uses self-attention to build a rich acoustic context, then predicts RIRs of arbitrary query source-receiver locations through cross-attention. Additionally, we design a novel training objective that improves the match in the acoustic signature between the RIR predictions and the targets. In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs, outperforming state-of-the-art methods and---in a major departure from traditional methods---generalizing to novel environments in a few-shot manner. Project: http://vision.cs.utexas.edu/projects/fs_rir
Author Information
Sagnik Majumder (University of Texas, Austin)
Changan Chen (University of Texas, Austin)
Ziad Al-Halah (KIT)
Kristen Grauman (University of Texas at Austin)
More from the Same Authors
-
2021 Spotlight: Shaping embodied agent behavior with activity-context priors from egocentric video »
Tushar Nagarajan · Kristen Grauman -
2023 Poster: EgoEnv: Human-centric environment representations from egocentric video »
Tushar Nagarajan · Santhosh Kumar Ramakrishnan · Ruta Desai · James Hillis · Kristen Grauman -
2023 Poster: Self-Supervised Visual Acoustic Matching »
Arjun Somayazulu · Changan Chen · Kristen Grauman -
2023 Poster: Video-Mined Task Graphs for Keystep Recognition in Instructional Videos »
Kumar Ashutosh · Santhosh Kumar Ramakrishnan · Triantafyllos Afouras · Kristen Grauman -
2023 Poster: Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment »
Zihui Xue · Kristen Grauman -
2023 Poster: EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding »
Shuhan Tan · Tushar Nagarajan · Kristen Grauman -
2023 Poster: Single-Stage Visual Query Localization in Egocentric Videos »
Hanwen Jiang · Santhosh Kumar Ramakrishnan · Kristen Grauman -
2023 Poster: EgoTracks: A Long-term Egocentric Visual Object Tracking Dataset »
Hao Tang · Kevin J Liang · Kristen Grauman · Matt Feiszli · Weiyao Wang -
2023 Oral: EgoEnv: Human-centric environment representations from egocentric video »
Tushar Nagarajan · Santhosh Kumar Ramakrishnan · Ruta Desai · James Hillis · Kristen Grauman -
2022 Poster: SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning »
Changan Chen · Carl Schissler · Sanchit Garg · Philip Kobernik · Alexander Clegg · Paul Calamia · Dhruv Batra · Philip Robinson · Kristen Grauman -
2021 Poster: Shaping embodied agent behavior with activity-context priors from egocentric video »
Tushar Nagarajan · Kristen Grauman -
2020 : Panel Discussion & Closing »
Yejin Choi · Alexei Efros · Chelsea Finn · Kristen Grauman · Quoc V Le · Yann LeCun · Ruslan Salakhutdinov · Eric Xing -
2020 : Q & A and Panel Session with Dan Weld, Kristen Grauman, Scott Yih, Emma Brunskill, and Alex Ratner »
Kristen Grauman · Wen-tau Yih · Alexander Ratner · Emma Brunskill · Douwe Kiela · Daniel S. Weld -
2020 : QA: Kristen Grauman »
Kristen Grauman -
2020 : Invited Talk: Kristen Grauman »
Kristen Grauman -
2020 Poster: Learning Affordance Landscapes for Interaction Exploration in 3D Environments »
Tushar Nagarajan · Kristen Grauman -
2020 Spotlight: Learning Affordance Landscapes for Interaction Exploration in 3D Environments »
Tushar Nagarajan · Kristen Grauman -
2017 Poster: Learning Spherical Convolution for Fast Features from 360° Imagery »
Yu-Chuan Su · Kristen Grauman -
2014 Poster: Diverse Sequential Subset Selection for Supervised Video Summarization »
Boqing Gong · Wei-Lun Chao · Kristen Grauman · Fei Sha -
2014 Poster: Predicting Useful Neighborhoods for Lazy Local Learning »
Aron Yu · Kristen Grauman -
2014 Poster: Zero-shot recognition with unreliable attributes »
Dinesh Jayaraman · Kristen Grauman -
2013 Poster: Reshaping Visual Datasets for Domain Adaptation »
Boqing Gong · Kristen Grauman · Fei Sha -
2012 Poster: Semantic Kernel Forests from Multiple Taxonomies »
Sung Ju Hwang · Kristen Grauman · Fei Sha -
2011 Poster: Learning a Tree of Metrics with Disjoint Visual Features »
Sung Ju Hwang · Kristen Grauman · Fei Sha -
2010 Poster: Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning »
Prateek Jain · Sudheendra Vijayanarasimhan · Kristen Grauman -
2008 Oral: Multi-Level Active Prediction of Useful Image Annotations for Recognition »
Sudheendra N Vijayanarasimhan · Kristen Grauman -
2008 Poster: Multi-Level Active Prediction of Useful Image Annotations for Recognition »
Sudheendra N Vijayanarasimhan · Kristen Grauman -
2008 Poster: Online Metric Learning and Fast Similarity Search »
Prateek Jain · Brian Kulis · Inderjit Dhillon · Kristen Grauman -
2008 Oral: Online Metric Learning and Fast Similarity Search »
Prateek Jain · Brian Kulis · Inderjit Dhillon · Kristen Grauman -
2006 Poster: Approximate Correspondences in High Dimensions »
Kristen Grauman · Trevor Darrell -
2006 Spotlight: Approximate Correspondences in High Dimensions »
Kristen Grauman · Trevor Darrell