Poster

Vocal Call Locator Benchmark (VCL) for localizing rodent vocalizations from multi-channel audio

Ralph Peterson ⋅ Aramis Tanelus ⋅ Christopher Ick ⋅ Bartul Mimica ⋅ Niegil Francis Muttath Joseph ⋅ Violet Ivan ⋅ Aman Choudhri ⋅ Annegret Falkner ⋅ Mala Murthy ⋅ David Schneider ⋅ Dan Sanes ⋅ Alex Williams

2024 Poster

Project Page [ Paper] [ Poster]

Abstract

Understanding the behavioral and neural dynamics of social interactions is a goalof contemporary neuroscience. Many machine learning methods have emergedin recent years to make sense of complex video and neurophysiological data thatresult from these experiments. Less focus has been placed on understanding howanimals process acoustic information, including social vocalizations. A criticalstep to bridge this gap is determining the senders and receivers of acoustic infor-mation in social interactions. While sound source localization (SSL) is a classicproblem in signal processing, existing approaches are limited in their ability tolocalize animal-generated sounds in standard laboratory environments. Advancesin deep learning methods for SSL are likely to help address these limitations,however there are currently no publicly available models, datasets, or benchmarksto systematically evaluate SSL algorithms in the domain of bioacoustics. Here,we present the VCL Benchmark: the first large-scale dataset for benchmarkingSSL algorithms in rodents. We acquired synchronized video and multi-channelaudio recordings of 767,295 sounds with annotated ground truth sources across 9conditions. The dataset provides benchmarks which evaluate SSL performance onreal data, simulated acoustic data, and a mixture of real and simulated data. Weintend for this benchmark to facilitate knowledge transfer between the neuroscienceand acoustic machine learning communities, which have had limited overlap.

Video

Chat is not available.