Timezone: »
Humans can infer a wide range of properties from a perceived sound, such as information about the source (e.g. what generated the sound? where is it coming from?), the information the sound conveys (this is a word that means X, this is a musical note in scale Y), and how it compares to other sounds (these two sounds come/don't come from the same source and are/aren't identical). Can any one learned representation do the same? The aim of this competition is to develop a general-purpose audio representation that provides a meaningful basis for learning in a wide variety of tasks and scenarios. We challenge participants with the following questions: Is it possible to develop a single representation that models all psychoacoustic phenomena? What approach best generalizes to a wide range of downstream audio tasks without fine-tuning? What audio representation allows researchers to formulate and solve novel and societally-valuable problems in simple, repeatable ways? We will evaluate audio representations using a benchmark suite across a variety of domains, including speech, environmental sound, medical audio, and music. In the spirit of shared exchange, all participants must submit an audio embedding model, following a common API, that is general-purpose, open-source, and freely available to use.
Author Information
Joseph Turian (MetaOptimize)
Jordan Shier (jordieshier@gmail.com)
Bhiksha Raj (Carnegie Mellon University)
Bjoern Schuller (University of Augsburg / Imperial College London)
Christian Steinmetz (Queen Mary University of London)
George Tzanetakis (University of Victoria)
Gissel Velarde (Independent)
I am an independent researcher. I hold a PhD in Computer Science and Engineering from Aalborg University, Denmark, recognized as Europe's best and world's fourth top university in engineering according to the US News World Ranking and the MIT ranking 2018. I hold a master’s degree in electronic systems and engineering management from the South Westphalia University of Applied Sciences, Soest, Germany, and a Licenciatura’s degree in systems engineering from the Universidad Católica Boliviana, La Paz, Bolivia, recognized as the third best university in Bolivia according to the Webometrics Ranking 2020.
Kirk McNally (University of Victoria)
Max Henry (McGill University)
Nicolas Pinto (Cygni Labs)
Yonatan Bisk (Carnegie Mellon University)
George Tzanetakis (University of Victoria)
Camille Noufi (Stanford University)
Dorien Herremans (Singapore University of Technology and Design)
Jesse Engel (Google Brain)
Justin Salamon (Adobe Research)
Prany Manocha (Princeton)
Philippe Esling (IRCAM - Sorbonne Université / CNRS)
Shinji Watanabe (Johns Hopkins University)
More from the Same Authors
-
2020 : Randomized Overdrive Neural Networks »
Christian Steinmetz -
2021 : MIDI-DDSP: Hierarchical Modeling of Music for Detailed Control »
Yusong Wu · Ethan Manilow · Kyle Kastner · Tim Cooijmans · Aaron Courville · Cheng-Zhi Anna Huang · Jesse Engel -
2022 Poster: USB: A Unified Semi-supervised Learning Benchmark for Classification »
Yidong Wang · Hao Chen · Yue Fan · Wang SUN · Ran Tao · Wenxin Hou · Renjie Wang · Linyi Yang · Zhi Zhou · Lan-Zhe Guo · Heli Qi · Zhen Wu · Yu-Feng Li · Satoshi Nakamura · Wei Ye · Marios Savvides · Bhiksha Raj · Takahiro Shinozaki · Bernt Schiele · Jindong Wang · Xing Xie · Yue Zhang -
2021 : WebQA Competition + Q&A »
Yingshan CHANG · Yonatan Bisk · Mridu Narang · Levi Melnick · Jianfeng Gao · Hisami Suzuki · Guihong Cao -
2020 : Panel Discussion 2 »
Tom White · Jesse Engel · Aaron Hertzmann · Stephanie Dinkins · Holly Grimm -
2020 : magenta: Empowering Creative Agency with Machine Learning »
Jesse Engel -
2020 Workshop: Self-Supervised Learning for Speech and Audio Processing »
Abdelrahman Mohamed · Hung-yi Lee · Shinji Watanabe · Shang-Wen Li · Tara Sainath · Karen Livescu -
2020 Poster: Is normalization indispensable for training deep neural network? »
Jie Shao · Kai Hu · Changhu Wang · Xiangyang Xue · Bhiksha Raj -
2020 Oral: Is normalization indispensable for training deep neural network? »
Jie Shao · Kai Hu · Changhu Wang · Xiangyang Xue · Bhiksha Raj -
2020 Poster: Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals »
Jing Shi · Xuankai Chang · Pengcheng Guo · Shinji Watanabe · Yusuke Fujita · Jiaming Xu · Bo Xu · Lei Xie -
2019 Workshop: NeurIPS Workshop on Machine Learning for Creativity and Design 3.0 »
Luba Elliott · Sander Dieleman · Adam Roberts · Jesse Engel · Tom White · Rebecca Fiebrink · Parag Mital · Christine McLeavey · Nao Tokui -
2019 Poster: Face Reconstruction from Voice using Generative Adversarial Networks »
Yandong Wen · Bhiksha Raj · Rita Singh -
2018 Workshop: Second Workshop on Machine Learning for Creativity and Design »
Luba Elliott · Sander Dieleman · Rebecca Fiebrink · Jesse Engel · Adam Roberts · Tom White -
2017 : Poster Session Music and environmental sounds »
Oriol Nieto · Jordi Pons · Bhiksha Raj · Tycho Tax · Benjamin Elizalde · Juhan Nam · Anurag Kumar -
2017 Demonstration: Magenta and deeplearn.js: Real-time Control of DeepGenerative Music Models in the Browser »
Curtis Hawthorne · Ian Simon · Adam Roberts · Jesse Engel · Daniel Smilkov · Nikhil Thorat · Douglas Eck -
2016 Demonstration: Interactive musical improvisation with Magenta »
Adam Roberts · Jesse Engel · Curtis Hawthorne · Ian Simon · Elliot Waite · Sageev Oore · Natasha Jaques · Cinjon Resnick · Douglas Eck -
2012 Poster: Unsupervised Structure Discovery for Semantic Analysis of Audio »
Sourish Chaudhuri · Bhiksha Raj -
2010 Demonstration: MetaOptimize: A Q+A site for machine learning »
Joseph Turian -
2010 Poster: Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers »
Manas A Pathak · Shantanu Rane · Bhiksha Raj -
2009 Poster: A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds »
Paris Smaragdis · Madhusudana Shashanka · Bhiksha Raj -
2006 Poster: Scalable Discriminative Learning for Natural Language Parsing and Translation »
Joseph Turian · Benjamin Wellington · Dan Melamed -
2006 Spotlight: Scalable Discriminative Learning for Natural Language Parsing and Translation »
Joseph Turian · Benjamin Wellington · Dan Melamed