Skip to yearly menu bar Skip to main content

( events)   Timezone:  
Fri Dec 08 08:00 AM -- 06:30 PM (PST) @ 201 A
Machine Learning for Audio Signal Processing (ML4Audio)
Hendrik Purwins · Bob L. Sturm · Mark Plumbley

Workshop Home Page

Abstracts and full papers:

Audio signal processing is currently undergoing a paradigm change, where data-driven machine learning is replacing hand-crafted feature design. This has led some to ask whether audio signal processing is still useful in the "era of machine learning." There are many challenges, new and old, including the interpretation of learned models in high dimensional spaces, problems associated with data-poor domains, adversarial examples, high computational requirements, and research driven by companies using large in-house datasets that is ultimately not reproducible.

ML4Audio aims to promote progress, systematization, understanding, and convergence of applying machine learning in the area of audio signal processing. Specifically, we are interested in work that demonstrates novel applications of machine learning techniques to audio data, as well as methodological considerations of merging machine learning with audio signal processing. We seek contributions in, but not limited to, the following topics:
- audio information retrieval using machine learning;
- audio synthesis with given contextual or musical constraints using machine learning;
- audio source separation using machine learning;
- audio transformations (e.g., sound morphing, style transfer) using machine learning;
- unsupervised learning, online learning, one-shot learning, reinforcement learning, and incremental learning for audio;
- applications/optimization of generative adversarial networks for audio;
- cognitively inspired machine learning models of sound cognition;
- mathematical foundations of machine learning for audio signal processing.

This workshop especially targets researchers, developers and musicians in academia and industry in the area of MIR, audio processing, hearing instruments, speech processing, musical HCI, musicology, music technology, music entertainment, and composition.

ML4Audio Organisation Committee:
Hendrik Purwins, Aalborg University Copenhagen, Denmark (
Bob L. Sturm, Queen Mary University of London, UK (
Mark Plumbley, University of Surrey, UK (

Program Committee:
Abeer Alwan (University of California, Los Angeles)
Jon Barker (University of Sheffield)
Sebastian Böck (Johannes Kepler University Linz)
Mads Græsbøll Christensen (Aalborg University)
Maximo Cobos (Universitat de Valencia)
Sander Dieleman (Google DeepMind)
Monika Dörfler (University of Vienna)
Shlomo Dubnov (UC San Diego)
Philippe Esling (IRCAM)
Cédric Févotte (IRIT)
Emilia Gómez (Universitat Pompeu Fabra)
Emanuël Habets (International Audio Labs Erlangen)
Jan Larsen (Danish Technical University)
Marco Marchini (Spotify)
Rafael Ramirez (Universitat Pompeu Fabra)
Gaël Richard (TELECOM ParisTech)
Fatemeh Saki (UT Dallas)
Sanjeev Satheesh (Baidu SVAIL)
Jan Schlüter (Austrian Research Institute for Artificial Intelligence)
Joan Serrà (Telefonica)
Malcolm Slaney (Google)
Emmanuel Vincent (INRIA Nancy)
Gerhard Widmer (Austrian Research Institute for Artificial Intelligence)
Tao Zhang (Starkey Hearing Technologies)

Overture (Talk)
Acoustic word embeddings for speech search (Invited Talk)
Learning Word Embeddings from Speech (Talk)
Multi-Speaker Localization Using Convolutional Neural Network Trained with Noise (Talk)
Adaptive Front-ends for End-to-end Source Separation (Talk)
Poster Session Speech: source separation, enhancement, recognition, synthesis (Coffee break and poster session)
Learning and transforming sound for interactive musical applications (Invited Talk)
Compact Recurrent Neural Network based on Tensor Train for Polyphonic Music Modeling (Talk)
Singing Voice Separation using Generative Adversarial Networks (Talk)
Audio Cover Song Identification using Convolutional Neural Network (Talk)
Lunch Break (Break)
Polyphonic piano transcription using deep neural networks (Invited Talk)
Deep learning for music recommendation and generation (Invited Talk)
Exploring Ad Effectiveness using Acoustic Features (Invited Talk)
Poster Session Music and environmental sounds (Coffee break and poster session)
Sight and sound (Invited Talk)
k-shot Learning of Acoustic Context (Talk)
Towards Learning Semantic Audio Representations from Unlabeled Data (Talk)
Cost-sensitive detection with variational autoencoders for environmental acoustic sensing (Talk)
Panel: Machine learning and audio signal processing: State of the art and future perspectives (Discussion Panel)