I'll discuss the problem of transcribing polyphonic piano music with an emphasis on generalizing to unseen instruments. We optimize for two objectives. We first predict pitch onset events and then conditionally predict pitch at the frame level. I'll discuss the model architecture, which combines CNNs and LSTMs. I'll also discuss challenges faced in robust piano transcription, such as obtaining enough data to train a good model I'll also provide some demos and links to working code. This collaboration was led by Curtis Hawthorne, Erich Elsen and Jialin Song (https://arxiv.org/abs/1710.11153).
Douglas Eck works at the Google Brain team on the Magenta project, an effort to generate music, video, images and text using machine intelligence. He also worked on music search and recommendation for Google Play Music. I was an Associate Professor in Computer Science at University of Montreal in the BRAMS research center. He also worked on music performance modeling.
Douglas Eck (Google Brain)
I’m a research scientist working on Magenta, an effort to generate music, video, images and text using machine intelligence. Magenta is part of the Google Brain team and is using TensorFlow (www.tensorflow.org), an open-source library for machine learning. The question Magenta asks is, “Can machines make music and art? If so, how? If not, why not?” The goal if Magenta is to produce open-source tools and models that help creative people be even more creative. I’m primarily looking at how to use so-called “generative” machine learning models to create engaging media. Additionally, I’m working on how to bring other aspects of the creative process into play. For example, art and music is not just about generating new pieces. It’s also about drawing one’s attention, being surprising, telling an interesting story, knowing what’s interesting in a scene, and so on. Before starting the Magenta project, I worked on music search and recommendation for Google Play Music. My research goal in this area was to use machine learning and audio signal processing to help listeners find the music they want when they want it. This involves both learning from audio and learning from how users consume music. In the audio domain, the main goal is to transform the ones and zeros in a digital audio file into something where musically-similar songs are also numerically similar, making it easier to do music recommendation. This is (a) user-dependent: my idea of similar is not the same as yours and (b) changes with context: my idea of similarity changes when I make a playlist for jogging versus making a playlist for a dinner party. I might choose the same song (say "Taxman" by the Beatles) but perhaps it would be the tempo for jogging that drove the selection of that specific song versus "I like the album Revolver and want to add it to the dinner party mix" for a dinner party playlist. I joined Google in 2003. Before then, I was an Associate Professor in Computer Science at University of Montreal. I helped found the BRAMS research center (Brain Music and Sound; www.brams.org) and was involved at the McGill CIRMMT center (Centre for Interdisciplinary Research in Music Media and Technology; www.cirmmt.org). Aside from audio signal processing and machine learning, I worked on music performance modeling. What exactly does a good music performer add to what is already in the score? I treated this as a machine learning question: Hypothetically, if we showed a piano-playing robot a huge collection of Chopin performances--- from the best in the world all the way down to that of a struggling teenage pianist---could it learn to play well by analyzing all of these examples? If so, what’s the right way to perform that analysis? In the end I learned a lot about the complexity and beauty of human music performance, and how performance relates to and extends composition.
More from the Same Authors
2017 Workshop: Machine Learning for Creativity and Design »
Douglas Eck · David Ha · S. M. Ali Eslami · Sander Dieleman · Rebecca Fiebrink · Luba Elliott
2017 Demonstration: Magenta and deeplearn.js: Real-time Control of DeepGenerative Music Models in the Browser »
Curtis Hawthorne · Ian Simon · Adam Roberts · Jesse Engel · Daniel Smilkov · Nikhil Thorat · Douglas Eck
2016 Demonstration: Interactive musical improvisation with Magenta »
Adam Roberts · Jesse Engel · Curtis Hawthorne · Ian Simon · Elliot Waite · Sageev Oore · Natasha Jaques · Cinjon Resnick · Douglas Eck
2011 Workshop: The 4th International Workshop on Music and Machine Learning: Learning from Musical Structure »
Rafael Ramirez · Darrell Conklin · Douglas Eck · Rif A. Saurous
2009 Poster: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism »
Aaron Courville · Douglas Eck · Yoshua Bengio
2007 Poster: Automatic Generation of Social Tags for Music Recommendation »
Douglas Eck · Paul Lamere · Thierry Bertin-Mahieux · Stephen J Green