For a number of speech tasks, it can be useful to represent speech segments of arbitrary length by fixed-dimensional vectors, or embeddings. In particular, vectors representing word segments -- acoustic word embeddings -- can be used in query-by-example search, example-based speech recognition, or spoken term discovery. Textual word embeddings have been common in natural language processing for a number of years now; the acoustic analogue is only recently starting to be explored. This talk will present our work on acoustic word embeddings and their application to query-by-example search. I will speculate on applications across a wider variety of audio tasks.
Karen Livescu is an Associate Professor at TTI-Chicago. She completed her PhD and post-doc in electrical engineering and computer science at MIT and her Bachelor's degree in Physics at Princeton University. Karen's main research interests are at the intersection of speech and language processing and machine learning. Her recent work includes multi-view representation learning, segmental neural models, acoustic word embeddings, and automatic sign language recognition. She is a member of the IEEE Spoken Language Technical Committee, an associate editor for IEEE Transactions on Audio, Speech, and Language Processing, and a technical co-chair of ASRU 2015 and 2017.
Karen Livescu (TTI-Chicago)
More from the Same Authors
2022 : On Convexity and Linear Mode Connectivity in Neural Networks »
David Yunis · Kumar Kshitij Patel · Pedro Savarese · Gal Vardi · Jonathan Frankle · Matthew Walter · Karen Livescu · Michael Maire
2020 Workshop: Self-Supervised Learning for Speech and Audio Processing »
Abdelrahman Mohamed · Hung-yi Lee · Shinji Watanabe · Shang-Wen Li · Tara Sainath · Karen Livescu
2017 : Panel: Machine learning and audio signal processing: State of the art and future perspectives »
Sepp Hochreiter · Bo Li · Karen Livescu · Arindam Mandal · Oriol Nieto · Malcolm Slaney · Hendrik Purwins