Skip to yearly menu bar Skip to main content


Invited talk
in
Workshop: Machine Learning for Audio

A multi-view approach for audio-based speech emotion recognition

Dimitra Emmanouilidou

[ ]
Sat 16 Dec 7:40 a.m. PST — 8:10 a.m. PST

Abstract:

The area of speech emotion recognition (SER) has seen significant advances with the wider availability of pre-trained models and embeddings, and the creation of larger publicly available corpora. In this talk we will touch upon some of the challenges that continue to riddle audio-based SER, such as domain adaptation, data augmentation and output generalization, and further discuss the advantages of a multi-view model approach, one that jointly learns from both categorical and dimensional affect labels.

Chat is not available.