Skip to yearly menu bar Skip to main content


Poster
in
Affinity Event: Black in AI

Building a Robust Amharic ASR

Yamlak Asrat Bogale · Yohannes Haile


Abstract:

Amharic, a Semitic language widely spoken in Ethiopia, is among the lowresource languages with limited representation in language technologies. Automatic Speech Recognition (ASR) systems developed for this language face challenges in performance improvement and Word Error Rate (WER) metrics, primarily due to frequent out-of-vocabulary errors during recognition. Acquiring publicly available datasets for training ASR models in Amharic is difficult and costly. This is due to the low-resource nature of the language. In this study, an open-source dataset is utilized which is compiled from publicly available sources such as audiobooks, news readings, multi-genre radio programs, and audio Bible readings. It comprises 110 hours of speech corpus. Leveraging this dataset, Facebooks wav2vec2.0 ASR model is employed for Amharic speech recognition. In addition, various techniques to enhance its robustness are used.

Live content is unavailable. Log in and register to view live content