Riwaya-ID: Towards ML-powered Identification of Qur’anic Recitation Style from Audio
A. Anas Chentouf
Abstract
The Holy Qur’an, the scripture of Muslims, is a recited text whose transmission traditions (riwayat) encode different recitation rules. We study riwaya identification: determining the Qur’anic transmission style of a recitation directly from audio. In order to do so, we curate over 700 hours of recitations and segment recordings into $12$ s windows to build a dataset. Building on pretrained speech encoders (e.g., wav2vec2.0, Whisper), we extract frame-level embeddings and train a lightweight classifier to predict the riwaya. Our embedding-based models achieve an $82\%$ prediction accuracy in distinguishing Warsh from Hafs, outperforming text-only baselines. We hope that this work provides a first step toward scalable, audio-native tools for enriching Qur’anic digital libraries and supporting different recitation styles.
Successful Page Load