Persian Musical Instruments Classification Using Polyphonic Data Augmentation
Abstract
Musical instrument classification is essential for music information retrieval (MIR) and generative music systems, yet research on non-Western traditions, particularly Persian music, remains limited. We address this gap by introducing a new dataset of isolated recordings covering seven traditional Persian instruments, violin, piano, and vocals. We propose a culturally informed data augmentation strategy that generates realistic polyphonic mixtures from monophonic samples.Using the MERT model (Music undERstanding with large-scale self-supervised Training) for multi-label classification, we evaluate our approach in out-of-distribution scenario. On real-world polyphonic Persian music, the proposed method yielded the best ROC-AUC (0.796), highlighting complementary benefits of tonal and temporal coherence. These results demonstrate the effectiveness of culturally grounded augmentation for robust Persian instrument recognition and provide a foundation for culturally inclusive MIR and diverse music generation systems.