Why Do Music Models Plagiarize? A Motif-Centric Perspective
Tatsuro Inaba · Kentaro Inui
Abstract
This paper examines plagiarism-like behaviors in Transformer-based models for symbolic music generation. While these models can produce musically convincing outputs, they also risk copying fragments from training data. We hypothesize that such plagiarism arises from local overfitting of motifs, short recurrent patterns within a piece, rather than from global overfitting. To test this hypothesis, we analyze motif repetition in training data and assess motif-level plagiarism through perplexity and the originality of generated samples. Experiments show that frequently repeated motifs are predicted with lower perplexity and are more likely to reappear in generated outputs. We also explore preliminary strategies to mitigate plagiarism—label smoothing, transposition-based data augmentation, and Top-$K$ sampling—and evaluate their effectiveness.
Chat is not available.
Successful Page Load