Mispronunciation Detection and Diagnosis for Young Arabic Learners Using Transfer Learning
Mispronunciation Detection and Diagnosis for Young Arabic Learners Using Transfer Learning
... ABSTRACT Improving primary school students' reading skills supports their academic growth and communication abilities. Pronunciation accuracy is central to reading, especially in Arabic, where small diacritic changes can alter meaning. This is complicated by Arabic's low-resource nature. This study developed a Mispronunciation Detection and Diagnosis system for Arabic learners, allowing teachers and learners to use Computer-Assisted Pronunciation Training for improved instruction and assessment. A pretrained self-supervised learning model was fine-tuned to detect phoneme-level pronunciation errors in Modern Standard Arabic using a unique dataset of primary school learner speech from Saudi Arabia. The data were structured, preprocessed, normalized, and aligned to phoneme sequences. The system showed improved phoneme recognition and performance approaching that of a human expert with an F one score of seventy-one point four percent.
I. INTRODUCTION
I. INTRODUCTION
Reading proficiency in early education depends on accurate pronunciation, particularly in languages such as Arabic where minor diacritic variations can substantially alter lexical meaning. Therefore, enhancing pronunciation accuracy is central to developing primary school learners' literacy and communication skills.
Pronunciation training in Arabic presents distinct challenges, owing to the language's complex phonological structure and dense diacritic system. Inadequate articulation impairs reading fluency and also hinders language acquisition and comprehension. Thus, reliable automated pronunciation assessment tools are essential for supporting Modern Standard Arabic learners.
Computer-Assisted Pronunciation Training systems are effective language learning tools, delivering automated feedback and adaptive instructions. However, Arabic Computer-Assisted Pronunciation Training systems, particularly those focused on Mispronunciation Detection and Diagnosis, remain underdeveloped.
This study aimed primarily to investigate whether self-supervised transfer learning models can be fine-tuned for effective mispronunciation detection in Arabic, particularly for continuous speech from young learners in real-world settings. An Mispronunciation Detection and Diagnosis framework tailored to Modern Standard Arabic was introduced and optimized for primary school student speech. Leveraging recent advancements in self-supervised learning and speech processing, the proposed system automatically identified and analyzed phoneme-level mispronunciations. This establishes a foundation for consistent, scalable, and accurate feedback to support both pronunciation training and reading assessment in Modern Standard Arabic.