The aim of this study is to evaluate the pronunciation accuracy of Korean EFL learners' by automatically comparing the EFL learners' pronunciation with native speakers' and giving them pronunciation scores. The feedback on their pronunciation can also...
The aim of this study is to evaluate the pronunciation accuracy of Korean EFL learners' by automatically comparing the EFL learners' pronunciation with native speakers' and giving them pronunciation scores. The feedback on their pronunciation can also be given to the learners.
In order to calculate the pronunciation accuracy, automatic speech recognition technologies were used based on Hidden Markov Model. In the perspective of phonetics and phonology, the pronunciation can be described in terms of segmental and suprasegmental features. Among suprasegmental features, pitch pattern and rhythmic pattern were investigated. The pattern of fundamental frequency fluctuation was mainly investigated to evaluate the pitch pattern of Korean speakers. In terms of rhythmic pattern, the regular intervals of each foot were examined. The pattern of English native speakers' pitch and rhythmic pattern was compared with that of Korean EFL learners'.
In order to show how much the pronunciation instruction is necessary in current EFL environments, a needs analysis was carried out with English teachers in Korea by giving them a questionnaire. The survey showed that the pronunciation instruction must be given to EFL learners. The teachers also pointed out that proper pronunciation teaching tools and teaching techniques should be provided. In order to calculate the pronunciation accuracy of Korean EFL learners, (1) phonetically-based phone sets were established, (2) the segmental and suprasegmental properties of native English speakers' pronunciation were investigated, (3) scripts for recording were created for acoustic analysis, (4) automatic labelling and segmentation of the recorded data were carried out, (5) spectrographic analysis was carried out to manually compare the Korean EFL learners' pronunciation with native speakers', and (6) based on these analyses, Korean EFL learners' pronunciation variation was described in terms of segmental and suprasegmental aspects. The recorded speech data was also used for the input to the pronunciation evaluation system.
Auditory evaluation was also carried out by 3 English native speakers and 5 Korean phonetics experts. The auditory evaluation scores of the pronunciation was compared with the automatic evaluation which was provided by the evaluation system in this study.
Segmental evaluation of the pronunciation was carried out by using HMM based on the native speaker's phone sets. Based on the training, native speakers' utterance was automatically annotated by forced alignment. Log probability was produced as a result of the automatic annotation. This probability was divided by the number of frames occupying the segment.
In order to evaluate the rhythmic accuracy of Korean EFL learners' English pronunciation, the duration of each segment was converted to a z-score. By using z-scores rather than absolute duration values, difficulties in measuring the duration and rhythm of the pronunciation due to speaker variations can be minimized. RMSE (root means squared prediction error) and correlation coefficient were used as a standard for the evaluation.
The two most important events that characterize intonation of utterances are F0 fluctuation (pitch accent) and intonation boundary (IP) tones. In fact, the F0 fluctuation is found to be too variable to be used as a consistent parameter. On the other hand, IP of utterances appears to be relatively invariant enough to be extracted as a language specific properties. Thus we investigated IP boundary tones of Korean learners' English utterances as compared to the identical utterances by native speakers to evaluate Korean learners' intonation pattern. For quantitative analysis, we employed Tilt intonation parameters which is assumed to be more practical and reliable than the ToBI framework.