http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
On the Use Of Speech Recognition Technology for Foreign Language Pronunciation Teaching
Hirose, Keikichi,Ishi, Caries-T.,Kawai, Goh The Korean Society Of Phonetic Sciences And Speech 2001 말소리 Vol.42 No.-
Recently speech technologies have shown notable advancements and they now play major roles in computer-aided language learning systems. In the current paper, use of speech recognition technologies is viewed with our system for teaching English pronunciation to Japanese speakers.
Spectral Analysis of Audio Signals with Noise Assisted Empirical Mode Decomposition
Poly Rani Ghosh,Keikichi Hirose,Md. Khademul Islam Molla 보안공학연구지원센터 2015 International Journal of Signal Processing, Image Vol.8 No.4
A data adaptive approach to spectral analysis of audio signals is implemented in this paper. The audio signals are non-stationary as well as non-linear in nature and the traditional Fourier based spectral representation is not effective. The Hilbert spectral analysis implemented by noise assisted bivariate empirical mode decomposition (NA-BEMD) is introduced here as an efficient spectral representation scheme of audio signals. In BEMD, the fractional Gaussian noise (fGn) and analyzing speech signal are used as two separate variables. Both signals are decomposed together yielding a finite set of intrinsic mode functions (IMFs) for individual variables (signals). The use of fGn implements BEMD with dyadic filterbank characteristics. The instantaneous frequencies of individual IMFs are computed by applying Hilbert transform and then the time-frequency representation is achieved by arranging the energy with respect to time and frequency simultaneously. Such representation is called Hilbert spectrum (HS) which is analogous to spectrogram. The marginal HS derived from HS corresponds the total energy at each frequency component. The experimental results show that the Hilbert spectral analysis provides better representation of audio signal contents compared to the Fourier based approach.
Unit Generation Based on Phrase Break Strength and Pruning for Corpus-Based Text-to-Speech
Kim, Sang-Hun,Lee, Young-Jik,Hirose, Keikichi Electronics and Telecommunications Research Instit 2001 ETRI Journal Vol.23 No.4
This paper discusses two important issues of corpus-based synthesis: synthesis unit generation based on phrase break strength information and pruning redundant synthesis unit instances. First, the new sentence set for recording was designed to make an efficient synthesis database, reflecting the characteristics of the Korean language. To obtain prosodic context sensitive units, we graded major prosodic phrases into 5 distinctive levels according to pause length and then discriminated intra-word triphones using the levels. Using the synthesis unit with phrase break strength information, synthetic speech was generated and evaluated subjectively. Second, a new pruning method based on weighted vector quantization (WVQ) was proposed to eliminate redundant synthesis unit instances from the synthesis database. WVQ takes the relative importance of each instance into account when clustering similar instances using vector quantization (VQ) technique. The proposed method was compared with two conventional pruning methods through objective and subjective evaluations of synthetic speech quality: one to simply limit the maximum number of instances, and the other based on normal VQ-based clustering. For the same reduction rate of instance number, the proposed method showed the best performance. The synthetic speech with reduction rate 45% had almost no perceptible degradation as compared to the synthetic speech without instance reduction.