http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Eigen - Environment 잡음 보상 방법을 이용한 강인한 음성인식
송화전,김형순,Song Hwa Jeon,Kim Hyung Soon 대한음성학회 2004 말소리 Vol.52 No.-
In this paper, a new noise compensation method based on the eigenvoice framework in feature space is proposed to reduce the mismatch between training and testing environments. The difference between clean and noisy environments is represented by the linear combination of K eigenvectors that represent the variation among environments. In the proposed method, the performance improvement of speech recognition systems is largely affected by how to construct the noisy models and the bias vector set. In this paper, two methods, the one based on MAP adaptation method and the other using stereo DB, are proposed to construct the noisy models. In experiments using Aurora 2 DB, we obtained 44.86% relative improvement with eigen-environment method in comparison with baseline system. Especially, in clean condition training mode, our proposed method yielded 66.74% relative improvement, which is better performance than several methods previously proposed in Aurora project.
다양한 변별분석을 통한 한국어 연결숫자 인식 성능향상에 관한 연구
송화전,김형순,Song Hwa Jeon,Kim Hyung Soon 대한음성학회 2002 말소리 Vol.44 No.-
In Korean, each digit is monosyllable and some pairs are known to have high confusability, causing performance degradation of connected digit recognition systems. To improve the performance, in this paper, we employ various discriminant analyses (DA) including Linear DA (LDA), Weighted Pairwise Scatter LDA WPS-LDA), Heteroscedastic Discriminant Analysis (HDA), and Maximum Likelihood Linear Transformation (MLLT). We also examine several combinations of various DA for additional performance improvement. Experimental results show that applying any DA mentioned above improves the string accuracy, but the amount of improvement of each DA method varies according to the model complexity or number of mixtures per state. Especially, more than 20% of string error reduction is achieved by applying MLLT after WPS-LDA, compared with the baseline system, when class level of DA is defined as a tied state and 1 mixture per state is used.
Sub-Stream 기반의 Eigenvoice를 이용한 고속 화자적응
송화전,이종석,김형순,Song, Hwa-Jeon,Lee, Jong-Seok,Kim, Hyung-Soon 대한음성학회 2005 말소리 Vol.55 No.-
In this paper, sub-stream based eigenvoice method is proposed to overcome the weak points of conventional eigenvoice and dimensional eigenvoice. In the proposed method, sub-streams are automatically constructed by the statistical clustering analysis that uses the correlation information between dimensions. To obtain the reliable distance matrix from covariance matrix for dividing into optimal sub-streams, MAP adaptation technique is employed to the covariance matrix of training data and the sample covariance of adaptation data. According to our experiments, the proposed method shows $41\%$ error rate reduction when the number of adaptation data is 50.
송화전,김현우,정의석,오성찬,이전우,강동오,정준영,이윤근,Song, H.J.,Kim, H.W.,Chung, E.,Oh, S.,Lee, J.W.,Kang, D.,Jung, J.Y.,Lee, Y.K. 한국전자통신연구원 2019 전자통신동향분석 Vol.34 No.4
Currently, a majority of artificial intelligence is used to secure big data; however, it is concentrated in a few of major companies. Therefore, automatic data augmentation and efficient learning algorithms for small-scale data will become key elements in future artificial intelligence competitiveness. In addition, it is necessary to develop a technique to learn meanings, correlations, and time-related associations of complex modal knowledge similar to that in humans and expand and transfer semantic prediction/knowledge inference about unknown data. To this end, a neural memory model, which imitates how knowledge in the human brain is processed, needs to be developed to enable knowledge expansion through modality cooperative learning. Moreover, declarative and procedural knowledge in the memory model must also be self-developed through human interaction. In this paper, we reviewed this essential methodology and briefly described achievements that have been made so far.
Eigenspace-based MLLR에 기반한 고속 화자적응 및 환경보상
송화전,김형순,Song Hwa-Jeon,Kim Hyung-Soon 대한음성학회 2006 말소리 Vol.58 No.-
Maximum likelihood linear regression (MLLR) adaptation experiences severe performance degradation with very tiny amount of adaptation data. Eigenspace- based MLLR, as an alternative to MLLR for fast speaker adaptation, also has a weak point that it cannot deal with the mismatch between training and testing environments. In this paper, we propose a simultaneous fast speaker and environment adaptation based on eigenspace-based MLLR. We also extend the sub-stream based eigenspace-based MLLR to generalize the eigenspace-based MLLR with bias compensation. A vocabulary-independent word recognition experiment shows the proposed algorithm is superior to eigenspace-based MLLR regardless of the amount of adaptation data in diverse noisy environments. Especially, proposed sub-stream eigenspace-based MLLR with bias compensation yields 67% relative improvement with 10 adaptation words in 10 dB SNR environment, in comparison with the conventional eigenspace-based MLLR.
효과적인 2차 최적화 적용을 위한 Minibatch 단위 DNN 훈련 관점에서의 CNN 구현
송화전(Song, Hwa Jeon),정호영(Jung, Ho Young),박전규(Park, Jeon Gue) 한국음성학회 2016 말소리와 음성과학 Vol.8 No.2
This paper describes some implementation schemes of CNN in view of mini-batch DNN training for efficient second order optimization. This uses same procedure updating parameters of DNN to train parameters of CNN by simply arranging an input image as a sequence of local patches, which is actually equivalent with mini-batch DNN training. Through this conversion, second order optimization providing higher performance can be simply conducted to train the parameters of CNN. In both results of image recognition on MNIST DB and syllable automatic speech recognition, our proposed scheme for CNN implementation shows better performance than one based on DNN.