http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Learning Self-Informed Feature Contribution for Deep Learning-Based Acoustic Modeling
Kim, Younggwan,Kim, Myungjong,Goo, Jahyun,Kim, Hoirin IEEE 2018 IEEE/ACM transactions on audio, speech, and langua Vol.26 No.11
<P>In this paper, we introduce a new feature engineering approach for deep learning-based acoustic modeling, which utilizes input feature contributions. For this purpose, we propose an auxiliary deep neural network (DNN) called a feature contribution network (FCN) whose output layer is composed of sigmoid-based contribution gates. In our framework, the FCN tries to learn element-level discriminative contributions of input features and an acoustic model network (AMN) is trained by gated features generated by element-wise multiplication between contribution gate outputs and input features. In addition, we also propose a regularization method for the FCN, which helps the FCN to activate the minimum number of the gates. The proposed methods were evaluated on the TED-LIUM release 1 corpus. We applied the proposed methods to DNN- and long short-term memory-based AMNs. Experimental results results showed that AMNs with the FCNs consistently improved recognition performance compared with AMN-only frameworks.</P>
스마트 벤트 홀을 가진 조수석 에어백의 전개 시뮬레이션
김영관(Younggwan Kim),김권희(Kwonhee Kim),김형준(Hyungjun Kim),남영호(Youngho Nam) 한국자동차공학회 2013 한국자동차공학회 부문종합 학술대회 Vol.2013 No.5
The degree of protection from passenger airbag depends on the details of airbag deployment such as shape and size of the airbag, airbag pressure, vent hole size as well as the size of passenger. Larger passengers require higher airbag pressure than smaller passengers. Adaptive vent systems detect the weight of the passenger and controls the vent hole size with costly mechanisms. In this work, feasibility of an economic alternative system with pressure sensitive vent mechanism is explored. The pressure ? vent rate relation is explored via airbag deployment simulations with reference to drop tower tests with varying drop weight.
Kim, Myung Jong,Kim, Younggwan,Kim, Hoirin IEEE 2015 IEEE/ACM transactions on audio, speech, and langua Vol.23 No.4
<P>This paper presents a new method for automatically assessing the speech intelligibility of patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. The proposed method consists of two main steps: feature representation and prediction. In the feature representation step, the speech utterance is converted into a phone sequence using an automatic speech recognition technique and is then aligned with a canonical phone sequence from a pronunciation dictionary using a weighted finite state transducer to capture the pronunciation mappings such as match, substitution, and deletion. The histograms of the pronunciation mappings on a pre-defined word set are used for features. Next, in the prediction step, a structured sparse linear model incorporated with phonological knowledge that simultaneously addresses phonologically structured sparse feature selection and intelligibility prediction is proposed. Evaluation of the proposed method on a database of 109 speakers consisting of 94 dysarthric and 15 control speakers yielded a root mean square error of 8.14 compared to subjectively rated scores in the range of 0 to 100. This is a promising performance in which the system can be successfully applied to help speech therapists in diagnosing the degree of speech disorder.</P>
SVM Based Speaker Verification Using Sparse Maximum A Posteriori Adaptation
Kim, Younggwan,Roh, Jaeyoung,Kim, Hoirin The Institute of Electronics and Information Engin 2013 IEIE Transactions on Smart Processing & Computing Vol.2 No.5
Modern speaker verification systems based on support vector machines (SVMs) use Gaussian mixture model (GMM) supervectors as their input feature vectors, and the maximum a posteriori (MAP) adaptation is a conventional method for generating speaker-dependent GMMs by adapting a universal background model (UBM). MAP adaptation requires the appropriate amount of input utterance due to the number of model parameters to be estimated. On the other hand, with limited utterances, unreliable MAP adaptation can be performed, which causes adaptation noise even though the Bayesian priors used in the MAP adaptation smooth the movements between the UBM and speaker dependent GMMs. This paper proposes a sparse MAP adaptation method, which is known to perform well in the automatic speech recognition area. By introducing sparse MAP adaptation to the GMM-SVM-based speaker verification system, the adaptation noise can be mitigated effectively. The proposed method utilizes the L0 norm as a regularizer to induce sparsity. The experimental results on the TIMIT database showed that the sparse MAP-based GMM-SVM speaker verification system yields a 42.6% relative reduction in the equal error rate with few additional computations.
김영관(Younggwan Kim),이주석(Jusuk Lee),김아정(Ajung Kim),홍지만(Jiman Hong) 한국스마트미디어학회 2021 스마트미디어저널 Vol.10 No.1
기계학습이 보편화되면서 기계학습을 활용한 응용 개발 또한 활발하게 이루어지고 있다. 또한 이러한 응용 개발을 지원하기 위한 기계학습 플랫폼 연구도 활발하게 진행되고 있다. 그러나 기계학습 플랫폼 연구가 활발하게 진행되고 있음에도 불구하고 기계학습 플랫폼에 적절한 부하 분산에 관한 연구는 아직 부족하다. 따라서 본 논문에서는 기계학습 분산 환경을 위한 부하 분산 기법을 제안한다. 제안하는 기법은 분산 서버를 레벨 해시 테이블 구조로 구성하고 각 서버의 성능을 고려하여 기계학습 작업을 서버에 할당한다. 이후 분산 서버를 구현하여 실험하고 기존 해싱 기법과 성능을 비교하였다. 제안하는 기법을 기존 해싱 기법과 비교하였을 때 평균 약 26%의 속도 향상을 보였고, 서버에 할당되지 못하고 대기하는 작업의 수가 약 38% 이상 감소함을 보였다. As the machine learning becomes more common, development of application using machine learning is actively increasing. In addition, research on machine learning platform to support development of application is also increasing. However, despite the increasing of research on machine learning platform, research on suitable load balancing for machine learning platform is insufficient. Therefore, in this paper, we propose a load balancing scheme that can be applied to machine learning distributed environment. The proposed scheme composes distributed servers in a level hash table structure and assigns machine learning task to the server in consideration of the performance of each server. We implemented distributed servers and experimented, and compared the performance with the existing hashing scheme. Compared with the existing hashing scheme, the proposed scheme showed an average 26% speed improvement, and more than 38% reduced the number of waiting tasks to assign to the server.
Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition
Myungjong Kim,Younggwan Kim,Joohong Yoo,Jun Wang,Hoirin Kim IEEE 2017 IEEE transactions on neural systems and rehabilita Vol.25 No.9
<P>This paper addresses the problem of recognizing the speech uttered by patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. Patients with dysarthria have articulatory limitation, and therefore, they often have trouble in pronouncing certain sounds, resulting in undesirable phonetic variation. Modern automatic speech recognition systems designed for regular speakers are ineffective for dysarthric sufferers due to the phonetic variation. To capture the phonetic variation, Kullback-Leibler divergence-based hidden Markov model (KL-HMM) is adopted, where the emission probability of state is parameterized by a categorical distribution using phoneme posterior probabilities obtained from a deep neural network-based acoustic model. To further reflect speaker-specific phonetic variation patterns, a speaker adaptation method based on a combination of L2 regularization and confusion-reducing regularization, which can enhance discriminability between categorical distributions of the KL-HMM states while preserving speaker-specific information is proposed. Evaluation of the proposed speaker adaptation method on a database of several hundred words for 30 speakers consisting of 12 mildly dysarthric, 8 moderately dysarthric, and 10 non-dysarthric control speakers showed that the proposed approach significantly outperformed the conventional deep neural network-based speaker adapted system on dysarthric as well as non-dysarthric speech.</P>