RISS 검색 - 학위논문 상세보기

다국어 초록 (Multilingual Abstract)

This dissertation proposes a robust acoustic model adaptation for automatic speech recognition (ASR) systems using continuous density hidden Markov models (CDHMMs) in unknown environments. We focus on the environmental adaptation method for practical applications using linear spectral transformation (LST). In the cepstral domain, it is difficult to handle environmental noise. LST deals with such noise in the linear spectral domain and it is able to operate with a relatively small number of transformation parameters. Hence, we can develop rapid adaptation approaches using a small amount of adaptation data; maximum likelihood (ML)-LST mean transformation, ML-LST mean and variance transformation, and maximum mutual information (MMI)-LST. However, these approaches require computational iteration procedures to obtain transformation parameters. To reduce the computational complexity and obtain real-time adaptation for practical applications, we exploit an approximate objective function of ML and propose a closed-form solution of ML-LST (CML-LST). To avoid erroneous transcription in adaptation, a lattice-based confidence measure is analyzed and this method is applied to the CML-LST for unsupervised adaptation. In addition, we propose an incremental CML-LST adaptation technique which accumulates the data statistics and achieves consistently improved performance during repeated attempts at adaptations. Finally, we combine the proposed methods and gain promising results by examining the TIMIT / FFMTIMIT and Aurora4 evaluations.

번역하기

국문 초록 (Abstract)

이 논문은 음성인식 시스템을 위해 연속 밀도 hidden Markov models (HMMs)을 음향 모델로 이용하여 잡음이 있는 환경에 강인한 음향 모델 적응기술을 제안한다. 그리고 linear spectral transformation (LST) 기술을 이용하여 실제 응용 프로그램을 위한 환경 적응 방법을 제안하는 것에 초점을 맞췄다. LST 기법을 이용한 음향 모델 적응은 선형 스펙트럼 도메인에서 처리되기 때문에 배경 잡음 같은 additive 잡음과 채널 잡음 같은 convolutional 잡음을 다루는데 유리하다. 그래서, 상대적으로 적은 수의 변환 매개변수를 이용할 수 있어서 적은 양의 데이터를 가지고도 적응을 할 수 있다. 그리고 이와 같은 LST 기법을 음향 모델의 평균값 변환과 음향 모델 평균값-분산값 동시 변환, 그리고 상관 (mutual) 정보 최대화 변환 기술 등으로 확장 제안하였다. 하지만, 이 방법들은 변환 매개변수를 구하기 위해 반복 추정을 해야 하는 계산비용이 요구된다. 그래서 계산 복잡도를 줄여 한번에 변환 매개변수를 구할 수 있는 LST의 근사 방법인 closed-form maximum likelihood LST (CML-LST) 기법을 제안하였다. 이 방법은 실제 응용 프로그램에 적용할 수 있는 실시간 적응을 가능하게 만든다. 또한 정답 전사 정보가 없는 상황을 고려하여, state 사후확률을 신뢰도로 (confidence measure) 이용하는 lattice 기반 비지도 적응 방법을 분석하고 CML-LST 기법에 적용하였다. 거기에 덧붙여, 한번 이상의 적응 시도에 대한 지속적인 성능 향상을 얻기 위해서 데이터의 통계치를 누적하는 incremental CML-LST 기법으로 확장 제안하였다. 결국, 통합된 이들 방법에 대해 TIMIT/FFMTIMIT 데이터와 Aurora4 데이터를 이용하여 잡음 환경에 강인한 음향 모델 적응 평가를 수행하였다.

번역하기

이 논문은 음성인식 시스템을 위해 연속 밀도 hidden Markov models (HMMs)을 음향 모델로 이용하여 잡음이 있는 환경에 강인한 음향 모델 적응기술을 제안한다. 그리고 linear spectral transformation (LST) ...

목차 (Table of Contents)