RISS 검색 - 학위논문 상세보기

다국어 초록 (Multilingual Abstract)

In this paper, acoustic modeling and OOV rejection method were studied for Korean vocabulary-independent speech recognizer.
To accurately model the phoneme, triphone was used and state-tying method was introduced for robust modeling with limited speech corpus. The problem of unseen model which appears in recognition phase but not in training phase was solved with Tree-Based Clustering that is one of top-down methodologies. In TBC, several phonetic question sets were organized and the best recognition result was achieved with question set that includes versatile phonetic question and excludes monophone-based question. Therefore, phonetic question set for TBC must include various phonetic phenomena and doesn't have to include monophone-based question.
By measuring the confidence of recognized result, OOV rejection experiment was conducted. Two different methods were compared. One was based on utterance-level LLR and the other was based on frame-level LLR. For utterance-level OOV rejection experiment, best and 2^(nd) best result were used to get LLR. By normalizing the result with the length of utterance, better result was obtained. In comparison to the utterance-level OOV rejection, frame-level OOV rejection showed the better performance. In frame-level OOV rejection, filler model made from CI models was used for alternate hypothesis and the number of clusters that constitute filler model was varied. With filler model composed of two clusters, EER of 0.5% was achieved. This amounts to rejecting one IV word and accepting one OOV word out of 200 words.
For future study, a novel method should be investigated for improved acoustic modeling. And for the OOV rejection, anti-modeling based on discriminative training method also should be tried. Additionally, for real-field application noise processing and keyword spotting also have to be implemented.

목차 (Table of Contents)

목차
1. 서론 = 1
2. 가변 어휘 음성인식 시스템 = 3
2.1 음성 특징 파라메터 추출 = 4
2.2 가변어휘 음성인식기를 위한 인식 네트워크 구조 = 9

목차
1. 서론 = 1
2. 가변 어휘 음성인식 시스템 = 3
2.1 음성 특징 파라메터 추출 = 4
2.2 가변어휘 음성인식기를 위한 인식 네트워크 구조 = 9
3. 음향학적 모델링 = 11
3.1 은닉 마르코프 모델 = 11
3.2 음소 모델링 = 12
3.3 파라메터 공유 = 14
3.3.1 Tied Mixture = 14
3.3.2 Tied-state triphones = 15
4. 발화 검증 = 21
4.1 검증 단위에 따른 발화 검증 방법 = 22
4.1.1 전체 발화 단위의 발화 검증 방법 = 22
4.1.2 프레임 단위의 발화 검증 방법 = 23
5. 실험 환경 및 결과 = 26
5.1 훈련용 음성 데이터베이스 = 26
5.2 인식용 음성 데이터베이스 = 26
5.3 모델링 방법 = 27
5.3.1 음소 모델링 = 27
5.3.2 Filler 모델링 = 27
5.4 음성학적 질문 집합에 의한 인식률 = 28
5.5 OOV 거절 방법에 대한 실험 = 32
5.5.1 전체발화 단위의 발화 검증 = 32
5.5.2 프레임 단위의 발화 검증 = 35
6. 결론 = 39
참고 문헌 = 41
Abstract = 43

상세검색

RISS 보유자료

상세검색

해외전자자료

한국어 가변어휘 인식기의 성능향상에 관한 연구 = (A) Study on the Performance Improvement of Korean Vocabulary-Independent Speech Recognizer

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료