Hidden Markov Models을 이용한 Viseme 인식|RISS 상세보기

다국어 초록 (Multilingual Abstract)

In this thesis, audio-to-visual conversion techniques for efficient multimedia communications are described. The audio signals are automatically converted to visual images of mouth shapes. Visual images synchronized with audio signals can provide user-friendly interface for man machine interactions. The visual speech can be represented as a sequence of visemes, which are the generic face images corresponding to particular sounds. HMMs(hidden Markov models) are used to convert audio signals to a sequence of visemes.
This study compares four approaches in using HMMs. In the first approach, an HMM is trained for each viseme, and the audio signals are directly recognized as a sequence of visemes. In the second approach, each phoneme is modeled with an HMM, and a general phoneme recognizer is utilized to produce a phoneme sequence from the audio signals. The phoneme sequence is then converted to a viseme sequence. In the third approach, an HMM is trained for each triviseme which is a viseme with its left and right context, and the audio signals are directly recognized as a sequence of trivisemes. In the fourth approach, each triphone is modeled with an HMM, and a general triphone recognizer is used to produce a triphone sequence from the audio signals. The triviseme or triphone sequence is then converted to a viseme sequence. The performances of the four visemes recognition systems are evaluated on the TIMIT speech corpus.
The viseme recognizer shows 33.9% viseme recognition error rate, and the phoneme-based approach exhibits 29.7% viseme recognition error rate. The triviseme-based approach displays 22.7% error rate. And triphone-based approach shows 17.4% recognition error rate. When similar viseme classes are merged, we have found that the error rates can be reduced to 26.9%, 19.6%, 18.8% and 10.7%, respectably. These results show that the triviseme model based system has the better accuracy than the monoviseme models.

번역하기

목차 (Table of Contents)

Abstract = ⅰ
목차 = ⅲ
제1장 서론 = 1
제2장 관련 연구 = 4
2.1 Viseme의 기존 연구 = 4

Abstract = ⅰ
목차 = ⅲ
제1장 서론 = 1
제2장 관련 연구 = 4
2.1 Viseme의 기존 연구 = 4
2.2 Hidden Markov Models = 7
2.2.1 구성 요소 = 8
2.2.2 확률값 계산 = 10
2.2.3 최적의 상태열 = 13
2.2.4 파라메터 학습 = 14
제3장 Viseme 인식 = 18
3.1 Viseme 집합 = 18
3.1.1 Monoviseme = 18
3.1.2 Triviseme = 24
3.2 Viseme 인식 = 25
3.2.1 Viseme 기반 방법과 음소 기반 방법 = 27
3.2.2 Triviseme 기반 방법과 Triphone 기반 방법 = 29
제4장 실험 및 결과 분석 = 31
4.1 실험 환경 = 31
4.2 실험 방법 = 32
4.3 실험 결과 = 36
제5장 결론 및 향후 과제 = 39
참고문헌 = 42

상세검색

RISS 보유자료

상세검색

해외전자자료

Hidden Markov Models을 이용한 Viseme 인식

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료