RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      Hidden Markov Models을 이용한 Viseme 인식

      한글로보기

      https://www.riss.kr/link?id=T10024810

      • 저자
      • 발행사항

        서울 : 高麗大學校 大學院, 2003

      • 학위논문사항

        학위논문(석사) -- 고려대학교 대학원 , 컴퓨터학과 전산학전공 , 2004. 2

      • 발행연도

        2003

      • 작성언어

        한국어

      • 주제어
      • KDC

        413.133 판사항(4)

      • 발행국(도시)

        서울

      • 형태사항

        vi, 44p. : 삽도 ; 26cm.

      • 일반주기명

        참고문헌: p. 42-44

      • 소장기관
        • 고려대학교 과학도서관 소장기관정보
        • 고려대학교 도서관 소장기관정보
        • 고려대학교 세종학술정보원 소장기관정보
      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract) kakao i 다국어 번역

      In this thesis, audio-to-visual conversion techniques for efficient multimedia communications are described. The audio signals are automatically converted to visual images of mouth shapes. Visual images synchronized with audio signals can provide user-friendly interface for man machine interactions. The visual speech can be represented as a sequence of visemes, which are the generic face images corresponding to particular sounds. HMMs(hidden Markov models) are used to convert audio signals to a sequence of visemes.
      This study compares four approaches in using HMMs. In the first approach, an HMM is trained for each viseme, and the audio signals are directly recognized as a sequence of visemes. In the second approach, each phoneme is modeled with an HMM, and a general phoneme recognizer is utilized to produce a phoneme sequence from the audio signals. The phoneme sequence is then converted to a viseme sequence. In the third approach, an HMM is trained for each triviseme which is a viseme with its left and right context, and the audio signals are directly recognized as a sequence of trivisemes. In the fourth approach, each triphone is modeled with an HMM, and a general triphone recognizer is used to produce a triphone sequence from the audio signals. The triviseme or triphone sequence is then converted to a viseme sequence. The performances of the four visemes recognition systems are evaluated on the TIMIT speech corpus.
      The viseme recognizer shows 33.9% viseme recognition error rate, and the phoneme-based approach exhibits 29.7% viseme recognition error rate. The triviseme-based approach displays 22.7% error rate. And triphone-based approach shows 17.4% recognition error rate. When similar viseme classes are merged, we have found that the error rates can be reduced to 26.9%, 19.6%, 18.8% and 10.7%, respectably. These results show that the triviseme model based system has the better accuracy than the monoviseme models.
      번역하기

      In this thesis, audio-to-visual conversion techniques for efficient multimedia communications are described. The audio signals are automatically converted to visual images of mouth shapes. Visual images synchronized with audio signals can provide user...

      In this thesis, audio-to-visual conversion techniques for efficient multimedia communications are described. The audio signals are automatically converted to visual images of mouth shapes. Visual images synchronized with audio signals can provide user-friendly interface for man machine interactions. The visual speech can be represented as a sequence of visemes, which are the generic face images corresponding to particular sounds. HMMs(hidden Markov models) are used to convert audio signals to a sequence of visemes.
      This study compares four approaches in using HMMs. In the first approach, an HMM is trained for each viseme, and the audio signals are directly recognized as a sequence of visemes. In the second approach, each phoneme is modeled with an HMM, and a general phoneme recognizer is utilized to produce a phoneme sequence from the audio signals. The phoneme sequence is then converted to a viseme sequence. In the third approach, an HMM is trained for each triviseme which is a viseme with its left and right context, and the audio signals are directly recognized as a sequence of trivisemes. In the fourth approach, each triphone is modeled with an HMM, and a general triphone recognizer is used to produce a triphone sequence from the audio signals. The triviseme or triphone sequence is then converted to a viseme sequence. The performances of the four visemes recognition systems are evaluated on the TIMIT speech corpus.
      The viseme recognizer shows 33.9% viseme recognition error rate, and the phoneme-based approach exhibits 29.7% viseme recognition error rate. The triviseme-based approach displays 22.7% error rate. And triphone-based approach shows 17.4% recognition error rate. When similar viseme classes are merged, we have found that the error rates can be reduced to 26.9%, 19.6%, 18.8% and 10.7%, respectably. These results show that the triviseme model based system has the better accuracy than the monoviseme models.

      더보기

      목차 (Table of Contents)

      • Abstract = ⅰ
      • 목차 = ⅲ
      • 제1장 서론 = 1
      • 제2장 관련 연구 = 4
      • 2.1 Viseme의 기존 연구 = 4
      • Abstract = ⅰ
      • 목차 = ⅲ
      • 제1장 서론 = 1
      • 제2장 관련 연구 = 4
      • 2.1 Viseme의 기존 연구 = 4
      • 2.2 Hidden Markov Models = 7
      • 2.2.1 구성 요소 = 8
      • 2.2.2 확률값 계산 = 10
      • 2.2.3 최적의 상태열 = 13
      • 2.2.4 파라메터 학습 = 14
      • 제3장 Viseme 인식 = 18
      • 3.1 Viseme 집합 = 18
      • 3.1.1 Monoviseme = 18
      • 3.1.2 Triviseme = 24
      • 3.2 Viseme 인식 = 25
      • 3.2.1 Viseme 기반 방법과 음소 기반 방법 = 27
      • 3.2.2 Triviseme 기반 방법과 Triphone 기반 방법 = 29
      • 제4장 실험 및 결과 분석 = 31
      • 4.1 실험 환경 = 31
      • 4.2 실험 방법 = 32
      • 4.3 실험 결과 = 36
      • 제5장 결론 및 향후 과제 = 39
      • 참고문헌 = 42
      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼