RISS 검색 - 국내학술지논문

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
문장 독립 화자 검증을 위한 그룹기반 화자 임베딩

정영문,엄영식,이영현,김회린,Jung, Youngmoon,Eom, Youngsik,Lee, Yeonghyeon,Kim, Hoirin 한국음향학회 2021 韓國音響學會誌 Vol.40 No.5
- 원문보기
Recently, deep speaker embedding approach has been widely used in text-independent speaker verification, which shows better performance than the traditional i-vector approach. In this work, to improve the deep speaker embedding approach, we propose a novel method called group-based speaker embedding which incorporates group information. We cluster all speakers of the training data into a predefined number of groups in an unsupervised manner, so that a fixed-length group embedding represents the corresponding group. A Group Decision Network (GDN) produces a group weight, and an aggregated group embedding is generated from the weighted sum of the group embeddings and the group weights. Finally, we generate a group-based embedding by adding the aggregated group embedding to the deep speaker embedding. In this way, a speaker embedding can reduce the search space of the speaker identity by incorporating group information, and thereby can flexibly represent a significant number of speakers. We conducted experiments using the VoxCeleb1 database to show that our proposed approach can improve the previous approaches.
2
한국어 text-to-speech(TTS) 시스템을 위한 엔드투엔드 합성 방식 연구

최연주(Choi, Yeunju),정영문(Jung, Youngmoon),김영관(Kim, Younggwan),서영주(Suh, Youngjoo),김회린(Kim, Hoirin) 한국음성학회 2018 말소리와 음성과학 Vol.10 No.1
- 원문보기 3
  ScienceON
  
  KCI
  
  DBpia
A typical statistical parametric speech synthesis (text-to-speech, TTS) system consists of separate modules, such as a text analysis module, an acoustic modeling module, and a speech synthesis module. This causes two problems: 1) expert knowledge of each module is required, and 2) errors generated in each module accumulate passing through each module. An end-to-end TTS system could avoid such problems by synthesizing voice signals directly from an input string. In this study, we implemented an end-to-end Korean TTS system using Google’s Tacotron, which is an end-to-end TTS system based on a sequence-to-sequence model with attention mechanism. We used 4392 utterances spoken by a Korean female speaker, an amount that corresponds to 37% of the dataset Google used for training Tacotron. Our system obtained mean opinion score (MOS) 2.98 and degradation mean opinion score (DMOS) 3.25. We will discuss the factors which affected training of the system. Experiments demonstrate that the post-processing network needs to be designed considering output language and input characters and that according to the amount of training data, the maximum value of n for n-grams modeled by the encoder should be small enough.
3
고효율 전력전송장치의 전력품질특성에 관한 연구

김찬혁(Chanhyeok Kim),왕용필(Yongpeel Wang),정영문(Youngmoon Jung),김승호(Seungho Kim),이헌태(Heontae Lee),노대석(Daeseok Rho) 대한전기학회 2016 대한전기학회 학술대회 논문집 Vol.2016 No.11
- 원문보기

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천