RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      KCI등재 SCOPUS

      Proposal of speaker change detection system considering speaker overlap = 화자 겹침을 고려한 화자 전환 검출 시스템 제안

      한글로보기

      https://www.riss.kr/link?id=A107876172

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract)

      Speaker Change Detection (SCD) refers to finding the moment when the main speaker changes from one person to the next in a speech conversation. In speaker change detection, difficulties arise due to overlapping speakers, inaccuracy in the information ...

      Speaker Change Detection (SCD) refers to finding the moment when the main speaker changes from one person to the next in a speech conversation. In speaker change detection, difficulties arise due to overlapping speakers, inaccuracy in the information labeling, and data imbalance. To solve these problems, TIMIT corpus widely used in speech recognition have been concatenated artificially to obtain a sufficient amount of training data, and the detection of changing speaker has performed after identifying overlapping speakers. In this paper, we propose an speaker change detection system that considers the speaker overlapping. We evaluated and verified the performance using various approaches. As a result, a detection system similar to the X-Vector structure was proposed to remove the speaker overlapping region, while the Bi-LSTM method was selected to model the speaker change system. The experimental results show a relative performance improvement of 4.6 % and 13.8 % respectively, compared to the baseline system. Additionally, we determined that a robust speaker change detection system can be built by conducting related studies based on the experimental results, taking into consideration text and speaker information.

      더보기

      참고문헌 (Reference)

      1 R. Yin, "peaker change detection in broadcast tv using bidirectional long short term memory networks" 3827-3831, 2017

      2 "WebRTC Homepage"

      3 S. C. Levinson, "Turn-taking in human communication - Origins and implications for language processing" 20 : 6-14, 2016

      4 H. Bredin, "TristouNet: Triplet loss for speaker turn embedding" 5430-5434, 2017

      5 V. Zue, "Speech database development at MIT: TIMIT and beyond" 9 : 351-356, 1990

      6 Z. Ge, "Speaker change detection using features through a neural network speaker classier" 1111-1116, 2017

      7 L. Bullock, "Overlap aware diarization: Resegmentation using neural end to-end overlapped speech detection" 7114-7118, 2020

      8 N. Sajjan, "Leveraging lstm models for overlap detection in multi party meetings" 5249-5253, 2018

      9 H. Kim, "Framework switching of speaker overlap de tection system" 17 : 101-113, 2021

      10 M. Kunesova, "Detection of overlapping speech for the purposes of speaker diarization" 247-257, 2019

      1 R. Yin, "peaker change detection in broadcast tv using bidirectional long short term memory networks" 3827-3831, 2017

      2 "WebRTC Homepage"

      3 S. C. Levinson, "Turn-taking in human communication - Origins and implications for language processing" 20 : 6-14, 2016

      4 H. Bredin, "TristouNet: Triplet loss for speaker turn embedding" 5430-5434, 2017

      5 V. Zue, "Speech database development at MIT: TIMIT and beyond" 9 : 351-356, 1990

      6 Z. Ge, "Speaker change detection using features through a neural network speaker classier" 1111-1116, 2017

      7 L. Bullock, "Overlap aware diarization: Resegmentation using neural end to-end overlapped speech detection" 7114-7118, 2020

      8 N. Sajjan, "Leveraging lstm models for overlap detection in multi party meetings" 5249-5253, 2018

      9 H. Kim, "Framework switching of speaker overlap de tection system" 17 : 101-113, 2021

      10 M. Kunesova, "Detection of overlapping speech for the purposes of speaker diarization" 247-257, 2019

      11 V. Andrei, "Detecting over-lapped speech on short time frames using deep learning" 1198-1202, 2017

      12 J. Park, "Data augmentation and d-vector representation methods for speaker change detection" 67-71, 2020

      13 E. Kazimirova, "Automatic detection of multi speaker fragments with high time resolution" 1338-1392, 2018

      14 A. G. Adam, "A new speaker change detection method for two-speaker segmentation" 3908-3911, 2002

      15 D. Snyder, "2X-vectors: Robust DNN embeddings for speaker recognition" 5329-5333, 2018

      더보기

      동일학술지(권/호) 다른 논문

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      인용정보 인용지수 설명보기

      학술지 이력

      학술지 이력
      연월일 이력구분 이력상세 등재구분
      2026 평가예정 재인증평가 신청대상 (재인증)
      2020-01-01 평가 등재학술지 유지 (재인증) KCI등재
      2017-01-01 평가 등재학술지 유지 (계속평가) KCI등재
      2013-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2010-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2008-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2006-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2004-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2001-07-01 평가 등재학술지 선정 (등재후보2차) KCI등재
      1999-01-01 평가 등재후보학술지 선정 (신규평가) KCI등재후보
      더보기

      학술지 인용정보

      학술지 인용정보
      기준연도 WOS-KCI 통합IF(2년) KCIF(2년) KCIF(3년)
      2016 0.23 0.23 0.22
      KCIF(4년) KCIF(5년) 중심성지수(3년) 즉시성지수
      0.2 0.18 0.398 0.07
      더보기

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼