http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
구명완,Koo, Myoung-Wan 한국음향학회 2021 韓國音響學會誌 Vol.40 No.5
We propose a speech recognition system based on conformer. Conformer is known to be convolution-augmented transformer, which combines transfer model for capturing global information with Convolution Neural Network (CNN) for exploiting local feature effectively. The baseline system is developed to be a transfer-based speech recognition using Long Short-Term Memory (LSTM)-based language model. The proposed system is a system which uses conformer instead of transformer with transformer-based language model. When Electronics and Telecommunications Research Institute (ETRI) speech corpus in AI-Hub is used for our evaluation, the proposed system yields 5.7 % of Character Error Rate (CER) while the baseline system results in 11.8 % of CER. Even though speech corpus is extended into other domain of AI-hub such as NHNdiguest speech corpus, the proposed system makes a robust performance for two domains. Throughout those experiments, we can prove a validation of the proposed system.
A Comparative Study of Speaker Adaptation Methods for HMM-Based Speech Recognition
구명완,은종관,이황수,Koo, Myoung-Wan,Un, Chong-Kwan,Lee, Hwang-Soo The Acoustical Society of Korea 1991 韓國音響學會誌 Vol.10 No.3
본 논문에서는 HMM을 이용한 음성인식 시스템에서 2단계로 이루어지는 화자적응 알고리즘의 성능비교를 수행하였다. 첫단계는 새로운 화자와의 거리차이를 줄여주는 VQ 적응방식들로 구성되는 이 방식들 중에서 lable prototype 적응, 적응음성으로부터 구성된 VQ코우드 북을 사용한 적응 및 사상 코우드 북을 사용한 적응등의 알고리즘 성능비교를 하였다. 두 번째 단계는 새로운 화자를 위해서 HMM 파라미터를 변환시켜주는 HMM 피라미터 적응방식들로 이루어지는데 이 방법들 중에서 Viterbi 알고리즘, DTW 알고리즘, iterative alignment 알고리즘 및 fuzzy histogram 알고리즘의 성능을 비교하였다. 성능비교 결과 fuzzy histogram 알고림즘에 의한 화자적응 방식이 최고의 인식율을 나타내었다. In this paper, we compare the performances of speaker adaptation which consist of two stages of processing for an HMM-based speech recognition system. We compare three kinds of VQ adaptation methods which may be used in the first stage to reduce the distortion error for a new speaker : label prototype adaptation, adaptation with a codebook from adaptation speech itself, and adaptation with a mapped codebook. We then compare the performance of four kinds of HMM parameter adaptation methods which may be used in the second stage to transform HMM parameters for a new speaker : adaptation by the Viterbi algorithm, that by the DTW algorithm, that by the iterative alignment algorithm. The results show that adaptation based on the fuzzy histogram algorithm yields the highest accuracy in an HMM-based speech recognition system.
구명완(Myoung Wan Koo),이우원(Woo Won Lee),임계영(Kye Young Lim) 전력전자학회 2008 전력전자학술대회 논문집 Vol.- No.-
The Former High Efficiency Inverter(the power restoration process) system process has advantage which is the energy reduction rather than the Former Inverter(the resistence damping process), However, under repair and remodeling, the power facilities capacity is not easy to increase that the former High Efficiency Inverter needs to increase the Power Facilities Capacity of 20~30% than the Inverter(the resistence damping process) so Therefore we are going to suggest the system which is not going to make an increase the power facilities capacity and is applicable the High Efficiency Inverter.
구명완(Myoung Wan Koo) 한국어학회 2001 한국어학 Vol.13 No.-
In this paper, we present the definition of speech recognition technology and its application services currently working over telephone network. And some research issues especially related to Korean are also shown among basic problems of speech recognition technology. The current statue of the speech recognition technology which will eventually change our future life style is still in the early stage. We show three kinds of services developed by Korea Telecom such as voice dialing, railroad information and name dialing services. And we also describe some research issues obtained throughout our services.
무선랜 환경에서의 분산 음성 인식을 이용한 음성 다이얼링 시스템
박성준,구명완,Park Sung-Joon,Koo Myoung_wan 대한음성학회 2005 말소리 Vol.56 No.-
In this paper, a WiFi phone system with distributed speech recognition is implemented. The WiFi phone with voice-activated dialing and its functions are explained. Features of the input speech are extracted and are sent to the interactive voice response (IVR) server according to the real-time transport protocol (RTP). Feature extraction is based on the European Telecommunication Standards Institute (ETSI) standard front-end, but is modified to reduce the processing time. The time for front-end processing on a WiFi phone is compared with that in a PC.
콘포머 기반 FastSpeech2를 이용한 한국어 음식 주문 문장 음성합성기
최예린,장재후,구명완,Choi, Yerin,Jang, JaeHoo,Koo, Myoung-Wan 한국음향학회 2022 韓國音響學會誌 Vol.41 No.3
In this paper, we present the Korean menu-ordering Sentence Text-to-Speech (TTS) system using conformer-based FastSpeech2. Conformer is the convolution-augmented transformer, which was originally proposed in Speech Recognition. Combining two different structures, the Conformer extracts better local and global features. It comprises two half Feed Forward module at the front and the end, sandwiching the Multi-Head Self-Attention module and Convolution module. We introduce the Conformer in Korean TTS, as we know it works well in Korean Speech Recognition. For comparison between transformer-based TTS model and Conformer-based one, we train FastSpeech2 and Conformer-based FastSpeech2. We collected a phoneme-balanced data set and used this for training our models. This corpus comprises not only general conversation, but also menu-ordering conversation consisting mainly of loanwords. This data set is the solution to the current Korean TTS model's degradation in loanwords. As a result of generating a synthesized sound using ParallelWave Gan, the Conformer-based FastSpeech2 achieved superior performance of MOS 4.04. We confirm that the model performance improved when the same structure was changed from transformer to Conformer in the Korean TTS.
박성준,김재인,구명완,전주식,Park Sung-Joon,Kim Jae-In,Koo Myoung-Wan,Jhon Chu-Shik 대한음성학회 2004 말소리 Vol.51 No.-
A weather forecast service with speech recognition is described. This service allows users to get the weather information of all the cities by saying the city names with just one phone call, which was not provided in the previous weather forecast service. Speech recognition is implemented in the intelligent peripheral (IP) of the advanced intelligent network (AIN). The AIN is a telephone network architecture that separates service logic from switching equipment, allowing new services to be added without having to redesign switches to support new services. Experiments in speech recognition show that the recognition accuracy is 90.06% for the general users' speech database. For the laboratory members' speech database, the accuracies are 95.04% and 93.81%, respectively in simulation and in the test on the developed system.