http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Automatic Music Summarization Using Vector Quantization and Segment Similarity
Kim, Sang-Ho,Kim, Sung-Tak,Kim, Hoi-Rin The Acoustical Society of Korea 2008 韓國音響學會誌 Vol.27 No.e2
In this paper, we propose an effective method for music summarization which automatically extracts a representative part of the music by using signal processing technology. Proposed method uses a vector quantization technique to extract several segments which can be regarded as the most important contents in the music. In general, there is a repetitive pattern in music, and human usually recognizes the most important or catchy tune from the repetitive pattern. Thus the repetition which is extracted using segment similarity is considered to express a music summary. The segments extracted are again combined to generate a complete music summary. Experiments show the proposed method captures the main theme of the music more effectively than conventional methods. The experimental results also show that the proposed method could be used for real-time application since the processing time in generating music summary is much faster than other methods.
Kim, Hoi-Cheol,Kim, In-Soo,Park, Jong-Bae,Shin, Joong-Rin The Korean Institute of Electrical Engineers 2001 KIEE International Transactions on Power Engineeri Vol.a11 No.4
The impact evaluation of a DSM program is a very important issue since the results are used to determine the sustainability of a program. In general. to estimate the impacts of a DSM program it is required to measure the electricity usage changes before and after a program. Since the measurement-based approaches cost highly, most of the conventional evaluations are based on the average figures. However estimation of the average-based impacts can lead to both distorted results of over/under estimation of kW and kWh savings and non-optimal DSM planning. In this paper, we have developed a new multi-point measurement approach which can evaluate kW and kWh savings of a DSM program more exactly. To do this, the saving rate and operating rate are defined and set as the function of load factor of a customer, and these rates are incorporated with the conventional diffusion function of Bass to project the future impacts of a DSM program. The case study is performed on the inverter program of Korea by using the suggested approach.
HMnet Evaluation for Phonetic Environment Variations of Traning Data in Speech Recognition
Kim, Hoi-Rin The Acoustical Society of Korea 1996 韓國音響學會誌 Vol.15 No.e4
In this paper, we propose a new evaluation methodology which can more clearly show the performance of the allophone modeling algorithm generally used in large vocabulary speech recognition. The proposed evaluation method shows the running characteristics and limitations of the modeling algorithm by testing how the variation of phonetic environments of training data affects the recognition performance and the desirable number of free parameters to be estimated. Using the method, we experiment results, we conclude that, in vocabulary-independent recognition task, the phonetic diversity of training data greatly affects the robustness of model, and it is necessary to develop a proper measure which can determine the number of states compromizing the robustness and the precision of the HMnet better than the conventional modeling efficiency.
Robust Histogram Equalization Using Compensated Probability Distribution
Kim, Sung-Tak,Kim, Hoi-Rin The Korean Society Of Phonetic Sciences And Speech 2005 말소리 Vol.55 No.-
A mismatch between the training and the test conditions often causes a drastic decrease in the performance of the speech recognition systems. In this paper, non-linear transformation techniques based on histogram equalization in the acoustic feature space are studied for reducing the mismatched condition. The purpose of histogram equalization(HEQ) is to convert the probability distribution of test speech into the probability distribution of training speech. While conventional histogram equalization methods consider only the probability distribution of a test speech, for noise-corrupted test speech, its probability distribution is also distorted. The transformation function obtained by this distorted probability distribution maybe bring about miss-transformation of feature vectors, and this causes the performance of histogram equalization to decrease. Therefore, this paper proposes a new method of calculating noise-removed probability distribution by using assumption that the CDF of noisy speech feature vectors consists of component of speech feature vectors and component of noise feature vectors, and this compensated probability distribution is used in HEQ process. In the AURORA-2 framework, the proposed method reduced the error rate by over $44\%$ in clean training condition compared to the baseline system. For multi training condition, the proposed methods are also better than the baseline system.
Optimal Decision Tree를 이용한 Unseen Model 추정방법
김성탁,김회린,Kim Sungtak,Kim Hoi-Rin 대한음성학회 2003 말소리 Vol.45 No.-
Decision tree-based state tying has been proposed in recent years as the most popular approach for clustering the states of context-dependent hidden Markov model-based speech recognition. The aims of state tying is to reduce the number of free parameters and predict state probability distributions of unseen models. But, when doing state tying, the size of a decision tree is very important for word independent recognition. In this paper, we try to construct optimized decision tree based on the average of feature vectors in state pool and the number of seen modes. We observed that the proposed optimal decision tree is effective in predicting the state probability distribution of unseen models.
음성학적 지식 기반 변이음 모델을 이용한 가변 어휘 단어 인식기
김회린,이항섭,Kim, Hoi-Rin,Lee, Hang-Seop 한국음향학회 1997 韓國音響學會誌 Vol.16 No.2
본 논문에서는 훈련용 음성 데이터와 무관한 임의의 새로운 어휘를 인식해 낼 수 있는 가변 어휘 단어 인식기 개발에 대하여 기술한다. 가변 어휘 단어 인식기를 구현하기 위해서는, 인식 대상이 될 새로운 어휘를 즉시 발음 사전으로 변환시키는 on-line 발음 사전 생성기가 필요하고, 발음 사전 출력을 가지고 각 단어를 모델링할 수 있는 신뢰성 있는 음소 및 변이음 모델이 필요하다. 이와 같은 신뢰성 있는 음소 및 변이음 모델은 생성시키기 위하여 본 연구에서는, 각 음소의 전후 음소들의 음성학적 자질을 고려하여 3 음소열을 집단화(clustering)하여 변이음을 정의하고 이를 당 연구실이 보유하고 있는 POW(Phonetically Optimized Words) 3,848개 단어에 적용하여 1,548개의 변이음 모델을 생성시켰다. 이를 토대로 가변 어휘 단어 인식기를 구현하고 이를 POW 3,848 DB, PBW 445 DB 및 호텔 예약용 244 단어 DB 등에 적용하여 그 성능을 평가하였다. 평가 결과, POW DB에 대해서는 79.6%, PBW DB에 대해서는 445 단어 사전의 경우 79.4%, 100 단어 사전의 경우 88.9%의 성능을 보여 주었고, 호텔 예약 DB에 대해서는 71.4%의 성능을 보여 주었다. In this paper, we propose a variable vocabulary word recognizer that is able to recognize new words not exist in training data. For the variable vocabulary word recognizer, we must have an on-line lexicon generator to transform new candidate words to the corresponding pronunciation sequences of phones without any large lexicon table. And, we also must make outputs. In order to model the phones and allophones reliably, we define Korean allophones by triphone clustering based on phonetic knowledge of preceding and succeeding phones of each phone. Using the clustering method, we generated 1,548 allophones with POW (Phonetically Optimized Words) 3,848 word DB. We evaluated the proposed word recognizer with POW 3,848 DB, PBW (Phonetically Balanced Words) 445 DB, and 244 word DB in hotel reservation task. Experimental results showed word recognition accuracy of 79.6% for the POW DB corresponding to vocabulary-dependent case, 79.4% in case of 445 word lexicon and 88.9% in case of 100 word lexicon for the PBW DB, and 71.4% for the hotel reservation DB corresponding to vocabulary-independent case.