http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Deep CNNs Along the Time Axis With Intermap Pooling for Robustness to Spectral Variations
Lee, Hwaran,Kim, Geonmin,Kim, Ho-Gyeong,Oh, Sang-Hoon,Lee, Soo-Young IEEE 2016 IEEE signal processing letters Vol.23 No.10
<P>Convolutional neural networks (CNNs) with convolutional and pooling operations along the frequency axis have been proposed to attain invariance to frequency shifts of features. However, this is inappropriate with regard to the fact that acoustic features vary in frequency. In this paper, we contend that convolution along the time axis is more effective. We also propose the addition of an intermap pooling (IMP) layer to deep CNNs. In this layer, filters in each group extract common but spectrally variant features, then the layer pools the feature maps of each group. As a result, the proposed IMP CNN can achieve insensitivity to spectral variations characteristic of different speakers and utterances. The effectiveness of the IMP CNN architecture is demonstrated on several LVCSR tasks. Even without speaker adaptation techniques, the architecture achieved a WER of 12.7% on the SWB part of the Hub5’2000 evaluation test set, which is competitive with other state-of-the-art methods.</P>
A Calibration Method for Eye-Gaze Estimation Systems Based on 3D Geometrical Optics
Hwaran Lee,Iqbal, Nadeem,Wonil Chang,Soo-Young Lee IEEE 2013 IEEE SENSORS JOURNAL Vol.13 No.9
<P>A new calibration method is presented for robust eye-gaze estimation systems which are utilized to understand the human mind. Although current eye-gaze systems with one camera and a few infrared light sources have been developed to allow users' head motion, they still require one to know the relative positions between the camera and light sources with very high accuracy. The developed calibration method utilizes a three-dimensional geometrical relationship between the light source positions and corresponding camera images reflected on a mirror at several positions and rotations. The best estimates of the light source positions are obtained from noisy measurements by minimizing a cost function, which ensures the integrity of the camera and light sources. The developed calibration method makes it possible to convert many camera-and-display devices into robust eye-gaze estimation systems.</P>
Eunkyung Kang,Kyung Young Lee,Minwoo Lee,Sung-Byung Yang,Hwaran Lee Smart Tourism Research Center 2023 Journal of smart tourism Vol.3 No.2
Due to the phenomenon of aging, a new consumer segment known as the "new-silver generation" is emerging. Unlike the previous silver generation, this generation possesses significant economic power and consuming willingness, attracting attention from consumer goods companies. However, both the new-silver generation and the elderly face challenges in adopting contactless or self-service technologies such as self-order kiosks, resulting in negative reactions. Therefore, this study aims to investigate the attitude and response of the newsilver generation towards kiosks, as well as the factors influencing their resistance to such technology. By applying theoretical perspectives from the innovation resistance model, technostress theory, and the value-based model, this study identifies influencing factors for innovation resistance among the new-silver generation when using contactless technologies implemented in fast-food restaurants. The findings indicate that a lower awareness of new technologies and services corresponds to decreased adoption resistance, while a higher perceived value leads to more positive behaviors and attitudes among the new-silver generation utilizing kiosks at fast-food restaurants.
Smart user interface for mobile consumer devices using model-based eye-gaze estimation
Iqbal, N.,Hwaran Lee,Soo-Young Lee IEEE 2013 IEEE TRANSACTIONS ON CONSUMER ELECTRONICS - Vol.59 No.1
<P>A smart user-interface for mobile consumer devices was developed using a robust eye-gaze system without any hand motion. Using one camera and one display already available in popular mobile devices, the eye-gaze system estimates the visual angle, which shows the area of interest on the display to indicate the position of the cursor. Three novel techniques were developed to make the system robust, userindependent, and head/device motion invariant. First, by carefully investigating the geometric relation between the device and the user's cornea, a new algorithm was developed to estimate the cornea center position, which is directly related to the optical axis of the eye. Unlike previous algorithms, it does not utilize the user-dependent cornea radius. Second, to make the system robust for practical application, an algorithm was developed to compensate for imaging position errors due to the finite camera resolution. Third, a binocular algorithm was developed to estimate the user-dependent angular offsets between the optical and visual axes with only single point calibration. The proposed system was demonstrated to be accurate enough for many practical mobile user interfaces.</P>
Rescoring of N-Best Hypotheses Using Top-Down Selective Attention for Automatic Speech Recognition
Kim, Ho-Gyeong,Lee, Hwaran,Kim, Geonmin,Oh, Sang-Hoon,Lee, Soo-Young IEEE Signal Processing Society 2018 IEEE signal processing letters Vol.25 No.2
<P>In this letter, we propose an <I>N</I>-best rescoring system that integrates attentional information for locally confusing words extracted from alternative hypotheses to a conventional speech recognition system. The attentional information is derived by adapting a test input feature for the word of interest, which is motivated by the top-down selective attention mechanism of the brain. To rescore the competing hypotheses, we define a new confidence measure that contains both the conventional posterior probability and the attentional information for the confusing words. In addition, a neural network is designed to provide different weights within the confidence measure for each utterance. The network is then optimized to minimize the word error rates. Tests on the Wall Street Journal and Aurora4 speech recognition tasks were conducted, and our best results achieve a word error rate of 3.83% and 11.09%, yielding a relative reduction of 5.20% and 2.55% over baselines, respectively.</P>