A Robust Facial Expression Recognition Algorithm Based on Multi-Rate Feature Fusion Scheme = 다중비율 특징 혼합 기법 기반의 강인한 얼굴 표정 인식 알고리즘|RISS 상세보기

다국어 초록 (Multilingual Abstract)

In recent years, the importance of catching the human's emotions grows larger as the Artificial Intelligence (AI) fields developed. The Facial Expression Recognition (FER) is part of understanding the emotion of the human through facial expression. This technique is highly utilized in the various fields because it can identify person’s expression from the face and provide the customized services.

With growth of deep learning, several approaches have been proposed to recognize the human's expression. In early stage, most of the studies use classical features such as holistic features or the local features of the face. As time goes on, the various neural networks appears such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM). The classical features become information which are fed to the networks to automatically discover the representations needed for object detection or classification.

In this thesis, we propose an algorithm which can efficiently classify the expression through feeding various and reinforced features to the joint fusion classifier. We design the inputs which are goes into the multi-depth networks be minimum overlapped so these can give a various information to the networks. To utilize 3D CNN, we propose a multi-rate-based 3D CNN based on multi-rate signal processing scheme. Also, we make the input images to be normalized based on the intensity of the given image and reinforce the output features by the self-attention. Then we concatenate the reinforced features and classified the expression by joint fusion classifier.

Through the proposed algorithm, for the CK+ database, the result of the proposed joint fusion classifier shows comparable accuracy of 96.23%. For the MMI and the GEMEP-FERA database, it outperforms other state-of-the-art models with accuracy of 96.69% and 99.79%. For the AFEW database, the proposed algorithm shows the accuracy of 31.02%.

번역하기

국문 초록 (Abstract)

최근 인공 지능 (Artificial Intelligence, AI) 분야가 발달하면서 인간의 감정을 이해하는 것에 대한 중요도가 커지고 있다. 얼굴 표정 인식(Facial Expression Recognition, FER)은 얼굴 표정을 통해 인간의 감정을 이해하는 분야이다. 이 기술은 얼굴에서 사람의 표정을 식별하고 이에 대한 맞춤형 서비스를 제공할 수 있어 다양한 분야에서 활용되고 있다.

딥러닝(Deep Learning)의 발전으로 인하여 사람의 감정을 인식하기 위한 다양한 접근법들이 제안되고 있다. 초기에는 대부분의 연구에서 전체 혹은 지역적 특징과 같은 고전적인 특징을 사용하였다. 시간이 지남에 따라 합성곱 신경망 (Convolutional Neural Network, CNN), 순환 신경망 (Recurrent Neural Network, RNN), 장단기 메모리 (Long Short-Term Memory, LSTM)과 같이 다양한 인공 네트워크들이 등장한다. 이로 인해 고전적인 특징은 객체 탐지 또는 분류에 필요한 표현을 자동으로 발견하기 위해 네트워크에 제공되는 정보가 된다.

본 논문에서는 합동 결합 분류기 (Joint Fusion Classifier)에 다양하고 강화된 특징을 제공하여 표정을 효율적으로 분류할 수 있는 알고리즘을 제안한다. 우리는 다중 깊이 네트워크에 들어가는 입력이 네트워크에 다양한 정보를 제공할 수 있도록 최소 중첩되도록 설계하였다. 3D CNN을 활용하기 위해 다중 프레임률 신호 처리 방식을 기반으로 하는 다중 프레임률 기반 3D CNN을 제안하였다. 또한 주어진 이미지의 밝기 값에 따라 입력 이미지를 정규화하고 셀프 어텐션을 통해 각 네트워크로부터 추출된 특징을 강화하였다. 강화된 여러 특징들을 이어붙인 후에 합동 결합 분류기에 넣어 표정을 분류한다.

제안한 알고리즘을 통해 CK+ 데이터베이스에 대하여 제안한 합동 결합 분류기는 평균 96.23%의 정확도로 다른 최신 모델들과 비교할만한 결과를 보였다. MMI 및 GEMEP-FERA 데이터베이스의 경우에는 평균적으로 각각 96.69%와 99.79%의 정확도를 보였으며 다른 최신 모델보다 성능이 뛰어났다. AFEW 데이터 베이스의 경우, 31.02%의 정확도를 보였다.

번역하기

최근 인공 지능 (Artificial Intelligence, AI) 분야가 발달하면서 인간의 감정을 이해하는 것에 대한 중요도가 커지고 있다. 얼굴 표정 인식(Facial Expression Recognition, FER)은 얼굴 표정을 통해 인간의 ...

목차 (Table of Contents)