감각치환을 위한 소리 기반 감정 표현 얼굴 생성 연구|RISS 상세보기

다국어 초록 (Multilingual Abstract)

Research on Visual-to-Auditory (V2A) which uses neural facilitation to transfer visual information to auditory information is actively ongoing. However, existing V2A methods have provided users with limited information by simply converting visual information to auditory information. To provide more information to the users, it is necessary to generate and transmit augmented visual information by integrating it with other sensory information. Recently, methods for generating augmented visual information by integrating auditory and visual information are being studied, but there is a drawback in that they cannot generate images optimized for sound intensity by not considering the rich intensity of sound. To solve this problem, in this paper, a method of classifying sound and converting the classified result into sound intensity to generate images optimized for sound intensity is proposed. The results of applying the proposed method to a face image generation model in experiments showed that the proposed method can generate face images utilizing sound information when sound intensity is considered.

번역하기

참고문헌 (Reference)

1 D. Jeong, "TräumerAI:Dreaming Music with StyleGAN"

2 S. Hanneton, "The Vibe : A Versatile Vision-to-Audition Sensory Substitution Device" 7 (7): 269-276, 2010

3 K. Moon, "Technological Trends in Sensory Substitution" 32 (32): 65-75, 2019

4 Y. Viazovetskyi, "StyleGAN2 Distillation for Feed-Forward Image Manipulation" 12367 : 170-186, 2020

5 O. Patashnik, "StyleCLIP : Text-Driven Manipulation of StyleGAN Imagery" 2085-2094, 2021

6 S. Lee, "Sound-Guided Semantic Image Manipulation" 3367-3376, 2022

7 A. Radford, "Learning Transferable Visual Models from Natural Language Supervision"

8 S. Abboud, "EyeMusic : Introducing a Visual Colorful Experience for the Build Using Auditory Sensory Substitution" 32 (32): 247-257, 2014

9 K. He, "Deep Residual Learning for Image Recognition" 770-778, 2016

10 C. Lee, "Crossing You in Style : Cross-modal Style Transfer from Music to Visual Arts" 3219-3227, 2020