RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
          펼치기
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
          펼치기
        • 발행연도
          펼치기
        • 작성언어
        • 저자
          펼치기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        시맨틱 텐서공간모델 기반 텍스트데이터 증식기법

        이길재,김한준 한국정보과학회 2019 데이타베이스 연구 Vol.35 No.3

        Data augmentation is the process of generating new data with little variation to existing data. Data augmentation helps to prevent model's overfitting and improve performance in machine learning by ensuring data diversity. While data augmentation is actively used in computer vision, the use of data augmentation is limited in text mining. This is because, due to the nature of text data requiring embedding, there is a risk that data having a completely different meaning from the original is generated during the augmentation process. In this paper, we propose a text data augmentation technique based on semantic tensor space model. The proposed augmentation technique does not cause the augmentation problem of text data, and unlike the existing augmentation techniques, it can be easily performed because it uses only simple operations. This paper verifies the validity of the proposed augmentation technique by showing that the data generated by the proposed technique leads to the performance improvement of the model. 데이터 증식은 기존의 데이터에서 약간의 변형을 갖는 새로운 데이터를 생성하는 과정이다. 데이터 증식은 데이터의 다양성을 확보함으로써 기계학습에서 모델의 과적합을 방지하고 성능을 향상시키는 데 도움을 준다. 컴퓨터 비전 분야에서 데이터 증식이 활발히 활용되는 데 반해, 텍스트마이닝 분야에서는 데이터 증식의 사용이 제한적이다. 이는 임베딩을 필요로 하는 텍스트데이터의 특성상, 증식 과정에서 원본과 전혀 다른 의미를 갖는 데이터가 생성될 위험이 있기 때문이다. 이에 본 논문은 시맨틱 텐서공간모델을 활용한 텍스트데이터 증식기법을 제안한다. 제안하는 증식기법은 텍스트데이터가 갖는 증식문제에서 자유롭고, 기존의 증식기법들과 달리 간단한 연산만을 활용하기 때문에 간편하게 수행할 수 있는 장점이 있다. 본 논문은 문서분류 실험을 통해 제안한 증식기법으로 생성한 데이터들이 모델의 성능향상을 이끌어냄을 보임으로써 제안기법의 유효성을 검증한다.

      • KCI등재

        음성위조 탐지에 있어서 데이터 증강 기법의 성능에 관한 비교 연구

        박관열,곽일엽 한국통계학회 2023 응용통계연구 Vol.36 No.2

        The data augmentation technique is effectively used to solve the problem of overfitting the model by allowing the training dataset to be viewed from various perspectives. In addition to image augmentation techniques such as rotation, cropping, horizontal flip, and vertical flip, occlusion-based data augmentation methods such as Cutmix and Cutout have been proposed. For models based on speech data, it is possible to use an occlusion-based data-based augmentation technique after converting a 1D speech signal into a 2D spectrogram. In particular, SpecAugment is an occlusion-based augmentation technique for speech spectrograms. In this study, we intend to compare and study data augmentation techniques that can be used in the problem of false-voice detection. Using data from the ASVspoof2017 and ASVspoof2019 competitions held to detect fake audio, a dataset applied with Cutout, Cutmix, and SpecAugment, an occlusion-based data augmentation method, was trained through an LCNN model. All three augmentation techniques, Cutout, Cutmix, and SpecAugment, generally improved the performance of the model. In ASVspoof2017, Cutmix, in ASVspoof2019 LA, Mixup, and in ASVspoof2019 PA, SpecAugment showed the best performance. In addition, increasing the number of masks for SpecAugment helps to improve performance. In conclusion, it is understood that the appropriate augmentation technique differs depending on the situation and data. 데이터 증강 기법은 학습용 데이터셋을 다양한 관점에서 볼 수 있게 해주어 모형의 과적합 문제를 해결하는데 효과적으로 사용되고 있다. 이미지 데이터 증강기법으로 회전, 잘라내기, 좌우대칭, 상하대칭등의 증강 기법 외에도 occlusion 기반 데이터 증강 방법인 Cutmix, Cutout 등이 제안되었다. 음성 데이터에 기반한 모형들에 있어서도, 1D 음성 신호를 2D 스펙트로그램으로 변환한 후, occlusion 기반 데이터 기반 증강기법의 사용이 가능하다. 특히, SpecAugment는 음성 스펙트로그램을 위해 제안된 occlusion 기반 증강 기법이다. 본 연구에서는 위조 음성 탐지 문제에 있어서 사용될 수 있는 데이터 증강기법에 대해 비교 연구해보고자 한다. Fake audio를 탐지하기 위해 개최된 ASVspoof2017과 ASVspoof2019 데이터를 사용하여 음성을 2D 스펙트로그램으로 변경시켜 occlusion 기반 데이터 증강 방식인 Cutout, Cutmix, SpecAugment를 적용한 데이터셋을 훈련 데이터로 하여 CNN 모형을 경량화시킨 LCNN 모형을 훈련시켰다. Cutout, Cutmix, SpecAugment 세 증강 기법 모두 대체적으로 모형의 성능을 향상시켰으나 방법에 따라 오히려 성능을 저하시키거나 성능에 변화가 없을 수도 있었다. ASVspoof2017 에서는 Cutmix, ASVspoof2019 LA 에서는 Mixup, ASVspoof2019 PA 에서는 SpecAugment 가 가장 좋은 성능을 보였다. 또, SpecAugment는 mask의 개수를 늘리는 것이 성능 향상에 도움이 된다. 결론적으로, 상황과 데이터에 따라 적합한 augmentation 기법이 다른 것으로 파악된다.

      • KCI등재

        Mask R-CNN을 이용한 이미지 합성 기반 데이터증강의 자주포 객체탐지 모델 성능향상 연구

        채한결,김수환 한국산학기술학회 2023 한국산학기술학회논문지 Vol.24 No.11

        The applications of AI technology have limitations in defense, such as the difficulty in obtaining sufficient high-quality big data and the lack of diversity in the acquired data. This study evaluated image data augmentation methods suitable for military operational environments to overcome these issues. First, the object selected is the "self-propelled artillery." By separating the object and synthesizing it into a specific background, an image synthesis-based data augmentation method is proposed to address the lack of data for military operational environments. Self-propelled artillery images identified in winter were used as test data to verify the performance. Three methods are used to augment the data: the baseline data augmentation method using self-propelled artillery training data identified in non-winter environments, data augmentation using Cycle-GAN, one of the image generation techniques, and the proposed image synthesis-based data augmentation method. The object detection performance was compared using the YOLOv5 model. The results show that using the image synthesis-based data augmentation method with the augmented training dataset achieved the highest performance, with an mAP (0.5) of 97%. This study shows that image synthesis-based data augmentation can address the lack of diversity in defense sector data. In addition, it provides a method to improve object detection model performance in weapon system detection.

      • 데이터 증강 방법을 이용한 정상데이터 기반 베어링 이상 진단법

        배재웅(Jaewoong Bae),정원호(Wonho Jung),박용화(Yonghwa Park) 대한기계학회 2021 대한기계학회 춘추학술대회 Vol.2021 No.4

        최근 데이터 불균형 문제를 해결하기 위해 고장데이터 증대 연구가 활발하다. 하지만 분포 학습기반 고장데이터 증대 기법은 데이터 특성에 따라 크게 변해 고장 진단 방법론에 적용이 어렵다. 본 연구에서는 고장 데이터 없이 정상 데이터 기반 베어링 이상 진단 기법을 제안한다. 제안된 방법은 세 단계로 구성된다: (1) 정상데이터 기반 데이터 증강, (2) 컨볼루션 신경망 기반 특성인자 추출, 그리고 (3) 이상 기준치 설계. 정상데이터 기반 데이터 증강 방법은 데이터 특성에 맞춰 노이즈 추가, 진폭 변조 등을 활용한다. 특성인자 추출을 위해 ResNet 을 이용하여 정상데이터 특성인자를 추출한다. 마지막으로 추출된 특성인자를 이용하여 정상데이터와 이상데이터 간의 거리를 계산하여 이상 기준치를 선정한다. CWRU 베어링 결함 데이터셋을 이용하여 제안된 방법을 검증하였고 약 95% 이상 진단 정확도를 보였다. 본 연구는 정상데이터만을 이용하여 데이터 불균형 환경 속에서도 이상 진단이 가능함을 보였다. A research on data augmentation has been actively conducted to solve the data imbalance problem. However, data distribution learning based data augmentation method is difficult to apply to fault diagnosis methodology because fault data distribution varies under the characteristics of their dataset. This paper proposes ball bearing anomaly detection method based on normal data only. The proposed method consists of three stages: (1) data augmentation based using normal data, (2) feature extraction using convolutional neural networks, and (3) design anomality threshold. The normal data-based augmentation method utilizes adding noise, amplitude variation under consideration of the data characteristics. ResNet architecture is used for feature extraction. Finally, anomality threshold is selected by calculating the distance between normal data and abnormal data. The proposed method was verified using the CWRU bearing dataset and presented a diagnostic accuracy of about 95%. This study expects that it is possible to diagnose abnormalities even in a data imbalance using only normal data.

      • Data Augmentation on Limited Biometric Data Set for M2M Authentication Model Testing

        Rin Nadia,Dana Koshen,JaeSeung Song 한국통신학회 2021 한국통신학회 학술대회논문집 Vol.2021 No.6

        Examining the performance of artificial intelligence (AI)-based model for machine to machine (M2M) authentication needs a large number of data. The accessibility of open biometric data is restricted by privacy regulations such as general data protection regulation (GDPR). This is because the data is commonly obtained by sensors embedded in personal wearable devices. Thus exploring and developing a system with biometric data as the main parameter is difficult to do. As data augmentation is a technique to increase the size of the data set by producing derived data from the original data, its usage in AI is popular especially for computer vision. Incorporating data augmentation in the development of AI-based authentication model could solve the shortage of database. Therefore this study proposes a data augmentation model and analyzes its training and augmenting performance.

      • KCI등재

        보조 분류기를 이용한 GAN 모델에서의데이터 증강 누출 방지 기법

        심종화,이지은,황인준 한국전기전자학회 2022 전기전자학회논문지 Vol.26 No.2

        Data augmentation is general approach to solve overfitting of machine learning models by applying various datatransformations and distortions to dataset. However, when data augmentation is applied in GAN-based model,which is deep learning image generation model, data transformation and distortion are reflected in the generatedimage, then the generated image quality decrease. To prevent this problem called augmentation leak, we proposea scheme that can prevent augmentation leak regardless of the type and number of augmentations. Specifically, weanalyze the conditions of augmentation leak occurrence by type and implement auxiliary augmentation taskclassifier that can prevent augmentation leak. Through experiments, we show that the proposed techniqueprevents augmentation leak in the GAN model, and as a result improves the quality of the generated image. Wealso demonstrate the superiority of the proposed scheme through ablation study and comparison with otherrepresentative augmentation leak prevention technique. 데이터 증강이란 다양한 데이터 변환 및 왜곡을 통해 데이터셋의 크기와 품질을 개선하는 기법으로, 기계학습 모델의 과적합 문제를 해결하기 위한 대표적인 접근법이다. 그러나 심층학습 이미지 생성 모델인 GAN 기반 모델에서 데이터 증강을 적용하면 생성된이미지에 데이터 변환과 왜곡이 반영되는 증강 누출 문제가 발생하여 생성 이미지의 품질이 하락한다. 이러한 문제를 해결하기 위해본 논문에서는 데이터 증강의 종류와 수에 관계없이 증강 누출을 방지하는 기법을 제안한다. 증강 누출의 발생 조건을 분석하였으며, 보조적인 데이터 증강 작업 분류기를 GAN 모델에 적용하여 증강 누출을 방지하였다. 정성적 정량적 평가를 통해 제안된 기법을적용하면 증강 누출이 발생하지 않음을 보이고 추가적으로 생성 이미지의 품질을 향상시키며 기존 기법과 비교하여 발전된 성능을보임을 입증하였다.

      • KCI등재

        로봇 디팔레타이징 산업용 인공지능 물체 검출 학습의 데이터 비용 절감을 위한 GAN기반 데이터 증강 기법

        조용우,김현우 제어·로봇·시스템학회 2022 제어·로봇·시스템학회 논문지 Vol.28 No.10

        The development of artificial intelligence (AI) has brought many changes to a variety of industrial fields. However, there are problems when applying AI due to the various environmental factors in industrial sites. In particular, the object detection model used for robot depalletizing often deteriorates over time due to the distribution of data changing every moment. While it is possible to maintain the performance of the model through additional data acquisition, continuous data acquisition is costly. Therefore, data augmentation techniques are used to reduce costs and maintain performance. However, data augmented by existing augmentation techniques do not significantly change in terms of the distribution of existing data, rendering it difficult to maintain the performance of the object detection model. In this paper, we develop a data augmentation pipeline based on generative adversarial networks, which is effective at maintaining performance and reducing the cost of object detection models. The proposed pipeline is composed of a data generator model and an object detection model. The generator model uses a small amount of training data to generate data with a new distribution, while the object detection model is trained with both training and generated data. The object detection model trained through the pipeline with 100 pieces of training data exhibited better performance on new data distribution by 9.9% AP compared to the model trained with 2000 pieces of training data. In addition, the results of the qualitative analysis confirmed that a representative error occurring in robot depalletizing could be improved by the proposed pipeline. .

      • KCI등재

        Deep Learning with Data Augmentation to Add Data Around Classification Boundaries

        Hideki Fujinami,Gendo Kumoi,Masayuki Goto 대한산업공학회 2021 Industrial Engineeering & Management Systems Vol.20 No.3

        Data augmentation methods are used as a technique to improve generalization by increasing the number of training data in image classification. However, most of these methods are not a data driven algorithm, the degree of improvement of generalization ability by performing these data augmentation methods differs between the domains of image data for training. Generative models are researched to use for augmenting data recently. In particular, Generative Adversarial Networks (GANs) (Goodfellow et al., 2014) that can generate clean image get attention as an excellent innovation in machine learning. As GANs extension method, there is a method called CGANs (Mirza and Osindero, 2014) that can be used for data augmentation. When enough training data for each class are not prepared for classification model, the same is true for training CGANs. In such case, CGAN generates noisy images. This makes a classification model to underfit to the original training data. Moreover, when a CGAN approximates the training data distribution, the CGAN generates new training data in the same region where training data densely exist. In such case, augmented data can’t reduce overfitting on the original training data. Therefore, our research contributes to augment data which meets these two requirements. In this study, we propose a method to generate data by the class specific GAN with small training data and selectively add generated data to the training data set that improves classification accuracy by using the entropy of the classification model. The feature of the proposed method is that it focuses on the positional relationship between data and the classification hyperplane in deep learning. In the proposed method, the entropy of the classification model is used to measure the positional relationship between the classification boundary and the data. As a result, the generalization performance is improved by adding the data around the classification boundary as new training data.

      • KCI등재

        데이터 증강기법 기반의 강건한 건선 중증도 분류 연구

        이언석,문초이,백유상,최민형 대한전기학회 2022 전기학회논문지 Vol.71 No.12

        Psoriasis is a chronic recurrent disease formed by lesions such as erythema and scale. To evaluate the severity of psoriasis, the psoriasis area and severity index (PASI) score have been used in clinical trials and studies. This clinical indicator is subjective, so to overcome these shortcoming, various automatic psoriasis analysis methods based on deep learning have been studied. However, the limited number of data and psoriasis characteristic such as ambiguity of severity deteriorate model performance. One of the simple and powerful methods to overcome these problem is data augmentation. Data augmentation should be used according to data characteristics. Therefore, we analyzed and compared the classification results applied with five data augmentation methods, Geometric transformation, CutMix, Visual Corruptions, AutoAugment, RandAugment, and explored data augmentation method suitable for psoriasis severity classification. We used the EfficientNet B2 for psoriasis severity classification. As a result, when RandAugment or the combination of Geometric transform and Visual Corruptions were used, it showed the best classification performance with an accuracy of 87.5%. In addition, we confirmed the effect of data augmentation for improving model performance and the difference in performance according to single or multiple applications of the data augmentation methods. Through these results, our study can be applied to various studies as a data augmentation method suitable for psoriasis disease image.

      • KCI등재

        딥 뉴럴 네트워크 기반의 음성 향상을 위한 데이터 증강

        이승관,이상민 한국멀티미디어학회 2019 멀티미디어학회논문지 Vol.22 No.7

        This paper proposes a data augmentation algorithm to improve the performance of DNN(Deep Neural Network) based speech enhancement. Many deep learning models are exploring algorithms to maximize the performance in limited amount of data. The most commonly used algorithm is the data augmentation which is the technique artificially increases the amount of data. For the effective data augmentation algorithm, we used a formant enhancement method that assign the different weights to the formant frequencies. The DNN model which is trained using the proposed data augmentation algorithm was evaluated in various noise environments. The speech enhancement performance of the DNN model with the proposed data augmentation algorithm was compared with the algorithms which are the DNN model with the conventional data augmentation and without the data augmentation. As a result, the proposed data augmentation algorithm showed the higher speech enhancement performance than the other algorithms.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼