RISS 검색 - 국내학술지논문

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
분산 인 메모리 DBMS 기반 병렬 K-Means의 In-database 분석 함수로의 설계와 구현

구해모(Heymo Kou),남창민(Changmin Nam),이우현(Woohyun Lee),이용재(Yongjae Lee),김형주(HyoungJoo Kim) 한국정보과학회 2018 정보과학회 컴퓨팅의 실제 논문지 Vol.24 No.3
- 원문보기 2
  ScienceON
  
  DBpia
데이터의 양이 증가하면서 단일 노드 데이터베이스로는 저장과 처리를 동시에 수행하기에는 부족하다. 따라서, 데이터를 분산시켜 복수 노드로 구성된 분산 데이터베이스에 저장되고 있으며 분석 역시 효율성을 위해 병렬 기능을 제공해야한다. 전통적인 분석 방식은 데이터베이스에서 분석 노드로 데이터를 이동시킨 후 분석을 수행하기 때문에 네트워크의 비용이 발생하며 사용자가 분석을 위해 분석 프레임워크도 다를 수 있어야한다. 본 연구는 군집화 분석 기법인 K-Means 군집화 알고리즘을 관계형 데이터베이스와 칼럼 기반 데이터베이스를 이용한 분산 데이터베이스 환경에서 SQL로 구현하는 In-database 분석 함수로의 설계와 구현 그리고 관계형 데이터베이스에서의 성능 최적화 방법을 제안한다. As data size increase, a single database is not enough to serve current volume of tasks. Since data is partitioned and stored into multiple databases, analysis should also support parallelism in order to increase efficiency. However, traditional analysis requires data to be transferred out of database into nodes where analytic service is performed and user is required to know both database and analytic framework. In this paper, we propose an efficient way to perform K-means clustering algorithm inside the distributed column-based database and relational database. We also suggest an efficient way to optimize K-means algorithm within relational database.
2
DSMS 환경에서 이상 탐지를 위한 SVM과 리샘플링 기법의 분석

김동효(Donghyo Kim),구해모(Heymo Kou),김형주(Hyoung-Joo Kim) 한국정보과학회 2018 정보과학회 컴퓨팅의 실제 논문지 Vol.24 No.9
- 원문보기
- 복사/대출신청
실시간 스트림 데이터가 연속적으로 들어오는 DSMS(Data Stream Management System) 환경에서 그 데이터들의 이상여부를 판단하는 아키텍쳐를 고안한다. DSMS는 전통적인 데이터베이스관리시스템보다 스트림 데이터를 처리하는데 최적화된 시스템이며, 일부 제품에서는 SQL 대신 CQL(Continuous Query Language)을 사용한다. 따라서 DSMS에서 이상탐지를 수행하기 위해서는 이상탐지 모델을 CQL로 DSMS에 등록해야 한다. 본 논문도 이러한 DSMS 환경에서의 이상탐지 상황을 상정하고, 이상탐지모델을 CQL로 구현하려한다. CQL로의 구현을 고려하여 이상탐지를 위한 클래스 예측 알고리즘은 SVM(Support Vector Machine)을 사용한다. 그리고 본 실험에서는 SVM의 검증 성능을 높이기 위한 실험을 진행한다. 데이터집합의 클래스가 불균형할 때 발생할 수 있는 학습모델의 검증 성능 저하 문제를 리샘플링기법을 적용시켜 해결한다. 또한, 학습한 SVM모델의 임계값(threshold)을 조정하여 검증 성능을 최적화한다. 최종적으로 리샘플링된 데이터로 학습하고 임계값 조정된 SVM모델을 CQL로 변환하는 작업을 수행한다. 이 과정은 두 개의 자동화된 변환 블록을 거쳐서 수행하도록 구현한다. In the DSMS (Data Stream Management System) environment, which receives real-time stream data continuously, we devised an architecture to judge whether the data is abnormal or not. DSMS is optimized for processing stream data rather than traditional DBMS, and some products use CQL (Continuous Query Language) instead of SQL. Therefore, an anomaly-detection model must be registered as a CQL in order to perform anomaly detection in the DSMS. This paper assumes an anomaly-detection situation in such a DSMS environment and implements the anomaly-detection model in CQL. Considering the implementation in CQL, we used an SVM (Support Vector Machine) as a class-prediction algorithm for anomaly detection. We performed experiments to improve the validation performance of the SVM. We solved the problem that validation performance of a learned model declines when the dataset is imbalanced, by applying resampling techniques. In addition, we adjusted the threshold of the learned SVM model to optimize the validation performance. Finally, we converted the threshold-tuned SVM model learned by resampled dataset to CQL. This process was implemented by means of two automated transformation blocks.
3
오토인코더를 활용한 효율적인 신용카드 사기 탐지 지도 기법

이용현(YongHyun Lee),구해모(HeyMo Kou),김형주(Hyoung-Joo Kim) 한국정보과학회 2019 정보과학회 컴퓨팅의 실제 논문지 Vol.25 No.1
- 원문보기
신용카드 결제 이상 거래 탐지는 카드의 사용이 실시간으로 이루어지고, 탐지가 즉각적으로 이루어져야 한다는 점에서 스트리밍 데이터 분석으로 볼 수 있고, 이는 배치 분석보다 더 빠른 실시간 분석을 요구한다. 데이터에서 핵심 부분만을 추출하여 분석하는 방법은 이러한 연산 속도의 요구사항을 잘 만족시킬 수 있을 것이고, 주성분 분석 등의 기법을 통해 이루어져 왔다. 본 논문에서는 인공신경망을 활용한 차원 축소 기법인 오토인코더로 데이터를 전처리하여 데이터의 차원을 축소한 후 데이터 마이닝 기법을 적용하는 방법을 제안한다. 오토인코더는 데이터 차원들 간의 비선형적인 결합 관계도 포착할 수 있기에 보다 효과적인 차원 축소 방법이다. 또한 이를 데이터베이스 내에서의 이상 탐지 분석에 어떻게 사용할 지에 대하여 CQL과의 연동 방법론을 제시하고자 한다. Credit card fraud detection can be viewed as Streaming Data Analysis in which a card is used in real time and the detection must be done immediately. This requires a real time analysis that is faster than the batch analysis. The method of extracting and analyzing only the core part of the data will satisfy the requirements of this computation speed. This has been done through techniques such as principal component analysis. In this paper, we propose a method that applies data mining techniques after reducing the dimension of data by preprocessing the data with Autoencoder. This method is known as the dimension reduction method and it uses a Neural Network. Autoencoder is a very efficient method of dimension reduction because it can capture nonlinear associations between data feature dimensions. We also propose a methodology to combine Autoencoder with CQL for fraud detection analysis in the database.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천