RISS 검색 - 학위논문

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
A Fast Subsequence Matching Scheme in Stock Prices : 주식 검색을 위한 빠른 서브시퀀스 매칭

Dat Trong Nguyen Kyungpook National University, Graduate School 2014 국내석사

RANK : 233023
Subsequence matching is similarity sequence matching that the data sequence is longer than the query sequence. Subsequence matching in stock prices can be implemented on finance applications that enable users input their desired stock patterns and search for list of stock symbols contain the pattern. Such application faces the problems that the query patterns are in various shapes, scaling ratios and appears in different time periods. Moreover, the quality of the query results and the query time are also be carefully considered. Prior methods that processing similarity matching with index-based approach can be adopted to solve almost problems except the quality of the query result because those methods do not consider stock patterns in the data sequence. Another method by Fu et al. that using SB-Tree and rules based on stock patterns, in the other hand, considers the stock patterns but is lack of effectiveness indexing technique that lead to poor performance, this method requires the stock patterns need to be known in advance. In this paper, we propose an index-based fast subsequence matching (FSM) approach that adopts the advances from prior subsequence matching methods and consider stock patterns by using new proposed filters based on pivot points that enhances the quality of query results significantly. The filters are applied on both data subsequence and query sequence in post-processing step, firstly require they have the same number of pivot points. Then the filters check the trends matching and distance aspects based on pivot points. If the subsequence which has pivot points passes all filters then the dynamic time warping distance is computed as the last step to get satisfied candidate. The proposal method is also implemented in a real application that adopts the client/server model. The client is an android smart phone for inputs query and displays the results. The server implements the FSM algorithm and manages stock databases. Experiment results show that our method has highest precision in most case and has query time faster about 43 times compares with Piecewise Aggregate Approximation (PAA) method and about 30 times compare with Discrete Wavelets Transform (DWT) method for long period such as 120 days. Overall, our proposed method has solved the problems of subsequence matching in stock application, has outstanding quality query results, query time and suitable for implementing in real application. 서브시퀀스 매칭이란 데이터 시퀀스가 사용자 질의 시퀀스보다 길이가 길 때 두 시퀀스 간의 유사도를 구하는 방법이다. 서브시퀀스 매칭은 음악 검색, 심전도 검색 등 여러 분야에서 널리 사용되고 있다. 그 중에서도 특히 주식 관련 어플리케이션에서 사용자가 찾고자 하는 주식 패턴들을 입력으로 그 패턴이 포함된 주식들을 찾는 용도로 사용될 수 있다. 위 어플리케이션에서 사용자가 찾고자 하는 주식 패턴 즉, 질의 시퀀스는 여러 시간에서 다양한 형태나 크기로 표현되며, 찾고자 하는 주식 패턴 또한 짧은 기간뿐만 아니라 오랜 기간동안 다양한 형태로 나타나므로 서브시퀀스 매칭을 이용할 시, 이들을 고려해야 한다. 또한, 주식 도메인 특성상 빠른 질의 처리 시간 및 고성능이 고려되어야 한다. 이와 관련하여 기존 연구들은 주식 도메인을 특징을 고려하지 않고 단순히 색인 기반의 서브시퀀스 매칭을 수행하여 유사한 주식들을 찾고자 하였다. 앞선 연구들은 데이터 시퀀스에서 나타나는 주식과 관련된 패턴을 고려하지 않았기 때문에 성능이 떨어진다는 단점을 가지고 있다. 이를 보완하기 위해 SB-Tree와 규칙을 반영하여 서브시퀀스 매칭 기반 주식 패턴 찾기를 수행한 연구도 있었다. 하지만, 위 연구는 비효율적인 인덱싱 기술로 인해 질의 처리 성능이 떨어지며, 사용자가 규칙을 적용하기 위해서는 주식 패턴에 대한 전문 지식이 필요하다는 한계점을 지닌다. 본 논문에서는 색인 기반 빠른 서브시퀀스 매칭 기법을 제안한다. 제안한 방법은 기존의 서브시퀀스 매칭 기법에 주식 도메인을 잘 반영할 수 있도록 추축(Pivot) 기반 필터링을 추가한 방법이다. 먼저, 기존 서브시퀀스 매칭과 동일하게 데이터 시퀀스를 서브시퀀스들로 나누고, 이들을 색인화한다. 다음으로 추축 기반의 필터링을 사용하여 데이터 서브시퀀스와 사용자 질의 시퀀스를 추축으로 표현한다. 제안한 추축 기반 필터링은 주식 시퀀스의 특징인 다양한 형태뿐만 아니라 크기를 잘 표현하며, 짧거나 긴 기간의 시퀀스도 적은 개수의 추축만으로 표현한다. 제안한 필터링을 통해 질의 시퀀스와 데이터 서브시퀀스를 동일한 개수의 추축들로 표현하며 적은 개수만으로도 두 시퀀스들을 비교함으로 인해 효율적으로 질의 시퀀스와 유사하지 않는 다수의 데이터 서브시퀀스를 제거할 수 있다. 마지막으로, 데이터 서브시퀀스들 중 추축 기반 비교를 통과한 서브시퀀스들만을 입력으로 동적 타임 워핑을 사용하여 질의 시퀀스와 유사한 최종 후보들을 찾는다. 본 논문의 우수성을 보이기 위하여 실제 주식 시장의 데이터를 사용하여 실험을 수행하였다. 주식 질의 시퀀스의 다양한 형태와 크기가 잘 반영됨을 평가하기 위해 사용자들이 대표적으로 사용하는 8개의 시퀀스를 실험에 사용하였으며, 여러 기간에도 잘 적용됨을 평가하기 위해20일, 60일, 120일을 입력으로 실험을 수행하였다. 실험 결과, 기존 연구들에서 사용한 부분 집계 근사법과 이산 웨이블릿 변환 방법에 비해 제안한 방법은 높은 정확도를 보였으며, 처리 시간 또한 각각 43배, 30배 정도 빠름을 보였다. 이를 통해 제안한 방법은 주식 도메인에서 발생할 수 있는 다양한 형태, 크기 및 여러 기간의 시퀀스를 적절히 처리함과 동시에 빠른 질의 처리를 할 수 있다.
2
On optimizing update propagation for modern three-tier storage architecture

Nguyen, Trong-Dat Sungkyunkwan University 2019 국내박사

RANK : 233023
The two-tier storage architecture in disk-based Database Management System (DBMS) consists of volatile buffer pools in DRAM for caching frequently accessed data and durable storage for keeping long-term data. In this architecture, evicting pages from buffer pool to durable storage have been considered as a common bottleneck in write-intensive, high concurrent OLTP workloads for three folds: (1) the block devices have high IO latency, (2) guaranteeing atomic page updates takes extra IO traffic and high lock contention of centralized cache, and (3) data fragmentation in flash SSD. This dissertation proposes a three-tier storage architecture (PB-NVM) that adopts emerging NVDIMM as a non-volatile write-back cache between DRAM and flash SSD to solve the above problems. PB-NVM separates eviction processes from propagation processes to increase the concurrency level of the system. Eviction process writes dirty pages to partitioned buffers instead of the centralized buffer and propagation processes asynchronously write those pages as a batch from partitioned buffers to flash SSD without stalling the eviction processes. Our proposed method exploits NVDIMM byte-addressable operations to archive high IO performance and adopt persistent memory develop kit (PMDK) to guarantee atomic page write. In propagation processes, we take WiredTiger as a research case and solve the data fragmentation problem in flash SSD by adopting TRIM command and propose a novel stream mapping scheme to exploit multi-streamed technology. We then extend the proposed stream mapping to a general solution using a machine learning framework. We implement the proposed schemes in InnoDB/MySQL and WiredTiger/MongoDB and evaluate with TPC-C, Linkbench, and YCSB benchmarks in a real NVDIMM server. Empirical results show that, compared to the original storage engine InnoDB or WiredTiger, the proposed methods improve the throughput, reduce the flushing time and latency significantly.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천