http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
SSD 기반 시스템에서 셔플 과정 최적화를 통한 하둡 맵리듀스의 처리속도 향상 기법
고광옥(Gwangok Go),박대동(Daedong Park),홍성수(Seongsoo Hong) 대한전자공학회 2015 대한전자공학회 학술대회 Vol.2015 No.6
MapReduce is a programming model widely used for processing big data in cloud datacenter. It is composed of Map, Shuffle and Reduce phases. Hadoop MapReduce is one of the most popular framework implementing MapReduce. During Shuffle phase, Hadoop MapReduce performs an excessive number of disk I/O operations and the transmission of large data. This accounts for about 40% of total data processing time. In order to solve these problems, we propose a new shuffle mechanism using the characteristics of SSD. This mechanism consists of (1) data address based sorting, (2) data address based merging and (3) early data transmission before Map phase completion. In order to demonstrate the effectiveness of our approach, we have implemented this mechanism into Hadoop MapReduce 1.2.1. Our experiments show that the proposed mechanism reduces the job completion time up to 5% compared to that of the legacy Hadoop MapReduce.