RISS 검색 - 국내학술지논문

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
Big Data Performance Evaluation Analysis Using Apache Pig

Gal Engelberg,Oded Koren,Nir Perel 보안공학연구지원센터 2016 International Journal of Software Engineering and Vol.10 No.11
While companies' usage of big data products increases, the question of which big data architecture is the most suitable to the company's needs is rising. This study presents an approach of running multiple processes which simulates preliminary data processing of sale transactions input dataset using Apache Pig, in order to find the best performing big data environment in terms of decentralization level over the HDFS. The case study approach can provide companies an additional tool for understanding the required investment on hardware or cloud computing resources. We analyze which decentralization level achieves the best processing time, and explore the behavior of performance's change according to the change in decentralization level and performance change according to the change in the size of the input dataset. The case study's insights are: When processing the same data flow over the same input dataset, processing time performance is better as long as decentralization level increases; As long as decentralization level increases the change between performances decreases significantly; Processing the same Pig data flow under the same scale of decentralization level over large input dataset performs better then processing it over a smaller input dataset - in terms of processing time per volume unit; As blocks-data nodes ratio becomes higher, the processing time becomes longer, and vice versa.
2
Pig Vs. Hive Use Case Analysis

Danielle Kendal,Oded Koren,Nir Perel 보안공학연구지원센터 2016 International Journal of Database Theory and Appli Vol.9 No.12
Corporations are changing their practices to data-driven big data initiatives, as big data analytics has provided companies with the ability to grow their businesses and increase competition. As the importance of data analytics grew, so accordingly did the size of the data to analyze, thus demanding a more powerful data platform. This paper shows a case study of two High Level Query Languages that are constructed on top of Hadoop MapReduce; Pig and Hive. By creating a query in each query language, both resulting in an identical output, and by running each query 30 times on 2 different sized files (120 runs total), this comparison provides a statistically significant conclusion.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천