RISS 검색 - 국내학술지논문 상세보기

국문 초록 (Abstract)

BLAS는 선형 대수 및 행렬 연산을 수행할 때 널리 사용되는 라이브러리이다. 특히 블럭 연산을 지원하는 level-3 BLAS는 캐시에 올라오는 데이터의 재사용성을 높여 성능을 최대로 얻을 수 있도록 한다. 이를 위해서는 프로세서의 캐시에 맞는 행렬의 블록크기, 루프 언롤링(loop unrolling) 횟수, 프리패칭(prefetching) 거리 등이 최적화 되는 설정값을 찾아내야 한다. 이를 컴퓨터가 자동으로 실행하면서 최적화하는 과정을 오토튜닝이라고 한다. Intel Knights Landing 등 Intel AVX-512를 사용하는 최신의 인텔 프로세서들은 BLAS 최적화를 위한 오토튜닝 방법이 거의 연구되어 있지 않으므로, 본 연구에서는 Intel AVX-512를 사용하는 인텔의 아키텍처들에서 level-3 BLAS의 성능을 최대화하기 위한 오토튜닝 방법을 보인다.

번역하기

BLAS는 선형 대수 및 행렬 연산을 수행할 때 널리 사용되는 라이브러리이다. 특히 블럭 연산을 지원하는 level-3 BLAS는 캐시에 올라오는 데이터의 재사용성을 높여 성능을 최대로 얻을 수 있도...

다국어 초록 (Multilingual Abstract)

BLAS is a widely used library for performing linear algebra and matrix operations. Level-3 BLAS, in particular, supports block operations to maximize performance by increasing the reusability of data in the cache. For a block operation, it is necessary to find optimal values that optimize the block size of the matrix matched to the processor`s cache, the number of loop unrolling times, and the prefetching distance. Autotuning means that the computer runs and automatically optimizes its performance. The appropriate autotuning processes for BLAS routines have not studied on the recent Intel processors with AVX-512 such as Intel Knights Landing and Intel Scalable processors. In this study, we have demonstrated an autotuning method to maximize the performance of level-3 BLAS in Intel architectures with Intel AVX-512.

번역하기

목차 (Table of Contents)

요약
Abstract
1. 서론
2. 배경연구
3. 구현

요약
Abstract
1. 서론
2. 배경연구
3. 구현
4. 실험
5. 결론
References

참고문헌 (Reference)

1 J. Bilmes, "Optimizing Matrix Multiply using PHiPAC : a Portable, High-performance, ANSI C Coding Methodology" ACM 1997

2 R. Lim, "OpenMP-based Parallel Implementation of Matrix-Matrix Multiplication on the Intel Knights Landing" ACM 63-66, 2018

3 B. Kagstrom, "GEMMbased Level 3 BLAS : High-performance Model Implementations and Performance Evaluation Benchmark" 24 (24): 268-302, 1998

4 R. Whaley, "Automatically Tuned Linear Algebra Software" IEEE Computer Society 1998

5 K. Goto, "Anatomy of highperformance matrix multiplication" 34 (34): 2008

6 R. Lim, "An Implementation of Matrix-Matrix Multiplication on the Intel KNL Processor with AVX-512" Springer 21 (21): 1785-1795, 2018

1 J. Bilmes, "Optimizing Matrix Multiply using PHiPAC : a Portable, High-performance, ANSI C Coding Methodology" ACM 1997

2 R. Lim, "OpenMP-based Parallel Implementation of Matrix-Matrix Multiplication on the Intel Knights Landing" ACM 63-66, 2018

3 B. Kagstrom, "GEMMbased Level 3 BLAS : High-performance Model Implementations and Performance Evaluation Benchmark" 24 (24): 268-302, 1998

4 R. Whaley, "Automatically Tuned Linear Algebra Software" IEEE Computer Society 1998

5 K. Goto, "Anatomy of highperformance matrix multiplication" 34 (34): 2008

6 R. Lim, "An Implementation of Matrix-Matrix Multiplication on the Intel KNL Processor with AVX-512" Springer 21 (21): 1785-1795, 2018

연월일	이력구분	이력상세
2022	평가예정	재인증평가 신청대상 (재인증)
2019-01-01	평가	등재학술지 유지 (계속평가)
2016-01-01	평가	등재학술지 유지 (계속평가)
2015-01-01	평가	등재학술지 유지 (등재유지)
2014-09-16	학술지명변경	한글명 : 정보과학회논문지 : 컴퓨팅의 실제 및 레터 -> 정보과학회 컴퓨팅의 실제 논문지 외국어명 : Journal of KIISE : Computing Practices and Letters -> KIISE Transactions on Computing Practices
2013-04-26	학술지명변경	외국어명 : Journal of KISS : Computing Practices and Letters -> Journal of KIISE : Computing Practices and Letters
2011-01-01	평가	등재학술지 유지 (등재유지)
2009-01-01	평가	등재학술지 유지 (등재유지)
2008-10-02	학술지명변경	한글명 : 정보과학회논문지 : 컴퓨팅의 실제 -> 정보과학회논문지 : 컴퓨팅의 실제 및 레터 외국어명 : Journal of KISS : Computing Practices -> Journal of KISS : Computing Practices and Letters
2007-01-01	평가	등재학술지 유지 (등재유지)
2005-01-01	평가	등재학술지 유지 (등재유지)
2002-01-01	평가	등재학술지 선정 (등재후보2차)

기준연도	WOS-KCI 통합IF(2년)	KCIF(2년)	KCIF(3년)
2016	0.29	0.29	0.27
KCIF(4년)	KCIF(5년)	중심성지수(3년)	즉시성지수
0.24	0.21	0.503	0.04

상세검색

RISS 보유자료

상세검색

해외전자자료

AVX-512를 활용한 인텔 프로세서에서 행렬 곱셈 연산의 오토튜닝 방법

부가정보

동일학술지(권/호) 다른 논문

분석정보

인용정보 인용지수 설명보기

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료