RISS 검색 - 국내학술지논문

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
AVX-512를 활용한 인텔 프로세서에서 행렬 곱셈 연산의 오토튜닝 방법

이영하(Yeongha Lee),임록택(Roktaek Lim),김래현(Raehyun Kim),최재영(Jaeyoung Choi) 한국정보과학회 2018 한국정보과학회 학술발표논문집 Vol.2018 No.6
- 원문보기
2
NFC 태그와 스마트폰을 활용한 ARG 게임 편집기와 게임 실행기 설계 및 구현

이영하(Yeongha Lee),강승우(Seungwoo Khang),백승혁(Seunghueok Baek),하민호(Minho Ha),강민수(Minsu Kang),이상준(Sangjun Lee) 한국정보과학회 2015 한국정보과학회 학술발표논문집 Vol.2015 No.12
- 원문보기
3
AVX-512를 활용한 인텔 프로세서에서 행렬 곱셈 연산의 오토튜닝 방법

이영하(Yeongha Lee),김래현(Raehyun Kim),최재영(Jaeyoung Choi) 한국정보과학회 2018 정보과학회 컴퓨팅의 실제 논문지 Vol.24 No.12
- 원문보기
- 복사/대출신청
BLAS는 선형 대수 및 행렬 연산을 수행할 때 널리 사용되는 라이브러리이다. 특히 블럭 연산을 지원하는 level-3 BLAS는 캐시에 올라오는 데이터의 재사용성을 높여 성능을 최대로 얻을 수 있도록 한다. 이를 위해서는 프로세서의 캐시에 맞는 행렬의 블록크기, 루프 언롤링(loop unrolling) 횟수, 프리패칭(prefetching) 거리 등이 최적화 되는 설정값을 찾아내야 한다. 이를 컴퓨터가 자동으로 실행하면서 최적화하는 과정을 오토튜닝이라고 한다. Intel Knights Landing 등 Intel AVX-512를 사용하는 최신의 인텔 프로세서들은 BLAS 최적화를 위한 오토튜닝 방법이 거의 연구되어 있지 않으므로, 본 연구에서는 Intel AVX-512를 사용하는 인텔의 아키텍처들에서 level-3 BLAS의 성능을 최대화하기 위한 오토튜닝 방법을 보인다. BLAS is a widely used library for performing linear algebra and matrix operations. Level-3 BLAS, in particular, supports block operations to maximize performance by increasing the reusability of data in the cache. For a block operation, it is necessary to find optimal values that optimize the block size of the matrix matched to the processor`s cache, the number of loop unrolling times, and the prefetching distance. Autotuning means that the computer runs and automatically optimizes its performance. The appropriate autotuning processes for BLAS routines have not studied on the recent Intel processors with AVX-512 such as Intel Knights Landing and Intel Scalable processors. In this study, we have demonstrated an autotuning method to maximize the performance of level-3 BLAS in Intel architectures with Intel AVX-512.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천