RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • Yet another Hybrid Strategy for Auto-tuning SpMV on GPUs

        Zhaohui Wang,Xiaojie Qiu,Aimin Zhang,Yichao Cheng,Yi Peng,Sun Sun 보안공학연구지원센터 2015 International Journal of Software Engineering and Vol.9 No.5

        Sparse matrix-vector multiplication (SpMV) is a key linear algebra algorithm and is widely used in many application domains. Besides multi-core architecture, there is also extensive research focusing on accelerating SpMV on many-core Graphics Processing Units (GPUs). SpMV computations have many indirect and irregular memory accesses, and load imbalance could occur while mapping computations onto single-instruction, multiple-data (SIMD) GPUs. SpMV is highly memory bandwidth-bound, though GPUs have massive computational resources, the performance of SpMV on GPUs is still unsatisfying. In this paper, we present a new hybrid strategy for auto-tuning SpMV on GPUs. Our strategy combines the advantages of row-major storage and column-major storage. Like many other strategies, we reordered a given sparse matrix according to row lengths in decreasing order. In order to be more adaptive and efficient, we proposed a new hybrid Blocked CSR and JDS (BCJ) format based on original CSR and JDS. BCJ splits a sparse matrix into a denser part and a sparser part after reordering and uses different kernels to process the corresponding part. And we proposed corresponding auto-tuning framework to help transforming matrix and launching kernels according to the sparsity characteristics of the matrix. A CUDA implementation of BCJ outperforms the original formats significantly on a broad range of unstructured sparse matrices.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼