RISS 검색 - 국내학술지논문

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
Coordinated thread block scheduling and warp scheduler for workload distribution

Vo Viet Tan,Jihoon Lee,Pyungkoo Park 한국디지털콘텐츠학회 2020 The Journal of Contents Computing Vol.2 No.2
- 원문보기
Throughput-oriented applications require special systems such as graphics processing units (GPUs) to achieve high efficiency. Complex applications are normally divided into smaller species of workload to be executed by utilizing parallelism. However, the workload cannot be distributed equally to every GPU resources, leading to resource underutilization. This paper proposes a simple approach to balance the workload distribution by limiting new issued Cooperative Threads Array (CTAs) to streaming processors (SMs) if they receive enough CTA. Our proposal can increase the resource utilization by prioritizing the warps belong to the last issued CTA. Experimental results show performance improvement by 1.26% on average compared to the baseline GTO (Greedy Then Oldest) warp scheduler.
2
Exit CTA Conscious Warp Scheduling

Viet Tan Vo,Xuan Thien Cao,Jinsul Kim 한국디지털콘텐츠학회 2021 The Journal of Contents Computing Vol.3 No.1
- 원문보기
Modern GPU architecture achieves a significant performance improvement over several previous GPU generations. Unfortunately, hardware resources upgrade is still underutilized. This paper targets this problem by reducing the warp waiting time when they approach the exit point at the end of CTA. Our proposed exit CTA conscious warp scheduler prioritizes warps in CTA (Cooperative Thread Arrays) which has the largest number of warps waiting at the exit point. The experiments prove that our implementation can improve the performance by 8.4% on average over the performance of baseline warp scheduler.
3
KAWS: Coordinate Kernel-Aware Warp Scheduling and Warp Sharing Mechanism for Advanced GPUs

Viet Tan Vo,김철홍 한국정보처리학회 2021 Journal of information processing systems Vol.17 No.6
- 원문보기 3
  ScienceON
  
  KCI
  
  KISS
Modern graphics processor unit (GPU) architectures offer significant hardware resource enhancements for parallel computing. However, without software optimization, GPUs continuously exhibit hardware resource underutilization. In this paper, we indicate the need to alter different warp scheduler schemes during different kernel execution periods to improve resource utilization. Existing warp schedulers cannot be aware of the kernelprogress to provide an effective scheduling policy. In addition, we identified the potential for improving resource utilization for multiple-warp-scheduler GPUs by sharing stalling warps with selected warp schedulers. To address the efficiency issue of the present GPU, we coordinated the kernel-aware warp scheduler and warp sharing mechanism (KAWS). The proposed warp scheduler acknowledges the execution progress of the running kernel to adapt to a more effective scheduling policy when the kernel progress attains a point of resource underutilization. Meanwhile, the warp-sharing mechanism distributes stalling warps to different warpschedulers wherein the execution pipeline unit is ready. Our design achieves performance that is on an average higher than that of the traditional warp scheduler by 7.97% and employs marginal additional hardware overhead.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천