RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
          펼치기
        • 등재정보
        • 학술지명
          펼치기
        • 주제분류
        • 발행연도
          펼치기
        • 작성언어

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        특화 메소드를 이용한 확장된 널 포인터 검사 제거

        최형규(Hyung-Kyu Choi),문수묵(Soo-Mook Moon) 한국정보과학회 2012 정보과학회 컴퓨팅의 실제 논문지 Vol.18 No.3

        기존 자바 가상 머신의 성능을 향상시키는 대표적인 기법들로 Just-in-time compiler와 Ahead-of-time compiler가 있다. 그리고 이러한 기법들은 전통적인 최적화 외에도 자바에 특화된 최적화 기법을 적용하여 효율적인 코드를 생성한다. 자바에 특화된 최적화 기법으로 대표적으로 널 포인터 검사 제거 기법이 있으며 이는 오래 전부터 자바 가상 머신에서는 필수적인 최적화로 여겨지며 그 성능 향상 또한 충분하다고 여겨졌다. 본 논문에서는 특수화 (specialization)기법을 도입하여 기존의 널 포인터 검사 기법을 확장하여 추가적으로 널 포인터 검사를 제거하는 기법을 제안하려고 한다. 그리고 이 기법은 기존의 널 포인터 검사 기법 자체를 수정하지 않아도 되며 Just-in-time compiler와 Ahead-of-time compiler에 모두 적용 가능하다. 실험 결과 메소드 호출이 많은 어플리케이션에서는 성능을 향상시킬 수 있었으며 일반적인 연산이 많은 어플리케이션에서도 일부 성능 향상을 얻을 수 있었다. Just-in-time compilation (JITC) and aheadof- time compilation (AOTC) has been proposed to improve the performance of Java virtual machine (JVM). These techniques adopt Java specific optimizations as well as traditional compiler optimizations. One of Java specific optimizations is a null pointer check elimination, which is considered to be a mandatory optimization in most JVM, since it can achieve noticeable performance improvement by eliminating redundant overhead of checking null pointers. In this paper, we propose an extended null pointer check elimination using specialization. The proposed technique extends the scope of existing null pointer check elimination and can eliminate additional null pointer checks. In addition, the proposed technique can be adopted to existing Just-in-time compiler and Ahead-of-time compiler, because it preserves the semantic of existing null pointer check elimination optimization. We observed meaningful performance improvement with benchmark programs as well as real applications after applying the proposed optimization.

      • KCI우수등재

        4세대 CKKS 동형암호 틀을 지원하는 딥뉴럴넷 특화 동형암호 최적화 컴파일러 HedgeHog

        이동권,이계진,김수찬,송우성,이도형,김훈,조승한,박규연,이광근 한국정보과학회 2022 정보과학회논문지 Vol.49 No.9

        우리는 기존보다 사용하기 쉬운 상위 입력언어를 사용하면서도 결과물 코드의 성능이 뛰어난 새로운 4세대 동형암호 최적화 컴파일러 HedgeHog를 개발하였다. 동형암호 기술은 그 유용성에도 불구하고 동형암호에 대한 전문지식이 없는 사용자의 입장에서는 직접 성능 좋은 동형암호 코드를 작성하는 것이 매우 어렵기 때문에, 상위 입력언어를 동형연산 코드로 자동변환해주는 동형 컴파일러 기술의 중요도는 매우 높다. 하지만 대부분의 기존 동형 컴파일러들은 4세대 이전의 동형암호 틀을 기반으로 하여 실수연산을 지원하지 못하기 때문에 뉴럴넷이나 통계분석 등의 분야에 쓰일 수는 없는 실정이다. 또한 4세대 동형암호 틀을 기반으로 하는 기존 동형 컴파일러들도 덧셈, 곱셈 등의 하위 연산자들을 기반으로 한 입력 언어를 사용하고 있기에 뉴럴넷 모델 등의 상위 프로그램을 기술하기 어렵다는 단점이 있고, 이 과정에서 불필요한 연산자들이 사용되어 비효율적인 결과물 코드를 생성할 위험성이 높다. 우리는 이러한 문제를 해결하기 위해 뉴럴넷 핵심 상위 연산자를 포함하는 입력언어 코드를 동형연산 코드로 자동변환해주는 최적화 컴파일러 HedgeHog를 개발하였다. HedgeHog는 기존 최고수준 4세대 동형 컴파일러인 EVA에 비해 최대 22%의 성능향상을 보인다. We present a new state-of-the-art optimizing homomorphic compiler HedgeHog based on high-level input language. Although homomorphic encryption enables safe and secure third party computation, it is hard to build high-performance HE applications without expertise. Homomorphic compiler lowers this hurdle, but most of the existing compilers are based on HE scheme that does not support real number computation and a few compilers based on the CKKS HE scheme that supports real number computation uses low-level input language. We present an optimizing compiler HedgeHog whose input language supports high-level DNN operators. In addition to its ease of use, compiled HE code shows a maximum of 22% more of speedup than the existing state-of-the-art compiler.

      • Extened-C 컴파일러의 개발 및 분석

        송진국,장성민 진주산업대학교 2000 산업과학기술연구소보 Vol.- No.7

        This paper describes the development of hcc, an extended-C language compiler, In this paper, we emphasize the importance of an object-oriented programming language and the code optimization techniques suitable for RISCs. The main objectives are support of Hangul, development of a C++ preprocessor, register allocation and optimization suitable for RISCs, design of an intermediate language and its interpreter, and development of a C compiler. In this paper, we describe ways in which C++ program is translated to machine code through hcc. We also present development of the hcc, and compare its performance with other systems.

      • KCI등재

        Development of a Prototyping Tool for New Memory Subsystem

        조중석,조두산 한국인터넷방송통신학회 2019 International Journal of Internet, Broadcasting an Vol.11 No.1

        The compiler is the key of the prototyping framework for the new memory system. These compiler-centric prototyping tools have several components, including compiler, linker, assembler, and standard libraries. It takes a lot of cost and man power to develop it all at zero base. Therefore, developer usually use a development framework to develop these prototyping tools efficiently. These development frameworks should be free of licensing issues when considering the commercialization of development results. Thus, developer should investigate the development framework, which is free from licensing issues and that provides all of the development environment to enable actual execution. There are three representative compiler-centric development frameworks: GCC, Clang (LLVM), and MS visual studio. There are some differences depending on the release version among them. And, there are some limitations to the freeware and commercial use. We chose LLVM here to explain the development of prototyping tools. This information will help accelerate the development of prototyping tools and will help reduce system development costs.

      • SCIESCOPUS

        Efficient embedded code generation with multiple load/store instructions

        Paek, Yunheung,Ahn, Minwook,Cho, Doosan,Kim, Taehwan John Wiley & Sons Ltd 2007 Software Vol.37 No.11

        <P>In a recent study, we discovered that many single load/store operations in embedded applications can be parallelized and thus encoded simultaneously in a single-instruction multiple-data instruction, called the multiple load/store (MLS) instruction. In this work, we investigate the problem of utilizing MLS instructions to produce optimized machine code, and propose an effective approach to the problem. Specifically, we formalize the MLS problem, that is, the problem of maximizing the use of MLS instructions with an unlimited register file size. Based on this analysis, we show that we can solve the problem efficiently by translating it into a variant of the problem finding a maximum weighted path cover in a dynamic weighted graph. To handle a more realistic case of the finite size of the register file, our solution is then extended to take into account the constraints of register sequencing in MLS instructions and the limited register resource available in the target processor. We demonstrate the effectiveness of our approach experimentally by using a set of benchmark programs. In summary, our approach can reduce the number of loads/stores by 13.3% on average, compared with the code generated from existing compilers. The total code size reduction is 3.6%. This code size reduction comes at almost no cost because the overall increase in compilation time as a result of our technique remains quite minimal. Copyright © 2007 John Wiley & Sons, Ltd.</P>

      • Performance Optimization of OpenMP Programs

        이명호,김용규 明知大學校 産業技術硏究所 2005 産業技術硏究所論文集 Vol.24 No.-

        OpenMP is fast becoming the standard paradigm for parallelizing applications for shared memory multiprocessor (SMP) systems. With a relatively small amount of coding effort, users can obtain scalable performance for their applications using OpenMP on SMF systems. In this paper, we present how the performance for OpenMP programs can be optimized. These optimizations, using Sun Studio compilers along with the Solaris™ 9 Operating System, have resulted in high performance and good scalability for the SPEC OMPL benchmarks on the SunFire™ line of servers. The benchmarks scale well up to 71 processors on a 72-processor SunFire 15K system.

      • Composition-based Cache simulation for structure reorganization

        Shin, K.,Han, H.,Choe, K.M. Elsevier 2010 JOURNAL OF SYSTEMS ARCHITECTURE - Vol.56 No.2

        Finding the best data layout has been an ultimate goal of memory optimization. Even with data access profile, heuristic algorithms are needed to reorganize data layout for better locality. The best layout could be found by running the given application with all possible data layouts and selecting the best performing layout. This approach, however, can incur too much overhead, particulary when the number of possible layouts are too many. In this paper, we present a composition-based cache simulation for structure reorganization. Instead of running all possible layouts, we simulate only the primary subsets of layouts and compose the cache misses for all layouts by summing up the cache misses of component subsets. Our experiment with the composition-based cache simulation shows that the differences in the cache misses are within 10% of the full cache simulation for 4-way and 8-way set associative caches. In addition to the cache miss estimation, our heuristic algorithm takes account of the extra instruction overhead incurred by structure reorganization. Our experiment with several structure intensive benchmarks shows the 37% reduction in the L1D read misses and the 28% reduction in the L2 read misses. As a result, the execution times are also reduced by 19% on average.

      • KCI등재

        차세대 저전력 멀티뱅크 메모리를 위한 컴파일러 최적화 기법

        조두산,Cho, Doosan 한국인터넷방송통신학회 2021 한국인터넷방송통신학회 논문지 Vol.21 No.6

        Various types of memory architectures have been developed, and various compiler optimization techniques have been studied to efficiently use them. In particular, since a memory is a major component that determines performance in mobile computing devices, various optimization techniques have been developed to support them. Recently, a lot of research on hybrid type memory architecture is being conducted, so various compiler techniques are being studied to support it. Existing compiler optimization techniques can be used to achieve the required minimum performance and constraint on low power according to market requirements. References for determining the low-power effect and the degree of performance improvement using these optimization techniques are not properly provided yet. This study was conducted to provide the experimental results of the existing compiler technique as a reference for the development of multibank memory architecture. 다양한 형태의 메모리 아키텍처가 개발되었고, 이를 효과적으로 사용하기 위한 여러 컴파일러 최적화 기법이 연구되었다. 특히, 모바일 컴퓨팅 디바이스에서 메모리는 성능을 결정하는 주요 컴포넌트이기 때문에 이를 지원하기 위한 다양한 최적화 기법들이 개발되었다. 최근에는 하이브리드 형태의 메모리 아키텍처에 대한 연구가 많이 진행되고 있기 때문에 이를 지원하기 위한 다양한 컴파일러 기법이 연구되고 있다. 시장의 요구조건에 맞추어 저전력에 대한 제약조건과 필요한 최소한의 성능을 달성하기 위하여 기존의 컴파일러 최적화 기법들이 사용될 수 있다. 이러한 최적화 기법들을 활용한 저전력 효과 및 성능 개선 정도를 파악하기 위한 레퍼런스가 제대로 제공되지 못하고 있는 실정이다. 본 연구는 기존의 컴파일러 기법에 대한 실험 결과를 멀티뱅크 메모리 아키텍처 개발의 레퍼런스로 제공하기 위하여 진행되었다.

      • KCI등재

        국가 대기질 예보 시스템의 모델링(기상 및 대기질) 계산속도 향상을 위한 전산환경 최적화 방안

        명지수 ( Jisu Myoung ),김태희 ( Taehee Kim ),이용희 ( Yonghee Lee ),서인석 ( Insuk Suh ),장임석 ( Limsuk Jang ) 한국환경과학회 2018 한국환경과학회지 Vol.27 No.8

        In this study, to investigate an optimal configuration method for the modeling system, we performed an optimization experiment by controlling the types of compilers and libraries, and the number of CPU cores because it was important to provide reliable model data very quickly for the national air quality forecast. We were made up the optimization experiment of twelve according to compilers (PGI and Intel), MPIs (mvapich-2.0, mvapich-2.2, and mpich-3.2) and NetCDF (NetCDF-3.6.3 and NetCDF-4.1.3) and performed wall clock time measurement for the WRF and CMAQ models based on the built computing resources. In the result of the experiment according to the compiler and library type, the performance of the WRF (30 min 30 s) and CMAQ (47 min 22 s) was best when the combination of Intel complier, mavapich-2.0, and NetCDF-3.6.3 was applied. Additionally, in a result of optimization by the number of CPU cores, the WRF model was best performed with 140 cores (five calculation servers), and the CMAQ model with 120 cores ( five calculation servers). While the WRF model demonstrated obvious differences depending on the number of CPU cores rather than the types of compilers and libraries, CMAQ model demonstrated the biggest differences on the combination of compilers and libraries.

      • 중간코드에서의 이명 분석에 의한 최적화

        송진국 진주산업대학교 1999 論文集 Vol.38 No.-

        The presence of aliases makes data-flow analysis more complex and reduces readability, since they cause uncertainly regarding what is defined and used. The alias information enhance performance of register allocation phase and optimization phases. This paper describes the alias analysis, the optimization with the alias information, and the result of test. We use TUP codes as intermediate language. In the alias analysis phase, we use flow graph and iteration algorithm on this flow graph.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼