http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
합성곱 신경망을 위한 커널별 일정하지 않은 정밀도를 지원하는 연산 블록 구조 설계
강종성(Jongsung Kang),김태환(Taewhan Kim) 대한전자공학회 2018 대한전자공학회 학술대회 Vol.2018 No.11
We propose a new hardware structure of arithmetic computation that is able to support kernel-level non-uniform computation precision on a Convolutional Neural Network (CNN). Since existing CNN compression techniques invariably assume uniform computation precision over all kernels in each convolutional layer of CNN, this work enables to further explore the CNN compression models, so that the compressed model can be fitted into a resource-limited architecture while minimizing the loss of prediction accuracy. We have implemented our proposed design in Virtex-7 FPGA and found that our design supporting the non-uniform kernel-level precision achieves higher utilization of the hardware resource than that of the layer-level kernel precision while using the same total amount of bits required for representing all kernel weights.
이미지 특징점 추출 방법에 따른 Bag of Word 성능 비교
강종성(Jongsung Kang),허정우(Jeongwoo Heo),김태환(Taewhan Kim) 대한전자공학회 2015 대한전자공학회 학술대회 Vol.2015 No.6
Bag of Word (BoW) is an image classification method, which selects the representative feature points of each category and decides the category of an arbitrary image by comparing an appearance frequency distribution of feature points. For the execution of this method, choosing the feature extraction method significantly affects the classification performance and processing speed. In this work, we evaluate the performance of BoW according to the selection of feature extraction methods. In summary, we found that the methods of SIFT and SURF, which inherently have long processing speed problem in the mobile SoC, did not show a sufficient classification performance.
합성곱 인공 신경망 처리기의 빠른 연산 블록 생성 기법
강종성(Jongsung Kang),김태환(Taewhan Kim) 대한전자공학회 2016 대한전자공학회 학술대회 Vol.2016 No.11
This work addresses the problem of generating a fast arithmetic circuit for each of the node computations of convolutional neural network (CNN). Once a training phase is done in CNN, the computation of each node can be expressed as a sum of multiplications with constant inputs (i.e., with known weights obtained from the training phase). Consequently, we can convert the multiplications into a sum of multiple addends, thereby producing an expression of sum of addends for the node computation, from which we generate a fast carry-save-adder (CSA) implementation structure for the expression. Through gate-level implementation of our proposed technique, we are able to reduce the node computation time by 8.8% on average while a little increase in power consumption, which is mainly caused by our prototyping of incompletely optimized CSA circuits in comparison with the commercial CSAs.