RISS 검색 - 국내학술지논문 상세보기

다국어 초록 (Multilingual Abstract)

Real-time operation is important in CNNs, which are widely used for video andaudio processing. Recently, in order to improve the operation speed of CNNs,methods for implementing CNNs with hardware using FPGAs have been studied.
Most of the studies used Xilinx Vitis HLS software, which implements CNNalgorithms written in C language directly into FPGAs. This software has adisadvantage in that there are few ways for users to intervene in the FPGAimplementation process, and an expensive FPGA is required for high performance.
In this paper, we propose a new method to implement the CNN algorithm inlow-cost FPGA using a chip design language such as VHDL. In order to implementusing low-cost FPGA, the number of data bits of CNN's node, weight, and biaswas reduced, and memory duplication and parallel processing methods were usedfor implementation to improve operation speed. In addition, operation speed, FPGAused resource, and hit ratio were compared with existing methods, and it wasshown that the proposed method is competitive

번역하기

국문 초록 (Abstract)

영상과 음성처리에 많이 사용되는 CNN은 실시간 동작이 중요하다. 최근에는 CNN의 동작속도를 개선하기 위해 FPGA를 사용하여 하드웨어로 CNN을 구현하는 방법들이 연구되고있다. 대부분의 연구들은 C언어로 작성된 CNN 알고리즘을 바로 FPGA로 구현하는 XilinxVitis HLS 소프트웨어를 사용하였다. 이러한 소프트웨어는 사용자가 FPGA 구현 과정에 개입할 수 있는 방법이 적고, 고성능을 위해서는 고가의 FPGA를 필요로 하는 단점이 있다. 본논문에서는 CNN 알고리즘을 VHDL과 같은 칩설계 언어를 사용하여 저가형 FPGA로 구현하는 새로운 방법을 제안한다. 저가의 FPGA를 사용해서 구현하기 위해 CNN의 node,weight 및 bias의 data 비트 수를 줄이고, 동작속도 향상을 위해 메모리 복제 및 병렬처리방법을 구현에 사용하였다. 또 기존의 방법들과 동작속도, FPGA 사용자원 그리고 hit ratio를 비교하였고 제안한 방법이 경쟁력이 있음을 보였다

번역하기

영상과 음성처리에 많이 사용되는 CNN은 실시간 동작이 중요하다. 최근에는 CNN의 동작속도를 개선하기 위해 FPGA를 사용하여 하드웨어로 CNN을 구현하는 방법들이 연구되고있다. 대부분의 연...

참고문헌 (Reference)

1 황연우 ; 조성원, "임베디드 보드에서 차량 감지 및 추적을 위한 딥러닝 모델 최적화" 한국지능시스템학회 32 (32): 151-157, 2022

2 Xilinx, "Zynq UltraScale+ MPSoC Data Sheet:Overview"

3 Xilinx, "Vivado Design Suite User Guide:Synthesis (2021)"

4 Xilinx, "Vitis High-Level Synthesis User Guide (2022)"

5 I. Hubara, "Neural Networks : Training Neural Networks with Low Precision Weights and Activations" 18 (18): 6869-6898, 2017

6 Y. Hou, "LeNet-5 improvement based on FPGA acceleration" 2020 (2020): 526-528, 2020

7 K. Chahal, "How to Quantize an MNIST network to 8 bits in Pytorch from scratch (No retraining required)"

8 S. Basodi, "Gradient Amplification : An Efficient Way to Train Deep Neural Networks" 3 (3): 196-207, 2020

9 M. H. Cho, "FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit" 10 (10): 2021

10 O. Kaziha, "Exploring QuantizationAware Training on a Convolution Neural Network" 1-5, 2020

1 황연우 ; 조성원, "임베디드 보드에서 차량 감지 및 추적을 위한 딥러닝 모델 최적화" 한국지능시스템학회 32 (32): 151-157, 2022

2 Xilinx, "Zynq UltraScale+ MPSoC Data Sheet:Overview"

3 Xilinx, "Vivado Design Suite User Guide:Synthesis (2021)"

4 Xilinx, "Vitis High-Level Synthesis User Guide (2022)"

5 I. Hubara, "Neural Networks : Training Neural Networks with Low Precision Weights and Activations" 18 (18): 6869-6898, 2017

6 Y. Hou, "LeNet-5 improvement based on FPGA acceleration" 2020 (2020): 526-528, 2020

7 K. Chahal, "How to Quantize an MNIST network to 8 bits in Pytorch from scratch (No retraining required)"

8 S. Basodi, "Gradient Amplification : An Efficient Way to Train Deep Neural Networks" 3 (3): 196-207, 2020

9 M. H. Cho, "FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit" 10 (10): 2021

10 O. Kaziha, "Exploring QuantizationAware Training on a Convolution Neural Network" 1-5, 2020

11 G. Feng, "EnergyEfficient and High-Throughput FPGA-based Accelerator for Convolutional Neural Networks" 624-626, 2016

12 S. Lee, "Double MAC on a DSP : Boosting the Performance of Convolutional Neural Networks on FPGAs" 38 (38): 888-897, 2019

13 Y. Shi, "Design of Parallel Acceleration Method of Convolutional Neural Network Based on FPGA" 133-137, 2020

14 S. Zhai, "Design of Convolutional Neural Network Based on FPGA" 1168 (1168): 1-7, 2019

15 Y. Zhou, "An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks" 829-832, 2015

16 K. Chahal, "Aggressive Quantization: How to run MNIST on a 4 bit Neural Net using Pytorch"

17 G. Murilo, "A survey on recently proposed activation functions for Deep Learning"

18 L. Huang, "A survey on performance optimization of high-level synthesis tools" 35 (35): 697-720, 2020

19 D. Shan, "A CNN Accelerator on FPGA with a Flexible Structure" 211-216, 2020

상세검색

RISS 보유자료

상세검색

해외전자자료

저가형 FPGA를 사용한 합성곱 신경망의 구현 = Implementation of Convolutional Neural Networks using Low-cost FPGAs

부가정보

동일학술지(권/호) 다른 논문

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료