Most recent artificial intelligence services are implemented using deep learning–based artificial neural networks. These networks are powerful tools for learning meaningful representations from unstructured data, and in recent years a variety of arc...
Most recent artificial intelligence services are implemented using deep learning–based artificial neural networks. These networks are powerful tools for learning meaningful representations from unstructured data, and in recent years a variety of architectures have converged and become standardized around convolutional neural networks. As such neural-network models grow in complexity and are deployed more widely, their computational workload and energy consumption increase dramatically, highlighting the need for high-performance, low-power hardware accelerators that can handle these workloads efficiently. In this thesis, an MRAM-based analog process-in-memory (MPIM) architecture with ternary neuron outputs is proposed. The proposed architecture (1) improves array efficiency by binarizing inputs to {0, 1} instead of {−1, 1}, (2) mitigates accuracy degradation through an input-dependent current compensation circuit, and (3) reduces conversion energy overhead via a 3-bit down-scaling ADC. Simulation results using a two-layer perceptron show that the proposed MPIM achieves 90.3% inference accuracy on the MNIST dataset and a peak energy efficiency of 819.8 TOPS/W at 200 MHz.