TY - JOUR
T1 - PIMCaffe
T2 - Functional Evaluation of a Machine Learning Framework for In-Memory Neural Processing Unit
AU - Jeon, Won
AU - Lee, Jiwon
AU - Kang, Dongseok
AU - Kal, Hongju
AU - Ro, Won Woo
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - The large amount of memory usage in recent machine learning applications imposes a significant system burden with respect to power and processing speed. To cope with such a problem, Processing-In-Memory (PIM) techniques can be applied as an alternative solution. Especially, the recommendation system, which is one of the major machine learning applications used in data centers, requires a large memory capacity and therefore represents a suitable candidate application that could be helped by the PIM technique. In this paper, we introduce a machine learning framework, PIMCaffe, designed for in-memory neural processing units and its evaluation environment. PIMCaffe consists of two components: A Caffe2-based deep learning framework that supports PIM acceleration and a PIM-emulating hardware platform. We develop a suite of functions, libraries, application programming interfaces, and a device driver to support the framework. In addition, we implement a prototype Neural Processing Unit (NPU) in PIMCaffe to evaluate the performance of our platform with machine learning applications. Our prototype NPU design includes a vector processor for parallel vector processing and a systolic array unit for matrix multiplication. Using the proposed software framework, we perform a detailed analysis of the in-memory neural processing unit. PIMCaffe supports evaluations of recommendation systems and various convolutional neural network models on the in-memory neural processing unit. PIMCaffe with the NPU shows up to $2.26\times $ , $5.99\times $ , and $1.71\times $ speedup, compared to the ARM Cortex-A53 CPU, for the recommendation system, AlexNet, and ResNet-50, respectively.
AB - The large amount of memory usage in recent machine learning applications imposes a significant system burden with respect to power and processing speed. To cope with such a problem, Processing-In-Memory (PIM) techniques can be applied as an alternative solution. Especially, the recommendation system, which is one of the major machine learning applications used in data centers, requires a large memory capacity and therefore represents a suitable candidate application that could be helped by the PIM technique. In this paper, we introduce a machine learning framework, PIMCaffe, designed for in-memory neural processing units and its evaluation environment. PIMCaffe consists of two components: A Caffe2-based deep learning framework that supports PIM acceleration and a PIM-emulating hardware platform. We develop a suite of functions, libraries, application programming interfaces, and a device driver to support the framework. In addition, we implement a prototype Neural Processing Unit (NPU) in PIMCaffe to evaluate the performance of our platform with machine learning applications. Our prototype NPU design includes a vector processor for parallel vector processing and a systolic array unit for matrix multiplication. Using the proposed software framework, we perform a detailed analysis of the in-memory neural processing unit. PIMCaffe supports evaluations of recommendation systems and various convolutional neural network models on the in-memory neural processing unit. PIMCaffe with the NPU shows up to $2.26\times $ , $5.99\times $ , and $1.71\times $ speedup, compared to the ARM Cortex-A53 CPU, for the recommendation system, AlexNet, and ResNet-50, respectively.
KW - Deep learning framework
KW - FPGA prototyping
KW - functionality
KW - neural processing unit
KW - processing-in-memory
KW - recommendation system
KW - system verification
UR - http://www.scopus.com/inward/record.url?scp=85110813956&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85110813956&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3094043
DO - 10.1109/ACCESS.2021.3094043
M3 - Article
AN - SCOPUS:85110813956
SN - 2169-3536
VL - 9
SP - 96629
EP - 96640
JO - IEEE Access
JF - IEEE Access
M1 - 9469808
ER -