Abstract
Various studies have been carried out to improve the operational efficiency of the Deep Neural Networks (DNNs). However, the importance of the reliability in DNNs has generally been overlooked. As the underlying semiconductor technology decreases in reliability, the probability that some components of computing devices fail also increases, preventing high accuracy in DNN operations. To achieve high accuracy, ensuring operational reliability, even if faults occur, is necessary. In this paper,we introduce aDNNreliability improvement scheme in 3D die-stacked memory called DRIS-3, based on the correlation between the faults in weights and an accuracy loss. We analyze the fault characteristics of conventional DNN models to find the bits that cause significant accuracy loss when faults are injected into weights. On the basis of the findings, we propose a reliability improvement structure which can reduce faults on the bits that must be protected for accuracy, considering asymmetric soft error rate (SER) per layer in 3D die-stacked memory. Experimental results show that with the proposed method, the fault tolerance is improved regardless of the type of model and the pruning applied. The fault tolerance based on bit error rate (BER) for a 1% accuracy loss is increased up to 104 times over the conventional model.
Original language | English |
---|---|
Title of host publication | Proceedings of the 56th Annual Design Automation Conference 2019, DAC 2019 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781450367257 |
DOIs | |
Publication status | Published - 2019 Jun 2 |
Event | 56th Annual Design Automation Conference, DAC 2019 - Las Vegas, United States Duration: 2019 Jun 2 → 2019 Jun 6 |
Publication series
Name | Proceedings - Design Automation Conference |
---|---|
ISSN (Print) | 0738-100X |
Conference
Conference | 56th Annual Design Automation Conference, DAC 2019 |
---|---|
Country/Territory | United States |
City | Las Vegas |
Period | 19/6/2 → 19/6/6 |
Bibliographical note
Funding Information:This work was supported by Samsung Research Funding Center of Samsung Electronics under Project Number SRFC-TB1803-02.
Publisher Copyright:
© 2019 Association for Computing Machinery.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Control and Systems Engineering
- Electrical and Electronic Engineering
- Modelling and Simulation