Increasing exponentially with technology scaling, the soft error rate even in earth-bound embedded systems manufactured in deep subnanometer technology is projected to become a serious design consideration. Partially protected cache (PPC) is a promising microarchitectural feature to mitigate failures due to soft errors in power, performance, and cost sensitive embedded processors. A processor with PPC maintains two caches, one protected and the other unprotected, both at the same level of memory hierarchy. The intuition behind PPCs is that not all data in the application is equally prone to soft errors. By finding and mapping the data that is more prone to soft errors to the protected cache, and error-resilient data to the unprotected cache, failures induced by soft errors can be significantly reduced at aminimal power and performance penalty. Consequently, the effectiveness of PPCs critically hinges on the compiler's ability to partition application data into error-prone and error-resilient data. The effectiveness of PPCs has previously been demonstrated on multimedia applications-where an obvious partitioning of data exists, the multimedia data is inherently resilient to soft errors, and the rest of the data and the entire code is assumed to be error-prone. Since the amount of multimedia data is a quite significant component of the entire application data, this obvious partitioning is quite effective. However, no such obvious data and code partitioning exists for general applications. This severely restricts the applicability of PPCs to data caches and instruction caches in general. This article investigates vulnerability-based partitioning schemes that are applicable to applications in general and effectively reduce failures due to soft errors at minimal power and performance overheads. Our experimental results on an HP iPAQ-like processor enhanced with PPC architecture, running benchmarks from the MiBench suite demonstrate that our partitioning heuristic efficiently finds page partitions for data PPCs that can reduce the failure rate by 48% at only 2% performance and 7% energy overhead, and finds page partitions for instruction PPCs that reduce the failure rate by 50% at only 2% performance and 8% energy overhead, on average.
|Journal||ACM Transactions on Design Automation of Electronic Systems|
|Publication status||Published - 2010 Sept|
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Computer Graphics and Computer-Aided Design
- Electrical and Electronic Engineering