While single-ISA heterogeneous multi-core processors are widely used in mobile computing, typical code generations optimize the code for a single target core, leaving it less suitable for the other cores in the processor. We present a microarchitecture-aware code generation methodology to mitigate this issue. We first suggest adopting Function-Multi-Versioning (FMV) to execute application codes utilizing a core at full capacity regardless of its microarchitecture. We also propose to add a simple but powerful backend optimization pass in the compiler to further boost the performance of applicable cores. Based on these schemes, we developed an automated flow that analyzes the program and generates multiple versions of hot functions tailored to different microarchitectures. At runtime, the running core chooses an optimal version to maximize computation performance. The measurements confirm that the methodology improves the performance of Cortex-A55 and Cortex-A75 cores in Samsung's next-generation Exynos 9820 processor by 11.2% and 17.9%, respectively, while running TensorFlow Lite.
Bibliographical noteFunding Information:
This work was supported in part by the National Research Foundation of Korea under Grant NRF-2019R1C1C1004927 and Grant NRF-2016R1C1B2016072, in part by the Information Technology Research Center Support Program under Grant IITP-2019-2018-0-01421, supervised by the Institute for Information and Communications Technology Promotion, and the System LSI Division of Samsung Electronics.
© 2013 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Science(all)
- Materials Science(all)