Abstract
People use high performance computers (HPCs) for computation-intensive tasks. Often these tasks require a lot of running time of their corresponding softwares. It is important to execute several tasks simultaneously for the system utilization while finishing all tasks within their desired deadlines. Therefore, it is important to know the runtime of each computation-intensive task without executing them in order to schedule the tasks on HPC and obtain better system performance. We propose a method for predicting runtime of MPI-based softwares on HPC using automata theory and deep learning. We first analyze a source code of an input program by representing the code as finite automata and measuring their state complexities. Next, we train the execution runtime of each module of our finite automata using deep neural network (DNN) together with its own state complexity. Then we combine all modules and make a single SW-runtime-prediction model. For experiment, we train the proposed model using OSU benchmark data, HPL and two in-house datasets, and present the usefulness of our model. We also demonstrate the adaptability of our model by updating the current model for new inputs using incremental DNN, which is an important feature for coping with new softwares or new systems.
Original language | English |
---|---|
Title of host publication | Proceedings - 2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion, ACSOS-C 2020 |
Editors | Esam El-Araby, Sven Tomforde, Timothy Wood, Pradeep Kumar, Claudia Raibulet, Ioan Petri, Gabriele Valentini, Phyllis Nelson, Barry Porter |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 7-12 |
Number of pages | 6 |
ISBN (Electronic) | 9781728184142 |
DOIs | |
Publication status | Published - 2020 Aug |
Event | 1st IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion, ACSOS-C 2020 - Virtual, Washington, United States Duration: 2020 Aug 17 → 2020 Aug 21 |
Publication series
Name | Proceedings - 2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion, ACSOS-C 2020 |
---|
Conference
Conference | 1st IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion, ACSOS-C 2020 |
---|---|
Country/Territory | United States |
City | Virtual, Washington |
Period | 20/8/17 → 20/8/21 |
Bibliographical note
Publisher Copyright:© 2020 IEEE.
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Hardware and Architecture
- Control and Optimization
- Information Systems
- Computer Science Applications
- Software
- Safety, Risk, Reliability and Quality
- Control and Systems Engineering
- Computer Networks and Communications