Runtime Performance Prediction for Deep Learning Models with Graph Neural Network
- Yanjie Gao ,
- Xianyu Gu ,
- Hongyu Zhang ,
- Haoxiang Lin ,
- Mao Yang
Published by IEEE/ACM
The 45th International Conference on Software Engineering, Software Engineering in Practice (SEIP) Track
Deep learning models have been widely adopted in many application domains. Predicting the runtime performance of deep learning models, such as GPU memory consumption and training time, is important for boosting development productivity and reducing resource waste. The reason is that improper configurations of hyperparameters and neural architectures can result in many failed training jobs or unsatisfactory models. However, the runtime performance prediction of deep learning models is challenging because of the hybrid programming paradigm, complicated hidden factors within the framework runtime, enormous model configuration space, and broad differences among models. In this paper, we propose DNNPerf, a novel ML-based tool for predicting the runtime performance of deep learning models using Graph Neural Network. DNNPerf represents a model as a directed acyclic computation graph and incorporates a rich set of performance-related features based on the computational semantics of both nodes and edges. We also propose a new Attention-based Node-Edge Encoder for the node and edge features. DNNPerf is evaluated on thousands of configurations of real-world and synthetic deep learning models to predict their GPU memory consumption and training time. The experimental results show that DNNPerf achieves accurate predictions, with an overall error of 7.4% for the training time prediction and an overall error of 13.7% for the GPU memory consumption prediction, confirming its effectiveness.