HPAC-ML

A Programming Model for Embedding ML Surrogates in Scientific Applications

Machine learning has shown immense potential for accelerating scientific computing, but integrating ML models into existing scientific applications remains challenging. In our paper, we introduce HPAC-ML (Fink et al., 2024), a programming model that makes it easy for developers to embed ML surrogate models into scientific applications. Using simple annotations, developers can specify which parts of their code should be replaced with ML models and how data should flow between the application and these models. HPAC-ML handles all the complex details of data transformation and model execution behind the scenes. The following example uses HPAC-ML to embed NN surrogates in a five-point stencil application. With these few lines of code, the application is equipped to capture training data to train a model and use a trained model to replace the computation with NN inference.

This example replaces the main computation of a 2-D stencil code with NN surrogate inference. Green highlights the declaration of tensor functors, parametric mappings between application memory and ML tensors. Red highlights the declaration of tensor maps, which apply tensor functors to application memory. Blue highlights the region from which HPAC-ML will either collect data from or completely replace with a NN model.

We put HPAC-ML to the test across five different scientific applications, exploring over 5,000 different ML models to understand the tradeoffs between model size, speed, and accuracy. The results were impressive - we achieved speedups of up to 83.6x while maintaining high accuracy. In one particularly interesting case, we found that our ML models could actually outperform an existing approximation algorithm in both speed and accuracy. We also delivered important insights about how ML models behave in iterative scientific simulations, where even small errors can compound over time.

This work demonstrates a practical path forward for accelerating scientific computing with machine learning. By making it dramatically easier to experiment with ML surrogates in real applications, HPAC-ML enables scientists to focus on their domain problems rather than wrestling with ML integration challenges. The project is open source and available on GitHub, providing a foundation for future research at the intersection of scientific computing and machine learning.

HPAC-ML was published at SC’24 in Atlanta, and you can read the full paper here.

References

2024

  1. HPAC-ML: A Programming Model for Embedding ML Surrogates in Scientific Applications
    Zane Fink, Konstantinos Parasyris, Praneet Rathi, and 3 more authors
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (to appear), 2024