Performance optimizations for scalable implicit RANS calculations with SU2

Multigrid method Speedup Stencil Solver Distributed memory Xeon
DOI: 10.1016/j.compfluid.2016.02.003 Publication Date: 2016-02-14T05:57:25Z
ABSTRACT
Abstract In this paper, we present single- and multi-node optimizations of SU2, a widely-used, open-source Computational Fluid Dynamics application, aimed at improving performance and scalability for implicit Reynolds-averaged Navier–Stokes calculations on unstructured grids. Typical industry-standard implementations are currently limited by unstructured accesses, variable degrees of parallelism, as well as the global synchronizations inherent in traditionally used Krylov linear solvers. Therefore, we rely on aggressive single-node optimizations, such as hierarchical parallelism, dynamic threading, compacted memory layout, and vectorization, along with a communication-friendly agglomeration (geometric) linear multigrid solver. Based on results with the well-known ONERA M6 geometry, our single core and shared memory optimizations result in a speedup of 2.6X on the latest 14-core Intel® Xeon™ 1 E5-2697v3 processor when compared to the baseline SU2 implementation with 14 MPI ranks. In multi-node settings, the hybrid OpenMP+MPI multigrid implementation achieves 2X higher parallel efficiency on 256 nodes over conventional Krylov-based (GMRES) methods.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (44)
CITATIONS (35)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....