In order to evaluate the main IP blocks being developed in Mont-Blanc 2020, a comprehensive evaluation environment is necessary. This involves in particular selecting a set of applications to be ported to the Arm ISA and SVE, so as to be able measure the impact of the various design options on the applications.
These applications then help test the technical requirements studied by the project, when evaluated on the Mont-Blanc 2020 demonstrator, which is an emulation platform that will enable integration and evaluation of the RTL designs developed by different project partners.
Mont-Blanc 2020 mini applications
The applications must be carefully chosen to be truly representative of a real-world mix of applications. Mont-Blanc 2020’s selection was based both on the knowledge of today’s use of supercomputers at our partners’ and on expectations of what a future application mix might be. The applications should also cover a broad range of computational characteristics, from stencil-type applications to highly irregular applications. And they should naturally stress the particular requirements that the project is aiming to study.
Selecting a representative set of applications
The following applications have been selected. See deliverable D3.1 for more details on each application.
- DL POLY Classic
- High Performance Linpack (HPL)
- MontBlanc Benchmarks
- RAJA Performance Suite
- Quantum Espresso
Porting the Mont-Blanc 2020 applications to the Arm ISA and SVE
The porting efforts and lessons drawn from it are detailed in deliverable D3.5. Here is an outline:
These efforts include:
- to generate appropriate binaries for the Arm Instruction Set Architecture (ISA) by using available tools like the Arm Compiler for HPC;
- to leverage the Scalable Vector Extension (SVE) to achieve competitive performance; and
- to ensure correct execution of ported applications by using available emulation and simulation tools such as ArmIE and gem5.
Exploiting SVE capabilities
Most of the effort has been devoted to port the applications to exploit SVE capabilities. This has been achieved in different ways depending on the application, from lower to higher level of effort:
- using Arm Performance Libraries,
- using Arm C Language Extensions (intrinsics), or
- hand-tuned assembly code.
Applications based on simple loops that have regular or contiguous memory access patterns can rely on compiler auto-vectorisation, while low level kernels that require high-performance are likely to benefit from hand-tuned assembly implementations. A significant effort has been made to achieve good SVE performance, however, performance fine-tuning has not been a main objective due to the lack of a target architecture to optimize for.
To know more about the methodology employed for each application, refer to the Mont-Blanc deliverable D3.5.
If porting applications to SVE was still a challenging task at the time it was executed within the Mont-Blanc 2020 project (due to the lack of maturity of the relevant HW / SW ecosystem), there is no doubt that as SVE-enabled hardware and tools are becoming rapidly available, SVE is likely to be quickly adopted by the HPC community:
“With respect to the SVE ISA, we have found that it is easy to reason about its vector length agnostic paradigm, which enables to write simpler and shorter code. More importantly, its C language extensions and datatypes are much more intuitive and comprehensive than other currently available solutions.”