Efficiency Modeling and Analysis of 64-bit ARM Clusters for HPC

This paper investigates the use of ARM 64-bit cores to improve the processing efficiency of upcoming HPC systems. It describes a set of available tools, models and platforms , and their combination in an efficient methodology for the design space exploration of large manycore computing clusters. Experimentation and results using representative benchmarks allow to set an exploration approach to evaluate essential design options at micro-architectural level while scaling with a large number of cores, and to envisage first directions for future system analysis and improvement.