Next Six PRACE 2IP Prototypes Selected
The latest six PRACE prototypes fall into three main focus areas namely Resilience, Node Accelerators and Low-Power Alternatives to current HPC architecture. They also have different influence and impact on current HPC systems. Some results are directly applicable to current systems, others will affect the next generation of supercomputers, and some are looking to evaluate technologies that will need multiple iterations to become viable HPC alternatives to current hardware paradigms. The following paragraphs will highlight the uniqueness and similarities of each prototype.
The AMFT prototype (GENCI/INRIA, BSC, and GENCI/CINES) will evaluate new check-pointing technologies (Fault Tolerant Interface [FTI] and Multilevel Fault Tolerance [MFT]) in combination with different storage levels and technologies. The results will be relevant to all Petascale-and-beyond systems and are directly applicable to current supercomputers. This is one area that was not a focus for the PRACE-1IP WP9 prototypes but which is identified as one of the key issues for moving towards Exascale computing by European Exascale Software Initiative.The Scalable Hybrid (CSC, CSCS, SARA, T-Platforms Ltd), EURORA (CINECA, GRNET, IPB, NCSA) and CPU/GPU (PSNC, WCNS, Cyfronet) prototypes are investigating node accelerator technologies that are more and more predominant in the latest top 10 of the top 500 list. The focus will be the evaluation of the usefulness and usability of each current accelerator technology, e.g. Intel Many Integrated Cores (MIC), Nvidia GPU and AMD GPU, and possible alternatives to InfiniBand which should help to leverage the computing power provided by accelerators for more applications. The CPU/GPU prototype will investigate AMD APU’s as compute node alternatives. A unique point is that Scalable Hybrid and EURORA will assess the usefulness of hot-water cooling (“free cooling”) not in central but in northern and in southern Europe in contrast to the first hot-water cooled prototype which was installed at the Leibniz Rechenzentrum (LRZ) in Germany as part of the PRACE-1IP prototypes. Preliminary results showed that in comparison to standard air cooled systems hot-water cooling can save a significant amount of energy in central European climates by using free cooling (no water chillers only outside heat exchangers). All systems will incorporate energy efficiency assessments and measurements.
The last two prototypes will evaluate possible low-power alternatives to current HPC systems based on Systems on Chip (SoC) solutions. The ARM+GPU prototype (BSC, GENCI, CINES) is based on the positive previous experience with BSC’s PRACE-1IP Tegra2 prototype and tries to improve the performance to watt ratio by using a more powerful ARM core plus a mobile GPU accelerator. The most far reaching of the new prototypes is SHAVE-PRACE (SNIC, ICHEC (NUI Galway), Movidius Ltd). It uses a SoC designed for video processing that incorporates stacked memory on chip. The peak performance to Watt ratio is very promising. It is projected to reach a double precision efficiency of 140 GFLOP per J with 28nm technology. This would theoretically allow a performance of 2.8 Exaflops per 20 MW if the package power consumption is the only consideration. The goal is to evaluate the possibilities and limitations of leveraging embedded technology for future highly energy aware high performance computing.