Project Start Date
01st Sep 2015
Project End Date
31st Aug 2018
High Performance Computing (HPC) has become a major instrument for many scientific and industrial fields to generate new insights and product developments. There is a continuous demand for growing compute power, leading to a constant increase in system size and complexity. Efficiently utilizing the resources provided on future Exascale systems will be a challenging task, potentially causing a large amount of underutilized resources and wasted energy. Parameters for adjusting the system to application requirements exist both on the hardware and on the system software level but are mostly unused today. Moreover, accelerators and co-processors offer a significant performance improvement at the cost of increased overhead, e.g., for data-transfers.
While HPC applications are usually highly compute intensive, they also exhibit a large degree of dynamic behaviour, e.g., the alternation between communication phases and compute kernels. Manually detecting and leveraging this dynamism to improve energy-efficiency is a tedious task that is commonly neglected by developers. However, using an automatic optimization approach, application dynamism can be detected at design-time and used to generate optimized system configurations. A lightweight run-time system will then detect this dynamic behaviour in production and switch parameter configurations if beneficial for the performance and energy-efficiency of the application. The READEX project is developing an integrated tool-suite a new programming paradigm to exploit application domain knowledge, with the goal of achieving an improvement in energy-efficiency of up to 22.5%.
Driven by a consortium of European experts from academia, HPC resource providers, and industry, the READEX project is developing a tools-aided methodology to exploit the dynamic behaviour of applications to achieve improved energy-efficiency and performance. The developed tool-suite will be efficient and scalable to support current and future extreme scale systems.
Specific roles and responsibilities of ICHEC:
To evaluate dynamism exhibited by existing and future HPC applications.
To analyse application parameters that can be tuned for energy savings.
To define a generic specification for providing domain-level knowledge about application dynamism.
To investigate future network infrastructure tuning parameters.
To integrate all READEX components into a prototypes and software release.
To extend a performance engineering workflow tool, Pathway, for the READEX methodology.
To develop and establish appropriate measures to ensure software quality.