In 2009, Tim Hayes (TCD) and Colin MacSweeny (UL) were awarded the ICHEC Summer Scholarships where they carried out HPC-related research projects under the supervision of our computational scientists. The following are the results and end products from their projects.
The mapping of virtual processes onto physical nodes is a very significant issue in high-performance computing (HPC). Depending on the topology of a machine's interconnect, a poor layout of processes can lead to network congestion, increased latency, decreased bandwidth and additional CPU time spent on networking rather than computation. An ideal mapping should therefore optimise the overall process-to-process communication, based on the physical topology of the network.
The aim of this project has been to develop a set of tools to evaluate the process placement of MPI jobs on a variety of HPC systems. The end product allows its user to observe the exact point-to-point bandwidth matrix, to infer the breakdown of bandwidth ordered by hop-count, and to provide estimates of better process placement when possible. Two separate but symbiotic tools were created: a profiling library called prmpi and an evaluation application called pranalysis. Their efficiency has been mostly tested on IBM BlueGene machines, but provided that information describing the physical network topology is made available, the tools are portable to any system.
There are currently a number of novel approaches to addressing the perceived productivity bottleneck associated with writing parallel codes. Ambitious and by their nature speculative examples of such approaches include X10, Fortress and Chapel. The aim of this project was to port a real world application to Chapel, a new high-productivity, parallel language being developed by Cray Inc. Chapel provides high level, global-view abstractions for performing data parallelism, task parallelism and nested parallelism which aim to simplify the development of parallel programs. Dimitri Perrin, a user of ICHEC's HPC resources, originally wrote the code chosen for porting to Chapel. The program generates a social network using an extended version of an algorithm first proposed by Keeling. The resulting network is used in agent-based simulations designed to investigate how disease spreads amongst a population. The original code was written in C++ with parallelism achieved using MPI. The project looked at the practical difficulties that remain in implementing a real world in what is still a developing language. The resulting code now forms part of Cray's Chapel regression testing suite.
Full details of this work are available in the project report.