Logo of Science Foundation Ireland  Logo of the Higher Education Authority, Ireland7 CapacitiesGPGPU Research Projects
Ireland's High-Performance Computing Centre | ICHEC
Home | News | Infrastructure | Outreach | Services | Research | Support | Education & Training | Consultancy | About Us | Login

ICHEC Summer Scholarships - Past Projects

Below are the Summer Scholarship projects that were on offer in previous years.


  1. Implementation of the multiple polynomial sieve factoring algorithm on the GPU architecture

    Integer factorisation is a classic problem in number theory, and it forms the basis for encryption methodologies that facilitate secure e-commerce. While factorisation may be trivial for small integers, the problem becomes much harder for extremely large numbers (e.g. over 100 digits). The multiple polynomial quadratic sieve factoring algorithm, implemented on a powerful computer, makes it feasible to factorise such large numbers. The algorithm is very parallelisable which makes it an ideal candidate for implementation on HPC architectures. Thus the challenge for this project is to implement this algorithm on the GPU and to evaluate its performance. Working C code for CPUs will be provided as a starting point.

  2. Development of a General Matrix Multiply (GEMM) library for hybrid CPU-GPU architectures

    General Matrix Multiply (GEMM) is a matrix multiplication subroutine in the Basic Linear Algebra Subprograms (BLAS) that is often optimised for HPC architectures. It has significant impact on performance as it is the foundation of many other subroutines, scientific codes and the LINPACK benchmark. As hybrid CPU-GPU architectures becomes more prevalent in HPC, Massimiliano Fatica1 developed a host library that intercepts double-precision DGEMM calls and distributes the workload to be executed on both CPUs and GPUs simultaneously. The results showed that improved Linpack performance is readily achievable on hybrid CPU-GPU architectures.

    Following on from that work, the goal of this project is to develop a more general-purpose version of the host library that will handle single-precision (SGEMM), complex double-precision (ZGEMM), as well as DGEMM operations. It will also ensure that best performance is obtained regardless of matrix dimension and form. As the [SDZ]GEMM subroutines are widely used, this project has the potential to benefit a large community.

    1Fatica, M. (2009) Accelerating Linpack with CUDA on heterogeneous clusters. ACM ICPS 383: 46-51. [Link]

  3. Porting EXCITON for the GPU architecture

    EXCITON is an electronic structure code under development in the School of Physics, Trinity College Dublin and ICHEC. Its purpose is computation of excitations in solids using state of the art methods such as the GW approximation and the Bethe-Salpeter Equation. It is written in C and is parallelised with MPI. We are investigating how the numerically intensive parts of EXCITON can be accelerated on Graphical Processor Unit (GPU) architectures. This project will involve porting EXCITON to GPU and use of the CUDA language.

  4. Bioinformatics on the GPU

    Many bioinformatics software tools follow the single instruction, multiple data (SIMD) paradigm. Hence GPGPU has the potential to provide relatively inexpensive, scalable solutions to increasingly data-intensive problems in biology. There are already a number of algorithms which has been implemented on the GPU (e.g. Smith-Waterman, BLASTP), most of which report significant speed-ups over CPU implementations.

    The goal of this project is to implement some of the existing bioinformatics GPGPU codes on ICHEC hardware and to assess their behaviour, applicability and useability. In collaboration with the Molecular Evolution and Bioinformatics Unit at NUI Maynooth, there are opportunities to carry out real scientific analyses and to enable previously-infeasible computations. The codes will cover a range of areas including biological sequence comparisons, molecular phylogenetics and high-throughput DNA sequencing.

  5. Parallelism improvements of the GIPAW Nuclear Magnetic Resonance module for HPC users

    GIPAW is a module within the Quantum ESPRESSO distribution that models NMR experiments (e.g. chemical shifts) from first principles. Until now, NMR modelling has been applied to relatively simple systems of tens of atoms in the unit cell at most. The current scalability of GIPAW is limited to a few hundreds of cores at most. In order to make GIPAW useful in biomedical and industrial applications, we need to extend its range to systems having hundreds to thousands of atoms, and to extend scalability up to thousands of cores. The aim of this project is to improve the parallelism inside GIPAW, by further distributing large arrays across processors and by adding a further parallelisation level on electronic states.

  6. Parallelisation of mesh-free computational fluid dynamics (CFD) code

    Methods for Computational Fluid Dynamics (CFD) has become increasingly important in many aspects of engineering. Traditionally, methods for CFD have been mesh-based, i.e. the computational nodes are interconnected and fixed in space. Mesh-free methods for CFD are a relatively recent development. These methods offer greater flexibility than traditional mesh-based approaches because the computational nodes can move with the fluid and have no pre-defined connectivity.

    Code for mesh-free CFD has been developed at Mechanical and Biomedical Engineering at NUI Galway. This project will involve the parallelisation of certain elements of this mesh-free CFD code and will be conducted in collaboration with the research group of Dr. Nathan Quinlan at NUI, Galway.

  7. Visualisation of nested climate and weather datasets

    Nested models are increasingly used in climate and weather, with a regional model running within a larger global model. Incorrectly nested models can have issues due to artificial noise at the model boundaries, or incorrectly filtered dynamics that remove the desired signals in the model.

    The aim of the project is to investigate and implement techniques for showing the results of two nested models within VTK or paraview, enabling the users to distinguish the different datasets and investigate potential issues due to resolution, artificial noise or filtering within the nested system.

  8. Development of wizard interfaces for efficient job submission on HPC systems

    The input file syntax required for running HPC jobs while reasonably straight-forward and deterministic can be confusing for new users. Similarly debugging and scaling work can require modifications of normal production jobs which can be error prone, if it is not a day-to-day activity. The aim of this project is to develop a series of 'Wizard' based interfaces, probably web-based, which can be used to generate job submission and related files for users in a user-friendly manner. The generated files could then be used by users directly or as a basis for further customisation.

  9. Instrumentation of taskfarming on ICHEC systems

    One method commonly used for large-scale parameter studies and other ideally-parallel workloads is so-called taskfarming. ICHEC's current taskfarm utility is light-weight and adaptable but it currently assumes that users have a good understanding of the system load and run-times of their jobs. However a more sophisticated taskfarming approach could automatically harvest and log performance data producing a summary, both textual and graphical, of the properties of a given run. This could benefit users by helping them to readily spot problematic tasks or inefficiencies.


  1. Assisted debugging and profiling tool usage

    There are a large number of tools for error checking and profiling code such as: Marmot, Vampir, Scalasca, gprof, Valgrind, Lint etc. They each have their strengths and are often complimentary. We could better exploit these and similar tools. The aim of this project would be to explore the notion of creating a "wizard" which given some basic information could be used to help users to configure some or all of these tools for use with their code in one step. In some cases the full testing process might be amenable to automation in others it may only be practical to generate required job scripts etc. But even this would eliminate an error prone process. In tandem with this technical report style documentation tailored to the ICHEC environment would be produced for the tools concerned.

  2. Implementing a real world application in a next generation HPC language

    Chapel is a new and very interesting language designed by Cray. The language is not ready for production use as yet however porting existing applications should be possible if they can be made to work without 3rd party libraries. It is proposed that a code be identified from amongst those used by the ICHEC userbase, ideally open source, which would be practical to port to Chapel. This may prove useful as a benchmark application in the future for the Chapel community. In the short term it would highlight the differences in the language and act as a measure as to how tractable porting existing code to the language would be were Chapel to be fully developed as a first tier HPC language.

  3. Development of a Chapel Distribution(s) useful to HPC users

    Chapel is a new and very interesting language designed by Cray. The language is not ready for production use as yet however it has been designed from the ground up with parallel programming in mind. It incorporates a notion called a "distribution " which describes how data is mapped from an index space to an endpoint (locale) without specifying exact details of data indices. The aim of this project is to select and implement a non trivial distribution(s) which is relevant to the HPC community.

  4. Development of a MPI profiling library on the IBM BlueGene system

    The aim of the project is to develop a light-weight MPI profiling library that differentiates between local communication (intra-node or nearby nodes) and distant communication (&grt; 1 hop) within the 3D torus of BlueGene systems. The library should use the standard PMPI profiling interface and be as portable as possible. It involves the instrumentation in C - all the point to point MPI functions (MPI_[ISB]Send and MPI_[I]Recv) and communicator creation (to keep track of the actual MPI ranks). It is also necessary to identify a suitable trace file format, and developing a tool to parse/analyse the trace files with a high-level programming language (e.g. Python). The impact of process placement on torus will be assessed on systems such as BG/L, BG/P and Cray XT4 (Hector, if the library is portable).

  5. Qualitative study of data assimilation algorithms and their implementation on HPC platforms

    Data assimilation involves accurate re-analysis, estimation and prediction of an unknown, true state by merging observed information into a model. This issue arises in all scientific areas that enjoy a profusion of data. The problem is fundamental yet challenging as it does not naturally afford a clean solution. An area where data assimilation is predominantly important is in weather forecasting and hydrology. The aim of the project is to review the literature for existing data assimilation techniques applicable in weather forecasting. The simplest form of the existing algorithms will be implemented and their computational behaviour will be analysed.

  6. Parallelisation of Climate Data Operators

    The Climate Data Operators (CDO) is a tool that is used extensively in the climate community for manipulating climate datasets: conversions from one format to another, and obtaining averages / min / max, etc. across fields. This project will involve the optimisation and parallelisation of the CDO code (e.g. it should be possible to load large fields in parallel and parallelise summary tasks).

  7. Study and implementation of parallel graph-theory algorithms in bioinformatics

    Biological data/relationships/networks can often be represented in graphs which can be manipulated and analysed using various algorithms. However, large graphs (e.g. those with more than hundreds of millions of nodes) poses a significant challenge for conventional algorithms. The project will involve implementation of existing parallel algorithms or the parallelisation of sequential algorithms to solve relevant graph-based problems in bioinformatics.

  8. Numerical solution of the Lotka-Volterra system

    The Lotka-Volterra equations are of practical interest as they are frequently used to describe the dynamics of biological systems. In this project the student will:

    • analyse the preservation of physical properties of numerical approximation solutions of the Lotka-Volterra system - the symplectic Euler method and and an explicit variant of it.
    • analyse the Poisson integration for this system with the simplistic Varlet scheme.
    • compare the numerical methods with constant/variable step sizes.
    • simulate long-time execution and assess the performance of each numerical method.
    Candidates are expected to have some exposure to scientific computing and have completed a course on partial differential equations.


Further Information

For further information regarding the Summer Scholarships, please contact our Education & Training Coordinator.