The Dublin Institute for Advanced Studies, in partnership with all the Irish Universities and major research institutions, and supported by the HEA under PRTLI cycles 3 and 4, led a national initiative in collaboration with ICHEC and HEAnet to provide the Irish research community with access to true capability high-performance computing. The outcome was an IBM solution based on local Blue Gene /L and Blue Gene /P supercomputers along with remote access to additional Blue Gene facilities abroad. Access to these systems is granted through the National Capability Service, which has been operated by ICHEC since the service opened in February 2008.
In cognizance of the performance limitations of the older Lanczos (Blue Gene/L) system. It was decided that it would be decomissioned on July 1st 2010.
Schrödinger is a single cabinet of Blue Gene/P. This provides 1024 nodes each with four fully cache coherent cores and 2GB RAM. The cores run at 850MHz. These nodes provide three modes of operation for jobs:
Running on 1024 nodes in virtual-node mode will give you 4096 MPI tasks.
Schrödinger uses a front-end login node (bgp.ichec.ie) with 16 1.8GHz Power5+ cores and 64GB RAM is provided for development along with some pre- and post-processing of data.
Storage is provided by 33TB (formatted) of tightly-integrated SAN running the IBM GPFS filesystem. In ideal cases the storage should be able to provide 1GB/sec of i/o from the Blue Gene cabinets. This should make the use of large checkpoint files both feasible and relatively efficient.
Schrödinger provides two different scheduling policies. The first is a production scheduling policy that permits large jobs with long runtimes. The second is a flexible debug scheduling policy that is made available on request to allow for debugging and testing.
The production scheduling policy provides the following classes:
| Job Size (nodes) | Maximum Walltime | Maximum Running Jobs | |
|---|---|---|---|
| 1024_48hrs | 1024 | 48 hours | 1 |
| 512_48hrs | 512 | 48 hours | 2 |
The debug scheduling policy isn't restricted to any specific set of classes and can be tailored to a user's requirements. For example, a user experiencing an issue on 1024 nodes only can arrange to have quick turnaround access to a suitable class to debug the issue. Arrangement for use of a debug scheduling policy should be made in advance via the Helpdesk. One of the ICHEC computational scientists will be able to make the arrangements and also provide assistance in debugging the problem.
Some photos of the Schrödinger system: