Logo of Science Foundation Ireland  Logo of the Higher Education Authority, Ireland7 CapacitiesGPGPU Research Projects
Ireland's High-Performance Computing Centre | ICHEC
Home | News | Infrastructure | Outreach | Services | Research | Support | Education & Training | Consultancy | About Us | Login

User Mailing

ICHEC mail #22


Posted: 2006-05-12

Dear ICHEC users,

Contents:

1. Scheduled downtime
2. Changes to our programming environment
3. Supported applications and benchmarking exercise
4. Termination of the Transitional Service
5. Monitoring jobs' efficiency
6. Monthly reports
7. IAHPC workshop

------------------------------------------------------------------------
1 – Scheduled downtime
------------------------------------------------------------------------

Our next scheduled downtime will take place on Tuesday 16 and Wednesday 17 May. Both Walton and Hamilton will be offline from Tuesday morning at 10:00, and are expected back up by 17:00 the following day. Systems may be back up at an earlier time. See http://www.ichec.ie/status for further information.

------------------------------------------------------------------------
2 – Changes to our programming environment
------------------------------------------------------------------------

ICHEC has recently purchased licenses for the Pathscale EKOPATH Compiler suite for AMD64 (see http://www.pathscale.com), including C, C++, and Fortran 77/90/95 compilers. This suite is currently undergoing thorough testing on our test cluster. New versions of the AMD Core Math Library (ACML) and MPICH will be deployed as part of this new environment. We recommend that you choose this new environment instead of the current Portland Group environment. This latter will be left on our cluster for backwards compatibility, but will no longer be actively supported.

ICHEC has also purchased the Intel Trace Analyser and Collector, a powerful tool to analyse and optimise parallel applications on Hamilton. See http://www.intel.com/cd/software/products/asmo-na/eng/cluster/tanalyzer/index.htm for more details.

Both products will be available for use from 1st June.

------------------------------------------------------------------------
3 – Supported applications and benchmarking exercise
------------------------------------------------------------------------

ICHEC is supporting a wide range of software, details of which can be found at http://www.ichec.ie/software. We have recently published results of various application benchmarks to assist you in determining the optimal number of CPUs to use for your application. Applications which have been tested so far include CPMD, DL_POLY, GROMACS, NAMD, SIESTA, and VASP. We intend to extend this study to a larger number of packages this summer – please contact us if you have any datasets and/or simulation parameters we could use to run benchmarks of realistic problems. See http://www.ichec.ie/incl/ICHEC_software_benchmarks.pdf.

Access to supported software requires prior registration with the helpdesk. Simply let us know which packages you would like to use.

Finally, we would like to point out that we are gradually moving supported packages under the module environment on Walton. As a reminder:

- module avail: list all the modules available for download
- module list: list all the modules you have currently loaded in your environment
- module load <module_name>: load the module with name <module_name>
- module unload <module_name>: unload the module with name <module_name>

------------------------------------------------------------------------
4 – Termination of the Transitional Service
------------------------------------------------------------------------

NOTE: The changes described in this section will not affect CosmoGrid projects. Contact Thibaut Lery at DIAS should you have any queries regarding CosmoGrid access.

The Transitional Service will be terminated on 31st May as initially announced. A number of changes will take place on 1st June:

a/ Any running job, or job queued by a project from the Transitional Service will be terminated – in other words, the only jobs which will be allowed in the queuing system, and running on our systems will be jobs submitted with a #QSUB –A <project_code> where <project_code> corresponds to a Class A, B or C project (or obviously a CosmoGrid project).

b/ Users who have not yet gained access under the Full National Service will be able to log on their accounts on Walton and Hamilton until Friday 16th June, but will not be able to submit jobs. This two week window will allow this group of users to transfer their files back to their home institutions, as their scratch/work directories will be deleted on Monday 19th June, and their login disabled. Home directories and Web accounts will be preserved to facilitate the return of users who intend to gain access through the Full National Service at a later date.

c/ Unallocated storage space will be configured as a global scratch area which all users may use for temporary storage (maximum 1 month). This space is not subject to quotas, but an automatic "clean up" mechanism will delete any data which are over 30 days old.

d/ The Class A, B and C QoS will be abolished, as the jobs' priorities will be determined by the fair share mechanism.

e/ Along with fair share, we are incorporating a job's Expansion Factor (XFACTOR) into priority calculations. The XFACTOR is defined as:

XFACTOR = 1 + <EFFQUEUETIME> / <WALLCLOCKLIMIT>

The effect of this is that jobs with shorter requested wall times will gain priority at a faster rate. So if jobs are supplied with reasonably accurate wall time estimates, this should increase the utilisation of the system through better use of backfill. Also turnaround times should be more in line with expected runtimes. So to have your job run as soon as possible, users should use the shortest reasonable wall time estimate. A number of measures are currently under consideration to reward users providing accurate estimates of the required runtime.

f/ A 256 CPU queue will be set up for use exclusively by Class A projects.

------------------------------------------------------------------------
5 – Monitoring jobs' efficiency
------------------------------------------------------------------------

This is a reminder – we still encounter too many jobs running with low efficiency. Users are encouraged to use the command "qutil" to investigate the performance of their running jobs.

Usage: qutil [ -u username | -j jobid,... | -a | -h ] [ -s ]

For instance:

qutil -a (shows all of your jobs)
qutil -j 5675,5677 (lists these two jobs with ID 5675 and 5677, but only if they belong to you)

Users who intend to run jobs on a large number of CPUs are expected to undertake a proper scalability study of their application before submitting a large number of jobs for production. More specifically, we recommend against running any application with a parallel efficiency below 60%.

You may also use the "pgprof" profiler for a more in-depth analysis of the performance of your application on walton (see http://www.pgroup.com/doc/pgitools.pdf), and soon the Intel Trace Analyser and Collector on Hamilton (see item 2 in this mailing).

------------------------------------------------------------------------
6 – Monthly reports
------------------------------------------------------------------------

Monthly reports describing our systems' availability and utilisation can be found at http://www.ichec.ie/reports. The April 2006 report is now available on-line.

------------------------------------------------------------------------
7 – IAHPC workshop
------------------------------------------------------------------------

The Irish Association for High-Performance Computing (IAHPC) will be holding a two day workshop on 8th - 9th June at the Tyndall Institute in Cork.

This meeting will include a series of presentations by HPC experts from EPCC and Daresbury Laboratories, so anyone with interest in HPC is warmly invited to take part in this event.

Further information will appear at www.iahpc.ie.

Return to User Mailings