Logo of Science Foundation Ireland  Logo of the Higher Education Authority, Ireland7 CapacitiesGPGPU Research Projects
Ireland's High-Performance Computing Centre | ICHEC
Home | News | Infrastructure | Outreach | Services | Research | Support | Education & Training | Consultancy | About Us | Login

User Mailing

ICHEC mail #16


Posted: 2006-02-14

Dear ICHEC users,

Contents:

1. National survey of HPC users in Ireland
2. Walton’s acceptance tests
3. Transition to the full national service
4. New scheduling policies
5. Disk quotas
6. Quality of Service (QoS)
7. Misc useful information
8. Monthly reports

------------------------------------------------------------------------
1 – National survey of High Performance Computing users in Ireland
------------------------------------------------------------------------

The Trinity Centre for High Performance Computing (TCHPC) recently published to results of a survey of High Performance Computing Users in Ireland, conducted in late 2005. This survey, based on the results of an online questionnaire, raised a number of important issues which SFI have asked ICHEC to further investigate. A follow-up national survey of High Performance Computing Users in Ireland has therefore been recently started as a collaboration between ICHEC and TCHPC. The new survey consists of three phases:

1. An exhaustive search to identify all researchers (post-doctoral level and above) with a need for HPC in Third Level Institutes in Ireland;

2. A request that each researcher fills in an online questionnaire;

3. A request that each researcher participates in a detailed one-on-one interview with HPC experts from ICHEC or TCHPC.

If you have not already been invited by an ICHEC staff to take part in this survey, please contact j-c.desplat (our e-mail addresses are formed by appending the suffix @ichec.ie) as soon as possible to receive an invitation. Note that the survey is only open to researchers at a postdoctoral level, and higher. PhD students should be represented by their supervisors.

------------------------------------------------------------------------
2 – Walton’s acceptance tests
------------------------------------------------------------------------

We are pleased to announce that Walton has now passed its stability tests. ICHEC has therefore accepted ownership of the cluster on Thursday 26th January.

------------------------------------------------------------------------
3 - Transition to the full national service
------------------------------------------------------------------------

The full national service officially started on 1st February 2006 as planned. Users currently making use of the system under the terms of the transitional service will keep their access in order to minimise disruption to their work, and making the most effective use of our facilities. However, jobs submitted by users granted access under the full service will benefit from a higher priority, and therefore benefit from a faster turn-around time.

Users of the transitional service are therefore advised not to wait before submitting their application to the full service, as their job turn around time will substantially degrade over time. For further information on how to apply to the full service, see http://www.ichec.ie/full_national_service and carefully read the application guidelines at http://www.ichec.ie/application_guidelines. Access to the transitional service will be discontinued on 31st May.

Timetable:

- Class C applications considered since 01/02/2006 (some already accepted);

- Class B applications to be considered at the inaugural meeting of the Scientific Council, to be held in Dublin on ***23/02/2006***;

- Class A applications: first call to be issued in March/April 2006.

------------------------------------------------------------------------
4 - New scheduling policies
------------------------------------------------------------------------

A number of changes will be implemented on 1st March to ensure a faster turn-around time on Walton:

- the maximum run-time will be brought down from 5 days, to 4 days (96 hours) – please remember to adjust your PBS scripts accordingly if you are specifying longer run-times, or your jobs will be rejected by the queuing system. We strongly encourage those users who have not yet implemented a checkpoint/re-start functionality within their production code to do so.

- the soft limit on the maximum number of jobs a single user may run has been brought down from 8 to 6. The hard limit remains unchanged at 12 jobs.

These changes are necessary to ensure a fairer share of the resources among users. We would also urge users to specify better estimates of the maximum run-time within their batch scripts, rather than systematically specifying the maximum allowed runtime.

------------------------------------------------------------------------
5 - Disk quotas
------------------------------------------------------------------------

Quotas will be enforced on Walton and Hamilton from 1st March. Users can check their current usage on their home directories by typing the command:

$ du -sk ~

For example:

jcdesplat@l2cu27:~> du -sk ~
720416 /ichec/home/staff/jcdesplat


Quotas will be set to 4GB on Walton and 2GB on Hamilton. Note that the contents of your home directory (under /ichec/home/users) will be backed up from 1st March. Home directories should only contain source code and analysis data. Raw data should instead be saved under /ichec/work/project where project corresponds to your project code. You can find out your project code(s) by using the command "groups", e.g.:

jcdesplat@l2cu27:~> groups
ichec science management icphy001 ulo ddt


My project code is therefore "icphy001". Then type the following to find the current usage on this directory:

$ du –sk /ichec/work/project

For instance:

jcdesplat@l2cu27:~> du -sk /ichec/work/icphy001
523200 /ichec/work/icphy001


The command "mmlsquota" can be used if a more complete report is required.

The amount of disk space available to users of the transitional service will be progressively reduced to make space for users supported under the full service.

------------------------------------------------------------------------
6 - Quality of Service (QoS)
------------------------------------------------------------------------

Quality of Service (QoS) is being implemented to provide a suitable level of access to CosmoGrid users (as defined in our Service Level Agreement), and to provide a better service to groups supported under the full service. In order to benefit from the QoS, you will need to include the following line in your PBS scripts:

#PBS -W x=QOS:cosmogrid

if you are a CosmoGrid user (your project code will be of the form cgXXX, where XXX is a three digit number), or

#PBS -W x=QOS:ClassC

if your project is a class C project (supported as part of the full service). If so, your project code will be of the form aabbbxxxc, i.e., a 9 character code, with "c" as the final letter.

------------------------------------------------------------------------
7 - Useful commands
------------------------------------------------------------------------

First, the automated mailing on start or completion of a batch job is now fully working. See http://www.ichec.ie/user_documentation.php#7 for further information.

A problem has also been recently reported to us where a failing job kept on re-submitting itself. Although in most cases jobs do die properly, we have observed cases where if a jobs gets allocated a 'proplem node' causing the job to crash, it gets rescheduled to run again. This behavious can be avoided by inserting the following command within your PBS scripts:

#PBS –r n

------------------------------------------------------------------------
8 - Monthly reports
------------------------------------------------------------------------

And finally, monthly reports are now published by ICHEC at http://www.ichec.ie/reports These reports include information such as utilisation figures, usage profile, helpdesk and training activities, etc.

Return to User Mailings