Ref. No.
TBC
Closing Date
TBC

HPC System Administrator

HPC System Administrator

Applications are invited from suitably qualified candidates for a full-time fixed term position as a HPC System Administrator with the Irish Centre for High End Computing (ICHEC) at the University of Galway.

This position is available from January 2023 to work on high-performance computing and data management projects. The candidate will be based at our offices in Dublin or Galway with the option for a hybrid work from home arrangement.

Interested candidates with the qualifications specified below, should contact recruit@ichec.ie for further details 

Job Description:

Selected responsibilities and duties for this post include, and are not limited to:

  1. Diagnosing and resolving faults on HPC and associated systems
    • Interact with users and computational scientists in diagnosing and resolving problems with applications
    • Understand the complex software and hardware stack comprising HPC clusters in order to troubleshoot and fix underlying faults or performance issues
    • Work with supplier technical support when required to resolve issues
  2. Operation of HPC and associated systems
    • Automation of lifecycle management of user account and projects
    • Installation and configuration of software along with security and bugfix updates
    • Configuration of batch scheduling and accounting systems
    • Development of comprehensive system monitoring and alerting
  3. Commissioning of new platforms and services
    • Contribute to technical specification and tender evaluation of new HPC and other infrastructure platforms
    • Plan and implement the various stages of new platforms and services from commissioning and testing through to migration to production status

 

Essential Requirements

  1. Applicants must have a higher degree (Level 8) in computational science/computer science, or a related discipline, or equivalent experience (min. 3+ years) in a similar technical environment.
  2. Advanced Linux systems administration skills with at least 3 years practical experience.
  3. Good knowledge and experience in managing fault tolerant, clustered services and cluster management software.
  4. Good knowledge of local area networking including Layer 2 switch and VLAN configuration and network services such as DNS and Apache and Nginx web servers.
  5. Good knowledge of security principles and practices including deploying firewalls, configuring SELinux and experience using security monitoring and intrusion detection tools.
  6. Experience deploying configuration management (eg Ansible, Saltstack) and monitoring tools (Nagios, Icinga).
  7. Systematic approach to trouble shooting and problem solving.

 

Desirable Requirements

  1. Experience managing HPC specific parallel filesystems such as Lustre
  2. Knowledge and experience managing private cloud technology such as OpenStack
  3. Knowledge of federated identity and authentication management systems
  4. Knowledge of software defined storage clusters (eg CEPH)

Salary

Administrative Officer, Grade 4. Salary €44,659 to €50,031 per annum pro rata for shorter and/or part-time contracts (public sector pay policy rules pertaining to new entrants will apply).

Supported By

File Browser Reference
Department FHERIS
NUI Galway
HEA Logo