Parallel

Valgrind can also be used for debugging parallel programs. Debugging POSIX pthreads is supported through the tool Heldgrind (see Valgrind User Manual). Debugging of distributed-memory applications which use the MPI message passing standard as is common in high performance computing environments is aslo possible. This support consists of a library of wrapper functions for the PMPI_* interface. When incorporated into the application’s address space, either by direct linking or by LD_PRELOAD, the wrappers intercept calls to PMPI_Send(), PMPI_Recv(), etc. They then use client requests to inform Valgrind of memory state changes caused by the function being wrapped. This reduces the number of false positives that Memcheck would otherwise typically report for MPI applications.

The wrappers also take the opportunity to carefully check the size and defined-ness of buffers passed as arguments to MPI functions, hence detecting errors such as passing undefined data to PMPI_Send(), or receiving data into a buffer which is too small.

To use Valgrind in parallel like this requires us to use a pbs script so the execution can be orchestrated by the batch processing system. %p is replaced with the current process ID. This is very useful for programs that invoke multiple processes. You need to compile your application with the same compiler and mpi module that is used the script. Using a different MPI-library will generate a lot of false messages in your output file.

Sample pbs script:

#!/bin/bash
#PBS -l nodes=1:ppn=24
#PBS -l walltime=00:05:00
#PBS -A sci_test
#PBS -o test_valg.out
#PBS -e test_valg.err
#For gcc
#module load dev valgrind/gcc/3.10.1
#export LD_PRELOAD=/ichec/packages/valgrind/gcc/3.10.1/lib/valgrind/libmpiwrap-amd64-linux.so

#For Intel
module load dev valgrind/intel/3.10.1
export LD_PRELOAD=/ichec/packages/valgrind/intel/3.10.1/lib/valgrind/libmpiwrap-amd64-linux.so
cd $PBS_O_WORKDIR

mpirun -n 1 valgrind --leak-check=full --log-file=Valgrind.%p ./mpi_hello.x

You should see something similar to the following in the output file repeated for every process in the group.

valgrind MPI wrappers 31855: Active for pid 31855
valgrind MPI wrappers 31855: Try MPIWRAP_DEBUG=help for possible options

Supported By

File Browser Reference
Department FHERIS
University of Galway
HEA Logo