Cheyenne job efficiency and the select statement

February 8, 2018

What’s the difference between running Cheyenne jobs efficiently and inefficiently? The CISL Consulting Services Group (CSG) recently encountered a case where revising a batch script select statement made a huge difference.

A WRF user was running simulations on 60 Cheyenne nodes, intending to use all 36 cores of each node with 4 MPI processes and 9 OpenMP threads per process. The following select statement likely would have been fine if the user hadn’t compiled WRF with the dmpar option, which enables only distributed-memory MPI support, instead of dm+sm, which enables both MPI and OpenMP support:

#PBS -l select=60:ncpus=36:mpiprocs=4:ompthreads=9

With an assist from CSG, the user modified the select statement as follows to use 36 MPI processes, and jobs that ran at 10.8% efficiency now run at more than 99%:

#PBS -l select=60:ncpus=36:mpiprocs=36:ompthreads=1

Improvements like that can make your allocation go a lot farther. Ask yourself if some of your jobs run significantly slower than you think they should. Do you unexpectedly run out of wall-clock time? Take another look at how you’re requesting resources in your job script (and how you compiled your code), and don’t hesitate to contact CSG for assistance.