Casper V100 nodes updated - temporary performance reduction likely

December 30, 2019

Several Casper nodes were updated with new high-speed network hardware earlier this week. The new hardware required updates to the system’s network software on those nodes which includes all nodes with NVIDIA V100 GPUs.  The majority of Casper’s nodes have not yet been updated and their network is unchanged. The list of updated nodes is provided below. 

The updates required Open MPI 3.1.4 to be rebuilt and it is likely that older versions of OpenMPI will no longer work on the updated nodes.  Intel MPI usage should be unaffected by the network upgrade. However, until the remainder of Casper’s nodes are updated with the new network hardware, multi-node jobs with a mix of updated and non-updated nodes will likely perform slower than expected using any MPI library.  Further changes to Casper’s OpenMPI software stack are likely in the next several weeks. Please watch for related announcements in the Daily Bulletin.

Updated Casper nodes: casper08, casper09, casper23, casper24, casper25, casper27, casper28