Daily Bulletin Archive

May. 21, 2018

No downtime: Cheyenne, GLADE, Geyser_Caldera and HPSS

May. 17, 2018

Funders and publishers increasingly require scientists to make their data open, and an AGU webinar at noon MDT on Thursday, May 17, can help you meet these evolving requirements. Shelley Stall, Director of Data Programs at AGU, will present “Data Management Fundamentals” to explore data management resources and proven strategies for making research more discoverable and understandable. See the AGU announcement to register.

May. 15, 2018

HPSS Disaster Recovery downtime: Tuesday, May 16th 7:30 a.m. - 11:30 a.m.

No downtime: Cheyenne, GLADE, Geyser_Caldera

May. 11, 2018

Plans are under way to implement major system updates to Geyser, Caldera and the GLADE file system in June. The Geyser and Caldera clusters’ operating systems will be upgraded to CentOS 7, and GLADE will be upgraded to a new version of GPFS.

Users will need to recompile their executables to run on Geyser and Caldera. Job scripts may need to be updated to run in the CentOS 7 environment as version numbers of most software on the systems will be different.

CISL has begun building all of the required system software such as compilers, the module environment, and critical libraries including NetCDF, MATLAB and IDL. Once those tasks are completed later this month, several Geyser and Caldera CentOS 7 nodes will be made available for users to begin testing and rebuilding their applications.

GLADE will be upgraded to GPFS version 5.0, which will provide improved stability, performance and maintainability. This upgrade will not require users to make any changes.

Updates and more details, including target dates, will be posted in the Daily Bulletin as they become available.

May. 11, 2018

Video and slides are now available here from a recent CISL Seminar Series presentation by Dave Hart, NCAR/CISL User Services Manager, about how new storage systems and practices are being put into place this year to reflect the new reality facing users of CISL's petascale computational environments. With input from users and taking into account recent changes in the storage technology landscape, CISL has begun to transform its storage offerings while working with users to help them understand the trade-offs that they now face and will continue to face.

More information and training support will be coming over the next several weeks and months.

May. 8, 2018

The Cheyenne cluster will be unavailable from 8 a.m. to 6 p.m. MDT today, May 8, to allow CISL staff and HPE engineers to perform hardware maintenance and address several known issues with the PBS job scheduler.

Users will be unable to log in during the maintenance period or submit new jobs. Running jobs that have not finished when maintenance begins will continue executing to completion. Jobs that have been submitted and are queued for execution will be dispatched by PBS as they normally would be.

Users will be informed via the CISL Notifier service when Cheyenne  is returned to service.

May. 8, 2018

HPSS downtime: Tuesday, May 8th, 7:30 a.m. to 3:00 p.m. for a library management system upgrade

DAV maintenance: Monday, May 7th, 12:00 p.m. to 1:00 p.m.

Cheyenne planned maintenance: Tuesday, May 8th, 8:00 a.m. to 6:00 p.m.

No downtime: GLADE

May. 7, 2018

The Geyser and Caldera clusters will be unavailable from noon  to 1 p.m. MDT today, May 7, to allow CISL system administrators to perform maintenance on the Slurm job scheduler.

Users will be unable to log in to either cluster during the maintenance period and new job submissions will not be possible. No interruptions are expected to existing login sessions or batch jobs that are already running or queued for execution.

We apologize for any inconvenience this might cause. Users will be informed via the CISL Notifier service when the systems are returned to service.

May. 1, 2018

Batch jobs running on the Cheyenne systems sometimes fail when they create large stdout or stderr files that overflow the spool directory on the first compute node used. Failures from this condition are more likely with MPI jobs. To avoid the problem, CISL recommends redirecting job output to a file as described in this newly updated documentation regarding job scripts.

May. 1, 2018

The recommended way to set up your Cheyenne user environment–or in some cases, environments–is to load the desired modules after logging in or to create customized environments as described here.

Some users recently reported problems that resulted from loading environment modules with their personalized start files (.bashrc, .cshrc, .kshrc, .tcshrc, .login, .profile, and so on) instead of the recommended procedures.

As advised in our Personalizing start files documentation, Cheyenne users should not set environment modules in their start files by using commands such as:

  • module load nco

  • module load netcdf

Please contact the CISL Consulting Services Group if you have questions.