Daily Bulletin Archive

April 16, 2019

Use relative paths and environment variables instead of hardcoding directory names in your job scripts. Hardcoding in scripts and elsewhere can make debugging your code more difficult and also complicate situations in which others need to copy your directories to build and run your code as themselves.

See this CISL page for a simple example and more information.

April 15, 2019

No scheduled downtime: Cheyenne, Casper, Campaign Storage, HPSS, and GLADE

April 15, 2019

The CISL website, the Systems Accounting Manager, Notifier service, ExtraView helpdesk ticketing system, and some other support services may be unavailable intermittently. Thank you for your patience as we work to resolve some network issues.

April 10, 2019

Batch jobs that fail tend to have much in common. While some fail for reasons that are beyond users’ control, many failures can be prevented with minor changes to batch scripts or by adopting best practices. This CISL web page – Common causes of job failures – points out several actions users can take to identify potential problems and ensure that jobs run successfully.

April 9, 2019

The HPSS Disaster Recovery service at the Mesa Lab will be down from 2pm on Friday, April 12 until 9 am on Monday, April 15

Cheyenne and Casper License Server Thursday, April 11 12 P.M. to 1 P.M. for MATLAB upgrade.

No downtime for Glade or Campaign Store.

April 9, 2019

A semi-annual NCAR Mesa Lab building maintenance power-down is scheduled for Saturday, April 13, but it should have little impact on university users of CISL’s high-end resources. Some Boulder-based UCAR/NCAR staff will be unable to log in to the Cheyenne system or other services, but sessions that start before the power-down will not be affected. The maintenance work is scheduled to begin at 4 a.m. and conclude by early evening.

The Cheyenne and Casper clusters, the GLADE system, Campaign Storage, and HPSS will remain in service at the NCAR-Wyoming Supercomputing Center (NWSC) in Cheyenne. Services that will be unavailable during the power-down include the SAM accounting system, the CISL website, license servers for Mathematica and the PGI compilers, and the ExtraView help desk ticketing system. The license server that supports MATLAB users on Cheyenne will not be affected.

Users who have urgent help requests during this time should call 303-497-2400 or 307-996-4300 to reach the NWSC operations center.

April 8, 2019

The release of MATLAB version R2019a previously scheduled for April 4 is now scheduled for this Thursday, April 11, at noon MDT.  The updates will apply to both the Cheyenne and Casper clusters. After the update the default MATLAB version will remain at R2016b for several weeks to allow users time to update their scripts and workflows.
 

The update will require a restart of the license server, which is expected to take less than 60 minutes. The license server also manages the Intel and PGI compilers and IDL software. During the license server restart period users will not be able to access new instances of those licenses. Batch jobs and interactive processes that are already running when the update begins are not expected to be affected.

April 5, 2019

An update of MATLAB to version R2019a on both Cheyenne and Casper that was scheduled for Thursday, April 4, has been postponed because of issues with the new MATLAB license. Another announcement will be made when the update is rescheduled.

April 4, 2019

This planned update has been postponed.

The newest version of MATLAB – R2019a – will be released on both Cheyenne and Casper this Thursday, April 4, at noon MDT. At that time the default version of MATLAB will be switched from R2016b to R2019a on both systems. MATLAB versions R2015b, R2016b, and R2018a will remain available after the update.

The update will require a restart of the license server, which is expected to take less than 60 minutes. The license server also manages the Intel and PGI compilers and IDL software. During the license server restart period users will not be able to access new instances of those licenses. Batch jobs and interactive processes that are already running when the update begins are not expected to be affected.

April 4, 2019

The Cheyenne system and Casper nodes are configured for distinct purposes. Cheyenne is best used for running climate and weather models and simulations while the heterogeneous Casper cluster of nodes is for other specialized work. Most Casper nodes are used for analyzing and visualizing data while others feature large-memory, dense GPU configurations that support explorations in machine learning and deep learning.

This documentation explains how to get jobs running on the most appropriate system for your work and on the individual types of nodes that will best meet your needs:

For expert assistance or guidance in using these resources, contact the CISL Consulting Services Group. See this web page for additional best practices.

Pages