Daily Bulletin Archive

February 8, 2019

Batch script examples for users of the Casper cluster have been updated to use the long form of Slurm directives, and several new examples have been added. The long-form syntax is more clear than the single-letter identifiers that were provided previously.

See these documentation pages for new and updated script examples:

Some of the updated scripts include other revisions, such as removal of the module purge command. Contact cislhelp@ucar.edu if you need assistance or have questions about job scripts.

 

February 7, 2019

The next regular maintenance operations on NCAR’s HPC systems are scheduled for Tuesday, March 5. The Cheyenne and Casper clusters and the GLADE file system are expected to be unavailable from 7 a.m. until 6 p.m. MST but every effort will be made to restore the systems to users earlier if possible. More details on the outage will be published in the Daily Bulletin later this month.

February 6, 2019

NCAR’s Computational and Information Systems Laboratory (CISL) is seeking users’ input as part of the process of procuring a system to follow the Cheyenne cluster currently in production in the NCAR-Wyoming Supercomputing Center. Users’ input will help ensure procurement of a system that best meets the community’s future needs.

Users are asked to provide their input by completing this survey by March 1: User Survey for CISL's Next HPC Procurement. The survey includes questions about users’ experience with the Cheyenne environment and priorities for the next-generation environment, which is expected to be in production by mid-2021.

Two of the three sections of the survey can be completed in about 10 minutes. Users are welcome and encouraged to respond to some open-ended questions in the third section if they can take the time.

 

February 5, 2019

Additional CMIP6 data are now available on GLADE through the CMIP Analysis Platform, including monthly averages from the historical, 1pctC02, and piControl experiments. Data sets are available from the BCC, CNRM, IPSL, NASA, and NOAA modeling centers. Users can request the addition of other data sets as modeling centers publish them. NCAR's initial CMIP6 data products are expected to be available in the first quarter of this calendar year.

More information: CMIP Analysis Platform.

 

February 5, 2019

Cheyenne: Tuesday, noon to 1 p.m. (details)

No scheduled downtime: Casper, Campaign Storage, GLADE and HPSS

February 1, 2019

The Casper cluster will be expanded soon with the addition of two nodes. The new nodes are similar to the two existing Supermicro nodes with eight NVIDIA Tesla V100 GPUs. They will support ongoing and future machine learning and deep learning efforts.

The new nodes have been received and are being installed by CISL staff at the NCAR-Wyoming Supercomputing Center. When the installation is complete, CISL system administrators and software engineers will begin acceptance testing, which is expected to take several weeks. More details will be published when the new nodes are ready for users.

 

January 30, 2019

CISL system administrators will update the PBS workload management server at noon MST on Tuesday, February 5. The update is expected to take less than 60 minutes to complete. The new version of PBS, 18.2.3, provides performance and stability improvements and a number of important bug fixes.

Most PBS commands, including qstat, will not work during the update, and new Cheyenne job submissions will not be possible. Jobs that are executing when the maintenance begins will continue to run without interruption. Jobs that are queued for execution or in a hold state will remain in those states until PBS is returned to service. Access to Cheyenne’s login nodes will not be interrupted.

January 29, 2019

No scheduled downtime: Cheyenne, Casper, Campaign Storage, GLADE and HPSS

January 28, 2019

CISL plans to roll out a new Jira Service Desk system with an integrated Confluence Knowledge Base to help HPC users, CISL staff, and others quickly find the solutions or assistance they need. The new system is expected to be ready in February and will replace the ExtraView ticketing system that has been in place for most of the past decade.

Service Desk features a friendlier user interface, simplified request forms, and a knowledge base of articles to answer common questions. Users will also be able to log in to track the status of their in-progress tickets.

UCAR/NCAR personnel already have the CIT passwords that are required to log in to Jira Service Desk, as do users who have Duo two-factor authentication rather than YubiKey tokens. To get a CIT password, call 303-497-2400 for assistance.

More information on implementation of the new service desk will be available soon.

 

January 25, 2019

The maintenance operations on NCAR’s HPC systems that were scheduled for Tuesday, February 5, have been canceled. To minimize inconvenience to users, the work that was scheduled for that day will be combined with other system maintenance on Tuesday, March 5. More details on the March 5 outage will be published in the Daily Bulletin next month.

 

Pages