Daily Bulletin Archive

July 14, 2020

All work scheduled for Monday at the NCAR-Wyoming Supercomputing Center (NWSC) was completed on time and without incident. Nortek engineers and CISL’s Cheyenne Admin Support Group completed repairs on several of Cheyenne’s cooling distribution units. CISL’s Infrastructure Support Group Cheyenne and external contractors began this week’s repair efforts on the facility’s power infrastructure. CISL HPC systems engineers updated critical system software stack components and HPE completed replacements of 11 of the 16 identified faulty manifolds.

The Casper cluster, GLADE and Campaign Storage file systems, HPSS, data-access nodes, and remote access services FastX and VNC will be unavailable today from 7 a.m. until approximately 7 p.m. CISL’s pre-production implementation of JupyterHub will be available after Cheyenne is restored to service later this week.

Cheyenne is expected to be returned to service on Saturday, July 18. Watch for regular updates in CISL’s Daily Bulletin and Notifier service.

July 13, 2020

The Cheyenne supercomputing system is unavailable this week to allow CISL staff and external contractors to perform power infrastructure repairs at the NCAR-Wyoming Supercomputing Center. CISL high-performance computing specialists and HPE engineers will also perform several repairs and updates while the system is down. The maintenance began at 5 a.m. MDT today. The system is expected to be returned to service on Saturday, July 18. 

The Casper cluster, GLADE and Campaign Storage file systems, HPSS, data-access nodes, and remote access services FastX and VNC will remain available throughout the week except for Tuesday, July 14, from 7 a.m. until approximately 7 p.m. While Cheyenne is unavailable, users can access GLADE by logging in to casper.ucar.edu. CISL’s pre-production implementation of the JupyterHub platform on Cheyenne and Casper will be available after Cheyenne is restored to service.

Watch for regular updates in the Daily Bulletin and Notifier service.

July 9, 2020

CISL staff recently added a new section to the Casper documentation about using remote desktops with virtual network computing (VNC). The new material describes how to use the vncmgr script to customize both the Casper session in which the VNC server will run and the server itself. It also clarifies that all VNC jobs are automatically placed on nodes with NVIDIA Quadro GP100 GPUs, so there is no need to specify GPU resources in those cases.

The update was prompted by user feedback, which is always appreciated. Users are encouraged to provide feedback and suggestions via this form or the CISL Help Desk.

July 9, 2020

The Cheyenne system is unavailable July 13-18 for scheduled maintenance, but other resources can still be used during the week to migrate data holdings from the HPSS tape archive. The Casper cluster, GLADE and Campaign Storage file systems, HPSS, and data-access nodes will all remain available except for Tuesday, July 14, from 7 a.m. until approximately 7 p.m.

HPSS will be decommissioned on October 1, 2021, and the tape archives have limited bandwidth. Users who wait too long to migrate their HPSS data holdings to other spaces such as Campaign Storage may run out of time to migrate all of their data and it will be lost when the system is shut down.

CISL documentation and this tutorial describe recommended processes for identifying and organizing HPSS holdings, for copying files that need to be preserved to another storage resource, and for deleting files that are no longer needed. Please contact CISL for advice on individual workflows and storage options.

July 8, 2020

No scheduled downtime: Cheyenne, Casper, GLADE, Campaign Store, Object Store, or HPSS

June 30, 2020

Accidentally deleting data can ruin your whole day and then some. One key to avoiding this is to double- or triple-check that you have not specified the same files as both source and destination before you execute a transfer.

When using the Globus web interface to transfer files, for example, activate the sync option in the Transfer & Sync Options menu. Sync can be set to allow a file transfer only if the file does not exist on the destination, if there’s a difference in checksum or file size, or if the source copy is newer than the destination copy. If you use the Globus command line interface, include the --verify-checksum and --sync-level checksum options when executing a transfer command.

Review this documentation for additional cautions and instructions for using Globus to transfer files.

Globus sync settings
Sync settings in Globus web interface

 

June 26, 2020

No scheduled downtime: Cheyenne, Casper, GLADE, Campaign Store, Object Store, or HPSS

June 25, 2020

Video and slides from the June 24 tutorial "Casper Basics for New Users" are now available here in the CISL training library. The 30-minute presentation by Shiquan Su of the CISL Consulting Services Group provides a basic understanding of how to run jobs on the Casper cluster. Casper is a heterogeneous system of specialized data analysis and visualization resources and large-memory, multi-GPU nodes.

June 22, 2020

Users are reminded that the High Performance Storage System (HPSS) will reach its end of life and be decommissioned in 2021. HPSS file owners and project leads have been contacted and instructed on how to access lists of their files. The lists are updated weekly.

For reference, the lists can be found here:

  • /glade/work/csgteam/hpssreports/current/byusers/<userID>.data.gz
  • /glade/work/csgteam/hpssreports/current/byprojects/<projectID>.data.gz

Writing HPSS files is no longer possible, but users can perform most common metadata operations on their HPSS holdings, including deleting, renaming, and moving files. Those who have not already done so should begin moving their data to alternative storage systems and deleting files that are no longer needed.

Documentation and training are available on recommended processes for identifying and organizing HPSS holdings; copying files that need to be preserved to another storage resource; and deleting files that are no longer needed. 

Please contact CISL for advice on individual workflows and storage options.

June 19, 2020

No scheduled downtime: Cheyenne, Casper, GLADE, Campaign Store, Object Store, or HPSS

Pages