Daily Bulletin Archive

November 13, 2018

An NCAR/CISL tutorial scheduled for 9:30 a.m. MST on November 29 will show participants how the Globus command line interface (CLI) can be used both interactively and within scripts as part of larger workflows. Much like the HSI interface to tape, the Globus CLI enables programmatic access to NCAR and external data endpoints, including the new NCAR Campaign Storage archive. The one-hour tutorial will be at the Foothills Lab (FL2-2001) in Boulder and online.

Topics will include.

  • Initiating, monitoring, and modifying transfers between endpoints

  • Incorporating CLI commands into data mover scripts

  • Best practices for storing data at NCAR

  • Strategies for archiving research data on the new Campaign Storage platform

Registration forms will be available in the next few days. Watch for another announcement in the Daily Bulletin.

November 12, 2018

No downtime: Cheyenne, GLADE, Geyser_Caldera, Casper and HPSS

November 9, 2018

The frequency and severity of hardware failures on the Geyser and Caldera clusters and the /glade/p_old file system have increased as they approach end of life. These failures pose a risk to users who have not yet moved files that they need from /glade/p_old to one of the new storage spaces or have not begun to migrate their workflows to the new Casper data analysis and visualization cluster.

Several Geyser and Caldera nodes and /glade/p_old disks have already suffered irrecoverable hardware failures. They have been removed and will not be replaced. CISL storage engineers have been able to recover all files from failed /glade/p_old disks and will continue their data recovery efforts until December 15.  After that date, all files on /glade/p_old/ disks that fail will be lost permanently.

As announced previously, the /glade/p_old/ project and work spaces will be decommissioned December 31 and all files will be permanently removed from the system. CISL provides example scripts to help users move their files from old to new GLADE file spaces and documentation for using the new Campaign Storage archive. A tutorial planned for November 29 will demonstrate how to use the Globus CLI to work with Campaign Storage. Watch the Daily Bulletin for more information about the tutorial.

November 7, 2018

User documentation about how to start interactive jobs and batch jobs on NCAR’s new Casper data analysis and visualization cluster now includes additional detail. Starting jobs on Casper nodes has a new section on how to take advantage of the system’s local NVMe solid-state disk (SSD) storage. It also provides more information to help users specify which of the system’s multiple types of nodes to use for running their jobs.

CISL encourages users of the Geyser and Caldera systems, which will be decommissioned at the end of the year, to migrate their work to the Casper cluster soon. In addition to contacting cislhelp@ucar.edu for assistance, users can learn more about the system in a tutorial scheduled for November 14. See this CISL Daily Bulletin announcement to get more information and register.

November 5, 2018

CISL has canceled this month’s full system scheduled downtime due to the increased stability of the Cheyenne, Geyser, Caldera, and GLADE systems and to improve system availability. Next month’s scheduled maintenance downtime has been moved from December 4 to December 11 to accommodate users’ preparations for this year’s AGU conference.

Users can check the HPC Events Calendar any time to see when significant events are scheduled, including maintenance downtimes, software releases, and new hardware installations. NCAR and UCAR users can add it to their own calendars with the “+ Google Calendar” button in the lower right corner of the calendar.

November 5, 2018

No downtime: Cheyenne, GLADE, Geyser_Caldera and HPSS, Casper

October 30, 2018

No downtime: Cheyenne, Casper, GLADE, Geyser_Caldera

October 26, 2018

Registration is now open for the NCAR/CISL Consulting Services Group’s 45-minute Casper user tutorial at 2:30 p.m. MST on Wednesday, November 14. The tutorial will introduce the capabilities of the new Casper system, describe how to access its features, and provide some best practices. These topics will be covered in detail:

  • The four types of Casper nodes and their features

  • Accessing Casper resources using Slurm

  • Interactive jobs and remote virtual desktops (VNC)

  • Using the GPU capabilities of Casper

Register to attend in person – in the Damon Conference Room at NCAR’s Mesa Lab in Boulder – or attend online by selecting one of these links:

October 24, 2018

A new lightweight, easy-to-use command is available for reporting Cheyenne job failures. The reportfailure command is intended for reporting failures that users suspect were caused by system issues such as node problems.

To report a suspected failure, from your Cheyenne command prompt enter reportfailure followed by a one-line description that includes the job ID. Here are two examples:

  • reportfailure  Job 1234567  MPT Warning, timeout with communication to r3i7n21

  • reportfailure  MPT: shepherd terminated: r2i2n21.ib0.cheyenne.ucar.edu   JobID: 2705231

The command is not intended to be a substitute for opening ExtraView tickets and users will not receive a response. Rather, use of reportfailure will help CISL more quickly identify and respond to potential system issues. If you have questions about this, contact CISL at 303-497-2400 or cislhelp@ucar.edu.


October 23, 2018

GLADE users who have not already done so need to take appropriate action soon as a result of several recent and important changes to GLADE’s project and work spaces.

As announced previously, the GLADE project and work spaces in /glade/p_old/ are now read-only and users can no longer move or delete files from those spaces. They will be decommissioned December 31 and all files will be permanently removed from the system. Users who still have files in /glade/p_old/ or /glade/p_old/work should copy them to one of the new storage systems as soon as possible.

Project spaces

CISL recommends moving active project data to /glade/p/<entity>/<project_code> where entity can be univ, uwyo, cesm, mmm, nsc, or other designated NCAR lab or special program. A one-year purge policy will be enforced on files in those new spaces, meaning files that are not accessed for more than one year will be deleted.

Project data that are not active but need to be preserved should be moved to the Campaign Storage archive. Users access and manage their Campaign Storage files with Globus services. A five-year purge policy will be enforced on Campaign Storage effective from the date files are created in that archive.

Work spaces

All users now have individual directories in /glade/work with 1-TB quotas. Files in those directories are not purged. Users should copy any files they need from their /glade/p_old/work/ directories to their new /glade/work directories before December 31.

These and other updates to storage systems have been published in table format here.

Contact cislhelp@ucar.edu with questions or for help copying files.