Daily Bulletin Archive

October 24, 2019

NCAR’s next supercomputer is slated to enter production early in 2022. The procurement process is moving right along, but this next-generation HPC system doesn’t have a name! We need your help coming up with something suitable.

Our goal is to select a name that reflects the value of the computational services we provide in support of our users' work in Earth system science. Our critically important services are among the core activities that enable NCAR to provide knowledge and information for the benefit of society, so it's worth giving some thought to what we will call this new system.

The best name will be easy to spell and pronounce, align with our mission, and inspire some strong visuals/graphics that can become the skins of the computer and even a logo we can use more generally. Suggestions should not conflict with other supercomputer names (e.g., Aurora at ANL, Frontier at ORNL), so you may want to check with Google that they aren't already being used somewhere else.

Submissions are due by Friday, November 22. You can suggest as many as three names. Please use this form to submit them:

What should we call our next supercomputer?

We look forward to seeing what you come up with. 

October 23, 2019

NCAR and UCAR MATLAB users are invited to attend a free online instructional session from 2 to 4 p.m. MDT on Wednesday, October 30. A MathWorks application engineer will provide instruction on using interactive tools to analyze, visualize, explore, and model data, and on how to create an executable notebook using MATLAB’s Live Editor. Participants will also learn how to build customized reports of their output or programs, and how to create interactive desktop and standalone apps with App Designer.

Go to http://www.mathworks.com/ncar-ucar to read a complete class description and register for the training session.

October 22, 2019

The Cheyenne cluster was fully restored to service Monday at approximately 6:15 p.m. MDT. CISL staff and vendor service engineers identified a faulty pressure sensor in one of the system’s cooling units that caused an overheating event early Sunday morning. Cheyenne’s automatic controls immediately responded as designed and shut the system down to prevent serious widespread damage. The pressure sensor was replaced in the afternoon and CISL’s System Services and Consulting Services groups then began the reboot and system verification processes.

CISL thanks everyone for their patience, understanding and cooperation throughout the outage.

October 21, 2019

Cheyenne’s compute nodes remain down this morning following a cooling system failure on Sunday. CISL continues to work with HPE to determine the root cause of the failure and the steps necessary to safely restore the system as soon as possible. The situation is at the highest severity level with HPE support and engineering. No ETA for returning Cheyenne to full service was available as of 9 a.m. today.

Cheyenne’s login nodes, the Casper cluster, the GLADE file system, Campaign Storage, HPSS, and the data-access nodes all remain available to users. More information will be provided via Notifier as it becomes available.

October 21, 2019

No scheduled downtime: Cheyenne, Casper, Campaign Storage, GLADE, HPSS

October 16, 2019

The GLADE file system, the Cheyenne and Casper clusters, the HPSS tape library, the data access nodes, and the JupyterHub portal will be unavailable on Tuesday, October 29, to allow CISL staff to update key storage system software components. The downtime will begin at 7 a.m. MDT and is expected to last until approximately 7 p.m. HPSS is expected to be back in service by 11 a.m.

All active Globus transfers will be suspended during the outage and will resume when GLADE is returned to service. System reservations on Cheyenne and Casper will prevent most batch jobs from executing after 7 a.m. that day. All batch jobs and login sessions that are running at that time will be killed.

October 16, 2019

University project leads are being asked to submit their information about FY2019 publications based on the use of NCAR/CISL supercomputing, analysis, and storage resources. The annual survey was launched this week in emails to project leads, who are asked to respond no later than November 6.

The survey also asks for information on graduate students who have used these resources and it asks for feedback on various services CISL provides. This information contributes to the CISL Annual Report and helps demonstrate to the National Science Foundation (NSF) the impact and value of NCAR/CISL resources to our user community.

All users are reminded of the importance of acknowledging NCAR and CISL computing support in their research papers. This, too, helps ensure continued support from NSF and other sources of funding for future high-performance computing systems. It is also a requirement of receiving an allocation, as noted in award letters. The reporting requirements, how to cite your use of various systems, and recommended wording of acknowledgments can be found on this CISL web page. The content of citations and acknowledgments varies depending on the type of allocation that was awarded.

October 14, 2019

Given several factors, including that our tape archive vendor, Oracle, is getting out of the hardware market, NCAR’s HPSS tape archive will be shut off on October 1, 2021. Our archival strategy has been moving away from tape and thus, for user data, the NCAR Campaign Storage file system is the recommended location for longer-lived data. Our annual storage budget will be focused primarily on adding capacity to  Campaign Storage on a regular basis.

Move data now

CISL will not be migrating the 90 petabytes of data currently in the HPSS tape archive; it will be up to you, the user, to assess which data are worth saving and to move your files to the  Campaign Storage system and/or other institutional storage resources available to you. We also expect a substantial volume of data is no longer needed and can be deleted.

It’s important to note that tape archives have limited bandwidth. If you wait too long, you will run out of time to migrate your data and they will be lost when the system is shut down. To prevent this, complete the following action items as soon as possible: 

User action items

  1. Review the data you have on tape and decide what you need to keep beyond October 1, 2021.

  2. Delete data that you do not need to keep.

  3. Review your workflows and stop writing data to tape as soon as possible. HPSS will be put into read-only mode shortly.

  4. Copy data from tape that you need to keep as soon as possible and transfer it to your target storage system.

CISL is working on a number of resources, documentation, and tools to ease the transition. Information about these will be published as soon as it becomes available. Nevertheless,  you should take action immediately. None of these tools will help with the difficult decision about which data are scientifically worth keeping and will fit into the storage resources available to you.

Please contact CISL for advice on individual workflows, accessing data movers, and storage options. Additional information and documentation will be provided in the CISL Daily Bulletin as it becomes available.

October 14, 2019

No scheduled downtime: Cheyenne, Casper, Campaign Storage, GLADE, HPSS

October 9, 2019

The GLADE file system will be unavailable on Tuesday, October 29, to allow CISL staff to update key file system software components. The Cheyenne and Casper clusters, the data access nodes and JupyterHub will also be unavailable to users throughout the outage. All active Globus transfers will be suspended during the outage but will resume when GLADE is returned to service. The downtime will begin at 7:00 a.m. MDT and is expected to last until approximately 7 p.m.  System reservations will be created on Cheyenne and Casper to prevent most batch jobs from running past 7:00 am that day. All running batch jobs and login sessions on Cheyenne, Casper and the data access nodes will be killed at 7:00 am.

Pages