Daily Bulletin Archive

Apr. 16, 2018

The location of the Research Data Archive (RDA) on NCAR’s GLADE file system has changed.  NCAR users are advised to access RDA data from the new production location at: /glade2/collections/rda/data.

Note that the previous location of RDA data, /glade/p/rda/data, has been moved to /glade/p/rda/data_old but is no longer being maintained and will be purged later this year.

Please contact rdahelp@ucar.edu with any questions or concerns.

Apr. 13, 2018

A semi-annual Mesa Lab building maintenance power-down scheduled for Saturday, April 14, should have little impact on university users of CISL’s high-end resources. Some Boulder-based UCAR/NCAR staff will be unable to log in to the Cheyenne system or other services with their authentication tokens, but sessions that start before the power-down will not be affected.

The power-down should otherwise not affect the Cheyenne, Geyser, and Caldera clusters, the GLADE system, or HPSS, which will remain in service at the NCAR-Wyoming Supercomputing Center (NWSC) in Cheyenne. The maintenance work is scheduled to begin at 6 a.m. and conclude by 6 p.m.

Some HPC support services and the HPSS disaster recovery resources that are housed at the Mesa Lab will be unavailable during the power-down. The affected services include the license servers  for Mathematica and the PGI compilers, the CISL website, the ExtraView help desk ticketing system, and SAM accounting system. The license server supporting MATLAB users on Cheyenne will not be affected.

Users who have urgent help requests during this time should call 303-497-2400 or 307-996-4300 to reach the NWSC operations center.

Apr. 11, 2018

Cheyenne system administrators will perform important system tests on Cheyenne’s login node, cheyenne1, today beginning at noon MDT.  The tests are expected to take approximately one hour to complete.

cheyenne1 will be unavailable throughout the testing period and steps have been taken to restrict access to cheyenne1 prior to the start of today’s tests to minimize impact to users.  Active user login sessions and all running processes on cheyenne1 will be killed when the testing procedures begin. Access to Cheyenne’s other five login nodes will not be interrupted and no impact to other Cheyenne components is expected.

Apr. 11, 2018

Acknowledging the support of NCAR and CISL computing when you publish research results helps ensure continued support from the National Science Foundation and other sources of funding for future high-performance computing (HPC) systems. It is also a requirement of receiving an allocation, as noted in your award letter.

The reporting requirements, how to cite your use of various systems, and recommended wording of acknowledgments can be found on this CISL web page. The content of citations and acknowledgments varies depending on the type of allocation that was awarded.

Apr. 9, 2018

No downtime: Cheyenne, GLADE, Geyser_Caldera and HPSS

Apr. 3, 2018

HPSS downtime: Tuesday, Apr. 3rd 8:00 a.m. - 2:00 p.m. for library consolidation work

No downtime: Cheyenne, GLADE, Geyser_Caldera

Apr. 2, 2018

Cheyenne system administrators will perform important maintenance procedures on PBS today beginning at noon MDT. The maintenance is expected to take approximately one hour to complete.

Most PBS commands will not work during the maintenance outage, including qstat and qsub, and new job submissions will not be possible. Access to Cheyenne’s login nodes will not be interrupted.

No batch jobs are expected to be lost as a result of the maintenance.  Jobs that are executing when the maintenance begins will continue to run without interruption. Jobs that are queued for execution or in a hold state will remain in those states until PBS is returned to service.

 

Users will be notified when PBS is returned to service.

Apr. 2, 2018

Users of the NCAR/CISL High Performance Storage System (HPSS) whose storage allocations are overspent as of Monday, April 2, will receive error messages when they try to write files to that system and those transfers will fail. Once an allocation is overspent, users will need to reduce their holdings before they can write additional files. Some users may need to modify their workflows to ensure that archive space is available, detect error messages, and confirm execution of transfers to HPSS.

To check the status of your HPSS allocation, log in to the Systems Accounting Manager (sam.ucar.edu) and select Reports, then My Account Statements. The accounting statements are updated weekly, so the most recent writes or deletions may not be reflected until several days after they are made.

Additional details and guidance will be available soon.

Mar. 31, 2018

The application deadline for the Women in IT Networking at SC (WINS) program has been extended to March 31. Awardees will receive funding to participate as SCinet team members during the SC18 conference in November in Dallas, Texas. Interested and qualified women are encouraged to apply.

See the WINS site for more information and a link to the application. WINS is a three year National Science Foundation-funded program that awards up to five early to mid-career women from diverse regions of the U.S. research and education community IT field to participate in the ground-up construction of SCinet, one of the fastest and most advanced computer networks in the world. WINS is a joint effort between the Energy Sciences Network (ESnet), the Keystone Initiative for Network Based Education and Research (KINBER), and the University Corporation for Atmospheric Research (UCAR).

Mar. 30, 2018

The PBS qstat command will be modified during this week's maintenance outage. When Cheyenne is returned to service late this week users will be able to query only for information about their own jobs and not jobs submitted by other users. The reason for this change is to reduce demands on the PBS server, which has frequently been overloaded, resulting in poor system performance and job failures. User should be aware that this change may affect some existing scripts and workflow managers.

CISL learned recently that some users’ scripts were issuing multiple qstat commands, which can be highly resource intensive, every minute or every second. Limiting qstat to return information only for jobs belonging to the user will significantly reduce demands on the system. Before this change, the command’s default behavior was to return information on all jobs in the PBS database.

Users can further help reduce demands on the system by adopting the following changes wherever possible:

  • Use “qstat <jobid>” instead of just “qstat”

  • Avoid using “qstat -f -x”

  • Limit the number and frequency of qstat commands. Multiple calls every minute provides little extra information and adversely affects overall system performance.

CISL thanks all users for their cooperation. Please contact cislhelp@ucar.edu if you have any questions or would like help in this matter.

Pages