Daily Bulletin Archive

September 6, 2013

Commands for creating and managing customized collections of Yellowstone environment modules have changed.

To save a customized environment as your default environment, load the modules that you want to use in that environment, then simply run module save. The save command replaces the now-deprecated setdefault command. Similarly, restore replaces the getdefault command.

For information about other changes, see the updated CISL Environment modules documentation.

August 29, 2013

The upgrade to LSF 9.1.1 and IBM Parallel Environment (PE) on Yellowstone was completed Tuesday, and following testing of the nodes, the system was returned to users at approximately 7 p.m.

Although not technically required, CISL's consultants strongly recommend that users recompile their codes in the updated environment.

August 28, 2013

CISL staff will be conducting a two-part update to key Yellowstone system software components August 20 and August 27. As part of this update, Yellowstone, Geyser, and Caldera will be taken out of service August 27, from 6 am MT until 6 pm MT. The updates include fixes for a number of issues experienced by users.

Although not technically required, CISL's consultants strongly recommend that users recompile their codes following the August 27 downtime.

Of most interest to users, the updates to LSF and the IBM Parallel Environment (PE) include:

* Corrected wrappers for the PGI compiler;

* the fix for a bug with MPI_IN_PLACE in MPI_Allgather that some users have encountered;

* a fix that will allow Fortran codes with "USE MPI" statements to compile correctly under PGI and GNU compilers; and

* the LSF and PE versions needed to complete integration of the Pronghorn Xeon Phi cluster into the environment.

On August 20, CISL will perform the first part of the update, upgrading the xCAT administration software, which is a prerequisite to the LSF and PE updates. No outage will be needed if the upgrade process goes as planned. However, users should be aware of the slight chance that CISL staff may need to take the system down should they encounter problems.

On August 27, CISL staff will take the system down to upgrade LSF to version 9.1.1 and the IBM PE to version The downtime is necessary since all the nodes must be rebooted to propagate all the changes.

During this period a number of other system firmware and software components will be brought up to date, but these will largely be invisible to users.

GLADE and HPSS will not be affected by the update process and are expected to remain in service throughout this period.

August 28, 2013

Yellowstone, Geyser, Caldera: Downtime Tuesday, August 27 6:00am - 6:00pm

No Scheduled Downtime: HPSS, GLADE, Lynx

August 23, 2013

NCAR researchers and eligible university researchers can now request "small" Janus allocations of up to 200,000 core-hours at any time, an increase from the previous limit of 50,000 core-hours for small allocations.

University researchers can request allocations of more than 200,000 core-hours as part of the semi-annual large allocation process. The next deadline is Sept. 16. See University Large Allocation Request Form. For small allocations, use the University Small Allocation Request Form.

NCAR staff can also request both small and larger allocations on Janus via the Janus allocation request form. Large allocations require a brief write-up of the technical readiness and justification of the computational request. NCAR researchers should use the Alternative Allocation Request Form.

More information is available here: http://www2.cisl.ucar.edu/resources/janus/allocations

August 20, 2013

An XSEDE training session for beginning and intermediate Linux/Unix users will be webcast from 1 to 4 p.m. Central time on Friday, September 6.

The Texas Advanced Computing Center will present the training session “Linux/Unix Basics.” XSEDE described it as an interactive lecture that will emphasize common strategies for interacting with clusters and HPC resources. It will include hands-on exercises. There are no prerequisites.

To register, see https://www.xsede.org/web/xup/course-calendar

August 16, 2013

Users are asked to plan around the 2013 Community Earth System Modeling (CESM) Tutorial schedule August 12 to 16 to reduce potential contention for Intel compiler licenses.

Tutorial participants will be using Yellowstone’s six login nodes and four Caldera nodes for compilation between these hours:

  •  2:30 and 5 p.m. Mountain time on Monday, Tuesday, and Thursday

  • 1 and 3 p.m. on Friday

During these windows, 80 attendees will work in two-person teams, compiling and submitting CESM jobs. They will not be using PGI, GNU, or PathScale compilers, so those will not be affected.

The results of the tutorial compilations on most days will be small, short compute jobs that should have minimal impact on the availability of batch nodes for other users.

August 14, 2013

Starting Friday and over the weekend, users may have experienced issues with interactive sessions on Yellowstone due to problems on two of the six login nodes.

Yslogin2 will be taken out of service today, Monday, August 12, 2 p.m. to 4 p.m., so that IBM can replace the system board on the node. The other five login nodes will remain available.

Yslogin4 was taken out of service Friday evening through Saturday morning to replace a failing InfiniBand adapter. User sessions were interrupted to complete the fix, and the node has been returned to service.

August 9, 2013

HPSS:   Downtime Tuesday, August 13, 7:00am-9:00am

No Scheduled Downtime: Yellowstone, Geyser, Caldera, GLADE, Lynx

August 5, 2013

This week, CISL staff are performing a rolling upgrade to the Yellowstone, Geyser and Caldera systems to bring the GPFS client software on the clusters up to version 3.5.

Sets of nodes have been placed under several system reservations and will be taken out of service and restarted with the new client software. After passing health checks, the nodes will be returned to service.

Users should not be affected by the updates, other than perhaps slightly longer queue waits as the reservations and upgrade process reduce the number of nodes available to jobs. Users should consult CISL's documentation on backfill windows to maximize their throughput around the reservations; see http://www2.cisl.ucar.edu/resources/yellowstone/using_resources/runningjobs#bslots

These updates complete the transition to the most recent version of GPFS, which provides Yellowstone and GLADE with a number of features to improve the management of the disk resource.

UPDATE, Aug. 1, 11:00 am MT: The upgrades to the login nodes have been completed and the nodes returned to users.

Two of the six Yellowstone login nodes have already been upgraded. The remaining four log in nodes are scheduled to be updated between 10 a.m. and noon on Thursday, August 1. We will issue a screen message before bringing those nodes down. We recommend that you log into yslogin3.ucar.edu or yslogin5.ucar.edu instead of yellowstone.ucar.edu on Thursday morning to avoid this disruption.