Daily Bulletin Archive

September 26, 2012

Starting shortly before midnight (11:45 pm) on Monday, September 24, 2012, the HPSS system will be down for 24 hours to transition the metadata server to NWSC.

We anticipate having HPSS back up within 24 hours. Except for the downtime, the change will be transparent to users.

September 24, 2012

Registration open for October 2012 OpenACC GPU Programming Workshop

One hundred registrants will be accepted for the OpenACC GPU Programming Workshop, to be held October 16 and 17, 2012. The workshop includes hand-on access to Keeneland, the newest XSEDE resource, which is managed by the Georgia Institute of Technology (Georgia Tech) and the National Institute for Computational Sciences, an XSEDE partner institution.

Based on demand, the workshop is scheduled to be held at ten different sites around the country. Anyone interested in participating is asked to follow the link below and then register by clicking on the preferred site. Only the first 100 registrants will be accepted.

The workshop is offered by the Pittsburgh Supercomputing Center, the National Institute for Computational Sciences, and Georgia Tech.

Questions? Contact Tom Maiden at tmaiden@psc.edu.

Register and read more about the workshop at:

OpenACC GPU Programming Workshop

September 24, 2012

As of 8 a.m., Sept. 4, Yellowstone officially entered its acceptance test period. While this represents a major milestone, the first week was not without its challenges. The primary issue was assuring that system state can be preserved on the diskless nodes across a cold start, stabilizing the FDR InfiniBand interconnect and reducing interference in the communications as the workload approached the full 4,500-node capacity of Yellowstone.

IBM and Mellanox have resolved several sources of problems, and since 04:15 Sept. 12, CISL staff have been running the full system workload comprised of six different benchmark codes with a 99.94% success rate. IBM benchmark runs have shown compute performance very close to the expected 28.9 "Bluefire-equivalents," and GLADE benchmark performance has also shown better than 80 GB/s for reads and better than 90 GB/s for writes.

The ATP workload testing will continue for the coming weeks, and IBM and Mellanox will continue to troubleshoot problem nodes, cables, and software configurations to improve the stability and performance of the system. While it is still too early to identify a specific date for Yellowstone to pass acceptance testing, CISL remains confident that early October is the likely timeframe.

September 18, 2012

As a reminder, NCAR's Computational and Information Systems Laboratory (CISL) invites NSF-supported university researchers in the atmospheric, oceanic, and related sciences to submit large allocation requests for the petascale Yellowstone system by September 17, 2012. Revised instructions have been posted for the next round of Large University Allocations, and all requesters are strongly encouraged to review the instructions before preparing their submissions.

These requests will be reviewed by the CISL High-performance computing Advisory Panel (CHAP), and there must be a direct linkage between the NSF award and the computational research being proposed. Please visit http://www2.cisl.ucar.edu/docs/allocations for more university allocation instructions and opportunities.

Allocations will be made on Yellowstone, NCAR's new 1.5-petaflops IBM iDataPlex system, the new data analysis and visualization clusters (Geyser and Caldera), the 11-PB GLADE disk resource, and the HPSS archive. Please see https://www2.cisl.ucar.edu/resources/yellowstone for more system details.

For the much larger Yellowstone resource, the threshold for Small University Allocations has been increased to 200,000 core-hours. Researchers with smaller-scale needs can now submit small allocation requests; see http://www2.cisl.ucar.edu/docs/allocations/university.

Questions may be addressed to: David Hart, User Services Manager, 303-497-1234, dhart@ucar.edu

September 13, 2012

Bluefire downtime Tuesday, September 11 from 6:00am - 1:00pm

HPSS downtime Wednesday, September 12 from 7:00am - 11:00am

No Scheduled Downtime: DAV, GLADE, Lynx

September 10, 2012

With Yellowstone soon to enter its acceptance testing period, CISL will no longer be accepting new project requests for the Bluefire environment from university PIs or NCAR labs. CISL will use the opportunity to help ensure a smooth transition to the new accounting system for Yellowstone, migrate all recently created projects to Yellowstone, and focus on setting up users and projects for the new environment.

After Sept. 7, a project lead may still add users to existing projects, and most other updates to existing projects will be accommodated.

Starting Monday, Sept. 10, university users will be able to submit small allocation requests for the Yellowstone system. New project requests will be queued and prepared for Yellowstone.

We apologize for any inconvenience.

September 7, 2012

The upcoming transition to the new Yellowstone environment is an opportunity to implement CISL best practices if you haven’t already done so. Consider this one in particular as you prepare for the new system:

Organize your files and keep them that way. Arrange them in same-purpose trees, for example. Say you have 20 TB of Mount Pinatubo volcanic aerosols data. Keep the files in a subdirectory such as /glade/home/username/pinatubo rather than scattered among unrelated files or in multiple directories. Specialized trees are easier to share with other users and to transfer to other users or projects as necessary.

Getting organized will also help you transition smoothly to the new Yellowstone system and bring along only the files you need. Once Yellowstone is generally available, users will have an opportunity to migrate essential files to the new GLADE environment as described on our Transition from Bluefire page.

September 6, 2012

The Yellowstone timeline has continued to slip despite long hours put in by CISL, IBM and Mellanox staff. At this writing, the most optimistic timeline has the three-week acceptance test period beginning late this week (the tail end of August), which pushes first user access at least to late September.

While the compute and storage hardware looks good and has demonstrated itself to be more stable than anticipated, with little “infant mortality” observed so far, IBM and Mellanox are continuing to address challenges to achieving the expected performance of 90 GB/s between the compute and storage systems.

The performance tuning involves complex hardware, software, and firmware interactions among the more than 4,500 compute nodes on Yellowstone; the 4,500 disk drives, 76 disk controllers, and 20 GPFS servers of the GLADE resource; and the InfiniBand interconnect comprised of nine core switches, 250 leaf switches, and more than 9,500 copper and fibre cables.

CISL is monitoring the deployment process closely, with ongoing interactions with and updates from the IBM team. Given the extent of the delays thus far, CISL is watching the system's stability and performance results and looking for the earliest possible opportunity to move into acceptance testing. If IBM's performance results are not quite at the promised levels, CISL may elect to pursue acceptance despite the shortfall and discuss alternate methods of later achieving performance targets with IBM.

When the Yellowstone timeline solidifies, CISL will also re-evaluate the schedule for Bluefire. With any Yellowstone delays, users can expect Bluefire’s decommissioning date to be extended accordingly.

August 27, 2012

Registration is now open for “Linux/Unix Basics,” a webcast training course presented by XSEDE. The class, for beginners and intermediate users, will cover the basic Linux/Unix command line environment and feature hands-on exercises. It will emphasize common strategies for interacting with clusters and HPC resources. There are no prerequisites. Participants must register here by Sept. 18.

August 20, 2012

The CISL User Services Section has published additional documentation to help users prepare for computing with the Yellowstone, Geyser, and Caldera resources that are being tested at the NCAR-Wyoming Supercomputing Center.

The documentation includes compilation commands for the Intel, PGI, PathScale, and GNU compilers to be used in the new system. It also describes the Yellowstone environment’s file format and mathematical libraries.

Feel free to use the Feedback link on our Support & Training menu to let us know what you think.