Daily Bulletin Archive

September 24, 2012

As of 8 a.m., Sept. 4, Yellowstone officially entered its acceptance test period. While this represents a major milestone, the first week was not without its challenges. The primary issue was assuring that system state can be preserved on the diskless nodes across a cold start, stabilizing the FDR InfiniBand interconnect and reducing interference in the communications as the workload approached the full 4,500-node capacity of Yellowstone.

IBM and Mellanox have resolved several sources of problems, and since 04:15 Sept. 12, CISL staff have been running the full system workload comprised of six different benchmark codes with a 99.94% success rate. IBM benchmark runs have shown compute performance very close to the expected 28.9 "Bluefire-equivalents," and GLADE benchmark performance has also shown better than 80 GB/s for reads and better than 90 GB/s for writes.

The ATP workload testing will continue for the coming weeks, and IBM and Mellanox will continue to troubleshoot problem nodes, cables, and software configurations to improve the stability and performance of the system. While it is still too early to identify a specific date for Yellowstone to pass acceptance testing, CISL remains confident that early October is the likely timeframe.

September 18, 2012

As a reminder, NCAR's Computational and Information Systems Laboratory (CISL) invites NSF-supported university researchers in the atmospheric, oceanic, and related sciences to submit large allocation requests for the petascale Yellowstone system by September 17, 2012. Revised instructions have been posted for the next round of Large University Allocations, and all requesters are strongly encouraged to review the instructions before preparing their submissions.

These requests will be reviewed by the CISL High-performance computing Advisory Panel (CHAP), and there must be a direct linkage between the NSF award and the computational research being proposed. Please visit http://www2.cisl.ucar.edu/docs/allocations for more university allocation instructions and opportunities.

Allocations will be made on Yellowstone, NCAR's new 1.5-petaflops IBM iDataPlex system, the new data analysis and visualization clusters (Geyser and Caldera), the 11-PB GLADE disk resource, and the HPSS archive. Please see https://www2.cisl.ucar.edu/resources/yellowstone for more system details.

For the much larger Yellowstone resource, the threshold for Small University Allocations has been increased to 200,000 core-hours. Researchers with smaller-scale needs can now submit small allocation requests; see http://www2.cisl.ucar.edu/docs/allocations/university.

Questions may be addressed to: David Hart, User Services Manager, 303-497-1234, dhart@ucar.edu

September 13, 2012

Bluefire downtime Tuesday, September 11 from 6:00am - 1:00pm

HPSS downtime Wednesday, September 12 from 7:00am - 11:00am

No Scheduled Downtime: DAV, GLADE, Lynx

September 10, 2012

With Yellowstone soon to enter its acceptance testing period, CISL will no longer be accepting new project requests for the Bluefire environment from university PIs or NCAR labs. CISL will use the opportunity to help ensure a smooth transition to the new accounting system for Yellowstone, migrate all recently created projects to Yellowstone, and focus on setting up users and projects for the new environment.

After Sept. 7, a project lead may still add users to existing projects, and most other updates to existing projects will be accommodated.

Starting Monday, Sept. 10, university users will be able to submit small allocation requests for the Yellowstone system. New project requests will be queued and prepared for Yellowstone.

We apologize for any inconvenience.

September 7, 2012

The upcoming transition to the new Yellowstone environment is an opportunity to implement CISL best practices if you haven’t already done so. Consider this one in particular as you prepare for the new system:

Organize your files and keep them that way. Arrange them in same-purpose trees, for example. Say you have 20 TB of Mount Pinatubo volcanic aerosols data. Keep the files in a subdirectory such as /glade/home/username/pinatubo rather than scattered among unrelated files or in multiple directories. Specialized trees are easier to share with other users and to transfer to other users or projects as necessary.

Getting organized will also help you transition smoothly to the new Yellowstone system and bring along only the files you need. Once Yellowstone is generally available, users will have an opportunity to migrate essential files to the new GLADE environment as described on our Transition from Bluefire page.

September 6, 2012

The Yellowstone timeline has continued to slip despite long hours put in by CISL, IBM and Mellanox staff. At this writing, the most optimistic timeline has the three-week acceptance test period beginning late this week (the tail end of August), which pushes first user access at least to late September.

While the compute and storage hardware looks good and has demonstrated itself to be more stable than anticipated, with little “infant mortality” observed so far, IBM and Mellanox are continuing to address challenges to achieving the expected performance of 90 GB/s between the compute and storage systems.

The performance tuning involves complex hardware, software, and firmware interactions among the more than 4,500 compute nodes on Yellowstone; the 4,500 disk drives, 76 disk controllers, and 20 GPFS servers of the GLADE resource; and the InfiniBand interconnect comprised of nine core switches, 250 leaf switches, and more than 9,500 copper and fibre cables.

CISL is monitoring the deployment process closely, with ongoing interactions with and updates from the IBM team. Given the extent of the delays thus far, CISL is watching the system's stability and performance results and looking for the earliest possible opportunity to move into acceptance testing. If IBM's performance results are not quite at the promised levels, CISL may elect to pursue acceptance despite the shortfall and discuss alternate methods of later achieving performance targets with IBM.

When the Yellowstone timeline solidifies, CISL will also re-evaluate the schedule for Bluefire. With any Yellowstone delays, users can expect Bluefire’s decommissioning date to be extended accordingly.

August 27, 2012

Registration is now open for “Linux/Unix Basics,” a webcast training course presented by XSEDE. The class, for beginners and intermediate users, will cover the basic Linux/Unix command line environment and feature hands-on exercises. It will emphasize common strategies for interacting with clusters and HPC resources. There are no prerequisites. Participants must register here by Sept. 18.

August 20, 2012

The CISL User Services Section has published additional documentation to help users prepare for computing with the Yellowstone, Geyser, and Caldera resources that are being tested at the NCAR-Wyoming Supercomputing Center.

The documentation includes compilation commands for the Intel, PGI, PathScale, and GNU compilers to be used in the new system. It also describes the Yellowstone environment’s file format and mathematical libraries.

Feel free to use the Feedback link on our Support & Training menu to let us know what you think.

August 17, 2012
Date and Time: 
Aug 16th, 2012, 10:00

Who should attend:

  • UCAR users who move big data…
    lots of small files, very large files,
    or anywhere in between
  • Users that access NCAR data nodes
    on XSEDE, DOE, or U.Colorado endpoints
  • Campus computing resource admins
  • When: Thursday, August 16, 10:00AM MT
  • Location: Webinar URL will be provided after registration
  • Presenter: Steve Tuecke, Deputy Director of the Computation Institute, University of Chicago and Argonne National Laboratory

Globus Online is a convenient interface for transferring files between two endpoints – for example, between NCAR resources, the University of Colorado, and XSEDE facilities or other sites. Globus Online also offers a feature called Globus Connect, which enables you to move files easily to and from your laptop or desktop computer and other endpoints.

The Consulting Services Group at NCAR will host a two-hour online workshop to help UCAR users set up and work with Globus Online. You are encouraged to use your own laptop and follow along with the presenter.

Click here to register for this workshop. Please contact Si Liu (siliu_at_mail_dot_ucar_dot_edu) if you have any questions or concerns.

August 9, 2012

"ISTeC at CSU is hosting the Front Range Consortium for Research Computing's (FRCRC's) Second Annual Front Range High Performance Computing Symposium, August 13-14, 2012, at Colorado State University, Fort Collins, CO 

www.frcrc.org/events/hpc-2012

The program is available on the above website. 

The FRCRC membership includes CSU, CU, Boulder, University of Wyoming, the Colorado School of Mines, NCAR, NREL, and NOAA. It includes HPC tutorials, birds of a feather, student presentations and more! This is a great opportunity to network with your colleagues in computational science on the Front Range and learn new skills.

Advance Registration is due by August 6th, 2012 at 12:00 noon. Registration at the door is not accepted. 

We hope you will attend. Please feel free to contact Rich Loft at loft@ucar.edu for more information."


Pages