Latest on Yellowstone and its Acceptance Testing Period

September 24, 2012

As of 8 a.m., Sept. 4, Yellowstone officially entered its acceptance test period. While this represents a major milestone, the first week was not without its challenges. The primary issue was assuring that system state can be preserved on the diskless nodes across a cold start, stabilizing the FDR InfiniBand interconnect and reducing interference in the communications as the workload approached the full 4,500-node capacity of Yellowstone.

IBM and Mellanox have resolved several sources of problems, and since 04:15 Sept. 12, CISL staff have been running the full system workload comprised of six different benchmark codes with a 99.94% success rate. IBM benchmark runs have shown compute performance very close to the expected 28.9 "Bluefire-equivalents," and GLADE benchmark performance has also shown better than 80 GB/s for reads and better than 90 GB/s for writes.

The ATP workload testing will continue for the coming weeks, and IBM and Mellanox will continue to troubleshoot problem nodes, cables, and software configurations to improve the stability and performance of the system. While it is still too early to identify a specific date for Yellowstone to pass acceptance testing, CISL remains confident that early October is the likely timeframe.