Daily Bulletin Archive

Feb. 20, 2018

The Cheyenne “standby” batch queue has been removed from the system until further notice due to recently discovered difficulties with scheduling jobs in that queue. The other batch queues remain available to users: premium, regular, economy, and share. See Job-submission queues and charges for more complete information on Cheyenne’s batch queues.

Feb. 20, 2018

No downtime: Cheyenne, GLADE, Geyser_Caldera and HPSS

Feb. 16, 2018

All research projects are undertaken with the hope to produce findings and products of lasting value. It is often unthinkable to consider that someone could forget the details relating to a project, especially how the results are produced. However, the state of becoming an “unloved data set” is often reached unintentionally over time. Specifically, if the research projects lose sight of data management actions, research results and products could be at risk of becoming forgotten or “unloved” when the team moves on to new projects.

The Data Stewardship Engineering Team (DSET) is a cross-organizational team formed by the NCAR Directors. DSET’s charter specifies that the DSET leads the organization’s efforts to provide enhanced, comprehensive digital data discovery and access, and the team is focused on providing a user-focused, integrated system for the discovery and access of digital scientific assets.

The DSET and the DASH services are here to help in promoting NCAR’s scientific results and allow them to be used, so that they would be valued for the long term.

If you would like to learn more about DSET/DASH and its services after the LYD week, please contact us at datahelp@ucar.edu.

Thank you for participating in Love Your Data Week by reading this and the previous four posts. If you have missed any of the five posts during this week, they are available in Staff Notes as well as the Daily Bulletin archive, or please feel welcome to contact the Data Curation & Stewardship Coordinator.

Feb. 15, 2018

XSEDE is offering introductory and advanced training sessions this Thursday and Friday via webcast from the Texas Advanced Computing Center. The focus of these training sessions will be on programming for manycore architectures such as Intel's Xeon Phi and Xeon Scalable processors. Both classes run from 7 a.m. to 11 a.m. MST. See these links for registration and class details:

Feb. 15, 2018

Finding the right data for a particular data story depends on many factors, including what were the research questions that produced the data, who was on the research project team, what are the terms and conditions for gaining access to the data, what data formats are available for use, and so on. Ultimately, the determination of whether a data set could be “right” for a data story relies both on the information from the original data producers and the information that the potential data users are able to access and understand.

Allowing NCAR’s data to be accessible by NCAR’s immediate communities is a significant first step. As the Digital Asset Services Hub (DASH) services progress in their development, the DASH would like to continue to help the NCAR community to fulfill and optimize the full potential of NCAR’s research data. This can include contributing to data efforts outside of NCAR, including assisting in education, communication, and increasing awareness for the Earth Sciences as a whole.

To learn more about how DASH is participating in data initiatives outside of NCAR, such as having the Data Curation & Stewardship Coordinator serve as a mentor and be on data advisory boards, please contact us at datahelp@ucar.edu.

The last LYD Week post is tomorrow and will be about “We are Data.”

Feb. 14, 2018

Data stories could be told by anyone who could understand and work with data, and the stories could be about any issues that are pertinent to the storyteller. The diversity of the data being used by the broad range of data users is a key factor that makes data stories engaging.

It is important to note that a storyteller is also a data user, and to be a data user, data must be shared and made accessible first. The more types of data that are made available, the higher the possibility that someone can create a compelling story by using data.

The DASH Search system from the Digital Asset Services Hub (DASH) is NCAR’s new metadata registry that facilitates the discovery, identification, and understanding of the research products and output from NCAR labs via a centralized system. The DASH Search system uses the NCAR Dialect to describe and record the resources that are available from NCAR. Once the metadata records of the available resource are submitted to the DASH Search, a potential user could effectively and efficiently locate the desired data using the information in the metadata records. Continuing to increase the access of NCAR’s data via the DASH Search system will help in communicating our science to our community and beyond, including through data stories.

To learn more about DASH Search, please visit https://data.ucar.edu/ or if you would like to submit a metadata record of your data to DASH Search, please contact us at datahelp@ucar.edu.

Day 4’s post will discuss “Connected conversations.”

Feb. 13, 2018

Before using data to tell a story, the data should be evaluated for its quality. Although data quality can be difficult to measure, quality attributes of the data, including completeness, accuracy, credibility, and consistency, are key for building a trustworthy story. Without high-quality data, readers could easily lose confidence in the story, or worse yet, quickly deem the story and its data as hearsay.

In order to achieve high- quality data and mitigate the chance for the data to be misused, it is critical to also have high-quality documentation or metadata for the data. At NCAR, the NCAR Dialect is the designated metadata standard used by the Digital Asset Services Hub (DASH) services, including the DASH Search system. The NCAR Dialect is a customized metadata schema that is designed based on international metadata standards for scientific data. The NCAR Dialect is capable of recording in-depth descriptions to assist with data understandability as well as capturing information that is essential for identification and discovery of the assets. The DASH Search Request to Submit Form demonstrates the elements that are included in the NCAR Dialect.

To learn more about the NCAR Dialect or if you would like to submit a metadata record of your data to DASH Search, please contact us at datahelp@ucar.edu.

Coming up for Day 3 is a post on “Telling Stories with Data.”

Feb. 12, 2018

While sharing research results with one’s identified discipline(s) is crucial for advancing specific studies, communicating scientific discoveries outside of one’s immediate science community could often bring major breakthroughs beyond the initial designs or intends of the original research. In particular, allowing the public to understand and even participate in science could help promote support for science, including the development of new policies, funds, and education programs.

Among the different options for communicating science, using data to tell a story or telling a story that is backed up by data can help a scientific issue to become more personal and relatable, and therefore, more actionable. In order to begin telling a data story, however, one needs to know what data are available, and data management is a vital method for organizing data and allowing data to be preserved for use/re-use.

The Digital Asset Services Hub (DASH) offers a variety of data management services, including the DMP Preparation Guidance and Template Document and DMP Checklist for Awarded Proposals. The Data Curation & Stewardship Coordinator could also help in providing consultation for data management questions or issues. Please contact us at datahelp@ucar.edu if you are interested in learning more.

Stay tuned for the Day 2 post, “Stories about data”!

Feb. 9, 2018

Love Your Data (LYD) Week 2018 is an international event coordinated by academic libraries and data archives to promote research data as being “the foundation of the scholarly record and crucial for advancing our knowledge of the world around us.”

This year’s theme for LYD week is “data stories.” In support of LYD week (Monday, February 12, to Friday, February 16), NCAR’s Data Curation & Stewardship Coordinator will share one post a day discussing how the Digital Asset Services Hub (DASH) as well as its resources and services could help with the following topics:

  • Monday: Why data stories?

  • Tuesday: Stories about data

  • Wednesday: Telling stories with data

  • Thursday: Connected conversations

  • Friday: We are data

Stay tuned for these LYD posts  next week, and please feel welcome to get in touch with DASH at datahelp@ucar.edu if you have any questions, need additional information, or would like to talk more about data and data-related topics with the Data Curation & Stewardship Coordinator.

Feb. 8, 2018

What’s the difference between running Cheyenne jobs efficiently and inefficiently? The CISL Consulting Services Group (CSG) recently encountered a case where revising a batch script select statement made a huge difference.

A WRF user was running simulations on 60 Cheyenne nodes, intending to use all 36 cores of each node with 4 MPI processes and 9 OpenMP threads per process. The following select statement likely would have been fine if the user hadn’t compiled WRF with the dmpar option, which enables only distributed-memory MPI support, instead of dm+sm, which enables both MPI and OpenMP support:

#PBS -l select=60:ncpus=36:mpiprocs=4:ompthreads=9

With an assist from CSG, the user modified the select statement as follows to use 36 MPI processes, and jobs that ran at 10.8% efficiency now run at more than 99%:

#PBS -l select=60:ncpus=36:mpiprocs=36:ompthreads=1

Improvements like that can make your allocation go a lot farther. Ask yourself if some of your jobs run significantly slower than you think they should. Do you unexpectedly run out of wall-clock time? Take another look at how you’re requesting resources in your job script (and how you compiled your code), and don’t hesitate to contact CSG for assistance.