DATA AND METADATA ONTOLOGY AND FORMATS

Another challenge lamented at the workshop was the lack of standards and terminologies for data and metadata. Several participants noted the absence of a formal ontology for materials science and the need for a practical set of identifiers and descriptors. It was noted during a discussion that the materials community may suffer from a dearth of conversation about ontology.

One participant complained that the process for developing standard terminology is very difficult and slow, pointing to the erstwhile ASTM International committee that once developed standards in this area. In general, however, companies do not like to pay their employees to do this type of activity, and the ASTM International committee folded because there was no funding for it from the community. The same participant noted that companies may not be interested in attaching themselves to a certain standard format, because they are concerned they will be forced to share information they would prefer to keep proprietary in their own formats. During his presentation, Mr. Crichton suggested that the international community should agree to standard models and to a consistent peer review process. He pointed out that agreement on how to represent data can be very difficult because different scientists will have different emphases within the same data set. Mr. Shepherd also pointed out problems with many formats, resolutions, and source locations of large data sets.

To move forward, one participant suggested looking to the NSF program EarthCube3 as a model for working across different communities to develop ontologies and names. Dr. Margiotta also reported that DARPA, along with the Army Research Laboratory (ARL) and other program partners, is developing methods to standardize data fields and metadata fields for materials and materials processing.

METADATA AND MODEL AVAILABILITY

The concept of metadata availability had several meanings at the workshop. Mainly it referred to access to knowledge and information about a particular experiment—models used, starting conditions, and other “meta” information about the data—in other words, information that is not generally available in a journal article, which limits one’s ability to replicate the experiment. Many participants spoke over and over about the need to capture and report metadata.

In other cases, metadata availability referred to broader access to the models themselves; there were several separate discussions about the need to have a stan-

_________________

3 The NSF EarthCube program is an integrated data management program in the geosciences. See https://www.nsf.gov/geo/earthcube/ for more information. Accessed February 24, 2014.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement