National Academies Press: OpenBook
« Previous: 1 Introduction
Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×

2

Keynote Addresses

DATA ANALYTICS AND WHAT IT MEANS TO THE MATERIALS COMMUNITY

Gareth Conduit, Intellegens, focused on the promise that machine learning (ML) brings to new materials design, shared case studies from his company’s work, and outlined future opportunities in this space.

The Promise of Machine Learning

ML has enormous promise for the materials community. Taking advantage of experimental, computational, and analytical sources of data, ML algorithms can quickly discover unexpected correlations between material properties, dramatically reduce the number of experiments required, and reduce overall costs. All of this can lead to more efficient discovery and design of new materials.

Traditional materials design is expert-driven and involves a lengthy trial-and-iteration process, with many years and millions of dollars invested in developing and licensing a new material. Standard ML algorithms speed up this process by using composition-property relationships to predict a new material’s properties. However, most algorithms work well only with a full spectrum of data to train the model, and often fall short when faced with significant data gaps.

Intellegen’s proprietary algorithm, Alchemite ML engine, exploits not only composition-property but also property-property correlations and simulations. It can handle large data sets, specialized databases with few data points, and

Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×

first-principles computer simulation results. Critically, it can fill data gaps by using ML to iteratively enhance the algorithm’s inputs, such that predictions on one property can then be used to inform predictions for another property in a subsequent cycle. Conduit said the tool has been demonstrated to reduce experimentation by 90 percent and cut the development time line for new materials from 2 decades to 2 years.

Case Studies

Conduit shared two case studies illustrating how Intellegens uses ML both to merge different sources of data and to merge data with first-principles computer simulations.

In one project, Intellegens merged multiple data sources to help Rolls-Royce find a new alloy that could withstand jet engine temperatures, be three-dimensional (3D)-printable with direct laser deposition, and meet other performance parameters for factors such as cost and fatigue. The task was challenging because of the enormous number of variables involved, and because of a dearth of data for some of the properties, particularly those produced by direct laser deposition. The team trained the Alchemite system on several merged data sources. For properties lacking data, the ML system was able to extrapolate from systems for which more data were available—in this case, learning from weldability data to predict properties relevant to 3D printing. Ultimately, the algorithm identified a new candidate composition, a nickel superalloy, having the desired specifications for heat treatment, exposure parameters, and other properties while also meeting targets such as cost.1 The next step was to experimentally validate the material’s properties; it is now being tested in real jet engines.

The second case study illustrates how ML can merge experimental data with first-principles calculations. This project’s goal was to design a new material that could be used as a thermometer while avoiding the calibration costs associated with traditional thermometers. Conduit’s team used quantum simulations to predict electrical resistance and other properties for which there were no existing data, identifying candidate materials that could meet the specifications. One proposed material proved successful in experimental validation studies, and it is currently in the process of being commercialized.

___________________

1 B.D. Conduit, T. Illston, S. Baker, D. Vadegadde Duggappa, S. Harding, H.J. Stone, and G.J. Conduit, 2019, Probabilistic neural network identification of an alloy for direct laser deposition, Materials and Design 168:107644, https://doi.org/10.1016/j.matdes.2019.107644.

Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×

Future Opportunities

ML techniques can be used to create new materials for a vast array of applications, including batteries, concrete additives, metal-organic frameworks, and pharmaceutical agents. Conduit suggested that this process could even usher in an era of concurrent design, where material design happens in parallel with the product design to ensure the right material for the product. He noted that Intellegens has created an integrated software product that allows customers to load their own data, train the ML model, and design a new material all in one workflow that can be shared by individual scientists and business managers, in addition to being shared between scientists to foster new collaborations.

Any ML model is reliant on sources of data that can be used to train the model; one important barrier in the materials field is that data is siloed in separate databases. Conduit described an effort by the Open Databases Integration for Materials Design (OPTiMaDe) Consortium2 to link multiple materials databases together for better data mining. The preliminary version uses a unified representational state transfer (REST)-ful application programming interface (API), enabling users to send queries into multiple databases at one time. With funding from the Centre Européen de Calcul Atomique et Moléculaire (CECAM), the effort is expanding to molecular dynamics and biosimulations databases.

Q&A

Responding to questions from Ichiro Takeuchi, University of Maryland, and Carlos Levi, University of California, Santa Barbara, Conduit discussed how ML addresses data gaps by generating reasonable approximations from related parameters for which more data is available. He noted that while the shape of the product was not a critical consideration for the 3D-printed jet engine component he discussed in his talk, shape could be a factor that would warrant greater consideration for other applications; in such cases, additional parameters would need to be incorporated into the analysis to support the necessary properties.

Prompted by Surya Kalidindi, Georgia Institute of Technology, Conduit discussed how considerations related to structure factor into the ML approach. Although the ML workflow may seem to jump from process to properties, structure is actually incorporated within other physical property parameters. ML can make inferences about structure and other physical properties from the data it gathers or generates.

Dane Morgan, University of Wisconsin, Madison, asked how ML is different from integrated computational materials engineering (ICME). Conduit answered

___________________

2 See http://www.optimade.org.

Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×

that ML it is not as complicated as ICME. It is similar to a Taylor expansion, but can handle missing data, better understand model uncertainties, and produce better results.

Last, Conduit agreed with June Lau, National Institute of Standards and Technology (NIST), that determining meaningful correlations is challenging. He noted that it is critical to rely on subject matter expertise to determine which variables will result in the most meaningful correlations.

AI-BASED KNOWLEDGE SYSTEMS FOR SUPPORTING MATERIALS-MANUFACTURING INNOVATIONS

Surya Kalidindi, Georgia Institute of Technology, detailed his vision for building materials knowledge systems based on artificial intelligence (AI). He argued that materials knowledge systems can accelerate discovery, development, and deployment of new or improved materials for advanced technologies, as well as mediate between the materials design community and those who use materials to make products.

Creating an AI-Based Materials Knowledge System

An AI-based materials knowledge system could support various functions, including diagnosis (e.g., by finding outliers in the context of machine failure) and prediction (answering “what if” questions—e.g., what would happen if more of an element is added to a material), as well as recommend next steps toward a stated goal. Kalidindi is working on a system formulated on Bayesian statistical frameworks, which work with small data and quantify uncertainty more formally than neural networks, regression, and other techniques. The system’s computational framework will be designed to dynamically evolve as data is continuously added from disparate sources. Kalidindi emphasized that to be useful, a materials knowledge system must have a highly efficient and user-friendly front end.

Process-structure-property (PSP) linkages, covering all relevant materials classes, structure length scales, and time scales, are foundational to materials knowledge systems, Kalidindi said. While future technologies may enable process-property capabilities, he believes structure remains an essential component of the workflow today. PSP linkages cover a hierarchy of several material structure scales, each of which has its own variables at the macro-, meso-, and microscale—one reason why materials science is so complex.

While process and property variables have rigorous mathematical definitions and no ambiguities, defining structure is more difficult because it is so highly dimensional. An effective materials knowledge system needs a framework for high-value, low-dimensional representations of the material structure that is broadly

Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×

applicable to various materials classes at different length scales, Kalidindi said. He noted that balancing the trade-offs between the number of dimensions and the amount of information included is an important challenge to creating PSP linkages.

The Role of Simulations and Experiments

PSP linkages, covering many materials across many length scales, are at the heart of the materials knowledge system. To generate integrated knowledge systems, one must draw from two classes of sources of information about process, structure, and properties: physics-based simulations and physical experiments.

Physics-based simulations use the governing rules of physics to generate expected values and variances under a given set of circumstances. The governing physics can be thought of as a variable by parameterizing materials phenomena through mathematics such as Green’s functions. Applying ML to enhance physics-based simulations with a small amount of data allows the creation of surrogate models, such as Gaussian process models, that can effectively sample large distributions within practical computational budgets.

By contrast, physical experiments are typically small, and governing physics is the central unknown, but not a variable. Said differently, the physics governing a given phenomenon may be undetermined, and understanding the underlying physics is often a key goal of experimentation. This can be done by first making an assumption about the governing physics, then generating enough data to enable new observations, and last calculating the likelihood of making those observations if the assumed physics is correct. This is a new application of Bayesian laws in materials science.

In Kalidindi’s view, physics-based simulations and experiments have different, complementary roles: Models are used to evaluate the likelihood function, and experiments are used to gain information. Materials knowledge systems could help researchers determine when a simulation is needed and when it is time to experiment more.

Progress Toward Materials Knowledge Systems

Kalidindi described several developments representing progress toward materials knowledge systems. The first relates to microstructure quantification. Kalidindi and colleagues successfully employed advanced pattern recognition for microstructures through fast computations of n-point spatial correlations on large data sets.3 They achieved this by taking advantage of discrete Fourier transforms, which

___________________

3 S. Kalidindi, 2015, Hierarchical Materials Informatics: Novel Analytics for Materials Data, Butterworth-Heinemann, Oxford, UK.

Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×

are applied to large data sets with batch processing and parallel programming to rapidly quantify and assess microstructures.

Building on that work, the team extended the calculations to target other parameters such as orientation, dislocation densities, and plasticity. The team also applied its fast and efficient statistical distribution computations at the atomic scale, converting coordinates from public databases to a voxelated microstructure. This in turn enables a feature engineering approach in which, using filters, features of interest can be defined in a rigorous statistical framework.

These tools are available for download as the Materials Knowledge System in Python (PyMKS),4 which is a broadly applicable framework to extract PSP linkages (Figure 2.1). The framework has three core steps: preprocessing data generation, segmentation and microstructure quantification, and postprocessing dimensionality reduction. Applied to image analysis, the PyMKS workflow is based on one of five “recipes” that are system-selected (but can be tweaked) and allow the user to batch-process hundreds of images in a mostly automated way. In addition to segmenting image features across large image collections according to states of interest, the system can be used to segment every unit individually if needed.

Example Applications

Kalidindi discussed how this approach has been applied to real-world materials problems. By eliminating model retraining, his team has greatly increased the prediction time of composite stress-strain responses based on a 3D microstructure and user-specified hardening laws for a material.5 An answer for any microstructure is given immediately, illustrating the benefits of a broadly applicable database with PSP linkages.

The team also applied process-structure linkages to predict microstructure changes in 27 superalloy samples based on time and temperature. The knowledge system allows predictions to be made from process to structure and also the inverse, from structure to process. In this case, the model identified the time-temperature equivalence, a phenomenon known to physicists but not explicitly programmed into the model. Kalidindi described several other applications, including quality control for steel certification and the application of atomMKS to grain boundary structures.

___________________

4 See http://www.pymks.org.

5 M.I. Latypov, L.S. Toth, and S.R. Kalidindi, 2019, Materials knowledge system for nonlinear composites, Computer Methods in Applied Mechanics and Engineering 346:180-196, https://doi.org/10.1016/j.cma.2018.11.034.

Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×
Image
FIGURE 2.1 PyMKS templated workflow for extracting process-structure-property (PSP) linkages. SOURCE: Surya Kalidindi, Georgia Institute of Technology, presentation to the workshop.

For localization, the team has applied its approach to study stress fields in polycrystals, creating predictions in under a minute on a standard desktop computer.6 The process also works for predicting plastic strain rates in two-phase composites.7 Kalidindi noted that localization studies are important because they provide an opportunity for learning about fatigue performance without running expensive simulations.8

After highlighting various ways to extract material properties at the microscale and macroscale, Kalidindi concluded by emphasizing the value of AI-based materials knowledge systems to provide objective decision support for materials innovation, as well as the continued need for high-throughput, cost-effective, multiresolution mechanical measurement protocols to generate the data to feed into these knowledge systems.

___________________

6 Y.C. Yabansu and S.R. Kalidindi, 2015, Representation and calibration of elastic localization kernels for a broad class of cubic polycrystals, Acta Materialia 94:26-35, https://doi.org/10.1016/j.actamat.2015.04.049.

7 D. Montes de Oca Zapiain, E. Popova, and S.R. Kalidindi, 2017, Prediction of microscale plastic strain rate fields in two-phase composites subjected to an arbitrary macroscale strain rate using the materials knowledge system framework, Acta Materialia 141:230-240, https://doi.org/10.1016/j.actamat.2017.09.016.

8 N.H. Paulson, M.W. Priddy, D.L. McDowell, and S.R. Kalidindi, 2018, Data-driven reduced-order models for rank-ordering the high cycle fatigue performance of polycrystalline microstructures, Materials and Design 154:170-183, https://doi.org/10.1016/j.matdes.2018.05.009.

Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×

Q&A

Josh Peek, Space Telescope Science Institute, asked if deep learning could replace feature engineering. Kalidindi replied that there are not yet enough data to facilitate deep learning, and that materials also have classification complexities that make mere image recognition inadequate.

Brian Storey, Olin College and Toyota Research Institute (TRI), asked about the process of classifying microstructure from images. Kalidindi answered that it is possible to track microstructure from images, but there are problems moving from two dimensions to three dimensions. In addition, a lack of data in this space increases the uncertainty. Carla Gomes, Cornell University, noted that this lack of data underscores the current limitations of ML more broadly. She and Kalidindi agreed that integrating physics-based models will be key to scaling up ML approaches.

Ward Plummer, Louisiana State University, asked how the materials knowledge system framework interfaces with specific parameters set by materials users. Kalidindi stressed that any such work is a collaboration, and that his team must work with the appropriate domain experts to select baseline materials and target definitions in order for the approach to yield useful results. In response to another participant’s question, Kalidindi agreed that time series expansions of his model show correlations between microstructures over different time scales very well.

Last, a participant asked if the model could still work if the underlying physics—for example, the physics of nanocomposites—is not well understood. Kalidindi clarified that uncertainty is always present in any underlying physics. The materials knowledge system is a means to quantify a state, with its uncertainty, in order to recommend the best next step to get closer to the goal. In essence, the system’s purpose is to provide decision support, not to solve every problem in existence.

Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×
Page 5
Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×
Page 6
Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×
Page 7
Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×
Page 8
Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×
Page 9
Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×
Page 10
Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×
Page 11
Suggested Citation:"2 Keynote Addresses." National Academies of Sciences, Engineering, and Medicine. 2021. Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25628.
×
Page 12
Next: 3 Materials Design »
Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop Get This Book
×
 Data Analytics and What It Means to the Materials Community: Proceedings of a Workshop
Buy Paperback | $40.00 Buy Ebook | $32.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Emerging techniques in data analytics, including machine learning and artificial intelligence, offer exciting opportunities for advancing scientific discovery and innovation in materials science. Vast repositories of experimental data and sophisticated simulations are being utilized to predict material properties, design and test new compositions, and accelerate nearly every facet of traditional materials science. How can the materials science community take advantage of these opportunities while avoiding potential pitfalls? What roadblocks may impede progress in the coming years, and how might they be addressed?

To explore these issues, the Workshop on Data Analytics and What It Means to the Materials Community was organized as part of a workshop series on Defense Materials, Manufacturing, and Its Infrastructure. Hosted by the National Academies of Sciences, Engineering, and Medicine, the 2-day workshop was organized around three main topics: materials design, data curation, and emerging applications. Speakers identified promising data analytics tools and their achievements to date, as well as key challenges related to dealing with sparse data and filling data gaps; decisions around data storage, retention, and sharing; and the need to access, combine, and use data from disparate sources. Participants discussed the complementary roles of simulation and experimentation and explored the many opportunities for data informatics to increase the efficiency of materials discovery, design, and testing by reducing the amount of experimentation required. With an eye toward the ultimate goal of enabling applications, attendees considered how to ensure that the benefits of data analytics tools carry through the entire materials development process, from exploration to validation, manufacturing, and use. This publication summarizes the presentations and discussion of the workshop.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!