National Academies Press: OpenBook
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R1
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R2
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R3
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R4
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R5
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R6
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R7
Page viii Cite
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R8
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R9
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R10
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R11
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R12
Page xiii Cite
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R13
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
×
Page R14

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Prepublication Copy—Subject To Further Editorial Correction Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs Committee on Forecasting Costs for Preserving and Promoting Access to Biomedical Data Board on Mathematical Sciences and Analytics Committee on Applied and Theoretical Statistics Computer Science and Telecommunications Board Division on Engineering and Physical Sciences Board on Life Sciences Division on Earth and Life Studies Board on Research Data and Information Policy and Global Affairs A Consensus Study Report of PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION

THE NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC 20001 This activity was supported by Contract No. HHSN263002 with the National Institutes of Health. Any opinions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of any organization or agency that provided support for the project. International Standard Book Number-13: 978-0-309-XXXXX-X International Standard Book Number-10: 0-309-XXXXX-X Digital Object Identifier: https://doi.org/10.17226/25639 Additional copies of this publication are available from the National Academies Press, 500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; http://www.nap.edu. Copyright 2020 by the National Academy of Sciences. All rights reserved. Printed in the United States of America Suggested citation: National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. https://doi.org/10.17226/25639. PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION

The National Academy of Sciences was established in 1863 by an Act of Congress, signed by President Lincoln, as a private, nongovernmental institution to advise the nation on issues related to science and technology. Members are elected by their peers for outstanding contributions to research. Dr. Marcia McNutt is president. The National Academy of Engineering was established in 1964 under the charter of the National Academy of Sciences to bring the practices of engineering to advising the nation. Members are elected by their peers for extraordinary contributions to engineering. Dr. John L. Anderson is president. The National Academy of Medicine (formerly the Institute of Medicine) was established in 1970 under the charter of the National Academy of Sciences to advise the nation on medical and health issues. Members are elected by their peers for distinguished contributions to medicine and health. Dr. Victor J. Dzau is president. The three Academies work together as the National Academies of Sciences, Engineering, and Medicine to provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions. The National Academies also encourage education and research, recognize outstanding contributions to knowledge, and increase public understanding in matters of science, engineering, and medicine. Learn more about the National Academies of Sciences, Engineering, and Medicine at www.nationalacademies.org. PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION

Consensus Study Reports published by the National Academies of Sciences, Engineering, and Medicine document the evidence-based consensus on the study’s statement of task by an authoring committee of experts. Reports typically include findings, conclusions, and recommendations based on information gathered by the committee and the committee’s deliberations. Each report has been subjected to a rigorous and independent peer-review process and it represents the position of the National Academies on the statement of task. Proceedings published by the National Academies of Sciences, Engineering, and Medicine chronicle the presentations and discussions at a workshop, symposium, or other event convened by the National Academies. The statements and opinions contained in proceedings are those of the participants and are not endorsed by other participants, the planning committee, or the National Academies. For information about other products and activities of the National Academies, please visit www.nationalacademies.org/about/whatwedo. PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION

COMMITTEE ON FORECASTING COSTS FOR PRESERVING AND PROMOTING ACCESS TO BIOMEDICAL DATA DAVID S.C. CHU, Institute for Defense Analyses, Chair ILKAY ALTINTAS, University of California, San Diego G. SAYEED CHOUDHURY, Johns Hopkins University MARGARET C. LEVENSTEIN, University of Michigan CLIFFORD A. LYNCH, Coalition for Networked Information DAVID MAIER, Portland State University CHARLES F. MANSKI, NAS,1 Northwestern University MARYANN MARTONE, University of California, San Diego ALEXA T. MCCRAY, NAM,2 Harvard Medical School MICHELLE N. MEYER, Geisinger WILLIAM W. STEAD, NAM, Vanderbilt University Medical Center LARS VILHUBER, Cornell University Staff SAMMANTHA L. MAGSINO, Senior Program Officer, Board on Earth Sciences and Resources, Study Director SELAM ARAIA, Senior Program Assistant, Board on Mathematical Sciences and Analytics LINDA CASOLA, Associate Program Officer, Board on Mathematical Sciences and Analytics (until December 2019) CHRISTOPHER FU, Research Associate, Board on Mathematical Sciences and Analytics (until August 2019) ADRIANNA HARGROVE, Financial Manager TYLER KLOEFKORN, Program Officer, Board on Mathematical Sciences and Analytics MICHELLE SCHWALBE, Director, Board on Mathematical Sciences and Analytics LINDA WALKER, Program Coordinator, Board on Physics and Astronomy BEN WATZAK, Research Assistant, Board on Mathematical Sciences and Analytics 1 Member, National Academy of Sciences. 2 Member, National Academy of Medicine. PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION v

BOARD ON MATHEMATICAL SCIENCES AND ANALYTICS MARK L. GREEN, University of California, Los Angeles, Chair HÉLÈNE BARCELO, Mathematical Sciences Research Institute JOHN R. BIRGE, NAE,1 University of Chicago RUSSEL E. CAFLISCH, NAS,2 New York University W. PETER CHERRY, NAE, Independent Consultant DAVID S.C. CHU, Institute for Defense Analyses RONALD R. COIFMAN, NAS, Yale University JAMES (JIM) CURRY, University of Colorado, Boulder SHAWNDRA HILL, Microsoft Research LYDIA KAVRAKI, NAM,3 Rice University TAMARA KOLDA, NAE, Sandia National Laboratories RACHEL KUSKE, Georgia Institute of Technology JOSEPH A. LANGSAM, University of Maryland, College Park DAVID MAIER, Portland State University LOIS CURFMAN MCINNES, Argonne National Laboratory JILL PIPHER, Brown University ELIZABETH A. THOMPSON, NAS, University of Washington CLAIRE TOMLIN, NAE, University of California, Berkeley LANCE WALLER, Emory University KAREN E. WILLCOX, University of Texas, Austin Staff MICHELLE SCHWALBE, Director SELAM ARAIA, Senior Program Assistant LINDA CASOLA, Associate Program Officer (until December 2019) CHRISTOPHER FU, Research Associate (until August 2019) ADRIANNA HARGROVE, Finance Business Partner TYLER KLOEFKORN, Program Officer BEN WATZAK, Research Assistant, Board on Mathematical Sciences and Analytics 1 Member, National Academy of Engineering. 2 Member, National Academy of Sciences. 3 Member, National Academy of Medicine. PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION vi

COMMITTEE ON APPLIED AND THEORETICAL STATISTICS ALFRED O. HERO III, University of Michigan, Chair ALICIA CARRIQUIRY, NAM, Iowa State University RONG CHEN, Rutgers University, The State University of New Jersey MICHAEL J. DANIELS, University of Florida KATHERINE BENNETT ENSOR, Rice University AMY H. HERRING, Duke University TIM HESTERBERG, Google, Inc. NICHOLAS J. HORTON, Amherst College DAVID MADIGAN, Columbia University XIAO-LI MENG, Harvard University JOSÉ M.F. MOURA, NAE, Carnegie Mellon University RAQUEL PRADO, University of California, Santa Cruz NANCY M. REID, NAS, University of Toronto CYNTHIA RUDIN, Duke University AARTI SINGH, Carnegie Mellon University ALYSON G. WILSON, North Carolina State University Staff TYLER KLOEFKORN, Director SELAM ARAIA, Senior Program Assistant LINDA CASOLA, Associate Program Officer (until December 2019) CHRISTOPHER FU, Research Associate (until August 2019) ADRIANNA HARGROVE, Financial Manager BEN WATZAK, Research Assistant, Board on Mathematical Sciences and Analytics PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION vii

COMPUTER SCIENCE AND TELECOMMUNICATIONS BOARD FARNAM JAHANIAN, Carnegie Mellon University, Chair STEVEN M. BELLOVIN, NAE, Columbia University DAVID CULLER, NAE, University of California, Berkeley EDWARD FRANK, NAE, Cloud Parity, Inc. LAURA HAAS, NAE, University of Massachusetts, Amherst MARK HOROWITZ, NAE, Stanford University ERIC HORVITZ, NAE, Microsoft Corporation BETH MYNATT, Georgia Institute of Technology CRAIG PARTRIDGE, Colorado State University DANIELA RUS, NAE, Massachusetts Institute of Technology FRED B. SCHNEIDER, NAE, Cornell University MARGO SELTZER, NAE, University of British Columbia MOSHE VARDI, NAS/NAE, Rice University Staff JON EISENBERG, Senior Board Director SHENAE BRADLEY, Administrative Assistant RENEE HAWKINS, Financial and Administrative Manager LYNETTE I. MILLETT, Associate Director KATIRIA ORTIZ, Associate Program Officer BRENDAN ROACH, Program Officer PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION viii

BOARD ON LIFE SCIENCES JAMES P. COLLINS, Arizona State University, Chair A. ALONSO AGUIRRE, George Mason University VALERIE H. BONHAM, Ropes and Gray LLP DOMINIQUE BROSSARD, University of Wisconsin, Madison NANCY D. CONNELL, Johns Hopkins Center for Health Security SEAN M. DECATUR, Kenyon College JOSEPH R. ECKER, NAS, Howard Hughes Medical Institute SCOTT V. EDWARDS, NAS, Harvard University GERALD L. EPSTEIN, National Defense University ROBERT J. FULL, University of California, Berkeley MARY E. MAXON, Lawrence Berkeley National Laboratory ROBERT NEWMAN, Independent Consultant STEPHEN J. O’BRIEN, NAS, Nova Southeastern University LUCILA OHNO-MACHADO, NAM, Nova Southeastern University CLAIRE POMEROY, NAM, The Albert and Mary Lasker Foundation MARY E. POWER, NAS, University of California, Berkeley SUSAN R. SINGER, Rollins College LANA SKIRBOLL, Sanofi DAVID R. WALT, NAE/NAM, Brigham and Women’s Hospital PHYLLIS M. WISE, NAM, University of Colorado Staff FRAN SHARPLES, Director KATHERINE BOWMAN, Senior Program Officer JESSICA DE MOUY, Senior Program Assistant ANDREA HODGSON, Program Officer JO HUSBANDS, Scholar/Senior Project Director STEVEN MOSS, Associate Program Officer KEEGAN SAWYER, Senior Program Officer AUDREY THEVENON, Program Officer KOSSANA YOUNG, Senior Program Assistant PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION ix

BOARD ON RESEARCH DATA AND INFORMATION ALEXA T. MCCRAY, NAM,1 Harvard Medical School, Chair AMY BRAND, Massachusetts Institute of Technology Press STUART FELDMAN, Schmidt Futures SALMAN HABIB, Argonne National Laboratory JAMES HENDLER, Rensselaer Polytechnic Institute MARY LEE KENNEDY, Association of Research Libraries BAREND MONS, Leiden University Medical Centre SARAH NUSSER, Iowa State University MICHAEL STEBBINS, Science Advisors, LLC Staff TOM ARRISON, Director GEORGE STRAWN, Scholar ESTER SZTEIN, Deputy Director EMI KAMEYAMA, Associate Program Officer 1 Member, National Academy of Medicine. PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION x

Acknowledgment of Reviewers This Consensus Study Report was reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise. The purpose of this independent review is to provide candid and critical comments that will assist the National Academies of Sciences, Engineering, and Medicine in making each published report as sound as possible and to ensure that it meets the institutional standards for quality, objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We thank the following individuals for their review of this report: Helen Berman, Rutgers, The State University of New Jersey, David Browdy, Fred Hutchinson Cancer Research Center, Dr. Mercè Crosas, Harvard University, Brandi Davis-Dusenbery, Seven Bridges, Mark Ellisman, National Center for Microscopy and Imaging Research, Adam Ferguson, University of California, San Francisco, Aaron Friedman, Amazon Web Services, Peter Jones, NAS,1 Van Andel Institute, Douglas Sicker, Carnegie Mellon University, Margo Seltzer, NAE,2 University of British Columbia, Sharon Terry, Genetic Alliance, and Carol Thompson, Allen Institute. Although the reviewers listed above provided many constructive comments and suggestions, they were not asked to endorse the conclusions or recommendations of this report nor did they see the final draft before its release. The review of this report was overseen by Susan J. Curry, NAM,3 University of Iowa. She was responsible for making certain that an independent examination of this report was carried out in accordance with the standards of the National Academies and that all review comments were carefully considered. Responsibility for the final content rests entirely with the authoring committee and the National Academies. 1 Member, National Academy of Sciences. 2 Member, National Academy of Engineering. 3 Member, National Academy of Medicine. PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION xi

Contents SUMMARY S-1 1 INTRODUCTION 1-1 The Charge to the National Academies and the Study Committee, 1-2 Committee Information Gathering and Approach to Its Task, 1-2 Federal Context, 1-4 Biomedical Data Landscape, 1-5 Fair Data, 1-11 Report Organization, 1-12 Beneficiaries of this Report, 1-13 References, 1-14 2 FRAMEWORK FOUNDATION: DATA STATES AND ASSOCIATED ACTIVITIES 2-1 State 1: The Primary Research and Data Management Environment, 2-2 State 2: The Active Repository and Platform, 2-3 State 3: Long-Term Preservation Platform, 2-6 Personnel and Their Relative Salary Levels, 2-8 References, 2-10 3 COST AND THE VALUE OF DATA 3-1 Economic Issues in Forecasting Costs, 3-1 Assessing the Value of Data, 3-7 Approaches to Data Valuation, 3-9 References, 3-11 4 THE COST FORECASTING FRAMEWORK: IDENTIFYING COST DRIVERS IN THE BIOMEDICAL DATA LIFE CYCLE 4-1 Consulting Widely to Conduct a Cost Forecast, 4-1 Mapping Cost Drivers to Activities in Each Data State, 4-4 Individual Cost Drivers in the Development and Operation of a Biomedical Information Resource, 4-9 Attaching Dollars to the Cost Forecast, 4-29 Infrastructural Elements Not Considered in the Cost Model, 4-33 References, 4-35 5 APPLYING THE FRAMEWORK TO A NEW DATA RESOURCE 5-1 Use Case 1: Estimating Costs Associated with Setting up a New Data Archive for the U.S. Brain Initiative, 5-1 References, 5-9 PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION xiii

6 APPLYING THE FRAMEWORK TO A NEW DATA SET 6-1 Use Case 2: Estimating Costs Associated With a Primary Research Data Set, 6-1 References, 6-13 7 POTENTIAL DISRUPTORS TO FORECASTING COSTS 7-1 Biomedical Data Volume and Variety, 7-1 Advances in Machine Learning and Artificial Intelligence, 7-2 Developments with Potential Cost Savings, 7-4 Workforce-Development Challenges, 7-5 Legal and Policy Disruptors, 7-6 Changing Understanding of Human Subjects Policy, 7-8 Other Potential Disruptors, 7-10 References, 7-10 8 FOSTERING THE DATA MANAGEMENT ENVIRONMENT 8-1 Strategies, 8-1 Actions, 8-2 Advances for Practice, 8-5 Factors for Successful Adoption of Data Forecasting Approaches, 8-6 References, 8-7 APPENDIXES A Meetings and Presentations A-1 B Active Data Management Plans as a Planning Tool B-1 C Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle C-1 D Soft Costs for Digital Preservation D-1 E Template to Map Cost Drivers to Data Resource Properties E-1 F Comparison of the Contents Across the Three Data States F-1 G Committee Biographical Information G-1 H Acronyms H-1 PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION xiv

Next: Summary »
Life Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs Get This Book
×
Buy Paperback | $60.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Biomedical research results in the collection and storage of increasingly large and complex data sets. Preserving those data so that they are discoverable, accessible, and interpretable accelerates scientific discovery and improves health outcomes, but requires that researchers, data curators, and data archivists consider the long-term disposition of data and the costs of preserving, archiving, and promoting access to them.

Life Cycle Decisions for Biomedical Data examines and assesses approaches and considerations for forecasting costs for preserving, archiving, and promoting access to biomedical research data. This report provides a comprehensive conceptual framework for cost-effective decision making that encourages data accessibility and reuse for researchers, data managers, data archivists, data scientists, and institutions that support platforms that enable biomedical research data preservation, discoverability, and use.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!