Skip to main content

Currently Skimming:

1 Introduction
Pages 1-6

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 1...
... Tools that forecast the costs of long-term data preservation could be useful as the cost to curate and manage these data in meaningful ways continues to increase, as could stewardship to assess and maintain data that have future value. The National Academies of Sciences, Engineering, and Medicine's Board on Mathematical Sciences and Analytics (in cooperation with the Computer Science and Telecommunications Board, the Board on Life Sciences, and the Board on Research Data and Information)
From page 2...
... In so doing, the committee will examine and evaluate the following considerations: • Economic factors to be considered when examining the life-cycle cost for data sets (e.g., data acquisition, preservation, and dissemination) ; • Cost consequences for various practices in accessioning and de-­ accessioning data sets; • Economic factors to be considered in designating data sets as high value; • Assumptions built in to the data collection and/or modeling processes; • Anticipated technological disruptors and future developments in data science in a 5- to 10-year horizon; and • Critical factors for successful adoption of data forecasting approaches by research and program management staff.
From page 3...
... Chu noted that NLM serves as an important resource for bio­ edical m discovery through its substantial data and information resources. Patricia Flatley Brennan, NLM, stated that 5 million people interact with NIH's data repositories, resources, data sets, and literature each day; these activities benefit clinicians, patients, researchers, industry, government agencies, and pharmaceutical companies.
From page 4...
... Brennan said that NIH actively encourages the use of open access data repositories1 for data generated throughout the course of the research process and oversees several data storage activities. PubMed Central,2 which currently hosts more than 5 million articles and adds between 5,000 and 7,000 data sets each month, is best suited for investigator-curated data sets up to 2 GB.
From page 5...
... In July 2019, NIH's National Center for Biotechnology Information uploaded 5 PB of a nonhuman sequence read archive into the cloud system, which will be available via Google Cloud and Amazon Web Services for public access. Brennan explained that each year, NIH spends $30 billion to generate data, more than $1 billion to manage NIH data in various repositories, and approximately $250 million to support data repositories in postsecondary institutions.6 She noted that there are political, sociological, and scientific questions embedded in decisions about the allocation of funds toward data sustainability in particular, and there are substantial hidden costs in data management.
From page 6...
... SOURCE: Patricia Flatley Brennan, National Library of Medicine, presentation to the workshop, July 11, 2019.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.