Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
1 Introduction and Background1 Now is an exciting time for brain research due to enormous global investments that have enabled creation of the infrastructure required to generate great pools of neuroscience data and develop novel techniques. This has been facilitated in part by a shift in the way research data are shared over the past decade (see Figure 1-1). In the traditional model, data generated by one group of investigators may be shared with one or more other research groups, each of which builds its own tools to analyze and manipulate the data, resulting in a proliferation of datasets and research tool versions. This contrasts with the cloud model, where data and tools are co-located in a platform that enables multiple investigators to work with a single copy of the dataset using common tools. The cloud model has led to a vast increase in the quantity and complex- ity of data and expanded access to these data, which has attracted many more researchers, enabled multi-national neuroscience collaborations, and facilitated the development of many new tools. Yet, the cloud model has also produced new challenges related to data storage, organization, and protection. Merely switching the technical infrastructure from local reposi- tories to cloud repositories is not enough to optimize data use. 1â The planning committeeâs role was limited to planning the workshop, and the Proceedings of a Workshop was prepared by the workshop rapporteurs as a factual summary of what oc- curred at the workshop. Statements, recommendations, and opinions expressed are those of individual presenters and participants; have not been endorsed or verified by the Health and Medicine Division (HMD) of the National Academies of Sciences, Engineering, and Medicine; and should not be construed as reflecting any group consensus. 1 PREPUBLICATION COPYâUncorrected Proofs
2 NEUROSCIENCE DATA IN THE CLOUD FIGURE 1-1â A shift in the research model. In the cloud model, rather than data being dispersed to multiple researchers for analysis with different sets of tools, data and tools are co-located in the cloud, allowing multiple researchers to access these data using a common set of analytical tools. SOURCE: Presented by Nick Weber, September 24, 2019. Thirty years ago, the National Institute of Mental Health (NIMH), National Institute on Drug Abuse (NIDA), and National Science Founda- tion (NSF) commissioned the Institute of Medicine (IOM) to consider the future of digital and networked neuroscience, recalled Michael Huerta, associate director of the National Library of Medicine (NLM). It is fit- ting that 30 years later, a group reconvened at the National Academies of Sciences, Engineering, and Medicine (the National Academies), which incorporates the former IOM, to explore the burgeoning use of cloud com- puting in neuroscience, said Huerta. On September 24, 2019, the National Academiesâ Forum on Neuroscience and Nervous System Disorders hosted a workshop on neuroscience data in the cloud, co-chaired by Huerta and Deanna Barch, chair of the Department of Psychological and Brain Sciences at Washington University in St. Louis.2 Box 1-1 provides definitions for some of the core concepts related to cloud computing discussed throughout the workshop. The intention of the workshop, said Barch, was to focus on maximizing the benefits that can be realized from neuroscience data. 2â For further information about the workshop, including slides presented by speakers, see http://www.nas.edu/NeuroForum (accessed January 17, 2020). PREPUBLICATION COPYâUncorrected Proofs
INTRODUCTION AND BACKGROUND 3 BOX 1-1 Definition of Cloud Computing and Select Related Concepts Cloud computing: as defined by the National Institute of Standards and TechÂ nology, âis a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interactionâ (Mell and Grance, 2011). Data integration: âthe process of combining data generated using a variety of different research methods to enable detection of underlying themesâ¦.âa Data model: âorganize data elements and standardize how the data elements relate to one another. Since data models document real life people, places, and things and the events between them, the data model represents reality.âb Federation: a data organizational scheme that supports âthe sharing of arbitrary resources from arbitrary application domains with arbitrary consumer groups across multiple domains.â¦ Any type of organizational collaboration could be facilitated by a secure method to selectively share data with specific partnersâ (Lee et al., 2019). Interoperability: the ability to exchange and use information from various sources and of different types in computer systems.c Platform: a grouping of software or hardware upon which other technologies are developed.c a From Nature Research, available at https://www.nature.com/subjects/data-integration (accessed January 16, 2020). b From the Princeton University Center for Data Analytics & Reporting, available at https:// cedar.princeton.edu/understanding-data/what-data-model (accessed January 17, 2020). c From the National Institutes of Health Strategic Plan for Data Science, available at https:// datascience.nih.gov/strategicplan (accessed January 16, 2020). WORKSHOP OBJECTIVES The workshop brought together a broad range of stakeholders involved in cloud-based neuroscience initiatives and research to explore the use of cloud technology to advance neuroscience research and share approaches to address current barriers. These stakeholders represented academia, govern- ment, foundations, the pharmaceutical and information technology indus- tries, and the legal system. They were tasked not only with identifying challenges, but also with suggesting solutions and best practices that can PREPUBLICATION COPYâUncorrected Proofs
4 NEUROSCIENCE DATA IN THE CLOUD help optimize the utility and increase the efficiency of cloud-based neurosci- ence initiatives, support ongoing efforts, and share information about the work of others, said Barch. In addition to cloud-specific issues, the workshop covered a number of topics related to encouraging data sharing and open science, which are integrally relevant for, but not specific to, cloud-based platforms. Many discussions at the workshop covered issues, such as privacy protection, that are common across many types of data, not just neuroscience. The workshop provided a venue for members of the neuroscience community to come together to discuss approaches for tackling these common chal- lenges, as well as challenges that are specific to neuroscience data and the cloud-based platforms that are dedicated to neuroscience or are frequently used by this community. Box 1-2 provides the workshop Statement of Task. ORGANIZATION OF PROCEEDINGS These proceedings reflect the organization of the meeting. Chap- ter 2 summarizes talks about the landscape of cloud-based technologies for neuroÂcience research. Two sets of breakout sessions are summarized s in Parts 1 and 2, which organize issues by content area and types of data, respectively. In Part 1, Chapter 3 covers issues related to the protection of privacy; Chapter 4 addresses data management and interoperability issues; BOX 1-2 Statement of Task â¢ Review the landscape of major neuroscience cloud-based initiatives and other uses of cloud technology within neuroscience research. â¢ Discuss aspirational goals for maximizing benefit from data and compute in the cloud by empowering broad and meaningful data sharing and fostering open science. â¢ Consider best practices and policies that would increase efficiencies within and across cloud resources, including aspects such as: o Consent and data use agreements o Authorization for and accessibility to a variety of data types by a variety of users o Protection of privacy o Assignment of credit, ownership, and licensing o Technical issues o Researcher support and training â¢ Explore potential next steps to move the field forward to develop and deploy best practices in the service of achieving aspirational goals. PREPUBLICATION COPYâUncorrected Proofs
INTRODUCTION AND BACKGROUND 5 Chapter 5 examines issues related to assignment of credit and data owner- ship; and Chapter 6 discusses platform governance. In Part 2, challenges related to different types of neuroscience data are examined: Chapter 7, clinical trial and research data; Chapter 8, genetic data; Chapter 9, neuro- imaging data; and Chapter 10, real-world data. Chapter 11 concludes with a discussion of future directions, including identifying tangible next steps and promising areas for future action. PREPUBLICATION COPYâUncorrected Proofs
PREPUBLICATION COPYâUncorrected Proofs