DATA SCIENCE
FOR UNDERGRADUATES
OPPORTUNITIES AND OPTIONS
Committee on Envisioning the Data Science Discipline:
The Undergraduate Perspective
Computer Science and Telecommunications Board
Board on Mathematical Sciences and Analytics
Committee on Applied and Theoretical Statistics
Division on Engineering and Physical Sciences
Board on Science Education
Division of Behavioral and Social Sciences and Education
A Consensus Study Report of
THE NATIONAL ACADEMIES PRESS
Washington, DC
www.nap.edu
THE NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC 20001
This activity was supported by Award No. 1626983 from the National Science Foundation (Directorate for Computer and Information Science and Engineering; Directorate for Education and Human Resources; Directorate for Mathematical and Physical Sciences/Division of Mathematical Sciences; and Directorate for Social, Behavioral and Economic Sciences). Any opinions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of any organization or agency that provided support for the project.
International Standard Book Number-13: 978-0-309-47559-4
International Standard Book Number-10: 0-309-47559-7
Digital Object Identifier: https://doi.org/10.17226/25104
Additional copies of this publication are available for sale from the National Academies Press, 500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; http://www.nap.edu.
Copyright 2018 by the National Academy of Sciences. All rights reserved.
Printed in the United States of America
Suggested citation: National Academies of Sciences, Engineering, and Medicine. 2018. Data Science for Undergraduates: Opportunities and Options. Washington, DC: The National Academies Press. https://doi.org/10.17226/25104.
The National Academy of Sciences was established in 1863 by an Act of Congress, signed by President Lincoln, as a private, nongovernmental institution to advise the nation on issues related to science and technology. Members are elected by their peers for outstanding contributions to research. Dr. Marcia McNutt is president.
The National Academy of Engineering was established in 1964 under the charter of the National Academy of Sciences to bring the practices of engineering to advising the nation. Members are elected by their peers for extraordinary contributions to engineering. Dr. C. D. Mote, Jr., is president.
The National Academy of Medicine (formerly the Institute of Medicine) was established in 1970 under the charter of the National Academy of Sciences to advise the nation on medical and health issues. Members are elected by their peers for distinguished contributions to medicine and health. Dr. Victor J. Dzau is president.
The three Academies work together as the National Academies of Sciences, Engineering, and Medicine to provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions. The National Academies also encourage education and research, recognize outstanding contributions to knowledge, and increase public understanding in matters of science, engineering, and medicine.
Learn more about the National Academies of Sciences, Engineering, and Medicine at www.nationalacademies.org.
Consensus Study Reports published by the National Academies of Sciences, Engineering, and Medicine document the evidence-based consensus on the study’s statement of task by an authoring committee of experts. Reports typically include findings, conclusions, and recommendations based on information gathered by the committee and the committee’s deliberations. Each report has been subjected to a rigorous and independent peer-review process and it represents the position of the National Academies on the statement of task.
Proceedings published by the National Academies of Sciences, Engineering, and Medicine chronicle the presentations and discussions at a workshop, symposium, or other event convened by the National Academies. The statements and opinions contained in proceedings are those of the participants and are not endorsed by other participants, the planning committee, or the National Academies.
For information about other products and activities of the National Academies, please visit www.nationalacademies.org/about/whatwedo.
COMMITTEE ON ENVISIONING THE DATA SCIENCE DISCIPLINE: THE UNDERGRADUATE PERSPECTIVE
LAURA HAAS, NAE,1 University of Massachusetts Amherst, Co-Chair
ALFRED O. HERO III, University of Michigan, Co-Chair
ANI ADHIKARI, University of California, Berkeley
DAVID CULLER, NAE, University of California, Berkeley
DAVID DONOHO, NAS,2 Stanford University
E. THOMAS EWING, Virginia Polytechnic Institute and State University
LOUIS J. GROSS, University of Tennessee, Knoxville
NICHOLAS J. HORTON, Amherst College
JULIA LANE, New York University
ANDREW MCCALLUM, University of Massachusetts Amherst
RICHARD MCCULLOUGH, Harvard University
REBECCA NUGENT, Carnegie Mellon University
LEE RAINIE, Pew Research Center
ROB RUTENBAR, University of Pittsburgh
KRISTIN TOLLE, Microsoft Research
TALITHIA WILLIAMS, Harvey Mudd College
ANDREW ZIEFFLER, University of Minnesota, Minneapolis
Staff
MICHELLE K. SCHWALBE, Director, Board on Mathematical Sciences and Analytics (BMSA), Study Director
JON EISENBERG, Director, Computer Science and Telecommunications Board (CSTB)
BEN WENDER, Director, Committee on Applied and Theoretical Statistics
AMY STEPHENS, Program Officer, Board on Science Education
LINDA CASOLA, BMSA, Associate Program Officer and Editor
RENEE HAWKINS, CSTB, Financial Manager
JANKI PATEL, CSTB, Senior Program Assistant
___________________
1 Member, National Academy of Engineering.
2 Member, National Academy of Sciences.
COMPUTER SCIENCE AND TELECOMMUNICATIONS BOARD
FARNAM JAHANIAN, Carnegie Mellon University, Chair
LUIZ ANDRÉ BARROSO, Google, Inc.
STEVEN M. BELLOVIN, NAE,1 Columbia University
ROBERT F. BRAMMER, Brammer Technology, LLC
DAVID CULLER, NAE, University of California, Berkeley
EDWARD FRANK, Cloud Parity, Inc.
LAURA HAAS, NAE, University of Massachusetts Amherst
MARK HOROWITZ, NAE, Stanford University
ERIC HORVITZ, NAE, Microsoft Corporation
VIJAY KUMAR, NAE, University of Pennsylvania
BETH MYNATT, Georgia Institute of Technology
CRAIG PARTRIDGE, Raytheon BBN Technologies
DANIELA RUS, NAE, Massachusetts Institute of Technology
FRED B. SCHNEIDER, NAE, Cornell University
MARGO SELTZER, Harvard University
MOSHE VARDI, NAS2/NAE, Rice University
Staff
JON EISENBERG, Director
LYNETTE I. MILLETT, Associate Director
EMILY GRUMBLING, Program Officer
KATIRIA ORTIZ, Associate Program Officer
RENEE HAWKINS, Financial and Administrative Manager
JANKI PATEL, Senior Program Assistant
SHENAE BRADLEY, Administrative Assistant
___________________
1 Member, National Academy of Engineering.
2 Member, National Academy of Sciences.
BOARD ON MATHEMATICAL SCIENCES AND ANALYTICS
STEPHEN M. ROBINSON, NAE,1 University of Wisconsin, Madison, Chair
JOHN R. BIRGE, NAE, University of Chicago
W. PETER CHERRY, Independent Consultant
DAVID CHU, Institute for Defense Analyses
RONALD R. COIFMAN, NAS,2 Yale University
JAMES CURRY, University of Colorado Boulder
CHRISTINE FOX, Johns Hopkins Applied Physics Laboratory
MARK L. GREEN, University of California, Los Angeles
PATRICIA A. JACOBS, Naval Postgraduate School
JOSEPH A. LANGSAM, Morgan Stanley (Retired)
SIMON A. LEVIN, NAS, Princeton University
ANDREW W. LO, Massachusetts Institute of Technology
DAVID MAIER, Portland State University
LOIS CURFMAN MCINNES, Argonne National Laboratory
FRED S. ROBERTS, Rutgers, The State University of New Jersey
ELIZABETH A. THOMPSON, NAS, University of Washington
CLAIRE TOMLIN, University of California, Berkeley
LANCE WALLER, Emory University
KAREN WILLCOX, Massachusetts Institute of Technology
DAVID YAO, NAE, Columbia University
Staff
MICHELLE K. SCHWALBE, Director
BEN WENDER, Program Officer
LINDA CASOLA, Associate Program Officer and Editor
BETH DOLAN, Financial Manager
RODNEY N. HOWARD, Administrative Assistant
___________________
1 Member, National Academy of Engineering.
2 Member, National Academy of Sciences.
COMMITTEE ON APPLIED AND THEORETICAL STATISTICS
ALFRED O. HERO III, University of Michigan, Chair
ALICIA CARRIQUIRY, NAM,1 Iowa State University
MICHAEL J. DANIELS, University of Florida
KATHERINE BENNETT ENSOR, Rice University
AMY HERRING, Duke University
NICHOLAS J. HORTON, Amherst College
DAVID MADIGAN, Columbia University
JOSÉ M.F. MOURA, NAE,2 Carnegie Mellon University
NANCY REID, NAS,3 University of Toronto
CYNTHIA RUDIN, Duke University
AARTI SINGH, Carnegie Mellon University
Staff
BEN WENDER, Director
LINDA CASOLA, Associate Program Officer and Editor
BETH DOLAN, Financial Manager
RODNEY N. HOWARD, Administrative Assistant
___________________
1 Member, National Academy of Medicine.
2 Member, National Academy of Engineering.
3 Member, National Academy of Sciences.
BOARD ON SCIENCE EDUCATION
ADAM GAMORAN, William T. Grant Foundation, Chair
SUNITA V. COOKE, MiraCosta College
MELANIE COOPER, Michigan State University
RODOLFO DIRZO, NAS,1 Stanford University
RUSH D. HOLT, American Association for the Advancement of Science
MATTHEW KREHBIEL, Achieve, Inc.
MICHAEL LACH, University of Chicago
LYNN LIBEN, Pennsylvania State University
CATHRYN (CATHY) MANDUCA, Carleton College
JOHN MATHER, NAS, NASA Goddard Space Flight Center
TONYA M. MATTHEWS, Michigan Science Center
BRIAN REISER, Northwestern University
MARSHALL (MIKE) SMITH, Carnegie Foundation for the Advancement of Teaching
ROBERTA TANNER, Thompson School District (Retired)
SUZANNE WILSON, Michigan State University
Staff
HEIDI SCHWEINGRUBER, Director
KERRY BRENNER, Senior Program Officer
MARGARET HILTON, Senior Program Officer
KENNE DIBNER, Program Officer
AMY STEPHENS, Program Officer
MATTHEW LAMMERS, Program Coordinator
LETICIA GARCILAZO GREEN, Senior Program Assistant
MARGARET KELLY, Senior Program Assistant
COREETHA ENTZMINGER, Program Assistant
___________________
1 Member, National Academy of Sciences.
This page intentionally left blank.
Preface
The National Academies of Sciences, Engineering, and Medicine established the Committee on Envisioning the Data Science Discipline: The Undergraduate Perspective to set forth a vision for the emerging discipline of data science at the undergraduate level (see Box P.1 for the committee’s statement of task).
This study was sponsored by the National Science Foundation. The Committee on Envisioning the Data Science Discipline: The Undergraduate Perspective (see Appendix A for biographical sketches of the committee members) conducted a number of information-gathering activities and engaged a broad community in its conversations to address the statement of task shown in Box P.1 (see Appendix B for a list of the presentations given during these meetings and Appendix C for a list of those who contributed). In December 2016, the committee met to discuss possible future directions based on progress with current data science programs; societal implications of the evolving field of data science; approaches to expand diversity and inclusion in data science among students, staff, and topic areas; and perspectives on envisioning the future of data science for undergraduates. In April 2017, the committee organized a webinar to collect further input from the public on topics of importance for this study.
In May 2017, the committee convened a workshop in which participants discussed educational models to build relevant foundational, translational, and professional skills for data scientists in various roles; the use of high-impact educational practices in the delivery of data science education; and strategies for broad participation in data science educa-
tion that rely on formal modes of evaluation and assessment. Participants focused on the ways in which students, institutions, and programs could change in the coming decade, as well as how these changes will affect future plans for data science education.
The committee also held nine webinars throughout fall 2017 as another means to engage the public in conversations about various aspects of data science education, which addressed the following topics:
- Building data acumen;
- Incorporating real-world applications;
- Training faculty and developing curriculum;
- Enhancing communication and teamwork skills;
- Fostering interdepartmental collaboration and institutional organization;
- Considering ethics;
- Assessing and evaluating data science programs;
- Emphasizing diversity, inclusion, and increased participation; and
- Exploring 2-year colleges and institutional partnerships.
Although these nine webinars focused specifically on applications to data science programs, many of the discussions highlighted best practices relevant for all types of academic programming. The committee met for a final session in December 2017 to prepare for the writing of this report. During this session, the committee synthesized discussions from the webinar series and results from activities under way in the data science community. This final report, which was preceded by a September 2017 interim report, explores key questions about the future of the field of data science.
This page intentionally left blank.
Acknowledgments
This Consensus Study Report has been reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise. The purpose of this independent review is to provide candid and critical comments that will assist the National Academies of Sciences, Engineering, and Medicine in making each published report as sound as possible and to ensure that it meets the institutional standards for quality, objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process.
We thank the following individuals for their review of this report:
Richard (Dick) De Veaux, Williams College,
Natalie M. Evans Harris, BrightHive,
Charles Isbell, Jr., Georgia Institute of Technology,
Iain Johnstone, NAS,1 Stanford University,
Brian Kotz, Montgomery College,
Peter Norvig, Google, Inc.,
Renata Rawlings-Goss, South Big Data Regional Innovation Hub and Georgia Institute of Technology,
Ali Sayed, NAE,2 University of California, Los Angeles,
Margo Seltzer, Harvard University, and
Sharon Wood, NAE, University of Texas, Austin.
___________________
1 Member, National Academy of Sciences.
2 Member, National Academy of Engineering.
Although the reviewers listed above provided many constructive comments and suggestions, they were not asked to endorse the conclusions or recommendations presented in the report, nor did they see the final draft of the report before its release. The review of this report was overseen by Alicia L. Carriquiry, NAM,3 Iowa State University. She was responsible for making certain that an independent examination of this report was carried out in accordance with the standards of the National Academies and that all review comments were carefully considered. Responsibility for the final content of this report rests entirely with the authoring committee and the National Academies.
The committee would like to thank Andy Burnett from Knowinnovation for facilitating the committee’s May 2017 workshop as well as the following staff members from the National Science Foundation for their input, assistance, and support of this study: Stephanie August, Chaitan Baru, Eva Campo, Vandana Janeja, Nandini Kannan, Sara Kiesler, Gabriel Perez-Giz, Earnestine Psalmonds-Easter, and Elena Zheleva. The committee would also like to thank the many individuals who provided input to this study; the full list of these individuals is included in Appendix C.
___________________
3 Member, National Academy of Medicine.