National Academies Press: OpenBook

Frontiers in Massive Data Analysis (2013)

Chapter: Front Matter

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

FRONTIERS IN

MASSIVE
DATA
ANALYSIS

Committee on the Analysis of Massive Data

Committee on Applied and Theoretical Statistics

Board on Mathematical Sciences and Their Applications

Division on Engineering and Physical Sciences

NATIONAL RESEARCH COUNCIL
OF THE NATIONAL ACADEMIES

THE NATIONAL ACADEMIES PRESS
Washington, D.C.
www.nap.edu

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

THE NATIONAL ACADEMIES PRESS     500 Fifth Street, NW     Washington, DC 20001

NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance.

This project was supported by the National Security Agency under contract number NSA H98230-09-C-0407. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the organizations or agencies that provided support for the project.

International Standard Book Number 13: 978-0-309-28778-4
International Standard Book Number 10: 0-309-28778-2
Library of Congress Control Number: 2013944743

Cover: Image courtesy of Jonathan Bachrach, University of California, Berkeley.

Additional copies of this report are available from the National Academies Press, 500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; http://www.nap.edu.

Suggested citation: National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, D.C.: The National Academies Press.

Copyright 2013 by the National Academy of Sciences. All rights reserved.

Printed in the United States of America

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

THE NATIONAL ACADEMIES

Advisers to the Nation on Science, Engineering, and Medicine

The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Ralph J. Cicerone is president of the National Academy of Sciences.

The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. C. D. Mote, Jr., is president of the National Academy of Engineering.

The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Harvey V. Fineberg is president of the Institute of Medicine.

The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy’s purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Ralph J. Cicerone and Dr. C. D. Mote, Jr., are chair and vice chair, respectively, of the National Research Council.

www.national-academies.org

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

This page intentionally left blank.

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

COMMITTEE ON THE ANALYSIS OF MASSIVE DATA

MICHAEL I. JORDAN, University of California, Berkeley, Chair

KATHLEEN M. CARLEY, Carnegie Mellon University

RONALD R. COIFMAN, Yale University

DANIEL J. CRICHTON, Jet Propulsion Laboratory

MICHAEL J. FRANKLIN, University of California, Berkeley

ANNA C. GILBERT, University of Michigan

ALEX G. GRAY, Georgia Institute of Technology

TREVOR J. HASTIE, Stanford University

PIOTR INDYK, Massachusetts Institute of Technology

THEODORE JOHNSON, AT&T Labs Research

DIANE LAMBERT, Google, Inc.

DAVID MADIGAN, Columbia University

MICHAEL W. MAHONEY, Stanford University

F. MILLER MALEY, Institute for Defense Analyses

CHRISTOPHER OLSTON, Google, Inc.

YORAM SINGER, Google, Inc.

ALEXANDER SANDOR SZALAY, Johns Hopkins University

TONG ZHANG, Rutgers, The State University of New Jersey

Staff

SUBHASH KUVELKER, Study Director (until October 17, 2011)

SCOTT WEIDMAN, Study Director (after October 17, 2011)

BARBARA WRIGHT, Administrative Assistant

 

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

COMMITTEE ON APPLIED AND THEORETICAL STATISTICS

CONSTANTINE GATSONIS, Brown University, Chair

MONTSERRAT FUENTES, North Carolina State University

ALFRED O. HERO III, University of Michigan

DAVID M. HIGDON, Los Alamos National Laboratory

IAIN JOHNSTONE, Stanford University

ROBERT E. KASS, Carnegie Mellon University

JOHN LAFFERTY, University of Chicago

XIHONG LIN, Harvard University

SHARON-LISE T. NORMAND, Harvard Medical School

GIOVANNI PARMIGIANI, Dana-Farber Cancer Institute

RAGHU RAMAKRISHNAN, Microsoft Corporation

ERNEST SEGLIE, Office of the Secretary of Defense (retired)

LANCE WALLER, Emory University

EUGENE WONG, University of California, Berkeley

Staff

MICHELLE SCHWALBE, Director

BARBARA WRIGHT, Administrative Assistant

 

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

BOARD ON MATHEMATICAL SCIENCES AND THEIR APPLICATIONS

DONALD G. SAARI, University of California, Irvine, Chair

GERALD G. BROWN, U.S. Naval Postgraduate School

LOUIS ANTHONY COX, JR., Cox Associates, Inc.

BRENDA L. DIETRICH, IBM T.J. Watson Research Center

CONSTANTINE GATSONIS, Brown University

DARRYLL HENDRICKS, UBS Investment Bank

ANDREW W. LO, Massachusetts Institute of Technology

DAVID MAIER, Portland State University

JAMES C. McWILLIAMS, University of California, Los Angeles

JUAN MEZA, University of California, Merced

JOHN W. MORGAN, Stony Brook University

VIJAYAN N. NAIR, University of Michigan

CLAUDIA NEUHAUSER, University of Minnesota, Rochester

J. TINSLEY ODEN, University of Texas, Austin

FRED ROBERTS, Rutgers, The State University of New Jersey

J.B. SILVERS, Case Western Reserve University

CARL P. SIMON, University of Michigan

EVA TARDOS, Cornell University

KAREN L. VOGTMANN, Cornell University

BIN YU, University of California, Berkeley

Staff

SCOTT WEIDMAN, Director

NEAL GLASSMAN, Senior Program Officer

MICHELLE SCHWALBE, Program Officer

BARBARA WRIGHT, Administrative Assistant

BETH DOLAN, Financial Associate

Page viii Cite
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

This page intentionally left blank.

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

Acknowledgments

This report has been reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise, in accordance with procedures approved by the National Research Council’s Report Review Committee. The purpose of this independent review is to provide candid and critical comments that will assist the institution in making its published report as sound as possible and to ensure that the report meets institutional standards for objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We wish to thank the following individuals for their review of this report:

Amy Braverman, Jet Propulsion Laboratory,

John Bruning, Corning Tropel Corporation (retired),

Jeffrey Hammerbacher, Cloudera,

Iain Johnstone, Stanford University,

Larry Lake, University of Texas,

Richard Sites, Google, Inc., and

Hal Stern, University of California, Irvine.

Although the reviewers listed above have provided many constructive comments and suggestions, they were not asked to endorse the conclusions or recommendations nor did they see the final draft of the report before its release. The review of this report was overseen by Michael Goodchild of the University of California, Santa Barbara. Appointed by the National Research Council, he was responsible for making certain that an indepen-

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

dent examination of this report was carried out in accordance with institutional procedures and that all review comments were carefully considered. Responsibility for the final content of this report rests entirely with the authoring committee and the institution.

The committee also acknowledges the valuable contribution of the following individuals, who provided input at the meetings on which this report is based or through other communications:

Léon Bottou, NEC Laboratories,

Jeffrey Dean, Google, Inc.,

John Gilbert, University of California, Santa Barbara,

Jeffrey Hammerbacher, Cloudera,

Patrick Hanrahan, Stanford University,

S. Muthu Muthukrishnan, Rutgers, The State University of New Jersey,

Ben Shneiderman, University of Maryland,

Michael Stonebraker, Massachusetts Institute of Technology, and

J. Anthony Tyson, University of California, Davis.

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

This page intentionally left blank.

Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R1
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R2
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R3
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R4
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R5
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R6
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R7
Page viii Cite
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R8
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R9
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R10
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R11
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R12
Page xiii Cite
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R13
Suggested Citation:"Front Matter." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page R14
Next: Summary »
Frontiers in Massive Data Analysis Get This Book
×
Buy Paperback | $46.00 Buy Ebook | $36.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Data mining of massive data sets is transforming the way we think about crisis response, marketing, entertainment, cybersecurity and national intelligence. Collections of documents, images, videos, and networks are being thought of not merely as bit strings to be stored, indexed, and retrieved, but as potential sources of discovery and knowledge, requiring sophisticated analysis techniques that go far beyond classical indexing and keyword counting, aiming to find relational and semantic interpretations of the phenomena underlying the data.

Frontiers in Massive Data Analysis examines the frontier of analyzing massive amounts of data, whether in a static database or streaming through a system. Data at that scale--terabytes and petabytes--is increasingly common in science (e.g., particle physics, remote sensing, genomics), Internet commerce, business analytics, national security, communications, and elsewhere. The tools that work to infer knowledge from data at smaller scales do not necessarily work, or work well, at such massive scale. New tools, skills, and approaches are necessary, and this report identifies many of them, plus promising research directions to explore. Frontiers in Massive Data Analysis discusses pitfalls in trying to infer knowledge from massive data, and it characterizes seven major classes of computation that are common in the analysis of massive data. Overall, this report illustrates the cross-disciplinary knowledge--from computer science, statistics, machine learning, and application disciplines--that must be brought to bear to make useful inferences from massive data.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!