National Academies Press: OpenBook

Frontiers in Massive Data Analysis (2013)

Chapter: Appendix B: Biographical Sketches of Committee Members

« Previous: Appendix A: Acronyms
Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

B

Biographical Sketches of Committee Members

MICHAEL I. JORDAN, Chair, is the Pehong Chen Distinguished Professor in the Department of Electrical Engineering and Department of Statistics at the University of California, Berkeley. He received a B.S. in psychology in 1978 from Louisiana State University, an M.S. in mathematics in 1980 from Arizona State University, and a Ph.D. in cognitive science in 1985 from the University of California, San Diego. His research interests are in the field of statistical machine learning, a field that bridges computation and statistics, with ties to information theory, signal processing, algorithms, control theory and optimization theory. One area of his research focus has been probabilistic graphical models, which blends probability theory and graph theory to represent uncertainty on interdependent collections of random variables. He developed new graphical model architectures that have had impact in various applied fields, including bioinformatics, computational vision, speech, natural language processing and information retrieval, and has contributed to the development of a new framework for inference in graphical models based on variational representations of probability distributions. Another area of focus has been nonparametric inference, including both Bayesian nonparametrics, where he developed new models based on the area of stochastic processes known as completely random measures, and frequentist nonparametrics, where he focused on kernel machines, spectral methods, dimension reduction and classification. He has also been interested in developing applications of machine learning to problems in distributed computer systems. In 2010, Dr. Jordan was elected to both the National Academy of Sciences and the National Academy of Engineering.

Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

KATHLEEN M. CARLEY is a professor in the School of Computer Science at Carnegie Mellon University (CMU). She is the director of the Center for Computational Analysis of Social and Organizational Systems, a university-wide interdisciplinary center that brings together network analysis, computer science and organization science and has an associated National Science Foundation (NSF)-funded training program for Ph.D. students. Her research combines cognitive science, social networks, and computer science to address complex social and organizational problems. Her specific research areas are dynamic network analysis, computational social and organization theory, adaptation and evolution, text mining, and the impact of telecommunication technologies and policy on communication, information diffusion, disease contagion, and response within and among groups, particularly in disaster or crisis situations. She and her team have developed infrastructure tools for analyzing large-scale dynamic networks and various multi-agent simulation systems. The infrastructure tools include the ORA, a statistical toolkit for analyzing and visualizing multi-dimensional networks. Another tool is AutoMap, a text-mining system for extracting semantic networks from texts and then cross-classifying them using an organizational ontology into the underlying social, knowledge, resource, and task networks. She is the founding co-editor of Computational Organization Theory and has co-edited several books in the computational organizations and dynamic network area.

RONALD R. COIFMAN is a professor of mathematics and computer science at Yale University. His research interests include nonlinear Fourier analysis, wavelet theory, singular integrals, numerical analysis and scattering theory, real and complex analysis, and new mathematical tools for efficient computation and transcriptions of physical data, with applications to numerical analysis, feature extraction recognition, and de-noising. He is currently developing analysis tools for spectrometric diagnostics and hyperspectral imaging. Dr. Coifman is a member of the American Academy of Arts and Sciences and the National Academy of Sciences. He is a recipient of the 1996 DARPA Sustained Excellence Award, the 1996 Connecticut Science Medal, the 1999 Pioneer Award of the International Society for Industrial and Applied Science, and the 1999 National Medal of Science.

DANIEL J. CRICHTON is a principal computer scientist and program manager for the Earth Science Data System and Technology Directorate and the Solar System Exploration Directorate at NASA’s Jet Propulsion Laboratory (JPL), where he provides leadership in the development of large-scale, scientific data systems for planetary, Earth, and other data-intensive technology projects. He has served in numerous roles including as principal investigator supporting the research and implementation of

Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

new novel approaches for dealing with the capture, management, distribution, and analysis of massive scientific data. He conceived of and built an open-source software framework to enable large-scale data management and sharing of scientific data across organizations that has been accepted into the Apache Software Foundation. He has served on a number of committees for NASA, the National Institutes of Health, and other agencies. He has authored more than 100 book chapters and papers on the topic of data-intensive systems. He has a B.S. in information and computer science from the University of California, Irvine, and an M.S. in computer science from the University of Southern California.

MICHAEL J. FRANKLIN is the Thomas M. Siebel Professor of Computer Science at the University of California, Berkeley, specializing in large-scale data management applications and infrastructure. Dr. Franklin is a member of the Database and Operating Systems and Networking Technology groups. He is director of the Algorithms, Machines, and People Laboratory (AMPLab), where he collaborates with students, postdoctoral researchers, and faculty who specialize in cloud computing, statistical machine learning, networking, and other important topics necessary for building scalable data-intensive systems. He is a co-founder of Truviso, a high-performance analytics software company in Foster City, California.

ANNA C. GILBERT is a professor in the Department of Mathematics at the University of Michigan. She has an S.B. degree from the University of Chicago and a Ph.D. from Princeton University, both in mathematics. In 1997 Dr. Gilbert was a postdoctoral fellow at Yale University. From 1998 to 2004 she was a member of technical staff at AT&T Labs-Research in Florham Park, New Jersey. Her research interests include analysis, probability, networking, and algorithms, and she is especially interested in randomized algorithms with applications to harmonic analysis, signal and image processing, networking, and massive data sets.

ALEX GRAY is director of the Fundamental Algorithmic and Statistical Tools Laboratory (FASTlab) at the Georgia Institute of Technology. Dr. Gray received bachelor’s degrees in applied mathematics and computer science from the University of California, Berkeley, and a Ph.D. in computer science from Carnegie Mellon University. He worked in the Machine Learning Systems Group of NASA’s JPL for 6 years. FASTlab works on the problem of how to perform machine learning/data mining/statistics on massive data sets and related problems in scientific computing and applied mathematics. Employing a multidisciplinary array of technical ideas (from machine learning, nonparametric statistics, convex optimization, linear algebra, discrete algorithms and data structures, computational geometry,

Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

computational physics, Monte Carlo methods, distributed computing, and automated theorem proving), his laboratory has developed the current fastest algorithms for several fundamental statistical methods, and is in the process of developing new machine learning methods for difficult aspects of real-world data, such as in astrophysics and biology. This work has enabled high-profile scientific results that have been featured in Science and Nature. Dr. Gray has received an NSF CAREER award, two best-paper awards, and two best-paper award nominations.

TREVOR HASTIE is a professor in the Department of Statistics at Stanford University and the Division of Biostatistics of the Health, Research, and Policy Department in the Stanford School of Medicine. His main research contributions have been in the field of applied nonparametric regression and classification, and he has co-written two books in this area, Generalized Additive Models and Elements of Statistical Learning. He has also made contributions to statistical computing, co-editing a large software library on modeling tools in the S language (Statistical Models in S, 1992), which form the basis for much of the statistical modeling in R and S-plus. His current research focuses on applied problems in biology and genomics, medicine, and industry, in particular data mining, prediction, and classification problems.

PIOTR INDYK is a professor in the Department of Electrical Engineering and Computer Science at Massachusetts Institute of Technology (MIT). He joined MIT in September 2000 after earning a Ph.D. from Stanford University. Earlier, he received a magister degree from Uniwersytet Warszawski, Poland, in 1995. Dr. Indyk’s research interests include computational geometry (especially in high-dimensional spaces), algorithms using sublinear time and/or space, and streaming algorithms. He is also interested in algorithmic coding theory and pattern-matching problems.

THEODORE JOHNSON is a research scientist in the Database Research Department at AT&T Labs-Research. He received a B.S. in mathematics from the Johns Hopkins University in 1986 and a Ph.D. in computer science from the Courant Institute of New York University in 1990. From 1990 through 1996, he was an assistant professor, and then an associate professor, in the Computer and Information Science and Engineering Department at the University of Florida. In 2004 he received an AT&T Science and Technology Award for his work in the Bellman database browser, and in 2010 he was made an AT&T fellow. He has co-authored two books, Distributed Operating Systems and Algorithms and Exploratory Data Mining and Data Cleaning.

Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

DIANE LAMBERT is a research scientist at Google, Inc. She has previously served as head of statistics and data mining research at Bell Laboratories from 1997 to 2005 and was a member of its technical staff from 1994 to 1997. She was a tenured member of the faculty at CMU from 1980 to 1986 and was also a visiting associate professor at University of Chicago from 1984 to 1986. She has held numerous editorial and program committee positions.

DAVID MADIGAN is a professor of statistics at Columbia University. He received a B.A. in mathematical sciences (1984) and Ph.D. (1990) in statistics from Trinity College in Dublin, Ireland. He was previously the dean of physical and mathematical sciences at Rutgers, The State University of New Jersey. He has received numerous honors that include the Institute of Mathematical Statistics Medallion Lecturer, fellow of the Institute of Mathematical Statistics, and being named as the “36th Most Cited Mathematician in the World, 1995-2005.”

MICHAEL MAHONEY is an engineering research associate in the Department of Mathematics at Stanford University. His research interests are algorithmic and statistical aspects of modern large-scale data analysis; design and analysis of algorithms for matrix, graph, and regression problems; statistical data analysis in large-scale scientific and Internet applications; applications to the analysis of large social and information networks; applications to DNA microarray and single nucleotide polymorphism data; and randomized algorithms for large linear algebra problems. Much of his current research focuses on geometric network analysis, developing approximate computation and regularization methods for large informatics graphs; applications in large social and information networks; and statistical data analysis of extremely large data sets. Recently, this work led to improved algorithms for two classical linear algebra problems.

F. MILLER MALEY is a researcher on the staff at the Communications Research Center (CRC), Princeton, a division of the Institute for Defense Analysis that supports National Security Agency research interests. He is also co-chair of the CRC’s SCAMP program on supercomputing. He was a visiting research fellow at Princeton University from 1987 to 1990. Dr. Maley received a B.S. in mathematics and physics from Amherst College in 1983, a Ph.D. in computer science from MIT in 1987, and a Ph.D. in mathematics from Princeton University in 1996. He is the author or co-author of 17 classified papers. His awards include the NSF Mathematical Sciences Postdoctoral Research Fellowship (1987-1990) and Office of Naval Research Graduate Fellowship (1983-1987).

Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×

CHRISTOPHER OLSTON is a staff research scientist at Google, Inc. Previously, he was a principal research scientist at Yahoo! Research. His research interest is data management, focusing especially on Web data management challenges. Dr. Olston received his Ph.D. in computer science in 2003 from Stanford University, supported by fellowships from the university and NSF. He received his bachelor’s degree in electrical engineering and computer sciences from the University of California, Berkeley, with highest honors. He has previously held teaching and research positions at Yahoo! Research, Carnegie Mellon University, Stanford University, Xerox Palo Alto Research Center, the University of California, Berkeley, and Informix Software, Inc.

YORAM SINGER is a senior research scientist at Google, Inc. Before joining Google, he was an associate professor at the School of Computer Science and Engineering of Hebrew University of Jerusalem, and before that he was a member of the technical staff at AT&T-Research. Dr. Singer received his B.Sc. and M.Sc. degrees in computer science from the Technion and his Ph.D. in computer science from Hebrew University.

ALEXANDER SANDOR SZALAY is a professor in the Department of Physics and Astronomy of Johns Hopkins University. His research interests are theoretical astrophysics and galaxy formation. His research interests are multicolor properties of galaxies, galaxy evolution, the large-scale power spectrum of fluctuations, gravitational lensing, pattern recognition and classification problems, the SDSS project, and large scalable databases. He is a leader in the use of massive data as input for scientific research.

TONG ZHANG is a professor of statistics at Rutgers University. His research interests are machine learning, statistical and numerical computation, and design and theoretical analysis of statistical algorithms. He has worked extensively in large-scale data analysis and statistical modeling, especially in text mining, natural language processing, search, and various other Web applications. Dr. Zhang received a Ph.D. in computer science from Stanford University in 1998. After graduation, he worked at IBM T.J. Watson Research Center and then Yahoo! Research in New York.

Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page 171
Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page 172
Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page 173
Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page 174
Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page 175
Suggested Citation:"Appendix B: Biographical Sketches of Committee Members." National Research Council. 2013. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press. doi: 10.17226/18374.
×
Page 176
Frontiers in Massive Data Analysis Get This Book
×
Buy Paperback | $46.00 Buy Ebook | $36.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Data mining of massive data sets is transforming the way we think about crisis response, marketing, entertainment, cybersecurity and national intelligence. Collections of documents, images, videos, and networks are being thought of not merely as bit strings to be stored, indexed, and retrieved, but as potential sources of discovery and knowledge, requiring sophisticated analysis techniques that go far beyond classical indexing and keyword counting, aiming to find relational and semantic interpretations of the phenomena underlying the data.

Frontiers in Massive Data Analysis examines the frontier of analyzing massive amounts of data, whether in a static database or streaming through a system. Data at that scale--terabytes and petabytes--is increasingly common in science (e.g., particle physics, remote sensing, genomics), Internet commerce, business analytics, national security, communications, and elsewhere. The tools that work to infer knowledge from data at smaller scales do not necessarily work, or work well, at such massive scale. New tools, skills, and approaches are necessary, and this report identifies many of them, plus promising research directions to explore. Frontiers in Massive Data Analysis discusses pitfalls in trying to infer knowledge from massive data, and it characterizes seven major classes of computation that are common in the analysis of massive data. Overall, this report illustrates the cross-disciplinary knowledge--from computer science, statistics, machine learning, and application disciplines--that must be brought to bear to make useful inferences from massive data.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!