Items for Ongoing Consideration
Data Preparation
- Elevation of status of data preparation and data quality stages in professional societies
- Clear articulation of what is meant by a massive data set
- Development of rigorous, theory-based methods for reduction of dimensionality
- Systematic study of how, when, and why methods used with small and medium-sized data sets break down with large size data sets; understanding of how far current methods, both statistical and computational, can be pushed; articulation of the variety of models that might be useful
- Development of methods for integration of tools and techniques
- Development of specialized tools in general "packages" for non-standard (e.g., sensor-based) data
- Establishment of better links between statistics and computer science
- Exploration of the use of "infinite" data sets to stimulate methods for massive data sets
- Creation of richer language for describing structure in data
- Educational opportunities—for nonstatisticians who use some statistical techniques and for statisticians, to broaden the knowledge base and provide better links to computer science
Models and Data Presentation Research Issues
- Discovery and comparison of homogeneous groups
- Communication and display of variability and bias in models
- Better design of hierarchical visual display
- New modeling metaphors and richer class of presentation approaches
- Methods to help "generalize" and "match" local models (e.g., automated agents)
- Robust or multiple models; sequential and dynamic models