finished form by nature. Questions are what determine the types of information that will be gathered, and many aspects of data coding and structuring also depend on the questions asked.

Data are inherently abstract, as they are observations that stand for concrete events. Data may take many forms: a linear distance may be represented by a number of standard units, a video recording can stand in for an observation of human interaction, or a reading on a thermometer may represent a sensation of heat.

Collection of data often requires the use of tools, and students often have a fragile grasp of the relationship between an event of interest and the operation or output of a tool used to capture data about the event. Whether that tool is a microscope, a pan balance, or a simple ruler, students often need help understanding the purpose of the tool and of measurement. Some students, for example, accustomed to relying on sensory observations of “felt weight,” may find a pan balance confusing, because they do not, at first, understand the value of using one object to determine the weight of another.

Data do not come with an inherent structure. Rather, a structure must be imposed on data. Scientists and students impose structure by selecting categories with which to describe and organize the data. However, young learners often fail to grasp this as evidenced in their tendency to believe that new questions can be addressed only with new data. They rarely think of querying existing data sets to explore questions that were not initially conceived when the data were collected. For example, earlier we described a biodiversity unit in which children cataloged a number of species in a woodlot adjacent to their school. The data generated in this activity could later be queried to determine the spread of a given population or which species of plants and animals tend to cluster together in certain areas of the woodlot.

Finally, data are represented in various ways to see, understand, or communicate different aspects of the phenomenon being studied. For example, a bar graph of children’s height may provide a quick visual sense of the range of heights. In contrast, a scatterplot of children’s height by children’s age would yield a linear relationship between height and age. An important goal for students—one that extends over several years—is to come to understand the conventions and properties of different kinds of data displays. There are many different kinds of representational displays, including tables, graphs of various kinds, and distributions. Not only should students understand the procedures for generating and reading displays, but they should also be able to critique them and to grasp the advantages and disadvantages of different displays for a given purpose.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement