. "4 Factors That Affect the Accuracy of NAEP's Estimates of Achievement." Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessment. Washington, DC: The National Academies Press, 2004.
The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
Keeping Score for All: The Effects of Inclusion and Accommodation Policies on Large-Scale Educational Assessments
NAEP SAMPLING PROCEDURES
Because NAEP is designed to provide estimates of the performance of large groups of students in more than five separate subject areas and at three different stages of schooling, it would not be practical to test all of the students about whom data are sought in all subjects. Not only would each student be subjected to a prohibitively large amount of testing time in order to cover all of the targeted subject matter, but schools would also be unacceptably disrupted by such a burden. The solution is to assess only a fraction of the nation’s students, evaluating each participating student on only a portion of the targeted subject matter. In order to be sure all of the material in each subject area is covered, developers design the assessment in blocks, each representing only a portion of the material specified in the NAEP framework for that subject. These blocks are administered according to a matrix sampling procedure, through which each student takes only two or three blocks in a variety of combinations. Statistical procedures are then used to link these results and project the performance to the broader population of the nation’s students (U.S. Department of Education, National Center for Education Statistics, and Office of Educational Research and Improvement, 2001).
NAEP’s estimates of proficiency are based on scientific samples of the population of interest, such as fourth grade students nationwide. In other words, the percentage of students in the total group of fourth graders who fall into each of the categories about which data are sought—such as girls, boys, members of various ethnic groups, and residents of urban, rural, or suburban areas—is calculated. A sample—a much smaller number of children—can then be identified whose proportions approximate those of the target population. Data are collected about other kinds of characteristics as well, including such information as parents’ education levels, the type of school in which students are enrolled (public/private, large/small), and whether students have disabilities or are English language learners. In this way, NAEP reports can provide answers to a wide variety of questions about the percentages of students in each of a variety of groups, the relative performance of different groups, and the relationships among achievement and a wide variety of academic and background characteristics.
The sampling for NAEP is based on data received from schools about their students’ characteristics as well as other factors. The selection of students in each school identified for NAEP participation is crucial to the representativeness of the overall sampling and the resulting estimates of performance. Local administrators are given lists of students who are to participate and instructions as to what adjustments to this list are permitted in response to absences and other factors that may affect participation. However, in the case of both students with disabilities and English language learners, which students ultimately remain in the sample depends in part on decisions made at the local level. These decisions are discussed in greater detail below.