For more than three decades, under Title I of the Elementary and Secondary Education Act of 1965, program evaluation through large-scale testing has been an integral part of federal support for the education of low-achieving children in poor neighborhoods. The minimum competency testing movement, beginning in the 1970s, gave large-scale, standardized achievement tests a visible and popular role in holding students (and sometimes schools) accountable. Such tests are widely used in decisions about promotion and graduation; their role in tracking—that is, assigning students to a course of study based on perceived achievement or skill level—is less clear. Tracking decisions are usually made at the school level, based on multiple sources of evidence.
By the mid-1980s, 33 states had mandated some form of minimum competency testing (Office of Technology Assessment, 1992). A decade later, 18 states had test-based requirements for high school graduation (Bond et al., 1996). In many states, both schools and students are held accountable for achievement-test performance. Almost all states administer standardized assessments in several core areas and report findings at the school level; in most states, these findings are supplemented by state-representative samples from the National Assessment of Educational Progress (NAEP). In almost half the states, students' test performance can have serious consequences for their schools, including funding gains or losses, loss of autonomy or accreditation, and even external takeover (Bond et al., 1996). In some places, like Chicago, the same achievement test is used both to hold schools accountable and to make individual student promotion decisions.
The political debate about voluntary national testing has focused on the inevitable tensions between uniform national standards and traditions of state and local school governance. But other important questions about the VNT proposal have been raised: Do we need new tests to hold American students to uniform high standards, or could the results of existing tests be reported in a common metric? The VNT proposal calls for public release of all test items soon after the administration of each test, but can new tests be developed each year that will meet high technical demands—for validity, reliability, fairness, and comparability? How should the VNT or similar tests be designed in order to measure achievement accurately and encourage higher academic performance by all students? How can potential misuses of the VNT or other tests be identified, remedied, and prevented?