A primary goal of value-added modeling is to make causal inferences by identifying the component of a student’s test score trajectory that can be credibly associated with a particular teacher, school, or program. In other words, the purpose is to determine how students’ achievement differs, having been in their assigned school, teacher’s classroom, or program, from what would have been observed had they been taught in another school, by another teacher, or in the absence of the program. This is often referred to as the estimation of counterfactual quantities—for example, the expected outcomes for students taught by teacher A had they been taught by teacher B and vice versa.
The ideal research design for obtaining evidence about effectiveness is one in which students are randomly assigned to schools, teachers, or programs. With random assignment and sufficiently large samples, differences in achievement among schools, teachers, or programs can be directly estimated and inferences drawn regarding their relative effectiveness. However, in the real world of education, random assignment is rarely possible or even desirable. There are many ways that classroom assignments depart from randomness, and some are quite purposeful (e.g., matching individual students’ to teachers’ instructional styles).1 Different schools and teachers often serve very different student populations, and programs are typically targeted at particular groups of students, so straightforward comparisons may be neither fair nor useful.
As workshop presenter Dale Ballou explained, to get around the problem of nonrandom assignment, value-added models adjust for preexisting differences among students using their starting levels of achievement. Sometimes a gain score model is used, so the outcome measure is students’ growth from their own starting point a year prior; sometimes prior achievement is included as a predictor or control variable in a regression or analysis of covariance; and some models use a more extensive history of student test scores as control variables, as in William Sanders’s work.
Many researchers believe that controlling for students’ prior achievement is not enough—that more needs to be done to statistically adjust for differences between the groups of students assigned to different schools, teachers, or programs. That is, the question is whether the test score history incorporated into the model is sufficient to account for differences among students on observed—and unobserved (e.g., systematic differences in