The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
construct (Willingham, 1998). In contexts in which tests are used to make predictions of subsequent performance (e.g., grades), fairness also requires comparability of predictions for different groups. The latter concern is particularly important in the case of tests used for placement, such as tracking and some types of promotion decisions. For such uses, there should be evidence that the relationships between scores on the test and subsequent performance in certain tracks or at a certain grade level are comparable from group to group. 1
In conclusion, what needs to be comparable across groups and settings for fair test use is score meaning and the actions that follow. That is, test fairness derives from comparable construct validity (which may draw on all six aspects of validity discussed earlier). These issues of fairness surrounding test use are explored in greater detail in Chapters 5, 6, and 7.
The issue of testing standards is not new, and there have been a number of useful documents over the years attempting to codify the principles of good practice. The most recent efforts bearing on the educational uses of tests include the Standards for Educational and Psychological Testing of the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education (1985), currently under revision; the Code of Fair Testing Practices in Education (Joint Committee on Testing Practices, 1988); Responsibilities of Users of Standardized Tests (Association for Measurement
Considerable attention has been given to developing fair selection models in the context of college admissions and job entry. These models put a heavy emphasis on predictive validity (the extent to which test scores predict some desired future performance) but at the expense of other aspects of construct validity. In one way or another, all of the fair selection models address the possibility of differences in the predictor-criterion relationship for different groups (Cleary, 1968; Cole, 1973; Linn, 1973; Thorndike, 1971). With the recognition that fundamental value differences are at issue in fair selection, several utility models were developed that go beyond these selection models in that they require specific value positions to be articulated (Cronbach, 1976; Gross and Su, 1975; Petersen and Novick, 1976; Sawyer et al., 1976). In this way, social values are incorporated explicitly into the measurement technology involved in selection models. The need to make values explicit does not, however, determine or make easier the hard choices among them.