that student athletes apparently studied harder in school and took courses that would prepare them better for college. This is a potentially important positive exemplar of test use because the introduction of higher standards through testing parallels broader proposals for standards-based educational reform—including some of the hopes for VNTs.
History provides equally striking examples of the actual or potential misuse of standardized tests to make decisions about individuals. Unhappy with the increasing numbers of immigrants living in New York City, the president of Columbia University in 1917 embraced the use of the Thorndike Tests for Mental Alertness "to limit the number of Jewish students without a formal policy of restriction" (Crouse and Trusheim, 1988:20). In one well-known California case (Larry P. v. Riles, 1984), the court found that inadequately validated IQ tests had been used to discriminate against black schoolchildren, who were assigned disproportionately to classes for the educable mentally retarded, and that California's classes for such students were often an educational dead end. In a Florida case, the state was enjoined from using a high school graduation test because black students, forced to attend segregated, inferior schools, had not been taught the material covered in the test (Debra P. v. Turlington , 1981). And in Rockford, Illinois, testing was recently used to rationalize the assignment of some black high school students to lower tracks, even when their test scores were higher than the scores of some whites assigned to higher tracks (People Who Care v. Rockford Board of Education, 1997).
The case of Debra P. offers an especially clear illustration of a crucial distinction between appropriate and inappropriate test use. Is it ever appropriate to test students on material they have not been taught? Yes, if the test is used to find out whether the schools are doing their job. But if that same test is used to hold students "accountable" for the failure of the schools, most testing professionals would find such use inappropriate. It is not the test itself that is the culprit in the latter case; results from a test that is valid for one purpose can be used improperly for other purposes.
In the examples above, it seems easy with the advantage of hindsight to identify the appropriate and inappropriate uses of tests. In practice it is often not at all obvious, and the judgment may well depend on the position of the observer. Some population groups see their low scores on achievement tests as a stigmatizing and discriminatory obstacle to educational progress. Other groups, with high scores on the same tests, view