lems related to dieting (de la Rocha, 1986). Men who successfully handicapped horse races could not apply the same skill to securities in the stock market (Ceci and Liker, 1986; Ceci, 1996).
In short, the available cross-cultural literature suggests that variations from the cultural norms embedded in tests and testing situations may significantly influence the judgments about intellectual ability and performance resulting from their use. Researchers have documented how these sociocultural contexts in the homes of different ethnic, racial, and linguistic groups in the United States can vary significantly from those of mainstream homes (Goldenberg et al., 1992; Heath, 1983, 1989). In light of differences in the fit between home and school culture for many minority children and the difference in the school experiences provided (see Chapter 5), these results bear directly on IQ testing of minority children.
In contrast to the cross-cultural and sociocultural research just described, work from a psychometric framework has centered on the issue of test bias. As early as the mid-1970s, questions were raised about the effects of cultural differences on standardized tests and their interpretation (Mercer, 1973a). Some researchers have considered the long-standing patterns of disproportionate representation of certain racial, ethnic, and English language learner groups in special education as de facto evidence of test bias (Bermudez and Rakow, 1990; Hilliard, 1992; Patton, 1992). The general argument has been that the content, structure, format, or language of standardized tests tends to be biased in favor of individuals from mainstream or middle- and upper-class backgrounds. Miller (1997) argues that all measures of intelligence are culturally grounded because performance depends on individual interpretations of the meaning of situations and their background presuppositions, rather than on pure g.
A contrasting approach to test bias is based on a more statistical or psychometric view. That is, a test is considered biased if quantitative indicators of validity differ for different groups (Jensen, 1980). A common procedure has been to conduct item analysis of specific tests to examine construct validity. A specific test would be considered to be biased if there is a significant “item by group interaction,” suggesting that a specific item deviates significantly from the overall profile for any group. Several researchers have concluded that there is no evidence for test bias using such procedures (Jensen, 1974; Sandoval, 1979), a view that was embraced by the 1982 National Research Council (NRC) committee (1982). Other investigators have noted, however, that cultural factors may serve to depress the scores of a particular group in a more generalized or comprehensive fashion so that individual items would not stand out, even though cultural