recommend that the department broaden the scope of its GATB improvement plan to encompass a wider range of technological options. Against this general backdrop, we turn to specific elements of the GATB improvement plan.

THE GATB IMPROVEMENT PLAN: SPECIFIC EVALUATION AND RECOMMENDATIONS

In the absence of more evidence regarding the quality of the products to emerge from the improvement program, the board is reluctant to offer a final determination on its technical adequacy. Responses to the department's requests for proposals and preliminary outlines and findings from research in progress, while helpful, would fall short of the data required to fully evaluate the adequacy of the program. Nevertheless, with the information provided, the board offers comments and suggestions on specific topics.

GATB Validity Research

The improvement plan addresses validity research separately from research on reducing score differences among racial and ethnic groups. However, in keeping with the NRC committee report, these issues should not be conceptually separated.

With respect to validity research, the improvement plan calls for various issue papers, validation studies, and research on validation methods. The issue papers are on job clustering and score weighting, assessment implications of the Dictionary of Occupational Titles, and on assessment of job performance. The board finds that these topics should remain in the department's research program, but only if the department is reasonably confident that the costs of that research will be justified by sufficient improvements in validity and reduction of adverse impact.

As outlined in the improvement plan, the validation research activities appear too narrowly focused on the current version of GATB and on a unidimensional criterion measure. One potentially fruitful avenue for research would be to explore the implications of recent (and ongoing) job performance measurement research. For example, research on job performance methods in the military could hold important lessons for the measurement of job performance in other jobs and occupations. GATB validation research could be much improved by emphasizing expansion of the predictor battery, including hands-on tasks, simulations, and other methods of assessing individual competencies.

Another area for research is the magnitude of the validity coefficients that would be needed to meet the legal standard of “business necessity ” under the Civil Rights Act. This would link validity research to questions of test utility and public policy. The choice of acceptable validity levels, which can depend on the nature of jobs, is largely a matter of policy. Evaluation of the trade-off between predictive validity and adverse impact could be informed by the development of a taxonomy of jobs and occupations, categorized in terms of this tradeoff.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 5
Evaluation of the U.S. Employment Service Workplan for the GATB Improvement Project recommend that the department broaden the scope of its GATB improvement plan to encompass a wider range of technological options. Against this general backdrop, we turn to specific elements of the GATB improvement plan. THE GATB IMPROVEMENT PLAN: SPECIFIC EVALUATION AND RECOMMENDATIONS In the absence of more evidence regarding the quality of the products to emerge from the improvement program, the board is reluctant to offer a final determination on its technical adequacy. Responses to the department's requests for proposals and preliminary outlines and findings from research in progress, while helpful, would fall short of the data required to fully evaluate the adequacy of the program. Nevertheless, with the information provided, the board offers comments and suggestions on specific topics. GATB Validity Research The improvement plan addresses validity research separately from research on reducing score differences among racial and ethnic groups. However, in keeping with the NRC committee report, these issues should not be conceptually separated. With respect to validity research, the improvement plan calls for various issue papers, validation studies, and research on validation methods. The issue papers are on job clustering and score weighting, assessment implications of the Dictionary of Occupational Titles, and on assessment of job performance. The board finds that these topics should remain in the department's research program, but only if the department is reasonably confident that the costs of that research will be justified by sufficient improvements in validity and reduction of adverse impact. As outlined in the improvement plan, the validation research activities appear too narrowly focused on the current version of GATB and on a unidimensional criterion measure. One potentially fruitful avenue for research would be to explore the implications of recent (and ongoing) job performance measurement research. For example, research on job performance methods in the military could hold important lessons for the measurement of job performance in other jobs and occupations. GATB validation research could be much improved by emphasizing expansion of the predictor battery, including hands-on tasks, simulations, and other methods of assessing individual competencies. Another area for research is the magnitude of the validity coefficients that would be needed to meet the legal standard of “business necessity ” under the Civil Rights Act. This would link validity research to questions of test utility and public policy. The choice of acceptable validity levels, which can depend on the nature of jobs, is largely a matter of policy. Evaluation of the trade-off between predictive validity and adverse impact could be informed by the development of a taxonomy of jobs and occupations, categorized in terms of this tradeoff.

OCR for page 5
Evaluation of the U.S. Employment Service Workplan for the GATB Improvement Project Reduction of GATB Score Differences Among Racial and Ethnic Groups The board finds that reduction of score differences that do not reflect performance differences and, more generally, research on testing methods that provide better representations of the performance capabilities of all groups, are critical to the future of GATB specifically and to employment testing generally. The basic rationale for the board 's emphasis on this line of research has already been stated. The board endorses the department's intention, as stated in the improvement plan, to consider new approaches to assessment (and reliance on additional sources of data, such as biographical information, measurements of work values and attitudes, etc.) and to view the GATB as part of a more comprehensive assessment program “in which the GATB is only one of a series of instruments used for assessment purposes.” Thus, papers and reviews that address possible modifications to the structure and format of the GATB, that identify methods and approaches for alternative assessments, and that investigate the inclusion of alternative data should be initiated and encouraged. (Alternative methods of scoring the GATB, such as “banding,” should be included.) The board recommends that the department monitor the progress of these efforts closely, to ascertain the likelihood of substantial improvements. An important rationale for this research is that the GATB (and similar tests) may cause employers (and society) to overlook and underutilize talent in the applicant pool. As the 1989 NRC report made clear, modest predictive validity (of any test) means that many qualified applicants are rejected; moreover, misclassification errors fall disproportionately on qualified minority applicants, who are therefore unfairly screened out of employment opportunities. Maintaining the GATB As the improvement plan correctly notes, the 1989 report made several recommendations regarding maintenance of GATB. However, these recommendations were made on the assumption that score adjustment methods would reduce adverse impact. The prohibition against score adjustments provides a new context in which to consider the utility of the various maintenance activities listed in the improvement plan. Further investigations into test speededness, for example, should be considered in the context of whether there is reason to believe that changes would bring about significant improvements in validity or reductions in adverse impact. Test security, on the other hand, remains a significant concern if, in fact, the GATB is used as a major determinant of employee referrals or selection. Aesthetics and computerized tracking of test materials appear to be less fruitful targets of department research funds now that more basic questions about validity and adverse impact will require substantial attention.