able to the program itself. Events or processes outside the program may be the real cause of the observed changes (in the case of employment training programs, outcomes may be due to changes in the broader economy). Another challenge with this type of evaluation is that the program has an incentive to select candidates with the strongest skills rather than candidates with the greatest need, so that it achieves the best outcomes. Often data are not available that allow the evaluator to clearly isolate the effects of the program on the participants versus the effects from extraneous factors or the effects on the broader population compared with its effects on a particular subpopulation. We return to these issues in subsequent chapters as we discuss the findings from our evaluation.

Generally, a program evaluation involves collecting a variety of kinds of data using both qualitative and quantitative methodologies. Amassing a wide collection of data helps the evaluator determine the areas of consensus in the results with regard to the effectiveness of a program and the areas in which additional research is needed. Guidelines for conducting program evaluations are documented in The Program Evaluation Standards: How to Assess Evaluations of Educational Programs, 2nd edition (Joint Committee on Standards for Educational Evaluation, 1994). These standards lay out guidelines for accepted practices that represent the consensus opinions endorsed by practitioners in the field of program evaluation.

Evaluating Credentialing Tests

The national board’s program consists primarily of a certification assessment, and several sets of standards exist for guiding evaluations of assessment programs. The most well known are Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999), Principles for the Validation and Use of Personnel Selection Procedures (Society for Industrial and Organizational Psychologists, 2003), and the Standards for the Accreditation of Certification Programs (National Commission for Certifying Agencies, 2004). In addition to the program evaluation standards, we relied on these sets of standards to formulate our framework and to guide our evaluation.

A certification test, such as the national board’s assessment, falls into a category of examinations known as credentialing tests. Credentialing tests include those used in the process of initial licensure of new professionals and the voluntary certification of professionals (see Box 2-1 for an explanation of these terms as they are used in this report). Evaluation of these kinds of assessments typically focuses on a review of the processes used to develop the assessment and its psychometric properties. The review includes the methods for determining the content to be assessed and the



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement