Evaluation of the Voluntary National Tests

Phase 1

Lauress L. Wise,

Robert M. Hauser,

Karen J. Mitchell, and

Michael J. Feuer

Board on Testing and Assessment

Commission on Behavioral and Social Sciences and Education

National Research Council

NATIONAL ACADEMY PRESS
Washington, D.C.
1999



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page R1
--> Evaluation of the Voluntary National Tests Phase 1 Lauress L. Wise, Robert M. Hauser, Karen J. Mitchell, and Michael J. Feuer Board on Testing and Assessment Commission on Behavioral and Social Sciences and Education National Research Council NATIONAL ACADEMY PRESS Washington, D.C. 1999

OCR for page R1
--> NATIONAL ACADEMY PRESS 2101 Constitution Avenue, N.W. Washington, D.C. 20418 NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The co-principal investigators responsible for the report were chosen for their special competence. The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Bruce M. Alberts is president of the National Academy of Sciences. The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. William A. Wulf is president of the National Academy of Engineering. The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Kenneth I. Shine is president of the Institute of Medicine. The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy’s purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Bruce M. Alberts and Dr. William A. Wulf are chairman and vice chairman, respectively, of the National Research Council. The study was supported by Contract/Grant No. RJ97184001 between the National Academy of Sciences and the U.S. Department of Education. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the organizations or agencies that provided support for this project. International Standard Book Number 0-309-06277-2 Additional copies of this report are available from: National Academy Press 2101 Constitution Avenue N.W. Washington, D.C. 20418 Call 800-624-6242 or 202-334-3313 (in the Washington Metropolitan Area). This report is also available on line at http://www.nap.edu Printed in the United States of America Copyright 1999 by the National Academy of Sciences. All rights reserved.

OCR for page R1
--> Project on the Evaluation of the Voluntary National Tests Co-Principal Investigators ROBERT M. HAUSER, University of Wisconsin, Madison LAURESS L. WISE, Human Resources Research Organization, Alexandria, Virginia Staff, Board on Testing and Assessment MICHAEL J. FEUER, Director KAREN J. MITCHELL, Senior Program Officer STEPHEN E. BALDWIN, Senior Program Officer MARILYN DABADY, Research Associate VIOLA C. HOREK, Administrative Associate DOROTHY R. MAJEWSKI, Senior Project Assistant

OCR for page R1
--> Board on Testing and Assessment ROBERT L. LINN (Chair), School of Education, University of Colorado, Boulder CARL F. KAESTLE (Vice Chair), Department of Education, Brown University RICHARD C. ATKINSON, President, University of California IRALINE BARNES, The Superior Court of the District of Columbia PAUL J. BLACK, School of Education, King's College, London, England RICHARD P. DURÁN, Graduate School of Education, University of California, Santa Barbara CHRISTOPHER F. EDLEY, JR., Harvard Law School PAUL W. HOLLAND, Graduate School of Education, University of California, Berkeley MICHAEL W. KIRST, School of Education, Stanford University ALAN M. LESGOLD, Learning Research and Development Center, University of Pittsburgh LORRAINE McDONNELL, Departments of Political Science and Education, University of California, Santa Barbara KENNETH PEARLMAN, Lucent Technologies, Inc., Warren, New Jersey PAUL R. SACKETT, Department of Psychology, University of Minnesota, Minneapolis RICHARD J. SHAVELSON, School of Education, Stanford University CATHERINE E. SNOW, Graduate School of Education, Harvard University WILLIAM L. TAYLOR, Attorney at Law, Washington, D.C. WILLIAM T. TRENT, Associate Chancellor, University of Illinois, Urbana-Champaign JACK WHALEN, Xerox Palo Alto Research Center, Palo Alto, California KENNETH I. WOLPIN, Department of Economics, University of Pennsylvania, Philadelphia MICHAEL J. FEUER, Director VIOLA C. HOREK, Administrative Associate

OCR for page R1
--> Foreword President Clinton's 1997 proposal to create voluntary national tests in reading and mathematics catapulted testing to the top of the national education agenda. The proposal turned up the volume on what had already been a contentious debate and drew intense scrutiny from a wide range of educators, parents, policy makers, and social scientists. Recognizing the important role science could play in sorting through the passionate and often heated exchanges in the testing debate, Congress and the Clinton administration asked the National Research Council, through its Board on Testing and Assessment (BOTA), to conduct three fast-track studies over a 10-month period. This report and its companions—Uncommon Measures: Equivalence and Linkage Among Educational Tests and High-Stakes: Testing for Tracking, Promotion, and Graduation—are the result of truly heroic efforts on the part of the BOTA members, the study committee chairs and members, two co-principal investigators, consultants, and staff, who all understood the urgency of the mission and rose to the challenge of a unique and daunting timeline. Michael Feuer, BOTA director, deserves the special thanks of the Board for keeping the effort on track and shepherding the report through the review process. His dedicated effort, long hours, sage advice, and good humor were essential to the success of this effort. Robert Hauser and Lauress Wise deserve our deepest appreciation for their outstanding commitment of time, energy, and intellectual firepower that made this evaluation possible. These reports are exemplars of the Research Council's commitment to scientific rigor in the public interest: they provide clear and compelling statements of the underlying issues, cogent answers to nettling questions, and highly readable findings and recommendations. These reports will help illuminate the toughest issues in the ongoing debate over the proposed Voluntary National Tests. But they will do much more as well. The issues addressed in this and the other two reports go well beyond the immediate national testing proposal: they have much to contribute to knowledge about the way tests—all tests—are planned, designed, implemented, reported, and used for a variety of education policy goals.

OCR for page R1
--> I know the whole board joins me in expressing our deepest gratitude to the many people who worked so hard on this project. These reports will advance the debate over the role of testing in American education, and I am honored to have participated in this effort. ROBERT L. LINN, CHAIR BOARD ON TESTING AND ASSESSMENT

OCR for page R1
--> Acknowledgments This project would not have been possible without the generosity of many individuals and the contributions of several institutions. The sage counsel of Bob Linn and Carl Kaestle, chair and vice chair of the Board on Testing and Assessment (BOTA), helped us frame the evaluation and test our findings and conclusions. Other BOTA members contributed in important ways by participating in briefings and making invaluable suggestions for improved analysis and discussion. The Office of Planning and Evaluation Services, U.S. Department of Education, administered the contract for this evaluation. Director Allen Ginsburg provided assistance in planning the evaluation, and Audrey Pendleton served as an exemplary contracting officer's technical representative during this first phase of the evaluation. We thank them for their guidance and support. Staff from the National Assessment Governing Board (NAGB), under the leadership of Roy Truby, executive director, and the NAGB prime contractor, the American Institutes for Research (AIR), with Archie LaPointe's guidance, were a valuable source of information and data on the design and development of the Voluntary National Tests (VNT). Sharif Shakrani, Raymond Fields, and Mary Crovo of NAGB and Mark Kutner, Steven Ferrara, John Olsen, Clayton Best, Roger Levine, Terry Salinger, Fran Stancavage, and Christine Paulson of AIR provided us with important information on occasions that are too numerous to mention. We benefited tremendously by attending and learning from discussions at meetings of the National Assessment Governing Board and meetings of its contractors; we thank them for opening their meetings to us and for sharing their knowledge and perspectives. We extend thanks to the staff of the cognitive laboratories and of Harcourt Brace Educational Measurement and Riverside Publishing for access to important information and their perspectives throughout the course of our work. We relied heavily on the input and advice of a cadre of testing and disciplinary experts, who provided helpful and insightful presentations at our workshops: they are listed in Appendices A–C, and we thank them. Our work was enriched by the stimulating intellectual exchange at the meetings to which they contributed greatly.

OCR for page R1
--> William Morrill, Rebecca Adamson, and Donald Wise of Mathtech, Inc., provided important help and perspective throughout. They attended and reported on workshops, cognitive laboratories, and bias review sessions, provided important insight into VNT development, and were valuable members of the evaluation team. This report has been reviewed by individuals chosen for their diverse perspectives and technical expertise, in accordance with procedures approved by the Report Review Committee of the National Research Council (NRC). The purpose of this independent review is to provide candid and critical comments that will assist the authors and the NRC in making the published report as sound as possible and to ensure that the report meets institutional standards for objectivity, evidence, and responsiveness to the study charge. The content of the review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We wish to thank the following individuals, who are neither officials nor employees of the NRC, for their participation in the review of this report: Arthur S. Goldberger, Department of Economics, University of Wisconsin; Lyle V. Jones, L.L. Thurstone Psychometric Laboratory, University of North Carolina, Chapel Hill; Michael J. Kolen, Iowa Testing Programs, University of Iowa; Henry W. Riecken, Professor of Behavioral Sciences (emeritus), University of Pennsylvania School of Medicine; Alan H. Schoenfeld, School of Education, University of California, Berkeley; Richard Shavelson, School of Education, Stanford University; Ross M. Stolzenberg, Department of Sociology, University of Chicago. Although these individuals provided many constructive comments and suggestions, responsibility for the final content of this report rests solely with the authors and the NRC. Above all, we are grateful to the many individuals at the National Research Council who provided guidance and assistance at many stages of the evaluation and during the preparation of the report. Barbara Torrey, executive director of the Commission on Behavioral and Social Sciences and Education (CBASSE), helped and encouraged our work—and the companion VNT studies—throughout. Sandy Wigdor, director of CBASSE's Division on Education, Labor, and Human Performance, also has been a source of great encouragement and paved many paths in the conduct of the study. We are indebted, also, to the whole CBASSE staff for indulging our scheduling exigencies. Thanks also to Sally Stanfield and the whole Audubon team at the National Academy Press, for their creative and speedy support. We are especially grateful to Eugenia Grohman, associate director for reports of CBASSE, for her advice on structuring the content of the report, for her expert editing of the manuscript, for her wise advice on the exposition of the report's main messages, and for her patient and deft guidance of the report through the NRC review process. We also are immensely grateful to Stephen Baldwin, Patricia Morison, and Naomi Chudowsky of the BOTA staff and Marilyn Dabady, a Yale Ph.D. candidate and BOTA summer intern, who made valuable contributions to our research and report. We express our gratitude to NRC administrative staff Adrienne Carrington and Lisa Alston. We are especially grateful to Dorothy Majewski and Viola Horek, who capably and admirably managed the operational aspects of the evaluation—arranging meeting and workshop logistics, producing multiple iterations of drafts and report text, and being available to assist with our requests, however large or small. We recognize the special contributions of Michael Feuer, BOTA director, and Karen Mitchell, senior staff officer, as our coauthors of this report. Michael guided the project, coordinated our work with the companion VNT projects on linkage and appropriate test use, and, most important, made frequent contributions to the discussion and the framing of our questions and conclusions. Karen was a principal source of expertise in both the substance and process of the evaluation, and she provided

OCR for page R1
--> cheerful and continuous liaison between the two of us and the staff of NAGB and AIR. Without her help, we could not have completed our work in time and to the NRC's rigorous standards. Lastly, we thank Winnie and Tess for their patience, help, understanding, and good humor during our work on this project. We'll be home for dinner. LAURESS WISE AND ROBERT HAUSER, CO-PRINCIPAL INVESTIGATORS EVALUATION OF THE VOLUNTARY NATIONAL TESTS

OCR for page R1
This page in the original is blank.

OCR for page R1
--> Contents     Executive Summary   1 1   The Proposed Voluntary National Tests and Their Evaluation   5 2   Test Specifications   12 3   Item Development and Review   17 4   VNT Pilot and Field Test Plans   35 5   Inclusion and Accommodation   45 6   Reporting   48     References   52     Appendices         A Workshop on Item and Test Specifications for VNT   57     B Workshop to Review VNT Pilot and Field Test Plans   59     C Workshop on VNT Item Development   60     D Source Documents   62     E Descriptions of Achievement Levels for Basic, Proficient, and Advanced   63     F Revised Item Development and Review Schedule for VNT   66     G Observations of Cognitive Labs and Bias Reviews   70     H Biographical Sketches   72

OCR for page R1
--> Public Law 105-78, enacted November 13, 1997 SEC. 308. STUDY—The National Academy of Sciences shall, not later than September 1, 1998, submit a written report to the Committee on Education and the Workforce of the House of Representatives, the Committee on Labor and Human Resources of the Senate, and the Committees on Appropriations of the House and Senate that evaluates all test items developed or funded by the Department of Education or any other agency of the Federal Government pursuant to contract RJ97153001, any subsequent contract related thereto, or any contract modification by the National Assessment Governing Board pursuant to section 307 of this Act, for— (1)   the technical quality of any test items for 4th grade reading and 8th grade mathematics; (2)   the validity, reliability, and adequacy of developed test items; (3)   the validity of any developed design which links test results to student performance; (4)   the degree to which any developed test items provide valid and useful information to the public; (5)   whether the test items are free from racial, cultural, or gender bias; (6)   whether the test items address the needs of disadvantaged, limited English proficient and disabled students; and (7)   whether the test items can be used for tracking, graduation or promotion of students.