High Stakes

Testing for Tracking, Promotion, and Graduation

Jay P. Heubert and Robert M. Hauser, Editors

Committee on Appropriate Test Use

Board on Testing and Assessment

Commission on Behavioral and Social Sciences and Education

National Research Council

NATIONAL ACADEMY PRESS
Washington, D.C.
1999



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page R1
--> High Stakes Testing for Tracking, Promotion, and Graduation Jay P. Heubert and Robert M. Hauser, Editors Committee on Appropriate Test Use Board on Testing and Assessment Commission on Behavioral and Social Sciences and Education National Research Council NATIONAL ACADEMY PRESS Washington, D.C. 1999

OCR for page R1
--> NATIONAL ACADEMY PRESS 2101 Constitution Avenue, N.W. Washington, D.C. 20418 NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance. The study was supported by Contract/Grant No. ED-98-CO-0005 between the National Academy of Sciences and the U.S. Department of Education. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the organizations or agencies that provided support for this project. Library of Congress Cataloging-in-Publication Data High stakes : testing for tracking, promotion, and graduation / Jay P. Heubert and Robert M. Hauser, editors ; Committee on Appropriate Test Use. p. cm. Includes bibliographical references and index. ISBN 0-309-06280-2 (pbk.) 1. Educational tests and measurements—United States. 2. Educational accountability—United States. 3. Education and state—United States. I. Heubert, Jay Philip. II. Hauser, Robert Mason. III. National Research Council (U.S.). Committee on Appropriate Test Use. LB3051 .H475 1999 371.26′0973—dc21 98-40215 Additional copies of this report are available from National Academy Press, 2101 Constitution Avenue, N.W., Washington, D.C. 20418 Call (800) 624-6242 or (202) 334-3313 (in the Washington metropolitan area) This report is also available on line at http://www.nap.edu Printed in the United States of America Copyright 1999 by the National Academy of Sciences. All rights reserved.

OCR for page R1
--> COMMITTEE ON APPROPRIATE TEST USE ROBERT M. HAUSER (Chair), Department of Sociology, University of Wisconsin, Madison LIZANNE DeSTEFANO, Department of Education, University of Illinois, Urbana-Champaign PASQUALE J. DeVITO, Office of Assessment and Information Services, Rhode Island Department of Education, Providence RICHARD P. DURÁN, Graduate School of Education, University of California, Santa Barbara JENNIFER L. HOCHSCHILD, Woodrow Wilson School of Public and International Affairs, Princeton University STEPHEN P. KLEIN, RAND Corporation, Santa Monica SHARON LEWIS, Council of the Great City Schools, Washington, D.C. LORRAINE M. McDONNELL, Department of Political Science, University of California, Santa Barbara SAMUEL MESSICK, Educational Testing Service, Princeton, New Jersey ULRIC NEISSER, Department of Psychology, Cornell University ANDREW C. PORTER, Wisconsin Center for Educational Research, University of Wisconsin, Madison AUDREY L. QUALLS, Iowa Testing Program, University of Iowa, Iowa City PAUL R. SACKETT, Department of Psychology, University of Minnesota, Minneapolis CATHERINE E. SNOW, Graduate School of Education, Harvard University WILLIAM T. TRENT, Department of Educational Policy Studies, University of Illinois, Urbana-Champaign ROBERT L. LINN, ex officio, Board on Testing and Assessment; School of Education, University of Colorado, Boulder JAY P. HEUBERT, Study Director MICHAEL J. FEUER, Director, Board on Testing and Assessment PATRICIA MORISON, Senior Program Officer NAOMI CHUDOWSKY, Senior Program Officer ALLISON M. BLACK, Research Associate MARGUERITE CLARKE, Technical Consultant EDWARD MILLER, Editorial Consultant VIOLA C. HOREK, Administrative Associate KIMBERLY D. SALDIN, Senior Project Assistant

OCR for page R1
--> BOARD ON TESTING AND ASSESSMENT ROBERT L. LINN (Chair), School of Education, University of Colorado, Boulder CARL F. KAESTLE (Vice Chair), Department of Education, Brown University RICHARD C. ATKINSON, President, University of California IRALINE BARNES, The Superior Court of the District of Columbia PAUL J. BLACK, School of Education, King's College, London RICHARD P. DURÁN, Graduate School of Education, University of California, Santa Barbara CHRISTOPHER F. EDLEY, JR., Harvard Law School, Harvard University PAUL W. HOLLAND, Graduate School of Education, University of California, Berkeley MICHAEL W. KIRST, School of Education, Stanford University ALAN M. LESGOLD, Learning Research and Development Center, University of Pittsburgh LORRAINE McDONNELL, Department of Political Science, University of California, Santa Barbara KENNETH PEARLMAN, Lucent Technologies, Inc., Warren, New Jersey PAUL R. SACKETT, Department of Psychology, University of Minnesota, Minneapolis RICHARD J. SHAVELSON, School of Education, Stanford University CATHERINE E. SNOW, Graduate School of Education, Harvard University WILLIAM L. TAYLOR, Attorney at Law, Washington, D.C. WILLIAM T. TRENT, Associate Chancellor, University of Illinois, Urbana-Champaign JACK WHALEN, Xerox Palo Alto Research Center, Palo Alto, California KENNETH I. WOLPIN, Department of Economics, University of Pennsylvania MICHAEL J. FEUER, Director VIOLA C. HOREK, Administrative Associate

OCR for page R1
--> FOREWORD President Clinton's 1997 proposal to create voluntary national tests in reading and mathematics catapulted testing to the top of the national education agenda. The proposal turned up the volume on what had already been a contentious debate and drew intense scrutiny from a wide range of educators, parents, policymakers, and social scientists. Recognizing the important role science could play in sorting through the passionate and often heated exchanges in the testing debate, Congress and the Clinton administration asked the National Research Council, through its Board on Testing and Assessment (BOTA), to conduct three fast-track studies over a 10-month period. This report and its companions—Uncommon Measures: Equivalence and Linkage Among Educational Tests and Evaluation of the Voluntary National Tests: Phase 1—are the result of truly heroic efforts on the part of the BOTA members, the study committee chairs and members, two co-principal investigators, consultants, and staff, who all understood the urgency of the mission and rose to the challenge of a unique and daunting timeline. Michael Feuer, BOTA director, deserves the special thanks of the board for keeping the effort on track and shepherding the report through the review process. His dedicated effort, long hours, sage advice, and good humor were essential to the success of this effort. Robert Hauser deserves our deepest appreciation for his superb leadership of the committee that produced this report.

OCR for page R1
--> These reports are exemplars of the Research Council's commitment to scientific rigor in the public interest: they provide clear and compelling statements of the underlying issues, cogent answers to nettling questions, and highly readable findings and recommendations. These reports will help illuminate the toughest issues in the ongoing debate over the proposed voluntary national tests. But they will do much more as well. The issues addressed in this and the other two reports go well beyond the immediate national testing proposal: they have much to contribute to knowledge about the way tests—all tests—are planned, designed, implemented, reported, and used for a variety of education policy goals. I know the whole board joins me in expressing our deepest gratitude to the many people who worked so hard on this project. These reports will advance the debate over the role of testing in American education, and I am honored to have participated in this effort. ROBERT L. LINN, CHAIR BOARD ON TESTING AND ASSESSMENT

OCR for page R1
--> DEDICATION In early October 1998, after the public release of this report but before its formal publication, we were saddened to learn of the death of our fellow committee member, Samuel Messick. Sam spent almost all of his career at the Educational Testing Service, and he made legendary contributions to the science and profession of educational measurement. Even had he not been a member of the committee, Sam would have guided the committee's deliberations through his earlier National Research Council work on the use of tests to make decisions about students with mental retardation—which provided the overarching framework of our report—and his creative reconstruction of the concept of test validity. As it was, Sam made even greater contributions to the project through his drafts of major sections of the text as well as his cordial, but ever crisp, incisive, and often wryly humorous contributions to our discussions. Sam was a wonderful scholar, intellect, and friend, and we dedicate this book to him.

OCR for page R1
This page in the original is blank.

OCR for page R1
--> ACKNOWLEDGMENTS The Committee on Appropriate Test Use wishes to thank the many people who helped make possible the preparation of this report on an accelerated schedule. An important part of the committee's work was to gather data about testing research, policy, and practice in states and school districts. Many people gave generously of their time, at meetings and workshops of the committee, in interviews with committee staff, and by drafting short papers to assist the committee's thinking. Lorrie A. Shepard, University of Colorado, Boulder, provided an excellent overview of educational issues in high-stakes testing of individual students. Floraline Stevens, of Los Angeles, provided insights into state and local high-stakes test policies. At a workshop on testing of English-language learners, Jamal Abedi, University of California, Los Angeles, shared his experimental findings on effects of question wording and format among English-language learners. Toni Marsnik, Language Acquisition and Bilingual Development Branch, Los Angeles Unified School District, and Lynn Winters, assistant superintendent for research, planning, and evaluation, Long Beach Unified School District, offered perspectives on practices for testing English-language learners in their districts and in California more generally. At a committee workshop in Washington, D.C., six leading educational policymakers offered local, state, and national perspectives on the use of high-stakes tests for promotion or retention; the presenters included

OCR for page R1
--> Arlene Ackerman, superintendent of schools, Washington, D.C.; Philip Hansen, chief accountability officer, Chicago Public Schools; Nancy Grasmick, superintendent of schools, State of Maryland; Jim Watts, vice president for state services, Southern Regional Education Board; Michael Cohen, special assistant to the president for educational policy; and Bella Rosenberg, assistant to the president, American Federation of Teachers. The committee also commissioned short papers to assist in deliberations about alternate strategies for promoting appropriate test use. Those who prepared such papers include: Tyler Cowan, George Mason University; Ernest House, University of Colorado, Boulder; Don Kettl, University of Wisconsin, Madison; Henry Levin, Stanford University; Theodore Marmor, Yale University; and Anne Schneider, Arizona State University. We are grateful to David Klahr, Carnegie Mellon University, for his insights. Jennifer C. Day, Population Division, U.S. Bureau of the Census, provided access to unpublished tabulations of school enrollment data from the October Current Population Survey. In addition, staff of several state education agencies provided valuable information about state retention rates: Alabama, Arizona, California, Delaware, District of Columbia, Florida, Georgia, Indiana, Kentucky, Louisiana, Maryland, Massachusetts, Michigan, Mississippi, New Mexico, New York, North Carolina, Ohio, South Carolina, Tennessee, Texas, Vermont, Virginia, West Virginia, and Wisconsin. We are also grateful to those who served as consultants to the committee. Marguerite Clarke, research associate at Boston College, provided invaluable contributions during all phases of the study, especially on psychometric issues. Edward Miller joined the project midway as editor, and he skillfully, tirelessly pulled our bits, scraps, and—sometimes—avalanches of text into clear, concise prose. Diane August provided important advice and assistance on the testing of English-language learners and prepared early drafts of Chapter 9 of the report. Susan E. Phillips, Michigan State University, and William L. Taylor, a member of the Board on Testing and Assessment, provided valuable advice on legal issues in testing. Taissa S. Hauser volunteered to collect and assemble statistical data on school retention and age-grade retardation, and her good company and quiet advice were a source of support to all on the project staff. We owe an important debt of gratitude to the scientific and professional

OCR for page R1
--> staff of the Commission on Behavioral and Social Sciences and Education (CBASSE), without whose guidance, support, and hard work we could not conceivably have completed this report. Barbara B. Torrey, executive director of the commission, and Sandy Wigdor, director of the Division on Education, Labor, and Human Performance, have been enthusiastic supporters of the project and a timely source of gracious reminders that we keep our priorities in line. Michael J. Feuer, director of the Board on Testing and Assessment (BOTA), brought our research team together, created staff support and resources whenever we needed them, and was our most valuable guide, sounding board, and humorist as we pondered the complexities of educational policy analysis. Patricia Morison made major contributions to our work on students with disabilities and English-language learners and was a constant source of support and thoughtful ideas. Allison Black contributed to many phases of the project; she developed many of the background materials for the committee, and her structured interviews with school administrators were a key source of information about local testing policies and practices. Naomi Chudowsky took major responsibility for the investigation of high school graduation and also contributed to the presentation of psychometric concepts, and Robert Rothman made important contributions to the analysis of policy alternatives. During her summer internship, Yale University doctoral student Marilyn Dabady was a careful and critical in-house reader of our drafts. National Research Council (NRC) staff were always available to pitch in when expertise or energy were called for. They were key members of the study team, and it is hard to see how the study could have been completed without their expert help. Kimberly Saldin served unflappably and flawlessly as the committee's senior project assistant. She dealt smoothly with the logistics of our four committee meetings in five months, with our voluminous collections and distributions of published and unpublished research materials, and with a seemingly endless stream of text files, e-mail file attachments, and file revisions in seemingly incompatible word-processing formats. Other BOTA staff—Steve Baldwin, Alix Beatty, Meryl Bertenthal, Cadelle Hemphill, Lee Jones, Karen Mitchell—offered advice, help, and support at key stages of the process. Kimberly Saldin received support when she needed it from other wonderful project assistants to the board: Lisa Alston, Dorothy Majewski, Jane Phillips, and Holly Wells. Viola Horek, administrative associate to BOTA, was always there, instrumental in seeing that the entire project ran smoothly.

OCR for page R1
--> We are deeply grateful to Eugenia Grohman, associate director for reports of CBASSE. Genie has and shares enormous knowledge and experience in keeping a committee on track and putting a report together from beginning to end. We also appreciate the superb work of Christine McShane, to whom fell the responsibility for final editing of the full report. We are indebted, also, to the whole CBASSE staff for indulging our scheduling exigencies. Thanks also to Sally Stanfield and the whole Audubon team at the National Academy Press for their creative and speedy support. Several members of the Board on Testing and Assessment were not members of the committee but attended our meetings ex officio and were constant sources of wisdom and encouragement: Robert L. Linn, University of Colorado at Boulder, chair of the Board on Testing and Assessment, and committee member ex officio; William L. Taylor, Attorney at Law; and Carl F. Kaestle, Brown University. Individual committee members have made outstanding contributions to the study. Several of them drafted sections on particular topics, prepared background materials, or helped to organize workshops and committee discussions. Everyone contributed constructive, critical thinking, serious concern about the difficult and complex issues that we faced, and an open-mindedness that was essential to the success of the project. A word of acknowledgment to the sponsors of this study. We have benefited from supportive and collegial relations with members of the various House and Senate committee staffs—on both sides of the aisle—for whom the results of our work have such important implications. We thank them all for understanding and respecting the process of the NRC. Our contracting officer's technical representative, Holly Spurlock, of the U.S. Department of Education, has been a most effective project officer; we thank her for her patience and guidance throughout. Many other officials in the department, the National Assessment Governing Board, and in numerous private and public organizations involved in testing also deserve our thanks and recognition for their cooperation in providing information. This report has been reviewed by individuals chosen for their diverse perspectives and technical expertise, in accordance with procedures approved by the NRC's Report Review Committee. The purpose of this independent review is to provide candid and critical comments that will assist the authors and the NRC in making the published report as sound as possible and to ensure that the report meets institutional standards for

OCR for page R1
--> objectivity, evidence, and responsiveness to the study charge. The content of the review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We wish to thank the following individuals, who are neither officials nor employees of the NRC, for their participation in the review of this report: Lloyd Bond, School of Education, University of North Carolina, Greensboro; Wayne J. Camara, The College Board, New York, New York; John Fremer, Educational Testing Service, Princeton, New Jersey; Adam Gamoran, Wisconsin Center for Education Research, University of Wisconsin; Arthur S. Goldberger, Department of Economics, University of Wisconsin; Lyle V. Jones, L.L. Thurstone psychometric Laboratory, University of North Carolina, Chapel Hill; Jeannie Oakes, Graduate School of Education and Information Studies, University of California, Los Angeles; Diana Pullin, School of Education, Boston College; Henry W. Riecken, Professor of Behavioral Sciences (emeritus), University of Pennsylvania School of Medicine. Although the individuals listed above have provided many constructive comments and suggestions, responsibility for the final content of this report rests solely with the authoring committee and the NRC. The two of us were unacquainted when we began the project, and—one a legal scholar and the other a demographer—we had little in common beyond our shared belief in the importance of our mandate. Each of us has benefited from the other's strengths, and working together has been an unalloyed pleasure. JAY HEUBERT, STUDY DIRECTOR ROBERT M. HAUSER, CHAIR COMMITTEE ON APPROPRIATE TEST USE

OCR for page R1
--> The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Bruce M. Alberts is president of the National Academy of Sciences. The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. William A. Wulf is president of the National Academy of Engineering. The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Kenneth I. Shine is president of the Institute of Medicine. The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy's purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Bruce M. Alberts and Dr. William A. Wulf are chairman and vice chairman, respectively, of the National Research Council.

OCR for page R1
--> CONTENTS Executive Summary   1 Part I Background and Context     1   Introduction   13 2   Assessment Policy and Politics   29 3   Legal Frameworks   50 4   Tests as Measurements   71 Part II Uses of Test to Make High-Stakes Decisions About Individuals     5   Tracking   89 6   Promotion and Retention   114

OCR for page R1
--> 7   Awarding or Withholding High School Diplomas   163 8   Students with Disabilities   188 9   English-Language Learners   211 10   Use of Voluntary National Test Scores for Tracking, Promotion, or Graduation Decisions   238 Part III Ensuring Appropriate Uses of Tests     11   Potential Strategies for Promoting Appropriate Test Use   247 12   Findings and Recommendations   273 Biographical Sketches   308 Index   313

OCR for page R1
High Stakes

OCR for page R1
--> Public Law 105-78, enacted November 13, 1997 SEC. 309. (a) STUDY—The National Academy of Sciences shall conduct a study and make written recommendations on appropriate methods, practices, and safeguards to ensure that— (1) existing and new tests that are used to assess student performance are not used in a discriminatory manner or inappropriately for student promotion, tracking or graduation; and (2) existing and new tests adequately assess student reading and mathematics comprehension in the form most likely to yield accurate information regarding student achievement of reading and mathematics skills. (b) REPORT TO CONGRESS—The National Academy of Sciences shall submit a written report to the White House, the National Assessment Governing Board, the Committee on Education and the Workforce of the House of Representatives, the Committee on Labor and Human Resources of the Senate, and the Committees on Appropriations of the House and Senate not later than September 1, 1998.