and test scores equivalent across different forms. See Kolen and Brennan (2004) or Holland and Dorans (2006) for additional explanation of linking.

Thus, the test developer is often faced with a number of dilemmas. Constructed-response and performance-based tasks may be the most authentic way to assess 21st century skills. However, achieving high technical standards with these item types is challenging. When tests do not meet high technical standards, the results should not be used for high-stakes decisions with important consequences for students. But, when the results do not impact students’ lives in important ways (i.e., “they do not count”), students may not try their best. Raising the stakes means increasing the technical quality of the tests. Test developers must face these issues and set priorities as to the most important aspects of the assessment. Is it more important to have authentic test items or to meet high reliability standards? Test developers are often faced with competing priorities and will need to make tradeoffs. Decisions about these tradeoffs will need to be guided by the goals and purposes of the assessment as well as practical constraints, such as the resources available.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement