Assessment is the means by which we determine what students know and can do. It tells teachers, students, parents, and policymakers something about what students have learned: the mathematical terms they recognize and can use, the procedures they can carry out, the kind of mathematical thinking they do, the concepts they understand, and the problems they can formulate and solve. It provides information that can be used to award grades, to evaluate a curriculum, or to decide whether to review fractions. Assessment can help convince the public and educators that change is needed in the short run and that the efforts to change mathematics education are worthwhile in the long run. Conversely, it can thwart attempts at change. Assessment that is out of synchronization with curriculum and instruction gives the wrong signals to all those concerned with education.

Mathematics assessments are roughly divided into two categories: internal and external. Internal assessments provide information about student performance to teachers for making instructional decisions. These assessments may be for high or low stakes, but they exert their chief influence within the walls of the classroom. External assessments provide information about mathematics programs to state and local agencies, funding bodies, policymakers, and the public. That information can be used either to hold program managers accountable or to monitor the program's level of performance. These assessments are used primarily by people outside the immediate school community. Although internal assessment is perhaps more obviously and directly connected with the improvement of mathematics learning than external assessment, both types of assessment should advance mathematics education.

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
2
A VISION OF MATHEMATICS ASSESSMENT
Assessment is the means by which we determine what students know and can do. It tells teachers, students, parents, and policymakers something about what students have learned: the mathematical terms they recognize and can use, the procedures they can carry out, the kind of mathematical thinking they do, the concepts they understand, and the problems they can formulate and solve. It provides information that can be used to award grades, to evaluate a curriculum, or to decide whether to review fractions. Assessment can help convince the public and educators that change is needed in the short run and that the efforts to change mathematics education are worthwhile in the long run. Conversely, it can thwart attempts at change. Assessment that is out of synchronization with curriculum and instruction gives the wrong signals to all those concerned with education.
Mathematics assessments are roughly divided into two categories: internal and external. Internal assessments provide information about student performance to teachers for making instructional decisions. These assessments may be for high or low stakes, but they exert their chief influence within the walls of the classroom. External assessments provide information about mathematics programs to state and local agencies, funding bodies, policymakers, and the public. That information can be used either to hold program managers accountable or to monitor the program's level of performance. These assessments are used primarily by people outside the immediate school community. Although internal assessment is perhaps more obviously and directly connected with the improvement of mathematics learning than external assessment, both types of assessment should advance mathematics education.

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
THE ROLE OF ASSESSMENT IN REFORM
Mathematics assessment must change in ways that will both support and be consistent with other changes under way in mathematics education
Assessment can play a powerful role in conveying, clearly and directly, the outcomes toward which reform in mathematics is aimed. As assessment changes along with instruction, it can help teachers and students keep track of their progress toward higher standards. Many reformers see assessment as much more than a signpost, viewing it as a lever for propelling reform forward.1 It is essential that mathematics assessment change in ways that will both support and be consistent with other changes under way in mathematics education.
From their beginnings in the last century, standardized achievement tests have been used in American schools not only to determine what students have learned but also to induce better teaching. The written tests administered by the Boston School Committee in 1845 led to rankings of schools by level of achievement and to recommended changes in instructional methods.2 The New York State Regents Examinations were set up primarily to maintain standards by showing teachers what their students needed to know.3 The traditional view of many Americans that tests and examinations can do more than measure achievement is reflected in this quotation from a 1936 book on assessment prepared for the American Council on Education: "Recently increasing emphasis has been placed upon examinations as means for improving instruction, and as instruments for securing information that is indispensable for the constructive educational guidance of pupils."4
Researchers are beginning to document more thoroughly the effects of assessment, determining, in effect, whether this traditional view is justified. A 1992 study by the Center for the Study of Testing, Evaluation, and Educational Policy at Boston College examined the content of the most commonly used tests embedded in textbooks and standardized tests in mathematics and science for grades 4 to 12 and how they influence instruction. The authors noted that the tests fell far short of the reform vision and concluded that
Since textbook tests were found to be similar to standardized tests in the skills they measure, and since these tests are widely used, an emphasis on low level thinking skills extends beyond the instructional time spent preparing for state and district mandated standardized tests. The tests most commonly taken by stu

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
dents—both standardized tests and textbooks tests—emphasize and mutually reinforce low level thinking and knowledge, and were found to have an extensive and pervasive influence on math and science instruction nationwide.5
If current assessment practices prevail, reform in school mathematics is not likely to succeed.
Proponents of mathematics education reform have expressed the view that the goal of more and better mathematics learning for all students cannot be realized if assessment remains wedded to what is easy to measure and what has traditionally been taught. The messages sent by new views of content, teaching, and learning will be contradicted by the values that such assessment practices communicate. Some teachers may attempt to respond to both messages by preparing students for the tests while, simultaneously, trying to offer students some taste of a richer, more challenging curriculum. Other teachers may continue to teach as they have always taught, reasoning that the tests used to make important decisions about students' lives, teachers' salaries, and educational quality indicate what is truly important for students to learn. If current assessment practices prevail, reform in school mathematics is not likely to succeed.
Suppose assessment practice were to change in American mathematics classes. What if, at the end of a unit, students wrote an essay explaining how two situations could both be modeled with the same exponential function, instead of being tested on skills such as solving equations and choosing among definitions? Imagine students being assessed not only with a final examination taken in class but also on how well they could conduct and report a group investigation of the symmetries in a wallpaper pattern. Suppose students were allowed to use calculators, computers, and other resources on most tests and examinations, including those administered by external agencies. If such changes were to occur, many mathematics teachers would shift their instruction to prepare their students adequately for such assessments.
Reformers have proposed a host of innovative approaches to assessment, many of which are described in subsequent sections of this report. Leaders in the educational policy community are joining the chorus, arguing that minimum competence tests and basic skill assessments, like those commonly seen today, work against efforts to improve schools.6 Low-level tests give coarse and deceptive readings of educational progress. Worse, they send the wrong message about what is important.7 Assessments need to record

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
genuine accomplishments in reasoning and in formulating and solving mathematical problems, not feats of rote memorization or proficiency in recognizing correct answers.8
The content of assessment needs to change along with its form if the vision of mathematics teaching and learning is to be attained. Portfolios of mathematical work can contribute to better teaching and learning only if the collections reflect work on meaningful tasks that require use of higher-level mathematical processes. Write-ups of mathematical investigations can support the vision only if the mathematics addressed is important rather than trivial. New forms of assessment can support efforts to change instruction and curriculum only if constructed in ways that reflect the philosophy and the substance of the reform vision of school mathematics. Obviously, changing the content and forms of assessment will not be sufficient to bring about reform. Such changes will have meaning only if curriculum changes and professional development for teachers are attended to as part of the reform process.
PRINCIPLES FOR ASSESSING MATHEMATICS LEARNING
In this chapter, three educational principles based on content, learning, and equity are set forth to guide changes in mathematics assessment. Underlying these three principles is the fundamental premise that assessment makes sense only if it is in harmony with the broad goals of mathematics education reform.
THE CONTENT PRINCIPLE
Assessment should reflect the mathematics that is most important for students to learn.
Any assessment of mathematics learning should first and foremost be anchored in important mathematical content. It should reflect topics and applications that are critical to a full understanding of mathematics as it is used in today's world and in students' later lives, whether in the workplace or in later studies. Assessments should reflect processes that are required for doing mathematics: reasoning, problem solving, communication, and connecting ideas. Consensus has been achieved within the discipline of mathematics and among organizations representing mathematics educators and

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
teachers on what constitutes important mathematics. Although such consensus is a necessary starting point, it is important to obtain public acceptance of these ideas and to preserve local flexibility to determine how agreed-upon standards are reflected in assessments as well as in curricula.
Assessment makes sense only if it is in harmony with the broad goals of mathematics education reform.
As uses of mathematics change over time, visions of school mathematics and assessment must evolve in consonant ways. No existing conception of important content should constitute an anchor, preventing changes in assessment that are warranted by changing times. Thus, assessment development will require more significant collaboration between content and measurement experts than has been characteristic in the past. The goal of the content principle is to ensure that assessments are based on well-reasoned conceptions of what mathematics students will need to lead fully informed lives. Only if the mathematics assessed is important can the mathematics be justified as significant and valuable for students to know, and the assessment justified as supportive of good instruction and a good use of educational resources.
THE LEARNING PRINCIPLE
Assessment should enhance mathematics learning and support good instructional practice.
Although assessments can be undertaken for various purposes and used in many ways, proponents of standards-based assessment reform have argued for the use of assessments that contribute very directly to student learning. The rationale is that challenging students to be creative and to formulate and solve problems will not ring true if all students see are quizzes, tests, and examinations that dwell on routine knowledge and skill. Consciously or unconsciously, students use assessments they are given to determine what others consider to be significant.
There are many ways to accomplish the desired links between assessment and learning. Assessment tasks can be designed so that they are virtually indistinguishable from good learning tasks by attending to factors that are critical to good instructional design: motivation, opportunities to construct or extend knowledge, and opportunities to receive feedback and revise work. Assessment and instruction can be combined, either through seamlessly weaving the two kind of activities together or by taking advantage of opportuni-

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
ties for assessment as instruction proceeds. Assessments can also be designed in ways that help communicate the goals of learning and the products of successful learning. In each of these approaches, the teacher's role is critical both for facilitating and mediating learning.
THE EQUITY PRINCIPLE
Assessment should support every student's opportunity to learn important mathematics.
The equity principle aims to ensure that assessments are designed to give every student a fair chance to demonstrate his or her best work and are used to provide every student with access to challenging mathematics.
Equity requires careful attention to the many ways in which understanding of mathematics can be demonstrated and the many factors that may color judgments of mathematical competence from a particular collection of assessment tasks.
Equity also requires attention to how assessment results are used. Often assessments have been used inappropriately to filter students out of educational opportunity. They might be used instead to empower students: to provide students the flexibility needed to do their best work, to provide concrete examples of good work so that students will know what to aim for in learning, and to elevate the students' and others' expectations of what can be achieved.
Equity also requires that policies regarding use of assessment results make clear the schools' obligations to educate students to the level of new content and performance standards.
EDUCATIONAL PRINCIPLES IN CONTEXT
Time spent on assessment is increasing in classrooms across the country.9 Separate assessments are often administered to answer a wide array of questions, from what the teacher should emphasize in class tomorrow to what the school system should do to improve its overall mathematics program. Whether the sheer number of assessments is reduced is not the primary issue. What is more critical is that any time spent on assessment be time used in pursuit of the goal of excellent education.

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
The content, learning and equity principles challenge the dominance, not the importance, of traditional measurement criteria.
The principles described above provide criteria that aim to ensure that assessments foster the goal of excellent mathematics education. For decades, educational assessment in the United States has been driven largely by practical and technical concerns rather than by educational priorities. Testing as we know it today arose because very efficient methods were found for assessing large numbers of people at low cost. A premium was placed on assessments that were easily administered and that made frugal use of resources. The constraints of efficiency meant that mathematics assessment tasks could not tap a student's ability to estimate the answer to an arithmetic calculation, construct a geometric figure, use a calculator or ruler, or produce a complex deductive argument.
A narrow focus on technical criteria—primarily reliability—also worked against good assessment. For too long, reliability meant that examinations composed of a small number of complex problems were devalued in favor of tests made up of many short items. Students were asked to perform large numbers of smaller tasks, each eliciting information on one facet of their understanding, rather than to engage in complex problem solving or modeling, the mathematics that is most important.
In the absence of expressly articulated educational principles to guide assessment, practical and technical criteria have become de facto ruling principles. The content, learning, and equity principles are proposed not to challenge the importance of these criteria, but to challenge their dominance and to strike a better balance between educational and measurement concerns. An increased emphasis on validity—with its attention to fidelity between assessments, high-quality curriculum and instruction, and consequences—is the tool by which the necessary balance can be achieved.
In attempting to strike a better balance between educational and measurement concerns, many of the old measurement questions must be re-examined. For example, standardization has usually been taken to mean that assessment procedures and conditions are the same for every student. But from the perspective of fairness and equity, it might be more critical to assure that every student has the same level of understanding about the context and requirements of an assessment or task. The latter interpretation requires that some accommodation be made to differences among learners. For example, the teacher or examination proctor might be allowed to

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
explain instructions when needed, a procedure that would be proscribed under prevailing practices. Standardization will remain important, but how it is viewed and how it is operationalized may require rethinking, as the new principles for assessment are put in place.10
To strike a better balance between educational and measurement concerns, many of the old measurement questions must be re-examined.
Putting the content, learning, and equity principles first will present different kinds of challenges for different constituencies. It will mean finding new approaches for creating, scoring, and evaluating mathematics assessments. Some new approaches are being tried in schools today. Techniques are being developed that allow students to show what they know and can do and not simply whether they recognize a correct answer when they see one. These changes imply new roles for teachers. Much of the impulse behind the movement toward standardized testing over this century arose from a mistrust of teachers' ability to make fair, adequate judgments of their students' performance.11 Teachers will have to be accorded more professional credibility as they are given increased responsibility for conducting and evaluating student responses on assessments developed to meet the three principles. Teachers will need systematic support in their efforts to meet these new professional responsibilities and challenges.
The principles also present challenges for assessment developers and researchers. Some issues that need clarification relate to the broader definitions of important content now embraced by the mathematics education community. Processes such as communication and reasoning, for example, previously have been classified as nonmathematical skills. Broadening the domain of important mathematics to include these skills may make it difficult to separate general cognitive skills from the outcomes of mathematics instruction, which may undermine validity as it is traditionally understood.12
Other open technical issues relate to the difficulty of establishing that assessment tasks actually evoke the higher-order processes they were designed to tap.13 The array of solutions to high-quality mathematics tasks is potentially so rich that expert judgements will not be sufficient. Students may need to be interviewed about their solution approaches during or at the conclusion of a task. Student work also will need to be examined. A number of researchers are exploring different approaches for making process

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
determinations from student work. As yet, however, there are no well established procedures for translating this kind of information into forms that are useful for evaluating how well assessments meet the content and learning principles or how well they satisfy the more traditional criterion of content validity.
Reordering priorities so that these new principles provide a foundation on which to develop new assessments puts student learning of mathematics ahead of other purposes for assessment. It is bound to have dramatic implications for mathematics assessment, not all of which can be foreseen now. The purpose of the remainder of this report, however, is to examine what is known from research, what questions still await answers, and what the wisdom of expert practice suggests about the principles and their implementation.

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
Endnotes
1
Peter W. Airasian, "Symbolic Validation: The Case of State-Mandated, High-Stakes Testing," Educational Evaluation and Policy Analysis 10:4 (1988), 301-313; Eva L Baker and Regie Stites, "Trends in Testing in the USA," in Susan H. Fuhrman and Betty Malen, eds., The Politics of Curriculum and Testing, 1990 Yearbook of the Politics of Education Association (London, England: The Falmer Press, 1990), 139-157; National Commission on Testing and Public Policy, From Gatekeeper to Gateway: Transforming Testing in America (Chestnut Hill, MA: Boston College, 1990); Grant Wiggins, "Teaching to the (Authentic) Test," Educational Leadership 46:7 (1989), 41-47; Dennie P. Wolf et al., "To Use Their Minds Well: Investigating New Forms of Student Assessment," in Gerald Grand, ed., Review of Research in Education (Washington, D.C.: American Educational Research Association, 1991), 31-74; Daniel Resnick and Lauren Resnick, "Assessing the Thinking Curriculum: New Tools for Educational Reform," in Bernard R. Gifford and Mary C. O'Connor, eds., Changing Assessments: Alternative Views of Aptitude, Achievement, and Instruction (Boston, MA: Kluwer Academic Publishers, 1992), 37-75; U.S. Congress, Office of Technology Assessment, Testing in American Schools: Asking the Right Questions, OTA-SET-519 (Washington, D.C.: U.S. Government Printing Office, 1992).
2
O. W. Caldwell and S. A. Courtis, Then and Now in Education, 1845-1923: A Message of Encouragement from the Past to the Present (Yonkers-on-Hudson, NY: World Book, 1925), 180-181.
3
Harlan H. Homer, Education in New York State, 1784-1954 (Albany, NY: University of the State of New York, State Education Department, 1954), 70.
4
Herbert E. Hawkes, E. F. Lindquist, and C. R. Mann, The Construction and Use of Achievement Examinations: A Manual for Secondary School Teachers (Boston, MA: Houghton Mifflin, 1936), iv.
5
George F. Madaus et al., The Influence of Testing on Teaching Math and Science in Grades 4-12: Executive Summary (Boston, MA: Boston College, Center for the Study of Testing, Evaluation, and Educational Policy, 1992), 1.
6
Testing in American Schools: Asking the Right Questions, 64-65; National Council on Education Standards and Testing, Raising Standards for American Education: A Report to Congress, the Secretary of Education, the National Education Goals Panel, and the American People (Washington, D.C.: U.S. Government Printing Office, 1992), 12; See also The Influence of Testing on Teaching Math and Science in Grades 4-12: Executive Summary, 18.
7
Raising Standards for American Education; Edward Silver, "Assessment and Mathematics Education Reform in the United States, International Journal of Educational Research, in press; Andrew C. Porter, "Assessing National Goals: Some Measurement Dilemmas," The Assessment of National Educational Goals: Invitational Conference Proceedings (Princeton, NJ: Educational Testing Service, 1970), 21-42; Lorrie Shepard, "Inflated Test Score Gains: Is the Problem Old Norms or Teaching to the Test?" Educational Measurement: Issues and Practices 9 (1990): 15-22; Testing in American Schools: Asking the Right Questions.
8
Thomas Romberg, E. Anne Zarinnia, and Kevin F. Collis, "A New World View of Assessment in Mathematics," in Gerald Kulm, ed., Assessing Higher

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
Order Thinking in Mathematics (Washington, D.C.: American Association for the Advancement of Science, 1990), 21-38; Thomas Romberg, "Evaluation: A Coat of Many Colors" (Paper presented at the Sixth International Congress on Mathematical Education, Budapest, Hungary, July 27-August 3, 1988), Division of Science, Technical and Environmental Education, UNESCO; Edward Silver and Patricia Kenney, "Sources of Assessment Information for Instructional Guidance in Mathematics," in Thomas Romberg, ed., Reform in School Mathematics and Authenic Assessment, (in press); Edward Silver, Patricia Kenney, and Leslie Salmon-Cox, The Content and Curricular Validity of the 1990 NAEP Mathematics Items: A Retrospective Analysis (Pittsburgh, PA: Learning Research and Development Center, University of Pittsburgh, 1991); Richard Lesh and Susan J. Lamon, eds., Assessment of Authentic Performance in School Mathematics (Washington, D.C.: American Association for the Advancement of Science, 1992).
9
From Gatekeeper to Gateway; Walter M. Haney, George F. Madaus, and Robert Lyons, The Fractured Marketplace for Standardized Testing (Boston, MA: Kluwer Academic Publishers), 319-323.
10
Eva L Baker and Harold F. O'Neil, Jr., "Diversity, Assessment, and Equity in Educational Reform" (Paper presented at the Ford Foundation Symposium on Equity and Educational Testing and Assessment, Washington, D.C., 11-12 March 1993).
11
"Symbolic Validation"; Daniel M. Koretz et al., Statement before the Subcommittee on Elementary, Secondary, and Vocational Education, Committee on Education and Labor, U.S. House of Representatives (19 February 1992); Eva L. Baker, Harold F. O'Neill, and Robert L. Linn, "What Works in Performance Assessment" (Sherman Oaks, CA: Horace Design Information, Inc., Draft final report, September 1992); Eva L Baker, "The Role of Domain Specifications in Improving the Technical Quality of Performance Assessment'' (Los Angeles, CA: National Center for Research on Evaluation, Standards, and Student Testing, 1992); Norman Webb and E. Yasui, "Alternative Approaches to Assessment in Mathematics and Science: The Influence of Problem Context on Mathematics Performance" CSE Technical Report 346 (Los Angeles, CA: National Center for Research on Evaluation, Standards, and Student Testing, 1992).
12
Stephen B. Dunbar and Elizabeth A. Witt, "Design innovations in Measuring Mathematics Achievement" (Paper commissioned by the Mathematical Sciences Education Board, September 1993, appended to this report).
13
For a discussion of relevant technical issues, see "Design Innovations in Measuring Mathematics Achievement" and others; Stephen B. Dunbar, Daniel M. Koretz, and H. D. Hoover, "Quality Control in the Development and Use of Performance Assessments," Applied Measurement in Education 4:4 (1991), 289-304; M. Magone et al., "Validity Evidence for Cognitive Complexity of Performance Assessments: An Analysis of Selected QUASAR Tasks" (Paper presented at the annual meeting of the American Educational Research Association, San Franciso, CA, April 1992); Daniel M. Koretz et al., "The Effects of High-Stakes Testing on Achievement: Preliminary Findings about Generalization Across Tests" (Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL, April 1991); Robert Glaser, Kalyani Raghavan, and Gall P. Baxter, Cognitive Theory as the Basis for Design of Innovative Assessment (Los Angeles, CA: The Center for Research on Evaluation, Standards, and Student Testing, 1993).

OCR for page 29

Measuring What Counts: A Conceptual Guide for Mathematics Assessment
This page in the original is blank.