Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 44
Student Thinking and Related Assessment:
Creating a Facet-Based Learning Environment
km Minstrell
From the research literature we know that students come to our classes with
preconceptions. Over the past 30 years there has been considerable research on
students' conceptions. In a classic popularized article, McCloskey et al. (1980)
identified several misconceptions in mechanics that they described as being con-
sistent with the impetus theory, which predominated before Newton's synthesis.
More recently, summaries of students' conceptual difficulties across the sciences
have been published (Driver et al., 1994; Gabel et al., 1994; Project 2061, 1993~.
There is even at least one summary of international research on students' concep-
tions (Duit et al., 1991~. How can these research results be incorporated into
mainline assessment, curriculum, and instruction?
In topics new to their experience and thinking, learners construct understand-
ing during class activities. The list of students' conceptions and reasoning has
grown to be quite extensive and continues to grow. Consider the following
student ideas:
· Are these ideas wrong?
· To find the average speed, divide the final position by the final time.
· Heavier things fall faster. Extremely light things don't even fall.
· A forward force is necessary to keep an object moving in the forward
direction at a constant speed.
· Objects don't weigh anything in space.
.
Balanced forces can't apply to both an at-rest object and an object moving
at a constant velocity.
· In an interaction the bigger/heavier object exerts the greater force.
44
OCR for page 44
JIM MINSTRELL
45
· The more pulleys the greater the mechanical advantage, or the less force
one will need to exert.
· More batteries will make the bulb brighter.
Most of these statements seem valid on the surface. Several are true, depending
on the context in which they are used. How can we honor the "sense making"
learners have done and yet help them move toward a more scientific understanding?
What can research reveal about students' thinking, and what are the implica-
tions for instruction and assessment? This chapter illustrates some aspects of
students' thinking, suggests a "facets of thinking" approach to organizing stu-
dents' thinking, and shows that the facets approach can be useful to teachers in
diagnosing student difficulties and designing or choosing instruction to address
those difficulties. If it can be useful to teachers to effect better learning, it makes
sense to incorporate the perspective into classroom assessment and even large-
scale assessment in order to inform decisions at the program and policy levels.
The purpose of the chapter is to demonstrate that research on learning and teach-
ing can be used effectively to inform curriculum, instruction, and assessment at
both the policy and especially the classroom levels.
THINKING ABOUT STUDENTS' THINKING
Background
Consider the following question:
A huge, strong magnet and a tiny, weak magnet are brought near each
other. Which of the following statements makes the most sense to you?
A. The huge magnet exerts no force on the small one, which exerts no
force on the large one.
B. The huge magnet exerts more force on the small magnet than the
small one exerts on the large one.
C. The huge magnet exerts the same force on the small magnet as the
small magnet exerts on the large one.
D. The huge magnet exerts less force on the small magnet than the
small magnet does on the large one.
E. The huge magnet exerts no force on the small magnet, which does
exert force on the large one.
Briefly explain how you decided.
Readers can most likely predict which is the most popular answer. In our classes,
prior to instruction, nearly 85 percent of the students pick B and justify the choice
OCR for page 44
46
STUDENT THINKING AND RELATED ASSESSMENT
by citing the fact that the one magnet is larger and stronger and therefore capable
of exerting the larger force. It is also interesting that about 15 percent choose C.
In this case their rationale comes from authority: "I remember that for every
action there is an equal reaction." Asked to cite experience consistent with this
idea, students report remembering "reading it in a book" or "hearing it from a
former teacher." This does not represent an adequate understanding.
Consider a second question:
Sam is taller, stronger, and heavier than Shirley. They are both standing
on level ground and lean on each other back to back without falling.
Which seems to make the most sense with respect to the forces they
exert on each other?
A. Sam exerts a greater force on Shirley.
B. Sam and Shirley exert equal forces on each other.
C. Shirley exerts a greater force on Sam.
D. Neither exerts a force on the other.
Briefly explain.
With the Sam and Shirley problem the reader may have more difficulty predict-
ing the outcomes. In our classes about 50 percent of the students suggest that
Sam will exert the larger force "because he is bigger and/or stronger." About 20
percent of the students suggest Shirley will exert the greater force, citing such
evidence as, "she has the angle on Sam" or "he is just leaning [passive), but she
will have to be pushing [active] to keep them from falling over." Nearly 30
percent suggest they will exert equal forces. While some students cite knowledge
learned from authority, many cite as evidence the fact that "nobody is winning"
and "they are not falling over" [no effect).
From these and similar questions it appeared that students were attending to
surface features of problem situations rather than understanding and applying
principles. From a formal physics perspective, it is clear the students are not
being consistent. After all, these are both "third law" [Newton] questions and the
students are not answering them the same. On the other hand, looking at the
questions from the students' viewpoints, the questions are very different. The
salient features in the two situations are different. In constructing their solutions
the learners were considering such features as size, strength, "winning" or result-
ing movement effects, and level of activity or passivity of the interacting objects.
A tenet from cognitive psychology is that learners are naturally mentally
active (Bruer, 1993~. As humans, we try to make sense of the natural world and
human-made artifacts in it. We organize it initially by surface features and then
react on the basis of recognition of patterns. We see what we perceive to be a
similar situation and make a similar prediction or action. If something does not
OCR for page 44
JIM MINSTRELL
47
work out as expected, we attempt to reorganize our understanding. It is around
these impasses, where ideas do not work, that change in our thinking results.
Making the leap to abstract scientific principles, like Newton's Third Law, to
organize the world phenomena does not come naturally or quickly. It takes
opportunities for development and time to develop our thinking to that level of
principled performance.
To better understand my students' thinking so that I could create better
instruction to address their cognition, I tried to think about the physical world like
a student does. I assumed my students were trying to make sense of their world.
I read their solutions and listened to their ideas with an eye and ear tuned to
search for features that seemed to make sense to them in limited contexts.
From the field of research on students' conceptions and reasoning, I began
identifying and organizing student thinking associated with various problematic
situations. I identified the individual sorts of thinking (which I call facets) and
clustered them around certain situations or ideas. I call these facet clusters
(Minstrel!, 1992~. The term facets was used to avoid the "baggage" that goes
with such terms as misconceptions or alternative conceptions. In fact, much of
the thinking of students is useful and can be built upon, but it does not appear to
be theoretically based, such as what would be part of an impetus theory or
Newtonian theory. It seems rather to be based on salient features and a construc-
tion of explanations from "pieces" of understanding (diSessa, 1993~.
Facets of Thinking
Facets are used to describe students' thinking as it is seen or heard in the
classroom. Facets of students' thinking are individual pieces or constructions of
a few pieces of knowledge and/or strategies of reasoning. While facets assumes
a "knowledge in pieces" perspective like that of diSessa (1993), the pieces are
generally not as small as the phenomenological primitives (p-prims) assumed by
diSessa. Facets have been derived from research on students' thinking and from
classroom observations by teachers. They are convenient units of thought for
characterizing and analyzing students' thinking in the interest of making deci-
sions to effect specific reform of curriculum, instruction, and assessment. Since
facets are only slight generalizations from what students actually say or do in the
classroom, they can be identified by teachers and used by them to discuss the
phenomena of students' ideas. Some are content specific for example,
"horizontal movement makes a falling object fall more slowly." Others are
strategic, like "average velocity is half the sum of the initial and final velocities"
(in any situation). Still others are generic "more implies more," such as "the
more batteries, the brighter the bulb." Typically they are (or seem to be) valid,
depending on the context of usage.
Facet clusters are sets of related facets, grouped around a physical situation
(e.g., forces on interacting objects) or around some conceptual idea (e.g., meaning
OCR for page 44
48
STUDENT THINKING AND RELATED ASSESSMENT
of average velocity). Within the cluster, facets are sequenced in an approximate
order of development and for recording purposes are coded numerically (see
Figures 3-1 and 3-2~. Those ending with O or 1 in the units digit tend to be
appropnate, acceptable understandings for introductory physics. The facets end-
ing in 9, 8, or so tend to be the more problematic facets in that, if this is not dealt
with during instruction, the student will likely have a great deal of trouble with
this cluster and with ideas in related clusters. For example, if students do not
differentiate average speed from a change in position (facet 229-3), they will
have great difficulty understanding many other ideas about motion. For some
facets there are several "subspecies." For example, 229 has three ways that it
represents what students do when they do not separate average rate (speed/
velocity) from amount of distance or displacement. Those facets with middle
digits frequently arise from formal instruction, but the student may have over-
generalized or undergeneralized the application of an appropriate pnnciple. The
numerical code is intended as a descriptive aid. Thus, rather than simply a score,
they suggest implications for what specifically needs to be addressed, where
specific deficiencies exist. For additional information on facets and clusters see
the following two Web sites: http://weber.u.washington.edu/~huntlab/diagnoser/
facetcode.html and www.talariainc.com.
FIGURE 3-1 Cluster 470: forces on interacting objects.
*470 All interactions involve equal magnitude and oppositely directed action and reaction
forces that are on the separate interacting bodies
474
475
476
Effects (such as damage or resulting motion) dictate relative magnitudes of forces
during interaction.
At rest, therefore interaction forces balance.
"Moves," therefore interacting forces unbalanced.
Equal force pairs are identified as action and reaction but are on the same object
Stronger exerts more force
477 One with more motion exerts more force
478
479
More active/energetic exerts more force
Bigger/heavier exerts more force
OCR for page 44
JIM MINSTRELL
FIGURE 3-2 Cluster 220: meaning of average speed or average velocity.
*220 avg. speed = (total distance covered)/(total amount of time)
*221 avg. velocity = Ax/At (together with a direction)
225 Rate expression is over-generalized
225-1 avg. v = vf + vi/2 unless compensation between low and high values occurs
e.g., acceleration is constant
225-2 avg. v = xf / if
226 Rate expression misstated
226-1 avg. v = At/Ax, i.e., change in time divided by change in position.
226-2 avg. v = Avl2
226-3 avg. v = vf/2
226-4 avg. v = (vf+vi)/At
228 Average rate not differentiated from another rate
228-1 avg. v means constant velocity
228-2 Velocity = speed Student doesn't differentiate between velocity and speed.
228-3 avg. v = vf, i.e., average v is the same as the final v.
228-31 greatest avg. vet = greatest Vf during any part of trip
228-4 avg. v = avg. a
228-5 avg. v = Av or Av divided by a quantity other than At
229
Average rate (speed/velocity) not differentiated from amount of distance or
displacement.
229-2 avg.v = pf, i.e., the final position
229-21 avg.v = avg.p
229-3 avg.v = Ap
INSTRUCTIONAL DESIGN BASED ON STUDENTS' THINKING
Using Facets to Create a Facet-Based Learning Environment
49
This section demonstrates how having information from facet assessment
can inform instructional decisions. Whether an assessment is done in the class-
room or on a larger scale, such as state or national assessments, the results and
implications must eventually be fed back to teachers to affect programs and
instruction. Thus, the facet assessment examples presented here are at the class-
room interface between teacher, student, and curriculum. Likewise, assessment
implications can also affect curriculum development or adaptation to better
address targeted learning difficulties with respect to particular learning goals
(e.g., standards).
I will describe how the research on facets is used to create a facet assessment-
based learning environment. The purpose of the environment will be to build
OCR for page 44
so
STUDENT THINKING AND RELATED ASSESSMENT
from assessments of students' initial and developing ideas toward a more prin-
cipled understanding. Facets are used to diagnose students' ideas and to direct
the choice or design of instructional activities (Minstrel!, 1989; Minstrell and
Stimpson, 1996~. The main body of this paper discusses the value of teachers
having, and being able to use, facets and facet clusters. A particular facet cluster
is used to demonstrate the creation of such a facet assessment-based learning
environment.
Goals in our introductory physics course include understanding the nature of
gravity and its effects and understanding the effects of ambient fluid (e.g., air or
water) mediums on objects in them, whether the objects are at rest or moving
through the fluid. For many introductory physics students, an initial difficulty
involves a confusion between which effects are effects of gravity and which are
effects of the surrounding medium. When one attempts to weigh something, does
it weigh what it does because the air pushes down on it? Or is the scale reading
that would give the true weight of the object distorted somehow because of the
air? Or is there absolutely no effect by air? Because these have been issues for
beginning students, the students are usually highly motivated to engage in
thoughtful discussion of the issues.
Assessment for Eliciting Students' Ideas Prior to Instruction in Order to
Build an Awareness of the Initial Understanding
At the beginning of several units or subunits, a preinstruction quiz is admin-
istered. One purpose is to provide the teacher with knowledge of the related
issues in the class in general and to provide specific knowledge of which students
exhibit what sorts of ideas. A second reason is to help students become more
aware of the content and issues involved in the upcoming unit.
To get students involved in separating effects of gravity from effects of the
ambient medium, we use the following question associated with Figure 3-3.
"First, suppose we weigh some object on a large spring scale, not unlike the ones
we have at the local market. The object apparently weighs ten pounds, according
to the scale. Now we put the same apparatus, scale, object and all, under a very
large glass dome, seal the system around the edges, and pump out all the air. That
is, we use a vacuum pump to allow all the air to escape out from under the glass
dome. What will the scale reading be now? Answer as precisely as you can at
this point in time. [pause] And, in the space provided, briefly explain how you
decided." Thus, students' ideas are elicited. (I encourage the reader to answer
this question now as best, and as precisely, as possible.)
Students write their answers and rationale. From their words a facet diagnosis
can be made relatively easily. The facets associated with this cluster, "Separating
medium effects from gravitational effects," can be seen in Figure 3-4. Students
who give an answer of zero pounds for the scale reading in a vacuum usually are
thinking that air only presses down and that "without air there would be no
OCR for page 44
JIM MINSTRELL
FIGURE 3-3 Preinstruction question.
Name School-
Period Physics I.D. #
Scale reading = 10.0 lbs. Scale reading =
Briefly explain how you decided.
51
Teacher
Nature and Effects of Gravity Diagnostic Quiz Problem 1.
Glass dome with air removed
,~,
lbs.
weight, like in space" (facet 319~. Other students suggest a number "a little less
than 10" because "air is very light, so it doesn't press down very hard, but it does
press down some"; thus, taking the air away will only decrease the scale reading
slightly (facet 318~. Other students suggest there will be no change at all. "Air
has absolutely no effect on scale reading." This answer could result either from
a belief that mediums do not exert any forces or pressures on objects in them
(facet 314) or that fluid pressures on the top and bottom of an object are equal
(facet 315~. A few students suggest that while there are pressures from above and
below there is a net upward pressure by the fluid. "There is a slight buoyant
force" (facet 310, an acceptable workable idea at this point). Finally, a few
students answer that there will be a large increase in the scale reading "because of
the [buoyant! support by the air" (facet 317~.
The numbering scheme for the facets allows for more than simply marking
the answers "right" or "wrong." The codes ending with a high digit (9, 8, and
sometimes 7) represent common facets used by our students at the beginning of
instruction. Codes ending in 0 or 1 are used to represent goals of instruction. The
OCR for page 44
52
STUDENT THINKING AND RELATED ASSESSMENT
FIGURE 3-4 Separating medium effects from gravitational effects.
*310 Pushes from above and below by a surrounding fluid medium lend a slight
support (net upward push due to differences in depth pressure gradient)
*310-1 The difference between the upward and downward pushes by the surrounding air
results in a slight upward support or buoyancy.
*310-2 Pushes above and below an object in a liquid medium yield a buoyant upward force
due to the larger pressure from below.
*311 A mathematical formulate approach (e.g., rhoxgxhl - rhoxgxh2 = net buoyant
pressure)
314
Surrounding fluids don't exert any forces or pushes on objects
315 Surrounding fluids exert equal pushes all around an object
315-1 Air pressure has no up or down influence (neutral)
315-2 Liquid presses equally from all sides regardless of depth
316 Whichever surface has greater amount of fluid above or below the object has
the greater push by the fluid on the surface.
317 Fluid mediums exert an upward push only
317-1 Air pressure is a big up influence (only direction)
317-2 Liquid presses up only
317-3 Fluids exert bigger up forces on lighter objects
318 Surrounding fluid mediums exert a net downward push
318-1 Air pressure is a down influence (only direction)
318-2 Liquid presses (net press) down
319 Weight of an object is directly proportional to medium pressure on it
319-1 Weight is proportional to air pressure.
319-2 Weight is proportional to liquid pressure
latter abstractions represent the sort of reasoning or understanding that would be
productive at this level of learning and instruction. Middle number codes repre-
sent some learning. When data are coded, the teacher/researcher can visually
scan the class results to identify dominant targets for the focus of instruction.
Benchmark Instruction to Initiate Change in Understanding and Reasoning
By committing their answers and rationale to paper, students express greater
interest in coming to some resolution, in finding out what is "right." Students are
now motivated to participate in activities that can lead to resolution. In the
classroom this benchmark lesson usually begins with a discussion of students'
ideas. We call this stage "benchmark instruction" since the lesson tends to be a
OCR for page 44
JIM MINSTRELL
53
reference point for subsequent lessons (diSessa and Minstrell, 1998~. It unpacks
the issues in the unit and provides clues to potential resolution of those issues. In
this stage, students are encouraged to share their answers and associated ratio-
nales. Teachers attempt to maintain neutrality in leading the discussion, both to
allow issues to be brought forth by students to maintain a focus on their thinking
and to honor the potential validity of students' facets of knowledge and reasoning
(van Zee and Minstrell, 1997~.
Note that many of the ideas and their corresponding facets have validity.
Facet 319: Some students have suggested a valid correlation between no air in
space and no apparent weight in space. What they have not realized is that in an
earth-orbiting shuttle one would likely get a zero spring scale reading, whether in
the breathable air inside the shuttle or the airless environment outside. Facet 318:
It is true that air is light, that is, its density is low relative to most objects we put
in it. Air does push downward, but it also pushes in other directions. Facet 317:
Air does help buoy things up, but the buoyant force involves a resolution of the
upward and downward forces by the fluid, and that effect is relatively small on
most objects in air (not so for a helium balloon). Facet 315: For many situations
the difference between the up and down forces by air is so small that even the
physicist chooses to ignore it. Thus, there is validity to most of the facets of
understanding and reasoning used by students as they attempt to understand and
reason about this problem situation.
By now many threads of students' present understanding of the situation are
unraveled and lay on the table for consideration. The next phase of the discussion
moves toward allowing fellow students to identify strengths and limitations of the
various suggested individual threads. "Is this idea ever true? When and in what
contexts? Is this idea valid in this context? Why or why not?" After seeing the
various threads unraveled, students are motivated to know "what is the truth."
The teacher asks: "How can we find out what happens?" Students readily sug-
gest: "Try it. Do the experiment and see what happens." The experiment is run,
air is evacuated, and the result is "no detectable difference" in the scale reading in
the vacuum versus in air.
Facet-Informed Elaboration Instruction to Explore Contexts of
Application of Other Threads Related to New Understanding and Reasoning
The initial activity was to address facet 319, considered the problematic
understanding. But many of the students also thought that air only pushed down
or only pushed up. Additional discussion and laboratory investigations allow
students to test the contexts of validity for other threads of understanding and
reasoning. Other activities involving ordinary daily experiences are brought out
for investigation: an inverted glass of water with a plastic card over the opening
(the water does not come out), a vertical straw dipped in water and a finger placed
over the upper end (the water does not come out of the lower end until the finger
OCR for page 44
54
STUDENT THINKING AND RELATED ASSESSMENT
is removed from the top), an inverted cylinder is lowered into a larger cylinder of
water (it "floats" and as the inverted cylinder is pushed down, one can see the
water rise relative to the inside of the inverted cylinder), and a 2-liter, water-
filled, plastic soda bottle with three holes at different levels down the side
(uncapped, water from the lowest hole comes out fastest; capped, air goes in the
top hole and water comes out the bottom hole). These activities address students'
hypotheses consistent with facets 318 and 317.
While each experiment is a new specific context, the teacher encourages the
students to come to general conclusions about the effects of the surrounding fluid.
"What can each experiment tell us that might relate to all of the other situations,
including the original benchmark problem?" In addition to encouraging addi-
tional investigation of issues, the teacher can help students note the similarity
between what happens to an object submerged in a container of water and what
happens to an object submerged in the "ocean of air" around the earth.
A final experiment for this subunit affords students the opportunity to try
their new understanding and reasoning in another more specific context. A solid
metal slug is "weighed" successively in air, partially submerged in water (scale
reading is slightly less), totally submerged just below the surface of the water
(scale reading is even less), and totally submerged deep in a container of water
(scale reading is the same as any other position, as long as it is totally sub-
merged). From the scale reading in air, students are asked to predict (qualita-
tively compare) each of the other results, do the experiment, record their results,
and, finally, interpret those results. This activity specifically addresses the
students' hypotheses associated with facets 316, 315, and 314. This task asks the
students to relate these results and the results of the previous experiments to the
original benchmark experience.
By seeing that air and water have similar fluid properties, students are pre-
pared to build an analogy between results. Weighing in water is to weighing out
of water (in air) as weighing in the ocean of air is to weighing out of the ocean of
air (in a vacuum). Thus, students are now better prepared to answer the original
question about weighing in a surround of air, and they have developed a more
principled view of the situation. Since students' cognition is associated with the
specific features of each situation, a paramount task for instruction is to help
students recognize the common features that cross the various situations. Part of
coming to understand physics is coming to see the world differently, but the
general principled view can be constructed inductively from experiences and
from the ideas that apply across a variety of specific situations.
The facets are our representation of the students' ideas. They originate and
are used by the students, although they may be elicited from the students by a
skilled instructor or within the design of assessment items. Thus, the generalized
understandings and explanations are constructed by students from their own
earlier ideas. In this way I am attempting to bridge from students' ideas to the
formal ideas of physics.
OCR for page 44
JIM MINSTRELL
63
FIGURE 3-9 Examples of other relevant DIAGNOSER questions.
OCR for page 44
64
STUDENT THINKING AND RELATED ASSESSMENT
TABLE 3-1 Student Preinstruction Predictions for Scale Reading
Scale Readinga Percent Facet Code
s201bs. 2 317
20 > s > 11 11 317
11 s > 10 3 310
s= 10 35 314/315
10 > s 2 9 12 318
9 > s > 1 17 318
1 2 s20 20 319
Note: Table is ordered by predicted scale reading answer followed by the inferred facet associated
with that answer.
aRepresents the predicted scale reading.
after students completed the elaboration experiences for a similar multiple-choice
question and reasoning combination, 81 percent of the answers to the phenom-
enological question and 59 percent of the answers to the reasoning were coded
310.
Apparently revisiting the "object in fluid" context in subsequent instruction
helped maintain the most productive level of understanding and reasoning about
buoyancy at nearly the 60 percent level. By the end of the first semester, the class
had integrated force-related ideas (statics and dynamics) into the context of fluid
effects on objects submerged in the fluid medium. On a question in this area 60
percent of the students chose, and then briefly defended in writing, an answer
coded 310. On the end-of-year final, 55,56, and 63 percent of students chose the
answer coded 310 on three related questions.
At the other end of the understanding and reasoning spectrum of facets is a
substantial development away from believing that "downward pressure causes
gravitational effects" (facet 319) and "fluid mediums push mainly in the down-
ward direction" (facet 318~. On the free-response preinstruction assessment,
these two facets accounted for 49 percent of the data (see Table 3-1~. In the
DIAGNOSER those facets accounted for about 5 to 20 percent of the data.
Similar results were achieved on both the first- and second-semester finals. Much
of this movement away from the problematic "pressure down" facets did not
make it all the way to the most productive facet. Much student thinking moved to
intermediate facets that involve thinking that there are no pushes by the surround-
ing fluid of air (facet 314) or that the pushes up and down by the surrounding air
are equal (facet 315~. Most of the students were not stuck on these intermediate
facets in the water context. This makes sense since they have direct evidence that
water pressure at different depths causes a difference in the scale reading. In the
air case the preponderance of the evidence is that if there is any difference
because of depth it does not matter (e.g., force diagrams on a metal slug hung in
the classroom do not usually include forces by the surrounding air). Even low
OCR for page 44
JIM MINSTRELL
65
achieving students made significant gains (see Tables 3-2 and 3-3~. The semester
test questions used were similar to the DIAGNOSER questions shown earlier.
Also from Tables 3-2 and 3-3 it can be seen that individual students do not
always answer in consistent fashion. Across items and across time individual
students exhibit various facets of thinking. Which pieces of their knowledge and
understanding are brought to a particular problem depend on the features of the
problem. Early in the instruction it is the salient physical or verbal features of the
problem. At this time there is considerable inconsistency between their answers
to problems that might be seen as similar when the questions are organized by
formal topic (recall the questions about Newton's Third Law). Later in the
TABLE 3-2 Development of Understanding and Reasoning: Forces by
Surrounding Air on Objects
Facet Code Preair Prewater Seml Sem2
310 16, 55 64, 72, 74 5, 16, 19, 21, 25,
64, 72
315 66, 74 16, 31, 53, 8, 27, 31, 53, 55,
66, 69 66, 69, 74
317 21
318 7, 19, 25, 27, 5, 7, 8, 19, 7
64, 69 21, 27, 55
319 5, 8, 31, 53, 72 25
Note: Numbers at right are identification numbers for 16 low-achieving students.
TABLE3-3 Development of Understanding and Reasoning: Forces by
Surrounding Water on Objects (four days after preair)
Facet Code Preair Prewater Seml Sem2
310 7,8,16,25 5,7,8, 16, 19, 5,7,16,19,21
55, 64, 74 21 ,27, 53, 55, 25, 27, 31, 53, 55,
64, 66, 72, 74 64, 66, 72, 74
315 5, 19,27, 66 25, 31, 69 69
317
318
319
21, 31, 53, 69
72
Note: Numbers at right are identification numbers for 16 low-achieving students.
OCR for page 44
66
STUDENT THINKING AND RELATED ASSESSMENT
instruction, as students become more expert like, their answers are based on
threads of experience and understanding that are more principle based (Chi et al.,
1981~. Their answers become more consistent and converge on the target under-
standings.
Apparently about half of our students came to physics instruction believing
that air and perhaps even water pressure effects are mainly in the downward
direction. By the end of the year, through early specific instruction and later
revisiting, this belief was greatly reduced, and over half of the students were able
to demonstrate good productive understanding of buoyant effects. Given that this
is a difficult topic conceptually even for many physics teachers, these results are
encouraging.
Similar facet-based instruction is now being used by many physics teachers
and some curriculum developers (Hunt and Minstrell, 1994~. Facet-based instruc-
tion has also been effective in the learning of introductory statistics and in train-
ing health care providers in the management of pain.
IMPLICATIONS FOR LARGE-SCALE ASSESSMENT
The examples given above are primarily from the classroom. That is the
source of most of our specific experience with facet-based assessment. But the
classroom is also where the results of large-scale assessments must make sense
and be useful if the large-scale assessments are to help effect reform and result in
better learning. We are beginning to explore the application of facet-based assess-
ment to large-scale assessment. Large-scale tests like the National Assessment of
Educational Progress or the state assessments could include facet-indexed foils
that could inform policy, program, and practice. While the preceding material is
based on many years of research and practice, below are some speculations as we
begin our exploration.
Implications for Policy, Program, and Practice
The National Science Education Standards advocate reform in assessment as
well as curriculum and instruction (National Research Council, 1996~. The test
items and ranking purposes of the typical normative-based assessment system
will not be sufficient. Universities and employers may still need to rank appli-
cants against each other, and that has been accomplished reasonably well by
normative testing, such as the SAT (Scholastic Assessment Test). But in a
standards-based system, large-scale assessment needs to compare the perfor-
mance of the unit (state, district, school, or individual) with the standard. There
is a choice to be made for the criteria for making the comparison. One could set
the large-scale standard to be a certain score that is deemed sufficient for certifi-
cation. But such action would sidestep the intent of the standards effort. We
would not know what the troubles are at a level of specificity that can help decide
OCR for page 44
JIM MINSTRELL
67
what to do about them. This would not be much different from what we presently
have with respect to assessment.
Suppose instead that the learning target standards are integrated with the
problematic understandings in facet clusters. Multiple-choice foils, or the rubrics
for coding open-response items, could be tuned to the facets. Such a large-scale
assessment system would be able to check on accountability for policy and pro-
gram revision, but it would also allow sufficiently rich feedback to inform the
system about what troubles exist. From identification of specific troubles, teach-
ers and others creating or adapting a curriculum could design or choose lessons to
address the problematic issues.
What might a test based on facets be like? To characterize thinking in any
one cluster for a group of students would likely require incorporating two
DIAGNOSER-type items, like those shown earlier, to each form of the test. If
the two items incorporate the reasoning as well as the phenomenological ques-
tion, that is like having four subitems per cluster. From our experience respond-
ing to these items takes about 1 minute per subitem for a total of about 4 minutes
per cluster. At that rate we could test for 15 clusters per 60-minute test. For our
physics program there are about 40 clusters, but several are not unique to physics.
If a large-scale test is to cover the learning in science over a three-year period, I
estimate that would represent about 100 clusters. (Note: that is not 100 topics.
For example, the topics of force and motion would be represented by about eight
clusters.)
For large-scale assessment in which not every student needs to take the same
test, sampling procedures could be used to cover all clusters. Analysis from such
an assessment could provide information about specifically where students were
having trouble in each cluster. This is the sort of feedback that can inform
curriculum and instruction decisions as well as teachers about what needs to be
focused on in the classroom. It seems that something like this procedure could be
used for NAEP and some state tests.
What about large-scale assessments where all students are to take equivalent
forms of the same test? For example, in Washington state all students at grade 10
need to obtain a certificate of mastery in science. Would that imply that all
students would have to be tested over the same clusters and that from one year to
the next the clusters must be the same? Presently test developers include items in
topical headings. If each topic contains several clusters, perhaps test developers
could have the freedom to choose items from within clusters under the given
topic heading. For example, the test contractor for the state of Washington was to
choose or design two or three items associated with each topical strand. There
are about 40 topical strands in the state science standards. Within each topical
strand, I estimate there would be two or three clusters. Thus, I believe a facets
and facet cluster base could be used as the basis for constructing and choosing
items instead of using traditional methods or current ones. In this way the state
would be able to certify students as meeting the general standard for science
OCR for page 44
68
STUDENT THINKING AND RELATED ASSESSMENT
using a score from reduced test data. But the school could get facet cluster-based
data from which to make program decisions, and teachers could get facet-based
data from which they could make instructional decisions to improve practice and
learning.
A facet-based system can also be used to tune expected learning targets. The
setting of our present national and state standards is based to a considerable
extent on what we "want our students to know and be able to do." To a much
lesser extent, standards efforts have incorporated some implications from research
on what students "do know and are able to do," especially when we set goals for
"all" students. We could consider these goals as the top-level facets, but much
more research is needed to determine the problematic constructions by students
on their way to the goal (Minstrel!, in preparation). This sets an item on the
agenda for research. In the past, research on learning was set largely in clinical or
classroom situations designed to teach particular topics, not particularly tuned to
learning the standards. To the extent that we collectively believe the standards
that have been set are the goals we want to achieve, we need to direct research on
learning in the disciplines toward identifying the problematic issues and under-
standing on the way to the goals. Then in our teaching experiments the problem-
atic ideas become the focus of our design of curriculum and instruction as we
attempt to guide students toward the standards.
Facet-based assessment can provide information from which we can decide
expected levels of understanding. If we had characterizations of various under-
standing and reasoning for students nationally, we might be better able to identify
reasonable targets for learning. For example, using the previously stated results,
is it reasonable to assume that all high school students can achieve the 310 level
of understanding for the air contexts as well as the water contexts? For air
contexts we might be willing to set the standard bar at 317 (air has some buoyant
upward force somehow) or 315 (air pushes from above and below are equal). Yet
requiring a 310 standard for all with respect to understanding water contexts
seems reasonable, since we see (from Tables 3-2 and 3-3) that the water context
is more achievable, even by lower achievers. Thus, a facet-like system can
provide information for making cost-benefit decisions. For example, knowing
that low-achieving students were diagnosed at 310 on the water cluster but at 315
in the air cluster would suggest that better activities are needed for demonstrating
the similarity of fluid characteristics of air and water. Can we afford the extra
instructional time to get from one level of understanding to the higher level?
Should we invest the extra time?
For making practical classroom "next day" decisions, one or more facet-
based questions can be used during one class period to inform the teacher about
tomorrow's needs. More questions per cluster will be needed in the long run for
periodic monitoring of learning by the teacher. Except on unit exams, the results
of the monitoring can be low-stakes assessment, with grades assigned only on the
OCR for page 44
JIM MINSTRELL
69
basis of honest effort. Meanwhile, the results provide data from which teachers
can make decisions on what might happen next.
Developing students' understanding in a cluster takes instructional time.
Deep learning cannot be hurried. Judging from our experience in classes, it took
four to five hours in class for students to develop their understanding in one
cluster, like the clusters already demonstrated. Other clusters, such as the three
for developing ideas of length, area, and volume, can be taken together as part of
coming to understand spatial extent, about five hours at the high school level.
Still other clusters involve the processes of scientific thinking and can be assessed
across some of the other more subject-matter-oriented clusters. For example, the
cluster for the meaning of explanation in science (Figure 3-10) can be applied
across items that ask for explanation of specific events (e.g., explaining falling
bodies or interpreting the resulting offspring from parent plants). As can be seen
from Tables 3-2 and 3-3, not all of the understanding comes during the four days
of instruction in that cluster. Some comes through revisiting the ideas and issues
in subsequent subunits around related clusters. Thus, districts, departments, and
individual teachers need to decide which clusters are more important or more
difficult for their students and choose or design instruction to develop the more
important ideas.
Need for Ongoing Research on Learning and Teaching
Although we have a good start for developing facets as they apply to high
school physics, substantially more research needs to be done to characterize
students' thinking across sciences and across grade levels. Consistent with this
FIGURE 3-10 Cluster: explanations or interpretations of phenomena.
*050 Explanations or interpretations involve conceptual modeling of multiple related
science or math concepts, using experimental evidence and rational argument to address
questions of "how do you know . . . ?" or "why do you believe the results, observation, or
prediction?"
*051 Explanation involves a mathematical modeling approach, incorporating principles
subsumed under that model.
053 Explanation involves identifying possible mechanisms involving a single concept
causing the result.
055 Explanation involves identifying and stating a relevant concept.
057 Explanation constitutes a description of procedures that led to the result.
059 Explanations or interpretations are given by repeating
the observation or result to be explained.
OCR for page 44
70
STUDENT THINKING AND RELATED ASSESSMENT
vision, we initiated an investigation into students' facets of thinking in probability
and statistics at the introductory level at the university (Schaffner et al., 1997~.
In collaboration with the University of Washington, the State Commission
on Student Learning, selected school districts, and Talaria, Inc., Earl Hunt and I are
directing the building of an assessment system to serve teachers as they focus on
students' learning. This project involves identifying facets and developing a facet-
based system for classroom assessment in the physical sciences and mathematics
relevant to the quantitative sciences for grades 6 through 10 for Washington state.
To follow this development, see the Web sites at http://weber.u.washington.edu/
huntlab/diagnoser/facetcode.html and www.talariainc.com.
Building a base of facets and facet clusters involves setting particular learn-
ing goals and doing the research to describe students' thinking in intermediate
positions on the way to those goals. The top-level facets need to be described at
a level of specificity that includes all of the "pieces" of knowledge and process-
ing necessary to operationally define the goal. For our example 310 facet, the
description fully written out is about a third of a page long. Defining these goals
at this level requires deep knowledge of the content domain. For a large-scale
facet assessment, the goals of learning will need to be carefully and specifically
articulated.
To identify the other facets requires research. What do learners say and do
when confronted with situations relevant to the learning goal? Some of the
research on students' conceptions exists in the literature, but much more needs to
be done in the context of the classroom. When we were building our present
version of the facets, we identified situations or tasks we thought students should
be able to explain if they had the goal understanding. Ideally, the tasks also
involved many of the key issues related to the cluster. We collected 50 or so
student responses to each task. As we read the responses, we sorted them accord-
ing to similarities in answers and reasoning. Then we attempted to characterize
the similarities among the several responses in one pile. Each characterization
was the first try at identifying a facet. Next, using another task that was relevant
to the same learning goal, the process was repeated for the responses to that
second task. If the characterization of one of the piles for this set seemed similar
to the characterization for a stack from the other set, we began to think we had
validity and reliability for identifying that particular facet. But since particular
tasks elicit particular ideas, not finding a similar pile for the second task analysis
did not mean that the facet was not valid. The showing of a particular facet
typically depends on context as well as content. To validate the facets associated
with large-scale assessment would necessitate substantially more research on
students' understanding of critical ideas in multiple contexts.
Once several facets in a particular cluster are identified, they can be used to
predict typical responses on other tasks related to the cluster. It takes creativity to
come up with novel problematic situations, but then the facets can be used to
OCR for page 44
JIM MINSTRELL
71
suggest responses to open-ended questions or to create foils for multiple-choice
questions.
A facets perspective offers an opportunity to apply statistical analyses to
determine prerequisite knowledge for the development of understanding of more
complex ideas. Participation in large-scale assessment offers the opportunity to
do research to determine what development is dependent on the development of
what other ideas. Research on learning and teaching can benefit from develop-
ment of understanding of students' facets of thinking resulting from large-scale
assessment. Statistical analyses of large-scale test data could yield information
on what facet in one cluster is related to what facets in other clusters. Thus,
research on learning can identify what facets are in an ecological relationship
(one of mutual existence and support) with other facets. Such research could
serve curriculum program designers about what ideas to address as a set.
Computerized tools can assist teachers or large-scale assessment systems in
diagnosing facets and handling electronically posted data from students. Univer-
sity of Washington colleagues Adam Carlson and Steve Tanimoto are building a
computerized system for facet coding of electronically submitted open responses
to questions and problems. Another colleague, Aurora Graf, has designed a
DIAGNOSER-type module to address facets or thinking about ratio reasoning for
middle-level students.
SUMMARY
Through a better understanding of students' thinking, we can characterize
the sorts of problematic understandings that students exhibit on their way to
learning goals. We can create facet clusters and individual facets. Using facet
assessment can help teachers identify needs for particular learning activities.
Curriculum developers or teachers adapting curriculum can better know and
understand the targets for the lessons they engineer. Facet assessment can be
used to monitor students' progress in the classroom. Large-scale facet-based
assessments can identify particular curricular needs or suggest the need to revise
standards or learning goals to make them more appropriate developmentally or
with respect to time and other available resources. Finally, large-scale facet-
based assessment will require support to clearly specify learning goals and
research to identify more than just the "right" answers.
Through facets and tasks related to targeted facet clusters the thinking of
large groups of students can be characterized and reported. From facet descrip-
tions of groups of learners, policy and program decisions can be informed. Feed-
back and recommendations, specific to the facets, can be presented to teachers in
the classroom and they can be better informed about what specifically to do to
effect better learning.
OCR for page 44
72
STUDENT THINKING AND RELATED ASSESSMENT
ACKNOWLEDGMENTS
Several colleagues over the years have influenced this work or assisted in its
progress. Arnold Arons, John Clement, Andrea diSessa, Virginia Stimpson,
Dorothy Simpson, Emily van Zee, and Earl Hunt have contributed to the genera-
tion or revision of the ideas. They deserve much credit. Tens of other teachers
and thousands of students have tested the ideas. I also want to thank the adminis-
trations of the school distncts, especially Mercer Island School Distnct, for their
willingness to allow their teachers to think about facet assessment and the effects
it can have on teaching and learning in the classroom.
The research and development described in this paper were supported by
grants to Mercer Island School Distnct and the University of Washington from
the James S. McDonnell Foundation Program for Cognitive Studies for Educa-
tional Practice and the National Science Foundation: Program for Research in
Teaching and Learning. Preparation of this paper was supported in part by a
grant from the National Science Foundation to Talana, Inc., a small research and
development company that creates facet-based assessment and learning environ-
ments. The ideas expressed here are those of the author and do not necessarily
reflect the beliefs of the sponsoring foundations.
REFERENCES
Bruer, J.
1993 Schools for Thought: A Science of Learning in the Classroom. Cambridge, Mass.: MIT
Press.
Chi, M., P. Feltovich, and R. Glaser
1981 Categorization and representation of physics problems by experts and novices. Cognitive
Science 5:121-152.
diSessa, A.
1993 Toward an epistemology of physics. Cognition and Instruction 10(2-3):105-226.
diSessa, A., and J. Minstrell
1998 Cultivating conceptual change with benchmark lessons. In Thinking Practices in Learning
and Teaching Science and Mathematics, J.G. Greeno and S. Goldman, eds. Mahwah,
N.J.: Lawrence Erlbaum Associates.
Driver, R., A. Squires, P. Rushworth, and V. Wood-Robinson
1994 Making Sense of Secondary Science: Research into Children's Ideas, New York:
Routledge.
Duit, R., F. Goldberg, and H. Niedder (eds.)
1991 Research in Physics Learning: Theoretical Issues and Empirical Studies: Proceedings of
an International Workshop held in Kiel, Germany. Institute for Science Education.
Gabel, D. (ed)
1994 Handbook of Research on Science Teaching and Learning. New York: MacMillan.
Hunt, E., and J. Minstrell
1994 A cognitive approach to the teaching of physics. In Classroom Lessons, K. McGilly, ed.
Cambridge, Mass.: MIT Press.
Levidow, B., E. Hunt, and C. McKee
1991 The Diagnoser: A HyperCard tool for building theoretically based tutorials. Behavior
Research Methods, Instruments, and Computers 23(2):249-252.
OCR for page 44
JIM MINSTRELL
73
McCloskey, M., A. Caramazza, and B. Green
1980 Curvilinear motion in the absence of external forces: Naive beliefs about the motion of
objects. Science 210:1139-1141.
Minstrell, J.
1989 Teaching science for understanding. In Toward the Thinking Curriculum: Current Cog
nitive Research, L. Resnick and L. Klopfer, eds. 1989 Yearbook of the Association for
Supervision and Curriculum Development, Alexandria, Virginia.
1992 Facets of students' knowledge and relevant instruction. Pp. 110-128 in Research in
Physics Learning: Theoretical Issues and Empirical Studies: Proceedings of an Inter
national Workshop held in Kiel, Germany. R. Duit, F. Goldberg, and H. Niedderer, eds.
Kiel, Germany: Institute for Science Education.
Minstrell, J., and V. Stimpson
1996 A classroom environment for learning: Guiding students' reconstruction of understand-
ing and reasoning. In Innovations in Learning: New Environments for Education, L.
Schauble and R. Glaser, eds. Mahwah, New Jersey: Lawrence Erlbaum Associates.
National Research Council
1996 National Science Education Standards. Washington, D.C.: National Academy Press.
Project 2061
1993 Benchmarks for Science Literacy. New York: Oxford University Press.
Schaffner, A., D. Madigan, A. Graf, E. Hunt, J. Minstrell, and M. Nason
1997 Benchmark lessons and the World Wide Web: Tools for teaching statistics. In: Proceed-
ings of the Second International Conference on the Learning Sciences, D.C. Edelson and
E.A. Domeshek (eds.). Evanston, Ill: Northwestern University.
van Zee, E., and J. Minstrell
1997 Reflective discourse: Developing shared understanding in a physics classroom. Inter-
national Journal of Science Education 19(2):209-228.