An Assessment Primer
Although few, if any, assessments are available in this country for technological literacy, many good assessment tools have been developed in other areas, from reading and writing to science and mathematics. Indeed, over a period of many years, a number of principles and procedures have been developed for obtaining reliable results. Although assessing technological literacy has some special requirements, the general principles developed for assessments in other areas are applicable. Thus, a logical place to begin the development of an assessment of technological literacy is with a review of what has been learned.
The overview of the field of assessments in this chapter lays the groundwork for the remainder of the report, which zeroes in on the assessment of technological literacy. The first section lays out the basics of testing and measurement—definitions, key ideas, and underlying concepts. The middle section focuses on what researchers have learned about cognition, that is, how people think and learn generally. The last section summarizes research on how people learn technological concepts and processes. Unfortunately, a great deal is still not known in this last area, a circumstance that is addressed in the committee’s recommendations in Chapter 8.
Nevertheless, readers of the report, particularly those planning to design an assessment instrument for technological literacy, will want to familiarize themselves with this literature, because a clear idea of the cognitive processes involved in learning is crucial to the development of assessments and the interpretation of the results (NRC, 2001a):
[A] well-developed and empirically validated model of thinking and learning in an academic domain can be used to design and select assessment tasks that support the analysis of various kinds of student performance. Such a model can also serve as the basis for rubrics for evaluating and scoring pupils’ work, with discriminating features of expertise defining the specific targets of assessment.
Testing and Measurement
Like any other field of knowledge, assessment has a specialized vocabulary. The terms “test” and “instrument,” for instance, which are often used interchangeably, refer to a set of items, questions, or tasks presented to individuals under controlled conditions. “Testing” is the administration of a test, and “measurement” is the process of assigning numbers, attributes, or characteristics—according to established rules—to determine the test taker’s level of performance on an instrument. The current emphasis on accountability in public schools, which entails accurate measurements of student performance, has renewed interest in measurement theory, which became a formal discipline in the 1930s.
“Assessment,” derived from the French assidere (to sit beside), is defined as the process of collecting data to describe a level of functioning. Never an end in itself, an assessment provides information about what an individual knows or can do and a basis for decision making, for instance about a school curriculum. A related term, “evaluation,” implies a value judgment about the level of functioning.
“Reliability” is a critical aspect of an assessment. An instrument is considered reliable if it provides consistent information over multiple administrations. For example, on a reliable test, a person’s score should be the same regardless of when the assessment was completed, when the responses were scored, or who scored the responses (Moskal and Leydens, 2000). Reliability is necessary, but not sufficient, to ensure that a test serves the purpose for which it was designed. Statistically, indices of test reliability typically range from zero to one, with reliabilities of 0.85 and above signifying test scores that are likely to be consistent from one test administration to the next and thus highly reliable (Linn and Gronlund, 2000). Assuming other aspects of an assessment remain
constant, reliability generally increases as the number of items or number of individuals participating increases.
“Errors of measurement” can compromise the reliability of an assessment. Even if an instrument is carefully designed and found to be highly reliable, it can never be completely free of errors of measurement (OERL, 2006). This means a test taker’s true score is the sum of the observed score plus or minus measurement error. Errors can relate to the characteristics of the test taker (e.g., anxiety), the test administrator (e.g., inattention to proper test procedures), or the test environment (e.g., insufficient light or excessive noise), as well as to the accuracy of scoring.
“Validity” refers to the soundness and appropriateness of the conclusions based on test scores. Validity answers questions such as “Is the test fair?”, “Does the test measure what it purports to measure?”, and “Are the test results useful for the intended purpose?” (Sireci, 2005). According to current measurement theory, a test or an assessment instrument in and of itself is not considered valid or invalid. Only the inferences based on the test results are valid or invalid. Various types of evidence may be used to determine validity, and all of them must relate to the underlying concept, or construct, being measured (AERA et al., 1999; Messick, 1989).
Various types of evidence may be used to determine validity.
One of the most important types of evidence for determining validity is how well the themes, wording, and format of test items relate to a specified target-content domain, which may be based on specific learning objectives, such as those spelled out in educational standards (e.g., ITEA’s Standards for Technological Literacy). A second type of evidence hinges on the relationship between test results and an external criterion, such as later success in college. A third type is based on a test taker’s response processes. For a test of technological decision making, for example, determining the content-specific problem-solving skills used by examinees to arrive at answers could provide important evidence of validity. When test scores are used or interpreted in more than one way or in different settings, each intended use or interpretation must be validated.
In order to be valid, an assessment must be reliable, but reliability does not guarantee validity. That is, an instrument may produce highly stable results over multiple administrations but not accurately measure the desired knowledge or skill. Data from assessments should be reliable, and the inferences drawn from the data should be valid.
In the course of this study, the committee returned again and again to several ideas of central importance to the development of high-quality assessment instruments. Although these themes are not the only important concepts in the field of assessment, they are given special emphasis in this report, which will be read by many people outside the field. The central themes are: (1) defining purpose; (2) selecting content; (3) avoiding bias; and (4) ensuring fairness.
Any assessment instrument can only assess a small part of what a person or group of people knows, believes, or can do. Thus, before starting the design process, it is important to define the purpose of the assessment. Although an assessment may serve more than one purpose, the most effective assessments are designed to serve only one purpose; different purposes all but imply different kinds of assessments. Completely different designs would be used, for instance, to test how well museum-goers understand the lessons of a technology exhibit and to determine how well graduates of a school of education have been prepared to teach technology to elementary school students.
A designer must first establish what test takers will be expected to know about technology and what they should be able to demonstrate that they know. For students, these questions have often been answered in the form of standards. ITEA (2000) has developed content standards for K–12 students that address technological literacy. AAAS (1993) and NRC (1996) have developed national science education standards that include references to technological literacy. However, because none of these technology-related standards has been widely accepted or incorporated into education programs in the United States, the issue of assessment design can be very complicated.
In the K–12 setting, researchers have identified a number of purposes for assessments.
In the K–12 setting, researchers have identified a number of purposes for assessments, ranging from program evaluation and instructional planning to pupil diagnosis (e.g., Brandt, 1998; McTighe and Ferrara, 1996; Stiggins, 1995). Assessments of technological literacy have two primary purposes in the K–12 setting: (1) to provide a measure of what students and teachers know about technology and how well they are
able to apply it; and (2) to identify strengths and weaknesses in students’ understanding, so that changes in teaching and the curriculum can be made to address those weaknesses. For an assessment of technological literacy, the designer must ask what types of information the results will provide and to whom; how the results will be interpreted; and how useful the results will be.
In contrast, the primary purpose of assessing the technological literacy of out-of-school adults should be to determine what the general populace knows and thinks about technology. At this point, little is known about the level of knowledge or practical skills of adults, and only slightly more is known about their attitudes toward technology. By contrast, a great deal is known about their political affiliations, television and movie viewing habits, health patterns, and buying trends. Assessments of technological literacy will provide information that can be used in a variety of ways, from designing museum exhibits to informing the design of new technologies.
Because there are no explicit standards or expectations for what teachers and out-of-school adults should know or be able to do with respect to technology, assessment developers may wish to consider using a matrix like the one presented in Chapter 3, which is based in part on student standards, as a starting point for selecting appropriate content.
Beyond the specific outcomes of learning, assessments must also take into account learning processes.
Theories of cognitive learning based on a constructivist approach to knowledge acquisition suggest that the most valuable assessment instruments for students—at both the K–12 and post-secondary levels (i.e., pre-service teachers)—are integrated with instructional outcomes and curriculum content. Developers of assessments must have an understanding of instructional goals before they can design assessments to measure whether students have indeed met those goals. However, beyond the specific outcomes of learning, assessments must also take into account learning processes, that is, how students learn; this is an important gauge of what students can do once they leave the classroom. By integrating assessments with instruction, curriculum, and standards, assessments can not only provide valuable feedback about a student’s progress, but can also be used diagnostically to route students through instruction.
Assessment developers must be alert to the possibility of inequities in an assessment. An item is biased if it elicits different levels of performance by individuals with the same ability but from different ethnic, sexual, cultural, or religious groups (Hambleton and Rogers, 1995). Bias can be present in various forms. If one group uses a familiar term as slang for another concept, for example, the use of that word on an assessment might cause members of that group to give the wrong answer even if they understand the concept correctly.
Pilot testing assessment items in small, sample populations is the best way to rule out bias. Suppose, for instance, that two questions seem identical, but the first has a correct response rate of 80 percent by all groups, and the second has an 80 percent correct response rate from all groups but one. Even if the bias is not apparent, the second question should not be used in the assessment.
Another kind of bias may be present for low-income students who may lack experiences that other students take for granted (e.g., family vacations, travel, visits to movie theaters and restaurants, and exposure to a variety of toys and tools). These students may present novel difficulties for developers of assessments trying to measure their knowledge, skills, and understanding.
The issue of fairness is closely related to bias. If no photos, illustrations, or given names of people of a student’s ethnicity or race are included in a test, the student may not be motivated to do well on the test. If the only representation of a student’s background has a negative connotation, the student’s score may be adversely affected. Every effort should be made to avoid stereotypes and include positive examples of all groups (AERA et al., 1999; Nitko, 1996).
Assessment developers must also take into account the extent to which those being assessed have had opportunities to acquire the knowledge or practice the skills that are the subject of the test. In the classroom setting, opportunities to learn may include access to instruction and instructional materials; time to review, practice, or apply a particular concept; teacher competence; and school environment and culture (Schwartz, 1995).
Ideally, test takers, whether they are students, teachers, or out-of-school adults, should be able to participate. For test takers with special needs, the test many have to be adjusted, either through accommodations, modifications, or, in rare instances, the use of alternative items or tasks. Adjustments may vary according to the particular case. For example, individuals with visual impairments require different modifications than individuals with dyslexia, although both may have trouble reading the text of a question. When making adjustments, test developers must ensure that the modified assessment measures the same knowledge or skills as the original assessment.
Each measurement method has advantages and disadvantages.
Assessments can include many different types of questions and exercises, from true/false questions to the construction of a physical model that performs a certain function. Each measurement method has advantages and disadvantages, and test developers must select the ones that serve the purpose of the assessment. Additional measurement issues may arise depending on the amount of knowledge or number and types of skills an assessment attempts to capture.
Selected-response items present test takers with a selection of responses to choose from. Formats include true/false, multiple-choice, and matching questions. One advantage of selected-response items is that they generally require less response time by test takers and are easy to score. This does not mean they are easier to develop, however. Multiple-choice items, when developed to ensure validity and reliability, can not only probe for facts, dates, names, and isolated ideas, but can also provide an effective measure of higher-order thinking skills and problem-solving abilities. Indeed, well constructed multiple-choice items can measure virtually any level of cognitive functioning.
One weakness of the selected-response format is that test takers can sometimes arrive at correct answers indirectly by eliminating incorrect choices, rather than directly by applying the knowledge intended by the test developer. In such cases, an assessment is measuring test-taking skill rather than knowledge or capability.
In constructed-response questions, such as short-answer questions or essay questions, the test taker must provide a response. In general, constructed-response items provide a more in-depth assessment of a person’s knowledge and ability to apply that knowledge than selected-response items. That advantage is counterbalanced, however, by the disadvantage that constructed-response questions are more difficult, time consuming, and subjective to score (Luckhel et al., 1994).
Performance assessments include exhibits, hands-on experiments, and other performance tasks, such as the construction of a device out of given materials that meets specified requirements. One advantage of performance assessments is that they can measure the capability—or “doing”—dimension of technological literacy. A disadvantage is that they are generally more time-consuming and expensive to develop and to administer than other types of assessments items. In addition, if the use of one or more performance tasks significantly reduces the total number of items in an assessment, the overall reliability of the assessment may be adversely affected (Custer et al., 2000).
Effective, Practical Formats
Many effective assessments, including some large-scale, statewide tests, combine at least two formats. In assessing technological literacy, multiple-choice and short-answer questions might be used to measure facts, knowledge, and concepts related to technological literacy, as well as the types of knowledge that can be applied in different situations. However, depending on the objective of the assessment, the latter skill might also be measured by performance tasks. Real or simulated performance tasks may be the best way for determining how well an individual can apply knowledge and concepts to solving a particular problem.
Domain of Knowledge
Often educators or researchers are interested in finding out what people know and can do related to a wide-ranging domain of knowledge.
Because the time and costs of testing would be extensive, it is usually not feasible to develop a single test to measure a very large body of knowledge. Assessment experts have devised a solution to this dilemma—giving only a fraction of the total number of items to each test subject. Dividing a large test into smaller segments and administering each segment to a portion of the population of interest is called “matrix sampling.” The results are reliable at the level of the total population tested as well as for certain subgroups (e.g., by gender or age) but not at the level of the individual, and individual results are not reported. The National Assessment of Educational Progress and the Trends in International Mathematics and Science Study use matrix-sampling techniques.
So-called census testing involves giving the same test to all members of the target population. Because testing time is generally limited, an entire domain of knowledge cannot be assessed in this way. The advantage of census testing is that the results are reliable and can be reported at the level of the individual. State grade-level assessments are examples of testing by the census approach.
Reporting of Results
The way the results of an assessment are reported depends on the purpose of the assessment and the methods used in its development. The most common presentation of results is basic and descriptive—for example, the percentage of individuals who correctly respond to an item or perform a task. Other types of reporting methods include: norm-referenced interpretation; criterion-referenced interpretation; and standards-based interpretation.
Norm-referenced results are relative interpretations based on an individual’s position with respect to a group, often called a normative sample. For example, a student might score in the 63rd percentile, which means that he or she scored better than 63 percent of the other students who took the test or, perhaps, better than 63 percent of a previous group of students who are the reference group (the norm) against which the test was standardized. Because norm-referenced results are relative, by definition some individuals score poorly, some average, and some well.
Criterion-referenced interpretations are presented in absolute rather than relative terms and indicate how well individuals perform absolutely, not on how well they perform relative to others. The criterion is a desired learning outcome, often based on educational standards, and assessment items measure how well the test taker demonstrates knowledge or skill related to that goal. Criterion-referenced results may be presented as a number on a scale, a grade, or a rubric (e.g., novice, adequate, proficient). Thus, depending on the assessment and the group being assessed, few, half, or a large number of individuals (or groups) could meet the established criteria.
Standards-based interpretation is closely related to criterion-based interpretation. The No Child Left Behind Act of 2001 requires that each state develop an assessment program based on a standards-based interpretation of results, which ultimately allows for 100 percent of students, overall and disaggregated by subgroup, to be 100 percent proficient in reading, mathematics, and starting in 2007, in science.
To define proficiency, each state education agency was required to submit a workbook plan to the U.S. Department of Education for approval based on accepted standards-setting techniques, such as Bookmark or Modified Angoff (Kiplinger, 1997). Standards-based interpretation, like criterion-based interpretation, has a proficiency-defining “cut-off” score.
In the assessment triangle described in Knowing What Students Know (NRC, 2001b), one corner of the triangle is cognition. In the context of the present report, cognition is a theory or set of beliefs about how people represent knowledge and develop competence in a subject domain. To test an individual’s learning and knowledge, assessment designers must first understand how people learn and know things. An explicit, well conceived cognitive model of learning is the basis of any sound assessment design; the model should reflect the most scientifically credible evidence about how learners represent knowledge and develop expertise.
Most experienced teachers have an understanding of how their students learn, although that understanding may not be scientifically formulated. As researchers learn more about how people learn and understand, the new understanding should be incorporated into assessments. An assessment should not be static but should be constantly evolving to reflect the latest and best research.
Nature of Expertise
Many principles about thinking and learning are derived from studies of the nature of expertise and how it is developed. Experts have a great deal of both declarative (knowing that) and procedural (knowing how) knowledge that is highly organized and can be efficiently retrieved to solve problems. Thus, cognitive scientists have focused considerable efforts on studying expert performance in the hope of gaining insights into thinking, learning, and problem solving. These studies reveal marked differences between experts and novices (defined as individuals in the early stages of acquiring expertise).
To become an expert, a person must have many years of experience and practice in a given domain. During those years, the individual collects and stores in memory huge amounts of knowledge, facts, and information about his or her domain of expertise. For this knowledge to be useful, however, it must be organized in ways that are efficient for recall and application (Bransford et al., 1999; Chi and Glaser, 1981; Ericsson and Kintsch, 1995). Researchers have found that expert knowledge is organized hierarchically; fundamental principles and concepts are located on the higher levels of the hierarchy and are interconnected with ancillary concepts and related facts on the lower levels of the hierarchy. In addition, procedures and contexts for applying knowledge are bundled with the knowledge so that experts can retrieve knowledge in “chunks” with relatively little cognitive effort. This so-called “conditionalized knowledge” makes it possible for experts to perform high-level cognitive tasks rapidly (Anderson, 1990).
Researchers have found that expert knowledge is organized hierarchically.
Thanks to this highly organized store of knowledge, experts can focus their short-term memory on analyzing and solving problems, rather than on searching long-term memory for relevant knowledge and procedures. In addition, experts can integrate new knowledge into their existing knowledge framework with relatively little effort. For an expert, “knowing more” means having (1) more conceptual chunks of knowledge in memory,
(2) more relations and features defining each chunk, (3) more interrelations among chunks, and (4) effective methods of retrieving and applying chunks (Chi and Glaser, 1981). In contrast, novices do not have highly organized stores of knowledge or links to related knowledge and procedures. Thus, novices must spend more cognitive effort looking for and retrieving knowledge from memory, which leaves less short-term memory for high-level tasks, such as problem solving.
Novices must spend more cognitive effort looking for and retrieving knowledge from memory.
In a telling experiment by Egan and Schwartz (1979), expert and novice electronic technicians were shown a complex circuit diagram for just a few seconds and asked to reproduce as much of the diagram as they could from memory. The experts accurately reproduced much of the circuit diagram, whereas the novices could not. The experts were capable of such remarkable recall because they recognized the elements of the circuit as members of recognizable groups, rather than as individual elements. For example, they noticed that a particular set of resistors, capacitors, and other elements formed an amplifier of a certain typical structure and then recalled the arrangement of this amplifier chunk. When both groups were shown circuit diagrams with the elements arranged randomly, the experts had no way of identifying chunks, or functional units. In this test, experts scored no better than novices.
Experts and novices also focus on different attributes to decide on a strategy for solving a problem. In physics and mathematics, for instance, research has shown that shortly after reading a problem skilled problem solvers cue in on the underlying principles or concepts that could be applied to solve it (Chi et al., 1981; Hardiman et al., 1989; Schoenfeld and Herrmann, 1982). In contrast, unskilled problem solvers cue in on the objects and terminology, searching for a method of attack. For example, skilled problem solvers in physics decide that two problems could be solved with a similar strategy if the same principle (e.g., Newton’s Second Law) applies to both problems. By contrast, unskilled problem solvers base their decisions on whether the two problems share the same surface characteristics (e.g., both contain inclined planes). Focusing on the surface characteristics is not very useful because the two problems that may look similar may require entirely different approaches.
Once an expert decides on the concepts that apply to a problem, he or she then decides on a procedure by which the concepts can be applied. Unskilled problem solvers must resort to finding and manipulating equations that contain the quantities given in the problem until they isolate the quantity or variable being asked for (Chi et al., 1981; Larkin, 1981, 1983; Mestre, 1991). Experts are also often flexible in ways novices
are not. Even when experts are asked to solve a complex problem outside their immediate knowledge base, they can often use strategies (e.g., metacogition, knowledge building), or disciplinary dispositions, to come up with a solution (Wineberg, 1998).
In short, experts have a tendency to carry out qualitative analyses of problems prior to executing the quantitative solution, whereas novices tend to rely on a formulaic approach. For novices, the formulaic approach is more a necessity than a choice, because they have not yet mastered the principles and concepts of the subject and are not adept at knowing when and how to apply them. In the physical sciences, novices with reasonable skills in algebra find it easier to begin by manipulating equations, which enables them to narrow the field to the equations that might be useful by knowing which portion of the textbook the problem came from and by matching the variables in the equations to the “givens” in the problem. Only after considerable experience solving problems in this way do unskilled problem solvers begin to realize that this approach cannot be “generalized.” At that point, they may begin to shift to concept-based problem-solving strategies.
Cognitive research related to expertise raises a number of questions relevant to assessments of technological literacy:
What assumptions can be made about the conditions and time necessary to acquire technological literacy?
How can technological literacy be assessed in ways that do not encourage short-term exam coaching?
What defines the key principles/concepts and procedural knowledge in different areas of technology, and what types of assessments can test for these high-level constructs?
How should naïve and skilled problem solving in technology be characterized, and what types of assessments can distinguish between them?
What constitutes cuing on surface characteristics and cuing on deep structures in technological problem solving, and how does one assess an individual’s place along this spectrum?
One dimension of technological literacy is the ability to reason about technology coherently and abstractly from a broad perspective as a
basis for making informed decisions about environmental, health, economic, political, scientific, and other issues that affect society. Thus, assessing technological literacy is largely about measuring the ability to transfer and apply knowledge in different contexts. In fact, knowledge transfer is a major goal in all education. Teaching in school and university classrooms and lecture halls is based on the premise that what a student learns in school will be useful in other settings both in and out of school (e.g., other courses, other disciplines, the workplace).
All of the research indicates that knowledge transfer is difficult to achieve. For example, classic studies of analogical transfer illustrate that transferring relevant knowledge from one situation to another in a different context, even if the tasks are isomorphic (i.e., they share the same structure), is not routine (Gick and Holyoak, 1980; Hayes and Simon, 1977; Reed et al., 1974, 1985). Most students can transfer knowledge only after being given hints pointing out that the two situations are isomorphic. Recently, Blanchette and Dunbar (2002) found that even students who spontaneously draw analogical inferences from one domain to another do not infer enough similarities to support a full-fledged transfer of knowledge. These studies suggest that the ability to transfer knowledge is context-bound, which presents educators with the challenge of structuring lessons to encourage transfer.
Knowledge transfer is difficult to achieve.
Research shows that several factors affect transfer. First, although it seems obvious, there must be an initial acquisition of knowledge (Brown et al., 1983; Carey and Smith, 1993; Chi, 2000). In many studies, a failure to transfer knowledge was attributable to insufficient initial learning (e.g., Brown, 1990; Klahr and Carver, 1988; Littlefield et al., 1988). The quality of initial learning is also important for transfer. Rote learning does not facilitate transfer; learning with understanding does (Bransford et al., 1983; Mandler and Orlich, 1993; see also the review of the literature in Barnett and Ceci, 2002). If students try to learn too many topics quickly, they may simply memorize isolated facts and have little opportunity to organize the material in a meaningful way or to link new knowledge to related knowledge.
The context of learning also affects transfer. If students perceive that knowledge is tightly bound to the context in which it is learned, transfer to contexts with even superficial differences becomes significantly more difficult (Bjork and Richardson-Klavhen, 1989; Carraher, 1986; Eich, 1985; Lave, 1988; Mestre, 2005; Saxe, 1989). For example, students who learn to solve arithmetic-progression problems can transfer the method
to similar physics problems involving velocity and distance, but students who learn to solve the physics problems first have difficulty transferring the method to arithmetic-progression problems involving the same basic principles (Bassok and Holyoak, 1989). Apparently, after learning physics equations in a specific context, students are unable to recognize that they are applicable in a different context.
Prior knowledge can affect transfer and can lead to the application of inappropriate knowledge to a situation (referred to as negative transfer). There is a great deal in the literature on misconceptions in the sciences indicating that students come to science classes with fragmented knowledge and many misconceptions about how the physical and biological world works (diSessa and Sherin, 1998; Etkina et al., 2005; McDermott, 1984). For example, when children who believe Earth is flat are told that it is round, they may understand this to mean that Earth is round like a pancake, with people standing on top of the pancake (Vosniadou and Brewer, 1992). When told that Earth is round like a ball, children may envision a ball with a pancake on top, upon which people could stand. Thus, misconceptions can adversely affect learning (because students may misconstrue new knowledge that conflicts with prior knowledge) and problem solving (because inappropriate knowledge may be applied).
Cognitive research related to knowledge transfer has raised several questions relevant to the assessment of technological literacy:
What test items would assess the transfer of key technological principles from one context to another?
Are teaching practices and curricula in technology education guided by our best understanding of how to promote knowledge transfer?
Cognitive scientists use the term “metacognition” to refer to the process of consciously keeping track of thinking processes and adjusting understanding while learning to solve problems. Learners develop metacognitive strategies, such as monitoring understanding through self-regulation, planning, monitoring success, and correcting errors, to assess their readiness for high-level performance and to become more aware of themselves as learners (Bransford et al., 1999). Reflecting on one’s learning,
a major component of metacognition, does not typically occur in the classroom, possibly because of the lack of opportunity, because instructors do not emphasize its importance, or because metacognition develops slowly. In the physical sciences, for example, if students are unable to make any progress in solving a problem and are asked to identify the difficulty, they tend to say only that they are “stuck” and not to analyze what they need to make progress. In short, they have a metacognitive awareness of their level of understanding but are unable to bring conditional knowledge of learning strategies to bear on the task.
There are some notable examples of metacognitive strategies being used to improve learning in various domains. In mathematics, for example, teachers have had success with techniques that combine problem-solving instruction with control strategies for generating alternative problem-solving approaches, evaluating among several courses of action, and assessing progress (Schoenfeld, 1985). And in science, a middle-school curriculum that incorporates metacognitive strategies, such as scaffolded inquiry, reflection, and generalization, has met with considerable success in teaching force and motion (White and Frederiksen, 1998).
Metacognition can be important in the development of technological literacy. Students whose instruction and curricula in technology education include metacognitive components, should show observable improvement in technological literacy over time. The development of metacognitive strategies for technology education thus has indirect implications for the assessment of technological literacy.
Metacognition can be important in the development of technological literacy.
Cognitive research related to metacognition has raised a number of questions relevant to the assessment of technological literacy:
How does metacognition develop in specific technology content areas?
How is self-monitoring accomplished for technology, and does it differ from self-monitoring in other domains?
What modes of instruction encourage self-monitoring?
Cognitive scientists have also examined how people form concepts and how they give up one concept in favor of another. In science learning, for example, although no consensus has been reached on the
ontological status of students’ emerging conceptual knowledge, some theories are emerging. One theory posits that portions of students’ knowledge have the qualities of “naïve theories,” which have an impact on students’ scientific explanations and judgments (Carey, 1999; Chi et al., 1994; Hatano and Inagaki, 1996; Ioannides and Vosniadou, 2002; McCloskey, 1983a,b; Smith et al., 1997; Vosniadou and Brewer, 1992; Vosniadou and Ioannides, 1998). Although proponents of this theory do not argue that students’ naïve theories have the robustness and consistency of scientists’ theories, they do argue that some of children’s knowledge is organized into cognitive entities that are activated as bundled units and applied remarkably consistently in similar contexts. According to this theory, encouraging conceptual change requires eliciting and exposing counterproductive knowledge, confronting and refuting that knowledge, and finally offering new ideas to replace the erroneous information (Strike and Posner, 1985).
Others argue that students’ knowledge of science is sensitive to context and unstructured to the point that it cannot be described as a “theory.” According to this view, students’ knowledge is composed of diverse, fine-grained elements that lack the coherence and integration necessary for theories. This granular knowledge is variously described as “resources” (Hammer and Elby, 2003; Hammer et al., 2005) and “knowledge in pieces” (diSessa, 1988, 1993; diSessa and Sherin, 1998; diSessa and Wagner, 2005; diSessa et al., in press). In the knowledge-in-pieces view, instead of activating and applying precompiled knowledge bundles (as suggested in the naïve-theories view), students activate and combine knowledge pieces to reason about scientific situations; however, the knowledge pieces are highly sensitive to contextual variations, and if a context is changed slightly, a new, or modified set of knowledge pieces is activated (Mestre et al., 2004). Thus, conceptual change is more correctly described as conceptual development or refinement because concepts are fluid rather than well formed. To encourage conceptual change, then, instructors must help students both develop knowledge pieces or resources relevant to the situation and then activate students’ existing productive resources they may not have considered relevant (Smith et al., 1994).
Cognitive research related to conceptual change has raised several questions relevant to the assessment of technological literacy:
What is the “conceptual ecology” (diSessa, 2002; Smith et al., 1994) of technological knowledge among different age groups,
and how does that knowledge affect technological problem solving and knowledge transfer?
What counterproductive knowledge about technology do students and adults possess, and how difficult is it to restructure this knowledge in ways that support effective reasoning?
How well do current theories of conceptual change in science map to what occurs in technological learning?
Research on Technological Learning
Technological literacy is a dynamic characteristic developed over a lifetime. To understand how individuals learn to design, solve technological problems, and make decisions and judgments about technological issues, in other words how they become technologically literate, we must attend to the research into how children and adults learn technological concepts and processes.
To inform the committee’s deliberations, two reviews of the literature related to how people learn technology-related concepts were commissioned. One review focused on work in the field of technology education (Petrina et al., 2004); the other examined research in the field of engineering (Waller, 2004). (For selected bibliographies from these reviews, see Appendix D.) As noted in the preceding section, there are a number of unanswered questions about key aspects of how people think and learn in the realm of technology. But some useful work has been done, and those interested in designing assessments for technological literacy will benefit by taking it into account.
Learning Related to Technology1
Very few empirical studies have been done on learning related to technology using a conceptual framework of forms, levels, and the development of competence and expertise. This can be attributed to two factors: (1) the field of technology education is young compared with the field of cognitive psychology; and (2) cognitive scientists, psychologists, and science-education researchers have conducted few studies of any kind on learning related to technology.
One of the few studies that refers explicitly to the insights of
cognitive science was published in 1997 by Thomson, who used concept mapping to investigate how students conceptualize technology. Concept mapping is a method of organizing knowledge hierarchically, showing cause-effect relationships among different knowledge components. Concept maps by experts reflect the organization and structure of their knowledge, whereas concept maps by novices tend to reflect their less integrated, less structured knowledge. Thomson concluded that, although concept mapping might be useful for assessing student knowledge structures, a great deal of work remains to be done to validate the use of concept mapping in technology-education research.
Other studies indicate that young students readily identify tangible objects as technology. They not only commonly associate technology with computers, but they also recognize buildings, machines, and vehicles as technology (Hill and Anning, 2001a,b; Rennie and Jarvis, 1995). Based on the number of examples of technology children identified in images, texts, and words, Jarvis and Rennie (1996, 1998) concluded that conceptions of technology became more sophisticated with increasing age.
Davis, Ginns, and McRobbie (2002) investigated the conceptual understanding of particular aspects of technology, such as material properties. In one study, seven- and eight-year-olds described features of materials they used in a bridge-building lesson. Although they had difficulty expressing the features of composition, such as strength, they understood that by increasing the volume of materials they could basically increase the strength of the structure.
Children understand the concept of technology to be primarily objects, but they understand design to be a process.
Children understand the concept of technology to be primarily objects, but they understand design to be a process. Children learn the processes of technology by participating in design activities (e.g., Foster and Wright, 2001; Hmelo et al., 2000; Roth, 1998). Using a classic “apprenticeship” model of situated cognition, Druin (1999, 2002) and other researchers have been investigating how students apply their expertise to the design of common children’s artifacts—animation, fantasy spaces, games, storybooks, and toys. These studies have shown that children are capable of playing different roles on design teams (e.g., user, tester of technology, design inventor, and critic). Other studies have shown that children prefer participatory models to independent-inventor models and that they feel most creative when they embed their design work in narratives or stories (Bers, 2001; Druin, 2002; Druin and Fast, 2002; Druin and Hendler, 2001; Kafai et al., 1997; Orr, 1996; Taxen et al., 2001).
Researchers have also studied the development of visualization and spatial skills in adolescents and younger teens, who are capable of working with simple symbolic, mathematical models, but who respond most readily to computer and concrete, three-dimensional (3-D) models. Although these children tend to have complex imaginations, unless they have sketching and drawing skills, they have difficulty representing the designs in their mind’s eye in two-dimensional space. For example, Welch et al. (2000) found that 12- to-13-year-old novice designers approach sketching differently than professional designers, who use sketching to explore ideas and solutions. Although adolescents and teens may not be adept at sketching and drawing, they tend to develop design ideas by working with 3-D models. These and other observations by researchers raise questions about the differences between school-based design and professional design (Hill and Anning, 2001a,b).
Novices and experts approach technological design tasks differently, just as they do in other domains of learning. Both novice and expert designers manage a range of concurrent cognitive actions, but novices lack metacognitive strategies for organizing their activities (Kavakli and Gero, 2002). In a study of how expert, novice, and naïve designers approach the redesign of simple mechanical devices, Crismond (2001) found that all three groups relied more heavily on analytic strategies than on evaluation or synthesis. Not surprisingly, expert designers were able to generate more redesign ideas than designers with less experience.
Novices and experts approach technological design tasks differently.
Research on the development of expertise has also focused on the relationship of procedural to conceptual knowledge, both of which appear to be necessary for successful design, for novices as well as experts. In addition, the content of the procedural knowledge is determined by the design problem to be solved. In other words, different design problems require different approaches. The connection between procedural and conceptual knowledge in educational settings was investigated by Pomares-Brandt (2003) in a study of students’ skills in retrieving information from the Internet. In this study, a lack of conceptual knowledge of what the Internet is and how it functions had a negative impact on information-retrieval skills.
Critical thinking and decision making in children, as in adults, suggest the level of reasoning necessary to making sensible choices regarding technological issues. Taking advantage of the relative comfort in distance afforded by virtual reality, researchers have used digital simulations to prompt students to reason through a variety of moral dilemmas
(e.g., Bers, 2001; Wegerif, 2004). Researchers in Germany found that when ethics was taught in school it was often perceived to be “just another school subject” or misunderstood to be religious instruction (Schallies et al., 2002). About 60 percent of the more than 3,000 high school students surveyed in this study did not think they had been prepared in school to deal with the types of ethical decisions that commonly face practitioners in science and technology.
Researchers are also beginning to explore the roles students negotiate in relation to technology (Jenson et al., 2003; Selwyn, 2001, 2003; Upitis, 1998; Zeidler, 2003). Students’ identities are increasingly defined through these roles in terms of competence, interests, and status. Ownership of cell phones or MP3 players, for example, confers status in a culture in which students are heavily influenced by media pressure from one direction and peer pressure from another.
Ethical decision making by adults may be commonplace, but research suggests it is difficult to specify how ethical decisions are made (Petrina, 2003). Research on software piracy reveals that moral reasoning on technological issues has contingencies. For instance, university students typically recognize unethical behavior, but make decisions relative to their desires. Nearly three-quarters of 433 students in one study acknowledged participating in software piracy, and half of these said they did not feel guilty about doing so (Hinduja, 2003).
Learning Related to Engineering2
Historically, research on engineering education has mostly been done by engineering faculty and has focused on changing curricula, classrooms, and content rather than on measuring the impact of these changes on what students know and can do. Recently, however, as academic engineering faculty increasingly collaborate with faculty in other disciplines, such as education, psychology, and sociology, the types of research questions being asked and the assumptions being made are beginning to change. Because of the shift toward investigating how people learn engineering, most of the available research is based on qualitative methodologies, such as verbal protocol analysis and open-ended questionnaires.
Research on how individuals learn the engineering design process
is focused mostly on comparing novice and experienced designers. These studies indicate that novices use a trial-and-error approach, consider fewer issues when describing a problem, ask for fewer kinds of information, use fewer types of design activities, make fewer transitions between design activities, and produce designs of lower quality than experienced designers (e.g., Adams et al., 2003; Ahmed et al., 2003; Atman et al., 1999; Mullins et al., 1999).
Other findings are also relevant to assessing design activity: (1) the choice of task affects problem-solving behavior; (2) more evaluation occurs in the solving of complex problems than simple problems; (3) students draw on personal experiences with the problem situation to generate solutions; and (4) sketching not only allows the problem solver to store information externally, but also allows him or her to experiment with reality, iterate the solution space, and reason at the conceptual and systems level.
Assessing mental models can be very tricky because questions about different, but parallel, situations evoke different explanations. In addition, people who have more than one model for a concept (e.g., electricity as flow and electricity as a field phenomenon) may use the simpler model to explain a situation unless they are asked specifically for the most technically precise explanation. Since the early 1980s, researchers have been trying to capture the mental models children, students, and adults use to understand concepts and processes, such as combustion, electricity, and evaporation (e.g., Borges and Gilbert, 1999; Tytler, 2000; Watson et al., 1997). However, because the vast majority of studies on conceptual change involve single or comparative designs, rather than longitudinal designs, the conclusions require assumptions of equivalence of samples and populations.
Assessing mental models can be very tricky.
Taken together, these studies indicate several features of mental models: (1) they are developed initially through everyday experiences; (2) they are generally simple, causal models of observable phenomena; and (3) they are applied consistently according to the individual’s rules of logic (which may not match those accepted in the scientific community). In addition, individuals can hold alternative conceptions simultaneously without apparent conflict. Thus, different questions may elicit different models from the same individual.
One way of measuring students’ conceptual understanding, rather than their ability to apply formulae, is through a concept inventory. First
developed in physics education in the Force Concept Inventory (Hestenes et al., 1992), concept inventories consist of multiple-choice questions that require a substantial understanding of concepts rather than simple calculation skills or commonsense understanding. By including a variety of distractors, such assessments reveal the extent and nature of student misconceptions about a topic. In engineering education, 15 concept inventories are in various stages of development (Box 4-1).
Thus far, no studies have addressed general engineering concepts, such as systems, boundaries, constraints, trade-offs, goal setting, estimation, and safety. Some of these are obliquely included in analyses of design behavior, but no study addresses how participants specifically include these concepts. In addition, not a single study investigates what the general public understands about these concepts, much less how they come to understand them.
When applying the findings of studies of how people learn engineering design and content, several caveats must be observed. First, engineering students and practitioners are not a random sample of the general population; therefore findings based on this specialized population may not apply to other populations. Second, learning preferences not only affect the way people learn, but also how they interact with assessment instruments, and engineering concepts can be expressed in many different ways (e.g., mathematics, diagrams, analogies, and verbal descriptions). Thus, a robust assessment instrument should accept several different expressions of concepts as “correct.” Third, engineering design is ultimately a collaborative process, with goals, boundaries, constraints, and criteria negotiated by a wide variety of stakeholders. Therefore, an authentic assessment of design skills should include a component that reflects the
Concept Inventories Under Development, by Topic
Source: Waller, 2004.
negotiation, teamwork, and communication skills necessary for successful design processes.
Fourth, because design is context sensitive, researchers must be cautious in comparing results across cultures (including cultures within the United States) (Herbeaux and Bannerot, 2003). The values underlying choices and trade-offs between groups may be very different, as may the communication and negotiation processes. And, because understanding depends in part on everyday experiences, assessors must be careful to select examples and situations that do not reflect socioeconomic or cultural differences. For example, some children may not have experience with clothes drying on a line, while others may never have seen a light wired to a dimmer switch. If these items are used, an assessment instrument may indicate differences in conceptual understanding that actually reflect socioeconomic and/or cultural differences among study participants.
AAAS (American Association for the Advancement of Science). 1993. Benchmarks for Science Literacy. Project 2061. New York: Oxford University Press.
Adams, R.S., J. Turns, and C.J. Atman. 2003. Educating effective engineering designers: the role of reflective practice. Design Studies 24(2003): 275–294.
AERA (American Educational Research Association), APA (American Psychological Association), and NCME (National Council on Measurement in Education). 1999. Fairness in Testing and Test Use: Standards 7.3 and 7.4. Pp. 79–82 in Standards for Educational and Psychological Testing. Washington, D.C.: AERA.
Ahmed, S., K.M. Wallace, and L.T.M. Blessing. 2003. Understanding the differences between how novice and experienced designers approach design tasks. Research in Engineering Design 14(2003): 1–11.
Anderson, J.R. 1990. Cognitive Psychology and Its Implications. San Francisco: Freeman.
Atman, C.J., J.R. Chimka, K.M. Bursic, and H. Nachtmann. 1999. A comparison of freshman and senior engineering design processes. Design Studies 20(2): 131–152.
Barnett, S.M., and S.J. Ceci. 2002. When and where do we apply what we learn?: a taxonomy for far transfer. Psychological Bulletin 128(4): 612–637.
Bassok, M., and K.J. Holyoak. 1989. Interdomain transfer between isomorphic topics in algebra and physics. Journal of Experimental Psychology: Learning, Memory, and Cognition 15(1): 153–166.
Bers, M.U. 2001. Identity construction environments: developing personal and moral values through design of a virtual city. Journal of the Learning Sciences 10(4): 365–415.
Bjork, R.A., and A. Richardson-Klavhen. 1989. On the Puzzling Relationship Between Environment Context and Human Memory. Pp. 313–344 in Current Issues in Cognitive Processes: The Tulane Flowerree Symposium on Cognition, edited by C. Izawa. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Blanchette, I., and K. Dunbar. 2002. Representational change and analogy: how analogical inferences alter target representations. Journal of Experimental Psychology: Learning, Memory, and Cognition 28(4): 672–685.
Borges, A.T., and J.K. Gilbert. 1999. Mental models of electricity. International Journal of Science Education 21(1): 95–117.
Brandt, R. 1998. Assessing Student Learning: New Rules, New Realities. Washington, D.C.: Educational Research Services.
Bransford, J.D., B.S. Stein, N.J. Vye, J.J. Franks, P.M. Auble, K.J. Mezynski, and G.A. Perfetto. 1983. Differences in approaches to learning: an overview. Journal of Experimental Psychology: General 3: 390–398.
Bransford, J.D., A.L. Brown, and R.R. Cocking. 1999. How People Learn: Brain, Mind, Experience, and School. Washington, D.C.: National Academy Press.
Brown, A.L. 1990. Domain-specific principles affect learning and transfer in children. Cognitive Science 14(1): 107–133.
Brown, A.L., J.D. Bransford, R.A. Ferrara, and J. Campione. 1983. Learning, Remembering, and Understanding. Pp. 78–166 in Cognitive Development, vol. 3, edited by J. Flavell and E.M. Markman. New York: John Wiley and Sons.
Carey, S. 1999. Sources of Conceptual Change. Pp. 293–326 in Conceptual Development: Piaget’s Legacy, edited by E.K. Scholnick, K. Nelson, S. Gelman, and P. Miller. Mahwah, N.J.: Lawrence Erlbaum Associates.
Carey, S., and C. Smith. 1993. On understanding the nature of scientific knowledge. Educational Psychologist 28(3): 235–251.
Carraher, T.N. 1986. From drawings to buildings: mathematical scales at work. International Journal of Behavioral Development 9(4): 527–544.
Chi, M.T.H. 2000. Self-explaining: The Dual Processes of Generating Inference and Repairing Mental Models. Pp. 161–238 in Advances in Instructional Psychology: Educational Design and Cognitive Science, vol. 5, edited by R. Glaser. Mahwah, N.J.: Lawrence Erlbaum Associates.
Chi, M.T.H, and R. Glaser. 1981. The Measurement of Expertise: Analysis of the Development of Knowledge and Skills as a Basis for Assessing Achievement. Pp. 37–47 in Design, Analysis and Policy in Testing, edited by E.L. Baker and E.S. Quellmalz. Beverly Hills, Calif.: Sage Publications.
Chi, M.T.H., P.J. Feltovich, and R. Glaser. 1981. Categorization and representation of physics problems by experts and novices. Cognitive Science 5(2): 121–152.
Chi, M.T.H., J.D. Slotta, and N. De Leeuw. 1994. From things to processes: a theory of conceptual change for learning science concepts. Learning and Instruction 4(1): 27–43.
Crismond, D. 2001. Learning and using science ideas when doing investigate-and-redesign tasks: a study of naïve, novice, and expert designers doing constrained and scaffolded design work. Journal of Research in Science Teaching 38(7): 791–820.
Custer, R.L., M. Hoepfl, B. McAlister, J. Schell, and J. Scott. 2000. Using Alternative Assessment in Vocational Education. Information Series No. 381. ERIC Clearinghouse on Adult, Career, and Vocational Education.
Davis, R.S., I.S. Ginns, and C.J. McRobbie. 2002. Elementary school students’ understandings of technology concepts. Journal of Technology Education 14(1): 35–50.
diSessa, A.A. 1988. Knowledge in Pieces. Pp. 49–70 in Constructivism in the Computer Age, edited by G. Forman and P. Pufall. Hillsdale, N.J.: Lawrence Erlbaum Associates.
diSessa, A.A. 1993. Toward an epistemology of physics. Cognition and Instruction 10(2-3): 105–225.
diSessa, A.A. 2002. Why “conceptual ecology” is a good idea. Pp. 29–60 in Reconsidering Conceptual Change: Issues in Theory and Practice, edited by M. Limón and L. Mason. Dordrecht, The Netherlands: Kluwer Academic Pulishers.
diSessa, A.A., and B.L. Sherin. 1998. What changes in conceptual change? International Journal of Science Education 20(10): 1155–1191.
diSessa, A.A., and J.F. Wagner. 2005. What Coordination Has to Say About Transfer. Pp. 121–154 in Transfer of Learning from a Modern Multidisciplinary Perspective, edited by J.P. Mestre. Greenwich, Conn.: Information Age Publishing.
diSessa, A A., N. Gillespie, and J. Esterly. In press. Coherence vs. fragmentation in the development of the concept of force. Cognitive Science.
Druin, A., ed. 1999. The Design of Children’s Technology. San Francisco: Morgan Kaufman.
Druin, A. 2002. The role of children in the design of new technology. Behaviour and Information Technology 21(1): 1–25.
Druin, A., and C. Fast. 2002. The child as learner, critic, inventor, and technology design partner: an analysis of three years of Swedish student journals. International Journal of Technology and Design Education 12(3): 189–213.
Druin, A., and J. Hendler, eds. 2001. Robots for Kids: Exploring New Technologies for Learning Experiences. San Francisco: Morgan Kaufman.
Egan, D.E., and B.J. Schwartz. 1979. Chunking in recall of symbolic drawings. Memory and Cognition 7: 149–158.
Eich, E. 1985. Context, memory, and integrated item/context imagery. Journal of Experimental Psychology: Learning, Memory, and Cognition 11(4): 764–770.
Ericsson, K.A., and W. Kintsch. 1995. Long-term working memory. Psychological Review 102(2): 211–245.
Etkina, E., J. Mestre, and A. O’Donnell. 2005. The Impact of the Cognitive Revolution on Science Learning and Teaching. Pp. 119–164 in The Cognitive Revolution in Educational Psychology, edited by J.M. Royer. Greenwich, Conn.: Information Age Publishing.
Foster, P., and M. Wright. 2001. How children think and feel about design and technology: two case studies. Journal of Industrial Teacher Education 38(2): 40–64.
Gick, M.L., and K.J. Holyoak. 1980. Analogical problem solving. Cognitive Psychology 12(3): 306–355.
Hambleton, R., and J. Rogers. 1995. Item bias review. Practical Assessment, Research and Evaluation 4(6). Available online at: http://pareonline.net/getvn.asp?v=4&n=6.
Hammer, D., and A. Elby. 2003. Tapping epistemological resources for learning physics. Journal of the Learning Sciences 12(1): 53–91.
Hammer, D., A. Elby, R. Scherr, and E. Redish. 2005. Resources, Framing, and Transfer. Pp. 89–119 in Transfer of Learning from a Modern Multidisciplinary Perspective, edited by J.P. Mestre. Greenwich, Conn.: Information Age Publishing.
Hardiman, P.T., R. Dufresne, and J.P. Mestre. 1989. The relation between problem categorization and problem solving among experts and novices. Memory and Cognition 17(5): 627–638.
Hatano, G., and K. Inagaki. 1996. Cognitive and Cultural Factors in the Acquisition of Intuitive Biology. Pp. 683–708 in Handbook of Education and Human Development: New Models of Learning, Teaching and Schooling, edited by D.R. Olson and N. Torrance. Malden, Mass.: Blackwell Publishers Inc.
Hayes, J.R., and H.A. Simon. 1977. Psychological Differences Among Problem Isomorphs. Pp. 21–41 in Cognitive Theory, vol. 2, edited by N.J. Castellan Jr., D.B. Pisoni, and G.R. Potts. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Herbeaux, J.-L., and R. Bannerot. 2003. Cultural Influences in Design. Paper presented at the American Society for Engineering Education Annual Conference and Exposition, June 22–25, 2003, Nashville, Tenn. Available online at: http://asee.org/acPapers/2003-549_Final.pdf (April 5, 2006).
Hestenes, D., M. Wells, and G. Swackhammer. 1992. Force concept inventory. The Physics Teacher 30(3): 141–158.
Hill, A.M., and A. Anning. 2001a. Comparisons and contrasts between elementary/ primary “school situated design” and “workplace design” in Canada and England. International Journal of Technology and Design Education 11(2): 111–136.
Hill, A.M., and A. Anning. 2001b. Primary teachers’ and students’ understanding of school situated design in Canada and England. Research in Science Education 31(1): 117–135.
Hinduja, S. 2003. Trends and patterns among online software pirates. Ethics and Information Technology 5(1): 49–61.
Hmelo, C., D. Holton, and J. Kolodner. 2000. Designing to learn about complex systems. Journal of the Learning Sciences 9(3): 247–298.
Ioannides, C., and S. Vosniadou. 2002. The changing meanings of force. Cognitive Science Quarterly 2(1): 5–62.
ITEA (International Technology Education Association). 2000. Standards for Technological Literacy: Content for the Study of Technology. Reston, Va.: ITEA.
Jarvis, T., and L. Rennie. 1996. Understanding technology: the development of a concept. International Journal of Science Education 18(8): 979–992.
Jarvis, T., and L. Rennie. 1998. Factors that influence children’s developing perceptions of technology. International Journal of Technology and Design Education 8(3): 261–279.
Jenson, J., S. de Castell, and M. Bryson. 2003. “Girl talk”: gender equity and identity discourses in a school-based computer culture. Women’s Studies International Forum 26(6): 561–573.
Kafai, Y., C.C. Ching, and S. Marshall. 1997. Children as designers of educational multimedia software. Computers in Education 29(2-3): 117–126.
Kavakli, M., and J. Gero. 2002. The structure of concurrent actions: a case study on novice and expert designers. Design Studies 23(1): 25–40.
Kiplinger, V.L. 1997. Standard-setting procedures for the specification of performance levels on a standards-based assessment. Available online at: http://www.cde.state.co.us/cdeassess/csap/asperf.htm (October 14, 2005).
Klahr, D., and S.M. Carver. 1988. Cognitive objectives in a LOGO debugging curriculum: instruction, learning, and transfer. Cognitive Psychology 20: 362– 404.
Larkin, J.H. 1981. Enriching Formal Knowledge: A Model for Learning to Solve Problems in Physics. Pp. 311–334 in Cognitive Skills and Their Acquisition, edited by J.R. Anderson. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Larkin, J.H. 1983. The Role of Problem Representation in Physics. Pp. 75–98 in Mental Models, edited by D. Gentner and A.L. Stevens. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Lave, J. 1988. Cognition in Practice: Mind, Mathematics and Culture in Everyday Life. Cambridge, U.K.: Cambridge University Press.
Linn, R.L., and N.E. Gronlund. 2000. Measurement and Assessment in Teaching, 8th ed. Upper Saddle River, N.J.: Prentice-Hall.
Littlefield, J., V. Delclos, S. Lever, K. Clayton, J. Bransford, and J. Franks. 1988. Learning LOGO: Method of Teaching, Transfer of General Skills, and Attitudes Toward School and Computers. Pp. 111–136 in Teaching and Learning Computer Programming, edited by R.E. Mayer. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Luckhel, R., D. Thissen, and H. Wainer. 1994. On the relative value of multiple-choice, constructed-response, and examinee-selected items on two achievement tests. Journal of Educational Measurement 31(3): 234–250.
Mandler, J.M., and F. Orlich. 1993. Analogical transfer: the roles of schema abstraction and awareness. Bulletin of the Psychonomic Society 31(5): 485–487.
McCloskey, M. 1983a. Intuitive physics. Scientific American 248(4): 122–130.
McCloskey, M. 1983b. Naive Theories of Motion. Pp. 299–313 in Mental Models, edited by D. Gentner and A. Stevens. Hillsdale, N.J.: Lawrence Erlbaum Associates.
McDermott, L.C. 1984. Research on conceptual understanding in mechanics. Physics Today 37(7): 24–32.
McTighe, J., and S. Ferrara. 1996. Assessing Learning in the Classroom. Washington, D.C.: National Education Association.
Messick, S. 1989. Validity. Pp. 13–103 in Educational Measurement, 3rd ed., edited by R.L. Linn. New York: Macmillan.
Mestre, J.P. 1991. Learning and instruction in pre-college physical science. Physics Today 44(9): 56–62.
Mestre, J.P., ed. 2005. Transfer of Learning from a Modern Multidisciplinary Perspective. Greenwich, Conn.: Information Age Publishing.
Mestre, J., T. Thaden-Koch, R. Dufresne, and W. Gerace. 2004. The Dependence of Knowledge Deployment on Context among Physics Novices. Pp. 367–408 in Proceedings of the International School of Physics, “Enrico Fermi” Course CLVI, edited by E.F. Redish and M. Vicentini. Amsterdam: ISO Press.
Moskal, B.M., and J.A. Leydens. 2000. Scoring rubric development: validity and reliability. Available online at: http://pareonline.net/getvn.asp?v=7&n=10 (January 26, 2006).
Mullins, C.A., C.J. Atman, and L.J. Shuman. 1999. Freshman engineers’ performance when solving design problems. IEEE Transactions on Education 42(4): 281–287.
Nitko, A. J. 1996. Bias in Educational Assessment. Pp. 91–93 in Educational Assessment of Students, 2nd ed. Englewood Cliffs, N.J.: Merrill.
NRC (National Research Council). 1996. National Science Education Standards. Washington, D.C.: National Academy Press.
NRC. 2001a. The Science and Design of Educational Assessment. Pp. 44–51 in Knowing What Students Know: The Science and Design of Educational Assessment.. Washington, D.C.: National Academy Press.
NRC. 2001b. Knowing What Students Know: The Science and Design of Educational Assessment. Washington, D.C.: National Academy Press.
OERL (Online Evaluation Resource Library). 2006, Alignment Table for Instrument Characteristics—Technical Quality. Available online at: http://oerl.sri.com/instruments/alignment/instralign_tq.html (January 26, 2006).
Orr, J.E. 1996. Talking About Machines: An Ethnography of a Modern Job. Ithaca, N.Y.: ILR Press.
Petrina, S. 2003. “Two cultures” of technical courses and discourses: the case of computer-aided design. International Journal of Technology and Design Education 13(1): 47–73.
Petrina, S., F. Feng, and J. Kim. 2004. How We Learn (About, Through, and for Technology): A Review Of Research. Paper commissioned by the Committee on Assessing Technological Literacy. Unpublished.
Pomares-Brandt, P. 2003. Les nouvelles technologies de l’information et de la communication dans les enseignements technologiques. Unpublished Ph.D. dissertation. Université de Provence, Aix-Marseille.
Reed, S.K., G.W. Ernst, and R. Banerji. 1974. The role of analogy in transfer between similar problem states. Cognitive Psychology 6(3): 436–450.
Reed, S.K., A. Dempster, and M. Ettinger. 1985. Usefulness of analogous solutions for solving algebra word problems. Journal of Experimental Psychology: Learning, Memory, and Cognition 11(1): 106–125.
Rennie, L., and T. Jarvis. 1995. Three approaches to measuring children’s perceptions about technology. International Journal of Science Education 17(6): 755–774.
Roth, W.-M. 1998. Designing Communities. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Saxe, G. B. 1989. Transfer of learning across cultural practices. Cognition and Instruction 6(4): 325–330.
Schallies, M., A. Wellensiek, and A. Lembens. 2002. The development of mature capabilities for understanding and valuing in technology through school project work: individual and structural preconditions. International Journal of Technology and Design Education 12(1): 41–58.
Schoenfeld, A.H. 1985. Mathematical Problem Solving. New York: Academic Press.
Schoenfeld, A.H., and D.J. Herrmann. 1982. Problem perception and knowledge structure in expert and novice mathematical problem solvers. Journal of Experimental Psychology: Learning, Memory, and Cognition 8(4): 484–494.
Schwartz, W. 1995. Opportunity to Learn Standards: Their Impact on Urban Students. ERIC/CUE Digest no. 110. Available online at: http://www.eric.ed.gov/ERICDocs/data/ericdocs2/content_storage_01/0000000b/80/2a/24/a0.pdf (January 26, 2006).
Selwyn, N. 2001. Turned on/switched off: exploring children’s engagement with computers in primary school. Journal of Educational Computing Research 25(3): 245–266.
Selwyn, N. 2003. Doing IT for the kids: re-examining children, computers and the “information society.” Media, Culture and Society 25(4): 351–378.
Sireci, S.G. 2005. The Most Frequently Unasked Questions About Testing. Pp. 111– 121 in Defending Standardized Testing, edited by R.P. Phelp. Mahwah, N.J.: Lawrence Erlbaum Associates.
Smith, C., D. Maclin, L. Grosslight, and H. Davis. 1997. Teaching for understanding: a study of students’ preinstruction theories of matter and a comparison of the effectiveness of two approaches to teaching about matter and density. Cognition and Instruction 15(3): 317–393.
Smith, J.P., A.A. diSessa, and J. Roschelle. 1994. Misconceptions reconceived: a constructivist analysis of knowledge in transition. Journal of the Learning Sciences 3(2): 115–163.
Stiggins, R. 1995. Assessment literacy for the 21st century. Phi Delta Kappan 77(3): 238–245.
Strike, K.A., and G.J. Posner. 1985. A Conceptual Change View of Learning and Understanding. Pp. 211–231 in Cognitive Structure and Conceptual Change, edited by L.H.T. West and A.L. Pines. New York: Academic Press.
Taxen, G., A. Druin, C. Fast, and M. Kjellin. 2001. KidStory: a technology design partnership with children. Behaviour and Information Technology 20(2): 119–125.
Thomson, C. 1997. Concept Mapping as an Aid to Learning and Teaching. Pp. 97–110 in Shaping Concepts of Technology: from Philosophical Perspectives to Mental Images, edited by M.J. de Vries and A. Tamir. Dordrecht, The Netherlands: Kluwer Academic Publishers. Reprinted from International Journal of Technology and Design Education 7(1-2).
Tytler, R. 2000. A comparison of year 1 and year 6 students’ conceptions of evaporation and condensation: dimensions of conceptual progression. International Journal of Science Education 22(5): 447–467.
Upitis, R. 1998. From hackers to Luddites, game players to game creators: profiles of adolescent students using technology. Journal of Curriculum Studies 30(3): 293–318.
Vosniadou, S., and W.F. Brewer. 1992. Mental models of the Earth: a study of conceptual change in childhood. Cognitive Psychology 24(4): 535–585.
Vosniadou, S., and C. Ioannides. 1998. From conceptual development to science education: a psychological point of view. International Journal of Science Education 20(10): 1213–1230.
Waller, A. 2004. Final Report on a Literature Review of Research on How People Learn Engineering Concepts and Processes. Paper commissionied by the Committee on Assessing Technological Literacy. Unpublished.
Watson, J.R., T. Prieto, and J.S. Dillon. 1997. Consistency of students’ explanations about combustion. Science Education 81(4): 425–443.
Wegerif, R. 2004. The role of educational software as a support for teaching and learning conversations. Computers and Education 43: 179–191.
Welch, M., D. Barlex, and H.S. Lim. 2000. The strategic thinking of novice designers: discontinuity between theory and practice. Journal of Technology Studies 25(2): 34–44.
White, B.Y., and J.R. Frederiksen. 1998. Inquiry, modeling, and metacognition: making science accessible to all students. Cognition and Instruction 16(1): 3–118.
Wineburg, S. 1998. Reading Abraham Lincoln: an expert/expert study in historical cognition. Cognitive Science 22(3): 319–346.
Zeidler, D., ed. 2003. The Role of Moral Reasoning on Socioscientific Issues and Discourse in Science Education. Dordrecht, The Netherlands: Kluwer Academic Publishers.