Read "Evaluation of "Redesigning the National Assessment of Educational Progress"" at NAP.edu

Page 13 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Appendix: Redesigning the National Assessment of Educational Progress: Draft For Public Comment

A slightly modified version was adopted on August 2, 1996.

Page 14 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

This page in the original is blank.

Page 15 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

National Assessment Governing Board

National Assessment of Educational Progress

Redesigning The National Assessment of Educational Progress

Draft for Public Comment

The National Assessment Governing Board seeks your comments on this proposed policy to change the National Assessment of Educational Progress. The National Assessment provides the American public with information about student achievement nationally and state-by-state. The proposed policy describes changes that will make the National Assessment a more effective monitor of student achievement and make it more useful to the public.

Written comments should be submitted to Ray Fields, National Assessment Governing Board, Suite 825, 800 North Capitol Street, NW, Washington, DC 20002 for receipt by June 28, 1996. Your comments will be used to help refine the proposed policy before final action by the Nation Assessment Governing Board on August 3, 1996.

Prepared by The Work Group on Planning

Mark Musick, Chair

Marilyn McConachie

Jason Millman

Richard Mills

William Moloney

Michael Nettles

William Randall

Staff: Daniel Taylor, Ray Fields

800 North Capitol Street, N. W.

Suite 825, Mailstop 7583

Washington, D.C. 20002-4233

Phone: (202) 357-6938

Fax: (202) 351-6945

Page 16 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Redesigning the National Assessment of Educational Progress

Overview

The National Assessment is the only means for the American public to know with accuracy how its students are achieving nationally and state-by-state. However, in its current form, the National Assessment provides too little information, too infrequently and too late. Over the years, the National Assessment has been asked to do more and more beyond its central purpose. Additions have been made without changing its basic design, making the National Assessment overly complex and costly.

The National Assessment must be changed in order to provide the public the information it needs about student achievement within the funding available. The National Assessment must be simplified. Its funding should be focused on its central purpose: reporting on student achievement in ten required subjects. Useful but less essential activities should be cut back or carried out by others.

The audience for the National Assessment is the American public. Reports should be timely, easy to use, understandable, and widely available. Results should describe both changes over time and whether student achievement on the National Assessment is “good enough.”

While change is in order, many current policies should not change. For example, reliability, validity, and accuracy of the data will remain hallmarks of the National Assessment. Students who are tested will be as representative as possible of the students in that grade; exclusions because of disability or limited English proficiency will be kept to a minimum.

The proposals recommended below will make the National Assessment more useful and allow it to report on more subjects, more frequently, and more quickly. The recommendations include the following:

test annually according to a publicly released schedule
provide state-level results in reading, writing, math and science at grade 4 and grade 8, according to a predictable schedule
use performance standards for reporting whether student achievement is “good enough”
use international comparisons where feasible
help states and others link their assessments with the National Assessment
vary the amount of detail in testing and reporting
simplify the National Assessment test design
keep test frameworks and specifications stable for at least ten years
simplify how student achievement trends are reported
emphasize grade-based reporting over age-based reporting
make use of innovations in testing and reporting
use an appropriate mix of multiple choice and performance test questions

Page 17 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Redesigning the National Assessment of Educational Progress

A Better Way to Measure Educational Progress in America

An effective democracy and a strong economy require well-educated citizens. A good education lays a foundation for getting a good job, leading a fulfilling life, and participating constructively in society.

But is the education provided in your state and in America good enough? How do our 12th graders compare with students in other nations in mathematics and science? Do our 8th grade students have an adequate understanding of the workings of our constitutional democracy? How well do our 4th grade students read, write, and compute? The National Assessment of Educational Progress is the only way for the public to know with accuracy how American students are achieving nationally and state-by-state.

The National Assessment tests at grades 4, 8 and 12. By law, it covers ten subjects, including reading, writing, math and science. The National Assessment has performance standards that indicate whether student achievement is “good enough.” The National Assessment is not a national exam taken by all students. In fact, only several thousand students are tested per grade, comprising carefully drawn samples that represent the nation and the participating states. Since its first test in 1969, the National Assessment has earned a trusted reputation for its quality and credibility. That reputation must be maintained.

The National Assessment is unique because of its national, state-by-state, and 12th grade results. State and local test results cannot be used to provide a national picture of student achievement. States and local schools use different tests that vary in many ways. The results cannot simply be “added up” to get a national score nor can state scores on their different tests be compared. Virtually no state tests 12th graders, so the only source of information about 12th grade achievement is the National Assessment. College entrance tests such as the ACT and the SAT are taken only by students planning on higher education; the results do not represent the achievement of the total 12th grade class. Twelfth grade achievement is important to monitor because it marks the end of elementary and secondary education, the transition point for most students from school to work, to college, or to technical training.

While there is much about the National Assessment that is working well, there is a problem. Under its current design, the National Assessment tests too few subjects, too infrequently, and reports achievement results too late--as much as 18 to 24 months after testing. Testing occurs every other year. During the 1990's, only reading and mathematics will be tested more than once using up-to-date tests and performance standards. Six subjects will be tested only once and two subjects not at all during the 1990's.

Page 18 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Why is the National Assessment testing so few subjects and fewer subjects now than years ago? Over the years, the National Assessment has become increasingly complex. Its quality and integrity have led to a multitude of demands and expectations beyond its central purpose. Meeting those expectations was done with good intentions and seemed right for the situation at the time. However, additions to the National Assessment have been “tacked on” without changing the basic design, reducing the number of subjects that can be tested and driving up costs.

For example, where a single 120 page mathematics report once sufficed, mathematics reporting in 1992 consisted of seven volumes totalling almost 1,800 pages, not including individual state reports. Also, there are now two separate testing programs for reading, writing, math and science. One monitors trends using tests developed during the 1970's; the other reflects current views on instruction and uses performance standards to report whether achievement is good enough. In addition, there are separate samples for reporting national and state results, even when the state samples may be adequate for some national reports.

The current National Assessment design is overburdened, inefficient and redundant. It is unable to provide the frequent, timely reports on student achievement the American public needs. The challenge is to supply more information, more quickly, with the funding available.

To meet this challenge, the National Assessment design must be changed, building on its strengths while making it more efficient. The design of the National Assessment must be simplified. The purpose of the National Assessment must be sharply focused and its principal audience clearly defined. Because the National Assessment cannot do all that some would have it do, trade-offs must be made among desirable activities. Useful but less important activities may have to be reduced, eliminated, or carried out by others. The National Assessment must “stick to its knitting” in order to be more cost-effective, reach more of the public, provide more information more promptly, and maintain its integrity.

[On the pages that follow are preliminary proposals for new policies for the National Assessment being offered for public comment by the National Assessment Governing Board. The intent of these proposals is to specify purposes, audiences, and changes that will make the National Assessment a more effective monitor of student achievement.]

Page 19 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Purpose of the National Assessment of Educational Progress

The purpose of the National Assessment is stated in its legislation:

to provide a fair and accurate presentation of educational achievement in reading, writing, and the other subjects included in the third National Education Goal, regarding student achievement and citizenship.

Thus, the central concern of the National Assessment is to inform the nation on the status of student achievement. The National Assessment Governing Board believes that this should be accomplished through the following objectives:

to measure national and state progress toward the third National Education Goal and provide timely, fair and accurate data about student achievement at the national level, among the states, and in comparison with other nations;
to develop, through a national consensus, sound assessments to measure what students know and can do as well what students should know and be able to do; and
to help states and others link their assessments with the National Assessment and use National Assessment data to improve education performance.

The Audience for the National Assessment

The primary audience for National Assessment results is the American public, including the general public in states that receive their own results from the National Assessment. Reports should be written for this audience. Results should be released within 6 months of testing. Reports should be understandable, jargon free, easy to use, and widely disseminated.

Principal users of National Assessment data are state policymakers and educators concerned with student achievement, curricula, testing and standards. National Assessment data should be available to these users in forms that support their efforts to interpret results to the public and to improve education performance.

What the National Assessment Is Not

The National Assessment is intended to describe how well students are performing, but not to explain why. The National Assessment only provides group results; it is not an individual student test. The National Assessment tests academic subjects and does not collect information on individual students' personal values or attitudes. Each National Assessment test is developed through a national consensus process. This national consensus process takes

Page 20 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

into account education practices, the results of education research, and changes in the curricula. However, the National Assessment is independent of any particular curriculum and does not promote specific ideas, ideologies, or teaching techniques. Nor is the National Assessment an appropriate means, by itself, for improving instruction in individual classrooms, evaluating the effects of specific teaching practices, or determining whether particular approaches to curricula are working.

Recommended Changes to the National Assessment

To provide the American public with more frequent information in more subjects about the progress of student achievement, changes must be made in the way that the National Assessment is designed and the results are reported. Many current policies should continue. Reliability, validity, and quality of data will remain a hallmark of the National Assessment. The sample of tested students will be as representative as possible, keeping to a minimum the number of students excluded because of disability or limited English proficiency. Tests and test frameworks will be kept stable to measure progress in student achievement over time.

The recommended changes relate to the three objectives outlined above. Current contracts for conducting the National Assessment extend through 1998. Changes can be incorporated in assessments in the year 1999 and thereafter. Where feasible, these recommendations should be used to guide decisions under, current contracts.

OBJECTIVE 1: To measure national and state progress toward the third National Education Goal and provide timely, fair and accurate data about student achievement at the national level, among the states, and in comparison with other nations.

Test all subjects specified by Congress: reading, writing, mathematics, science, history, geography, civics, the arts, foreign language, and economics

The gap must be closed between the number of subjects the National Assessment is required to test and the number of subjects it can test under the current design. By law, the National Assessment is required to test ten subjects and report results and trends. In order to chart progress and report trends, subjects must be tested more than once. However, during the 1990's only reading and mathematics will have been tested more than once using up-to-date tests and performance standards to report how well students are doing.

Recommendations:

the National Assessment should be conducted annually;

Page 21 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

reading, writing, mathematics and science should be given priority, with testing in these subjects conducted according to a publicly released 10-year schedule adopted by the National Assessment Governing Board;
history, geography, the arts, civics, foreign language, and economics also should be tested on a reliable basis according to a publicly released schedule adopted by the National Assessment Governing Board.

Vary the amount of detail in testing and in reporting results

More subjects can be tested if different strategies are used. But each time the National Assessment is conducted, it uses a similar approach, regardless of the nature of the subject or the number of times a subject has been tested. This approach is locked-in through 1998 under current contracts. Under this approach, a larger number of students is tested in order to provide not just overall results, but fine-grained details as well (e.g. the achievement scores of 4th grade students whose teachers that year had five hours or more of in-service training). The National Assessment also collects “ background” information through questionnaires completed by students, teachers, and principals. The questionnaires ask about teaching practices, school policies, and television watching, to name a few. Data analyses are elaborate. Reports are detailed and exhaustive, involving as many as seven separate reports per subject. Although the National Assessment has been praised for this thoroughness, it comes at the cost of testing more subjects, more frequently, with more timely reporting.

The different strategies needed might include several approaches to testing and reporting. For example, these approaches could take the form of “standard report cards,” “comprehensive reports,” and special, focused assessments. A standard report card would provide overall results in a subject with performance standards and average scores. Results for standard report cards would be reported by sex, race/ethnicity, socio-economic status, and for public and private schools, but would not be broken down further. This may reduce the number of students needed for testing and may reduce associated costs. Student, teacher and principal survey questionnaires, if collected at all, would be limited and selective, with reports of results focused on only the most essential issues. Generally, subcategories within a subject (e.g. algebra, measurement and geometry within mathematics) would not be reported. However, data from the National Assessment would continue to be available to state and local educators and policymakers for additional analysis. Most National Assessment reports would use this strategy.

Comprehensive reports, like the current approach, would be an in-depth look at a subject, perhaps using a newly adopted test framework, many students, many test questions, and ample background information. In addition to overall results using performance standards and average scores, subcategories within a subject could be reported. Results would be reported by sex, race/ethnicity, socio-economic status, and for public and private schools, and might be broken down further as well. In some cases, more than one report may be issued

Page 22 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

in a subject. However, comprehensive reporting would occur infrequently, perhaps once in ten years in any one subject.

Special, focused assessments in a subject would be scheduled as needed. They would explore a particular question or issue and may be limited to particular grades. Generally, the cost would be less than the cost of a standard report card. Examples of these smaller-scale, focused assessments include: (1) assessing subjects using targeted approaches (e.g. 8th grade arts), (2) testing special populations (e.g. in-school 12th graders vs. out-of-school youth), and (3) examining skills and knowledge across several subjects (e.g. readiness for work).

Recommendations:

National Assessment testing and reporting should vary, using standard report cards most frequently, comprehensive reporting in selected subjects about once every ten years, and special, focused assessments as needed;
National Assessment results should be timely, with the goal being to release results within 6 months of the completion of testing.

Simplify the National Assessment design

The current design of the National Assessment is very complex. No student takes the complete set of test questions in a subject and as many as twenty-six different test booklets are used within each grade. Students, teachers, and principals complete separate questionnaires and may submit them for scoring at different times. Scores are not calculated directly from the test booklets, but are estimated using statistical procedures known as “conditioning,” “drawing plausible values,” and “imputation.” The estimates are calculated in part by using the questionnaire data collected from the students, teachers, and principals, in addition to the student answers to the test questions. Although using these procedures helps make the data accurate, it also increases the possibility of mistakes. Under these procedures, each time a problem arises in analyzing the data, everything must be redone. It is not unusual for data to be re-calculated hundreds of times. The current complex design of the National Assessment lengthens the time from testing to reporting and adds significantly to its cost.

Recommendation:

options should be identified to simplify the design of the National Assessment and reduce reliance on conditioning, plausible values, and imputation to estimate group scores.

Page 23 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Simplify the way the National Assessment reports trends in student achievement

From its beginning in 1969, monitoring achievement trends has been a central mission of the National Assessment of Educational Progress. Since 1990, the National Assessment has reported achievement trends using two unconnected testing programs. The tests, criteria for selecting students, and reporting are all different. The first program, “the main National Assessment,” tests at grades 4, 8 and 12 and covers ten subjects. The tests are based on a national consensus representing current views of each subject. Performance standards are used to report whether student achievement on the National Assessment is “good enough.” The schedule of subjects to be tested in the main National Assessment is unrelated to the schedule of subjects tested under the second testing program.

The second testing program reports long-term trends that go as far back as 1970. Only four subjects are covered: reading, writing, mathematics and science. The tests are based on views of the curricula prevalent during the 1970's and have not been changed. Testing is at ages 9, 13 and 17 except for writing, which tests at grades 4, 8 and 11. Trends are reported by average score; performance standards are not used. The long-term trend program has been valuable for documenting declines and increases in student achievement over time and a decrease in the achievement gap between minority and non-minority students.

It may be impractical and unnecessary to operate two separate testing programs. However, it also is likely that curricula will continue to change and that current test frameworks may be less relevant in the future. The tension between the need for stable measures of student achievement and changing curricula must be addressed carefully.

Recommendations:

a carefully planned transition should be developed to enable “the main National Assessment” to become the primary way to measure trends in reading, writing, mathematics and science in the National Assessment program;
as a part of the transition, the National Assessment Governing Board will review the tests now used to monitor long-term trends in reading, writing, mathematics and science to determine how they might be used now that new tests and performance standards have been developed during the 1990's for “the main National Assessment.” The Governing Board will decide how to continue the present long-term trend assessments, how often they would be used, and how the results would be reported.

Use performance standards to report whether student achievement is “good enough”

In reporting on “educational progress,” the National Assessment has, until recently, only considered current student performance compared to student achievement in previous years. Under this approach, the only standard was how well students had done previously, not how

Page 24 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

well they should be doing on what is measured by the National Assessment. Although this approach has been useful, it began to change in 1988 from a sole focus on “where we have been” to include “where we want to be” as well.

In 1988, Congress created a non-partisan citizen's group--the National Assessment Governing Board--and authorized it to set explicit performance standards, called achievement levels, for reporting National Assessment results.

The achievement levels describe “how good is good enough” on the various tests that make up the National Assessment. Previously, it might have been reported that the average math score of 4th graders went up (or down) four points on a five-hundred-point scale. There was no way of knowing whether the previous score represented strong or weak performance and whether the amount of change should give cause for concern or celebration. In contrast, the National Assessment now also reports the percentage of students who are performing at or above “basic,” “proficient,” and “advanced” levels of achievement. Proficient, the central level, represents “competency over challenging subject matter,” as demonstrated by how well students perform on the questions on each National Assessment test. Basic denotes partial mastery and advanced signifies superior performance on the National Assessment. Using achievement levels to report results and track changes allows readers to make judgments about whether performance is adequate, whether “progress” is sufficient, and how the National Assessment standards and results compare to those of other tests, such as state and local tests.

Recommendation:

the National Assessment should continue to report student achievement results based on performance standards.

Use international comparisons

Looking at student performance and curriculum expectations in other nations is yet another way to consider the adequacy of U.S. student performance. The National Assessment is, and should be, a domestic assessment. However, decisions on the content of National Assessment tests, the achievement standards, and the interpretation of test results, where feasible, should be informed, in part, by the expectations for education set by other countries, such as Japan, Germany, and England. This, in turn, should take into account problems in making international comparisons truly comparable. In addition, the National Assessment should promote “linking” studies with international assessments, as has been done with the Third International Mathematics and Science Study, so that states that participate in the National Assessment can have state, national and international comparisons.

Page 25 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Recommendations:

National Assessment test frameworks, test specifications, achievement levels and data interpretations should take into account, where feasible, curricula, standards, and student performance in other nations;
the National Assessment should promote “linking” studies with international assessments.

Emphasize reporting for grades 4, 8 and 12

An aspect of the National Assessment design that needs reconsideration is age versus grade-based reporting. At its inception, the National Assessment tested only by age. Current law requires testing both by age (ages 9, 13 and 17) and by grade (grades 4, 8 and 12). Grade-based results are generally more useful than age-based results. Schools and curricula are organized by grade, not by age. Grades 4, 8 and 12 mark key transition points in American education. Grade 12 performance is particularly important as an “exit” measure from the K-12 education system. Grades 4, 8 and 12 are specified for monitoring in National Education Goal 3. Age-based samples may be more appropriate with respect to international comparisons and, given high school drop-out rates, would be more inclusive for age 17 than for grade 12 samples, which are limited to youth enrolled in school. However, assessing the knowledge and skills of out-of-school youth may properly fall under the purpose of another program, such as the National Adult Literacy Survey.

Although grade-based reporting is generally preferable, there is a problem about the accuracy of grade 12 National Assessment results. At grade 12, a smaller percentage of schools and students that are invited actually participate in testing than is the case with 4th and 8th graders. Also, more 12th graders fail to complete their tests than do 4th and 8th graders. In addition, when asked “How hard did you try on this test?” and “How important is doing well on this test?” many more 12th graders, than 4th or 8th graders, say that they didn't try hard and that the test wasn't important. Low participation rates, low completion rates, and indicators of low motivation suggest that the National Assessment may be underestimating what 12th graders know and can do.

One possible reason for low response and low motivation is that schools and students receive very little in return for their participation in the National Assessment beyond the knowledge that they are performing a public service. They do not receive test scores nor do they receive other information from the National Assessment that teachers and principals might wish to use as a part of the instructional program. This should be changed. The National Assessment design should use meaningful, practical incentives that will give school principals and teachers a greater reason to participate and students more of a reason to try harder. The underlying idea is clear: if principals and teachers see direct benefits, they are more likely to agree to participate in the National Assessment. Students may be more likely to take the assessment seriously if they see that their teachers and principals are enthusiastic about participating.

Page 26 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Recommendations:

the National Assessment should continue to test in and report results for grades 4, 8 and 12; however, in selected subjects, one or more of these grades may not be tested;
age-based testing and reporting should continue only to the extent necessary for international comparisons and for long-term trends, should the Governing Board decide to continue long-term trends in their current form;
grade 12 results should be accompanied by clear, highlighted statements about school and student participation, student motivation, and cautions, where appropriate, about interpreting 12th grade achievement results;
the National Assessment design should seek to improve school and student participation rates and student motivation at grade 12.

National Assessment results for states

In 1988, testing at the state level was added to the National Assessment. Previously, the National Assessment reported only national and regional results. For the first time, the information was relevant to individuals in states who make decisions about education funding, governance and policy. As a result, states now are major users of National Assessment data.

Participation was strong in the first state-level assessment in 1990 and has grown to include even more states. In 1996, 44 states and 3 jurisdictions participated in the math assessments at grade 4 and 8 and the science assessment at grade 8.

Currently, the National Assessment draws a separate sample to obtain national results in addition to the samples drawn for individual state reports. Testing separate national samples increases costs and creates additional burdens on states, particularly small states. If this practice can be discontinued, savings should be possible.

States participate in the National Assessment for many reasons, including to have an unbiased, external benchmark to help them make judgments about their own tests and standards. National Assessment data are used to make comparisons to other states, to help determine if curriculum and standards are rigorous enough, to develop questions about curricular strengths and weaknesses, to make state to international comparisons, and to provide a general indicator of achievement.

There is a strong interest among states to use the National Assessment to get state level information in reading, writing, science and mathematics. The level of interest in using the National Assessment varies with respect to the other subjects. State education officials are most interested in the National Assessment testing at grades 4 and 8. They say that

Page 27 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

obtaining cooperation from high schools and 12th grade students is difficult. Also, from their perspective, 12th grade testing comes at the end of compulsory schooling, after which remediation is not feasible within the elementary and secondary system.

States are active partners in the National Assessment program. States help develop National Assessment test frameworks, review test items, and assist in conducting the tests. The National Assessment program is effective, to a great degree, because of the involvement of the states.

Because it is useful to them, and because they invest time and resources in it, states want a dependable schedule for National Assessment testing. With a dependable schedule, states that want to will be better able to coordinate the National Assessment with their own state testing program and make better use of the National Assessment as an external reference point.

Recommendations:

National Assessment state-level assessments should be conducted on a reliable, predictable schedule according to a 10-year plan adopted by the Governing Board;
reading, writing, mathematics, and science at grades 4 and 8 should be given priority for National Assessment state-level testing;
testing in other subjects and at grade 12 should be permitted at state option and cost;
where possible, national results should be estimated from state samples in order to reduce burden on states, increase efficiency and save costs.

Use innovations in measurement and reporting

The National Assessment has a record of innovations in large-scale testing. These include the early use of performance items, sampling both students and test questions, using standards describing what students should know and be able to do, and employing computers for such things as inventory control, scoring, data analysis and reporting. The National Assessment should continue to incorporate promising innovative approaches to test administration and improved methods for measuring and reporting student achievement.

Technology can help improve National Assessment reporting and testing. For example, reports could be put on computer disc, transmitted electronically, and made available through the World-Wide Web. Test questions could be catalogued and made available on-line for use by state assessment personnel and classroom teachers. Also, the National Assessment could be administered by computer, eliminating the need for costly test booklet systems and reducing steps related to data entry of student responses. Students could answer

Page 28 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

“performance items” in cost-effective, computerized formats. The increasing use of computers in schools may make it feasible to administer some parts of the National Assessment by computer under the next contract for the National Assessment, beginning around the year 2000.

Other examples of promising methods for measuring and reporting student achievement include adaptive testing and domain-score reporting. In adaptive testing, each student is given a short “pre-test” to estimate that student's level of achievement. On the basis of the pre-test, higher achieving students are given tougher questions; students who know and can do less are given easier questions. Since the test is “adapted” to the individual, it is more precise and can be markedly more efficient than regular test administration. In domain-score reporting, a subject (or “domain”) is well-defined, a goodly number of test questions are developed that encompass the subject, and student results are reported as a percentage of the “domain” that students “know and can do.” This is in contrast to reporting results using an arbitrary scale, such as the 0-500 scale used in the National Assessment.

Recommendations:

the National Assessment should assess the merits of advances related to technology and the measurement and reporting of student achievement;
where warranted, the National Assessment should implement such advances in order to reduce costs and/or improve test administration, measurement and reporting;
the next competition for National Assessment contracts, for assessments beginning around the year 2000, should ask bidders to provide a plan for (1) conducting testing by computer in at least one subject at one grade, and (2) making use of technology to improve test administration, measurement, and reporting.

OBJECTIVE 2: To develop, through a national consensus, sound assessments to measure what students know and can do as well as what students should know and be able to do.

Keep test frameworks and specifications stable

Test frameworks spell out in general terms how a test will be put together. The test frameworks also determine what will be reported and influence how expensive an assessment will be. Should 8th grade mathematics include algebra questions? Should there be both multiple choice questions and questions in which students show their work? What is the best mix of such types of questions for each grade? Which grades are appropriate for testing in a subject area? Test specifications provide detailed instructions to the test writers about the specific content to be tested at each grade, how test questions will be scored, and the format for each test question (e.g. multiple choice, essay, etc.).

Page 29 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Test frameworks and specifications are developed through a national consensus process conducted by the Governing Board. The national consensus process involves hundreds of teachers, curriculum experts, directors of state and local testing programs, administrators, and members of the public. The national consensus process helps determine what is important for the National Assessment to test, how it should be measured, and how much of what is measured by the National Assessment students should know and be able to do in each subject.

Through the national consensus process, both current classroom teaching practices and important developments in each subject area are considered for inclusion in the National Assessment. In order to ensure that National Assessment data fairly represent student achievement, the test frameworks and specifications are subjected to wide public review before adoption and all test questions developed for the National Assessment are reviewed for relevance and quality by representatives from each participating state.

An important role of the National Assessment is to report on trends in student achievement over time. For the National Assessment to be able to measure trends, the frameworks (and hence the tests) must remain stable. However, as new knowledge is gained in subject areas and as teaching practices change and evolve, pressures arise to change the test frameworks and tests to keep them current. But, if frameworks, specifications and tests change too frequently, trends may be lost, costs go up, and reporting time may increase.

Recommendations:

test frameworks and test specifications developed for the National Assessment generally should remain stable for at least ten years;
to ensure that trend results can be reported, the pool of test questions developed in each subject for the National Assessment should provide a stable measure of student performance for at least ten years;
in rare circumstances, such as where significant changes in curricula have occurred, the Governing Board may consider making changes to test frameworks and specifications before ten years have elapsed;
in developing new test frameworks and specifications, or in making major alterations to approved frameworks and specifications, the cost of the resulting assessment should be estimated. The Governing Board will consider the effect of that cost on the ability to test other subjects before approving a proposed test framework and/or specifications.

Use an appropriate mix of multiple-choice and “performance” questions

To provide information about “what students know and can do,” the National Assessment uses both multiple-choice questions and questions in which students are asked to provide their own answers, such as writing a response to an essay question or explaining how they solved a math problem. Questions of the latter type are sometimes called “performance items.” The two types of questions may require students to demonstrate different kinds of skills and knowledge.

Page 30 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Performance items are desired because they provide direct evidence of what students can do. Individuals confronted with problems in the real world are seldom handed four possible answers, one of which is correct. Although they may be desirable, performance items are more expensive than multiple-choice to develop, administer, and score.

Multiple-choice questions are desired because conclusions are more practical to obtain about the kinds of skills and knowledge assessed by these items, given the time available for testing. However, multiple-choice questions are more subject to guessing than are performance items.

Currently, all students tested by the National Assessment are given both types of questions. Generally, about half the testing time is devoted to each type of question, but the amount of time for each differs based on the skills and knowledge to be assessed, as established in the National Assessment test frameworks. For example, in a writing assessment, all students are asked to write their responses to specific “prompts.” In other subjects, the appropriate mix of multiple-choice and performance items varies.

Recommendations:

both multiple-choice and performance items should continue to be used in the National Assessment;
in developing new test frameworks, specifications, and questions, decisions about the appropriate mix of multiple-choice and performance items should take into account the nature of the subject, the range of skills to be assessed, and cost.

OBJECTIVE 3: To help states and others link their assessments with the National Assessment and use National Assessment data to improve education performance.

The primary job of the National Assessment is to report frequently and promptly to the American public on student achievement. The resources of the National Assessment must be focused on this central purpose if it is to be achieved. However, the products of the National Assessment--test questions, test data, frameworks and specifications, are widely regarded as being of high quality. They are developed with public funds and, therefore, should be available for public use as long as such uses do not threaten the integrity of the National Assessment or its ability to report regularly on student achievement.

The National Assessment should be designed in a way that permits its use by others while protecting the privacy of students, teachers, and principals who have participated in the National Assessment. This should include making National Assessment test questions and data easy to access and use, and providing related technical assistance upon request. Generally, the costs of a project should be borne by the individual or group making the proposal, not by the National Assessment. Examples of areas in which particular interest has been expressed for using the National Assessment include linking state and local tests with the National Assessment and performing in-depth analysis on National Assessment data. States that link their tests to the National Assessment would have an unbiased external benchmark to help make judgments about their own tests and standards and would also have a means for comparing their tests and standards with those of other states.

Page 31 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

Recommendations:

the National Assessment should develop policies, practices and procedures that enable states, school districts and others who want to do so at their own cost, to conduct studies to link their test results to the National Assessment;
the National Assessment should be designed so that others may access and use National Assessment test questions, test data and background information;
the National Assessment should employ safeguards to protect the integrity of the National Assessment program, prevent misuse of data, and ensure the privacy of individual test takers.

Page 32 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

This page in the original is blank.

Page 33 Cite

Suggested Citation:"Appendix." National Research Council. 1996. Evaluation of "Redesigning the National Assessment of Educational Progress". Washington, DC: The National Academies Press. doi: 10.17226/5419.

×

NATIONAL ACADEMY PRESS

The National Academy Press was created by the National Academy of Sciences to publish the reports issued by the Academy and by the National Academy of Engineering, the Institute of Medicine, and the National Research Council, all operating under the charter granted to the National Academy of Sciences by the Congress of the United States.