Skip to main content

Currently Skimming:

Three Designs for Embedding
Pages 40-55

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 40...
... As is noted there, if the national test is aciministerec3 following the procedures that were used when the test was stanciarclizec3, and if the inferences drawn from the national test are appropriate two major conclitionsthe approach can provide comparable national scores for individual stu' dents in different states. Embeclcling creates at least one and often two changes, compared with two freestanding tests.
From page 41...
... State test items are not physically embeciclec3 in the national test, but they are administered close in time under whatever conditions the state determines to be appropriate. This administration procedure protects the national test from context effects.
From page 42...
... The state education agency develops scores that meet its needs, such as a state~specific scale score or performance level system or state norms. The state items provide no scores referenced to national norms or performance levels.
From page 43...
... Assuming that the state items are aciministerec3 close in time to the national test, rather than at the same time, they are unlikely to change responses to the national test appreciably. For these reasons, the clouble-cluty approach provides national scores for incliviclual students that are essentially the same in quality as those that would be obtained in the absence of embecicling.
From page 44...
... Finally, in the current environment of varying but often intense accountability pressures, the national information obtained through the clouble-cluty scenario will sometimes be suspect. That is, if teachers anc3 students feel pressure to raise state scores anc3 if part of the national test contributes to state scores, there may be incentives to engage in the types of inappropriate teaching to the test that can inflate scores.
From page 45...
... States insert the same three unaltered NAEP blocks into their tests. The state tests may vary in content, format, length, level of difficulty, usage, stakes, administration practices, and policies for the inclusion and accommodation of students with disabilities or limited English proficiency.
From page 46...
... The state score is based on either the state test alone or the state test in conjunction with any NAEP items the state considers appropriate. The national score is based on the NAEP items alone.
From page 47...
... Incliviclual NAEP blocks are not constructed to represent the entirety of the assess' meet, and even a set of three blocks is likely to provide an unbalanced or less than complete representation of the NAEP assessment. This lack of representativeness woulc3 likely be exacerbated if states are restricted to using publicly released NAEP blocks, which is likely, because allowing 2NAEP blocks include open-ended responses that must be scored by trained ratersplacing a burden on the state to train the raters using the NAEP procedures or to hire the NAEP scoring subcontractor to score them.
From page 48...
... Consequently, states might be required to administer their state assessments at a time that is more appropriate for the embeciclec3 test than for their own tests anc3 to follow NAEP guidelines for inclusion anc3 accommodation of students with disabilities anc3 limited proficiency in English. Motivational differences are another threat to the comparability of scores.
From page 49...
... For example, if some teachers tailor instruction directly to secure NAEP items because they expect them to appear on state tests, the result coulc3 be distortions of comparisons that are based on NAEP scores. NAEP trends might appear more favorable than they really are, and some comparisons among states coulc3 be biased.
From page 50...
... State testing agencies choose items from the bank, incliviclually or in sets, anc3 include the selected items in their state tests. Although item banks can be used in various ways, in this scenario it is assumed that states choose items on the basis of a match with their curricula or other considerations, with no consideration given to maintaining comparability across states in the items selected.
From page 51...
... The shading of the symbols indicates that the items may vary along important dimensions other than content, such as difficulty and format. There are variations among subsets of the item bank that might be embedded into different state tests.
From page 52...
... If the embeciclec3 items come from a secure source, such as nonreleasec3 NAEP blocks or commercial test publishers' item banks, embecicling them repeatecily in state assessments undermines their security. If the national
From page 53...
... EVALUATION OF THE SCENARIOS The three scenarios differ along several dimensions: the representativeness of the embedded material versus added testing burden for students; the amount of standardization in administration versus the degree of local control; and the extent of the burden placed on states. A major purpose of embedding is to provide two scores, a national and a state score, without significantly adding to the amount of time a student spends taking tests.
From page 54...
... If national items are physically embeclclecl in state tests, the various accommodations that are made available for the state tests would have to be available for the embeciclec3 items. Suppose each state makes all of its accommodations available for the national items (which at best may involve considerable work anc3 expense anc3 at worst may not be possible)
From page 55...
... In the doubles duty scenario, the national score can be proviclec3 very quickly, but the local data might be slow in coming, because student scores on the state items cannot be proviclec3 before national test results are macle available. Although embedding appears at first to be a practical answer to policy makers' goal of obtaining data on student achievement relative to national stanciarcis with little or no aciclec3 test burden for students anc3 minimum disruption of state testing programs, myriad problems, as illustrated by these three scenarios, make that goal elusive.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.