Reliability and Validity

Information about the technical qualities of the assessment is provided in annually published technical digests.17 The technical digests are prepared for each administration cycle by TEA, in conjunction with Pearson, the state’s testing contractor. The version used for the panel’s review was for the 2008-2009 school year (Texas Education Agency, 2009c). This digest contains detailed information about test specifications, item and form development, item and form analysis, and statistical procedures for equating of the reading test. The holistically rated assessments are not statistically equated; instead, the difficulty is maintained through the use of consistent rating rubrics developed to define the proficiency levels and through consistent training and qualifying procedures for the raters. Details about standard setting appear in the report for the 2007-2008 school year.

The technical report contains results of analyses to evaluate reliability and validity. Reliability analyses include the standard types of analyses used for tests with multiple-choice items (i.e., estimates of internal consistency) as well as those used for open-ended items (i.e., interrater agreement). Estimates of classification accuracy are also provided (e.g., accuracy of student classifications into performance categories). Some validity evidence has been collected. Content-related validity evidence consists primarily of expert review of the extent to which the items correspond/conform to the item specifications and the performance-level descriptions. The TEA indicates that construct-related validity evidence is provided through estimation of internal consistency reliability for the multiple-choice components and the training and administration procedures for the holistically rated components. Evidence of criterion-related validity was collected by examining the degree of correspondence between performance on the TELPAS reading component and performance on the state’s reading assessment (the Texas Assessment of Knowledge and Skills, TAKS). For the study, the average TAKS reading score was calculated for students at each grade level and at each performance level: for example, the mean TAKS score for 3rd graders classified on the TELPAS as beginning, intermediate, advanced, or advanced high, and so on for each grade). Rating audits of the other language domains are conducted to provide evidence that the internal structure of the assessments are intact and that teachers administer the holistically rated assessments and apply the rating rubrics as intended. No information is provided about attempts to evaluate the assessment for fairness or bias.


The digests are available at [December 2010].

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement