APPENDIX
C
Using the Prospects Data to Report on the Achievement of Students with Disabilities
INTRODUCTION
In 1988, Congress mandated a "national longitudinal study of eligible children" to assess the effect of Chapter 1 (now renamed Title I) on students' academic achievement and other measures of school success. This study, title Prospects: The Congressionally Mandated Study of Educational Opportunity and Growth, was designed to evaluate the shortand longterm consequences of Chapter 1 program participation by following large national samples of public schoolchildren in three grade cohorts, as well as their parents, teachers, and principals. Baseline data were collected in spring 1991 for third and seventh grade students and in fall 1991 for first grade students.
There were three stages of sampling for Prospects: (1) selection of a sample of school districts, (2) selection of a sample of schools within sampled districts, and (3) subsampling of students, but only in very large schools. Within most sampled schools, all students enrolled in all classrooms containing the target sample grades were included in the sample. Thus, the Prospects study includes all enrolled students within designated grades with no exclusions on the basis of disability, lack of English proficiency, or any other reasons. Thus, Prospects was designed to include approximately 7 to 10 percent more students compared with other national studies. If a student with a disability was excused from participating in some activities on which data were gathered, (e.g., achievement testing, selfadministered questionnaire), every attempt was made to complete the remainder of the data collection protocol for that student.
A rich collection of information was gathered, including responses to a district Chapter 1 coordinator questionnaire, a school and program questionnaire (completed by principal or other staff member), a classroom teacher question
naire, a student questionnaire, the Comprehensive Test of Basic Skills, a parent questionnaire, as well as student record information and a student profile (ratings completed by the teacher).
The study was designed as a sixyear longitudinal study for evaluating Chapter 1. However, funding for the study was terminated before it was completed. In these analyses, we therefore use only the first two years of the study, 1991 and 1992, and use only the data for the third grade cohort. The design was national in scope and focused on cohorts in grades 1, 3, and 7, with oversampling of lowincome districts and schools. The sample include 337 schools, with 10,333 students in the third grade cohort. For a detailed description of the study, see U.S. Department of Education (1993). In this appendix we refer to the program as Chapter 1.
SIMPLE POINT ESTIMATES OF ACHIEVEMENT
By far the most common method of assessing and reporting achievement based on standardized tests is to report single, point estimates or cohort scores, perhaps broken down by group categories. The most common statistics are either to report median or mean scores, by selected grades. Because the reported scores are usually based on a national probability distribution, individual student scores are measured relative to the national population of students in a given grade. Institutional scores (by district or school) are aggregates of individual scores and allow for the same comparisons—ignoring withininstitution variation.
Examples of third and fourth grade Comprehensive Test of Basic Skills reading and math scores, from the Prospects Study, appear in Table C1.^{1} Normal curve equivalents (NCEs) are used as the basic metric. ^{2} The table provides means and standard deviations for the total population of students tested and relevant subpopulations. The third set of columns—change scores—is an individual change score based only on students taking both the third and fourth grade tests.
The information conveyed is certainly relevant. The total population, which is a sample of students in schools with high concentrations of Chapter 1 students, is below the national mean of 50 on each test, as expected. And between the third and fourth grades, students decline relative to the national norms—more in math
TABLE C1 Third and Fourth Grade Prospects Achievement Test Data, 19911992


Third Grade 
Fourth Grade 
Change Scores 



Reading 
Math 
Reading 
Math 
Reading 
Math 
Total Population 

Mean 
46.8 
47.7 
45.4 
45.4 
1.1 
2.7 

Standard Deviation 
20.6 
20.2 
20.5 
22.0 
12.8 
14.6 

N 
13,431 
13,167 
10,584 
10,584 
7,906 
7,692 

Free Lunch 

Mean 
41.0 
43.0 
39.0 
39.9 
1.6 
3.1 

Standard Deviation 
19.0 
18.9 
18.1 
20.1 
12.5 
14.3 

N 
4,752 
4,696 
4,304 
4,282 
3,109 
3,064 

NonFree Lunch 

Mean 
55.1 
54.9 
53.8 
53.1 
0.9 
2.1 

Standard Deviation 
19.2 
19.2 
19.7 
21.4 
12.8 
14.3 

N 
5,890 
5,744 
4,817 
4,674 
3,891 
3,757 

Females 

Mean 
48.6 
48.4 
47.5 
46.4 
0.8 
2.3 

Standard Deviation 
19.6 
19.4 
20.0 
21.1 
12.2 
13.6 

N 
6,683 
6,562 
5,223 
5,125 
4,001 
3,897 

Males 

Mean 
45.2 
47.1 
43.6 
44.7 
1.5 
3.0 

Standard Deviation 
21.5 
20.9 
20.8 
22.8 
13.3 
15.0 

N 
6,625 
6,489 
5,204 
5,101 
3,903 
3,793 

African American 

Mean 
37.3 
38.3 
36.0 
35.6 
1.7 
3.0 

Standard Deviation 
18.3 
18.4 
17.3 
19.1 
12.9 
14.4 

N 
2,824 
2,801 
1,984 
1,976 
1,524 
1,507 

Asian American 

Mean 
47.4 
55.5 
48.8 
59.2 
1.6 
1.2 

Standard Deviation 
19.6 
20.0 
19.1 
21.7 
11.4 
13.8 

N 
604 
596 
469 
461 
376 
369 

Hispanic American 

Mean 
37.4 
41.2 
36.2 
38.3 
1.2 
3.9 

Standard Deviation 
19.0 
19.1 
18.0 
19.5 
12.7 
14.0 

N 
2,125 
2,078 
1,920 
1,889 
1,398 
1,366 

Other American 

Mean 
46.2 
47.0 
43.9 
45.3 
2.2 
2.1 

Standard Deviation 
19.1 
19.1 
19.0 
20.8 
11.4 
13.2 

N 
283 
278 
195 
191 
150 
142 

White American 

Mean 
53.4 
52.8 
52.1 
50.8 
1.2 
2.7 

Standard Deviation 
19.5 
19.3 
19.8 
21.5 
12.9 
14.5 

N 
6,605 
6,423 
5,132 
4,992 
4,027 
3,880 

Disabled 

Mean 
41.7 
42.7 
40.2 
39.9 
1.2 
2.9 

Standard Deviation 
22.6 
21.4 
20.4 
22.4 
12.4 
14.7 

N 
1,152 
1,124 
821 
796 
582 
562 


Third Grade 
Fourth Grade 
Change Scores 



Reading 
Math 
Reading 
Math 
Reading 
Math 
Emotional Disability 

Mean 
36.3 
35.2 
33.5 
31.5 
1.6 
3.4 

Standard Deviation 
21.0 
21.7 
18.5 
22.9 
12.6 
17.0 

N 
95 
98 
85 
81 
56 
52 

Learning Disability 

Mean 
27.1 
30.3 
29.6 
29.3 
0.5 
2.3 

Standard Deviation 
18.6 
17.5 
15.8 
17.1 
12.1 
12.2 

N 
286 
278 
217 
209 
133 
127 

Physical Disability 

Mean 
44.7 
45.7 
43.3 
43.5 
1.2 
2.4 

Standard Deviation 
21.4 
21.1 
19.1 
20.5 
12.7 
15.2 

N 
203 
189 
130 
126 
100 
98 

Speech Disability 

Mean 
41.1 
43.5 
41.5 
42.7 
0.1 
1.2 

Standard Deviation 
21.2 
20.3 
20.6 
22.9 
12.2 
15.6 

N 
307 
303 
234 
225 
168 
164 

Other Health Disability 

Mean 
48.6 
48.5 
45.5 
44.4 
1.6 
4.3 

Standard Deviation 
22.2 
20.9 
20.6 
22.8 
12.8 
14.6 

N 
399 
388 
266 
258 
195 
189 
(2.7 NCEs) than reading (1.1 NCEs). The group differences are also relevant and often quite stark. For example at this age, girls do better than boys on all tests, and drop behind the national population over the year less than boys do. Asian American students score lower than whites on reading but somewhat higher on math; however, Asian students improve more than the national population, or any racial group, on both reading and math. The differences between African and Hispanic Americans and whites and Asians is considerable in both grades on both tests—at times approaching a full standard deviation.
The variance within groups also provides useful information. First, as is typical of large sample test data, the variances around the mean are not very different between groups. For example, the largest differences in variances by race for the four tests over the two grades are 17.3 (African American, fourth grade, reading) to 21.5 (white, fourth grade, math) and all but 3 of the 20 variances are between 18 and 20. However—and critical—these variances within groups may be very misleading for assessing both achievement levels and educational progress. And because the variances are misrepresented by such simple reporting, so are the differences in the means between groups. This can be simply illustrated by using relatively simple and then more complex multivariate estimates of group differences.
MULTIVARIATE ESTIMATES OF ACHIEVEMENT
A range of more complex estimation models can be used to provide a more accurate and richer picture of educational achievement than is obtained by reporting simple, mean point estimates of achievement. The problem is that these estimates require increasingly complex statistical procedures and more elaborate and costly data. In Tables C2a, C2b, C3a, and C3b data complexity increases in the columns marked Model I to Model III (for Table C2a and C3a) and Model IV to VI (for Table C2b and C3b). The first level of complexity (Models I and IV) requires multivariate estimates. These variables include: (1) a student income measure—qualifying for free lunch or not, (2) student gender, (3) student race, and (4) student disability status.
Models II and V add variables on family status. These variables—family income, parent education, parent employment, and marital status—were acquired in the Prospects study through parent surveys. Models III and VI add behavioral and attitude data for individual families—data obtained from parent surveys. For purposes of these analyses, the variables include a measure of parent academic educational expectations for their child, an index of satisfaction with the school their child attends, number of school contacts, and three questions on parental involvement (at home, through participation in school organizations, and through attendance at school events).
Finally, Tables C2a and C2b are distinguished from C3a and C3b by modeling fourth grade student achievement with (C3a and C3b) and without controlling for prior achievement (third grade achievement test scores). Tables C3a and C3b include prior test scores as independent variables. These models allow changescore, achievement progress assessments.
Cohort, PointEstimate Models
Increasingly more complex and more accurate estimations of point or cohort scores (when reporting by grade), are depicted in Tables C2a and C2b. Table C4 provides descriptive statistics for variables used in Tables C2 and C3. The differences between Tables C2a and C2b (and later Tables C3a and C3b) are in the modeling of students with disabilities. In Table C2a, a general indicator variable for being disabled or not is included; in Table C2b, indicator variables are included for each of 5 types of disability.^{3}
In Model I, for both reading and math, all the variables are indicator variables and the coefficients can be interpreted as differences in means between the relevant categories. Thus, the coefficient for free lunch eligibility for reading means that students whose family income qualifies for governmentprovided free lunch (1.35 times the poverty line), on average, and controlling for other gender
and racial differences, scored 9.88 normal curve equivalent points less than students who did not qualify for the subsidy. Similarly, girls scored 3.85 points more than boys, African Americans 11.26 less than whites, Asian American 1.78 points less than whites, etc. The size of these differences can be compared with the standard deviations for the fourth grade tests for these groups reported in Table C4.
There are several important differences between the results derived from these models and the simple descriptive group differences as reported in Table C1. First, if one computes the crude differences in means in Table C1 for any category (freelunch vs. nonfree lunch; African American vs. white), in each case the indicator variables in Table C2a represent smaller differences. The reason is that several of the independent variables are correlated, and thus failure to control for that correlation produces artificially higher estimates of group differences. Specifically, by simply reporting racial group means, we fail to account for the considerable diversity in group populations—in this simple model, the differences in income and gender of students within racial groups.^{4}
Models II and III add precision and explanatory information, but also reduce sample sizes. In this national sample, the reduction in sample sizes results from the failure of sample families to complete surveys. In addition, reduction in sample sizes may affect the accuracy of the estimates of subpopulations, such as students with a given disability. Despite these problems, the added information provides insights into factors affecting achievement, and potentially useful data for specifying realistic expectations for schools and districts. For example, the effect of parent education is obvious and, as we shall see, impervious to the inclusion of almost every variable we can include. Regardless of race, income, employment, or marital status and despite attitudes and direct parent support of education, having a parent who has more education is a significant predictor of higher student achievement.
The same is true of educational expectations held by parents for their children. As measured by a question querying how many years of education they expect their child to complete, ''expectations" are a very significant and strong predictor of higher test scores. This result also carries over into more complex models.
These results tell us something not only about the puzzle of education, but also about how to assess educational systems and specify institutional expectations. They also illustrate the variances within groups of students and the policy implications of excluding such control variables from assessments.
Finally, Models I through III provide useful insights into how students with disabilities could be included in systemic reform assessment systems. Model I
TABLE C2a Fourth Grade Cohort Regression Models, 1992: Disability Indicator Variable


Reading Models 
Math Models 



Model I 
Model II 
Model III 
Model I 
Model II 
Model III 
Prior Tests 

Prior Reading 
— 
— 
— 
— 
— 
— 

Prior Math 
— 
— 
— 
— 
— 
— 

District SES 

Free Lunch (1 = Yes) 
9.88*** 
4.78*** 
2.81*** 
9.00*** 
3.72*** 
1.27 

Gender (1 = Female) 
3.85*** 
4.69*** 
3.44*** 
1.48*** 
2.00*** 
0.35 

African American 
11.26*** 
9.52*** 
8.34*** 
10.36*** 
8.65*** 
7.28*** 

Asian American 
1.78 
3.69*** 
6.57*** 
9.31*** 
6.88*** 
4.36*** 

Hispanic American 
11.87*** 
9.80*** 
9.74*** 
8.69*** 
7.49*** 
7.09*** 

Other American 
5.21*** 
5.76*** 
4.07* 
2.34 
3.11 
2.02 

Disabled (1 = Yes) 
6.90*** 
6.44*** 
3.19*** 
6.97*** 
6.55*** 
3.62*** 

Family SES 

Income 
— 
0.87*** 
0.61*** 
— 
0.91*** 
0.63*** 

Respondent Education 
— 
2.24*** 
0.90*** 
— 
2.15*** 
0.86*** 

Respondent Employment 
— 
0.54* 
0.04 
— 
0.17 
0.61 

Respondent Marital Status 
— 
0.80 
0.02 
— 
1.12* 
0.13 

Family Attitudes/Behavior 

Expectations 
— 
— 
3.39*** 
— 
— 
3.54*** 

School Dissatisfaction 
— 
— 
0.14*** 
— 
— 
0.19*** 

Parental Involvement—Home 
— 
— 
0.49*** 
— 
— 
0.62*** 

Parental Involvement—Attendance 
— 
— 
0.20 
— 
— 
0.09 

Parental Involvement—Organizations 
— 
— 
0.66*** 
— 
— 
1.04*** 

School Contacts 
— 
— 
0.60*** 
— 
— 
0.58*** 

Constants 
54.48*** 
39.02*** 
19.66*** 
54.12*** 
37.63*** 
14.65*** 

R^{2} 
.20 
.23 
.28 
.15 
.18 
.25 

SE 
18.12 
17.89 
16.84 
20.07 
19.90 
18.65 

F 
305.54 
187.00 
90.28 
205.78 
133.05 
73.43 

(df) 
(8,393, 7) 
(6,681, 11) 
(3,807, 17) 
(8,234, 7) 
(6,545, 11) 
(3,741, 17) 

*** probability that B = 0 < .001 ** probability that B = 0 < .01 * probability that B = 0 < .05 
TABLE C2b Fourth Grade Cohort Regression Models, 1992: Categories of Disability


Reading Models 
Math Models 



Model IV 
Model V 
Model VI 
Model IV 
Model V 
Model VI 
Prior Tests 

Prior Reading 
— 
— 
— 
— 
— 
— 

Prior Math 
— 
— 
— 
— 
— 
— 

District SES 

Free Lunch (1 = Yes) 
9.93*** 
4.76*** 
2.95*** 
9.05*** 
3.69*** 
1.34 

Gender (1 = Female) 
3.72*** 
4.58*** 
3.41*** 
1.38** 
1.92*** 
0.40 

African American 
11.25*** 
9.51*** 
8.49*** 
10.34*** 
8.62*** 
7.36*** 

Asian American 
1.81 
3.65*** 
6.38*** 
9.12*** 
6.86*** 
4.42** 

Hispanic American 
11.78*** 
9.72*** 
9.57*** 
8.63*** 
7.42*** 
6.92*** 

Other American 
4.80** 
5.40*** 
4.12* 
2.18 
2.84 
2.28 

Disabled 

Emotional 
9.82*** 
8.93*** 
5.00 
11.34*** 
10.05*** 
6.98 

Learning 
16.81*** 
17.44*** 
17.49*** 
16.44*** 
16.78*** 
16.12*** 

Physical 
2.56 
1.80 
2.44 
1.00 
0.53 
4.80 

Speech 
2.99* 
2.36 
1.30 
2.36 
1.32 
2.45 

Other 
0.54 
0.54 
1.49 
1.92 
2.34 
0.54 

Family SES 

Income 
— 
0.89*** 
0.64*** 
— 
0.94*** 
0.66*** 

Respondent Education 
— 
2.22*** 
0.90*** 
— 
2.15*** 
0.88*** 

Respondent Employment 
— 
0.52* 
0 
— 
0.18 
0.54 

Respondent Marital Status 
— 
0.68 
0.21 
— 
1.24 
0.03 

Family Attitudes/Behavior 

Expectations 
— 
— 
3.33*** 
— 
— 
3.49*** 

School Dissatisfaction 
— 
— 
0.15*** 
— 
— 
0.20*** 

Parental Involvement—Home 
— 
— 
0.48*** 
— 
— 
0.60*** 

Parental Involvement—Attendance 
— 
— 
0.26 
— 
— 
0.16 

Parental Involvement—Organization 
— 
— 
0.59*** 
— 
— 
0.97*** 

School Contacts 
— 
— 
0.57*** 
— 
— 
0.57*** 

Constants 
54.53*** 
38.97*** 
20.42*** 
54.17*** 
37.55*** 
15.18*** 

R^{2} 
.21 
.24 
.29 
.16 
.19 
.26 

SE 
18.02 
17.75 
16.70 
19.99 
19.81 
18.56 

F 
202.84 
143.58 
76.13 
136.55 
101.61 
61.27 

(df) 
(8,264, 11) 
(6,583, 15) 
(3,741, 21) 
(8,106, 11) 
(6,448, 15) 
(3,676, 21) 

*** probability that B = 0 < .001 ** probability that B = 0 < .01 * probability that B = 0 < .05 
TABLE C3a Fourth Grade ValueAdded Regression Models, 1991—1992: Disability Indicator Variable


Reading Models 
Math Models 



Model I 
Model II 
Model III 
Model I 
Model II 
Model III 
Prior Tests 

Prior Reading 
0.62*** 
0.61*** 
0.61*** 
0.21*** 
0.20*** 
0.17*** 

Prior Math 
0.21*** 
0.21*** 
0.19*** 
0.65*** 
0.66*** 
0.66*** 

District SES 

Free Lunch (1 = Yes) 
2.49*** 
1.66*** 
0.24 
1.52*** 
0.50 
1.06 

Gender (1 = Female) 
1.81*** 
2.06*** 
1.75*** 
0.36 
0.41 
0.42 

African American 
2.25*** 
1.84*** 
1.54* 
1.16* 
0.79 
0.62 

Asian American 
0.55 
0.14 
1.04 
6.61*** 
5.55*** 
4.76*** 

Hispanic American 
2.53*** 
2.39*** 
2.91*** 
1.46** 
1.42* 
1.59* 

Other American 
1.53 
2.29 
1.77 
1.26 
0.51 
0.43 

Disabled (1 = Yes) 
0.86 
0.88 
0.57 
1.17 
1.35 
0.42 

Family SES 

Income 
— 
0 
0.17 
— 
0 
0.21 

Respondent Education 
— 
0.81*** 
0.42* 
— 
0.91*** 
0.54*** 

Respondent Employment 
— 
0.30 
0.24 
— 
0.25 
0.19 

Respondent Marital Status 
— 
0.25 
0.27 
— 
0.62 
0.12 

Family Attitudes/Behavior 

Expectations 
— 
— 
0.73*** 
— 
— 
0.76*** 

School Dissatisfaction 
— 
— 
0.06* 
— 
— 
0.07** 

Parental Involvement—Home 
— 
— 
0.08 
— 
— 
0.17** 

Parental Involvement—Attendance 
— 
— 
0.11 
— 
— 
0.06 

Parental Involvement—Organizations 
— 
— 
0.19 
— 
— 
0.44*** 

School Contacts 
— 
— 
0.44*** 
— 
— 
0.47*** 

Constants 
8.24*** 
5.58*** 
5.32* 
5.47*** 
2.35* 
0.82 

R^{2} 
.67 
.68 
.67 
.61 
.62 
.63 

SE 
11.55 
11.58 
11.29 
13.45 
13.39 
12.87 

F 
1,438.60 
814.63 
325.23 
1,109.74 
643.35 
268.69 

(df) 
(6,323, 9) 
(5,067, 13) 
(2,967, 19) 
(6,320, 9) 
(5,062, 13) 
(2,967, 19) 

*** probability that B = 0 < .001 ** probability that B = 0 < .01 * probability that B = 0 < .05 
TABLE C3b Fourth Grade ValueAdded Regression Models, 1991–1992: Categories of Disability


Reading Models 
Math Models 



Model IV 
Model V 
Model VI 
Model IV 
Model V 
Model VI 
Prior Tests 

Prior Reading 
0.62*** 
0.61*** 
0.61*** 
0.21*** 
0.20*** 
0.16*** 

Prior Math 
0.21*** 
0.21*** 
0.19*** 
0.65*** 
0.66*** 
0.66*** 

District SES 

Free Lunch (1 = Yes) 
2.54*** 
1.73*** 
0.37 
1.54*** 
0.54 
0.98 

Gender (1 = Female) 
1.71*** 
1.99*** 
1.68*** 
0.30 
0.38 
0.36 

African American 
2.26*** 
1.84*** 
1.68* 
1.16* 
0.77 
0.70 

Asian American 
0.68 
0 
0.91 
6.55*** 
5.54*** 
4.67*** 

Hispanic American 
2.43*** 
2.30*** 
2.79*** 
1.43** 
1.38* 
1.54* 

Other American 
1.26 
2.02 
1.61 
1.30 
0.60 
0.32 

Disabled 

Emotional 
2.57 
3.56 
1.98 
2.58 
4.23 
4.51 

Learning 
3.20** 
3.27* 
4.09* 
1.59 
1.97 
3.54 

Physical 
0.02 
0 
0.46 
0.41 
0.86 
0.68 

Speech 
0.53 
1.00 
3.01* 
0.57 
0.86 
2.77 

Other 
0.34 
0.12 
2.13 
2.19* 
2.48* 
0.95 

Family SES 

Income 
— 
0 
0.15 
— 
0.01 
0.22 

Respondent Education 
— 
0.82*** 
0.43** 
— 
0.94*** 
0.56*** 

Respondent Employment 
— 
0.28 
0.26 
— 
0.26 
0.22 

Respondent Marital Status 
— 
0.21 
0.37 
— 
0.59 
0.19 

Family Attitudes/Behavior 

Expectations 
— 
— 
0.77*** 
— 
— 
0.78*** 

School Dissatisfaction 
— 
— 
0.06* 
— 
— 
0.07** 

Parental Involvement—Home 
— 
— 
.07 
— 
— 
0.16** 

Parental Involvement—Attendance 
— 
— 
0.12 
— 
— 
0.02 

Parental Involvement—Organization 
— 
— 
0.17 
— 
— 
0.44** 

School Contacts 
— 
— 
0.43*** 
— 
— 
0.48*** 

Constants 
8.34*** 
5.67*** 
5.83* 
5.50*** 
2.38* 
1.10 

R^{2} 
.67 
.68 
.68 
.61 
.62 
.63 

SE 
11.53 
11.56 
11.25 
13.46 
13.41 
12.88 

F 
985.78 
616.76 
266.77 
755.12 
483.25 
218.50 

(df) 
(6,217, 13) 
(4,988, 17) 
(2,913, 23) 
(6,214, 13) 
(4,983, 17) 
(2,913, 23) 

*** probability that B = 0 < .001 ** probability that B = 0 < .01 * probability that B = 0 < .05 
TABLE C4 Fourth Grade Cohort and ValueAdded Regressions: Variable Definitions and Statistics

Mean 
Standard Deviation 
Range 
(N) 
Reading NCE (1992) 
45.42 
20.48 
98.00 
10,584 
Math NCE (1992) 
45.41 
21.96 
98.00 
10,388 
Free Lunch (1 = Yes) 
0.47 
0.50 
1.00 
9,221 
Gender (1 = Female) 
0.50 
0.50 
1.00 
10,542 
African American 
0.20 
0.40 
1.00 
9,810 
Asian American 
0.05 
0.21 
1.00 
9,810 
Hispanic American 
0.20 
0.40 
1.00 
9,810 
Other American 
0.02 
0.14 
1.00 
9,810 
Disabled (1 = Yes) 
0.08 
0.27 
1.00 
10,543 
Emotional 
0.01 
0.09 
1.00 
9,791 
Learning 
0.02 
0.15 
1.00 
9,925 
Physical 
0.01 
0.12 
1.00 
9,837 
Speech 
0.02 
0.15 
1.00 
9,945 
Other 
0.03 
0.16 
1.00 
9,977 
Income 
6.71 
2.72 
9.00 
8,696 
Respondent Education 
3.31 
1.77 
7.00 
8,364 
Respondent Employment 
2.15 
0.91 
2.00 
9,178 
Respondent Marital Status 
0.68 
0.47 
1.00 
9,387 
Expectations 
5.12 
1.70 
6.00 
7,697 
School Dissatisfaction 
39.77 
9.35 
62.00 
6,501 
Parental Involvement—Home 
24.04 
4.78 
36.00 
7,015 
Parental Involvement—Attendance 
12.26 
2.72 
16.00 
6,998 
Parental Involvement—Organization 
8.65 
2.19 
14.00 
7,520 
School Contacts 
11.64 
2.46 
18.00 
8,763 
contains a simple indicator variable for disability. The result, controlling for income, gender, and race, is a very significant 6.90 normal curve equivalents in reading and 6.97 in math. As expected, students with disabilities do less well. However, when we control for more variables, the differential scores are partly dissipated. Controlling for family socioeconomic status has only a modest result, but controlling for expectations, satisfaction with the school, and parental involvement has a major effect in predicting reduced differential scores for students with disabilities. Although not suggesting a causal explanation, it is clear that, as before, withingroup variance is considerable and must be taken into account in assessment systems.
The differences in results between Tables C2a and C2b highlight this fact. In Table C2b a series of indicator variables are used to represent different types of disabilities. The effects are quite startling. Essentially, when suitable controls are employed, only emotionally disturbed and learning disabled students have
cohort test scores significantly below the rest of the population.^{5} Learning disabled student scores are close to a standard deviation below the rest of the population; emotionally disturbed students are less far behind.
The effects of including control variables seem to differ for students with emotional disturbances and learning disabilities. The differences in estimated test scores for students with emotional disturbances are smaller than the expected differences computed for fourth graders in Table C1. And, as more variables are added to explain the variance, the effect of an emotional disability declines to the point that it may not be significant when we include family attitude and behavior effects. In contrast, for both reading and math, the effects of a learning disability are not reduced very much by inclusion of any control variables. The differences in means for fourth graders computed from Table C1 are very close to the sizes of the effects for students with learning disabilities in Table C2b, and the size and significance of the coefficient does not change much as more variables are added.
ValueAdded Models
Valueadded achievement models are based on the assumption that to adequately measure educational progress and the varying contribution of educational institutions, one must control for prior student achievement. Various measures of change can then be constructed and, controlling for relevant student, family, and institutional differences, reasonable expectations based on student progress can be established.
Tables C3a and C3b present results of such models for the Prospects study. The tables duplicate those presented in the cohort models depicted in Tables C2a and C2b with the addition of third grade reading and math test scores as measures of prior achievement. As expected, the prior tests are very good predictors of fourth grade tests. And the coefficients are quite stable across models. All are significant at the .001 level; the primary tests have coefficients between .61 and .66; and the secondary tests are approximately.2.
What is more interesting are the changes in the coefficients for the remaining independent variables. The most obvious differences are that the coefficients for all independent variables are much smaller. This is to be expected because we are now essentially estimating the variance in changes in achievement, not simply the variance between scores. And changes are smaller numbers.^{6} What is more relevant is the statistical significance of the coefficients.
^{5} 
"Physical disabilities" include categories for physical, hearing, speech, orthopedic, and deafness disabilities. The other categories were coded as they appear in the tables. Mental retardation was excluded from the analysis because only 5 students with mental retardation were given standardized tests. 
^{6} 
The reader can verify the differences by taking the means of fourth grade scores in Table C4 and subtracting from them the appropriate Bs times the third grade scores for both reading and math. That is the equivalent of what we are estimating once prior achievement has been controlled. 
We leave it to the reader to explore the totality of differences emerging from a comparison of Tables C2 and C3. We note several interesting observations. Gender differences in math scores are significant in Tables C2a and C2b for Models I and II, but none of the valueadded differences in Tables C3a and C3b are significant, and the Model VI coefficients have a negative sign. This may indicate that the absolute advantage of girls in math in the early years is not matched by greater progress. Similarly, Asian American reading scores are below white scores as indicated in Tables C2a and C2b, but those differences disappear once prior achievement is controlled. This means that, in terms of progress on reading, Asian Americans and whites do approximately the same.
Variables that remain significant with valueadded measures include: (1) some racial effects—extraordinarily positive for Asian Americans on math, negative for African and Hispanic Americans on reading and at times on math; (2) parent's education, which remains positively related to increase learning; and (3) parental expectations—there is a positive effect of higher parental educational expectations. The only parental involvement measure that seems to matter is the scale measuring the frequency of school contracts—the effect is to depress fourth grade test scores.
The implications of valueadded models for students with disabilities are quite striking. If all students with disabilities are considered together, Table C3a indicates that there is no reason to expect statistically significant differences between the populations with and without disabilities. In fact, controlling for the full range of variables (Table C3a, Model VI), the coefficients vary closely around zero.
Different disabilities suggest different valueadded results. The only consistent significant effect is for reading scores for students with learning disabilities. As reported above, cohort score estimates for students with learning disabilities were consistently close to a standard deviation behind students without learning disabilities—regardless of which model was estimated. It also appears that, between the third and fourth grades, students with learning disabilities fall further behind, and surprisingly, the effect seems to increase in size as more control variables are included in the equation. The only other effects that approach conventional levels of significance are a positive effect for students with speech impairments on the full reading model (Model VI), and small negative effects on math for those with "other" disabilities or health problems.
CONCLUSIONS
If further analysis confirms these results, they suggest several conclusions. First, different types of disabilities need to be treated separately. Second, the conclusions differ when one models cohort and valueadded achievement measures. Third, students with learning disabilities need to be studied further and perhaps treated quite differently in assessment systems. Results of these analyses
suggest that students with learning disabilities may show persistently poor test scores and poor progress despite variation in a host of exogenous factors, which in other populations are related to achievement success.
Despite differences in opinion about what students should know and what is a valid form for testing that knowledge, policy makers undertaking standardsbased reforms still need to compare student achievement over time, across populations, and between organizations. This requires internal and testretest reliability for the instruments as well as the conversion of scores into a known probability distribution so that unbiased trend and intergroup comparisons can be made.^{7} Valueadded models, which control for prior achievement, offer promise as a valid method for reporting achievement scores and should be considered by policy makers.
REFERENCE
U.S. Department of Education 1993. Prospects: The Congressionally Mandated Study of Educational Growth and Opportunity: The Interim Report. July. Washington, DC: U.S. Department of Education.
^{7} 
Although many probability distributions could produce desired comparisons, normal or Student's T distributions are by far the most commonly assumed ones. They allow the use of numerous, common statistical techniques for analyzing results. This explains the requirement under Chapter 1 funding that tests be administered that can be converted into normal curve equivalents. 