Current and Proposed Measures
What was the high school dropout rate last year? What was the graduation rate? Most people believe that quantifying these rates should be a straightforward task. Intuitively, it might seem that calculating the dropout rate simply means dividing the number of students who drop out by the total number of students in the cohort and, similarly, that calculating the graduation rate simply means dividing the number of students who graduate by the total number in the cohort. Intuitively, it might also seem that once one of the rates is obtained, the other can be calculated by subtracting it from 100.
As those who work in this area can attest, calculating the rates is not that simple. The rates can be calculated from a variety of different data sources using a variety of different methods. These differences can lead to discrepant estimates of the rates. For instance, consider the high school completion rates published for the 2005-06 school year. The U.S. Department of Education reported that approximately 73 percent of public high school students graduated on time that year (Snyder, Dillow, and Hoffman, 2009, p. 3), and Editorial Projects in Education (2008, p. 28) reported a similar figure of 69 percent. However, the U.S. Department of Education also reported a dropout rate for 16- to 24-year-olds of only 9 percent in 2006 (Snyder, Dillow, and Hoffman, 2009, Table 105), and the Annie E. Casey Foundation (2009, p. 64) reported a dropout rate for 16- to 19-year- olds of only 7 percent. How should a completion rate of between 69 and 73 percent be interpreted in light of a dropout rate below 10 percent?
Why do estimates of the high school completion rate and the high school dropout rate differ so much from one another? How can these disparate estimates be reconciled? The disparities can be traced to three sources:
(1) differences in what various estimates are designed to accomplish, (2) differences in the conceptual and technical definition of both the numerator and the denominator used to calculate the rates, and (3) differences in the accuracy of the data used to produce them. Understanding these three sources of differences is key to making sense of the resulting rates.
This chapter first discusses the different purposes of the estimates and the sources of data used in calculating the rates. We then discuss the different types of rates that can be calculated. We close the chapter by discussing the importance of aligning the choice of a rate with the purpose it will serve. The chapter draws extensively from papers prepared for the workshop by Rob Warren, with the University of Minnesota, and Elaine Allensworth, with the Consortium on Chicago School Research (Allensworth, 2008; Warren, 2008).
DIFFERENT PURPOSES OF THE ESTIMATES
One major source of differences in estimates of dropout and completion rates is the question they were designed to answer. Analysts operationalize high school completion and dropout rates in different ways because they have different conceptual or practical reasons for making those measurements. There are three chief uses of high school completion and dropout rates.
The first use is to describe the amount (or lack) of human capital in a population. In this case, the rates characterize an attribute of society: they quantify the share of people who bring particular credentials and skills to the labor force. The second use is to describe the “holding power” of schools. In this case, the rates answer questions about schools; they characterize their success at moving young people from the first day of high school to successful completion (see Hartzell, McKay, and Frymier, 1992). The third use is to describe students’ success at navigating high school from beginning to end. For this purpose, the rates answer questions about individual students themselves; they measure how successful students are in progressing from the first day of high school to successful completion.
If the goal of the rate is to describe the amounts of human capital in a population, the timing of high school completion—how long ago or at what age people completed high school—is not of critical importance. Nor, for some purposes, does it matter exactly how young people complete high school—by obtaining a diploma, a General Educational Development (GED) credential, or a certificate of completion, completing an adult education program, or some other way. For other purposes, however, how students complete high school is critical because research suggests that students who fail to earn a regular high school diploma are less competitive in the labor market compared with graduates (Heckman and Rubinstein, 2001; Tyler, 2003). Any young person who has completed high school is considered to have acquired marketable capital, regardless of his or her age at the time of crossing that educational threshold.
For the latter two uses, however, both the timing of high school completion and the manner in which young people complete high school can be important. For instance, schools may be deemed successful at moving young people through to completion only if they obtain regular diplomas “on time,” typically within four years.
Given these differences in intended purpose, it becomes less puzzling to read in the Digest of Education Statistics that “73.4 percent of public high school students graduated on time,” despite the fact that only 9 percent of 16- to 24-year-olds were dropouts in 2006 (Snyder, Dillow, and Hoffman, 2009, p. 3 and Table 109). The former estimate is explicitly intended to describe the share of a cohort of students that has completed high school on time and by obtaining a diploma—essentially an attribute of schools. The latter estimate is clearly intended to describe the share of young people who are not gaining the human capital associated with high school completion. Presumably many of the 26.6 percent of ninth graders in fall 2002 who did not go on to graduate from high school with a diploma by spring 2006 were still enrolled or will compete high school later, via a GED or another alternative credential.1 Given the reported 9 percent dropout rate, one might presume that eventually about 91 percent of young people will eventually complete high school one way or another.2
DIFFERENT DATA SOURCES
Another source of differences in the rates is the data used in the calculations. A number of available data sources can be used for calculating the rates. These data were collected for different reasons using different types of designs—cross-sectional sample surveys, longitudinal sample surveys, cross-sectional administrative data, and longitudinal administrative data. The collection method and the reasons for collecting the data can affect the rates that are calculated. In this section, we describe the major data sources used in this country to compute high school dropout and completion rates and discuss their strengths and weaknesses in relation to the three purposes listed above.
Cross-Sectional Sample Surveys
The data most widely used for measuring high school dropout and completion come from the Current Population Survey (CPS),3 the U.S. decennial
census, and (in more recent years) the American Community Survey (ACS). Because the census is conducted every 10 years and the ACS is a relatively new resource, the CPS has served as the central cross-sectional data resource for decades.4
The CPS is conducted monthly by the Census Bureau for the Bureau of Labor Statistics and surveys more than 50,000 households. Households are selected in such a way that it is possible to generalize to the nation as a whole and, in recent years, to individual states and other specific geographic areas. Individuals in the CPS are broadly representative of the civilian, noninstitutionalized population of the United States. In addition to the basic demographic and labor force questions that are included in each monthly administration of the CPS, questions on selected topics are included in most months. Since 1968, the October CPS has obtained basic monthly data as well as information about school enrollment—including current enrollment status, public versus private school enrollment, grade attending if enrolled, most recent year of enrollment, enrollment status in the preceding October, grade of enrollment in the preceding October, and high school completion status. In recent years, the October CPS has also ascertained whether high school completers earned diplomas or GED certificates.
There are a number of conceptual and technical problems with CPS-derived measures of high school dropout and completion, particularly when computed at the state level. Most importantly, the sample sizes are not large enough to produce reliable estimates of rates of high school completion or dropout at the state or substate levels (Kaufman, 2001; Winglee et al., 2000). Even when data are aggregated across years—for instance, in the Annie E. Casey Foundation’s Kids Count (2008) measure—the standard errors of estimates for some states are frequently so large that it is difficult to make meaningful comparisons across states or over time. Moreover, when aggregated across years, the resulting measure does not pertain to specific cohorts, and because the CPS data are typically tabulated by age rather than grade level, they usually do not pertain to specific cohorts of incoming students. As a result, CPS-based measures are not useful, except at the national level and possibly aggregated across survey years, for assessing schools’ holding power or for describing the dropout or completion rates of school entry cohorts.
Second, until 1987, it was not possible to distinguish high school completers from GED recipients in the CPS. Since 1988, October CPS respondents who recently completed high school have been asked whether they obtained a diploma or a GED, but there are serious concerns about the quality of the resulting data (Chaplin, 2002; Kaufman, 2001). Third, as noted by Greene and Forster (2003), “[status] dropout statistics derived from the [CPS] are
based on young people who live in an area but who may not have gone to high school in that area.” To the extent that young people move from state to state after age 18, estimates of state high school dropout rates based on CPS data—particularly status dropout rates based on 16- to 24-year-olds—may be of questionable validity (see also Kaufman, McMillen, and Bradby, 1992).
Fourth, there are concerns about population coverage with the CPS, particularly for racial/ethnic minorities. The CPS is representative of the civilian, noninstitutionalized population of household residents in the United States, so young people who are incarcerated, in the military, or homeless are not represented. To the extent that these populations differ from the rest of the population with respect to frequency and method of high school completion, there is the potential for CPS-based estimates to differ from those based on other data sets that capture these populations. Finally, substantial changes over time in CPS questionnaire design, administration, and survey items have made year-to-year comparisons difficult (Hauser, 1997; Kaufman, 2001).
It is possible to overcome some, but not all, of these limitations of the CPS by using data from the ACS or the decennial census. Sample sizes are larger, enhancing the reliability of state- and urban-level estimates. Both the ACS and the decennial census include individuals who are institutionalized or in the military, which enhances the generalizability of the reported statistics. However, it is still not clear how accurately ACS respondents report whether they obtained GEDs or regular high school diplomas.
In addition, the ACS and the decennial census share with the CPS the limitation that sampled young people are not asked to indicate the state(s) in which they attended high school or the school they attended. As a result, the ACS and the census are useful for constructing rates that describe the human capital of populations; however, measures derived from the CPS, the ACS, and the census are not well suited to describing schools’ holding power, because they never refer to specific schools at all, or to young people’s success in navigating the secondary school system.
Longitudinal Sample Surveys
Although a number of longitudinal sample surveys are used for constructing dropout and completion rates (e.g., the National Longitudinal Surveys), the most widely used are those conducted periodically by the Bureau of Labor Statistics (see http://www.bls.gov/nls/) and the National Center for Education Statistics (NCES) (see http://nces.ed.gov/surveys/SurveyGroups.asp?group=1). The NCES surveys include the following:
The 1972 sample of seniors in the National Longitudinal Study of the High School Class of 1972 (NLS).
The 1980 and 1982 samples of sophomores and seniors in High School & Beyond (HS&B).
The sample of eighth graders in the 1988 National Educational Longitudinal Study (NELS).
The sample of sophomores in the 2002 Educational Longitudinal Study (ELS).
Of these four data sources, NELS has been at the center of a great deal of research and debate on the measurement of high school dropout and completion rates in recent years (e.g., see Greene, Winters, and Swanson, 2006; Kaufman, 2004; Mishel, 2006; Mishel and Roy, 2006). Thus, we focus on NELS in the discussion below.
NELS is a longitudinal survey of the grade 8 student cohort of 1988. In the base year, the sample included approximately 25,000 randomly selected students in 1,000 public and private schools across the United States. In addition to the data collected from student interviews, NELS contains information from parents, school administrators, teachers, and student transcripts. The initial student cohort has been resurveyed on four occasions, in 1990, 1992, 1994, and 2000. Students who dropped out of school between surveys were also interviewed. In the early follow-up surveys, the sample was “freshened” with new sample members in order to make the first and second follow-up surveys cross-sectionally representative of 1990 sophomores and 1992 seniors, respectively. The content of the surveys includes students’ school, work, and home experiences; educational resources and support; parental and peer influences; educational and occupational plans and aspirations; delinquency; and many others (Curtin et al., 2002).
For the purposes of measuring high school dropout and completion rates, the key feature of NELS (and of other longitudinal sample surveys as well) is that it includes information about whether and when cohort members dropped out of school and whether and how they obtained secondary school credentials. A key design feature of NELS is the availability of transcript data on high school enrollment, dropout, and completion. In the absence of coverage bias and nonparticipation, NELS data would provide very accurate estimates of high school dropout and completion rates at the national (but not state or district) level—albeit for a single cohort of young people.
Despite the advantages associated with its longitudinal design, a number of technical issues raise questions about the accuracy of dropout and completion rates based on NELS (Kaufman, 2004); these issues also arise in the context of other longitudinal surveys based on samples. First, the base-year NELS sample excluded many students with limited English proficiency or mental or physical disabilities. NCES gathered supplementary information from these students later, but it is not clear how often this supplemental information is used in calculating NELS-based dropout and completion rates. Second, as noted by
Kaufman (2004, p. 119), “[s]ince NELS is a sample survey, it is subject to the same potential for bias due to non-response and undercoverage that CPS has.” Third, transcript data are frequently unavailable for dropouts or alternative completers. This is due in part to the logistical difficulties inherent in collecting such data and in part to nonresponse by schools (Ingels et al., 1995). Some of these problems are overcome by the use of sample weights in NELS; nevertheless, NELS—like all longitudinal sample surveys—has a difficult time retaining hard-to-follow individuals like high school dropouts.
Each state maintains its own system for counting the numbers of students who are enrolled in each grade (usually at the beginning of each academic year), the numbers of students who drop out of school, and the numbers of students who obtain regular diplomas and other high school completion credentials. These counts are usually aggregated up from the schoolhouse level, and they are increasingly linked to longitudinal data systems. At the national level, cross-sectional administrative data on enrollments and numbers of completers are compiled as part of the Common Core of Data (CCD).
Compiled by NCES, the CCD is the federal government’s primary database on public elementary and secondary education. Each year the CCD survey collects information about all public elementary and secondary schools from local and state education agencies. One component of the CCD—the State Nonfiscal Survey—provides basic, annual information on public elementary and secondary school students and staff for each state and the District of Columbia. The State Nonfiscal Survey includes counts of the number of students enrolled in each grade in the fall of each academic year as well as the number of students who earned regular diplomas, earned other diplomas, or completed high school in some other manner in the spring of each academic year. Although the State Nonfiscal Survey has collected counts of public school dropouts since the 1991-92 school year, many states have not provided this information or have provided it in a manner inconsistent with the standard CCD definition of dropout (Winglee et al., 2000; Young and Hoffman, 2002).
One obvious limitation of CCD data—and indeed of all state administrative data—is that they pertain exclusively to public school students. When high school dropout and completion rates are used for the purposes of describing levels of human capital in a population or for describing young people’s success at navigating the secondary education system, this limitation is important. In 2009, 8.4 percent of secondary school students were enrolled in private schools (Snyder and Dillow, 2010, Table 55), which gives a sense of the extent of the population not represented by CCD and state administrative data.5 The CCD
is also limited because it excludes secondary credentials awarded outside the K-12 education system, such as Adult Education and Jobs Corps.6
Beyond these cross-sectional, state-produced enrollment counts, states also frequently make use of longitudinal administrative data to produce high school dropout and completion rates. Each state uses somewhat different procedures for data collecting, reporting, and aggregation, but in general there have been few concerns about states’ reports of the numbers of students in each grade or the numbers of students obtaining regular diplomas. The most prominent controversies pertain to decisions about how to handle the data. For instance, although there is little concern about states’ abilities to accurately count the number of students who begin high school as ninth graders, there is frequently concern about how states account for factors like migration, incarceration, expulsion, and enrollment in alternative educational programs.
DIFFERENT TYPES OF RATES
There is no one best measure of high school dropout or completion. Different methods of calculating graduation, completion, and dropout rates will be more or less useful for different purposes and more or less valid and reliable for different types of students (National Institute of Statistical Sciences and the Education Statistics Services Institute, 2005; Swanson, 2003). Below we discuss three types of rates: status rates, event rates, and cohort rates.7 We distinguish between cohort rates based on individual data (which we refer to as “individual cohort rates”) and cohort rates based on aggregated data (which we refer to as “aggregate cohort rates”).
A status rate reports the fraction of a population that falls into a certain category at a given point in time. The most common example is the status dropout rate, although status enrollment rates and status completion rates are also occasionally reported. For instance, in Dropout Rates in the United States, 2006 the U.S. Department of Education reported that 9.3 percent of 16- to 24-year-olds were not enrolled in school and did not have any high school credential in October of that year (Laird et al., 2008). In that same month, 87.8 percent
have been used in conjunction with the CCD to calculate national graduation rates. See Chaplin and Klasik (2006) at http://www.uark.edu/ua/der/EWPA/Research/Accountability/1790.html.
Adult Education, for example, awarded 62,598 diplomas and equivalency credentials in 2008-09 (data retrieved July 19, 2010, from http://wdcrobcolp01.ed.gov/CFAPPS/OVAE/NRS/reports/index.cfm).
See also National Institute of Statistical Sciences and the Education Statistics Services Institute (2005) for a more technical description of the various types of rates.
of 16- to 24-year-olds were status completers; that is, they were not enrolled in high school and held some sort of high school credential.
The numerator of the status dropout rate reflects the number of people who have not obtained any high school credential and are not working toward one. The fact that many dropouts subsequently re-enroll in high school, obtain a GED credential, or earn high school credentials in other ways is immaterial in calculating the rate, as is the age at which young people complete high school.
Status dropout and completion rates are usually calculated using cross-sectional data on individuals in the target population. All that is required is information about individuals’ ages, enrollment status,8 and high school completion status. All status dropout and completion rates are measures of the amount (or lack) of human capital in a population. Status rates do not differentiate between those with a diploma and those with a GED or other credentials, however, and do not consider when the credential was earned. As such, they are poor measures of schools’ holding power or of young people’s success at navigating the secondary school system and persisting in school. A low status dropout rate may reflect very high holding power of schools, or it may obscure a situation in which schools have very low holding power and many young people obtain alternate credentials in their late teens or early twenties.
Moreover, status rates do not account for the location of schools. A geographic area may have a low status dropout rate because its schools have high holding power, or the area may attract people who have high school credentials. For instance, counties with high technology industries or large postsecondary institutions tend to have relatively low status dropout rates. This probably says more about the human capital of people who move to those counties than about the holding power of the schools there.
An event rate reports the fraction of a population that experiences a particular event over a given time interval; by definition, everyone in the population is at risk of experiencing that event during the period. The most frequently reported example is the event dropout rate—the proportion of students who exit school during a given academic year without completing high school.
Event dropout and completion rates can be calculated using either cross-sectional or longitudinal data. All that is required is information about individuals’ enrollment status in two consecutive academic years, their completion status in the second of those years, and (under some formulations) age. Enrollment status is typically measured at the beginning of each academic year
so the rate can more clearly represent the incidence of dropping out during a well-defined period of time.
For instance, Kominski (1990) developed an event dropout rate that could be calculated from data collected in the October School Enrollment Supplement of the Current Population Survey; that rate could be estimated separately at grades 10, 11, and 12 or combined across all three grade levels. Research on national trends and differentials in the event dropout rate was undertaken by Hauser (1997) and by Hauser, Simmons, and Pager (2004). The rates estimated in those studies fall well below those of the more extreme (high) estimates of status dropout and, perhaps partly for that reason, have received little public attention. Misinterpretation of event rates as cohort rates often leads people to believe that dropout rates are lower than they really are. Because event rates are sometimes reported for students at all grade levels and ages, who have very different risks of dropout, and because the CPS provides very small samples at the relevant ages, they are rarely sufficiently sensitive for gauging the effects of changes in school practices, even at the state or regional level. Another disadvantage of this rate, shared by the status dropout rate when estimated from the CPS, is that it excludes those in the institutionalized population, such as students in prison or in the military. There are also population coverage problems in the CPS, especially for minorities.
Because they are measures of the share of a population that experiences a particular event over the course of a specific time interval, event dropout and completion rates can be used to describe schools’ holding power or young people’s ability to successfully navigate the school system. Whether an event dropout rate is a fair characterization depends on (a) how “success” and “failure” are defined in the numerator and (b) how the population is defined in the denominator. If the goal is to measure schools’ holding power, the numerator is determined by how schools define success (e.g., they are explicit about whether GEDs and other alternative credentials are treated as equivalent to regular diplomas), and the denominator is restricted to those continuously residing in a well-defined geographic area (typically a school district or state). The resulting event dropout rate thus describes the experiences of only those students for whom the school district or state is formally responsible. If the goal is to measure the rate at which students’ succeed in navigating the secondary school system, the denominator need not be restricted to those who continuously reside in a particular geographic area.
Individual Cohort Rates
Individual cohort rates are derived from longitudinal (or retrospective) data on individuals, all of whom were the same age or in the same grade at a certain point in time (e.g., students entering high school in a given year). Individual cohort rates report the fraction of individuals who transition into a
particular status (e.g., dropouts, graduates, completers) by a subsequent point in time, typically within four years.
There are two general sources for the longitudinal data used to compute cohort dropout and completion rates. The first source is administrative data collected by the school system, district, and/or state. In this case, the number of first-time ninth graders in the fall of a given academic year is the denominator for the rate. Depending on the kind of rate desired, the numerator is formed by counting the number of students who obtain diplomas (for graduation rates), obtain any secondary credential (for completion rates), or leave school without obtaining any credential (for dropout rates). As with event dropout rates, states and districts differ with respect to what counts as success and failure in the numerator. Some agencies count only regular diplomas as successes, while others also count GEDs and alternative credentials.
The second source is surveys like the National Longitudinal Surveys and the set of longitudinal surveys administered by NCES (i.e., NLS, HS&B, NELS, and ELS, all described in the previous section), which are based on surveys of samples of students. In each of its longitudinal surveys, NCES began by selecting stratified, nationally representative samples of students in the focal grade(s) in the base year. Those students were then followed periodically, allowing for the computation of individual cohort dropout and completion rates. The data sets include information about whether, when, and how students completed school. Here, the denominator of the cohort rate consists of all sampled students who were in the same grade at the same point in time. The numerator can be defined to answer the question of interest, depending on whether the desired rate is for graduates, dropouts, all completers (including GED recipients), ontime graduates, eventual graduates, and so on.
Individual cohort dropout and completion rates obtained from administrative data differ in several ways from those computed using data on samples of students (such as from NCES or elsewhere). Most importantly, rates based on administrative data are typically used to characterize dropout and completion in a particular state, district, or school. As such, the denominator must be adjusted to account for entry and exit into the population of interest, such as when students transfer into or out of a school, to other educational settings (GED or adult education programs), are incarcerated or expelled, or die.
Individual cohort dropout and completion rates that are derived from the NCES samples are typically intended to characterize large populations of students, such as all students or all African American students in the United States. Rates that are based on state administrative data provide an estimate of how well schools, districts, and states are “holding” their students and give an ultimate estimate of how many students are succeeding or failing in the jurisdiction. As such, they are more useful for determining the effects of programs and policies on students’ risk of graduating or dropping out than are NCES-based rates. Unlike rates based on samples, these rates include all students, so there is less risk of sample selection bias.
The difference in purpose is partly attributable to the nature of the longitudinal cohort samples themselves. The NCES cohorts, for instance, have been conducted at roughly 10-year intervals and are not sufficiently large to characterize school districts or even states. They are thus not useful for school accountability purposes. Because student-level characteristics are publicly available in these data sets, analyses generally focus on understanding the student-level correlates of high school completion or dropping out. Researchers are also at liberty to use the data to construct cohort dropout and completion rates that suit their own purposes and that differ with respect to the technical definitions of both the numerator and the denominator. Finally, because NCES samples include students in both public and private schools, the findings generalize to all students, unlike administrative data, which are collected only for public school students. However, the validity of findings from the NCES studies is compromised to the extent that there is differential attrition of students from the samples across time. Moreover, students in the baseline sample are not representative of students in high school grades in later years because of inmigration and grade retention. Thus, in NELS88, the sample of eighth graders of 1988 was augmented in grade 12 in 1992 to make it representative of all high school seniors in that year.
Aggregate Cohort Rates
Aggregate cohort rates are designed to approximate true cohort rates (i.e., cohort rates based on individuals tracked over time) when longitudinal data are not available. Beginning with a count of the number of individuals who share a common characteristic at one point in time (e.g., students entering high school), aggregate cohort rates estimate the percentage of individuals who transition into a new status (e.g., high school completion) by a subsequent point in time. With these rates, the numerator is the number of dropouts or completers in a cohort, and the denominator is the number of students at risk of dropping out or completing.
Aggregate cohort rates are primarily based on aggregated administrative data collected from schools. All that is required is the number of students completing and/or dropping out at a point in time and the number of students at risk of doing so; this information is usually tied to specific cohorts of incoming students. No data are obtained to link observations of individual students across time.
Determining the denominator of an aggregated cohort rate can be difficult. For most purposes, the denominator should include only first-time ninth graders9 and should account for student migration into and out of the cohort.
In Chapter 3, we discuss the impact of including only first-time ninth graders versus students who are repeating grade 9.
However, this is difficult to do in the absence of longitudinal data. Many states have calculated an aggregated cohort rate by dividing the number of completers in the spring of a given academic year by a denominator that is the sum of the number of students who completed high school in the spring of that year plus the number of dropouts over the four prior years. Such rates have the disadvantage of including multiple counts of dropouts who entered and left high school more than once, and they fail to include as completers those dropouts from one school who later enrolled in and completed high school in another school. They also omit students whose final status cannot be determined and may therefore discourage school systems from trying to identify the final status of such students if they are typically dropouts.
Several alternative methods have been proposed for dealing with the problem in the denominator of these rates (i.e., correctly determining the number of students in the cohort). Prominent examples of alternatives, all of which use data from the State Nonfiscal Survey of the CCD, include (1) the Cumulative Promotion Index (CPI) developed by Swanson (Editorial Projects in Education, 2008; Swanson and Chaplin, 2003); (2) the Averaged Freshman Graduation Rate (AFGR) used by NCES (Seastrom et al., 2006a, 2006b); (3) the Adjusted Completion Rate (ACR) developed by Greene (Greene and Forster, 2003; Greene and Winters, 2002); and (4) the Estimated Completion Rate (ECR) developed by Warren (2005; Warren and Halpern-Manners, forthcoming).
All of these measures make some effort to adjust the denominator—which is based on the total number of ninth graders—to account for migration into and out of the “at risk” population and/or to account for bias introduced by the fact that some ninth graders are not first-time ninth graders.10 For instance, the AFGR uses a “smoothed” denominator that attempts to account for higher grade retentions in grade 9 by forming an average of grade 8, 9, and 10 enrollments. Alternatively, the CPI rate is calculated by first dividing the number of regular diploma recipients in the spring of a given year by the grade 12 enrollment in the fall of that year and then multiplying this proportion by a promotion index for the three prior years. The promotion index is intended to estimate the likelihood of a “9th grader from a particular district completing high school with a regular diploma in four years given the conditions in that district during the school year” (Swanson, 2003, p. 15).
As Warren (2008) demonstrates (see footnote 10), the adjustments used for the AFGR, the CPI, and the ACR do not actually compensate for these biases. The resulting estimates of the graduation rate are in some cases upwardly biased and in some cases downwardly biased, and the extent of bias worsens when
moving from rates reported at the state level to rates reported for substate levels (i.e., districts). Warren’s own rate (the ECR) shows less bias than the other three at the state level, but it is also biased at the substate level, where it is subject to sampling error.
These aggregate cohort rates also have problems with the numerator. That is, they are unable to distinguish on-time graduates from other graduates—all that is known is the total number of graduates. Although these rates are frequently used to characterize the holding power of states, districts, and schools, they are conceptually imperfect for this use. Because of this limitation, they do not meet the graduation rate definition spelled out by the National Governors Association (NGA) Compact or in the most recent regulations for the No Child Left Behind (NCLB) Act and hence are not useful for accountability purposes.
Despite these weaknesses, aggregate cohort rates have one major advantage. They can be computed for every state and every local education agency in the country in a technically consistent manner, and they are available annually for many prior years. This allows for meaningful comparisons over time and across locales. The same cannot be said of true cohort rates. When based on longitudinal administrative data, true cohort rates are generally not computed in a consistent manner across locales or over time (although the NGA Compact and the NCLB regulations may change this). When based on longitudinal survey data on students, true cohort rates usually cannot be generalized to districts or even (in many cases) states. For any analyses of change in completion or dropout rates over time and/or across locales, aggregate cohort rates are all that are available.
The problems with aggregate cohort rates are important because the rates have been widely reported, have received considerable attention, and have been used to make judgments about the quality of education in specific states, districts, and schools. For instance, the AFGR is routinely reported by NCES. The CPI is routinely used for the graduation rates reported by the Editorial Projects in Education’s Diploma Counts publications (e.g., 2008), which receive considerable publicity. The other two (ACR and ECR) have received somewhat less attention, in part because they have been used primarily for research purposes.
In addition, Balfanz and Legters (2004) describe and use a Promoting Power Index (PPI), which they argue serves as an indirect indicator of the rate at which a school system graduates students. The measure is based on the ratio of the number of enrolled twelfth graders in one year to the number of enrolled ninth graders three years earlier (or else the number of tenth graders enrolled two years earlier when school systems do not enroll ninth graders).11 These authors, as well as other users of this measure (e.g., the Alliance for Excellent Education), are forthcoming about the potential weaknesses of this measure. For one thing, it does not actually include the number of graduates
from a school system in its calculation; thus it does not account for individuals who are enrolled in grade 12 in the fall but who do not go on to graduate. For another, it does nothing to adjust for the known biases in such measures that arise as the result of grade retention or student migration.
The problems with these aggregate cohort rates lead us to the following conclusion:
CONCLUSION 4-1: Aggregate cohort indicators, such as the Averaged Freshman Graduation Rate (AFGR), the Promoting Power Index (PPI), and the Cumulative Proportion Index (CPI), are useful as rough approximations of graduation rates. However, the rates are too imprecise to be used to make fine distinctions, such as to compare graduation rates across states, districts, or schools or across time.
There is no one best measure of high school dropout or completion. Different methods of calculating graduation, completion, and dropout rates will be more or less useful for different purposes and more or less valid and reliable for different types of students. The technical issues need to be considered in the context of how the resulting statistics are to be used, knowing that they will serve different purposes. For instance, some rates are more appropriate for providing information about the human capital of the country’s population, some are more appropriate for characterizing schools’ holding power, and some are more appropriate for characterizing students’ success at navigating through high school. Once the purpose is established, some methods are more appropriate than others. We therefore recommend:
RECOMMENDATION 4-1: The choice of a dropout or completion indicator should be based on the purpose and uses of the indicator.
We note that when dropout and graduation rates are reported in documents that are used for multiple purposes, multiple types of rates should be reported. Decisions about the types of rates to report should be based on the intended audience and uses of the information.
Our review also suggests that cohort rates based on aggregate data are not sufficiently accurate for research, policy, or accountability decisions. When these rates are used to make fine distinctions, such as to make comparisons across states, districts, or schools or across time, they may lead to erroneous conclusions. Three methods for calculating aggregate cohort rates—the Promoting Power Index, the Averaged Freshman Graduation Rate, and the Cumulative Proportion Index—are commonly used and receive wide attention. The PPI is used by the Alliance for Excellent Education and others. The AFGR
is used by the National Center for Education Statistics to report district- and state-level graduation rates and, by virtue of being produced by the federal government, has an implicit stamp of legitimacy that is not justified. The CPI is used in Diplomas Count, the annual publication by Editorial Projects in Education that summarizes states’ and districts’ progress in graduating their students. Use of these rates should be phased out in favor of true cohort rates. The most accurate cohort rates are those based on individual longitudinal data. Whenever possible, longitudinal data should be used to calculate these rates. We therefore recommend:
RECOMMENDATION 4-2: Whenever possible, dropout and completion rates should be based on individual student-level data. This allows for the greatest flexibility and transparency with respect to how data analysts handle important methodological issues that arise in defining the numerator and the denominator of these rates.
RECOMMENDATION 4-3: The Averaged Freshman Graduation Rate, the Cumulative Proportion Index, the Promoting Power Index, and similar measures based on aggregate-level data produce demonstrably biased estimates. These indicators should be phased out in favor of true longitudinal rates, particularly to report district-level rates or to make comparisons across states, districts, and schools or over time.
If additional information were collected through the ACS, it would be possible to calculate robust individual cohort rates nationally and for individual states. The ACS already ascertains whether people complete high school via a GED or diploma, but questions could be added to determine the state and year in which people first entered ninth grade and the state and year in which they completed high school. Using this information, one could reliably estimate the percentage of first-time ninth graders who obtained high school diplomas and/or GEDs (on time or otherwise) for multiple cohorts of students. These rates could be calculated nationally and for states, although sample size restrictions in the ACS would prevent drawing conclusions at the district level. We therefore recommend:
RECOMMENDATION 4-4: The U.S. Department of Education should explore the feasibility of adding several questions to the American Community Survey so the survey data can be used to estimate state graduation rates. This can be accomplished by ascertaining the year and state in which individuals first started high school, the year and state in which they exited high school, and the method of exiting high school (i.e., diploma, GED, dropping out). These additional questions could be asked about all individuals over age 16, but, in order to minimize problems associated with
recall errors and selective mortality, we suggest that these items be asked only of individuals between the ages of 16 and 45.
In the past few years, dropout and graduation rates have received much attention, in part because of discrepancies in the reported rates. These discrepancies have arisen as a result of different ways of calculating the rates, different purposes for the rates, and different ways of defining terms and populations of interest. The federal government can do much to help ameliorate the confusion about the rates. For instance, in 2008, it provided regulatory guidance about how the rates were to be calculated and reported to meet the requirements of NCLB. The National Governors Association’s definition of graduation rates provides a good starting point for standardizing practice in the way that these rates are determined. However, the definition is not specific enough to ensure that rates are comparable across states. We therefore recommend:
RECOMMENDATION 4-5: The federal government should continue to promote common definitions of a broad array of dropout, graduation, and completion indicators and also to describe the appropriate uses and limitations of each statistic.