Using Licensure Tests for Accountability
Earlier chapters examine the use of teacher licensure tests in identifying candidates with some of the knowledge and skills needed for minimally competent beginning practice. This chapter looks at policy makers’ other ambitions for teacher licensure tests, such as using them to help teacher education institutions focus on the knowledge and skills considered critical for beginning teaching and to hold higher-education institutions and states accountable for the quality of teacher preparation and licensure programs. In the current policy context, licensure tests are being used to identify competent teacher candidates, communicate what beginning teachers need to know and be able to do, and evaluate the quality of teacher preparation and licensure programs. These are broad and ambitious goals.
FOCUSING TEACHER EDUCATION ON IDENTIFIED COMPETENCIES
Teacher licensure tests have the potential to influence teacher preparation institutions in several ways. Some assert—and the committee agrees—that initial licensure tests can have positive effects on teacher education if the tests support states’ teaching and learning standards and if performance on them relates to teachers’ performance in the classroom (Melnick and Pullin, 2000). Initial licensure tests signal to teacher education programs the content and pedagogical knowledge considered prerequisite to minimally competent beginning teaching. They can draw attention to the advances in research and professional knowledge recognized by the test. The committee has tried to examine these issues.
As noted in Chapter 6, as yet there is little information on the relationship between results on initial licensure tests and other indicators of candidates’ competence. However, there is some evidence on the relationship between the content of licensure tests and states’ teaching standards. The test adoption process followed by states using Educational Testing Service (ETS) tests and the test construction procedures used by National Evaluation Systems (NES) suggest that licensure tests have some correspondence to states’ standards. The test adoption process undertaken by states administering ETS tests calls for comparisons of states’ needs and tested content. States administering ETS tests are asked to make a judgment that the tests they select correspond to their teaching and learning goals. In states that contract with NES for test development, tests are developed according to the states’ specifications. Alignment between initial licensure tests and state teaching and learning standards is an important prerequisite to coherent developmental systems for teacher preparation, assessment, and support.
It is also possible that licensure tests can have negative effects on teacher education. Several logical arguments can be made. If licensure tests oversimplify teaching knowledge or emphasize types of knowledge or practice that are not universally associated with effective teaching or do so in a manner that discourages teachers from learning to be diagnostic in relation to different students’ needs, they might have negative effects. However, as to negative effects, there is a paucity of data. Some data are available on possible testing effects in Massachusetts. Flippo and Riccards (2000) report changes by some Massachusetts colleges and universities in response to the recent disappointing performance of Massachusetts candidates on the state’s new licensure exam (Haney et al., 1999) and to recent federal attention to initial teacher licensure testing. These institutions report aligning course content to test specifications, adding workshops on test preparation, and imposing testing requirements for admission to teacher education programs. These changes would not necessarily be cause for concern except that in Massachusetts there may be a perceived misalignment between the content of the licensing test and the knowledge and skills identified as important for teaching (Haney et al., 1999; Melnick and Pullin, 2000). As noted earlier, the first administration of Massachusetts’ new test gave notable weight in scoring to candidates’ responses to a test item asking for transcription of one of the Federalist papers from audiotape, a task not highly related to the central tasks of teaching. It is possible that poorly designed tests could serve to water down teacher education curricula in institutions where officials are more desirous of having graduates pass the test than of preparing well-qualified teachers (Flippo and Riccards, 2000).
The committee does not have enough evidence to judge whether, on balance, current tests and the regulations that surround them are likely to improve teacher preparation or divert teacher education from untested content in problematic ways. It will also be important to monitor the impact of new licensure tests and
licensure requirements on teacher education curricula and to examine the conditions under which tests are used. Given currently available data on the impact of initial licensure tests on teacher education programs, it is unclear whether and under what conditions licensure tests can and do improve teacher education.
HOLDING STATES AND HIGHER EDUCATION INSTITUTIONS ACCOUNTABLE
In 1998 Congress amended Title II of the Higher Education Act (HEA) to promote teacher quality in two ways (P.L. 105–244). Teacher quality enhancement grants were authorized for states and for partnerships including at least an institution of higher education and a high-need school district. Additionally, public reporting and accountability requirements were established for states receiving funds authorized by the law and for institutions of higher education with teacher preparation programs enrolling students receiving aid under the HEA.
The law has several purposes (see Box 7–1). The overriding purpose is to improve student achievement. To this end, the federal government seeks improvement in the preparation of prospective teachers and the quality of professional development for current teachers. The law also holds institutions of higher education with teacher preparation programs accountable for preparing teachers who have the necessary teaching skills and content knowledge. Additionally, it encourages efforts to recruit qualified individuals into teaching, including those now in other occupations.
The teacher quality enhancement grants for states and partnerships may be used to address the purposes of Title II in several ways. States are permitted to use the funds for implementing reforms to hold institutions of higher education with
BOX 7–1 Title II Purposes
SOURCE: Higher Education Reauthorization Act, 1998.
teacher preparation programs accountable for preparing new teachers who will be competent to teach in their planned areas. Funds can be used to reform teacher certification or licensure requirements. Grants can be used to establish, expand, or improve alternatives to traditional routes to certification or licensure in teacher preparation programs. Additionally, funds can be used to develop and implement new ways to recruit highly qualified teachers, provide financial rewards to those who are highly effective, and remove those who are not competent.
Partnerships can use their grants for an array of activities. A partnership can address problems in integrating the efforts of schools of arts and sciences and schools of education in preparing new teachers who are highly knowledgeable in relevant content areas. A partnership can work to provide clinical experiences in preservice teacher programs. A partnership can also create and implement improved professional development programs and can work to improve the recruitment of highly qualified individuals into teaching.
The federal government is investing $75 million in these new initiatives in the first year and $98,000,000 in each of the second and third fiscal years.
Accountability and Evaluation Provisions
Title II created a new accountability system of reports on the quality of teacher preparation. This system requires institutions of higher education to report annually to states, states to report annually to the U.S. Department of Education, and the Secretary of Education to report annually to the Congress and the public on the quality of teacher preparation. Institutions are to report by April 7 of each year the passing rates of their teacher education graduates on state-required assessments; the average passing rates of all graduates in the state; and selected program characteristics, such as the numbers of enrolled students, faculty/student ratios, and whether a teacher preparation program has been identified as low performing by the state. All of this information is to be included in institutional publications such as promotional materials, school catalogs, and information sent to prospective employers. Institutions can be fined $25,000 for failing to report in a timely and accurate manner.
States are to aggregate institutional reports and report to the U.S. Department of Education by October 7 of each year. Each state report is to rank teacher preparation programs by quartiles on the passing rates of their graduates on licensure tests. Information on state requirements for teacher certification or licensure is to be presented, and states must identify any institutions identified as low performing or at risk of being so identified based on criteria they select. See Box 7–2.
Finally, the Secretary of Education is to prepare an annual report to Congress and the public on the quality of teacher preparation starting in April 2002. This report is to include information on state requirements for certification or licensure as well as information on efforts to improve the quality of teacher preparation and the teacher force.
BOX 7–2 State Reporting Provisions
The law specifies that state report cards should include:
SOURCE: Higher Education Reauthorization Act, 1998.
To receive funds under the HEA, states are to develop procedures to identify and assist, through the provision of technical assistance, low-performing programs of teacher preparation in institutions of higher education. As Box 7–2 shows, states are to include in their annual reports a list of any such institutions identified, and any institution so identified must report this designation in its
Title II and other public reports. States are responsible for selecting the criteria they use to identify low-performing institutions. Title II says that states may use the passing rates of graduates in determining low performance but are not required to do so. The law also says that, if an institution looses state approval or state funding for its teacher preparation program because of low performance, it will lose its eligibility for certain federal funds. Such institutions will be ineligible for funding by the U.S. Department of Education for professional development. Further, teacher preparation programs will not be permitted to accept or enroll any student receiving aid under Title IV of the HEA.
The stakes for noncompliance with the law or for poor performance are high. Institutions can be fined $25,000 for failing to report in a timely and accurate manner. Non-reporting states can loose funding authorized under the Higher Education Act. If states withdraw program approval or terminate funding, teacher education programs can lose grant funds and are ineligible to enroll teacher candidates who receive federal aid. At stake are federal fines and federal aid to teacher education programs and students. Declining student enrollments and decreasing support from the professional community and public also are possible consequences for low-performing schools.
The new law has met with notable resistance from states and the higher education community (Interstate New Teacher Assessment and Support Consortium, 1999; American Association of State Colleges and Universities, 1999, 2000a; American Association of Colleges for Teacher Education, 2000; Teacher Education Accreditation Council, 2000). Critics argue that teacher licensure tests provide only a limited view of program quality; that passing rates on licensure tests are not comparable across states and across institutions; and that the data are difficult to collect, verify, and report.
The U.S. Department of Education (2000b) has developed a reference guide to help states and institutions of higher education learn about and comply with the law. The Reference and Reporting Guide for Preparing State and Institutional Reports on the Quality of Teacher Preparation includes definitions, forms, and time tables. It suggests data-reporting procedures and includes questionnaires that states and institutions must use to report. It provides guidance on data interpretation. The Guide also includes a list of possible supplementary indicators of program quality for states and institutions.
Concerns About the Accountability and Evaluation Provisions
The National Center for Education Statistics of the U.S. Department of Education has worked closely with representatives from the states and higher education to develop standard definitions and uniform reporting methods for the data required by the law. Procedures to track candidates’ testing histories, check institutional affiliation data, and verify data have been instituted and are described in the Guide. However, there remain a number of important data collection and
reporting problems for states and institutions (Interstate New Teacher Assessment and Support Consortium, 1999; American Association of State Colleges and Universities, 1999, 2000a; American Association of Colleges for Teacher Education, 2000; Teacher Education Accreditation Council, 2000). Some of the problems are fairly fundamental and raise legitimate questions about the interpretability of the data that will be reported. These are described next. The Reference and Reporting Guide includes some similar cautionary statements about data interpretation.
For institutions and states, collecting and reporting required data are made difficult by:
the need to report degree and licensure requirements, waiver categories, and types of licenses in standardized ways;
the need to verify the institutional affiliations reported by test takers;
the need to track over time the testing records of individuals who retest;
the need to determine the pertinent institutional affiliations of students who have attended multiple institutions;
the large numbers of tests for which states have data;
insufficient numbers of examinees per test and population group to obtain reliable aggregated results;
the need to aggregate data across categories of tests; and
the absence of previous state and institutional data collections on these federally required data.
These difficulties may make it hard to interpret the reported data and hard to draw meaningful conclusions about candidates’ mastery of tested content.
The law requires states to report passing rate data for all tests that are used in more than one state and that have sufficient numbers of test takers. Institutions are required to report passing rate data for all tests for which they have adequate numbers of candidates. It is likely that policy makers and others will use these data to make inferences about the relative quality and rigor of preparation and licensure systems; however, comparability across states and institutions is questionable.
Differences among state testing systems are likely to make comparisons of passing rates misleading; these differences include:
considerable variability among states in the tests required for licensure;
even when tests are the same, states set different passing scores;
states administer tests at different points in teacher education (e.g., before admission, before graduation, after degree conferral); and
states attach different stakes to passing (e.g., no passing requirements, passage required for student teaching, passage required for licensure).
This diversity across states makes passing rate comparisons among them misleading. For states using different examinations, comparing passing rates is highly problematic because differences in the content, format, and margins of error of the tests limit reasonable inferences about relative performance (National Research Council, 1999a). Comparing results for states using the same tests with different passing scores also is problematic because the numbers passing in each state are partly determined by the passing scores in effect. Furthermore, differences in the way tests are used by states make it virtually impossible to meaningfully compare passing rates from different licensure tests (National Research Council, 1999). Passing rates are partially determined by the decisions about teacher candidates that the scores support. States that require Praxis I for entry into teacher preparation will have 100 percent passing rates on Praxis I, while states that require it at licensure are likely to have lower rates. Passing rates are comparable only for states using the same tests with the same passing scores to support the same decisions.
For the tests reviewed in Chapter 5, only three allow comparisons across states. For these tests multiple states use them with the same passing scores to support the same candidate decisions. When states’ current passing scores and uses are considered for the Mathematics: Proofs, Models, and Problems, Part 1 test, only two states can compare their own passing rates to those of one other state. The same is true for the Biology: Content Knowledge, Part 1 test. No comparisons are possible for the Principles of Learning and Teaching test, the Middle School English/Language Arts test, or the Biology: Content Knowledge, Part 2 test. On the Pre-professional Skills Test (PPST) in Reading, four states can each compare their own passing rates to those of one other state; a total of 12 states can compare their own passing rates to those of two other states; and five states can compare their own passing rates to those of four others. However, the PPST tests provide little information about the quality of teacher preparation. They examine basic skills considered prerequisite to, not the result of, teacher education. Because these tests were selected by the committee for this report from among ETS’s more commonly administered tests, fewer comparisons will be possible, in general, on the other Praxis tests. Comparisons of passing rates for NES states or others using state-specific tests are not meaningful.
In addition to passing rates for individual tests, the Reference and Reporting Guide says passing rates are to be aggregated and reported across tests within various test categories, i.e. basic skills tests, subject matter tests, pedagogy tests, special populations tests (special education, English as a second language), other content tests, and performance tests. Passing rates are also to be aggregated and reported across all required tests. The committee is highly doubtful that these aggregated and summary scores can support meaningful conclusions about program quality. Combining data from different tests, with different passing scores, given at different times, to support different candidate decisions will create summary scores with unknown meaning.
Institutional comparisons within states pose similar problems. Though passing rates provide a useful point of entry for investigating possible sources of strong and weak performance, they can be taken at face value only in some cases. Passing rates may not be comparable across teacher preparation programs within a state because of institutional factors such as:
different missions and recruiting practices,
differences in the entry-level characteristics of students,
different entry and exit testing requirements in teacher education,
variability in the procedures used to determine appropriate affiliations for students who attend multiple institutions or who do not identify institutional affiliations,
differences in the numbers of students who retest, and
differential score instability associated with small test-taking pools.
Because some teacher candidates take subject matter coursework at different institutions, passing rates on the subject matter tests may not indicate very much about the quality of a teacher preparation program. Even within institutions, faculty in the arts and sciences departments who teach subject matter courses to teacher candidates may be differentially willing to attend to the content of licensing tests.
Differences in the entry-level characteristics of teacher candidates at different institutions also pose an important problem for comparability. Higher passing rates at one institution may simply indicate that admitted students had better prior educational opportunities than those admitted to another institution. For example, a teacher education institution that draws students from underrepresented groups, some of whom score relatively poorly on college entrance tests of basic skills, will likely need to provide instruction in basic skills. Even with this instruction, it might be more difficult for such teacher candidates to pass the basic skills and other teacher licensure tests. According to Title II criteria, however, such an institution could appear to be a poor or even a failing institution even though students may have had more learning opportunities and accomplished more while in the program than, say, students in a program with higher admissions requirements. Passing rates may say very little about the quality of education at an institution in the absence of information about candidates’ mastery levels when entering the program.
Differences in the percentage of students who retest at different institutions pose additional problems for comparability. As discussed in Chapter 5, the test-taking population of every school includes initial testers who pass, initial testers who fail and never retry the licensing test, test takers who initially fail and eventually pass the licensing test, and test takers who repeat but never pass the licensing test. Institutions are asked to report passing rates just after program completion and, starting in 2004, three years after completion. The proportion
of candidates in each of the first-time and repeating groups in a reporting year is likely to be related to the institution’s overall passing rate. Differences in their proportions across institutions will make passing rate data difficult to compare.
The data collection and interpretation problems posed by Title II are not insubstantial. On their own, initial licensure tests fall short as indicators of program quality. In complying with Title II, test score information should be supplemented with other information about the characteristics and quality of programs. The U.S. Department of Education (2000b) suggests that data on other measures be reported, including demographic data on teacher education students and program completers, job placement rates in fields of eligibility, numbers of completers with National Board for Professional Teaching Standards certification, and information on the goals of teacher education programs. Other supplementary indicators might include data on the entry-level academic characteristics of teacher education students and program completers, job retention rates for graduates, numbers of placements in high-need schools and fields, and employer evaluations.
The Teacher Quality Enhancement Grants for States and Partnerships was enacted to achieve four goals: to improve student achievement; to improve the quality of the current and future teaching forces by improving the preparation of prospective teachers and enhancing professional development activities; to hold institutions of higher education accountable for preparing teachers who have the necessary skills and are highly competent in the academic content areas in which they plan to teach; and to recruit highly qualified individuals, including individuals from other occupations, into the teaching force.
Given its analysis of the objectives and requirements of the law, the committee concludes that:
It is reasonable to hold teacher education institutions accountable for the quality of teacher preparation programs.
By their design and as currently used, initial teacher licensure tests fall short of the intended policy goals for their use as accountability tools and as levers for improving teacher preparation and licensing programs. The public reporting and accountability provisions of Title II may encourage erroneous conclusions about the quality of teacher preparation.
Although the percentage of graduates who pass initial licensure tests provides an entry point for evaluating an institution’s quality, simple comparisons among institutions based on their passing rates are difficult to interpret for many reasons. These include the fact that institutions have different educational missions and recruiting practices, their students have different entry-level qualifications, teacher education programs have dif-
ferent entry and exit testing requirements, and programs have different procedures for determining the institutional affiliations of their candidates. By themselves, passing rates on licensure tests do not provide adequate information on which to judge the quality of teacher education programs.
Simple comparisons of passing rates across states are misleading. Many states use different tests for initial licensure or set different passing scores on the same tests. States have different policies about when a test is given or what decisions it supports.
To fairly and accurately judge the quality of teacher education programs, federal and state officials need data on a wide variety of program characteristics from multiple sources. Other indicators of program quality might include assessment data for students in relation to course and program benchmarks, employer evaluations, and district or state evaluations of beginning teaching. Other indicators might include information on course requirements and course quality, measures of the amount and quality of field experiences, and evidence of opportunities to work with students with special learning needs and students with diverse backgrounds. Data on the qualifications of program faculty, the allocation of resources, and the adequacy of facilities might be considered. The qualifications of students at entry to teacher education programs also should be included.