11
Guidance on Outcomes and Assessments

This report centers around two key principles. First, all assessments should be integrated into a larger coherent system of early childhood care and education that they are designed to support. This is not a new idea, but the committee is convinced that it bears repeating, because it is fundamental to worthwhile assessment. A system of early childhood care and education must have well-articulated goals and objectives, documented in standards, guidelines, and frameworks, that can inform the design and implementation of early care and education programs. The same set of goals should drive all assessment of whether the objectives are being met—by programs, by teachers, and by children. This supports the coherence necessary for an effective system.

Second, and also a key point not new in this report, the purposes for assessment must be clearly articulated before the assessment is designed, developed, selected, or implemented. Different purposes require different types of assessments, and an assessment designed for one purpose should never be converted to another without careful consideration of its appropriateness to the new purpose. This is really an extension of the first principle, but it is especially important for building trust among the people and organizations involved in an assessment effort. Poorly articulated purposes and assessments used for inappropriate purposes



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 341
11 Guidance on Outcomes and Assessments T his report centers around two key principles. First, all assessments should be integrated into a larger coherent system of early childhood care and education that they are designed to support. This is not a new idea, but the committee is convinced that it bears repeating, because it is fundamental to worthwhile assessment. A system of early childhood care and education must have well-articulated goals and objectives, documented in standards, guidelines, and frameworks, that can inform the design and implementation of early care and education programs. The same set of goals should drive all assessment of whether the objectives are being met—by programs, by teachers, and by children. This supports the coherence necessary for an effective system. Second, and also a key point not new in this report, the purposes for assessment must be clearly articulated before the assessment is designed, developed, selected, or implemented. Different purposes require different types of assessments, and an assessment designed for one purpose should never be converted to another without careful consideration of its appropriateness to the new purpose. This is really an extension of the first principle, but it is especially important for building trust among the people and organizations involved in an assessment effort. Poorly articu- lated purposes and assessments used for inappropriate purposes 

OCR for page 341
 EARLY CHILDHOOD ASSESSMENT can lead to decisions that are unfair or unclear, and they may do harm to programs, teachers, and, most importantly, children. In this chapter, we present a set of guidelines that should be useful to a broad range of organizations charged with the assess- ment of children and of programs providing care and education to young children. These guidelines are organized around the major themes of the report and flow from the perspective that any assessment decision should be made in the context of a larger, coherent assessment system, which is in turn embedded in a network of medical, educational, and family support systems designed to ensure optimal development for all children. Thus, though we briefly recap our rationale, based on our review of the literature, and present our guidelines following the order of topics in the volume, we hope the reader interprets our discussion of purposes, targets, and procedures for assessment as different specific topics subordinated to the notion of an assessment system. In compliance with our charge, we have also included a section presenting a recommended agenda for research on the assessment of young children, following the detailed guidelines. These guidelines should be useful to anyone contemplat- ing the selection or implementation of an assessment for young children, including medical and educational service providers, classroom practitioners, federal, state, and local governments and private agencies operating or regulating child care and early childhood education programs, and those interested in expanding the knowledge base about child development and the conditions of childhood. To make our guidance more pointed and practical, the chapter ends with a list of high-priority actions by members of specific groups engaged in the assessment of young children, which can be taken quickly and should provide maximum payoffs. PURPOSES AND USES OF ASSESSMENT Rationale In recent years, the purposes for which young children are being assessed have expanded, with more children being assessed than ever before. Young children have been assessed to screen for

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS and identify possible developmental problems for many years, but with advances in knowledge and new technologies the number of potential problems for which screening can be done has increased. The use of assessment to plan and guide instruction with young children also has been a recognized purpose of assessment for many years but has received more attention lately, as it has become widely acknowledged as a key component of a high-quality early childhood program. Making decisions about early childhood pro- grams is a purpose for assessment for which an increasing number of children are being assessed lately, and for which even more children are likely to be assessed in the future. These decisions can be the result of a program evaluation or as part of ongoing account- ability procedures. This last area has generated much discussion because of the technical challenges involved and because of the potential for misuse of assessment information. Despite the greatly increased amount of assessment in which young children are engaged, it is not always clear why assess- ments are undertaken or what rationale exists for the form of assessment selected. Assessments are often chosen and used that do not match their purpose well. The process of developing any assessment system involving young children needs to begin with a clearly articulated statement of purpose. Clearly thinking through the purpose involves defining the question the assessment process is designed to answer, as well as defining in advance how the information to be collected will be used. The problem of mismatch between assessment purpose and assessment use is evidenced in several ways: • Assessments designed and developed for one purpose are adopted for different purposes, without consideration of the match of information generated to the goal or to the validity of inferences with the novel use. Whoever selects the assessment instrument should consider the goal and seek an instrument with proven validity when used for that goal. If such an instrument does not exist, then firm conclu- sions cannot be drawn. • There are not many tools designed for large-scale program evaluation, so tools designed for other purposes often are adapted (e.g., shortened or administered differently) out

OCR for page 341
 EARLY CHILDHOOD ASSESSMENT of necessity, without sufficiently investigating the validity of the adapted tools in their new form and for their new purpose. • There is considerable worry in the field that an absence of the funding needed to develop effective measures is driv- ing people to use simple, unaligned, poorly developed measures or to use well-developed psychometrically sound measures to assess constructs for which they are not well designed. Purposes for assessment range widely, and some measures can be used for more than one purpose. Child-focused assess- ments can be used for child-specific purposes, such as screening and diagnosis, as well as for program monitoring and improve- ment purposes or for program evaluation. Similarly, with care, classroom quality assessments can be used for purposes of pro- gram monitoring, as formative input to guide program decisions, as an outcome in program evaluations, or in order to serve as moderating or mediating variables in predicting child outcomes in research. Nonetheless, not all instruments are appropriate for all purposes, and those selecting an assessment need to review the purposes for which it was designed to determine if it can be appropriately used for their intended purpose. It is not uncommon that inferences about program effective- ness are based on end-of-program performance of individual children. Such inferences are inappropriate without attention to the environments children experience both inside and outside the program, as well as to the characteristics at entry of the children served by the program. In the systems perspective we adopt, child performance should be viewed developmentally, and the complexity of factors influencing child performance or growth in any particular domain should be understood. Threats to the validity of inferences about program effectiveness that are based purely on child performance are reduced if measures reflect child progress rather than just end-of-program status, as well as if direct indicators of quality in the environment are also collected. Of course, information from these various sources about program effectiveness then also needs to be contextualized in information about resources (funding, longevity, administrative support, pro-

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS fessional development) available to the program before it could possibly justify any decisions about restructuring or defunding. There is a responsibility to articulate the purpose of any assessment in a responsible way to those who participate and who might be influenced by outcomes. For example, if a program is being evaluated, program staff should understand whether there are plans to use the assessments to evaluate their perfor- mance on an individual level. They should also know whether the information will be made available to guide decisions about the program and individual children. Consequences of assessment vary. Ideally, of course, assessment information benefits children by providing information that can be used to inform their care- givers, to improve the quality of their care and education environ- ments, and to identify child risk factors that could be remediated. Particularly in assessing young children, care is needed to ensure that they are not negatively affected (unintentionally frightened or made to feel incompetent) by the process of assessment, and that the value of the information gathered through assessment outweighs any negative effects (e.g., time taken away from instruction, disruption of normal routine, boredom or disengage- ment with the tasks, decisions that may negatively affect them). Guidelines on Purposes of Assessment (P-1) Public and private entities undertaking the assessment of young children should make the purposes of assessment explicit and public. (P-2) The assessment strategy—which assessments to use, how often to administer them, how long they should be, how the domain of items or children or programs should be sampled—should match the stated purpose and require the minimum amount of time to obtain valid results for that purpose. Even assessments that do not directly involve children, such as classroom observations, teacher rating forms, and collection of work products, impose a burden on adults and will require advance planning for using the information. (P-3) Those charged with selecting assessments need to weigh options carefully, considering the appropriateness of candi-

OCR for page 341
 EARLY CHILDHOOD ASSESSMENT date assessments for the desired purpose and for use with all the subgroups of children to be included. Although the same measure may be used for more than one purpose, prior consideration of all potential purposes is essential, as is careful analysis of the actual content of the assessment instrument. Direct examination of the assessment items is important because the title of a measure does not always reflect the content. DOMAINS AND MEASURES OF DEVELOPMENTAL OUTCOMES Rationale During infancy and toddlerhood in particular, frequently assessed domains include those implicated by the agenda of screening for medical, developmental, or environmental risk. Across the entire preschool period, a critical issue is what aspect of young children’s skills or behavior to measure. Research on the developing child has traditionally conceived of development as proceeding in different domains, for example, language or motor or socioemotional development. These distinctions have served science well and are helpful for assessment purposes, but in reality the distinctions among children’s skills and behaviors are somewhat artificial and not as clear-cut as the organization of research or assessment tools would suggest. Developmental domains are intertwined, especially in the very young child, making it challenging or even impossible to interpret measures in some domains without also measuring the influence of others. Health, socioemotional functioning and cognitive function- ing are closely interconnected in infancy, as for example when sleeping difficulties affect both socioemotional and cognitive functioning. For somewhat older preschoolers, the domains may be more readily differentiated operationally and theoretically, but they remain interdependent; for example, socioemotional (e.g., capacity to regulate negative emotion) and cognitive measures are interrelated and appear to have linked neural bases. Nevertheless, a conceptualization is needed that identifies the areas of development society wants to track and that programs

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS and services for young children are trying to impact. Convergent sources of information suggest that five major domains of child functioning recur in discussions of development during the pre- school period. Following the usage established by the National Education Goals Panel (1995) on school readiness, we use the following terms to describe them: 1. physical well-being and motor development, 2. social and emotional development, 3. approaches toward learning, 4. language development (including emergent literacy), and 5. cognition and general knowledge (including mathematics and science). These domains are themselves at different levels of develop- ment in defining the constructs they encompass and in the range and sophistication of the associated measures, and they differ as well in the amount of attention they get in policies for young chil- dren. It is relatively easy to converge on a set of general domains, but disagreement is common when specifics are needed. Social and emotional development, for example, encompasses emo- tion labeling in some assessments, but not others. Attentiveness is classified as social/emotional in some assessments, but under approaches toward learning in others. Also, the operationaliza- tion of the larger constructs evolves over time; fitness as an aspect of physical well-being, for example, is only recently emerging as a focus of policy attention in the preschool period, and it is not widely included in state standards. For the domains of social and emotional development and approaches to learning and for the subdomain of fitness, this is a period of active measures devel- opment, including both direct assessment and further work on parent and teacher reports. While important work in these areas is under way, both measures development and consensus about key constructs remain less advanced than for such subdomains as language, literacy, and mathematics. Some domains important to many parents and perhaps to others are minimally represented in standards, research, or assessment—such as art, music, morality. Those concerned with promoting good outcomes for children differ in their beliefs about

OCR for page 341
 EARLY CHILDHOOD ASSESSMENT what domains are most important, as evidenced by the variation among states’ early learning standards and the focus on basic skills in the federal program Good Start, Grow Smart. Further- more, a policy focus on a domain is likely to generate pressures to develop associated measures, which in turn increases the likeli- hood that the domain will be included in subsequent assessment activities. One basis for identifying particular domains as outcomes worthy of being tracked in young children is the values of par- ents, educators, policy makers, and traditional forces in society; these forces are clearly historical, and thus the basis may need to be expanded as the composition of society changes. Another is predictive data that show relationships to school achievement or other important long-term outcomes (e.g., staying out of the juvenile justice system); these, too, represent relationships to tradi- tionally valued outcomes, but as the goals of education change, they, too, might need to be adjusted. Evidence is not available about the relative relevance of the domains currently emphasized in assessment systems to groups increasing their representation in the society rather than those traditionally most numerous. Although domains are an easy way to think about outcomes, they may not be the right approach for all purposes. A notable example is assessment of children with disabilities, for whom the recommended practice is to write functional rather than domain- based outcomes on individualized service plans (e.g., dressing oneself, participating in family mealtime). To support this empha- sis in service provision, the Office of Special Education Programs in the U.S. Department of Education adopted three functional outcomes for national accountability reporting on programs serv- ing children from birth to age 3 and ages 3 through 5 with delays and disabilities. Guidelines on Domains and Measures of Developmental Outcomes (D-1) Domains included when assessing child outcomes and the quality of education programs should be expanded beyond those traditionally emphasized (language, literacy, and

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS mathematics) to include others, such as affect, interpersonal interaction, and opportunities for self-expression. (D-2) Support is needed to develop measures of approaches to learning and socioemotional functioning, as well as other currently neglected domains, such as art, music, creativity, and interpersonal skills. (D-3) Studies of the child outcomes of greatest importance to parents, including those from ethnic minority and immigrant groups, are needed to ensure that assessment instruments are available for domains (and thinking about domains) emphasized in different cultural perspectives, for example, proficiency in the native language as well as in English. (D-4) For children with disabilities and special needs, domain- based assessments may need to be replaced or supple- mented with more functional approaches. (D-5) Selecting domains to assess requires first establishing the purposes of the assessment, then deciding which of the various possible domains dictated by the purposes can best be assessed using available instruments of proven reliability and validity, and considering what the costs will be of omit- ting domains from the assessment system (e.g., reduction of their importance in the eyes of practitioners or parents). SELECTING AND IMPLEMENTING ASSESSMENTS Rationale A wide array of instruments and approaches can be used to collect information about young children and their environments, ranging from interviews with caregivers to ratings of child per- formance by caregivers or observers, to observations in naturally occurring or structured settings to direct assessments. Assess- ments of any type must be selected and implemented with care, but special attention is needed when using direct assessments with young children. It requires greater attention to establishing a relationship with the child, to ascertaining whether the task is familiar and comprehensible to him or her, to limiting length of the session and the child’s discomfort, to recognizing the role of

OCR for page 341
0 EARLY CHILDHOOD ASSESSMENT conditions like hunger or fatigue, and to recognizing the possibil- ity of bias if the tester is a caregiver or otherwise connected to the child. Instruments that have the most user-appeal often do not have the best psychometric properties. For example, portfolios of children’s artistic productions contain rich information but are hard to rate reliably. In the experience of committee members, selection of instruments is often more influenced by cost, by ease of administration, and by use in other equivalent programs than by the criteria proposed here. Those charged with selecting assessment instruments need to carefully review the information provided in the instrument’s technical manual. Although test publishers may provide exten- sive psychometric information about their products, additional evidence beyond that provided in manuals should also be con- sidered in instrument selection. Those selecting assessments should be familiar with the assessment standards contained in the standards document produced by the American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (1999). Important questions to ask are: Has this assessment been developed and validated for the purpose for which it is being considered? If a norm-referenced measure is being considered, has the assessment been normed with children like those with whom it will be used? For example, if the assessment is to be used as part of a program evaluation with minority children, were like children included in the development studies, including any norming studies? There is typically more robust evidence for inferences based on early childhood measures when used for normally developing, white, English-speaking children than for children from ethnic or lan- guage minorities or children with disabilities. Validity evidence is quite sparse for these special groups on most extant measures. Conducting valid assessments with language-minority children and children with special needs is especially challenging, and the reader is referred to Part III for more discussion of these topics. As explained in Chapter 7, one cannot say that measurement instru- ments either possess or lack validity; rather, inferences from the use of particular measurements for particular purposes may be supported or not supported by validity evidence. There are many special considerations when using existing

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS assessments for language-minority and cultural-minority children and children with disabilities. Key issues for children learning English include whether to assess in both the child’s home language and English and in what order the assessments should occur. If the child’s primary caregivers intend to raise the child bilingually, or if the early care and education setting is intentionally bilingual, then assessing the child in both languages reflects both the goals and the context of development. Typically, a young child should be assessed first in the higher proficiency language, if that is known. Information of importance in drawing inferences about young children’s functioning can be derived from many sources: col- lection of children’s work products (drawings or stories told), observation of the child in natural settings while engaged in a task or while interacting with peers, interviews with and surveys of parents and teachers, and direct child assessments. Each of these assessment modes has its own strengths and potential pitfalls. For example, work products are highly informative, but selecting equivalent “performances” across children is difficult. Teacher ratings reflect the ability to compare across children, but they are subject to bias if collected in circumstances in which there may be serious consequences for the teachers. Parent reports are based on rich knowledge of the child, but they are subject to social desir- ability biases. Observational measures provide information about real-world functioning, but they have to be contextualized in an understanding of how typical the observed behavior is. Direct assessments often provide information about norms or criteria for performance, but they can generate misleading results if the child being tested is shy, unfamiliar with the tester, or resistant to direction. Implementing a state-level early childhood assessment sys- tem is a relatively new process for any state that has undertaken it. States have approached this task in different ways, with some making decisions that would be supported by research and rec- ommended practice and others making decisions that would not. There is enormous variation across settings in the care with which decisions about early childhood assessments are made. New Jersey, for example, has developed effective assessment decision processes, which were described in Chapter 9.

OCR for page 341
 EARLY CHILDHOOD ASSESSMENT assessment process for young children; such work is hampered by disagreement about what constitutes bias and how it operates with different populations. Research on how to address these issues is needed to be able to move forward. More work is needed to explore the influence of sampling and norming in reducing bias. More work is needed to understand the effects of the exam- iner, rater, or the testing situation on all children, but especially on populations subject to bias. Work is needed to expand the universal design characteristics of extant testing instruments, to make them optimally useful for all children, including children with special needs and children from cultural and language minorities, and to consider universal design characteristics in the development of new instruments. Work is needed on the functionality of various instruments with different populations (e.g., for minority and nonminority chil- dren) in different settings (e.g., in a Head Start program and a private, for-profit, preschool program). English Language Learners Research is needed to develop psychometrically sound native language assessments for English language learners (ELLs). This will require the expertise of several disciplines, including linguis- tics, cognitive psychology, education, and psychometrics. Further empirical research is needed to evaluate the reliability and valid- ity of traditional cognitive measures for English language learners and intelligence tests developed for specific ELL populations. For English language learners, empirical research is needed to inform decisions about which accommodations to use, for whom, and under what conditions. There is a need for ongoing implementation research in the area of professional development and training for assessing young English language learners. This research needs to identify the sub- stance of professional development to improve staff competencies necessary to work as a part of a professional team; inform how staff works with interpreters; guide how to choose and admin- ister appropriate assessment batteries; and train practitioners to develop their competence in second language acquisition, accul- turation, and the evaluation of educational interventions.

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS More research documenting the current scenarios for the assess- ment of young ELLs across the country is needed, including more work to evaluate assessment practices in various localities; survey research and observational approaches to document practices in preassessment and assessment planning, conducting the assessment, analyzing and interpreting the results, reporting the results (in writ- ten and oral format), and determining eligibility and monitoring; and a focus on the development of strategies to train professionals with the skills necessary to serve young ELL children. Research is needed to develop assessment tools normed espe- cially for young English language learners using a bottom-up approach, so that assessment tools, procedures, and constructs assessed are aligned with cultural and linguistic characteristics of ELL children. Children with Special Needs More research is needed on what the various practitioners who assess young children with special needs—early interven- tionists, special education teachers, speech therapists, psycholo- gists, etc.—actually do. More research is needed on the use of accommodations with children with disabilities. What are appropriate guidelines for decision making about what kind of accommodations to use with what kind of child under what conditions? Research is needed on the impact of accommodations on the validity of the assessment results. Accountability and Program Quality There is a need for the development of assessment instru- ments designed for the purpose of accountability and program evaluation. Instruments that are developed for federal studies such as the Early Childhood Longitudinal Study, Kindergarten- First Grade Waves (ECLS-K) or national studies of Head Start should become publicly available, so they can used by others. There is a need for research on the implementation of account- ability systems and the tracking of positive and negative conse- quences at all levels of the system:

OCR for page 341
 EARLY CHILDHOOD ASSESSMENT • How strong is the research base for the accountability system? What is the impact on practice? Is that impact in line with what could be reasonably expected from the prior research? • Does the system have the intended impact? • Are there any negative consequences of the accountability system (e.g., narrowing of the curriculum, exclusion of high-risk children)? • If data are meant to improve programs or direct allocation of resources, does this happen? • How familiar are teachers and child care providers with the purpose of a program evaluation or a state accountability system? • How does information need to be packaged to ensure it is understood by program administrators, teachers or child care providers, and parents? There is a need for a compilation of experiences with differ- ent measures for accountability purposes. What are we learning about which measures or types of measures work well? There is a need for research on the development of accountability standards for types of information reported about assessments and account- ability for early childhood programs. Increased consideration of and research on system-level effects of various assessment approaches are needed. Detailed case studies of coherent com- prehensive assessment systems serving well-integrated systems of child care and education should be developed to serve as models for programs, districts, and states attempting to develop such systems. There is a need for research on the overall validity and conse- quences of particular approaches, such as: • Direct assessment with sampling—Where this has been used, what have been the program-level impacts? • If data are provided at the center but not the classroom level, does this create negative reactions? • What level of training and follow-up monitoring is required to ensure the assessment is administered consistently?

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS Different reporting formats should be evaluated with usability studies to determine which are best understood and most likely to be used accurately by typical audiences. PUTTING GUIDANCE INTO PRACTICE We conclude this volume by addressing our most urgent advice to the most likely agents. Different agents almost inevita- bly have somewhat different purposes for assessment, as well as different responsibilities and different levels of control. Here we attempt to clarify what actions can be taken by each of the major agents to implement the guidelines we have provided. In this way we hope to jumpstart actions to improve the care and education of young children. Pediatricians and Primary Health Care Personnel • Pediatricians and health care personnel should be aware of the full range of information sources useful in screen- ing children for developmental and medical risk; those responsible for the education of such professionals should include such information in medical training and in-service training. • Health care personnel should use effective strategies to convey information to parents and other caregivers of infants and children to whom they administer assessments. • Health care personnel should be aware of the educational implications of the risks they might identify through screening assessment, in order to help guide the search for services. • Health care providers need to be aware of the resources available in the community, such as Individuals with Disabilities Education Act Part C early intervention and preschool special education programs for children who are in need of additional developmental assessment and services.

OCR for page 341
0 EARLY CHILDHOOD ASSESSMENT Classroom Teachers in Early Childhood Settings • Teachers should work with colleagues and coaches/ professional development personnel/program admin- istrators to select or devise and implement formative (classroom- or curriculum-based) assessments to guide their own teaching. • If assessment information of any kind is collected in the classroom, the teacher should be fully informed about why the assessment is being conducted and for what pur- poses the data will be used. The teacher should be able to explain the purpose, process, and results of the assess- ment to parents. • Teachers should seek information about the psychometric properties of any assessments being used with children, in particular for direct assessments, and exercise caution in using direct assessment results from assessments with low reliability or tests not normed on children like those in their classrooms. • Teachers should make sure that they understand the mean- ing of children’s scores, both in relative terms (who is scor- ing highest, lowest in the class) and in relation to standards or expectations (who is scoring at or below expected levels for the age) if age-based norms are available. • Teachers should work with colleagues and coaches/ professional development personnel/program adminis- trators to determine the best ways of sharing information about child performance with primary caregivers, and encourage the program they work in to be systematic about sharing assessment findings with parents. • If the information collected as part of formal assessments (for program evaluation purposes, for example) ignores important domains, teachers should seek out ways to col- lect and record supplementary information on their own group of children. For example, if only early mathematical and literacy skills are formally tested, teachers should be systematic about collecting a wider array of developmental indicators, e.g., by using systematic observations during peer play sessions to collect information about children’s

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS socioemotional development, ask the child to select artistic products for placement in an art portfolio, take 90 seconds to elicit and write down a story from each child to reflect oral language skills, or in other ways be systematic about collecting a wider array of developmental indicators. Early Childhood Program Administrators • Program administrators should support their classroom personnel in selecting or developing and implementing formative (classroom- or curriculum-based) assessments to guide their own teaching. • Program administrators should ensure that they are fully informed about any assessment information of any kind being collected in their program by external agents. It is their responsibility (and their right) to know why the assessment is being conducted and for what purposes the data will be used. • Program administrators should seek information about the psychometric properties of any assessments being used with their children, in particular direct assessments, and exercise caution in using direct results from assessments with low reliability or tests not normed on children like those in their program. • Program administrators should make sure they under- stand the meaning of children’s scores, what they say both about how children in the program are progressing and whether they are meeting age-based or standards-based expectations. • Program administrators should work to ensure that their own level of assessment literacy is appropriate to the types of assessment taking place in their classroom. They should promote the assessment literacy of their staff through pro- fessional development opportunities. • Program administrators should work with the practitioners in their program to establish and practice the best ways of sharing information about child performance with primary caregivers and ensure that the program is systematic about sharing assessment findings.

OCR for page 341
 EARLY CHILDHOOD ASSESSMENT • If the information collected as part of formal assessments i gnores important domains, program administrators should encourage their staff to find assessments that cover the other domains or collect and record supplementary information on their own. • Program administrators should make systematic observa- tions of classrooms to assess the quality of teaching and the social context, using their own or an available measure, and use the findings to coach and provide professional support for teachers. • If no information related to the effectiveness of the program is being collected by external agencies, program admin- istrators should undertake their own regular systematic evaluation of the program and use the results to improve its overall effectiveness. The evaluation should include data on program quality (e.g., features of the classrooms, teacher-child interaction) and assessments that document the progress being made by children in the program. District, State, and Federal Officials with Responsibility for Early Childhood Programs • Officials should ensure the availability of professional development to support program personnel in interpreting and using information from assessments and in selecting or developing formative (classroom- or curriculum-based) assessments to guide their own continual improvement. • Officials should be clear about the purposes for which they are recommending or mandating assessments and ensure that the assessments and assessment strategies recom- mended or mandated fit those purposes well. • When selecting or developing assessment instruments or strategies for use with any purpose in their programs, offi- cials should maintain a record (audit trail) of the decisions made and the factors that influenced those decisions. • Officials should ensure that the psychometric properties of any direct tests they select or develop are adequate, both

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS in general and in particular for children like those being served in their programs. • Officials should build funding and planning for progress monitoring and evaluation into the budgets for program implementation. • Officials should consider the larger system when making specific decisions about assessment. They should select assessment instruments that are aligned with standards and that complement one another in the kinds of informa- tion they provide, plan in advance for informing program personnel about the nature and the purposes of the assess- ments, and plan in advance how the information generated will be shared and used. • Officials should reexamine regularly the standards to which their assessments are aligned, the domains that are included in their assessment system, and the degree of coherence (hor- izontal, vertical, and developmental) across the assessment system and early childhood care and education structure. • Officials should become informed about the risks associ- ated with assessing young children. • Officials should not make high-stakes decisions for chil- dren or for programs unless a number of criteria have been met. These criteria include 1. A clearly articulated purpose for the testing. 2. Identification of why particular assessments were selected in relation to the purpose. 3. A clear connection between the assessment results and quality of care. 4. Observation of quality of instruction and definition of what would need to be focused on for improvement. 5. A clear plan for following up to improve program quality. 6. Careful decisions about how to achieve the purposes of the assessment while minimizing the assessment burden, for example by sampling children, domains, or items. 7. Careful decisions about how to balance standardizing the administration of direct assessments with threats to

OCR for page 341
 EARLY CHILDHOOD ASSESSMENT optimal test performance because of unnaturalness or nonresponsiveness. Researchers • Researchers should work with early childhood practi- tioners and programs to learn about the full array of child outcomes of interest to them, to analyze the adequacy of the extant array of assessment instruments, to improve existing assessment procedures, and to develop assessment proce- dures for understudied or poorly instrumented domains. • Researchers should work to expand the universal design characteristics of extant testing instruments, to make them optimally useful for all children, including those with dis- abilities and cultural and language minority children. • Researchers should study the development of linguistic and cultural minority children in order to inform the devel- opment of assessments that would adequately reflect their capacities. • R esearchers should develop detailed case studies of coherent comprehensive assessment systems serving well- integrated systems of child care and education, to serve as models for programs, districts, and states attempting to develop such systems. CONCLUSION Writing a report about assessment, especially about assessment in early childhood, almost inevitably has to anticipate two quite different audiences. A significant proportion of the audience will start reading the report armed with a negative view of the idea of assessing preschoolers, alert to the complexities of assessing young children in ways that are informative and reliable, aware that testing can produce stress or discomfort in the child, and worried that the full array of skills and capacities the child has is unlikely to be represented. This portion of the audience will be integrat- ing the new information in the report with assessment horror stories—children who were identified as low IQ when in fact they were second language learners, programs that were threatened

OCR for page 341
 GUIDANCE ON OUTCOMES AND ASSESSMENTS with loss of funding because the children in them failed to meet some external standard even though they had progressed enor- mously, programs subjected to evaluation using tests of capacities that had not been included in the curriculum. Such readers will be particularly sensitive to the notion that child assessment might be included as a basis for program accountability. Another large portion of the audience will filter the informa- tion in a report like this through a generally much more positive view of assessment in early (and later) childhood. These readers are thinking of the value to parents of the procedures for screen- ing infants to identify those who need services. They would cite the value to taxpayers of evaluating early childhood programs to ensure they are of high quality and the value to practitioners, to parents, and to children of having both progress monitoring and formative assessments available to support program improve- ment. They would cite standards and associated assessments as levers for program improvement, as well as the need to hold publicly financed programs accountable for meeting their goals of providing young children with supportive and stimulating environments. They would point out how much has been learned about child development from assessment, and how much more we need to know. Of course, quite a lot of the readers, like many members of this committee, constitute a third group—those who understand the opportunities that well-thought-out and effective assessment offers to inform teaching and program improvement, but who are simultaneously acutely aware that poor practices abound even in the face of the best information about how to do better. Repre- senting the views of this latter group, this report attempts to take neither a positive nor a negative view of assessment, although we recognize the credibility of specific claims on both sides of the controversy. The committee members represent the full range of gut feelings about assessment. Some of us, reading early drafts of these chapters, wrote comments suggesting that more warn- ings and cautions were needed, whereas others wrote comments indicating that the view of assessment presented was much too bleak, that the value of assessment in educational improvement needed to be more robustly emphasized. We conclude, not that the very positive or the very negative views are wrong, but that

OCR for page 341
 EARLY CHILDHOOD ASSESSMENT both are correct and that both are limited. The final version of the report, thus, explicitly does not take the position that assessment is here to stay and we’d better learn to live with it. Rather, it takes the position that assessments can make crucial contributions to the improvement of children’s well-being, but only if they are well designed, implemented effectively and in the context of sys- tematic planning, and interpreted and used appropriately. Other- wise, assessment of children and programs can have negative consequences for both. We conclude that the value of assessments themselves cannot be judged without attention to the design of the larger systems in which they are used.