National Academies Press: OpenBook

Early Childhood Assessment: Why, What, and How (2008)

Chapter: 11 Guidance on Outcomes and Assessments

« Previous: 10 Thinking Systematically
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 341
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 342
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 343
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 344
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 345
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 346
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 347
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 348
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 349
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 350
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 351
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 352
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 353
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 354
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 355
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 356
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 357
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 358
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 359
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 360
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 361
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 362
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 363
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 364
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 365
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 366
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 367
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 368
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 369
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 370
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 371
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 372
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 373
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 374
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 375
Suggested Citation:"11 Guidance on Outcomes and Assessments." National Research Council. 2008. Early Childhood Assessment: Why, What, and How. Washington, DC: The National Academies Press. doi: 10.17226/12446.
×
Page 376

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

11 Guidance on Outcomes and Assessments T his report centers around two key principles. First, all assessments should be integrated into a larger coherent system of early childhood care and education that they are designed to support. This is not a new idea, but the committee is convinced that it bears repeating, because it is fundamental to worthwhile assessment. A system of early childhood care and education must have well-articulated goals and objectives, documented in standards, guidelines, and frameworks, that can inform the design and implementation of early care and education programs. The same set of goals should drive all assessment of whether the objectives are being met—by programs, by teachers, and by children. This supports the coherence necessary for an effective system. Second, and also a key point not new in this report, the purposes for assessment must be clearly articulated before the assessment is designed, developed, selected, or implemented. Different purposes require different types of assessments, and an assessment designed for one purpose should never be converted to another without careful consideration of its appropriateness to the new purpose. This is really an extension of the first principle, but it is especially important for building trust among the people and organizations involved in an assessment effort. Poorly articu- lated purposes and assessments used for inappropriate purposes 341

342 EARLY CHILDHOOD ASSESSMENT can lead to decisions that are unfair or unclear, and they may do harm to programs, teachers, and, most importantly, children. In this chapter, we present a set of guidelines that should be useful to a broad range of organizations charged with the assess- ment of children and of programs providing care and education to young children. These guidelines are organized around the major themes of the report and flow from the perspective that any assessment decision should be made in the context of a larger, coherent assessment system, which is in turn embedded in a network of medical, educational, and family support systems designed to ensure optimal development for all children. Thus, though we briefly recap our rationale, based on our review of the literature, and present our guidelines following the order of topics in the volume, we hope the reader interprets our discussion of purposes, targets, and procedures for assessment as different specific topics subordinated to the notion of an assessment system. In compliance with our charge, we have also included a section presenting a recommended agenda for research on the assessment of young children, following the detailed guidelines. These guidelines should be useful to anyone contemplat- ing the selection or implementation of an assessment for young children, including medical and educational service providers, classroom practitioners, federal, state, and local governments and private agencies operating or regulating child care and early childhood education programs, and those interested in expanding the knowledge base about child development and the conditions of childhood. To make our guidance more pointed and practical, the chapter ends with a list of high-priority actions by members of specific groups engaged in the assessment of young children, which can be taken quickly and should provide maximum payoffs. Purposes and uses of Assessment Rationale In recent years, the purposes for which young children are being assessed have expanded, with more children being assessed than ever before. Young children have been assessed to screen for

GUIDANCE ON OUTCOMES AND ASSESSMENTS 343 and identify possible developmental problems for many years, but with advances in knowledge and new technologies the number of potential problems for which screening can be done has increased. The use of assessment to plan and guide instruction with young children also has been a recognized purpose of assessment for many years but has received more attention lately, as it has become widely acknowledged as a key component of a high-quality early childhood program. Making decisions about early childhood pro- grams is a purpose for assessment for which an increasing number of children are being assessed lately, and for which even more children are likely to be assessed in the future. These decisions can be the result of a program evaluation or as part of ongoing account- ability procedures. This last area has generated much discussion because of the technical challenges involved and because of the potential for misuse of assessment information. Despite the greatly increased amount of assessment in which young children are engaged, it is not always clear why assess- ments are undertaken or what rationale exists for the form of assessment selected. Assessments are often chosen and used that do not match their purpose well. The process of developing any assessment system involving young children needs to begin with a clearly articulated statement of purpose. Clearly thinking through the purpose involves defining the question the assessment process is designed to answer, as well as defining in advance how the information to be collected will be used. The problem of mismatch between assessment purpose and assessment use is evidenced in several ways: • Assessments designed and developed for one purpose are adopted for different purposes, without consideration of the match of information generated to the goal or to the validity of inferences with the novel use. Whoever selects the assessment instrument should consider the goal and seek an instrument with proven validity when used for that goal. If such an instrument does not exist, then firm conclu- sions cannot be drawn. • There are not many tools designed for large-scale program evaluation, so tools designed for other purposes often are adapted (e.g., shortened or administered differently) out

344 EARLY CHILDHOOD ASSESSMENT of necessity, without sufficiently investigating the validity of the adapted tools in their new form and for their new purpose. • There is considerable worry in the field that an absence of the funding needed to develop effective measures is driv- ing people to use simple, unaligned, poorly developed measures or to use well-developed psychometrically sound measures to assess constructs for which they are not well designed. Purposes for assessment range widely, and some measures can be used for more than one purpose. Child-focused assess- ments can be used for child-specific purposes, such as screening and diagnosis, as well as for program monitoring and improve- ment purposes or for program evaluation. Similarly, with care, classroom quality assessments can be used for purposes of pro- gram monitoring, as formative input to guide program decisions, as an outcome in program evaluations, or in order to serve as moderating or mediating variables in predicting child outcomes in research. Nonetheless, not all instruments are appropriate for all purposes, and those selecting an assessment need to review the purposes for which it was designed to determine if it can be appropriately used for their intended purpose. It is not uncommon that inferences about program effective- ness are based on end-of-program performance of individual children. Such inferences are inappropriate without attention to the environments children experience both inside and outside the program, as well as to the characteristics at entry of the children served by the program. In the systems perspective we adopt, child performance should be viewed developmentally, and the complexity of factors influencing child performance or growth in any particular domain should be understood. Threats to the validity of inferences about program effectiveness that are based purely on child performance are reduced if measures reflect child progress rather than just end-of-program status, as well as if direct indicators of quality in the environment are also collected. Of course, information from these various sources about program effectiveness then also needs to be contextualized in information about resources (funding, longevity, administrative support, pro-

GUIDANCE ON OUTCOMES AND ASSESSMENTS 345 fessional development) available to the program before it could possibly justify any decisions about restructuring or defunding. There is a responsibility to articulate the purpose of any assessment in a responsible way to those who participate and who might be influenced by outcomes. For example, if a program is being evaluated, program staff should understand whether there are plans to use the assessments to evaluate their perfor- mance on an individual level. They should also know whether the information will be made available to guide decisions about the program and individual children. Consequences of assessment vary. ­Ideally, of course, assessment information benefits children by providing information that can be used to inform their care- givers, to improve the quality of their care and education environ- ments, and to identify child risk factors that could be remediated. Particularly in assessing young children, care is needed to ensure that they are not negatively affected (unintentionally frightened or made to feel incompetent) by the process of assessment, and that the value of the information gathered through assessment outweighs any negative effects (e.g., time taken away from instruction, disruption of normal routine, boredom or disengage- ment with the tasks, decisions that may negatively affect them). Guidelines on Purposes of Assessment (P-1) Public and private entities undertaking the assessment of young children should make the purposes of assessment explicit and public. (P-2) The assessment strategy—which assessments to use, how often to administer them, how long they should be, how the domain of items or children or programs should be sampled—should match the stated purpose and require the minimum amount of time to obtain valid results for that purpose. Even assessments that do not directly involve children, such as classroom observations, teacher rating forms, and collection of work products, impose a burden on adults and will require advance planning for using the information. (P-3) Those charged with selecting assessments need to weigh options carefully, considering the appropriateness of candi-

346 EARLY CHILDHOOD ASSESSMENT date assessments for the desired purpose and for use with all the subgroups of children to be included. Although the same measure may be used for more than one purpose, prior consideration of all potential purposes is essential, as is careful analysis of the actual content of the assessment instrument. Direct examination of the assessment items is important because the title of a measure does not always reflect the content. Domains and Measures of developmental outcomes Rationale During infancy and toddlerhood in particular, frequently assessed domains include those implicated by the agenda of screening for medical, developmental, or environmental risk. Across the entire preschool period, a critical issue is what aspect of young children’s skills or behavior to measure. Research on the developing child has traditionally conceived of development as proceeding in different domains, for example, language or motor or socioemotional development. These distinctions have served science well and are helpful for assessment purposes, but in reality the distinctions among children’s skills and behaviors are somewhat artificial and not as clear-cut as the organization of research or assessment tools would suggest. Developmental domains are intertwined, especially in the very young child, making it challenging or even impossible to interpret measures in some domains without also measuring the influence of others. Health, socioemotional functioning and cognitive function- ing are closely interconnected in infancy, as for example when sleeping difficulties affect both socioemotional and cognitive functioning. For somewhat older preschoolers, the domains may be more readily differentiated operationally and theoretically, but they remain interdependent; for example, socioemotional (e.g., capacity to regulate negative emotion) and cognitive measures are interrelated and appear to have linked neural bases. Nevertheless, a conceptualization is needed that identifies the areas of development society wants to track and that programs

GUIDANCE ON OUTCOMES AND ASSESSMENTS 347 and services for young children are trying to impact. Convergent sources of information suggest that five major domains of child functioning recur in discussions of development during the pre- school period. Following the usage established by the National Education Goals Panel (1995) on school readiness, we use the follow­ing terms to describe them: 1. physical well-being and motor development, 2. social and emotional development, 3. approaches toward learning, 4. language development (including emergent literacy), and 5. cognition and general knowledge (including mathematics and science). These domains are themselves at different levels of develop- ment in defining the constructs they encompass and in the range and sophistication of the associated measures, and they differ as well in the amount of attention they get in policies for young chil- dren. It is relatively easy to converge on a set of general domains, but disagreement is common when specifics are needed. Social and emotional development, for example, encompasses emo- tion labeling in some assessments, but not others. Attentiveness is classified as social/emotional in some assessments, but under approaches toward learning in others. Also, the operationaliza- tion of the larger constructs evolves over time; fitness as an aspect of physical well-being, for example, is only recently emerging as a focus of policy attention in the preschool period, and it is not widely included in state standards. For the domains of social and emotional development and approaches to learning and for the subdomain of fitness, this is a period of active measures devel- opment, including both direct assessment and further work on parent and teacher reports. While important work in these areas is under way, both measures development and consensus about key constructs remain less advanced than for such subdomains as language, literacy, and mathematics. Some domains important to many parents and perhaps to others are minimally represented in standards, research, or a ­ ssessment—such as art, music, morality. Those concerned with promoting good outcomes for children differ in their beliefs about

348 EARLY CHILDHOOD ASSESSMENT what domains are most important, as evidenced by the variation among states’ early learning standards and the focus on basic skills in the federal program Good Start, Grow Smart. Further- more, a policy focus on a domain is likely to generate pressures to develop associated measures, which in turn increases the likeli- hood that the domain will be included in subsequent assessment activities. One basis for identifying particular domains as outcomes worthy of being tracked in young children is the values of par- ents, educators, policy makers, and traditional forces in society; these forces are clearly historical, and thus the basis may need to be expanded as the composition of society changes. Another is predictive data that show relationships to school achievement or other important long-term outcomes (e.g., staying out of the juvenile justice system); these, too, represent relationships to tradi­ tionally valued outcomes, but as the goals of education change, they, too, might need to be adjusted. Evidence is not available about the relative relevance of the domains currently emphasized in assessment systems to groups increasing their representation in the society rather than those traditionally most numerous. Although domains are an easy way to think about outcomes, they may not be the right approach for all purposes. A notable example is assessment of children with disabilities, for whom the recommended practice is to write functional rather than domain- based outcomes on individualized service plans (e.g., dressing oneself, participating in family mealtime). To support this empha- sis in service provision, the Office of Special Education Programs in the U.S. Department of Education adopted three functional outcomes for national accountability reporting on programs serv- ing children from birth to age 3 and ages 3 through 5 with delays and disabilities. Guidelines on Domains and Measures of Developmental Outcomes (D-1) Domains included when assessing child outcomes and the quality of education programs should be expanded beyond those traditionally emphasized (language, literacy, and

GUIDANCE ON OUTCOMES AND ASSESSMENTS 349 mathematics) to include others, such as affect, inter­personal interaction, and opportunities for self-expression. (D-2) Support is needed to develop measures of approaches to learning and socioemotional functioning, as well as other currently neglected domains, such as art, music, creativity, and interpersonal skills. (D-3) Studies of the child outcomes of greatest importance to ­ parents, including those from ethnic minority and immigrant groups, are needed to ensure that assessment instruments are available for domains (and thinking about domains) emphasized in different cultural perspectives, for example, proficiency in the native language as well as in English. (D-4) For children with disabilities and special needs, domain- based assessments may need to be replaced or supple- mented with more functional approaches. (D-5) Selecting domains to assess requires first establishing the purposes of the assessment, then deciding which of the various possible domains dictated by the purposes can best be assessed using available instruments of proven reliability and validity, and considering what the costs will be of omit- ting domains from the assessment system (e.g., reduction of their importance in the eyes of practitioners or parents). Selecting and Implementing Assessments Rationale A wide array of instruments and approaches can be used to collect information about young children and their environments, ranging from interviews with caregivers to ratings of child per- formance by caregivers or observers, to observations in naturally occurring or structured settings to direct assessments. Assess- ments of any type must be selected and implemented with care, but special attention is needed when using direct assessments with young children. It requires greater attention to establishing a relationship with the child, to ascertaining whether the task is familiar and comprehensible to him or her, to limiting length of the session and the child’s discomfort, to recognizing the role of

350 EARLY CHILDHOOD ASSESSMENT conditions like hunger or fatigue, and to recognizing the possibil- ity of bias if the tester is a caregiver or otherwise connected to the child. Instruments that have the most user-appeal often do not have the best psychometric properties. For example, portfolios of children’s artistic productions contain rich information but are hard to rate reliably. In the experience of committee members, selection of instruments is often more influenced by cost, by ease of administration, and by use in other equivalent programs than by the criteria proposed here. Those charged with selecting assessment instruments need to carefully review the information provided in the instrument’s technical manual. Although test publishers may provide exten- sive psychometric information about their products, additional evidence beyond that provided in manuals should also be con- sidered in instrument selection. Those selecting assessments should be familiar with the assessment standards contained in the standards document produced by the American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (1999). Important questions to ask are: Has this assessment been developed and validated for the purpose for which it is being considered? If a norm-referenced measure is being considered, has the assessment been normed with children like those with whom it will be used? For example, if the assessment is to be used as part of a program evaluation with minority children, were like children included in the development studies, including any norming studies? There is typically more robust evidence for inferences based on early childhood measures when used for normally developing, white, English-speaking children than for children from ethnic or lan- guage minorities or children with disabilities. Validity evidence is quite sparse for these special groups on most extant measures. Conducting valid assessments with language-minority children and children with special needs is especially challenging, and the reader is referred to Part III for more discussion of these topics. As explained in Chapter 7, one cannot say that measurement instru- ments either possess or lack validity; rather, inferences from the use of particular measurements for particular purposes may be supported or not supported by validity evidence. There are many special considerations when using existing

GUIDANCE ON OUTCOMES AND ASSESSMENTS 351 assessments for language-minority and cultural-minority children and children with disabilities. Key issues for children learning English include whether to assess in both the child’s home language and English and in what order the assessments should occur. If the child’s primary caregivers intend to raise the child bilingually, or if the early care and education setting is intentionally bilingual, then assessing the child in both languages reflects both the goals and the context of development. Typically, a young child should be assessed first in the higher proficiency language, if that is known. Information of importance in drawing inferences about young children’s functioning can be derived from many sources: col- lection of children’s work products (drawings or stories told), observation of the child in natural settings while engaged in a task or while interacting with peers, interviews with and surveys of parents and teachers, and direct child assessments. Each of these assessment modes has its own strengths and potential pitfalls. For example, work products are highly informative, but selecting equivalent “performances” across children is difficult. Teacher ratings reflect the ability to compare across children, but they are subject to bias if collected in circumstances in which there may be serious consequences for the teachers. Parent reports are based on rich knowledge of the child, but they are subject to social desir- ability biases. Observational measures provide information about real-world functioning, but they have to be contextualized in an understanding of how typical the observed behavior is. Direct assessments often provide information about norms or criteria for performance, but they can generate misleading results if the child being tested is shy, unfamiliar with the tester, or resistant to direction. Implementing a state-level early childhood assessment sys- tem is a relatively new process for any state that has undertaken it. States have approached this task in different ways, with some making decisions that would be supported by research and rec- ommended practice and others making decisions that would not. There is enormous variation across settings in the care with which decisions about early childhood assessments are made. New Jersey, for example, has developed effective assessment decision processes, which were described in Chapter 9.

352 EARLY CHILDHOOD ASSESSMENT Guidelines on Instrument Selection and Implementation ( I-1) Selection of a tool or instrument should always include careful attention to its psychometric properties. (a) Assessment tools should be chosen that have been shown to have acceptable levels of validity and reliability evidence for the purposes for which they will be used and the popu- lations that will be assessed. (b) Those charged with implementing assessment systems need to make sure that procedures are in place to examine validity data as part of instrument selection and then to examine the data being produced with the instrument to ensure that the scores being generated are valid for the purposes for which they are being used. (c) Test developers and others need to collect and make available evidence about the validity of inferences for lan- guage and cultural minority groups and for children with disabilities. (d) Program directors, policy makers, and others who select instruments for assessments should receive instruction in how to select and use assessment instruments. ( I-2) Assessments should not be given without clear plans for follow-up steps that use the information productively and appropriately. ( I-3) When assessments are carried out, primary caregivers should be informed in advance about their purposes and focus. When assessments are for screening purposes, p ­ rimary caregivers should be informed promptly about the results, in particular whether they indicate a need for further diagnostic assessment. ( I-4) Pediatricians, primary medical caregivers, and other quali- fied personnel should screen for maternal or family factors that might impact child outcomes—child abuse risk, mater- nal depression, and other factors known to relate to later outcomes. ( I-5) Screening assessment should be done only when the avail- able instruments are informative and have good predictive validity.

GUIDANCE ON OUTCOMES AND ASSESSMENTS 353 ( I-6) Assessors, teachers, and program administrators should be able to articulate the purpose of assessments to parents and others. ( I-7) Assessors should be well trained to meet a clearly specified level of expertise in administering assessments, should be monitored systematically, and should be reevaluated occa- sionally. Teachers or program staff may administer assess- ments if they are carefully supervised and if ­ reliability checks and monitoring are in place to ensure adherence to approved procedures. ( I-8) States or other groups selecting high-stakes assessments should leave an audit trail—a public record of the decision making that was part of the design and development of the assessment system. These decisions would include why the assessment data are being collected, why a particular set of outcomes was selected for assessment, why the particular tools were selected, how the results will be reported and to whom, as well as how the assessors were trained and the assessment process was monitored. ( I-9) For large-scale assessment systems, decisions regarding instrument selection or development for young children should be made by individuals with the requisite program- matic and technical knowledge and after careful consid- eration of a variety of factors, including existing research, recommended practice, and available resources. Given the broad-based knowledge needed to make such decisions wisely, they cannot be made by a single individual or by fiat in legislation. Policy and legislation should allow for the adoption of new instruments as they are developed and validated. ( I-10) Assessment tools should be constructed and selected for use in accordance with principles of universal design, so they will be accessible to, valid, and appropriate for the greatest possible number of children. Children with dis- abilities may still need accommodations, but this need should be minimized. ( I-11) Extreme caution needs to be exercised in reaching con- clusions about the status, progress, and effectiveness of programs serving, young children with special needs, chil-

354 EARLY CHILDHOOD ASSESSMENT dren from language-minority homes, and other children from groups not well represented in norming or validation samples, until more information about assessment use is available and better measures are developed. The Assessment System Rationale In its use of the term “system,” the committee intends that: • the assessment system and assessing within that system be seen as part of the larger structure of early childhood care and education, including child outcomes as well as pro- gram standards, constructs, measures, indicators, decision making, and follow-up; • selection of assessments be intimately linked to goals defined in the context of that larger system; • procedures for sharing information about and using infor- mation from assessments be considered as part of the pro- cess of selecting and administering assessments; and • different parts of the assessment system itself (standards, constructs, measures, indicators) work together. Many, if not most, early childhood assessment programs cur- rently in use lack elements of well-integrated systems. At least some partially integrated systems exist and constitute models for how to design assessment systems—in New Zealand and in New Jersey, for example—but there are many barriers to doing this universally. The knowledge and resources are available to do a better job of integrating information from a range of early child outcome and program assessments into efforts to improve the quality of services to young children. Assessments that are not integrated into well-designed sys- tems often are ill-suited to the purposes to which they are put, are not well aligned with program standards and goals, and they do not contribute as they should to the improvement of children’s learning and development. Evaluation and account- ability are separate goals; integrating them takes explicit plan-

GUIDANCE ON OUTCOMES AND ASSESSMENTS 355 ning and great care to avoid potential risks to children, teachers, and programs. Good, systematic use of assessment for program improve- ment, evaluation, or accountability purposes implies integrating information from a range of assessments focusing on different elements of the system. It implies as well a procedure for provid- ing assistance in addressing problems with classroom resources and challenges, as well as for providing resources and support (including corrective action) before the imposition of any negative consequences for teachers or programs. Furthermore, any use of assessment for drawing conclusions about program effective- ness requires meeting the criteria for a systematic and coherent approach to assessment-based decision making. These criteria include 1. Clearly articulated purpose for the testing. Identification of why particular assessments were selected in relation to the purpose. 2. Clear connection between the assessment results and q ­ uality of care. 3. Observation of quality of instruction and definition of what would need to be focused on for improvement. 4. Clear plan for following up to improve program quality. 5. Careful decisions about how to achieve the purposes of the assessment while minimizing the assessment burden, e.g., by sampling children, domains, or items. 6. Careful decisions about how to balance standardizing the administration of direct assessments with threats to optimal test performance because of unnaturalness or nonresponsiveness. The NRC Committee on High Stakes (National Research Council, 1999) articulated a list of criteria that had to be met before any high stakes were imposed on students or on schools; that committee’s work was primarily relevant to the K-12 system, but its general tone of caution about using assessments to make crucial decisions is equally applicable to the early childhood years. For example, its report concluded that educational deci- sions about individual children should never be based on a single

356 EARLY CHILDHOOD ASSESSMENT test, that different kinds and sources of information about child performance were needed. Similarly, high-stakes decisions about teachers and programs should never be based on a single source of information. Information from child assessments should be contextualized in an understanding of their care and education environments, as well as child-specific factors (fatigue, hunger, illness) that may undermine the validity of inferences. Informa- tion about program effectiveness should be contextualized in an understanding of the resources and supports available. Decisions about teacher effectiveness cannot be validly based on informa- tion from child performance or even from direct observation without also knowing about access to resources, to professional development, to mental health consultation, to supervision, and so on. Lack of further investigation and follow-up to screening are violations of good practice. There are high stakes associated with being identified for retention or special services, but also with failure to identify those with possible problems and follow up with appropriate in-depth diagnostic assessment and, when appropriate, services. Doing accountability well for early childhood programs is expensive and occurs only if the accountability work is funded as part of funding the program. As states invest increasingly in pre- kindergarten programs, it is important to recognize that building in a process of accountability takes thoughtful planning as well as resources. There are models of carefully designed accountability processes built into a few such programs. Guidelines on Systems ( S-1) An effective early childhood assessment system must be part of a larger system with a strong infrastructure to support children’s care and education. The infrastructure is the foundation on which the assessment systems rest and is critical to its smooth and effective functioning. The infrastructure should encompass several components that together form the system: (a) Standards: A comprehensive, well-articulated set of stan- dards for both program quality and children’s learning that

GUIDANCE ON OUTCOMES AND ASSESSMENTS 357 are aligned to one another and that define the constructs of interest as well as child outcomes that demonstrate that the learning described in the standard has occurred. (b) Assessments:  Multiple approaches to documenting child development and learning and reviewing program quality that are of high quality and connect to one another in well- defined ways, from which strategic selection can be made depending on specific purposes. (c) Reporting:  Maintenance of an integrated database of assessment instruments and results (with appropriate safeguards of confidentiality) that is accessible to potential users, that provides information about how the instru- ments and scores relate to standards, and that can generate reports for the varied audiences and purposes. (d) Professional development:  Ongoing opportunities pro- vided to those at all levels (policy makers, program d ­ irectors, assessment administrators, practitioners) to understand the standards and the assessments and to learn to use the data and data reports with integrity for their own purposes. (e) Opportunity to learn:  Procedures to assess whether the environments in which children are spending time offer high-quality support for development and learning, as well as safety, enjoyment, and affectively positive relationships, and to direct support to those that fall short. (f) Inclusion:  Methods and procedures for ensuring that all children served by the program will be assessed fairly, regardless of their language, culture, or disabilities, and with tools that provide useful information for fostering their development and learning. (g) Resources:  The assurance that the financial resources needed to ensure the development and implementation of the system components will be available. (h) Monitoring and evaluation:  Continuous monitoring of the system itself to ensure that it is operating effectively and that all elements are working together to serve the interests of the children. This entire infrastructure must be in place to create and sustain an

358 EARLY CHILDHOOD ASSESSMENT assessment subsystem within a larger system of early child- hood care and education. ( S-2) A successful system of assessments must be coherent in a variety of ways. It should be horizontally coherent, with the curriculum, instruction, and assessment all aligned with the early learning and development standards and with the program standards, targeting the same goals for learning, and working together to support children’s developing knowledge and skill across all domains. It should be verti- cally coherent, with a shared understanding at all levels of the system of the goals for children’s learning and devel- opment that underlie the standards, as well as consensus about the purposes and uses of assessment. It should be developmentally coherent, taking into account what is known about how children’s skills and understanding develop over time and the content knowledge, abilities, and understand- ing that are needed for learning to progress at each stage of the process. The California Desired Results Developmental Profile provides an example of movement toward a multi- ply coherent system. These coherences drive the design of all the subsystems. For example, the development of early learning standards, curriculum, and the design of teaching practices and assessments should be guided by the same framework for understanding what is being attempted in the classroom that informs the training of beginning teachers and the continuing professional development of experienced teachers. The reporting of assessment results to parents, teachers, and other stakeholders should also be based on this same framework, as should the evaluations of effectiveness built into all systems. Each child should have an equivalent opportunity to achieve the defined goals, and the allocation of resources should reflect those goals. ( S-3) Following the best assessment practices is especially crucial in cases in which assessment can have significant con- sequences for children, teachers, or programs. The NRC report High Stakes: Testing for Tracking, Promotion, and Gradu- ation (National Research Council, 1999) urged extreme caution in basing high-stakes decisions on assessment out- comes, and we conclude that even more extreme caution

GUIDANCE ON OUTCOMES AND ASSESSMENTS 359 is needed when dealing with young children from birth to age 5 and with the early care and education system. We emphasize that a primary purpose of assessing children or classrooms is to improve the quality of early childhood care and education by identifying where more support, profes- sional development, or funding is needed and by providing classroom personnel with tools to track children’s growth and adjust instruction. ( S-4) Accountability is another important purpose for assess- ment, especially when significant state or federal invest- ments are made in early childhood programs. Program- level accountability should involve high stakes only under very well-defined conditions: (a) data about input factors are fully taken into account, (b) quality rating systems or other program quality information has been considered in conjunction with child measures, (c) the programs have been provided with all the supports needed to improve, and (d) it is clear that restructuring or shutting the pro- gram down will not have worse consequences for children than leaving it open. Similarly, high stakes for teachers should not be imposed on the basis of classroom function- ing or child outcomes alone. Information about access to resources and support for teachers should be gathered and carefully considered in all such decisions, because sanction- ing teachers for the failure of the system to support them is inappropriate. ( S-5) Performance (classroom-based) assessments of children can be used for accountability, if objectivity is ensured by checking a sample of the assessments for reliability and consistency, if the results are appropriately contextualized in information about the program, and if careful safeguards are in place to prevent misuse of information. ( S-6) Minimizing the burdens of assessment is an important goal; being clear about purpose and embedding any individual assessment decision into a larger system can limit the time and money invested in assessment. ( S-7) It is important to establish a common way of identifying children for services across the early care and education, family support, health, and welfare sectors.

360 EARLY CHILDHOOD ASSESSMENT S-8) Implementing assessment procedures requires skilled ( administrators who have been carefully trained in the assess- ment procedures to be implemented; because direct assess- ments with young children can be particularly challenging, more training may be required for such assessments. ( S-9) Implementation of a system-level approach requires having services available to meet the needs of all children identi- fied through screening, as well as requiring follow-up with more in-depth assessments. ( S-10) If services are not available, it can be appropriate to use screening assessments and then use the results to argue for expansion of services. Failure to screen when services are not available may lead to underestimation of the need for services. Research Agenda Among the tasks of the committee was the development of a research agenda to improve the quality and suitability of devel- opmental assessment, across a wide array of purposes and for the b ­ enefit of all the various children who will eventually be enter- ing kinder­garten. References to the need for research on assess- ment tools and the building of an assessment system, distributed throughout this volume, especially in connection with concerns about the adequacy of current instruments and processes, are g ­ athered together here. These recommendations relate specifi- cally to research needs in connection with assessment tools and the building of an assessment system, the committee’s specific charge. However, research related to assessment is dependent on continued support for other basic research in child development (especially as related to children of cultural and linguistic minori- ties), family functioning, effective programming for children and families, and community supports. The research base that can guide the development of assessments is based on theories of learning that are also evolving (see National Research Council, 2006); it would be short-sighted to proceed as though everything needed to do this well is already known. The relationship between assessment tools and knowledge of child development is highly

GUIDANCE ON OUTCOMES AND ASSESSMENTS 361 intertwined. Advances in knowledge will proceed in tandem with advances in assessment because a primary way that researchers learn about what children know and can do or how one area of development relates to another involves administering the cur- rently available assessment tools. As assessment tools improve, the knowledge base will expand; at the same time, innovations in assessment will emerge from the expanding knowledge base. Given the current state of assessment tools and how much more understanding is needed about the development of young children, especially those from other cultures or who speak other languages, it is imperative that both the strengths and limitations of any given set of assessments be acknowledged. Because very young children are at even greater risk than older ones of bad consequences resulting from the misuse of assessment, great care must be taken not to impose the incomplete understandings in the K-12 system on this more vulnerable population (National Research Council and Institute of Medicine, 2000). Instrument Development The various assessments available for use with young chil- dren have their origins in a variety of theoretical frameworks and purposes. Some were developed many years ago and thus do not incorporate what is now known about development and learning. Principles of assessment development and ­psychometric theory also have advanced in recent years and these are not reflected in older tools. Assessment development is a lengthy and resource- intensive process, but it is critically important that it be under- taken. Assessments are used to make a variety of decisions about young children, including screening, diagnosis, and instructional planning. With the emergence of more programs for young chil- dren and the need for accountability for those resources, assess- ment will become even more widespread. The quality of the assessment tools must match the various demands being placed on them, and that requires an investment in research on the devel- opment of new techniques.

362 EARLY CHILDHOOD ASSESSMENT Basic Considerations About Assessment The field presently lacks conceptual frameworks and the measures necessary to move this research forward to systemati- cally improve children’s learning. Preliminary research on the role of context in learning suggests that environmental factors can increase children’s engagement and participation (Christenson, 2004; Goldenberg, Rueda, and August, 2006), which in turn can lead to increased learning—and that the influence of contextual contingencies on learning outcomes is mediated by children’s motivation to learn (Rueda, 2007; Rueda and Yaden, 2006; Rueda et al., 2001). Meaningful empirical work in this area will require the convergence of research methods (e.g., multilevel statistics and the mixing of qualitative approaches with experimental and quasi-experimental designs) and social science disciplines (e.g., cognitive psychology, educational anthropology, the sociology of education). Conceptual and empirical research on child assessment is needed to move beyond the individual level to understand that processes outside the individual—in the classroom (e.g., teacher- child interactions, peer-to-peer interactions), the home (e.g., fre- quency of words spoken, number of books), and the school (e.g., language instruction policies) affect learning. Research is needed to apply the latest technical advances, such as item response theory, to assessment development, to ensure that assessments are providing good measurement for all children. Most direct assessment tools and observations ­methods are developed conceptually, without sufficient attention to ensur- ing adequate measurement at all ranges of the scale and for chil- dren from diverse backgrounds. Development research is needed on assessments that span a broader age range, ideally from birth to ages 6 or 7. Assessments with a broader age span are needed for research to allow children’s learning and development to be tracked longitudinally, through the transition into the primary grades. They also are important for program continuity, as children move from one early childhood classroom to the next, and for relating children’s learning to early learning guidelines. Finally, for children with developmental delays, assessments that span the entire early childhood period

GUIDANCE ON OUTCOMES AND ASSESSMENTS 363 allow growth to be tracked on the same assessment, even if chil- dren are performing significantly below their age peers. Recently developed tools for examining social emotional development need further work to generate evidence about their reliability, validity, and sensitivity to intervention approaches. More work is needed to develop key constructs within the domain of approaches to learning, as well as tools to measure those con- structs and their role in children’s learning and development. The shortcomings of current measures, especially standardized norm- referenced measures for young children and those with special needs, have been extensively documented, yet it is precisely these kinds of measures that are often employed in large-scale data collections. New measures are needed that accurately capture children’s growth toward being able to meaningfully participate in the variety of settings that make up their day-to-day lives. Research is needed on how to effectively use technology in all forms of early childhood assessment. Some assessments cur- rently provide for online entry of data and computerized scoring and automatic report generation, but more work is needed. More research is needed on the use of computer adaptive procedures for establishing floor and ceiling levels, to allow more in-depth assessment at the child’s current performance level. Computer- adaptive assessment could be applicable to both direct and o ­ bservation-based measures. For the Improvement of Screening Research is needed to validate screening tools for the full range of children represented in early childhood programs. There is a need to continue to collect information on who currently con- ducts screenings, including consideration of the barriers working against more widespread screening. There is a need for informa- tion on how many are screened, fail the screen, receive follow-up testing, and receive treatment or intervention based on whether a problem is verified. (Newborn hearing screening data is a model for this; the dismal results on measures of follow-up have become clear only because the data were systematically collected.)

364 EARLY CHILDHOOD ASSESSMENT For the Improvement of Diagnostic Tests More information is needed on the validity of currently avail- able tools to identify the presence of a developmental delay or a ­ typical development (Are the right children being identified?). Tools are needed for identifying developmental delay in chil- dren from other cultures and those who are speakers of other languages. For the Improvement of Observation-Based and Curriculum-Based Child Measures More research is needed in the use of authentic assessment tools for program evaluation and accountability, including con- sideration of what level of training (and retraining) is necessary to ensure that teachers reliably administer the assessment ini- tially and over time, whether the use of observation-based tools in an accountability system leads to inflated scores or otherwise reduces its usefulness in the classroom, and what level of monitor- ing and supervision is required to ensure that the assessment is administered consistently. Information is needed about how to train teachers efficiently and effectively in the administration and use of curriculum- based assessments. Further work is needed to determine whether p ­ sychometric methods to address differences in how teachers use rating scales need to be routinely applied when these approaches are used for evaluation. There is a need for research on the impact on practice of ongoing assessment in the classroom, on the barriers to effective implementation and use of ongoing assessment, and on the use of progress monitoring for ensuring that all children are receiving appropriate instruction. Assessment Processes Response to Intervention Much more research and model development are needed on the application of response to intervention (RTI) to identification

GUIDANCE ON OUTCOMES AND ASSESSMENTS 365 and service delivery in early childhood, especially as it relates to developmentally appropriate practice. These questions are critical: • How can RTI be applied effectively to preschoolers? • Will it allow for the earlier identification and intervention for children with learning problems? • What type of assessment tools are needed to apply RTI to early childhood? • Can these tools be used to plan instruction? • How can screening for RTI be integrated with ongoing assessment for instructional planning? (The Institute of Education Sciences will be funding an RTI center for pre- schoolers; proposals are now under review.) Research is needed on the types of tools and types of informa- tion most useful to teachers for ongoing assessment. Child Outcomes and Program Quality Standards Research is needed on tools and processes to tap children’s knowledge and skill in such domains as art, music, creativity, science, and ethics. There is need for consistent definitions and measures for key constructs in early social and emotional com- petencies, self-regulation, and the absence of serious behavior problems. Parallel work is needed to establish their relationship to early participation in learning activities and to academic achieve- ment. Further research is needed to identify fruitful domain structures and optimal content and formats for early learning standards to serve as a model for states as they revise initial work. Research should continue to identify program quality elements that strengthen child outcomes. Use of Assessment Tools and Processes with Special Populations Addressing Bias Little work has been done to address the effect of bias in the

366 EARLY CHILDHOOD ASSESSMENT assessment process for young children; such work is hampered by disagreement about what constitutes bias and how it operates with different populations. Research on how to address these issues is needed to be able to move forward. More work is needed to explore the influence of sampling and norming in reducing bias. More work is needed to understand the effects of the exam- iner, rater, or the testing situation on all children, but especially on populations subject to bias. Work is needed to expand the universal design characteristics of extant testing instruments, to make them optimally useful for all children, including children with special needs and children from cultural and language minorities, and to consider universal design characteristics in the development of new instruments. Work is needed on the functionality of various instruments with different populations (e.g., for minority and nonminority chil- dren) in different settings (e.g., in a Head Start program and a private, for-profit, preschool program). English Language Learners Research is needed to develop psychometrically sound native language assessments for English language learners (ELLs). This will require the expertise of several disciplines, including linguis- tics, cognitive psychology, education, and psychometrics. Further empirical research is needed to evaluate the reliability and valid- ity of traditional cognitive measures for English language learners and intelligence tests developed for specific ELL populations. For English language learners, empirical research is needed to inform decisions about which accommodations to use, for whom, and under what conditions. There is a need for ongoing implementation research in the area of professional development and training for assessing young English language learners. This research needs to identify the sub- stance of professional development to improve staff competencies necessary to work as a part of a professional team; inform how staff works with interpreters; guide how to choose and admin- ister appropriate assessment batteries; and train practitioners to develop their competence in second language acquisition, accul- turation, and the evaluation of educational interventions.

GUIDANCE ON OUTCOMES AND ASSESSMENTS 367 More research documenting the current scenarios for the assess- ment of young ELLs across the country is needed, including more work to evaluate assessment practices in various localities; survey research and observational approaches to document practices in preassessment and assessment planning, conducting the assessment, analyzing and interpreting the results, reporting the results (in writ- ten and oral format), and determining eligibility and monitoring; and a focus on the development of strategies to train professionals with the skills necessary to serve young ELL children. Research is needed to develop assessment tools normed espe- cially for young English language learners using a bottom-up approach, so that assessment tools, procedures, and constructs assessed are aligned with cultural and linguistic characteristics of ELL children. Children with Special Needs More research is needed on what the various practitioners who assess young children with special needs—early interven- tionists, special education teachers, speech therapists, psycholo- gists, etc.—actually do. More research is needed on the use of accommodations with children with disabilities. What are appropriate guidelines for decision making about what kind of accommodations to use with what kind of child under what conditions? Research is needed on the impact of accommodations on the validity of the assessment results. Accountability and Program Quality There is a need for the development of assessment instru- ments designed for the purpose of accountability and program evaluation. Instruments that are developed for federal studies such as the Early Childhood Longitudinal Study, Kindergarten- First Grade Waves (ECLS-K) or national studies of Head Start should become publicly available, so they can used by others. There is a need for research on the implementation of account- ability systems and the tracking of positive and negative conse- quences at all levels of the system:

368 EARLY CHILDHOOD ASSESSMENT • How strong is the research base for the accountability s ­ ystem? What is the impact on practice? Is that impact in line with what could be reasonably expected from the prior research? • Does the system have the intended impact? • Are there any negative consequences of the accountability system (e.g., narrowing of the curriculum, exclusion of high-risk children)? • If data are meant to improve programs or direct allocation of resources, does this happen? • How familiar are teachers and child care providers with the purpose of a program evaluation or a state accountability system? • How does information need to be packaged to ensure it is understood by program administrators, teachers or child care providers, and parents? There is a need for a compilation of experiences with differ- ent measures for accountability purposes. What are we learning about which measures or types of measures work well? There is a need for research on the development of accountability standards for types of information reported about assessments and account- ability for early childhood programs. Increased consideration of and research on system-level effects of various assessment approaches are needed. Detailed case studies of coherent com- prehensive assessment systems serving well-integrated systems of child care and education should be developed to serve as models for programs, districts, and states attempting to develop such systems. There is a need for research on the overall validity and conse- quences of particular approaches, such as: • Direct assessment with sampling—Where this has been used, what have been the program-level impacts? • If data are provided at the center but not the classroom level, does this create negative reactions? • What level of training and follow-up monitoring is required to ensure the assessment is administered consistently?

GUIDANCE ON OUTCOMES AND ASSESSMENTS 369 Different reporting formats should be evaluated with ­usability studies to determine which are best understood and most likely to be used accurately by typical audiences. putting guidance into Practice We conclude this volume by addressing our most urgent advice to the most likely agents. Different agents almost inevita- bly have somewhat different purposes for assessment, as well as different responsibilities and different levels of control. Here we attempt to clarify what actions can be taken by each of the major agents to implement the guidelines we have provided. In this way we hope to jumpstart actions to improve the care and education of young children. Pediatricians and Primary Health Care Personnel • Pediatricians and health care personnel should be aware of the full range of information sources useful in screen- ing children for developmental and medical risk; those responsible for the education of such professionals should include such information in medical training and in-service training. • Health care personnel should use effective strategies to c ­ onvey information to parents and other care­givers of infants and children to whom they administer assessments. • Health care personnel should be aware of the educational implications of the risks they might identify through screening assessment, in order to help guide the search for services. • Health care providers need to be aware of the resources available in the community, such as Individuals with Disabilities Education Act Part C early intervention and preschool special education programs for children who are in need of additional developmental assessment and services.

370 EARLY CHILDHOOD ASSESSMENT Classroom Teachers in Early Childhood Settings • Teachers should work with colleagues and coaches/­ professional development personnel/program admin- istrators to select or devise and implement formative ( ­ classroom- or curriculum-based) assessments to guide their own teaching. • If assessment information of any kind is collected in the classroom, the teacher should be fully informed about why the assessment is being conducted and for what pur- poses the data will be used. The teacher should be able to explain the purpose, process, and results of the assess- ment to parents. • Teachers should seek information about the psychometric properties of any assessments being used with children, in particular for direct assessments, and exercise caution in using direct assessment results from assessments with low reliability or tests not normed on children like those in their classrooms. • Teachers should make sure that they understand the mean- ing of children’s scores, both in relative terms (who is scor- ing highest, lowest in the class) and in relation to standards or expectations (who is scoring at or below expected levels for the age) if age-based norms are available. • Teachers should work with colleagues and coaches/­ professional development personnel/program adminis- trators to determine the best ways of sharing information about child performance with primary caregivers, and encourage the program they work in to be systematic about sharing assessment findings with parents. • If the information collected as part of formal assessments (for program evaluation purposes, for example) ignores important domains, teachers should seek out ways to col- lect and record supplementary information on their own group of children. For example, if only early mathematical and literacy skills are formally tested, teachers should be systematic about collecting a wider array of developmental indicators, e.g., by using systematic observations during peer play sessions to collect information about children’s

GUIDANCE ON OUTCOMES AND ASSESSMENTS 371 socioemotional development, ask the child to select artistic products for placement in an art portfolio, take 90 seconds to elicit and write down a story from each child to reflect oral language skills, or in other ways be systematic about collecting a wider array of developmental indicators. Early Childhood Program Administrators • Program administrators should support their classroom personnel in selecting or developing and implementing formative (classroom- or curriculum-based) assessments to guide their own teaching. • Program administrators should ensure that they are fully informed about any assessment information of any kind being collected in their program by external agents. It is their responsibility (and their right) to know why the assessment is being conducted and for what purposes the data will be used. • Program administrators should seek information about the psychometric properties of any assessments being used with their children, in particular direct assessments, and exercise caution in using direct results from assessments with low reliability or tests not normed on children like those in their program. • Program administrators should make sure they under- stand the meaning of children’s scores, what they say both about how children in the program are progressing and whether they are meeting age-based or standards-based expectations. • Program administrators should work to ensure that their own level of assessment literacy is appropriate to the types of assessment taking place in their classroom. They should promote the assessment literacy of their staff through pro- fessional development opportunities. • Program administrators should work with the practitioners in their program to establish and practice the best ways of sharing information about child performance with primary caregivers and ensure that the program is systematic about sharing assessment findings.

372 EARLY CHILDHOOD ASSESSMENT • If the information collected as part of formal assessments ignores important domains, program administrators should encourage their staff to find assessments that cover the other domains or collect and record supplementary information on their own. • Program administrators should make systematic observa- tions of classrooms to assess the quality of teaching and the social context, using their own or an available measure, and use the findings to coach and provide professional support for teachers. • If no information related to the effectiveness of the program is being collected by external agencies, program admin- istrators should undertake their own regular systematic evaluation of the program and use the results to improve its overall effectiveness. The evaluation should include data on program quality (e.g., features of the classrooms, teacher-child interaction) and assessments that document the progress being made by children in the program. District, State, and Federal Officials with Responsibility for Early Childhood Programs • Officials should ensure the availability of professional development to support program personnel in interpreting and using information from assessments and in selecting or developing formative (classroom- or curriculum-based) assessments to guide their own continual improvement. • Officials should be clear about the purposes for which they are recommending or mandating assessments and ensure that the assessments and assessment strategies recom- mended or mandated fit those purposes well. • When selecting or developing assessment instruments or strategies for use with any purpose in their programs, offi- cials should maintain a record (audit trail) of the decisions made and the factors that influenced those decisions. • Officials should ensure that the psychometric properties of any direct tests they select or develop are adequate, both

GUIDANCE ON OUTCOMES AND ASSESSMENTS 373 in general and in particular for children like those being served in their programs. • Officials should build funding and planning for progress monitoring and evaluation into the budgets for program implementation. • Officials should consider the larger system when making specific decisions about assessment. They should select assessment instruments that are aligned with standards and that complement one another in the kinds of informa- tion they provide, plan in advance for informing program personnel about the nature and the purposes of the assess- ments, and plan in advance how the information generated will be shared and used. • Officials should reexamine regularly the standards to which their assessments are aligned, the domains that are included in their assessment system, and the degree of coherence (hor- izontal, vertical, and developmental) across the assessment system and early childhood care and education structure. • Officials should become informed about the risks associ- ated with assessing young children. • Officials should not make high-stakes decisions for chil- dren or for programs unless a number of criteria have been met. These criteria include 1. A clearly articulated purpose for the testing. 2.  dentification of why particular assessments were I selected in relation to the purpose. 3.  clear connection between the assessment results and A quality of care. 4.  bservation of quality of instruction and definition of O what would need to be focused on for improvement. 5.  clear plan for following up to improve program A quality. 6.  areful decisions about how to achieve the purposes C of the assessment while minimizing the assessment burden, for example by sampling children, domains, or items. 7.  areful decisions about how to balance standardizing C the administration of direct assessments with threats to

374 EARLY CHILDHOOD ASSESSMENT optimal test performance because of unnaturalness or nonresponsiveness. Researchers • Researchers should work with early childhood practi­ tioners and programs to learn about the full array of child outcomes of interest to them, to analyze the adequacy of the extant array of assessment instruments, to improve existing assessment procedures, and to develop assessment proce- dures for understudied or poorly instrumented domains. • Researchers should work to expand the universal design characteristics of extant testing instruments, to make them optimally useful for all children, including those with dis- abilities and cultural and language minority children. • Researchers should study the development of linguistic and cultural minority children in order to inform the devel- opment of assessments that would adequately reflect their capacities. • Researchers should develop detailed case studies of coherent comprehensive assessment systems serving well- i ­ ntegrated systems of child care and education, to serve as models for programs, districts, and states attempting to develop such systems. Conclusion Writing a report about assessment, especially about assessment in early childhood, almost inevitably has to anticipate two quite different audiences. A significant proportion of the audience will start reading the report armed with a negative view of the idea of assessing preschoolers, alert to the complexities of assessing young children in ways that are informative and reliable, aware that testing can produce stress or discomfort in the child, and worried that the full array of skills and capacities the child has is unlikely to be represented. This portion of the audience will be integrat- ing the new information in the report with assessment horror stories—­children who were identified as low IQ when in fact they were second language learners, programs that were threatened

GUIDANCE ON OUTCOMES AND ASSESSMENTS 375 with loss of funding because the children in them failed to meet some external standard even though they had progressed enor- mously, programs subjected to evaluation using tests of capacities that had not been included in the curriculum. Such readers will be particularly sensitive to the notion that child assessment might be included as a basis for program accountability. Another large portion of the audience will filter the informa- tion in a report like this through a generally much more positive view of assessment in early (and later) childhood. These readers are thinking of the value to parents of the procedures for screen- ing infants to identify those who need services. They would cite the value to taxpayers of evaluating early childhood programs to ensure they are of high quality and the value to practitioners, to parents, and to children of having both progress monitoring and formative assessments available to support program improve- ment. They would cite standards and associated assessments as levers for program improvement, as well as the need to hold publicly financed programs accountable for meeting their goals of providing young children with supportive and stimulating environments. They would point out how much has been learned about child development from assessment, and how much more we need to know. Of course, quite a lot of the readers, like many members of this committee, constitute a third group—those who understand the opportunities that well-thought-out and effective assessment offers to inform teaching and program improvement, but who are simultaneously acutely aware that poor practices abound even in the face of the best information about how to do better. Repre- senting the views of this latter group, this report attempts to take neither a positive nor a negative view of assessment, although we recognize the credibility of specific claims on both sides of the controversy. The committee members represent the full range of gut feelings about assessment. Some of us, reading early drafts of these chapters, wrote comments suggesting that more warn- ings and cautions were needed, whereas others wrote comments indicating that the view of assessment presented was much too bleak, that the value of assessment in educational improvement needed to be more robustly emphasized. We conclude, not that the very positive or the very negative views are wrong, but that

376 EARLY CHILDHOOD ASSESSMENT both are correct and that both are limited. The final version of the report, thus, explicitly does not take the position that assessment is here to stay and we’d better learn to live with it. Rather, it takes the position that assessments can make crucial contributions to the improvement of children’s well-being, but only if they are well designed, implemented effectively and in the context of sys- tematic planning, and interpreted and used appropriately. Other­ wise, assessment of children and programs can have negative consequences for both. We conclude that the value of assessments themselves cannot be judged without attention to the design of the larger systems in which they are used.

Next: References »
Early Childhood Assessment: Why, What, and How Get This Book
×

The assessment of young children's development and learning has recently taken on new importance. Private and government organizations are developing programs to enhance the school readiness of all young children, especially children from economically disadvantaged homes and communities and children with special needs.

Well-planned and effective assessment can inform teaching and program improvement, and contribute to better outcomes for children. This book affirms that assessments can make crucial contributions to the improvement of children's well-being, but only if they are well designed, implemented effectively, developed in the context of systematic planning, and are interpreted and used appropriately. Otherwise, assessment of children and programs can have negative consequences for both. The value of assessments therefore requires fundamental attention to their purpose and the design of the larger systems in which they are used.

Early Childhood Assessment addresses these issues by identifying the important outcomes for children from birth to age 5 and the quality and purposes of different techniques and instruments for developmental assessments.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!