Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Summary T he assessment of young childrenâs development and learn- ing has recently taken on new importance. Private and government organizations are developing programs to enhance the school readiness of all young children, especially chil- dren from economically disadvantaged homes and communities and children with special needs. These programs are designed to enhance social, language, and academic skills through responsive early care and education. In addition, they constitute a site where children with developmental problems can be identified and receive appropriate interventions. Societal and government initiatives have also promoted accountability for these educational programs, especially those that are publicly funded. These initiatives focus on promoting standards of learning and monitoring childrenâs progress in meet- ing those standards. In this atmosphere, Congress has enacted such laws as the Government Performance and Results Act and the No Child Left Behind Act. School systems and government agencies are asked to set goals, track progress, analyze strengths and weaknesses in programs, and report on their achievements, with consequences for unmet goals. Likewise, early childhood education and intervention programs are increasingly being asked to prove their worth.
EARLY CHILDHOOD ASSESSMENT In 2006, Congress requested that the National Research Council (NRC) conduct a study of developmental outcomes and appropriate assessment of young children. With funding from the Office of Head Start in the U.S. Department of Health and Human Services, the specific charge to this committee was the identifica- tion of important outcomes for children from birth to age 5 and the quality and purposes of different techniques and instruments for developmental assessments. The committeeâs review highlights two key principles. First, the purpose of an assessment should guide assessment decisions. Second, assessment activity should be conducted within a coher- ent system of medical, educational, and family support services that promote optimal development for all children. Our focus on the need for purposefulness and systematicity is particularly important at this time, because young children are currently being assessed for a wide array of purposes, across a wide array of domains, and in multiple service settings. The increase in the amount of assessment raises understandable wor- ries about whether assessments are selected, implemented, and interpreted correctly. Assessments of children may be used for purposes as diverse as determining the level of functioning of individual children, guiding instruction, or measuring function- ing at the program, community, or state level. Different purposes require different types of assessments, and the evidentiary base that supports the use of an assessment for one purpose may not be suitable for another. As the consequences of assessment findings become weightier, the accuracy and q Â uality of the instruments used to provide findings must be more certain. Decisions based on an assessment that is used to monitor the progress of one child can be important to that child and her family and thus must be taken with caution, but they can also be challenged and revisited more easily than assessments used to determine the fate or funding for groups of children, such as those attending a local child care center, an early education program, or a nationwide program like Head Start. When used for purposes of program evaluation and accountability, often called high stakes, We have adopted the following definition of high-stakes assessment (see A Â ppendix A): Tests and/or assessment processes for which the results lead to sig-
SUMMARY assessments can have major consequences for large numbers of children and families, for the community served by the program, and for policy. If decisions about individual children or about programs are to be defended, the system of assessment must reflect the highest standards of evidence in three domains: the psychometric proper- ties of the instruments used in the assessment system; the evidence supporting the appropriateness of the assessment instruments for different ethnic, racial, language, functional status, and age group populations; and the domains that serve as the focus of the assess- ment. In addition, resources need to be directed to the training of assessors, the analysis and reporting of results, and the interpreta- tion of those results. Such attention is especially warranted when making decisions about whether programs will continue to be funded by tax monies. The purpose and system principles apply as well to the inter- pretation, use, and communication of assessment data. Collecting data should be preceded by planning how the data will be used, who should have access to them, in what decisions they will play a role, and what stakeholders need to know about them. Ideally, any assessment activity benefits children by providing informa- tion that can be used to inform their caregivers and teachers, to improve the quality of their care and educational environments, and to identify child risk factors that can be remedied. But assess- ments may also have adverse consequences. Direct assessments may make children feel anxious, incompetent, or bored, and indirect assessments may constitute a burden on adults. An assessment activity may also deflect time and resources from instruction, and assessments cost money. It is therefore important to ensure that the value of the information gathered through assessments outweighs any negative effects on adults or children and that it merits the investment of resources. Purposeful and systematic assessment requires decisions about what to assess. In this study, the committee focuses on five nificant sanctions or rewards for children, their teachers, administrators, schools/Â programs, and/or school systems. Sanctions may be direct (e.g., retention in grade for children, reassignment for teachers, reorganization for schools) or unintended (e.g., narrowing of the curriculum, increased dropping out).
EARLY CHILDHOOD ASSESSMENT domains that build on the school readiness work of the National Education Goals Panel (1995): 1. physical well-being and motor development, 2. social and emotional development, 3. approaches toward learning, 4. language development (including emergent literacy), and 5. cognition and general knowledge (including mathematics and science). This list reflects state early learning standards, guidelines from organizations focused on the welfare of young children, and the status of available assessment instruments. The domains are not specific about many areas of potential interest to parents, to educators, and to society, such as art, music, creativity, prosocial behavior, and morality. Also, for some purposes and for some chil- dren, including infants and preschool children with disabilities, a functional rather than a domain-specific approach to assessment may be appropriate. Once a purpose has been established and a set of domains selected, the next challenge is to identify the best assessment instrument; this may be one that is widely used, or an adaptation of a previously used instrument, or in some cases a newly devel- oped instrument. The varied available approaches, which include conducting direct assessments, interviewing parents or teachers, observing children in natural or slightly structured settings, and analyzing their work, all constitute rich sources of information. Issues of psychometric adequacy, in particular the validity of the instrument chosen for all the subgroups of children to be consid- ered, are paramount, for observational and interview instruments as well as direct assessments. The remainder of this summary presents guidelines for assess- ment related to four issues: purposes, domains and measures, implementation, and systems. The summary concludes with key points for a future research agenda.
SUMMARY Guidelines on Purposes of Assessment (P-1) Public and private entities undertaking the assessment of young children should make the purposes of assessment explicit and public. (P-2) The assessment strategyâwhich assessments to use, how often to administer them, how long they should be, how the domain of items or children or programs should be sampledâshould match the stated purpose and require the minimum amount of time to obtain valid results for that purpose. Even assessments that do not directly involve children, such as classroom observations, teacher rating forms, and collection of work products, impose a burden on adults and will require advance planning for using the information. (P-3) Those charged with selecting assessments need to weigh options carefully, considering the appropriateness of candi- date assessments for the desired purpose and for use with all the subgroups of children to be included. Although the same measure may be used for more than one purpose, prior consideration of all potential purposes is essential, as is careful analysis of the actual content of the assessment instrument. Direct examination of the assessment items is important because the title of a measure does not always reflect the content. Guidelines on Domains and Measures of Developmental Outcomes (D-1) Domains included when assessing child outcomes and the quality of education programs should be expanded beyond those traditionally emphasized (language, literacy, and mathematics) to include others, such as affect, interper- sonal interaction, and opportunities for self-expression. (D-2) Support is needed to develop measures of approaches to learning and socioemotional functioning, as well as other currently neglected domains, such as art, music, creativity, and interpersonal skills. (D-3) Studies of the child outcomes of greatest importance to par- ents, including those from ethnic minority and immigrant
EARLY CHILDHOOD ASSESSMENT groups, are needed to ensure that assessment instruments are available for domains (and thinking about domains) emphasized in different cultural perspectives, for example, proficiency in the native language as well as in English. (D-4) For children with disabilities and special needs, Â domain- based assessments may need to be replaced or supple- mented with more functional approaches. (D-5) Selecting domains to assess requires first establishing the purposes of the assessment, then deciding which of theÂ var- ious possible domains dictated by the purposes can best be assessed using available instruments of proven reliability and validity, and considering what the costs will be of omit- ting domains from the assessment system (e.g., reduction of their importance in the eyes of practitioners or parents). Guidelines on Instrument Selection and Implementation (I-1) Selection of a tool or instrument should always include careful attention to its psychometric properties. A. Assessment tools should be chosen that have been shown to have acceptable levels of validity and reliability evidence for the purposes for which they will be used and the popu- lations that will be assessed. B. Those charged with implementing assessment systems need to make sure that procedures are in place to examine validity data as part of instrument selection and then to examine the data being produced with the instrument to ensure that the scores being generated are valid for the purposes for which they are being used. C. Test developers and others need to collect and make available evidence about the validity of inferences for language and cultural minority groups and for children with disabilities. D. Program directors, policy makers, and others who select instruments for assessments should receive instruction in how to select and use assessment instruments. (I-2) Assessments should not be given without clear plans for follow-up steps that use the information productively and appropriately.
SUMMARY (I-3) When assessments are carried out, primary caregivers should be informed in advance about their purposes and focus. When assessments are for screening purposes, pri- mary caregivers should be informed promptly about the results, in particular whether they indicate a need for fur- ther diagnostic assessment. (I-4) Pediatricians, primary medical caregivers, and other quali- fied personnel should screen for maternal or family fac- tors that might impact child outcomesâchild abuse risk, m Â aternal depression, and other factors known to relate to later outcomes. (I-5) Screening assessment should be done only when the avail- able instruments are informative and have good predictive validity. (I-6) Assessors, teachers, and program administrators should be able to articulate the purpose of assessments to parents and others. (I-7) Assessors should be trained to meet a clearly specified level of expertise in administering assessments, should be monitored systematically, and should be reevaluated occa- sionally. Teachers or other program staff may administer assessments if they are carefully supervised and if reliabil- ity checks and monitoring are in place to ensure adherence to approved procedures. (I-8) States or other groups selecting high-stakes assessments should leave an audit trailâa public record of the decision making that was part of the design and development of the assessment system. These decisions would include why the assessment data are being collected, why a particular set of outcomes was selected for assessment, why the particular tools were selected, how the results will be reported and to whom, as well as how the assessors were trained and the assessment process was monitored. (I-9) For large-scale assessment systems, decisions regarding instrument selection or development for young children should be made by individuals with the requisite program- matic and technical knowledge and after careful consid- eration of a variety of factors, including existing research, recommended practice, and available resources. Given the
EARLY CHILDHOOD ASSESSMENT broad-based knowledge needed to make such decisions wisely, they cannot be made by a single individual or by fiat in legislation. Policy and legislation should allow for the adoption of new instruments as they are developed and validated. (I-10) Assessment tools should be constructed and selected for use in accordance with principles of universal design, so they will be accessible to, valid, and appropriate for the greatest possible number of children. Children with dis- abilities may still need accommodations, but this need should be minimized. (I-11) Extreme caution needs to be exercised in reaching conclu- sions about the status and progress of, as well as the effec- tiveness of programs serving, young children with special needs, children from language-minority homes, and other children from groups not well represented in norming or validation samples, until more information about assess- ment use is available and better measures are developed. Guidelines on Systems (S-1) An effective early childhood assessment system must be part of a larger system with a strong infrastructure to support childrenâs care and education. The infrastructure is the foundation on which the assessment systems rest and is critical to its smooth and effective functioning. The infrastructure should encompass several components that together form the system: A. Standards: A comprehensive, well-articulated set of stan- dards for both program quality and childrenâs learning that are aligned to one another and that define the constructs of interest as well as child outcomes that demonstrate that the learning described in the standard has occurred. B. Assessments: Multiple approaches to documenting child development and learning and reviewing program quality that are of high quality and connect to one another in well- defined ways, from which strategic selection can be made depending on specific purposes.
SUMMARY C. Reporting: Maintenance of an integrated database of assess- ment instruments and results (with appropriate safeguards of confidentiality) that is accessible to potential users, that provides information about how the instruments and scores relate to standards, and that can generate reports for varied audiences and purposes. D. Professional development: Ongoing opportunities provided to those at all levels (policy makers, program directors, assessment administrators, practitioners) to understand the standards and the assessments and to learn to use the data and data reports with integrity for their own purposes. E. Opportunity to learn: Procedures to assess whether the environments in which children are spending time offer high-quality support for development and learning, as well as safety, enjoyment, and affectively positive relationships, and to direct support to those that fall short. F. Inclusion: Methods and procedures for ensuring that all children served by the program will be assessed fairly, regardless of their language, culture, or disabilities, and with tools that provide useful information for fostering their development and learning. G. Resources: The assurance that the financial resources needed to ensure the development and implementation of the system components will be available. H. Monitoring and evaluation: Continuous monitoring of the system itself to ensure that it is operating effectively and that all elements are working together to serve the interests of the children. This entire infrastructure must be in place to create and sustain an assessment subsystem within a larger system of early childhood care and education. (S-2) A successful system of assessments must be coherent in a variety of ways. It should be horizontally coherent, with the curriculum, instruction, and assessment all aligned with the early learning and development standards and with the program standards, targeting the same goals for learning, and working together to support childrenâs developing knowledge and skill across all domains. It should be verti- cally coherent, with a shared understanding at all levels of the system of the goals for childrenâs learning and devel-
10 EARLY CHILDHOOD ASSESSMENT opment that underlie the standards, as well as consensus about the purposes and uses of assessment. It should be developmentally coherent, taking into account what is known about how childrenâs skills and understanding develop over time and the content knowledge, abilities, and understand- ing that are needed for learning to progress at each stage of the process. The California Desired Results Developmental Profile provides an example of movement toward a multi- ply coherent system. These coherences drive the design of all the subsystems. For example, the development of early learning standards, curriculum, and the design of teaching practices and assessments should be guided by the same framework for understanding what is being attempted in the classroom that informs the training of beginning teachers and the continuing professional development of experienced teachers. The reporting of assessment results to parents, teachers, and other stakeholders should also be based on this same framework, as should the evaluations of effectiveness built into all systems. Each child should have an equivalent opportunity to achieve the defined goals, and the allocation of resources should reflect those goals. (S-3) Following the best possible assessment practices is especially crucial in cases in which assessment can have significant consequences for children, teachers, or programs. The 1999 NRC report High Stakes: Testing for Tracking, Promotion, and Graduation urged extreme caution in basing high-stakes decisions on assessment outcomes, and we conclude that even more extreme caution is needed when dealing with young children from birth to age 5 and with the early care and education system. We emphasize that a primary pur- pose of assessing children or classrooms is to improve the quality of early childhood care and education by identifying where more support, professional development, or funding is needed and by providing classroom personnel with tools to track childrenâs growth and adjust instruction. (S-4) Accountability is another important purpose for assess- ment, especially when significant state or federal invest- ments are made in early childhood programs. Program- level accountability should involve high stakes only under
SUMMARY 11 very well-defined conditions: (a) data about input factors are fully taken into account, (b) quality rating systems or other program quality information has been considered in conjunction with child measures, (c) the programs have been provided with all the supports needed to improve, and (d) it is clear that restructuring or shutting the pro- gram down will not have worse consequences for children than leaving it open. Similarly, high stakes for teachers should not be imposed on the basis of classroom function- ing or child outcomes alone. Information about access to resources and support for teachers should be gathered and carefully considered in all such decisions, because sanction- ing teachers for the failure of the system to support them is inappropriate. (S-5) Performance (classroom-based) assessments of children can be used for accountability, if objectivity is ensured by checking a sample of the assessments for reliability and consistency, if the results are appropriately contextualized in information about the program, and if careful safeguards are in place to prevent misuse of information. (S-6) Minimizing the burdens of assessment is an important goal; being clear about purpose and embedding any individual assessment decision into a larger system can limit the time and money invested in assessment. (S-7) It is important to establish a common way of identifying children for services across the early care and education, family support, health, and welfare sectors. (S-8) Implementing assessment procedures requires skilled admin- istrators who have been carefully trained in the Âassessment procedures to be implemented; because direct assessments with young children can be particularly challenging, more training may be required for such assessments. (S-9) Implementation of a system-level approach requires having services available to meet the needs of all children identi- fied through screening, as well as requiring follow-up with more in-depth assessments. (S-10) If services are not available, it can be appropriate to use screening assessments and then use the results to argue for expansion of services. Failure to screen when services
12 EARLY CHILDHOOD ASSESSMENT are not available may lead to underestimation of the need for services. Research AGENDA Among the tasks of the committee was the development of a research agenda to improve the quality and suitability of devel- opmental assessment, across a wide array of purposes and for the benefit of all the various subgroups of children who will eventu- ally be entering kindergarten. References to the need for research on assessment tools and the building of an assessment system are distributed throughout this document. Major topics of recom- mended research, with details in Chapter 11, are â¢ research related to instrument development, â¢ research related to assessment processes, â¢ research on the use of assessment tools and processes with special populations, and â¢ research related to accountability. Conclusion Well-planned and effective assessment can inform teaching and program improvement, and contribute to better outcomes for children. Current assessment practices do not universally reflect the available information about how to do assessment well. This report affirms that assessments can make crucial contributions to the improvement of childrenâs well-being, but only if they are well designed, implemented effectively, developed in the context of systematic planning, and are interpreted and used appropriately. Otherwise, assessment of children and programs can have nega- tive consequences for both. The value of assessments therefore requires fundamental attention to their purpose and the design of the larger systems in which they are used.