Click for next page ( 4


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 3
Evaluating Early Childhood Demonstration Programs I NTRODUCTION During the last two decades, public and private programs for young children and their families have undergone profound changes. Programs and philosophies have proliferated. Program objectives have broadened. Federal support has increased: Projected expenditures for child care and preschool education alone neared $3 billion several years ago. Target populations have expanded and diversified, as have the constituencies affected by programs; such constituencies reach beyond the target populations themselves. A sizable evaluation enterprise has grown along with the expansion in programs. Formal outcome measurement has gained increasing acceptance as a tool for policy analysis, as a test of accountability, and to some extent as a guide for improving program practices. Programs have been subjected to scrutiny from all sides, as parents, practitioners, and politicians have become increasingly sophisticated about methods and issues that once were the exclusive preserve of the researcher. At the same time, evaluation has come under attack--some of it politically motivated, some of it justified. Professionals question the technical quality of evaluations, while parents, practitioners, and policy makers complain that studies fail to address their concerns or to reflect program realities. Improvements in evaluation design and outcome measurement have failed to keep pace with the evolution of programs, widening the gap between what is measured and what programs actually do. This report attempts to take modest steps toward rectifying the situation. Rather than recommend specific instruments, its aims are (1) to characterize recent 3

OCR for page 3
4 developments in programs and policies for children and families that challenge traditional approaches to evalua- tions and (2) to trace the implications for outcome measurement and for the broader conduct of evaluation studies. We have attempted to identify various types of information that evaluators of early childhood programs might collect, depending on their purposes. Our intent is not so much to prescribe how evaluation should be done as to provide a basis for intelligent choice of data to be collected. Two related premises underlie much of our argument. First, policies and programs, at least those in the public domain, are shaped by many forces. Constituencies with conflicting interests influence policies or programs and in turn are affected by them. Policies and programs evolve continuously, in response to objective conditions and to the concerns of constituents. Demonstration programs, the subject of this report, are particularly likely to change as experience accumulates. Consequently, evaluation must address multiple concerns and must shift focus as programs mature or choral Her and == -^w policy issues emerge. Any single study is limited in its Ivy co react co changes, out a single study is only a part of the larger evaluation process. Second, the role of the evaluator is to contribute to public debate, to help make programs and policies more effective by informing the forensic process through which they are shaped. Though the evaluator might never actually engage in public discussion or make policy recommendations, he or she is nevertheless a participant in the policy formation process, a participant whose special role is to provide systematic information and to articulate value choices, rather than to plead the case for particular actions or values. Note that we distinguish between informing the policy formation process and being co-opted by it--between research and advocacy. Research is characterized by systematic inquiry, concern with the reduction and control of bias, and commitment to addressing all the evidence. Nothing that we say is intended to relax the need for such rigor. There are many views of the evaluator's role. Relevant discussions appear in numerous standard sources on evalu- ation methodology, such as Suchman (1967), Weiss (1972), Rossi et al. (1979), and Goodwin and Driscoll (1980). Some of these views are consonant, and some are partially contrasting with ours. For example, one widely held view ~I_ ~ # _ ~ OF ~ J _ ~

OCR for page 3
5 is that the role of the evaluator is, ideally, to provide definitive information to decision makers about the degree to which programs or policies are achieving their stated goals.) Though we agree that evaluation should inform decision makers (among others) and should strive for clear evidence on whether goals are being met, we argue that this view is insufficiently attuned to the pluralistic, dynamic process through which most programs and policies are formed and changed. Sometimes the most valuable lesson to be learned from a demonstration is whether a particular intervention has achieved a specified end. Often, however, other lessons are equally or more important. An intervention can succeed for reasons that have little import for future programs or policies--for example, because of the efforts of uniquely talented staff. Conversely, a demonstration that fails, overall, may contain successful elements deserving replication in other contexts, and it may succeed in identifying practices that should be amended or avoided. Or a demonstration may shift its goals and "treatments" in response to local needs and resources, thereby failing to achieve its original ends but succeeding in other important respects. By the same token, a randomized field experiment, with rigorous control of treatment and subject assignment, is sometimes the most appropriate way to answer questions salient for policy formation or program management. In such situations, government should be encouraged to provide the support necessary to implement experimental designs. There are situations, however, in which experimental rigor is impractical or premature, or in which information of a different character is likely to be more useful to policy makers and program managers. Preoccupation with prespecified goals and treatments can cause evaluators to overlook important changes in the aims and operations of programs as well as important outcomes that were not part of the original plan. If demonstrations have been allowed to adapt to local conditions, thoughtful documentation of the process of Strictly speaking, this view applies only to "summa- tive" evaluations, as distinguished from "formative" evaluations, which are intended to provide continuous feedback to program participants for the purpose of improving program operations.

OCR for page 3
6 change can be far more useful in designing future programs than a report on whether original goals were met. Even if change in goals and treatments is not at issue, understanding the mechanisms by which programs work or fail to work is likely to be more helpful than simply knowing whether they have achieved their stated goals. These meabanisms are often complex, and the evaluator's understanding of them often develops gradually. To elucidate mechanisms of change, it may be necessary to modify an initial experimental design, to perform post hoc analyses without benefit of experimental control, or to supplement quantitative data collection with qualitative accounts of program operations. In short, we believe that evaluation is best conceived as a process of systematic learning from experience--the experience of the demonstration program itself and the experience of the evaluator as he or she gains increasing familiarity with the program. It is the systematic quality of evaluation that distinguishes it from advocacy or journalism. It is the need to bring experience to bear on practice that distinguishes evaluation from other forms of social scientific inquiry. A Word on Definitions This is a report about the evaluation of demonstration programs for young children and their families. Each word or phrase in the foregoing sentence is subject to multiple interpretations. The substance of this report is intimately bound up with our choice of definitions. By evaluation we mean systematic inquiry into the operations of a program--the services it delivers, the process by which those services are provided, the costs of services, the characteristics of the persons served, relations with relevant community institutions (e.g., schools or clinics), and, especially, the outcomes for program participants. By outcomes we mean any changes in program participants or in the contexts in which they function. The latter is a deliberately broad definition, which includes yet extends far beyond the changes in individual children that are usually thought of as program outcomes. We believe that the definition is appropriate, given the nature of contemporary programs, and we endeavor to support this claim in some detail.

OCR for page 3
7 By demonstration programs we mean any programs installed at least in part for the purpose of generating practical knowledge--such as the effectiveness of particular interventions; the costs, feasibility, or accessibility of services under alternative approaches to delivery; or the interaction of a program with other community institutions. This definition goes beyond traditional concerns with program effectiveness. We believe that it is an appropriate definition in light of the policy considerations that surround programs for young children today. Finally, by young children we mean children from birth to roughly age eight, although some of our discussion applies to older children as well. We take very seriously the inclusion of families as recipients of services; we emphasize the fact that many contemporary programs attempt to help the child through the family and that outcome measures should reflect this emphasis. Plan of the Report We begin by tracing the historical evolution of demonstration programs from 1960 to the mid-1970s, and of the evaluations undertaken in that period. Although children's programs and formal evaluation have histories beginning long before 1960, the programs and evaluations of the early 1960s both prefigure and constrain our thinking about outcome measurement today. Following this historical overview is a section that examines in some detail the policy issues and programs that have evolved in recent years and that appear to be salient for the 1980s. The next section--the heart of the report-- identifies some important implications of these programs and policy developments for outcome measurement and evaluation design. The final section points to implications for dissemination and utilization of results, for the organization and conduct of applied research, and, finally, for the articulation between applied research and basic social science. PROGRAMS FOR CHILDREN AND FAMILIES, 1960-1975 Programs for children and families have come a long way since 1960, but it is fair to say that the earliest demonstration programs of the 1960S, precursors of Head

OCR for page 3
8 Start, still have a hold on the imagination of the public as well as many researchers. It is perhaps an oversimplification--but nevertheless one with a large grain of truth--to say that outcome measurement, which was reasonably well adapted to the early demonstrations, has stood still while programs have changed radically. To illustrate, let us consider the experience of a "typical" child in a "typical" demonstration program at various points from 1960 to the present, and let us briefly survey the kinds of measures that have been used at each point to assess the effects of programs. In the early 1960s it would have been easy to characterize a typical child and a typical program. . Prototypical remonstrations of that period were primarily preschool education programs, designed to enhance the cognitive skills of "culturally disadvantaged" children from low-income families, in order to prepare them to function more effectively as students and, ultimately, as workers and citizens. It was only natural to measure as outcomes children's school performance, academic ability, and achievement. Some practitioners had misgivings about the fit between available measures and the skills and atti- tudes they were attempting to teach, and many lamented the lack of good measures of social and emotional growth. There was fairly widespread consensus, however, that preacademic instruction was the heart of early childhood demonstrations. (Horowitz and Paden, 1973, Provide one of several useful reviews of these early projects.) By 1965 the typical child would have been one of more than half a million children to participate in the first Head Start program. Despite its scale, Head Start was and still is termed a "demonstration" in its authorizing legislation. Moreover, Head Start has constantly expert mented with curricula and approaches to service delivery, and it has spawned a vast number of evaluations. For these reasons it dominates our discussion of demon- strations from 1965 until very recently. (A collection of papers edited by Zigler and Valentine, 1979, reviews the history of Head Start. See in particular Datta's paper in that volume (Datta, 1979) for a discussion of Head Start research.) The program originally consisted of eight weeks of preschool during the summer and was soon extended to a full year. Proponents had stressed "comprehensive services," and many teachers viewed socialization rather than academic instruction as their primary goal. Many of the federal managers and local practitioners did not -

OCR for page 3
9 conceive Head Start exclusively as a cognitive enrichment program. Nevertheless, Head Start was widely perceived-- by the public, by Congress, and by many participants--as a way to correct deficiencies in cognitive functioning before a child entered the school system. Early Head Start programs involved many enthusiastic parents, but the educational mission and direction of the program was set by professional staff and local sponsoring organiza- tions. Programs and developmental theories were numerous and diverse; no uniform curriculum was set. Yet there seems to have been consensus and a high level of confidence with respect to one key point--that early intervention would be effective, regardless of the particular approach. In some quarters this confidence was severely shaken by the first national evaluation of Head Start's impact on children, the Westinghouse-Ohio study (Westinghouse Learning Corp. and Ohio University, 1969). The study reported that Head Start graduates showed only modest immediate gains on standardized tests of cognitive ability and that these gains disappeared after a few years in school. However, for others the results testified only to the narrowness of the study's outcome measures and to other inadequacies of design. Some partisans of Head Start and critics of the Westinghouse-Ohio study, claiming that the program was much more than an attempt at compen- satory education or cognitive enrichment, argued that the study bad measured Head Start against a standard more appropriate to its precursors. These advocates argued that Head Start enhanced social skills (to which the Westinghouse-Ohio study paid limited attention) and provided food, medical and dental checkups, and corrective services to children who were badly in need of them. Thus its justification lay in part in the provision of immediate benefits to low-income populations, not solely in expected future gains. Furthermore, argued advocates of Head Start, many local programs had mobilized parents and become a focus for community organization and political action. To be sure, some of the criticism of the Westinghouse-Ohio study was rhetorical and politically motivated. However, many of the critics' points were supported empirically, for example, by an evaluation by Kirschner Associates (1970), which documented the impact of the program on services provided by the community. By 1970, Head Start had begun to experiment with systematic variations in curriculum. Now the typical preschool child might be served according to any of a

OCR for page 3
10 dozen models, ranging from highly structured academic drill to global, diffuse support for social and emotional growth. Models were viewed as fixed treatments, to be applied more or less uniformly across sites. Parallel models were also put in place in elementary schools that received Head Start graduates, as part of the National Follow Through experiment. Under most models, treatment was still directed primarily to individual children, not families or communities. Some models made an effort to integrate parents; others did not. Noneducational program components, such as health, nutrition, and social ser- vices, had expanded but were still widely viewed as subordinate to the various developmental approaches. Comparative evaluations continued to stress a relatively narrow range of educational outcomes. As a result, pro- grams with a heavy cognitive emphasis tended to fare better than others, although no single approach proved superior on all measures, and there were large differences in the effectiveness of a given model at different sites. Dissatisfaction with the narrowness of outcome measures continued to grow, as programs broadened their goals and came to be seen as having distinctive approaches and outcomes, not necessarily reflected by the measures being used. By 1975, Head Start had changed and diversified significantly. Program standards were put in place, mandating comprehensive services and parent involvement nationwide. In 1975 more than 300 Head Start programs were gearing up to provide home-based services as supple- ments to, or even substitutes for, center-based services. The home-based option was permitted in the national guidelines following an evaluation of Home Start, a 16-site demonstration project (Love et al., 1975). The evaluation, which involved random assignment of children to home treatment and control conditions, found that the home treatment group scored significantly above the control group on a variety of measures, including a standardized cognitive test, and that the home treatment group did as well as a nonrandom comparison group of children in Head Start centers. In addition, several offshoot demonstrations, some of them dating from the 1960s, began to get increased attention, notably the Child and Family Resource Program, the Parent-Child Centers, and Parent-Child Development Centers. These projects extend services to children much younger than age three or four, the normal age for Head Start entrants These programs work through the mother or the family

OCR for page 3
11 rather than serving the child alone. They combine home visits with center sessions in various mixes. Although these programs even today serve only about 8 percent of the total number of children served in Head Start, they represent significant departures from traditional approaches. We have a good deal more to say about these programs below. Thus by 1975 the experience of the typical Head Start child had become difficult to characterize. The child might be served at home or in a center; he or she might receive a concentrated dose of preacademic instruction or almost no instruction at all. In the face of this diver- sity, it is apparent that standardized tests, measuring aspects of academic skill and ability, capture only a part of what Head Start was trying to accomplish. Evaluations of Head Start's components, such as health services, and offshoot demonstrations, such as the Child and Family Resource Program, have been conducted or are currently in progress. Head Start's research division in 1977 initiated a multimillion-dollar procurement to develop a new comprehensive assessment battery that stresses health and social as well as cognitive measures. By the late 1970s other programs, mostly federal in origin, were beginning to take their places beside Head Start as major providers of services to children. In addition, federal evaluation research began to concentrate on other children's programs, such as day care, which had existed for many years but had begun to assume new importance for policy in the 1970s. In the next section we attempt to characterize some of the recent program initiatives as well as the policy climate that surrounds programs for young children and their families in the early 1980s. THE PROM AND POLICY CONTEXT OF THE 1980s Public policy both creates social change and responds to it. The evolution of policies toward children and families must be understood in the context of general societal change. Demographic shifts in the number of young children, the composition of families, and the labor force participation of mothers in recent years have increased and broadened the demand for services. They have also heightened consciousness about policy issues surrounding child health care, early education, and social services. Policy makers and evaluators in the

OCR for page 3
12 1980s are coping with the consequences of these broad changes. Contemporary policy issues and program characteristics constitute the environment in which evaluators ply their trade, and they pose challenges with which new evaluations and outcome measures must deal. To understand the policy context surrounding demonstra- tion programs for children in the 1980s, it is useful to begin by outlining some general considerations that affect the formation of policy. These generic considerations apply to virtually all programs and public issues but shift in emphasis and importance as they are applied to particular programs and issues, at particular times, under particular conditions. . . . . The most fundamental consideration is whether the program or policy in question (whether newly proposed or a candidate for modification or termina Lion) accords with the general philosophy of some group of policy makers and their constituents. Closely related is the question of tangible public support for a program or policy: Can the groups favoring a particular action translate their needs into effective political pressure? Assuming that basic support exists, issues of access, - , ~ equity, effectiveness, and efficiency arise. Will a program reach the target population(s) that it is intended to affect (access)? Will it provide benefits fairly, without favoring or denying any eligible target aroun--for example, by virtue of geographic location, ethnicity, or any other characteristics irrelevant to eligibility? And will its costs, financial and nonfinancial, be apportioned fairly (equity)? Will it achieve its intended objectives (effectiveness)? Will it do so without excessively cumbersome administrative machinery, and will cost- effectiveness and administrative requirements compare favorably with alternative programs or policies (efficiency)? Two related concerns have to do with the unintended consequences of programs and policies and their interplay with existing policies and institutions. Will the policy or program have unanticipated positive or negative effects? Will it facilitate or impede the operations of existing policies, programs, or agencies? How will it affect the operations of private, formal, and informal institutions? Programs for children and families are not exempt from any of these concerns. Some have loomed larger than others at times in the past two decades, and the current configuration is rather different from the one that prevailed when the first evaluations of compensatory

OCR for page 3
13 education were initiated. The policy climate of the early 1960s was one of concern over poverty and inequality and of faith in the effectiveness of government-initiated social reform. The principal policy initiative of that period directed toward children and families--namely, the founding of Head Start--exemplified this concern and this faith. Head Start was initially administered by the now defunct Office of Economic Opportunity (OEO), and many local Head Start centers were affiliated with OEO-funded Community Action Programs. Thus, while it was in the first instance a service to children, Head Start was also part of the government's somewhat paradoxical attempt to stimulate grass roots political action "from the top down. n The national managers made a conscious, concerted effort to distinguish Head Start from other children's services, notably day care. The latter was seen as controversial--hence, a politically risky ally. The early 1960s was a time of economic and governmental expansion. Consequently, questions of cost and efficiency did not come to the fore. The principal concerns of the period were to extend services--to broaden access--and to demonstrate the effectiveness of the program. As noted earlier, effectiveness in the public mind was largely equated with cognitive gains. Despite the political character of the program, studies documenting its effectiveness as a focus for community organization and political action received little attention or weight-- perhaps because the political activities of OEO-funded entities, such as the Community Action Programs and Legal Services, were sensitive issues even in the 1960s. Yet it was precisely the effectiveness of Head Start at mobilizing parents (together with the political skills of its national leaders) that saved the program when the Westinghouse-Ohio study produced bleak results and a new administration dismantled OEO. During the 1970s the policy climate changed markedly. Economic slowdown and growing disillusionment with what were seen as excesses and failures of the policies of the 1960s brought about a concern for accountability and fiscal restraint, a concern that is still present and growing. Head Start responded by establishing national performance standards in an effort at quality control. Expansion was curtailed as the program fought to retain its budget in the face of inflation and congressional skepticism. (In fiscal 1977 only 15-18 percent of eligible children were actually served by Head Start.) Policy makers and program managers began to demand that

OCR for page 3
44 of handicapped children exercise their rights to change their children's educational placement, there is no guarantee that the educational experiences of the child will in fact be improved, either by the lengthy process of appeals that may be involved or by the ultimate outcome. In such a situation, legitimate values compete: Is it more important for parents to have such rights or for children to have steady, uninterrupted, and relaxed educational experiences? Such conflicts create delicate situations in which evaluators, sponsors of evaluations, practitioners, and clients must negotiate the choice and weighting of outcomes. Our point is that the scope of an evaluation the breadth of the audience for which it provides at least some relevant information, and the likelihood that its findings will be put to use will all be enhanced if the perspectives of the various constituencies are considered. Communicating with Multiple Audiences We have argued consistently that if evaluation is to accomplish its goal of helping to improve programs and shape policies, it must be attuned to practical issues, not only to the interests of discipline-based researchers and methodologists. Beyond this first and most important step, evaluators can, by virtue of the way in which they present their work, take further measures to ensure the dissemination and utilization of their results. Basic researchers are usually trained to speak only to other researchers. Buttressed with statistics and hedged with caveats. their reports typically have a logic and an a ~ _: a_ _ ~ _ _: ~ _ ~ ~ i: organization almea at persuading processional crows o' the accuracy of careful delimited empirical claims. However, applied researchers must address many audiences who make very different uses of their findings. Policy makers, government program managers, advocacy groups, practitioners, and parents are among their many audiences. Each group has its own concerns and requires a special form of communication. However, all these groups have some common needs and aims, quite different from those or the research audience. They all want information to guide action, rather than information for its own sake. They have limited interest and sophistication with respect to research methods and statistics. This situation poses practical and ethical problems for the evaluator. The practical problem is simply that

OCR for page 3
4s of finding ways to communicate findings clearly, with a minimum of jargon and technical detail. One strategy that has proved effective in this regard is organizing presentations around the questions of concern to non- technical audiences, rather than around the researcher's data-collection procedures and analyses. Adoption of this strategy of course presumes that the research itself has been designed at least in part to answer the questions of policy makers and practitioners. In addition, the impact of a report, however well written, can be enhanced by adroit management of other aspects of the dissemination process--public presentations, informal discussions with members of the intended audience, and the like--which can help create a climate of realistic advance expectations and appropriate after-the-fact interpretation. The ethical problem is that of drawing the line between necessary qualification and unnecessary detail. One can always write a report with a clear message by ignoring inconsistent data and problematic analyses. The difficulty is to maintain scientific integrity without burying the message in methodological complexities and caveats. There is no general formula for solving this problem, any more than there is a formula for writing accurately and forcefully. It is important, however, that the problem be recognized--that researchers do not allow themselves to fall back on comfortable obscurantism or to strain for publicity and effect at the price of scientific honesty. Building in Familiarity and Flexibility The considerations about design and measurement discussed above have practical implications for the way in which applied research is conducted. One implication is that both researchers and the people who manage applied research--particularly government project officers and perhaps even program officers in foundations--need to develop intimate familiarity with the operations of service programs as well as basic understanding of the policy context surrounding those programs. Technical virtuosity and substantive excellence in an academic discipline do not alone make an effective evaluator. Over and above these kinds of knowledge, a practical, experiential awareness of program realities and policy concerns is essential if evaluation is to deal with those realities and to address those concerns. When third-party

OCR for page 3
46 evaluations are conducted by organizations other than the service program or its funding agency, a preliminary period of familiarization may be needed by the outside evaluator. Moreover, that individual or organization should remain in close enough touch with the service program throughout the evaluation to respond to changes in focus, clientele, or program practices. A second, related implication is that the evaluation process must be flexible enough to accommodate the evolution of programs and the researcher's understanding. Premature commitment to a particular design or set of measures may leave an evaluation with insufficient resources to respond to important changes, ultimately resulting in a report that speaks only to a program's past and not to its future. Such a report fails disastrously in meeting what we see as the primary responsibility of the evaluator, namely to teach the public and the policy maker whatever there is to learn from the orouram's experience. There is danger, too, in the evaluator's being familiar with programs and flexible in responding to program changes as we have advocated. Too much intimacy with a program can erode an evaluator's intellectual independ- ence, which is often threatened in any case by his or her financial dependence on the agency sponsoring the Program - in question. (Most evaluations are funded and monitored by federal mission agencies or private sponsors that also operate demonstration programs themselves.) We see no easy solution to this serious dilemma, but at the same time we can point to mechanisms that limit any distor- tions introduced by too close a relationship between evaluator and program. Most important among them are the canons of science, which require that the evaluator collect, analyze, and Present data in a way that opens the conclusions to scrutiny. The political process can also act as a corrective force, In that it exposes the evaluator's conclusions to criticism from many value perspectives. Finally, as some researchers have urged, it may sometimes be feasible to deal with advocacy in evaluation by establishing concurrent evaluations of the same program, perhaps funded by separate agencies, but in any case deliberately designed to reflect divergent values and presuppositions. This report does not discuss in detail the institu- tional arrangements that might lead to more effective program evaluations nor does it examine current arrange- ments critically. Such an examination would be a major

OCR for page 3
47 report in itself. Relevant reports have been written under the aegis of the National Research Council, e.g., Raizen and Rossi (1981). However, we observe that many major evaluations are funded by the federal government - through contracts with universities or private research organizations. The contracting process is rather tightly controlled. Subject to the approval of the funding agency, the contractor is typically required to choose designs, variables, and measures early in the course of the study, then stick to them. It is rare that contrac- tors are given adequate time to assimilate preliminary information or to develop and pretest study designs and methods. Sometimes the overall evaluation process is segmented into separate contracts for design, data collection, statistical analysis, and policy analysis. It is Perfectly understandable that the government is . . . reluctant to give universities or contract research organizations carte blanche, especially in large evalua- tions, which may cost millions of dollars. Even the fragmentation of evaluation efforts may be partially justifiable, on the grounds that it allows the government to purchase the services of organizations with complement- ary, specialied expertise. Whatever the merits of these policies, it seems clear that in some respects the contracting process is at odds with the needs we have identified for gradual accretion of practical under- standing and for flexibility in adapting designs and measures to changes in programs. Drawing on and Contributing to Basic Social Science In some respects, evaluation stands in the same relationship to traditional social science disciplines as do engineering, medicine, and other applied fields to-the physical and biological sciences. Evaluation draws on the theories, findings, and methods of anthropology, economics, history, political science, psychology, sociology, statistics, and kindred basic research fields. At the same time, evaluation "technology" can also contribute to basic knowledge. The approach to the evaluation of children's programs set forth in this report has implications both for the kinds of basic social science that are likely to give rise to the most useful applications and for the kinds of contributions that evaluation can make to fundamental research.

OCR for page 3
48 Traditionally, evaluation has borrowed most heavily from basic research fields that emphasize formal designs and quantitative analytic techniques--statistics, economics, experimental psychology, survey research in sociology, and political science. The approach to evaluation we suggest implies that quantitative techniques can usefully be supplemented--not supplanted-- by ethnographic, historical, and clinical techniques. These qualitative approaches are well suited to formu- lating hypotheses about orderly patterns underlying complex, multidetermined, constantly changing phenomena, although not to rigorous establishment of causal chains. There is nothing scientific about adherence to forms and techniques that have proved their usefulness elsewhere but fail to fit the phenomena at hand. Science instead adapts and develops techniques to fit natural and social phenomena. When a field is at an early stage of develop- ment, available techniques are likely to have severe limitations. But the use of all the techniques available, with candid admission of their limitations, is preferable to Procrustean distortion of phenomena to fit preferred methods in pursuit of spurious rigor. Our proposed approach also suggests that global, systemic approaches to theory, of which the ecological approach to human development is an example, are potentially useful. Ad hoc empirical "theories" that specify relationships among small numbers of variables, whatever their merits in terms of clarity and precision, simply omit too much. Theories that explicate relation- ships among variables describing individual growth, family dynamics, and ties between families and other institutions have greater heuristic value, even if they are too ambitious to be precise at this early stage in their development. It should be clear that we favor precision, rigor, and quantitative techniques. Each has its place, even given the present state of the evaluation art, and that place is likely to become larger and more secure as the art advances. We argue, however, that description and qualitative understanding of social programs are in themselves worthwhile aims of evaluation and are essential to the development of useful formal approaches. We have indicated some of the directions in which we think evaluation technology is likely to lead social science. Because understanding social programs requires a judicious fusion of qualitative and quantitative methods, evaluation may stimulate new methodological work

OCR for page 3
49 articulating the two approaches. We may, for example, learn better ways to bring together clinical and experi- mental studies of individual children or ethnographic and survey-based studies of the family. Because understanding programs requires an appreciation of interlocking social systems, evaluation may contribute to the expansion and refinement of ecological, systemic theories. Thinking about children's programs may lead to a deeper under- standing of the'ways in which individual development is shaped by social systems of which the child is a part. Finally, because programs are complex phenomena that cannot be fully comprehended within the intellectual boundaries of a single discipline, evaluation may open up fruitful areas of interdisciplinary cooperation. We are well aware that science often proceeds analyti- cally rather than holistically; for example, it is useful for some purposes to isolate the circulatory system as an object of study, even though it is intimately linked to many other bodily systems. Nevertheless it is also useful now and then to examine interrelationships among previously defined systems to see if new insights and new areas of study--new systems--emerge. It is our hope that evaluation research can play this role vis-a-vis the social sciences. By focusing on concrete, real-world phenomena that do not fit neatly into existing theoretical or methodological boxes, evaluation may stimulate the development of both theory and method. REFERENCES Ainsworth, M. D. S., and Wittig, B. A. (1969) Attachment and exploratory behavior of one- year-olds in a strange situation. In B. M. Foss, ea., Determinants of Infant Behavior, Volume 4. London: Methuen. Anderson, S., and Messick, S. (1974) Social competency in young children. Developmental Psychology 10:282-293. Belsky, J. (1980) Child maltreatment: an ecological integration. American Psychologist 35(4):320-335. Belsky, J., and Steinberg, L. D. (1978) The effects of day care: a critical review Child Development 49:929-949.

OCR for page 3
50 Boruch, R. F., and Cordray, D. S. (1980) An Appraisal of Educational Program Evaluations: Federal, State and Local Agencies. Report prepared for the U.S. Department of Education, Contract No. 300-79-0467. Northwestern University (June 30). Brim, O. G. (1959) Education for Child Rearing. New York: Russell Sage Foundation. Bronfenbrenner, U. (1974) A Report on Lonaitudinal Evaluations of (1979) Preschool Programs. Vol. II: Is Early Intervention Effective? U.S. Department of Health, Education, and Welfare, Publication No. OHD 75-25. Washington, D.C.: U.S. Department of Health, Education, and Welfare. The Ecology of Human Development. Cambridge, Mass.: Harvard University Press. Bureau of Education for the Handicapped (1979) Progress Toward a Free, Appropriate Public Education. A Report to Congress on the Implementation of Public Law 94-142: The Education for All Handicapped Children Act. HEW Publication No. (OK) 79-05003. Washington, D.C.: U.S. Department of Health, Education, and Welfare. Connell, D. C., and Carew, J. V. (1980) Infant Activities in Low-Income Homes: Impact of Family-Focused Intervention. International Conference on Infant Studies, New Haven, Conn. (April). Datta, L. E. (1979) Another spring and other hopes: some findings from National Evaluations of Project Head Start. In E. Zigler and J. Valentine, eds., Project Head Start: A Legacy of the War on Poverty. New York: Free Press. Farran, D., and Ramey, C. (1980) Social class differences in dyadic involvement during infancy. Child Development 51:254-257. General Accounting Office (1979) Early Childhood and Family Development Programs Improve the Quality of Life for Low-Income Families. Report to the Congress by the Comptroller General. HR-79-40 (February).

OCR for page 3
51 Goodson, B. D., and Hess, R. D. (1978) The effects of parent training programs on child performance and parent behavior. In B. Brown, ea., Found: Long-Term Gains From Early Education. Boulder, Colo.: Westview Press. Goodwin, W. L., and Driscoll, L. A. (1980) Handbook for Measurement and Evaluation in Early Childhood Education. San Francisco, Calif.: Jossey-Bass, Inc., Publishers. Horowitz, F. D., and Paden, L. Y. (1973) The effectiveness of environmental programs. In B. Caldwell and H. D. Riccioti, eds., Review of Child Development Research. Vol. 3: Child Development and Social Policy. Chicago, Ill.: University of Chicago Press. Johnson, O. G. (1976) Tests and Measurements in Child Development: Handbook II. Vols. 1 and 2. San Francisco, Calif.: Jossey-8ass, Inc., Publishers. Johnson, O. G., and Bommarito, J. W. (1971) Tests and Measurements in Child Development: A Handbook. San Francisco, Calif.: Jossey-Bass, Inc., Publishers e Kirschner Associates, Albuquerque, N.M. (1970) A National Survey of the Impacts of Head Start Centers on Community Institutions. (ED04519S) Washington, D.C.: Opportunity. Lazar, I., and Darlington, R. B. (1978) Lasting Effects After Preschool. A report of the Consortium for Longitudinal Studies. U.S. Department of Health, Education, and Welfare, Office of Human Development Services, Administration for Children, Youth, and Families. Lindsey, W. E. (1976) Instrumentation of OCD Research Projects on the Family. Mimeographed report prepared under contract HEW-105-76-1120, U.S. Department of Health, Education, and Welfare Social Research Group, The George Washington University, Washington, D.C. Office of Economic . Love, J. M., Nauta, M. J., Coelen, C. G., and Ruopp, R. R. (1975) Home Start Evaluation Study: Executive Summary--Findings and Recommendations. Ypsilanti, Mich., and Cambridge, Mass.: Higb/Scope Educational Research Foundation and Abt Associates, Inc.

OCR for page 3
52 Raizen, S. A., and Rossi, P. H., eds. (1981) Program Evaluation in Education: When? To What Ended Committee on Program Evaluation - in Education, Assembly of Behavioral and Social Sciences, National Research Council Washington, D.C.: National Academy Press. Ramey, C., and Mills, J. (1975) Mother-Infant Interaction Patterns as a Function of Rearing Conditions. Paper presented at the biennial meeting of the Society for Research in Child Development, Denver, Colo. (April). Rossi, P. H., Freeman, H. E., and Wright, S. R e (1979) Evaluation: A Systematic Approach. Hills, Calif.: Sage Publications. Ruopp, R., Travers, J., Coelen, C., and Glantz, F. (1979) Children at the Center. Final report of the National Day Care Study, Volume I. Cambridge, Mass.: Abt Books. Smith, M. S., and Bissell, J. S. (1970) Report analysis: the impact of Head Start. Harvard Educational Review 40:51-104. Sroufe, L. A. (1979) The coherence of individual development: early care, attachment and subsequent developmental issues. American Psychologist 34:834-841. Stallings, J. (1975) Implementation and child effects of teaching practices in Follow Through classrooms. Monographs of the Society for Research in Child Development 40(7-8), Serial No. 163. Stebbins, L. B., et al. (1977) Education as Experimentation: A Planned Variation Model. Vol. IV. Cambridge, Mass.: Abt Associates, Inc. Also issued by the U.S. Office of Education as National Evaluation . Patterns of Effects. Vol. II of the Follow Through Planned Variation Series. Suchman, E. A. (1967) Evaluation Research: Principles and Practice in Public Service and Social Action Programs. New York: Russell Sage Foundation. Walker, D. K. (1973) Socioemotional Measures for Preschool and Kindergarten Children. San Francisco, Calif.: Jossey-Bass, Inc., Publishers.

OCR for page 3
53 Weber, C. U., Foster, P. S., and Weikart, D. P. (1977) An economic analysis of the Ypsilanti Perry Preschool Project. Monographs of the High/Scope Educational Research Foundation. Series No. 5. Weiss, C. H. (1972) Evaluating Action Programs: Readings in Social Action and Education. Boston, Mass.: Allyn & Bacon, Inc. Westinghouse Learning Corporation and Ohio University (1969) The Impact of Head Start: An Evaluation of the Effects of Head Start on Children's Cognitive and Affective Development. Executive Summary. Report to the Office of Economic Opportunity (ED036321). Washington, D.C.: Clearinghouse for Federal Scientific and Technical Information. Zigler, E., and Trickett, P. (1978) IQ, social competence and evaluation of early childhood intervention programs. American Psychologist 33:789-798. Zigler, E., and Valentine, J., eds. (1979) Project Head Start: A Legacy of the War on Poverty. New York: The Free Press.

OCR for page 3