Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 3
Evaluating Early Childhood
Demonstration Programs
I NTRODUCTION
During the last two decades, public and private
programs for young children and their families have
undergone profound changes. Programs and philosophies
have proliferated. Program objectives have broadened.
Federal support has increased: Projected expenditures
for child care and preschool education alone neared
$3 billion several years ago. Target populations have
expanded and diversified, as have the constituencies
affected by programs; such constituencies reach beyond
the target populations themselves.
A sizable evaluation enterprise has grown along with
the expansion in programs. Formal outcome measurement
has gained increasing acceptance as a tool for policy
analysis, as a test of accountability, and to some extent
as a guide for improving program practices. Programs have
been subjected to scrutiny from all sides, as parents,
practitioners, and politicians have become increasingly
sophisticated about methods and issues that once were the
exclusive preserve of the researcher. At the same time,
evaluation has come under attack--some of it politically
motivated, some of it justified. Professionals question
the technical quality of evaluations, while parents,
practitioners, and policy makers complain that studies
fail to address their concerns or to reflect program
realities. Improvements in evaluation design and outcome
measurement have failed to keep pace with the evolution
of programs, widening the gap between what is measured
and what programs actually do.
This report attempts to take modest steps toward
rectifying the situation. Rather than recommend specific
instruments, its aims are (1) to characterize recent
3
OCR for page 4
4
developments in programs and policies for children and
families that challenge traditional approaches to evalua-
tions and (2) to trace the implications for outcome
measurement and for the broader conduct of evaluation
studies. We have attempted to identify various types of
information that evaluators of early childhood programs
might collect, depending on their purposes. Our intent
is not so much to prescribe how evaluation should be done
as to provide a basis for intelligent choice of data to
be collected.
Two related premises underlie much of our argument.
First, policies and programs, at least those in the public
domain, are shaped by many forces. Constituencies with
conflicting interests influence policies or programs and
in turn are affected by them. Policies and programs
evolve continuously, in response to objective conditions
and to the concerns of constituents. Demonstration
programs, the subject of this report, are particularly
likely to change as experience accumulates. Consequently,
evaluation must address multiple concerns and must shift
focus as programs mature or choral Her and == -^w
policy issues emerge. Any single study is limited in its
Ivy co react co changes, out a single study is only
a part of the larger evaluation process.
Second, the role of the evaluator is to contribute to
public debate, to help make programs and policies more
effective by informing the forensic process through which
they are shaped. Though the evaluator might never
actually engage in public discussion or make policy
recommendations, he or she is nevertheless a participant
in the policy formation process, a participant whose
special role is to provide systematic information and to
articulate value choices, rather than to plead the case
for particular actions or values.
Note that we distinguish between informing the policy
formation process and being co-opted by it--between
research and advocacy. Research is characterized by
systematic inquiry, concern with the reduction and
control of bias, and commitment to addressing all the
evidence. Nothing that we say is intended to relax the
need for such rigor.
There are many views of the evaluator's role. Relevant
discussions appear in numerous standard sources on evalu-
ation methodology, such as Suchman (1967), Weiss (1972),
Rossi et al. (1979), and Goodwin and Driscoll (1980).
Some of these views are consonant, and some are partially
contrasting with ours. For example, one widely held view
~I_ ~ # _ ~ OF ~ J _ ~
OCR for page 5
5
is that the role of the evaluator is, ideally, to provide
definitive information to decision makers about the
degree to which programs or policies are achieving their
stated goals.) Though we agree that evaluation should
inform decision makers (among others) and should strive
for clear evidence on whether goals are being met, we
argue that this view is insufficiently attuned to the
pluralistic, dynamic process through which most programs
and policies are formed and changed.
Sometimes the most valuable lesson to be learned from
a demonstration is whether a particular intervention has
achieved a specified end. Often, however, other lessons
are equally or more important. An intervention can
succeed for reasons that have little import for future
programs or policies--for example, because of the efforts
of uniquely talented staff. Conversely, a demonstration
that fails, overall, may contain successful elements
deserving replication in other contexts, and it may
succeed in identifying practices that should be amended
or avoided. Or a demonstration may shift its goals and
"treatments" in response to local needs and resources,
thereby failing to achieve its original ends but
succeeding in other important respects.
By the same token, a randomized field experiment, with
rigorous control of treatment and subject assignment, is
sometimes the most appropriate way to answer questions
salient for policy formation or program management. In
such situations, government should be encouraged to
provide the support necessary to implement experimental
designs. There are situations, however, in which
experimental rigor is impractical or premature, or in
which information of a different character is likely to
be more useful to policy makers and program managers.
Preoccupation with prespecified goals and treatments can
cause evaluators to overlook important changes in the
aims and operations of programs as well as important
outcomes that were not part of the original plan. If
demonstrations have been allowed to adapt to local
conditions, thoughtful documentation of the process of
Strictly speaking, this view applies only to "summa-
tive" evaluations, as distinguished from "formative"
evaluations, which are intended to provide continuous
feedback to program participants for the purpose of
improving program operations.
OCR for page 6
6
change can be far more useful in designing future programs
than a report on whether original goals were met.
Even if change in goals and treatments is not at
issue, understanding the mechanisms by which programs
work or fail to work is likely to be more helpful than
simply knowing whether they have achieved their stated
goals. These meabanisms are often complex, and the
evaluator's understanding of them often develops
gradually. To elucidate mechanisms of change, it may be
necessary to modify an initial experimental design, to
perform post hoc analyses without benefit of experimental
control, or to supplement quantitative data collection
with qualitative accounts of program operations.
In short, we believe that evaluation is best conceived
as a process of systematic learning from experience--the
experience of the demonstration program itself and the
experience of the evaluator as he or she gains increasing
familiarity with the program. It is the systematic
quality of evaluation that distinguishes it from advocacy
or journalism. It is the need to bring experience to
bear on practice that distinguishes evaluation from other
forms of social scientific inquiry.
A Word on Definitions
This is a report about the evaluation of demonstration
programs for young children and their families. Each
word or phrase in the foregoing sentence is subject to
multiple interpretations. The substance of this report
is intimately bound up with our choice of definitions.
By evaluation we mean systematic inquiry into the
operations of a program--the services it delivers, the
process by which those services are provided, the costs
of services, the characteristics of the persons served,
relations with relevant community institutions (e.g.,
schools or clinics), and, especially, the outcomes for
program participants.
By outcomes we mean any changes in program participants
or in the contexts in which they function. The latter is
a deliberately broad definition, which includes yet
extends far beyond the changes in individual children
that are usually thought of as program outcomes. We
believe that the definition is appropriate, given the
nature of contemporary programs, and we endeavor to
support this claim in some detail.
OCR for page 7
7
By demonstration programs we mean any programs
installed at least in part for the purpose of generating
practical knowledge--such as the effectiveness of
particular interventions; the costs, feasibility, or
accessibility of services under alternative approaches to
delivery; or the interaction of a program with other
community institutions. This definition goes beyond
traditional concerns with program effectiveness. We
believe that it is an appropriate definition in light of
the policy considerations that surround programs for
young children today.
Finally, by young children we mean children from birth
to roughly age eight, although some of our discussion
applies to older children as well. We take very
seriously the inclusion of families as recipients of
services; we emphasize the fact that many contemporary
programs attempt to help the child through the family and
that outcome measures should reflect this emphasis.
Plan of the Report
We begin by tracing the historical evolution of
demonstration programs from 1960 to the mid-1970s, and of
the evaluations undertaken in that period. Although
children's programs and formal evaluation have histories
beginning long before 1960, the programs and evaluations
of the early 1960s both prefigure and constrain our
thinking about outcome measurement today. Following this
historical overview is a section that examines in some
detail the policy issues and programs that have evolved
in recent years and that appear to be salient for the
1980s. The next section--the heart of the report--
identifies some important implications of these programs
and policy developments for outcome measurement and
evaluation design. The final section points to
implications for dissemination and utilization of
results, for the organization and conduct of applied
research, and, finally, for the articulation between
applied research and basic social science.
PROGRAMS FOR CHILDREN AND FAMILIES, 1960-1975
Programs for children and families have come a long
way since 1960, but it is fair to say that the earliest
demonstration programs of the 1960S, precursors of Head
OCR for page 8
8
Start, still have a hold on the imagination of the public
as well as many researchers. It is perhaps an
oversimplification--but nevertheless one with a large
grain of truth--to say that outcome measurement, which
was reasonably well adapted to the early demonstrations,
has stood still while programs have changed radically.
To illustrate, let us consider the experience of a
"typical" child in a "typical" demonstration program at
various points from 1960 to the present, and let us
briefly survey the kinds of measures that have been used
at each point to assess the effects of programs. In the
early 1960s it would have been easy to characterize a
typical child and a typical program.
.
Prototypical
remonstrations of that period were primarily preschool
education programs, designed to enhance the cognitive
skills of "culturally disadvantaged" children from
low-income families, in order to prepare them to function
more effectively as students and, ultimately, as workers
and citizens. It was only natural to measure as outcomes
children's school performance, academic ability, and
achievement. Some practitioners had misgivings about the
fit between available measures and the skills and atti-
tudes they were attempting to teach, and many lamented
the lack of good measures of social and emotional growth.
There was fairly widespread consensus, however, that
preacademic instruction was the heart of early childhood
demonstrations. (Horowitz and Paden, 1973, Provide one
of several useful reviews of these early projects.)
By 1965 the typical child would have been one of more
than half a million children to participate in the first
Head Start program. Despite its scale, Head Start was
and still is termed a "demonstration" in its authorizing
legislation. Moreover, Head Start has constantly expert
mented with curricula and approaches to service delivery,
and it has spawned a vast number of evaluations. For
these reasons it dominates our discussion of demon-
strations from 1965 until very recently. (A collection
of papers edited by Zigler and Valentine, 1979, reviews
the history of Head Start. See in particular Datta's
paper in that volume (Datta, 1979) for a discussion of
Head Start research.)
The program originally consisted of eight weeks of
preschool during the summer and was soon extended to a
full year. Proponents had stressed "comprehensive
services," and many teachers viewed socialization rather
than academic instruction as their primary goal. Many of
the federal managers and local practitioners did not
-
OCR for page 9
9
conceive Head Start exclusively as a cognitive enrichment
program. Nevertheless, Head Start was widely perceived--
by the public, by Congress, and by many participants--as
a way to correct deficiencies in cognitive functioning
before a child entered the school system. Early Head
Start programs involved many enthusiastic parents, but
the educational mission and direction of the program was
set by professional staff and local sponsoring organiza-
tions. Programs and developmental theories were numerous
and diverse; no uniform curriculum was set. Yet there
seems to have been consensus and a high level of
confidence with respect to one key point--that early
intervention would be effective, regardless of the
particular approach.
In some quarters this confidence was severely shaken
by the first national evaluation of Head Start's impact
on children, the Westinghouse-Ohio study (Westinghouse
Learning Corp. and Ohio University, 1969). The study
reported that Head Start graduates showed only modest
immediate gains on standardized tests of cognitive ability
and that these gains disappeared after a few years in
school. However, for others the results testified only
to the narrowness of the study's outcome measures and to
other inadequacies of design. Some partisans of Head
Start and critics of the Westinghouse-Ohio study, claiming
that the program was much more than an attempt at compen-
satory education or cognitive enrichment, argued that the
study bad measured Head Start against a standard more
appropriate to its precursors. These advocates argued
that Head Start enhanced social skills (to which the
Westinghouse-Ohio study paid limited attention) and
provided food, medical and dental checkups, and corrective
services to children who were badly in need of them.
Thus its justification lay in part in the provision of
immediate benefits to low-income populations, not solely
in expected future gains. Furthermore, argued advocates
of Head Start, many local programs had mobilized parents
and become a focus for community organization and
political action. To be sure, some of the criticism of
the Westinghouse-Ohio study was rhetorical and politically
motivated. However, many of the critics' points were
supported empirically, for example, by an evaluation by
Kirschner Associates (1970), which documented the impact
of the program on services provided by the community.
By 1970, Head Start had begun to experiment with
systematic variations in curriculum. Now the typical
preschool child might be served according to any of a
OCR for page 10
10
dozen models, ranging from highly structured academic
drill to global, diffuse support for social and emotional
growth. Models were viewed as fixed treatments, to be
applied more or less uniformly across sites. Parallel
models were also put in place in elementary schools that
received Head Start graduates, as part of the National
Follow Through experiment. Under most models, treatment
was still directed primarily to individual children, not
families or communities. Some models made an effort to
integrate parents; others did not. Noneducational program
components, such as health, nutrition, and social ser-
vices, had expanded but were still widely viewed as
subordinate to the various developmental approaches.
Comparative evaluations continued to stress a relatively
narrow range of educational outcomes. As a result, pro-
grams with a heavy cognitive emphasis tended to fare
better than others, although no single approach proved
superior on all measures, and there were large differences
in the effectiveness of a given model at different sites.
Dissatisfaction with the narrowness of outcome measures
continued to grow, as programs broadened their goals and
came to be seen as having distinctive approaches and
outcomes, not necessarily reflected by the measures being
used.
By 1975, Head Start had changed and diversified
significantly. Program standards were put in place,
mandating comprehensive services and parent involvement
nationwide. In 1975 more than 300 Head Start programs
were gearing up to provide home-based services as supple-
ments to, or even substitutes for, center-based services.
The home-based option was permitted in the national
guidelines following an evaluation of Home Start, a
16-site demonstration project (Love et al., 1975). The
evaluation, which involved random assignment of children
to home treatment and control conditions, found that the
home treatment group scored significantly above the
control group on a variety of measures, including a
standardized cognitive test, and that the home treatment
group did as well as a nonrandom comparison group of
children in Head Start centers. In addition, several
offshoot demonstrations, some of them dating from the
1960s, began to get increased attention, notably the
Child and Family Resource Program, the Parent-Child
Centers, and Parent-Child Development Centers. These
projects extend services to children much younger than
age three or four, the normal age for Head Start entrants
These programs work through the mother or the family
OCR for page 11
11
rather than serving the child alone. They combine home
visits with center sessions in various mixes. Although
these programs even today serve only about 8 percent of
the total number of children served in Head Start, they
represent significant departures from traditional
approaches. We have a good deal more to say about these
programs below.
Thus by 1975 the experience of the typical Head Start
child had become difficult to characterize. The child
might be served at home or in a center; he or she might
receive a concentrated dose of preacademic instruction or
almost no instruction at all. In the face of this diver-
sity, it is apparent that standardized tests, measuring
aspects of academic skill and ability, capture only a
part of what Head Start was trying to accomplish.
Evaluations of Head Start's components, such as health
services, and offshoot demonstrations, such as the Child
and Family Resource Program, have been conducted or are
currently in progress. Head Start's research division in
1977 initiated a multimillion-dollar procurement to
develop a new comprehensive assessment battery that
stresses health and social as well as cognitive measures.
By the late 1970s other programs, mostly federal in
origin, were beginning to take their places beside Head
Start as major providers of services to children. In
addition, federal evaluation research began to concentrate
on other children's programs, such as day care, which had
existed for many years but had begun to assume new
importance for policy in the 1970s. In the next section
we attempt to characterize some of the recent program
initiatives as well as the policy climate that surrounds
programs for young children and their families in the
early 1980s.
THE PROM AND POLICY CONTEXT OF THE 1980s
Public policy both creates social change and responds
to it. The evolution of policies toward children and
families must be understood in the context of general
societal change. Demographic shifts in the number of
young children, the composition of families, and the
labor force participation of mothers in recent years have
increased and broadened the demand for services. They
have also heightened consciousness about policy issues
surrounding child health care, early education, and
social services. Policy makers and evaluators in the
OCR for page 12
12
1980s are coping with the consequences of these broad
changes. Contemporary policy issues and program
characteristics constitute the environment in which
evaluators ply their trade, and they pose challenges with
which new evaluations and outcome measures must deal.
To understand the policy context surrounding demonstra-
tion programs for children in the 1980s, it is useful to
begin by outlining some general considerations that affect
the formation of policy. These generic considerations
apply to virtually all programs and public issues but
shift in emphasis and importance as they are applied to
particular programs and issues, at particular times, under
particular conditions.
. . . .
The most fundamental consideration
is whether the program or policy in question (whether
newly proposed or a candidate for modification or termina
Lion) accords with the general philosophy of some group
of policy makers and their constituents. Closely related
is the question of tangible public support for a program
or policy: Can the groups favoring a particular action
translate their needs into effective political pressure?
Assuming that basic support exists, issues of access,
-
, ~
equity, effectiveness, and efficiency arise. Will a
program reach the target population(s) that it is intended
to affect (access)? Will it provide benefits fairly,
without favoring or denying any eligible target aroun--for
example, by virtue of geographic location, ethnicity, or
any other characteristics irrelevant to eligibility? And
will its costs, financial and nonfinancial, be apportioned
fairly (equity)? Will it achieve its intended objectives
(effectiveness)? Will it do so without excessively
cumbersome administrative machinery, and will cost-
effectiveness and administrative requirements compare
favorably with alternative programs or policies
(efficiency)?
Two related concerns have to do with the unintended
consequences of programs and policies and their interplay
with existing policies and institutions. Will the policy
or program have unanticipated positive or negative
effects? Will it facilitate or impede the operations of
existing policies, programs, or agencies? How will it
affect the operations of private, formal, and informal
institutions?
Programs for children and families are not exempt from
any of these concerns. Some have loomed larger than
others at times in the past two decades, and the current
configuration is rather different from the one that
prevailed when the first evaluations of compensatory
OCR for page 13
13
education were initiated. The policy climate of the early
1960s was one of concern over poverty and inequality and
of faith in the effectiveness of government-initiated
social reform. The principal policy initiative of that
period directed toward children and families--namely, the
founding of Head Start--exemplified this concern and this
faith. Head Start was initially administered by the now
defunct Office of Economic Opportunity (OEO), and many
local Head Start centers were affiliated with OEO-funded
Community Action Programs. Thus, while it was in the
first instance a service to children, Head Start was also
part of the government's somewhat paradoxical attempt to
stimulate grass roots political action "from the top
down. n The national managers made a conscious, concerted
effort to distinguish Head Start from other children's
services, notably day care. The latter was seen as
controversial--hence, a politically risky ally.
The early 1960s was a time of economic and governmental
expansion. Consequently, questions of cost and efficiency
did not come to the fore. The principal concerns of the
period were to extend services--to broaden access--and to
demonstrate the effectiveness of the program. As noted
earlier, effectiveness in the public mind was largely
equated with cognitive gains. Despite the political
character of the program, studies documenting its
effectiveness as a focus for community organization and
political action received little attention or weight--
perhaps because the political activities of OEO-funded
entities, such as the Community Action Programs and Legal
Services, were sensitive issues even in the 1960s. Yet
it was precisely the effectiveness of Head Start at
mobilizing parents (together with the political skills of
its national leaders) that saved the program when the
Westinghouse-Ohio study produced bleak results and a new
administration dismantled OEO.
During the 1970s the policy climate changed markedly.
Economic slowdown and growing disillusionment with what
were seen as excesses and failures of the policies of the
1960s brought about a concern for accountability and
fiscal restraint, a concern that is still present and
growing. Head Start responded by establishing national
performance standards in an effort at quality control.
Expansion was curtailed as the program fought to retain
its budget in the face of inflation and congressional
skepticism. (In fiscal 1977 only 15-18 percent of
eligible children were actually served by Head Start.)
Policy makers and program managers began to demand that
OCR for page 44
44
of handicapped children exercise their rights to change
their children's educational placement, there is no
guarantee that the educational experiences of the child
will in fact be improved, either by the lengthy process
of appeals that may be involved or by the ultimate
outcome. In such a situation, legitimate values compete:
Is it more important for parents to have such rights or
for children to have steady, uninterrupted, and relaxed
educational experiences? Such conflicts create delicate
situations in which evaluators, sponsors of evaluations,
practitioners, and clients must negotiate the choice and
weighting of outcomes. Our point is that the scope of an
evaluation the breadth of the audience for
which it
provides at least some relevant information, and the
likelihood that its findings will be put to use will all
be enhanced if the perspectives of the various
constituencies are considered.
Communicating with Multiple Audiences
We have argued consistently that if evaluation is to
accomplish its goal of helping to improve programs and
shape policies, it must be attuned to practical issues,
not only to the interests of discipline-based researchers
and methodologists.
Beyond this first and most important
step, evaluators can, by virtue of the way in which they
present their work, take further measures to ensure the
dissemination and utilization of their results.
Basic researchers are usually trained to speak only to
other researchers. Buttressed with statistics and hedged
with caveats. their reports typically have a logic and an
a · ~ _: a_ _ ~ _ _: ~ _ ~ ~ i:
organization almea at persuading processional crows o'
the accuracy of careful delimited empirical claims.
However, applied researchers must address many audiences
who make very different uses of their findings. Policy
makers, government program managers, advocacy groups,
practitioners, and parents are among their many audiences.
Each group has its own concerns and requires a special
form of communication.
However, all these groups have
some common needs and aims, quite different from those or
the research audience. They all want information to guide
action, rather than information for its own sake. They
have limited interest and sophistication with respect to
research methods and statistics.
This situation poses practical and ethical problems
for the evaluator. The practical problem is simply that
OCR for page 45
4s
of finding ways to communicate findings clearly, with a
minimum of jargon and technical detail. One strategy
that has proved effective in this regard is organizing
presentations around the questions of concern to non-
technical audiences, rather than around the researcher's
data-collection procedures and analyses. Adoption of
this strategy of course presumes that the research itself
has been designed at least in part to answer the questions
of policy makers and practitioners. In addition, the
impact of a report, however well written, can be enhanced
by adroit management of other aspects of the dissemination
process--public presentations, informal discussions with
members of the intended audience, and the like--which can
help create a climate of realistic advance expectations
and appropriate after-the-fact interpretation.
The ethical problem is that of drawing the line between
necessary qualification and unnecessary detail. One can
always write a report with a clear message by ignoring
inconsistent data and problematic analyses. The
difficulty is to maintain scientific integrity without
burying the message in methodological complexities and
caveats. There is no general formula for solving this
problem, any more than there is a formula for writing
accurately and forcefully. It is important, however,
that the problem be recognized--that researchers do not
allow themselves to fall back on comfortable obscurantism
or to strain for publicity and effect at the price of
scientific honesty.
Building in Familiarity and Flexibility
The considerations about design and measurement
discussed above have practical implications for the way
in which applied research is conducted. One implication
is that both researchers and the people who manage applied
research--particularly government project officers and
perhaps even program officers in foundations--need to
develop intimate familiarity with the operations of
service programs as well as basic understanding of the
policy context surrounding those programs. Technical
virtuosity and substantive excellence in an academic
discipline do not alone make an effective evaluator.
Over and above these kinds of knowledge, a practical,
experiential awareness of program realities and policy
concerns is essential if evaluation is to deal with those
realities and to address those concerns. When third-party
OCR for page 46
46
evaluations are conducted by organizations other than the
service program or its funding agency, a preliminary
period of familiarization may be needed by the outside
evaluator. Moreover, that individual or organization
should remain in close enough touch with the service
program throughout the evaluation to respond to changes
in focus, clientele, or program practices.
A second, related implication is that the evaluation
process must be flexible enough to accommodate the
evolution of programs and the researcher's understanding.
Premature commitment to a particular design or set of
measures may leave an evaluation with insufficient
resources to respond to important changes, ultimately
resulting in a report that speaks only to a program's
past and not to its future. Such a report fails
disastrously in meeting what we see as the primary
responsibility of the evaluator, namely to teach the
public and the policy maker whatever there is to learn
from the orouram's experience.
There is danger, too, in the evaluator's being familiar
with programs and flexible in responding to program
changes as we have advocated. Too much intimacy with a
program can erode an evaluator's intellectual independ-
ence, which is often threatened in any case by his or her
financial dependence on the agency sponsoring the Program
-
in question. (Most evaluations are funded and monitored
by federal mission agencies or private sponsors that also
operate demonstration programs themselves.) We see no
easy solution to this serious dilemma, but at the same
time we can point to mechanisms that limit any distor-
tions introduced by too close a relationship between
evaluator and program. Most important among them are the
canons of science, which require that the evaluator
collect, analyze, and Present data in a way that opens
the conclusions to scrutiny.
The political process can
also act as a corrective force, In that it exposes the
evaluator's conclusions to criticism from many value
perspectives. Finally, as some researchers have urged,
it may sometimes be feasible to deal with advocacy in
evaluation by establishing concurrent evaluations of the
same program, perhaps funded by separate agencies, but in
any case deliberately designed to reflect divergent
values and presuppositions.
This report does not discuss in detail the institu-
tional arrangements that might lead to more effective
program evaluations nor does it examine current arrange-
ments critically. Such an examination would be a major
OCR for page 47
47
report in itself. Relevant reports have been written
under the aegis of the National Research Council, e.g.,
Raizen and Rossi (1981). However, we observe that many
major evaluations are funded by the federal government -
through contracts with universities or private research
organizations. The contracting process is rather tightly
controlled.
Subject to the approval of the funding
agency, the contractor is typically required to choose
designs, variables, and measures early in the course of
the study, then stick to them. It is rare that contrac-
tors are given adequate time to assimilate preliminary
information or to develop and pretest study designs and
methods. Sometimes the overall evaluation process is
segmented into separate contracts for design, data
collection, statistical analysis, and policy analysis.
It is Perfectly understandable that the government is
. . .
reluctant to give universities or contract research
organizations carte blanche, especially in large evalua-
tions, which may cost millions of dollars. Even the
fragmentation of evaluation efforts may be partially
justifiable, on the grounds that it allows the government
to purchase the services of organizations with complement-
ary, specialied expertise. Whatever the merits of these
policies, it seems clear that in some respects the
contracting process is at odds with the needs we have
identified for gradual accretion of practical under-
standing and for flexibility in adapting designs and
measures to changes in programs.
Drawing on and Contributing to Basic Social Science
In some respects, evaluation stands in the same
relationship to traditional social science disciplines as
do engineering, medicine, and other applied fields to-the
physical and biological sciences. Evaluation draws on
the theories, findings, and methods of anthropology,
economics, history, political science, psychology,
sociology, statistics, and kindred basic research fields.
At the same time, evaluation "technology" can also
contribute to basic knowledge.
The approach to the
evaluation of children's programs set forth in this
report has implications both for the kinds of basic
social science that are likely to give rise to the most
useful applications and for the kinds of contributions
that evaluation can make to fundamental research.
OCR for page 48
48
Traditionally, evaluation has borrowed most heavily
from basic research fields that emphasize formal designs
and quantitative analytic techniques--statistics,
economics, experimental psychology, survey research in
sociology, and political science. The approach to
evaluation we suggest implies that quantitative
techniques can usefully be supplemented--not supplanted--
by ethnographic, historical, and clinical techniques.
These qualitative approaches are well suited to formu-
lating hypotheses about orderly patterns underlying
complex, multidetermined, constantly changing phenomena,
although not to rigorous establishment of causal chains.
There is nothing scientific about adherence to forms and
techniques that have proved their usefulness elsewhere
but fail to fit the phenomena at hand. Science instead
adapts and develops techniques to fit natural and social
phenomena. When a field is at an early stage of develop-
ment, available techniques are likely to have severe
limitations. But the use of all the techniques available,
with candid admission of their limitations, is preferable
to Procrustean distortion of phenomena to fit preferred
methods in pursuit of spurious rigor.
Our proposed approach also suggests that global,
systemic approaches to theory, of which the ecological
approach to human development is an example, are
potentially useful. Ad hoc empirical "theories" that
specify relationships among small numbers of variables,
whatever their merits in terms of clarity and precision,
simply omit too much. Theories that explicate relation-
ships among variables describing individual growth,
family dynamics, and ties between families and other
institutions have greater heuristic value, even if they
are too ambitious to be precise at this early stage in
their development.
It should be clear that we favor precision, rigor, and
quantitative techniques. Each has its place, even given
the present state of the evaluation art, and that place
is likely to become larger and more secure as the art
advances. We argue, however, that description and
qualitative understanding of social programs are in
themselves worthwhile aims of evaluation and are
essential to the development of useful formal approaches.
We have indicated some of the directions in which we
think evaluation technology is likely to lead social
science. Because understanding social programs requires
a judicious fusion of qualitative and quantitative
methods, evaluation may stimulate new methodological work
OCR for page 49
49
articulating the two approaches. We may, for example,
learn better ways to bring together clinical and experi-
mental studies of individual children or ethnographic and
survey-based studies of the family. Because understanding
programs requires an appreciation of interlocking social
systems, evaluation may contribute to the expansion and
refinement of ecological, systemic theories. Thinking
about children's programs may lead to a deeper under-
standing of the'ways in which individual development is
shaped by social systems of which the child is a part.
Finally, because programs are complex phenomena that
cannot be fully comprehended within the intellectual
boundaries of a single discipline, evaluation may open up
fruitful areas of interdisciplinary cooperation.
We are well aware that science often proceeds analyti-
cally rather than holistically; for example, it is useful
for some purposes to isolate the circulatory system as an
object of study, even though it is intimately linked to
many other bodily systems. Nevertheless it is also
useful now and then to examine interrelationships among
previously defined systems to see if new insights and new
areas of study--new systems--emerge. It is our hope that
evaluation research can play this role vis-a-vis the
social sciences. By focusing on concrete, real-world
phenomena that do not fit neatly into existing theoretical
or methodological boxes, evaluation may stimulate the
development of both theory and method.
REFERENCES
Ainsworth, M. D. S., and Wittig, B. A.
(1969) Attachment and exploratory behavior of one-
year-olds in a strange situation. In B. M.
Foss, ea., Determinants of Infant Behavior,
Volume 4. London: Methuen.
Anderson, S., and Messick, S.
(1974) Social competency in young children.
Developmental Psychology 10:282-293.
Belsky, J.
(1980) Child maltreatment: an ecological integration.
American Psychologist 35(4):320-335.
Belsky, J., and Steinberg, L. D.
(1978) The effects of day care: a critical review
Child Development 49:929-949.
OCR for page 50
50
Boruch, R. F., and Cordray, D. S.
(1980) An Appraisal of Educational Program
Evaluations: Federal, State and Local
Agencies. Report prepared for the U.S.
Department of Education, Contract No.
300-79-0467. Northwestern University (June
30).
Brim, O. G.
(1959) Education for Child Rearing. New York:
Russell Sage Foundation.
Bronfenbrenner, U.
(1974) A Report on Lonaitudinal Evaluations of
(1979)
Preschool Programs. Vol. II: Is Early
Intervention Effective? U.S. Department of
Health, Education, and Welfare, Publication
No. OHD 75-25. Washington, D.C.: U.S.
Department of Health, Education, and Welfare.
The Ecology of Human Development. Cambridge,
Mass.: Harvard University Press.
Bureau of Education for the Handicapped
(1979)
Progress Toward a Free, Appropriate Public
Education. A Report to Congress on the
Implementation of Public Law 94-142: The
Education for All Handicapped Children Act.
HEW Publication No. (OK) 79-05003. Washington,
D.C.: U.S. Department of Health, Education,
and Welfare.
Connell, D. C., and Carew, J. V.
(1980) Infant Activities in Low-Income Homes: Impact
of Family-Focused Intervention. International
Conference on Infant Studies, New Haven, Conn.
(April).
Datta, L. E.
(1979) Another spring and other hopes: some findings
from National Evaluations of Project Head
Start. In E. Zigler and J. Valentine, eds.,
Project Head Start: A Legacy of the War on
Poverty. New York:
Free Press.
Farran, D., and Ramey, C.
(1980) Social class differences in dyadic involvement
during infancy. Child Development 51:254-257.
General Accounting Office
(1979) Early Childhood and Family Development
Programs Improve the Quality of Life for
Low-Income Families. Report to the Congress
by the Comptroller General. HR-79-40
(February).
OCR for page 51
51
Goodson, B. D., and Hess, R. D.
(1978) The effects of parent training programs on
child performance and parent behavior. In B.
Brown, ea., Found: Long-Term Gains From Early
Education. Boulder, Colo.: Westview Press.
Goodwin, W. L., and Driscoll, L. A.
(1980) Handbook for Measurement and Evaluation in
Early Childhood Education. San Francisco,
Calif.: Jossey-Bass, Inc., Publishers.
Horowitz, F. D., and Paden, L. Y.
(1973) The effectiveness of environmental programs.
In B. Caldwell and H. D. Riccioti, eds.,
Review of Child Development Research. Vol.
3: Child Development and Social Policy.
Chicago, Ill.: University of Chicago Press.
Johnson, O. G.
(1976) Tests and Measurements in Child Development:
Handbook II. Vols. 1 and 2. San Francisco,
Calif.: Jossey-8ass, Inc., Publishers.
Johnson, O. G., and Bommarito, J. W.
(1971) Tests and Measurements in Child Development: A
Handbook. San Francisco, Calif.:
Jossey-Bass,
Inc., Publishers e
Kirschner Associates, Albuquerque, N.M.
(1970) A National Survey of the Impacts of Head Start
Centers on Community Institutions. (ED04519S)
Washington, D.C.:
Opportunity.
Lazar, I., and Darlington, R. B.
(1978) Lasting Effects After Preschool. A report of
the Consortium for Longitudinal Studies. U.S.
Department of Health, Education, and Welfare,
Office of Human Development Services,
Administration for Children, Youth, and
Families.
Lindsey, W. E.
(1976) Instrumentation of OCD Research Projects on
the Family. Mimeographed report prepared
under contract HEW-105-76-1120, U.S.
Department of Health, Education, and Welfare
Social Research Group, The George Washington
University, Washington, D.C.
Office of Economic
.
Love, J. M., Nauta, M. J., Coelen, C. G., and Ruopp, R. R.
(1975) Home Start Evaluation Study: Executive
Summary--Findings and Recommendations.
Ypsilanti, Mich., and Cambridge, Mass.:
Higb/Scope Educational Research Foundation and
Abt Associates, Inc.
OCR for page 52
52
Raizen, S. A., and Rossi, P. H., eds.
(1981) Program Evaluation in Education: When?
To What Ended
Committee on Program Evaluation
-
in Education, Assembly of Behavioral and
Social Sciences, National Research Council
Washington, D.C.: National Academy Press.
Ramey, C., and Mills, J.
(1975) Mother-Infant Interaction Patterns as a
Function of Rearing Conditions. Paper
presented at the biennial meeting of the
Society for Research in Child Development,
Denver, Colo. (April).
Rossi, P. H., Freeman, H. E., and Wright, S. R e
(1979) Evaluation: A Systematic Approach.
Hills, Calif.: Sage Publications.
Ruopp, R., Travers, J., Coelen, C., and Glantz, F.
(1979) Children at the Center. Final report of the
National Day Care Study, Volume I. Cambridge,
Mass.: Abt Books.
Smith, M. S., and Bissell, J. S.
(1970) Report analysis: the impact of Head Start.
Harvard Educational Review 40:51-104.
Sroufe, L. A.
(1979) The coherence of individual development:
early care, attachment and subsequent
developmental issues. American Psychologist
34:834-841.
Stallings, J.
(1975) Implementation and child effects of teaching
practices in Follow Through classrooms.
Monographs of the Society for Research in
Child Development 40(7-8), Serial No. 163.
Stebbins, L. B., et al.
(1977) Education as Experimentation: A Planned
Variation Model. Vol. IV. Cambridge, Mass.:
Abt Associates, Inc. Also issued by the U.S.
Office of Education as National Evaluation
.
Patterns of Effects. Vol. II of the Follow
Through Planned Variation Series.
Suchman, E. A.
(1967) Evaluation Research: Principles and Practice
in Public Service and Social Action Programs.
New York: Russell Sage Foundation.
Walker, D. K.
(1973) Socioemotional Measures for Preschool and
Kindergarten Children. San Francisco,
Calif.: Jossey-Bass, Inc., Publishers.
OCR for page 53
53
Weber, C. U., Foster, P. S., and Weikart, D. P.
(1977) An economic analysis of the Ypsilanti Perry
Preschool Project. Monographs of the
High/Scope Educational Research Foundation.
Series No. 5.
Weiss, C. H.
(1972) Evaluating Action Programs: Readings in
Social Action and Education. Boston, Mass.:
Allyn & Bacon, Inc.
Westinghouse Learning Corporation and Ohio University
(1969) The Impact of Head Start: An Evaluation of
the Effects of Head Start on Children's
Cognitive and Affective Development.
Executive Summary. Report to the Office of
Economic Opportunity (ED036321). Washington,
D.C.: Clearinghouse for Federal Scientific
and Technical Information.
Zigler, E., and Trickett, P.
(1978) IQ, social competence and evaluation of early
childhood intervention programs. American
Psychologist 33:789-798.
Zigler, E., and Valentine, J., eds.
(1979) Project Head Start: A Legacy of the War on
Poverty. New York:
The Free Press.
OCR for page 54
Representative terms from entire chapter:
day care