G Assessing Program Effectiveness and Cost-Effectiveness: Supplement to Chapter 8
Mark R. Montgomery
Member, Committee on Unintended Pregnancy
This appendix discusses several principles of evaluation that can be applied to family planning programs. As the body of Chapter 8 makes clear, the formal tools of program evaluation have only occasionally been brought to bear on U.S. family planning programs. This is surprising, especially as there exists a lively literature on family planning programs in developing countries, and many of the issues are precisely the same, although the contexts in which these issues emerge are, of course, quite different.
The evaluation concepts discussed in this appendix can be distinguished according to the objectives of the evaluation and the specific tools that are applied in the course of the evaluation. Although other forms of evaluation exist, this appendix concentrates on the evaluation of program outcomes. A fundamental objective is to determine whether a given program is effective, that is, whether it exerts a measurable influence on the outcome or outcomes of interest. A program's effectiveness can be determined through the use of experimental methods where these are feasible, but more commonly, effectiveness must be assessed by an application of statistical methods to non-experimental data. The first main section below describes what is involved in establishing program effectiveness with such non-experimental data. In this area, a good deal can be learned from the literature on family planning programs in developing countries.
Once given a set of programs whose effectiveness has been demonstrated, a second evaluation objective then comes into play: to determine their cost-effectiveness. Cost-effectiveness is a difficult and much-misunderstood concept in program evaluation. It is a relative concept, in that it involves a comparison between at least two programs that achieve the same level of effectiveness or
"output." To assess cost-effectiveness, one asks which of the two programs achieves this output at lower total social cost; it is this program that can be described as "cost-effective." In practice, formidable conceptual and empirical issues confront even the simplest of cost-effectiveness analyses, particularly when unit costs vary with the level of output or when outcomes have multiple dimensions that must be considered. These issues are discussed in the second main section below. Only the most rudimentary forms of cost-effectiveness analysis have been applied to family planning programs, whether in the United States or in developing countries. As is made clear in what follows, perhaps the most informative literature on these issues is that concerned with health care costs in the United States, also reviewed in the second main section.
Finally, it is important to mention yet an additional objective of evaluation that arises in some circumstances: to determine the net social benefits generated by a particular program. This is termed a cost-benefit analysis. Cost-benefit analysis requires a means of translating outcome measures—such as pregnancies averted—into a single metric or standard that permits comparisons among quite different programs operating, perhaps, in quite different sectors. Even if attention is confined to the health sector, controversy must inevitably accompany any effort to force very dissimilar outcomes into a common metric or index for evaluation. A well-known recent attempt is that of the World Bank (1993), which sought in the concept of DALYS (disability-adjusted life-year saved) an index for assessing a great range of programs in developing-country health care.1 Such approaches are commented on only briefly below.2
The DALYS index relies on a combination of subjective rankings and empirical data to represent the degree of health disability associated with various illnesses and conditions. If these rankings are accepted one has a means of comparing the benefits from investing in (for example) family planning programs with those derived from investments in malaria control.
As discussed in the body of Chapter 8, the term cost-benefit analysis is sometimes misapplied in reference to the net effects of public investments in family planning service delivery for public budgets in health and social service sectors in general. Levey et al., (1988) term this a "taxpayer's benefit-cost" perspective. A taxpayer's benefit-cost analysis asks whether public expenditures are likely to be reduced, on net, by public funding of family planning services. This perspective frames the evaluation issue very narrowly, being concerned only with the impact of one form of public expenditure on another form. There is no clear or necessary relationship between the claims that programs make on government budgets and their cost-effectiveness or social desirability. Thus, the terms "benefit" and "cost" that appear in a taxpayer's benefit-cost analysis bear no obvious correspondence to social benefits and costs.
Assessing Program Effectiveness
This section discusses some of the issues regarding evaluation of program outcomes when no randomized, experimental data are available. It is assumed that individuals have access, to a greater or lesser degree, to a variety of contraceptive methods supplied by the private market as well as by various programs. The focus of evaluation is on the net contribution made by one or perhaps a set of these programs. The outcome of principal interest is the degree of protection against unintended pregnancy. This analysis begins by considering the issues from the viewpoint of a representative woman.
Figure G-1 depicts the environment in which her decisions regarding contraception are made, and shows the potential ramifications of these decisions over the near term. (The highlighted boxes represent features of particular interest to this report.) The figure is meant to represent one period in the sequence of decision periods that make up a woman's reproductive life cycle. One may think of this decision period as being as short as a month, although the consequences of decisions made within that period are played out over the ensuing nine months and beyond (the later events are not shown). The individual woman entering this decision period is assumed not to be pregnant; she may or may not be sexually active, and may or may not believe herself to be capable of conceiving.
The information and constraints bearing on her contraceptive choices encompass a range of factors that have been discussed in several chapters of this report. Here only a few of the major dimensions are indicated. The woman's age (a), her current marital status, and her level of current income or poverty status (as denoted by y) are important, not only because of the obvious connections to contraceptive motivation, but also because of the fact that in the United States, access to service subsidies is conditioned on income. Under the current system of public funding, a woman's income relative to the poverty line determines the fee schedule that she faces for services in Title X clinics and establishes her rights to free services through Medicaid.3
In choosing among alternative providers, a woman will rely on the information available to her regarding the characteristics Xpriv of private physicians or other non-program sources of contraceptive methods and the out-of-pocket prices of these methods ppriv(y), the latter of which will vary according to her current income; likewise, she will take into account the characteristics Xclinic and prices pclinic(y) of clinic or program sources. Included in Xpriv and Xclinic
are the time and money costs that a woman faces to gain access to services.4 Another factor of potential significance is whether private insurance is available to the woman and, if so, whether it covers contraceptive and pregnancy-related services.5 The woman's current knowledge about methods, which may have been derived from previous method use on her part and from program and nonprogram counseling, will certainly affect her choices among methods and may also influence her propensity to use any method. Finally, various features of the woman's attitudes and preferences are important, including her subjective expectations about future income opportunities, the likelihood, as she perceives it, of sexual activity in the decision period, her perceived abilities to conceive, her skill and self-confidence in negotiations with sexual partners, and (especially for teens) her attitudes in respect to risk-taking.
Although the woman may have been informed by previous decisions she has made about contraception, she may also find herself constrained by these decisions. A constraint may arise if the woman has adopted methods such as sterilization, Norplant, or the IUD, which are either very difficult to reverse or costly (in money terms or otherwise) to remove.
All these factors, when taken in combination, express themselves in a woman's current wantedness status, that is, in her desire either to conceive within the decision period, or to defer conception to some later date, or to avert conception altogether.6 The wantedness categories are, in fact, little more than markers of the current intensity with which a woman wishes to avoid conception; no doubt there is great variability in desires within as well as between categories. The categories are nevertheless convenient in that they correspond to the data gathered in fertility surveys such as the NSFG.7
The desire to avoid conception is acted upon through adoption or continued use 8 of a contraceptive method. For reasons that remain poorly understood (see Chapter 4, not all women who wish to delay or avert pregnancy make use of a contraceptive method.9 This disjuncture between expressed preferences and behavior has been termed a "KAP-gap." Women who act on their stated preferences may obtain contraceptive methods either from program sources -- these are mainly family planning clinics -- or from non-program sources -- these are mainly private physicians -- in the case of presciption methods, and pharmacies for the non-prescription methods such as condoms and foam.
Users of contraception face some risk of conception, as do non-users, although of course method users enjoy a greater degree of protection. A woman in in either category is at risk of unintended pregnancy, whether mistimed or unwanted, and this may result in a miscarriage, an abortion, or an unintended birth.
A reproductive career may be viewed as a sequence of decision periods like the one sketched in Figure G-1, in which the decisions and outcomes in one period contribute to the information, and add to the set of constraints, that form the basis for decisions in the next period. For the purpose of program evaluation, it is important to realize that many elements in the figure are variable -- and some of these are highly variable -- over a reproductive career. Income y may fluctuate above and below the poverty line, so that the expected length of a spell in poverty and other aspects of poverty dynamics must be taken into consideration.10 Other elements, including marital status and expectations about future income, also vary over the life cycle. This variation in what is termed information and constraints may then influence wantedness status. Depending on circumstances, a woman may swith among categories of
wantedness over time. Hence, her wantedness status at a point in time cannot be taken as a constant or be interpreted as fully predictive of her status in the future.
The individual perspective adopted above is useful in clarifying the net effects of family planning programs on individual contraceptive use. Consider the following hypothetical but instructive situations. A user of a given program might, in its absence, have obtained her information and methods elsewhere, without there being any net impact on her contraceptive use. Conversely, a well-run program might attract a woman who had formerly relied upon other programs or on private sources, again with no net change in use. An increase in one program's out-of-pocket fees for methods or services might either reduce overall contraceptive method use, or induce substitution among methods, or encourage substitution of non-program sources for program sources. The information and counseling provided by a family planning clinic at one point in time might influence a woman's contraceptive choices at some later date; but the change in her behavior, although stimulated by the program, might not ever be expressed in the choice of a method supplied by that program. All this suggests the difficulty of determining the net effectiveness of any given program in an environment where a number of program and non-program sources are operating.
Consider now the upper part of Figure G-2, which depicts the reproductive career of an individual woman. In the course of this reproductive career, she might have come into contact with a number of different program and non-program sources. As of a given point in time (say, at the time of a demographic survey), the woman will have accumulated information about contraceptive methods and related aspects of reproductive health, will have enjoyed a certain amount of cumulative protection from unintended pregnancy, and will currently use or not use a method. Each of the programs will have made some contribution to her accumulated knowledge, contraceptive protection, and current use. To isolate the net contribution of any given program, however, is clearly a difficult task, one that calls for retrospective or prospective data on program contacts and careful statistical modeling. The evaluation task is further complicated because of the aspect of self-selection in program contacts. For a general review of the statistical issues and a comparison of statistical approaches to alternatives using experimental or quasi-experimental designs, see Maddala (1985), Foster (1989), Heckman and Hotz (1989), and Hausman and Wise (1985).
The fundamental problem in establishing program effectiveness is to predict what the program clients might have done, had that program not existed, or had its characteristics been different from what they are. Such "what if?" or "counter-factual" questions can only be rigorously addressed through a statistical model of the factors that induce individual women to participate in the program. As Huntington and Connell (no date) have pointed out in another context, the
detail available in the empirical data and the statistical model brought to bear on these data are critical elements in producing defensible estimates of net effectiveness.
The ideal data include information on the nature of both program and non-program sources in the areas in which these women live. Note that when access and fees for service are income-conditioned, as they are for Title X clinics and Medicaid services, predictions about net program effects depend on the price-responsiveness of demands for contraception among women of differing income levels.11 The committee is not aware of any detailed data concerning such issues for the United States, although there are data available for developing countries.
To understand some of the issues, consider the following equations (G.1 and G.2), which represent, respectively, the motivation to use contraception and the choice of program versus non-program method sources. The first equation represents the woman's "wantedness" status, that is, the intensity of her motivation to use contraception:
Here Yi,j* represents the intensity of motivation on the part of woman i living in area j. The Zi,j are her own individual characteristics (age, income and the like). Program (Xclinic, Pclinic) and non-program (here denoted by Xpriv and Ppriv) characteristics, or more precisely, the woman's knowledge of these program characteristics, also affect her motivation to use. For simplicity, assume that there is only 1 program and 1 non-program source in area j. There is also a place in the equation for certain unmeasured traits or constraints vi of the woman's that may affect her level of motivation.
Provided that motivation exceeds a certain threshold, say Yi,j* > 0, the woman will seek out the source of contraceptive supply that, on the basis of her own knowledge, level of income and the like, best meets her needs. This source may or may not be the program in question. The probability that she makes use of the program can be represented as
which is a general function of the same set of variables as in the motivation equation. The program's own service statistics can be viewed as the summation of such probabilities across all women i in its geographic catchment area j.
From the viewpoint of evaluation, there are at least three difficulties brought to the fore by these equations. First, it is unusual for the evaluator to have any listing of all program and non-program sources in a given area, to say nothing of womens' subjective knowledge about such sources. Second, it is equally unusual for the evaluator to know which program or programs the woman chose, unless it happens to have been the program under evaluation. Third, there is the problem of self-selection, by which is meant that there will always exist unmeasured variables, represented here by vi, which affect both individual motivations to use and their program choices. If a particular program is eliminated, or if its characteristics are altered, one must trace the net impact first on motivation (equation G.1), and then on the probability that the program is accessed (equation G.2). A proper assessment requires data on all potential sources of supply, and also requires a means of controlling statistically for unmeasured but confounding variables such as vi.
To sum up this much-abbreviated discussion, it is no simple matter to document program effectiveness in a context of multiple programs and individual self-selection among programs. The evaluation literature can provide some guidance on these issues, but much depends on the availability of data at both the individual and the areal level. In view of these difficulties, some progress
can be made with aggregated areal-level data. For attempts at evaluation using aggregated areal data (in this case, data on large U.S. counties), see Joyce, Corman, and Grossman (1988); Corman, Joyce, and Grossman (1987); and Frank, Strobino, Salkever, and Jackson (1992). The recent Title X evaluation of Meier and McFarlane (1994) is of this type. These authors are frank about the limitations of aggregated data and the possibility of mistaken inference owing to ecological fallacies.
The principles of cost-effectiveness come into play when one wishes to establish a ranking among alternative means of mobilizing resources to achieve a given objective x. Here x is defined principally in terms of protection from unintended pregnancy, but other dimensions will also prove to be of interest. The ranking is expressed in terms of the full social costs C(x) of the resources used to produce x, with these costs being denominated in money terms. "Social costs," in this context, means the value of these resources when they are put to their best alternative use.
To engage in a cost-effectiveness analysis, one must first define the objective x in some quantifiable and measurable way, consider at least two alternative programs or ways of achieving the objective, make an accounting of the associated social costs C1(x) and C2(x) of the two programs, and assign a higher ranking to the program having lower total social costs. Often the comparison between programs is implicit, with Program 1 representing the status quo and Program 2 a newly initiated alternative.
Cost-effectiveness analysis should be carefully distinguished from cost-benefit analysis. In the latter, a social value V(x) is attached to the objective x. As with the social costs, V(x) is expressed in money terms, and this makes possible a direct comparison of social costs to benefits. The level of x is said to be socially optimal at the point when marginal social costs equal marginal social benefits, or C'(x) = V'(x). The ideal cost-benefit analysis aims to identify the optimum level of x, or at the least, to establish whether the current level of x should be increased or reduced. A cost-effectiveness analysis, by contrast, makes no effort to assign a social value to x, and is therefore silent on its socially appropriate level. The cost-effectiveness approach leaves these larger issues to be decided by other criteria.
In spite of its limitations, cost-effectiveness is more suitable than cost-benefit analysis for the evaluation of family planning programs. These programs are necessarily concerned with the prevention of unintended pregnancies and births. To assign a social value to the prevention of an unwanted birth, in particular, is to engage in an ethical and religious debate about the valuation of life in
general.12 These are not scientific matters, nor are they matters about which economic analysis is especially informative. Thus, for the purpose at hand, the limited aims of a cost-effectiveness approach are entirely appropriate.
As it has been described to this point, cost-effectiveness analysis may seem to be little more than an exercise in accounting. Quite to the contrary: it is a difficult endeavor, requiring a blend of economic theory, expert judgment, and statistical sophistication. Consider these complications, which arise even in the most mundane of program evaluations: If two programs produce different levels of the output x, say x1 and x2, they are not directly comparable. That is, a comparison of total costs C1(x1) to C2(x2) is not informative, in itself, about cost-effectiveness. (The exception is when one of the projects can produce higher levels of x with the same or lower total costs.) It has been common practice to use average costs, C(x)/x , to make comparisons between programs, but this too is misleading if the average costs of a program are not constant with respect to the level of x. Furthermore, if x is defined in terms of multiple dimensions or joint outputs, then the total costs of a program with output mix x1 are not directly comparable to the total costs of a program having output mix x2. Programs differing in scale or output mix can only be compared indirectly through the device of cost functions, which provide a statistical means of predicting what social costs would be if the programs generated the same level of output or had the same output mix. Thus, the estimation of cost functions—a difficult task—is central to an evaluation of program cost-effectiveness.
There is also the question of units of analysis: whether these should be individuals within a set of well-defined geographic areas, program units such as the clinics that operate within these areas, or the reproductive health system through which individuals pass and within which a variety of programs are situated. As argued above, the effectiveness of programs is perhaps most easily established at the individual level, with data on the reproductive careers of individual women. Accurate measurement of costs, however, requires both individual-level data and data on program units. The individual data are required because important components of total social costs are borne by individuals, most notably the value of their travel time, their waiting time, and if referred elsewhere for additional services, the time that they spend in locating the next service point (Jonsson, 1985; Warner and Luce, 1982). From a broad conceptual
point of view, it is the reproductive health system, taken as a whole, whose organization should be subjected to cost-effectiveness analysis.
The different units of analysis are exhibited in Figure G-2. The individual level, depicted in the upper part of the figure, has already been described. The programs themselves, each intersected by a variety of women in different stages of their reproductive careers, are shown in the middle portion of the figure. The bottom part of the figure contrasts two reproductive health systems, one having family planning programs A and B and a non-program actor or agency C (for instance, a private physician), and the other system having, in addition to these, a new program D. These two arrangments yield outputs x1 and x2, respectively. The full social costs C1(x1) and C2(x2) of the two health systems are the proper subjects of a cost-effectiveness evaluation.
The contrast between health systems need not take the form pictured in Figure G-2. For example, two organizational schemes might be envisioned, one of these being characterized by closer ties among programs units linked by a specialized referral system. Or, one can envision two health systems that differ only in respect to the organization of resources within a given program unit.
The recent history of publicly supported family planning programs shows why the appropriate comparison is between health systems rather than between lower-level program units. In the latter part of the 1980s (see Donovan 1991, and Levine and Tsoflias, 1993:42-50), family planning clinics in the United States found themselves caught between two forces: a decline in public funding, on the one side, and on the other side an increasing demand among poor clients for reproductive and other health services. The response on the part of many clinics, according to Donovan, was to increase reliance on fees for family planning and other services, and to reduce the range of services supplied in-house. The increasing use of referral by clinics, as opposed to direct service provision, has implications for the level and mix of output x, the total social costs C(x) associated with this output, and the incidence of these costs among population subgroups, as between taxpayers in general and poor clients. In the late 1980s, clients were being asked, in effect, to take on a greater share of the full social costs of service provision. They were now using more of their time—a valuable resource—in order to obtain services. Clinics were directing fewer resources, on the whole, to contraceptive service delivery. The net effect on the mix of outputs x and total social costs is unknown, but it is at least conceivable that total costs C(x), for any given level and mix of outputs x, were driven up. An evaluation of the reproductive health system, rather than a narrow focus on clinic costs alone, would be required to determine this.
In making the case for a comparison of health systems, the focus is on a conceptual ideal in evaluation. Many difficulties stand between this ideal and practical implementation. To begin, one must face the issue of how and where to draw the line in defining a health system. One simply cannot include all
health services in a meaningful evaluation, as the range of services is too great; even within the sphere of reproductive health, it may be that attention should be concentrated on a fairly narrow range of services. The key, in the committee's view, is to restrict analysis to the set of services within which there are potential economies of scope. To understand the concept of economies of scope, let q represent the quantity of contraceptive services and let r represent the quantity of all other reproductive health services, so that the objective x is defined in two dimensions, as x = (q,r). Economies of scope are said to exist if C*(q,r) < C1(q,0) + C2(0,r), where C* represents total costs under an integrated service delivery scheme, and C1 and C2 represent costs when the services are not integrated.13 To put it differently, economies of scope exist if there are potential cost savings to be secured by integrating service delivery in some fashion.14 If these savings exist, or if the savings are believed to be possible under an alternative organization of the health system, then q and r should be considered jointly in a proper cost-effectiveness evaluation.
One final point should be made in respect to cost functions and evaluation of relatively new programs. In general, one expects the cost curve associated with a new program or system to be an unreliable guide to the costs that will be exhibited over the long term. New programs are unlikely to have on hand precisely the right mix of fixed inputs (such as capital) to minimize their long-run costs in the neighborhood of the output level and mix they will produce over the long run (Cowing et al., 1983). There are also learning effects to be considered, which would tend to reduce the cost curves over time, as well as differences in manager and worker motivation in pilot as compared to mature projects, which may tend either to reduce or to increase costs (Jensen 1991; Kenney and Lewis, 1991; Kristein, 1983). Thus, even if the estimated cost functions show a new program to be more costly (for given output levels) than an established program, some margin for error should be allowed.
Cost-Effectiveness in Family Planning
The issues discussed to this point represent common themes in program evaluation in general. Family planning programs, or intervention programs having a significant family planning component, present two additional features that require special consideration.
First, they are concerned with prevention, and in this case, mainly with the prevention of unintended pregnancy. Second, family planning programs produce what should be regarded as intermediate outputs, as opposed to final outputs x, in that they provide interested clients with contraceptive methods accompanied by information on their use, the potential side-effects, and so on.
Programs aimed at prevention have impacts and benefits that take place in the future, for the most part, so that in evaluating program impact a decision must be made regarding the discounting of future benefits. Prevention programs typically cannot guarantee that a benefit will materialize or that an undesirable outcome will be avoided; rather, programs affect only the probabilities of these future events. Thus the concept of uncertainty comes into play in an evaluation of prevention programs (Warner and Luce, 1982). In some contrast to prevention programs in other areas of health, in which the illness in question can always be presumed to be unwanted, not all pregnancies are unwanted. As has been discussed above, a woman's wantedness status at a point in time—that is, whether she desires to have no more births or simply to defer pregnancy, and how intensely these desires are felt—can be expected to vary over the remainder of her reproductive life cycle. The desire to avert or delay pregnancy may change with her socioeconomic circumstances, with marital status, with stage of the life cycle, and with all manner of unforseen events. This inherent randomness in individual circumstances and desires introduces a number of subtle complications.
Consider the issues associated with discounting of future benefits. If a program supplies a woman with a 6-month allotment of oral pills, this may appear to be equivalent to 6 months of protection against unintended pregnancy, where the degree of protection is defined by the failure rate of the pill. But setting aside the possibility of method failure, do 6 months of protection necessarily result? If the woman's wantedness status remains unchanged over the 6-month period, she will make use of her full allocation of pills. But if her wantedness status happens to change, let us say after 3 months' time, she may discontinue the pill and strive to conceive. Has the program the supplied her with 6 months of protection or with only 3 months? From the woman's point of view, the program has supplied her with the means to avoid unintended pregnancy for as many as 6 months, although in the event, she chooses not to make full use of this. At the time she receives her allotment of pills, their value to her is affected by the probability, as she perceives it, that her wantedness
status will change. The amount of protection supplied by the program thus depends on the expected span of time over which her wantedness status will remain unchanged.15 In short, the service rendered by the program depends on the characteristics of the method and of the individual.
To further develop this point, consider the example of a non-reversible method. Envision a woman who, after counseling and referral from a family planning program, decides to undergo sterilization. She has all but guaranteed herself protection from unwanted pregnancy for the remainder of her reproductive career. (Evidently some risk of conception remains, see Trussell and Kost (1987), but the risk is very small.) Yet because her wantedness status may change in the future, perhaps owing to changes in marital or economic circumstances, a woman who has been sterilized may find herself later regretting the operation. She has been well-protected against unwanted conception, to be sure, but in selecting a method that is very difficult to reverse, she has also constrained her ability to have a wanted conception.
How should the contribution of the program be assessed in this instance? Wantedness status is to a great degree an individual matter, something that lies beyond the reach and the responsibilities of family planning programs. Programs play an important role, however, in providing the information on which individual decisions are based. A woman who at present wishes to have no more children, and who then makes inquiries regarding the option of sterilization, can reasonably expect to learn about all the costs and risks of the operation, the possibilities for reversal, and the characteristics of other effective methods (e.g., Norplant) which are more easily reversed. Having been fully informed as to the options facing her, she will then weigh the benefits of sterilization against the
strength of her desires to have no more children and the likelihood, known only to her, that these desires may someday change.16
Two larger points may be extracted from these examples. First, although the business of family planning programs can be defined as the prevention of unintended pregnancy, from a broader point of view their role is to assist women in the regulation of conception risks. Programs should aim to facilitate conception when conception is desired, and to help in the prevention of conception when it is not.
Second, the provision of information about methods and reproductive health has an immediate individual and social value, irrespective of whether the woman decides to use a method on the basis of what she has heard. Of course it is difficult to formulate objective and quantifiable measures of the quality of care and counseling that a program makes available, but these are fundamental outputs that must be distinguished from the provision of methods and recognized in any accounting for program cost-effectiveness.
Indeed, there is a clear link between the provision of information and the individual and social benefit attached to the supply of methods. In practice, contraceptive methods are used with greatly varying degrees of effectiveness relative to their theoretical benchmarks. The degree of use-effectiveness depends on the socioeconomic characteristics and motivations of the user, and on the information that has been made available to her. In providing contraceptive methods, a program produces an intermediate output that must be, as it were, further processed by the client to yield a final output: the level of protection against unintended pregnancy. Since different programs provide their clients with different amounts and qualities of information, they affect the way methods are used and the protection they ultimately supply against unintended pregnancy.
Dimensions of Cost-Effectiveness
Given the importance of cost functions C(x) to program evaluation, and the many subtleties that arise in their specification and estimation, a brief discussion
of the issues is in order. Several issues merit particular attention: (1) how to account for the costs borne by individuals, these being the value of time and money expenditures they make in order to gain access to services; (2) how to deal with the multiple dimensions of services; and (3) how to make a translation from the service statistics collected by programs to measures of final outputs. These issues are seldom addressed in any rigorous manner in the literature on family planning programs.17 For insight into the issues, one must look instead to the broader field of health economics, which in recent years has seen a profusion of cost studies. Table G-1 provides a guide to the notation used to organize this discussion.
To begin, it is important to re-emphasize here a point made earlier: family planning programs are engaged in the provision of a number of distinct services, and the supply of contraceptive methods is only one of these. Table G-1 identifies the four broad areas in which services are delivered. First, there are community outreach and informational services O, which provide information to the surrounding community and assist in bringing clients into contact with the program. These outreach services are in part responsible for the number of clients N who present themselves to the clinic or program. Second, an average amount of information I is made available per client during a typical visit. Third, contraceptive methods Q are provided to the clients who request them, either directly or through referral. Fourth, other reproductive services R are also supplied. (For notational convenience, I, Q, and R are expressed as averages per client, or in terms of proportions of clients receiving the service. Total services supplied are therefore NI, NQ, and NR.) The total social costs associated with these services may be summarized in a cost function Ci(O,N,I,Q,R;w), where the i subscript is used to indicate that the program or health system is of ''type i" in its organization.
The variable w appears in the cost function to denote the set of input prices facing the program, such as the prices of methods, wage and salary levels, rent, and the like. These input prices must be adjusted, if necessary, to reflect the true social costs of the resources employed. Many small or newly formed programs rely on volunteer labor or donated services. Although such resources are free in an accounting sense, they should be valued at their market-price equivalents in determining total social costs. Likewise, services made available to a program on a subsidized basis should be valued at their true social costs.
The cost function Ci(O,N,I,Q,R;w) is to be viewed as a summary of all social costs associated with the delivery of the set of services (O,N,I,Q,R). These costs must include the value of travel time and waiting time on the part
TABLE G-1 Dimensions of Program Output
O1, …, OK
Information provided by community outreach efforts
N1, …, NK
Clients, in total (N) and by age and socioeconomic type (Nk, k = 1, …, K)
I1, …, IK
Information provided to clients on average (I) and average by client type (I1, …, IK)
Q1, …, Qc
Q11, …, Q1K; QC1, …, QCK
Contraceptive methods supplied per client, average by method (Q1, …, QC) and by method and client type
R = (R1, …, RS)
R11, …, R1K; RS1, …, RSK
Other reproductive health services supplied per client, average by type of service (R1, …, RS) and by service and client type
of the N clients (Jonsson, 1985; Warner and Luce, 1982). Such time and travel costs must generally be imputed from information on a client's characteristics, such as her age, education, labor market experience, and the like. The distinction between the costs borne by clients and the costs tallied in program accounts is important, because different forms of program organization imply different divisions of social costs between the program and its clients. For example, a program that does little in the way of community outreach reduces its own administrative costs in comparison to more ambitious programs, but in so doing might increase total social costs. Without the guidance provided by outreach, potential clients might spend greater amounts of their time and resources in learning about and locating the program.
It is obviously misleading to compare the total costs C1(O1,N1,I1,Q 1,R1;w1) for one program, with its own particular mix and level of services, to the total costs C2(O2,N2,I2,Q2,R2;w2) of another. Neither is it defensible to focus attention on only one dimension of services—say, the provision of contraceptive methods Q—and to interpretthe relative costs of the two programs in light of this dimension alone. Perhaps it goes without saying that output mix and scale must matter a great deal to program costs; yet the family planning literature is full of misleading statements about costs that ignore both mix and scale. 18 Likewise, it is inappropriate to make cost comparisons without recognizing that programs face different prices w for their inputs, as would be the case when one program operates in an urban area and another in a rural area.19 But by estimating cost functions, through which the implications of different service levels, service mixes, and input prices can be explored, one can lay down a proper foundation for meaningful cross-program comparisons.
As Table G-1 suggests, it may be useful to further disaggregate the service dimensions (O,N,I,Q,R) according to the socioeconomic characteristics of the clients being served and by type of services delivered (e.g., types of contraceptive methods) within each broad category. The various population subgroups (as indexed by the superscript k) could be defined on the basis of age and socioeconomic status, and perhaps additionally by the expressed motivation (spacing, stopping) given for family planning visit.
This disaggregation across subgroups serves several purposes. By indicating who the recipients of services are, it permits the equity dimension of service delivery to be examined. It also allows for differences in the costs of providing service to certain population subgroups, who may (for example) require greater
outreach effort, counseling, or care.20 It allows for meaningful imputation of the time and travel costs borne by clients. Disaggregation also provides the information required to convert data on the supply of contraceptives of various types into more refined measures of protection against unintended pregnancy.
Translating Service Statistics into Measures of Contraceptive Protection
A long-standing problem in the evaluation of family planning programs concerns the link between program service statistics, which are represented in this document by (O,N,I,Q,R), and protection from unintended pregnancy. The essence of the problem is this: contraceptive methods are heterogeneous, so that a way must be found to aggregate across methods; and individuals themselves make use of any given method with different degrees of effectiveness, and will discontinue use as wantedness status changes. Protection from unintended pregnancy thus depends on both the method that is used and the characteristics of the individual user.
In progressing from service statistics to more refined measures of final services, two issues require discussion: (1) whether and how to adjust contraceptive failure rates for differences in program-supplied information and client characteristics; and (2) how to adjust for variations over time in wantedness status, another characteristic of clients. Curiously, the family planning literature has been more concerned with the first of these issues than with the second.21
To understand the difficulties, consider Figure G-3. Imagine that a woman begins in "state 1," defined as not pregnant and desiring not to conceive. In this condition, she contacts or begins her participation in a program. Upon contacting the program, she may decide to use no contraceptive method, or she may accept an allotment of s months of method type c. If the method adopted is sterilization, for instance, then s represents the length of time remaining in the woman's reproductive career. If the method is Norplant, s represents 5 years of coverage. For the pill, s corresponds to the length of the prescription issued.22
Newly armed with this length and type of contraceptive protection, the woman may then enjoy s full periods of protection from unintended pregnancy (a path represented in Figure G-3 by the long solid line) or she may experience a contraceptive failure and become pregnant (this is represented by the dashed lines showing a transition at a1 from state 1 to state 3) or she may have a change of heart in respect to wantedness, discontinue the method, and hope to conceive (a path indicated by the dashed lines showing a transition at a2 to state 2). Three transition rates govern the movements among states: r12 and r21 determine transitions between the wantedness statuses, and r13 represents the method failure rate. All three transition rates depend implicitly on the woman's characteristics (her subgroup k), and r13 may also depend on the contraceptive counseling that has been supplied by the program.
When a program supplies a woman with method c in allotment s, it provides her with an expected span of protection from unintended pregnancy. The ex ante value of the service she receives is indexed, in part, by the expected length of time E1(c,s;k,Ik/c) that the woman will spend in state 1. This quantity E1 is, in general, less than s; how much less depends on the effectiveness of the method and the variability of conception desires. If method c is difficult to remove or
reverse, then in addition to E1 one must also consider the expected span of time E2(c,s;k,Ik/c) over which the woman finds herself in state 2, wanting to conceive but being prevented (at least temporarily) from doing so. The quantities E1 and E2 may be calculated by increment-decrement life tables or related methods.23
Given values of E1 and E2 for each method allotment (c,s), client type k, and program counseling strategy Ik/c, total services provided are calculated by summing E1 and E2 over these categories. What results is a pair of indicators of total services supplied (E1,E2). The E1 dimension, representing total expected protection against unintended pregnancy, should be positively valued from a social point of view, whereas the E2 dimension, representing total constraints on desired conception, should be negatively valued.
As the discussion suggests, rather detailed data are required to undertake calculations of E1 and E2. Given this, it may be useful to view E1 and E2 as conceptual ideals against which current standard practice can be measured.
Perhaps the most common approach now found in the family planning evaluation literature is the use of couple-years of protection, or CYP. This approach is based on two assumptions: that wantedness status is fixed (i.e., r12=0), which removes the E2 dimension; and that method failure rates (r13) can be set equal to their theoretical values, unaffected either by client type or by the information provided by the program. Under these strong assumptions, the calculation of E1 then becomes a trivial matter.
An alternative and better-justified measure, but one that is more demanding of data, is that of use-effectiveness. The use-effectiveness measure retains the assumption of fixed wantedness status, but allows contraceptive failure rates r13 to depend on the characteristics of the client served (her subgroup k, in this document) and permits the failure rates to vary (at least in principle) with the information provided by the program. Forrest and Singh (1990) give a good illustration of the use-effectiveness approach in family planning evaluation, whereas Shelton (1991) defends the CYP approach on practical grounds. As just noted, neither of these conventional approaches deals with the variability of wantedness status. Hence, neither approach can be fully justified when applied
to programs that supply irreversible methods or methods that are difficult to remove.
Predicting Net Program Effects Without Individual Data
Cost-effectiveness is inherently a relative concept: a program or health system is said to be cost-effective in relation to an alternative program or system. In practice, however, one may not have access to the cost and service level information required to compare two distinct systems or programs. Evaluation must then proceed on a hypothetical basis. The conceptual experiment is to imagine a health system much like the current system, except that the program of interest has been removed, scaled back, or otherwise altered. What would be the service level and mix x, and total social costs C(x), in this hypothetical world?
It is argued above that the availability of detailed data on individuals, and an appropriate statistical model, would permit such questions to be addressed. But how should evaluation proceed if these ideal data are unavailable? The family planning literature presents no consensus, and offers surprisingly little guidance, on this central evaluation question. Essentially, one must set out different assumptions about individual choices that might be made in the absence of the program (or under an altered version of the program) and summarize these assumptions in a set of probabilities about contraceptive use and method choice.
Let pk/i represent the probability that, in the absence of the program being evaluated, an individual woman (in socioeconomic subgroup k) would have chosen contraceptive method i. One should allow for the possibility that no method (i=0) might have been used. With all else held constant, her predicted E1 becomes
where the bars over method allotments si and information Ik/i represent assumptions about the hypothetical values that these quantities assume in the absence of the program. An analogous expression can be specified for the expected or predicted value of E2 in the absence of the program. Couple-years of protection (CYP) or use-effectiveness can also be predicted in this manner.
Evidently, then, the prediction issue can be reduced to the set of assumptions required to motivate a particular configuration of choice probabilities pk/i. Some authors (e.g., Fitzgibbons, 1993) propose that information on a woman's method choices prior to her contact with the program be used to define pk/i. The prescription is as follows. If the woman had previously used method c, then set
pck = 1 and set the probabilities associated with all other methods to zero. If the woman had used no method, then set pok = 1. This approach ignores the woman's level of motivation, that is, the factors that brought her to seek out the program in question. If that program had not existed, would she not have found other, non-program sources of services? Why is it reasonable to assume that the woman simply would have gone on doing what she was doing?
An attractive alternative in selecting a set of pik, suggested by Forrest and Singh (1990) and used by them in an evaluation, is to employ the distribution of use and method choices that characterizes women who do not access the program in question. Forrest and Singh make a statistical adjustment to account for differences in social and economic characteristics among those who used the program (e.g., Title X clinics) as compared to the women who relied on non-program sources. The statistical adjustments described by Forrest and Singh do not take client self-selection into account, owing to a lack of data, but in other respects represent a reasonable compromise.24
Estimating the Cost Functions
Given all that has been said above, how can cost functions be estimated and used to establish cost-effectiveness of programs? Remarkably little guidance on this question can be found in the family planning literature. Within the broader field of health economics, however, numerous studies exist that employ cost functions, most of these studies being focused on the estimation of cost functions for hospitals. Cowing, Holtmann, and Powers (1983) provide an survey of the literature, and Cowing and Holtmann (1983), Granneman, Brown, and Pauly (1986), and Nyman and Dowd (1991) present empirical applications. Much of this literature is concerned with the cost implications associated with the delivery of a range of services. The methods are therefore of considerable relevance to the evaluation of family planning service delivery, which is also characterized by multiple service dimensions.
Cost functions are estimated from a base of cost and services data covering either a number of similar programs operating in different environments or a given program observed over time. Each data point provides information on total social costs C and the level and mix of services provided, whether these are expressed in terms of service statistics (O,N,I,Q,R) or, for the contraceptive component of services, in more refined measures of services such as E1 and E2 discussed above. Input prices w should also be available for each data point. Finally, each program under consideration should be classified according to its
organizational type. In their studies of hospitals, for example, Cowing and Holtmann (1983), Granneman, Brown, and Pauly (1986), and Nyman and Dowd (1991) specify cost functions that incorporate shift factors representing different forms of organization, as for example, not-for-profit versus proprietary hospitals.
With such data in hand, one then selects a functional form for the cost function Ci(O,N,I,Q,R;w). At this stage a decision needs to be made about how to aggregate services within the broad categories of outreach, information, provision of methods, and provision of other reproductive health services. Cowing, Holtmann, and Powers (1983) discuss the formal criteria that justify service aggregation, and of course data availability must also be taken into account. As a general rule, the less aggregation applied to the services data, the better; but some compromise is inevitable. Nyman and Dowd (1991) have estimated a cost function having as many as seven dimensions of service delivery. In principle, at least, even more dimensions could be incorporated, up to the limits imposed by the data.25
The estimation procedure takes the form of non-linear regression, with total costs (or the natural log of costs) being the dependent variable and with the set of explanatory variables defined in terms of levels of services and input prices. For example, suppose that a homothetic functional form is selected,
where Di is a dummy variable representing program type and the sign of the coefficient βi indicates whether, with all else constant, this type of program is associated with greater or lesser costs. A specification such as the above would typically be estimated in log form:
and non-linearity arises through the functions of services Q and input prices F on the right-hand side of the equation. If the estimated bi > 0, this indicates that a program of type i is cost-inefficient relative to the benchmark program.
How can this approach be employed to assess the cost-effectiveness of a new program, on which perhaps only one data point is available? If an estimated cost function is available for a range of already established programs, then the cost-effectiveness
of the new program can be evaluated using the estimated function as a benchmark. In other words, by substituting into a previously estimated cost function the service levels (O,N,I,Q,R) for the new program, and the input prices w faced by that program, one can derive the predicted costs of the new program. These predicted costs can be compared to the actual costs exhibited by the new program. Although no strong conclusions about cost-effectiveness can be drawn on the basis of a single data point, a divergence of predicted costs from actual costs may nevertheless prove informative.
To sum up, the assessment of program cost-effectiveness is a demanding task, particularly so in respect to the data that are required to support a rigorous analysis. If the data are not available, evaluation can proceed only on an informal basis, by invoking strong assumptions on the nature of the cost functions. Much of the cost-effectiveness literature in family planning has rested on two exceedingly strong yet rarely scrutinized assumptions: (1) that the multiple dimensions of output can somehow be collapsed into a single output indicator; and (2) that average costs, defined as total costs divided by (composite) output, are constant over the range of output. If both conditions are met, then a single observation on average costs can provide a basis for program comparisons. But in the absence of supporting evidence—the committee finds none in the literature—these strong assumptions are not well justified and may be misleading as a guide to policy.
This discussion and analysis has attempted to provide an introduction or guide to the evaluation literature on effectiveness and cost-effectiveness, with emphasis on those issues which are central to family planning programs. One of the more alarming facts about the state of family planning evaluation in the United States is the scarcity of research, and the degree to which techniques that have become standard practice in other fields have yet to be applied. Some of these are difficult techniques, to be sure, and results based on them lack the feel of certainty that attaches to randomized-experiments research. Nevertheless, their application to family planning evaluation is long overdue.
Corman H, Joyce T, Grossman M. Birth outcome production function in the United States. J Hum Resour. 1987;22:339–360.
Cowing T, Holtmann A. Multiproduct short-run hospital cost functions: Empirical evidence and policy implications from cross-section data. Southern Econ J. 1983; 49:637–653.
Cowing T, Holtmann A, Powers S. Hospital cost analysis: A survey and evaluation of recent studies. Adv Health Econ Health Serv Res. 1983;4: 257–303.
Donovan P. Family planning clinics: Facing higher costs and sicker patients. Fam Plann Perspect. 1991;23:198–203.
Enke S. The gains to India from population control: Some money measures and incentive schemes. Rev Econ Stat. 1960;42:175–181.
Fitzgibbons E. Benefit: Cost Analysis of Family Planning in Washington State. Unpublished Master's thesis. University of Washington; 1993.
Forrest J, Singh S. Public-sector savings resulting from expenditures for contraceptive services. Fam Plann Perspect. 1990;22:6–15.
Foster R. Identifying experimental program effects with confounding price changes and selection bias. J Hum Resour. 1989;24:253–279.
Frank R, Strobino D, Salkever D, Jackson C. Updated estimates of the impact of prenatal care on birthweight outcomes by race. J Hum Resour. 1992;27:629–642.
Grannemann T, Brown R, Pauly M. Estimating hospital costs: A multiple-output analysis. J Health Econ. 1986;5:107–127.
Hausman J, Wise D. Technical problems in social experimentation: Cost versus ease of analysis. In Social Experimentation. Hausman J, Wise D, eds. Chicago, IL: The University of Chicago Press; 1985.
Heckman J, Hotz VJ, Choosing among alternative nonexperimental methods for estimating the impact of social programs: The case of manpower training. J Am Stat Assoc. 1989;84:862–880.
Huntington J, Connell F. ''For every dollar spent …" The cost-savings argument for prenatal care. Department of Health Services, University of Washington; no date.
Janowitz B, Bratt J. Costs of family planning services: A critique of the literature. Int Fam Plann Perspect. 1992;18:137–144.
Jensen E. Cost-effectiveness and financial sustainability in family planning operations research. In Operations Research: Helping Family Planning Programs Work Better. Seidman M, Horn M, eds. New York, NY: Wiley-Liss; 1991.
Jonsson B. The value of prevention: Economic aspects. In The Value of Preventive Medicine. London, England: Pitman. (Ciba Foundation symposium 110); 1985.
Joyce T, Corman H, Grossman M. A cost-effectiveness analysis of strategies to reduce infant mortality. Med Care. 1988;26:348–360.
Kenney G, Lewis M. Cost analysis in family planning: Operations Research programs and beyond. In Operations Research: Helping Family Planning Programs Work Better. Seidman M, Horn M, eds. New York, NY: Wiley-Liss; 1991.
Kristein M. Using cost-effectiveness and cost-benefit analysis for health care policymaking. Adv in Health Econ and Health Services Res. 1983;4:199–224.
Ku L. Financing of family planning services. In Publicly Supported Family Planning in the United States. Washington, DC: The Urban Institute and Child Trends, Inc.; 1993.
Levey L, Nyman J, Haugaard J. A benefit-cost analysis of family planning services in Iowa. Eval Health Prof. 1988;11:403–424.
Levine R, Tsoflias L. Use in the 1980s. In Publicly Supported Family Planning in the United States. Washington, DC: The Urban Institute and Child Trends, Inc.; 1993.
Long D. Analyzing social program production: An assessment of Supported Work for Youths. J Hum Resour. 1988;22:551–562.
Maddala GS. A survey of the literature on selectivity bias as it pertains to health care markets. Adv Health Econ Health Serv Res. 1985;6:3–18.
Meier K, McFarlane D. State family planning and abortion expenditures: Their effect on public health. Am J Public Health. 1994;84:1468–1472.
Montgomery M. Dynamic behavioral models and contraceptive use. Dynamics of Contraceptive Use. J Biosoc Sci Suppl No. 11. 1989;17–40.
National Research Council. Risking the Future: Adolescent Sexuality, Pregnancy, and Childbearing. Vol I. Hayes C. (ed.). Washington, DC: National Academy Press; 1987.
Nyman J, Dowd B. Cost function analysis of Medicare policy: Are reimbursement limits for rural home health agencies sufficient? J Health Econ. 1991;10:313–327.
Shelton J. What's wrong with CYP? Stud Fam Plann. 1991;22:332–335.
Trussell J, Kost K. Contraceptive failure in the United States: A critical review of the literature. Stud Fam Plann. 1987;18:237–283.
Tsui A, Herbertson M, eds. Dynamics of Contraceptive Use. J Biosoc Sci Suppl No. 11. 1989.
United Nations. Manual IX: The methodology of measuring the impact of family planning programs on fertility. Popul Stud. No. 66. New York, NY: United Nations. 1979.
Vincent M, Lepro E, Baker S, Garvey D. Projected public sector savings in a teen pregnancy prevention project. J Health Educ. 1991;22:208–212.
Warner K, Luce B. Cost-Benefit and Cost-Effectiveness Analysis in Health Care. Ann Arbor, MI: Health Administration Press; 1982.
World Bank. World Development Report, 1993: Investing in Health. Washington, DC: The World Bank; 1993.
|This page in the original is blank.|