G Assessing Program Effectiveness and Cost-Effectiveness: Supplement to Chapter 8

Mark R. Montgomery

Member, Committee on Unintended Pregnancy

This appendix discusses several principles of evaluation that can be applied to family planning programs. As the body of Chapter 8 makes clear, the formal tools of program evaluation have only occasionally been brought to bear on U.S. family planning programs. This is surprising, especially as there exists a lively literature on family planning programs in developing countries, and many of the issues are precisely the same, although the contexts in which these issues emerge are, of course, quite different.

The evaluation concepts discussed in this appendix can be distinguished according to the objectives of the evaluation and the specific tools that are applied in the course of the evaluation. Although other forms of evaluation exist, this appendix concentrates on the evaluation of program outcomes. A fundamental objective is to determine whether a given program is effective, that is, whether it exerts a measurable influence on the outcome or outcomes of interest. A program's effectiveness can be determined through the use of experimental methods where these are feasible, but more commonly, effectiveness must be assessed by an application of statistical methods to non-experimental data. The first main section below describes what is involved in establishing program effectiveness with such non-experimental data. In this area, a good deal can be learned from the literature on family planning programs in developing countries.

Once given a set of programs whose effectiveness has been demonstrated, a second evaluation objective then comes into play: to determine their cost-effectiveness. Cost-effectiveness is a difficult and much-misunderstood concept in program evaluation. It is a relative concept, in that it involves a comparison between at least two programs that achieve the same level of effectiveness or



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 338
--> G Assessing Program Effectiveness and Cost-Effectiveness: Supplement to Chapter 8 Mark R. Montgomery Member, Committee on Unintended Pregnancy This appendix discusses several principles of evaluation that can be applied to family planning programs. As the body of Chapter 8 makes clear, the formal tools of program evaluation have only occasionally been brought to bear on U.S. family planning programs. This is surprising, especially as there exists a lively literature on family planning programs in developing countries, and many of the issues are precisely the same, although the contexts in which these issues emerge are, of course, quite different. The evaluation concepts discussed in this appendix can be distinguished according to the objectives of the evaluation and the specific tools that are applied in the course of the evaluation. Although other forms of evaluation exist, this appendix concentrates on the evaluation of program outcomes. A fundamental objective is to determine whether a given program is effective, that is, whether it exerts a measurable influence on the outcome or outcomes of interest. A program's effectiveness can be determined through the use of experimental methods where these are feasible, but more commonly, effectiveness must be assessed by an application of statistical methods to non-experimental data. The first main section below describes what is involved in establishing program effectiveness with such non-experimental data. In this area, a good deal can be learned from the literature on family planning programs in developing countries. Once given a set of programs whose effectiveness has been demonstrated, a second evaluation objective then comes into play: to determine their cost-effectiveness. Cost-effectiveness is a difficult and much-misunderstood concept in program evaluation. It is a relative concept, in that it involves a comparison between at least two programs that achieve the same level of effectiveness or

OCR for page 338
--> "output." To assess cost-effectiveness, one asks which of the two programs achieves this output at lower total social cost; it is this program that can be described as "cost-effective." In practice, formidable conceptual and empirical issues confront even the simplest of cost-effectiveness analyses, particularly when unit costs vary with the level of output or when outcomes have multiple dimensions that must be considered. These issues are discussed in the second main section below. Only the most rudimentary forms of cost-effectiveness analysis have been applied to family planning programs, whether in the United States or in developing countries. As is made clear in what follows, perhaps the most informative literature on these issues is that concerned with health care costs in the United States, also reviewed in the second main section. Finally, it is important to mention yet an additional objective of evaluation that arises in some circumstances: to determine the net social benefits generated by a particular program. This is termed a cost-benefit analysis. Cost-benefit analysis requires a means of translating outcome measures—such as pregnancies averted—into a single metric or standard that permits comparisons among quite different programs operating, perhaps, in quite different sectors. Even if attention is confined to the health sector, controversy must inevitably accompany any effort to force very dissimilar outcomes into a common metric or index for evaluation. A well-known recent attempt is that of the World Bank (1993), which sought in the concept of DALYS (disability-adjusted life-year saved) an index for assessing a great range of programs in developing-country health care.1 Such approaches are commented on only briefly below.2 1   The DALYS index relies on a combination of subjective rankings and empirical data to represent the degree of health disability associated with various illnesses and conditions. If these rankings are accepted one has a means of comparing the benefits from investing in (for example) family planning programs with those derived from investments in malaria control. 2   As discussed in the body of Chapter 8, the term cost-benefit analysis is sometimes misapplied in reference to the net effects of public investments in family planning service delivery for public budgets in health and social service sectors in general. Levey et al., (1988) term this a "taxpayer's benefit-cost" perspective. A taxpayer's benefit-cost analysis asks whether public expenditures are likely to be reduced, on net, by public funding of family planning services. This perspective frames the evaluation issue very narrowly, being concerned only with the impact of one form of public expenditure on another form. There is no clear or necessary relationship between the claims that programs make on government budgets and their cost-effectiveness or social desirability. Thus, the terms "benefit" and "cost" that appear in a taxpayer's benefit-cost analysis bear no obvious correspondence to social benefits and costs.

OCR for page 338
--> Assessing Program Effectiveness This section discusses some of the issues regarding evaluation of program outcomes when no randomized, experimental data are available. It is assumed that individuals have access, to a greater or lesser degree, to a variety of contraceptive methods supplied by the private market as well as by various programs. The focus of evaluation is on the net contribution made by one or perhaps a set of these programs. The outcome of principal interest is the degree of protection against unintended pregnancy. This analysis begins by considering the issues from the viewpoint of a representative woman. Figure G-1 depicts the environment in which her decisions regarding contraception are made, and shows the potential ramifications of these decisions over the near term. (The highlighted boxes represent features of particular interest to this report.) The figure is meant to represent one period in the sequence of decision periods that make up a woman's reproductive life cycle. One may think of this decision period as being as short as a month, although the consequences of decisions made within that period are played out over the ensuing nine months and beyond (the later events are not shown). The individual woman entering this decision period is assumed not to be pregnant; she may or may not be sexually active, and may or may not believe herself to be capable of conceiving. The information and constraints bearing on her contraceptive choices encompass a range of factors that have been discussed in several chapters of this report. Here only a few of the major dimensions are indicated. The woman's age (a), her current marital status, and her level of current income or poverty status (as denoted by y) are important, not only because of the obvious connections to contraceptive motivation, but also because of the fact that in the United States, access to service subsidies is conditioned on income. Under the current system of public funding, a woman's income relative to the poverty line determines the fee schedule that she faces for services in Title X clinics and establishes her rights to free services through Medicaid.3 In choosing among alternative providers, a woman will rely on the information available to her regarding the characteristics Xpriv of private physicians or other non-program sources of contraceptive methods and the out-of-pocket prices of these methods ppriv(y), the latter of which will vary according to her current income; likewise, she will take into account the characteristics Xclinic and prices pclinic(y) of clinic or program sources. Included in Xpriv and Xclinic 3   See Ku (1993) and Levine and Tsoflias (1993:50–54) for reviews of Title X and Medicaid funding and a discussion of typical fee schedules for women at various percentiles above and below the federal poverty line.

OCR for page 338
-->

OCR for page 338
--> are the time and money costs that a woman faces to gain access to services.4 Another factor of potential significance is whether private insurance is available to the woman and, if so, whether it covers contraceptive and pregnancy-related services.5 The woman's current knowledge about methods, which may have been derived from previous method use on her part and from program and nonprogram counseling, will certainly affect her choices among methods and may also influence her propensity to use any method. Finally, various features of the woman's attitudes and preferences are important, including her subjective expectations about future income opportunities, the likelihood, as she perceives it, of sexual activity in the decision period, her perceived abilities to conceive, her skill and self-confidence in negotiations with sexual partners, and (especially for teens) her attitudes in respect to risk-taking. Although the woman may have been informed by previous decisions she has made about contraception, she may also find herself constrained by these decisions. A constraint may arise if the woman has adopted methods such as sterilization, Norplant, or the IUD, which are either very difficult to reverse or costly (in money terms or otherwise) to remove. All these factors, when taken in combination, express themselves in a woman's current wantedness status, that is, in her desire either to conceive within the decision period, or to defer conception to some later date, or to avert conception altogether.6 The wantedness categories are, in fact, little more than markers of the current intensity with which a woman wishes to avoid conception; no doubt there is great variability in desires within as well as between categories. The categories are nevertheless convenient in that they correspond to the data gathered in fertility surveys such as the NSFG.7 4   The X variables should also include indices of perceived quality of care; see Levine and Tsoflias (1993:33–34). Confidentiality of care is one aspect of quality that appears to be particularly important for adolescents; see National Research Council (1987:164–165). Note that access to care may be determined by the age and marital status of the woman, in that some private physicians will not accept unmarried teenage clients without parental consent (National Research Council, 1987:155–159). Nor will all private physicians accept Medicaid clients. See Ku (1993:32ff) for further discussion of Medicaid eligibility criteria in the context of family planning. 5   See Ku (1993:4–9). 6   Women who currently wish to conceive, but are prevented from doing so by an earlier sterilization or the costs of removing a method, face a somewhat different set of decisions and consequences; these are not shown in Figure G-1. 7   Although the desire to avoid conception is something that is expressed in the current decision period, a woman's expectations about future events figure into this desire. Consider a woman who, at present, wishes to avert conception altogether. She envisions the remainder of her reproductive career and can imagine no realistic scenario in which she would desire to conceive. A woman who simply wishes to defer pregnancy, on the

OCR for page 338
--> The desire to avoid conception is acted upon through adoption or continued use 8 of a contraceptive method. For reasons that remain poorly understood (see Chapter 4, not all women who wish to delay or avert pregnancy make use of a contraceptive method.9 This disjuncture between expressed preferences and behavior has been termed a "KAP-gap." Women who act on their stated preferences may obtain contraceptive methods either from program sources -- these are mainly family planning clinics -- or from non-program sources -- these are mainly private physicians -- in the case of presciption methods, and pharmacies for the non-prescription methods such as condoms and foam. Users of contraception face some risk of conception, as do non-users, although of course method users enjoy a greater degree of protection. A woman in in either category is at risk of unintended pregnancy, whether mistimed or unwanted, and this may result in a miscarriage, an abortion, or an unintended birth. A reproductive career may be viewed as a sequence of decision periods like the one sketched in Figure G-1, in which the decisions and outcomes in one period contribute to the information, and add to the set of constraints, that form the basis for decisions in the next period. For the purpose of program evaluation, it is important to realize that many elements in the figure are variable -- and some of these are highly variable -- over a reproductive career. Income y may fluctuate above and below the poverty line, so that the expected length of a spell in poverty and other aspects of poverty dynamics must be taken into consideration.10 Other elements, including marital status and expectations about future income, also vary over the life cycle. This variation in what is termed information and constraints may then influence wantedness status. Depending on circumstances, a woman may swith among categories of    other hand, has in mind some future decision period in which she is likely to want to conceive. These differenct expectations regarding the future affect the intensity of current desires to avoid conception. See Montgomery (1989) for further discussion of such dynamic and life-cycle issues. 8   The informational costs, and perhaps the money consts, associated with a woman's first use of a given method will differ from the costs of continued use of the method. The costs of adoption, continuation, and method switching are undoubtably important in contraceptive use dynamics. See the issue of the Journal of Biosocial Sciences edited by A. Tsui and M. Herbertson (1989) for further discussion. 9   According to Levine and Tsoflias (1993:ii,22), some 37 percent of poor women (with income below 150 percent of the poverty line) who were sexually active believed themselves to be fertile, were not pregnant and did not want to become pregnant nevertheless used no contraceptive method. Even among better-off women (with incomes above 150 percent of the poverty line) in these circumstances, 22 percent did not use a method. 10   See Levine and Tsoflias (1993:15).

OCR for page 338
--> wantedness over time. Hence, her wantedness status at a point in time cannot be taken as a constant or be interpreted as fully predictive of her status in the future. The individual perspective adopted above is useful in clarifying the net effects of family planning programs on individual contraceptive use. Consider the following hypothetical but instructive situations. A user of a given program might, in its absence, have obtained her information and methods elsewhere, without there being any net impact on her contraceptive use. Conversely, a well-run program might attract a woman who had formerly relied upon other programs or on private sources, again with no net change in use. An increase in one program's out-of-pocket fees for methods or services might either reduce overall contraceptive method use, or induce substitution among methods, or encourage substitution of non-program sources for program sources. The information and counseling provided by a family planning clinic at one point in time might influence a woman's contraceptive choices at some later date; but the change in her behavior, although stimulated by the program, might not ever be expressed in the choice of a method supplied by that program. All this suggests the difficulty of determining the net effectiveness of any given program in an environment where a number of program and non-program sources are operating. Consider now the upper part of Figure G-2, which depicts the reproductive career of an individual woman. In the course of this reproductive career, she might have come into contact with a number of different program and non-program sources. As of a given point in time (say, at the time of a demographic survey), the woman will have accumulated information about contraceptive methods and related aspects of reproductive health, will have enjoyed a certain amount of cumulative protection from unintended pregnancy, and will currently use or not use a method. Each of the programs will have made some contribution to her accumulated knowledge, contraceptive protection, and current use. To isolate the net contribution of any given program, however, is clearly a difficult task, one that calls for retrospective or prospective data on program contacts and careful statistical modeling. The evaluation task is further complicated because of the aspect of self-selection in program contacts. For a general review of the statistical issues and a comparison of statistical approaches to alternatives using experimental or quasi-experimental designs, see Maddala (1985), Foster (1989), Heckman and Hotz (1989), and Hausman and Wise (1985). The fundamental problem in establishing program effectiveness is to predict what the program clients might have done, had that program not existed, or had its characteristics been different from what they are. Such "what if?" or "counter-factual" questions can only be rigorously addressed through a statistical model of the factors that induce individual women to participate in the program. As Huntington and Connell (no date) have pointed out in another context, the

OCR for page 338
--> detail available in the empirical data and the statistical model brought to bear on these data are critical elements in producing defensible estimates of net effectiveness. The ideal data include information on the nature of both program and non-program sources in the areas in which these women live. Note that when access and fees for service are income-conditioned, as they are for Title X clinics and Medicaid services, predictions about net program effects depend on the price-responsiveness of demands for contraception among women of differing income levels.11 The committee is not aware of any detailed data concerning such issues for the United States, although there are data available for developing countries. To understand some of the issues, consider the following equations (G.1 and G.2), which represent, respectively, the motivation to use contraception and the choice of program versus non-program method sources. The first equation represents the woman's "wantedness" status, that is, the intensity of her motivation to use contraception: 11   See Foster (1989) for an illustrative analysis dealing with the introduction of a new form of health clinic in a low income neighborhood.

OCR for page 338
--> Here Yi,j* represents the intensity of motivation on the part of woman i living in area j. The Zi,j are her own individual characteristics (age, income and the like). Program (Xclinic, Pclinic) and non-program (here denoted by Xpriv and Ppriv) characteristics, or more precisely, the woman's knowledge of these program characteristics, also affect her motivation to use. For simplicity, assume that there is only 1 program and 1 non-program source in area j. There is also a place in the equation for certain unmeasured traits or constraints vi of the woman's that may affect her level of motivation. Provided that motivation exceeds a certain threshold, say Yi,j* > 0, the woman will seek out the source of contraceptive supply that, on the basis of her own knowledge, level of income and the like, best meets her needs. This source may or may not be the program in question. The probability that she makes use of the program can be represented as which is a general function of the same set of variables as in the motivation equation. The program's own service statistics can be viewed as the summation of such probabilities across all women i in its geographic catchment area j. From the viewpoint of evaluation, there are at least three difficulties brought to the fore by these equations. First, it is unusual for the evaluator to have any listing of all program and non-program sources in a given area, to say nothing of womens' subjective knowledge about such sources. Second, it is equally unusual for the evaluator to know which program or programs the woman chose, unless it happens to have been the program under evaluation. Third, there is the problem of self-selection, by which is meant that there will always exist unmeasured variables, represented here by vi, which affect both individual motivations to use and their program choices. If a particular program is eliminated, or if its characteristics are altered, one must trace the net impact first on motivation (equation G.1), and then on the probability that the program is accessed (equation G.2). A proper assessment requires data on all potential sources of supply, and also requires a means of controlling statistically for unmeasured but confounding variables such as vi. To sum up this much-abbreviated discussion, it is no simple matter to document program effectiveness in a context of multiple programs and individual self-selection among programs. The evaluation literature can provide some guidance on these issues, but much depends on the availability of data at both the individual and the areal level. In view of these difficulties, some progress

OCR for page 338
--> can be made with aggregated areal-level data. For attempts at evaluation using aggregated areal data (in this case, data on large U.S. counties), see Joyce, Corman, and Grossman (1988); Corman, Joyce, and Grossman (1987); and Frank, Strobino, Salkever, and Jackson (1992). The recent Title X evaluation of Meier and McFarlane (1994) is of this type. These authors are frank about the limitations of aggregated data and the possibility of mistaken inference owing to ecological fallacies. Assessing Cost-Effectiveness The principles of cost-effectiveness come into play when one wishes to establish a ranking among alternative means of mobilizing resources to achieve a given objective x. Here x is defined principally in terms of protection from unintended pregnancy, but other dimensions will also prove to be of interest. The ranking is expressed in terms of the full social costs C(x) of the resources used to produce x, with these costs being denominated in money terms. "Social costs," in this context, means the value of these resources when they are put to their best alternative use. To engage in a cost-effectiveness analysis, one must first define the objective x in some quantifiable and measurable way, consider at least two alternative programs or ways of achieving the objective, make an accounting of the associated social costs C1(x) and C2(x) of the two programs, and assign a higher ranking to the program having lower total social costs. Often the comparison between programs is implicit, with Program 1 representing the status quo and Program 2 a newly initiated alternative. Cost-effectiveness analysis should be carefully distinguished from cost-benefit analysis. In the latter, a social value V(x) is attached to the objective x. As with the social costs, V(x) is expressed in money terms, and this makes possible a direct comparison of social costs to benefits. The level of x is said to be socially optimal at the point when marginal social costs equal marginal social benefits, or C'(x) = V'(x). The ideal cost-benefit analysis aims to identify the optimum level of x, or at the least, to establish whether the current level of x should be increased or reduced. A cost-effectiveness analysis, by contrast, makes no effort to assign a social value to x, and is therefore silent on its socially appropriate level. The cost-effectiveness approach leaves these larger issues to be decided by other criteria. In spite of its limitations, cost-effectiveness is more suitable than cost-benefit analysis for the evaluation of family planning programs. These programs are necessarily concerned with the prevention of unintended pregnancies and births. To assign a social value to the prevention of an unwanted birth, in particular, is to engage in an ethical and religious debate about the valuation of life in

OCR for page 338
--> general.12 These are not scientific matters, nor are they matters about which economic analysis is especially informative. Thus, for the purpose at hand, the limited aims of a cost-effectiveness approach are entirely appropriate. As it has been described to this point, cost-effectiveness analysis may seem to be little more than an exercise in accounting. Quite to the contrary: it is a difficult endeavor, requiring a blend of economic theory, expert judgment, and statistical sophistication. Consider these complications, which arise even in the most mundane of program evaluations: If two programs produce different levels of the output x, say x1 and x2, they are not directly comparable. That is, a comparison of total costs C1(x1) to C2(x2) is not informative, in itself, about cost-effectiveness. (The exception is when one of the projects can produce higher levels of x with the same or lower total costs.) It has been common practice to use average costs, C(x)/x , to make comparisons between programs, but this too is misleading if the average costs of a program are not constant with respect to the level of x. Furthermore, if x is defined in terms of multiple dimensions or joint outputs, then the total costs of a program with output mix x1 are not directly comparable to the total costs of a program having output mix x2. Programs differing in scale or output mix can only be compared indirectly through the device of cost functions, which provide a statistical means of predicting what social costs would be if the programs generated the same level of output or had the same output mix. Thus, the estimation of cost functions—a difficult task—is central to an evaluation of program cost-effectiveness. There is also the question of units of analysis: whether these should be individuals within a set of well-defined geographic areas, program units such as the clinics that operate within these areas, or the reproductive health system through which individuals pass and within which a variety of programs are situated. As argued above, the effectiveness of programs is perhaps most easily established at the individual level, with data on the reproductive careers of individual women. Accurate measurement of costs, however, requires both individual-level data and data on program units. The individual data are required because important components of total social costs are borne by individuals, most notably the value of their travel time, their waiting time, and if referred elsewhere for additional services, the time that they spend in locating the next service point (Jonsson, 1985; Warner and Luce, 1982). From a broad conceptual 12   Although such valuations have certainly been made (Enke, 1960), they are inherently controversial. The debate about valuation is not further clarified by the need, in a cost-benefit calculus, to summarize all values in money terms. The use of money indices of benefits and costs, while perhaps suitable for quantifying an individual's balance of production net of consumption over a life span, tends to direct attention to the measurable economic benefits and to divert attention from the non-economic or non-quantifiable dimensions of the problem.

OCR for page 338
--> of the N clients (Jonsson, 1985; Warner and Luce, 1982). Such time and travel costs must generally be imputed from information on a client's characteristics, such as her age, education, labor market experience, and the like. The distinction between the costs borne by clients and the costs tallied in program accounts is important, because different forms of program organization imply different divisions of social costs between the program and its clients. For example, a program that does little in the way of community outreach reduces its own administrative costs in comparison to more ambitious programs, but in so doing might increase total social costs. Without the guidance provided by outreach, potential clients might spend greater amounts of their time and resources in learning about and locating the program. It is obviously misleading to compare the total costs C1(O1,N1,I1,Q 1,R1;w1) for one program, with its own particular mix and level of services, to the total costs C2(O2,N2,I2,Q2,R2;w2) of another. Neither is it defensible to focus attention on only one dimension of services—say, the provision of contraceptive methods Q—and to interpretthe relative costs of the two programs in light of this dimension alone. Perhaps it goes without saying that output mix and scale must matter a great deal to program costs; yet the family planning literature is full of misleading statements about costs that ignore both mix and scale. 18 Likewise, it is inappropriate to make cost comparisons without recognizing that programs face different prices w for their inputs, as would be the case when one program operates in an urban area and another in a rural area.19 But by estimating cost functions, through which the implications of different service levels, service mixes, and input prices can be explored, one can lay down a proper foundation for meaningful cross-program comparisons. As Table G-1 suggests, it may be useful to further disaggregate the service dimensions (O,N,I,Q,R) according to the socioeconomic characteristics of the clients being served and by type of services delivered (e.g., types of contraceptive methods) within each broad category. The various population subgroups (as indexed by the superscript k) could be defined on the basis of age and socioeconomic status, and perhaps additionally by the expressed motivation (spacing, stopping) given for family planning visit. This disaggregation across subgroups serves several purposes. By indicating who the recipients of services are, it permits the equity dimension of service delivery to be examined. It also allows for differences in the costs of providing service to certain population subgroups, who may (for example) require greater 18   The analogy would be to statements about hospital costs that ignore differences in case mix and numbers of cases in each category. 19   Nyman and Dowd (1991) explore this issue in the context of Medicare. Note that programs operating in socioeconomically disadvantaged areas may need to pay higher wages and salaries to recruit qualified personnel.

OCR for page 338
--> outreach effort, counseling, or care.20 It allows for meaningful imputation of the time and travel costs borne by clients. Disaggregation also provides the information required to convert data on the supply of contraceptives of various types into more refined measures of protection against unintended pregnancy. Translating Service Statistics into Measures of Contraceptive Protection A long-standing problem in the evaluation of family planning programs concerns the link between program service statistics, which are represented in this document by (O,N,I,Q,R), and protection from unintended pregnancy. The essence of the problem is this: contraceptive methods are heterogeneous, so that a way must be found to aggregate across methods; and individuals themselves make use of any given method with different degrees of effectiveness, and will discontinue use as wantedness status changes. Protection from unintended pregnancy thus depends on both the method that is used and the characteristics of the individual user. In progressing from service statistics to more refined measures of final services, two issues require discussion: (1) whether and how to adjust contraceptive failure rates for differences in program-supplied information and client characteristics; and (2) how to adjust for variations over time in wantedness status, another characteristic of clients. Curiously, the family planning literature has been more concerned with the first of these issues than with the second.21 To understand the difficulties, consider Figure G-3. Imagine that a woman begins in "state 1," defined as not pregnant and desiring not to conceive. In this condition, she contacts or begins her participation in a program. Upon contacting the program, she may decide to use no contraceptive method, or she may accept an allotment of s months of method type c. If the method adopted is sterilization, for instance, then s represents the length of time remaining in the woman's reproductive career. If the method is Norplant, s represents 5 years of coverage. For the pill, s corresponds to the length of the prescription issued.22 20   See Nyman and Dowd (1991) for an exploration of the link between outpatient characteristics and program costs. 21   For a review of some of these issues in the developing-country context, see United Nations (1979). 22   A slightly more elaborate framework is required to deal with the possibility of unintentional expulsion of the method, or with health side effects severe enough to cause discomfort or to warrant its removal. The side effects can be considered by assigning a distinct "state" to each distinct (and measurable) degree of discomfort associated with method use. This is analogous to the method used to calculate disability-adjusted life years.

OCR for page 338
--> Newly armed with this length and type of contraceptive protection, the woman may then enjoy s full periods of protection from unintended pregnancy (a path represented in Figure G-3 by the long solid line) or she may experience a contraceptive failure and become pregnant (this is represented by the dashed lines showing a transition at a1 from state 1 to state 3) or she may have a change of heart in respect to wantedness, discontinue the method, and hope to conceive (a path indicated by the dashed lines showing a transition at a2 to state 2). Three transition rates govern the movements among states: r12 and r21 determine transitions between the wantedness statuses, and r13 represents the method failure rate. All three transition rates depend implicitly on the woman's characteristics (her subgroup k), and r13 may also depend on the contraceptive counseling that has been supplied by the program. When a program supplies a woman with method c in allotment s, it provides her with an expected span of protection from unintended pregnancy. The ex ante value of the service she receives is indexed, in part, by the expected length of time E1(c,s;k,Ik/c) that the woman will spend in state 1. This quantity E1 is, in general, less than s; how much less depends on the effectiveness of the method and the variability of conception desires. If method c is difficult to remove or

OCR for page 338
--> reverse, then in addition to E1 one must also consider the expected span of time E2(c,s;k,Ik/c) over which the woman finds herself in state 2, wanting to conceive but being prevented (at least temporarily) from doing so. The quantities E1 and E2 may be calculated by increment-decrement life tables or related methods.23 Given values of E1 and E2 for each method allotment (c,s), client type k, and program counseling strategy Ik/c, total services provided are calculated by summing E1 and E2 over these categories. What results is a pair of indicators of total services supplied (E1,E2). The E1 dimension, representing total expected protection against unintended pregnancy, should be positively valued from a social point of view, whereas the E2 dimension, representing total constraints on desired conception, should be negatively valued. As the discussion suggests, rather detailed data are required to undertake calculations of E1 and E2. Given this, it may be useful to view E1 and E2 as conceptual ideals against which current standard practice can be measured. Perhaps the most common approach now found in the family planning evaluation literature is the use of couple-years of protection, or CYP. This approach is based on two assumptions: that wantedness status is fixed (i.e., r12=0), which removes the E2 dimension; and that method failure rates (r13) can be set equal to their theoretical values, unaffected either by client type or by the information provided by the program. Under these strong assumptions, the calculation of E1 then becomes a trivial matter. An alternative and better-justified measure, but one that is more demanding of data, is that of use-effectiveness. The use-effectiveness measure retains the assumption of fixed wantedness status, but allows contraceptive failure rates r13 to depend on the characteristics of the client served (her subgroup k, in this document) and permits the failure rates to vary (at least in principle) with the information provided by the program. Forrest and Singh (1990) give a good illustration of the use-effectiveness approach in family planning evaluation, whereas Shelton (1991) defends the CYP approach on practical grounds. As just noted, neither of these conventional approaches deals with the variability of wantedness status. Hence, neither approach can be fully justified when applied 23   See earlier footnotes. E1 and E2 can be calculated if data are available on changes in wantedness status. Surveys such as the NSFG collect reliable data on wantedness status at survey, and the distribution of wantedness at a point in time can therefore be linked to various socioeconomic characteristics that we have subsumed under the superscript k. Reliable retrospective data on changes in wantedness status, however, are less common. The NSFG collects such information prior to the survey date, but only with respect to previous pregnancies. If a pregnancy took place, then its wantedness (in retrospect) can be ascertained via a standard set of survey questions. But since wantedness influences contraceptive motivation, which in turn affects the likelihood of pregnancy, these data may be subject to considerable selectivity bias.

OCR for page 338
--> to programs that supply irreversible methods or methods that are difficult to remove. Predicting Net Program Effects Without Individual Data Cost-effectiveness is inherently a relative concept: a program or health system is said to be cost-effective in relation to an alternative program or system. In practice, however, one may not have access to the cost and service level information required to compare two distinct systems or programs. Evaluation must then proceed on a hypothetical basis. The conceptual experiment is to imagine a health system much like the current system, except that the program of interest has been removed, scaled back, or otherwise altered. What would be the service level and mix x, and total social costs C(x), in this hypothetical world? It is argued above that the availability of detailed data on individuals, and an appropriate statistical model, would permit such questions to be addressed. But how should evaluation proceed if these ideal data are unavailable? The family planning literature presents no consensus, and offers surprisingly little guidance, on this central evaluation question. Essentially, one must set out different assumptions about individual choices that might be made in the absence of the program (or under an altered version of the program) and summarize these assumptions in a set of probabilities about contraceptive use and method choice. Let pk/i represent the probability that, in the absence of the program being evaluated, an individual woman (in socioeconomic subgroup k) would have chosen contraceptive method i. One should allow for the possibility that no method (i=0) might have been used. With all else held constant, her predicted E1 becomes where the bars over method allotments si and information Ik/i represent assumptions about the hypothetical values that these quantities assume in the absence of the program. An analogous expression can be specified for the expected or predicted value of E2 in the absence of the program. Couple-years of protection (CYP) or use-effectiveness can also be predicted in this manner. Evidently, then, the prediction issue can be reduced to the set of assumptions required to motivate a particular configuration of choice probabilities pk/i. Some authors (e.g., Fitzgibbons, 1993) propose that information on a woman's method choices prior to her contact with the program be used to define pk/i. The prescription is as follows. If the woman had previously used method c, then set

OCR for page 338
--> pck = 1 and set the probabilities associated with all other methods to zero. If the woman had used no method, then set pok = 1. This approach ignores the woman's level of motivation, that is, the factors that brought her to seek out the program in question. If that program had not existed, would she not have found other, non-program sources of services? Why is it reasonable to assume that the woman simply would have gone on doing what she was doing? An attractive alternative in selecting a set of pik, suggested by Forrest and Singh (1990) and used by them in an evaluation, is to employ the distribution of use and method choices that characterizes women who do not access the program in question. Forrest and Singh make a statistical adjustment to account for differences in social and economic characteristics among those who used the program (e.g., Title X clinics) as compared to the women who relied on non-program sources. The statistical adjustments described by Forrest and Singh do not take client self-selection into account, owing to a lack of data, but in other respects represent a reasonable compromise.24 Estimating the Cost Functions Given all that has been said above, how can cost functions be estimated and used to establish cost-effectiveness of programs? Remarkably little guidance on this question can be found in the family planning literature. Within the broader field of health economics, however, numerous studies exist that employ cost functions, most of these studies being focused on the estimation of cost functions for hospitals. Cowing, Holtmann, and Powers (1983) provide an survey of the literature, and Cowing and Holtmann (1983), Granneman, Brown, and Pauly (1986), and Nyman and Dowd (1991) present empirical applications. Much of this literature is concerned with the cost implications associated with the delivery of a range of services. The methods are therefore of considerable relevance to the evaluation of family planning service delivery, which is also characterized by multiple service dimensions. Cost functions are estimated from a base of cost and services data covering either a number of similar programs operating in different environments or a given program observed over time. Each data point provides information on total social costs C and the level and mix of services provided, whether these are expressed in terms of service statistics (O,N,I,Q,R) or, for the contraceptive component of services, in more refined measures of services such as E1 and E2 discussed above. Input prices w should also be available for each data point. Finally, each program under consideration should be classified according to its 24   One questionable aspect of the Forrest-Singh analysis concerns their treatment of sterilization, which is assumed to be beyond the means of any poor or near-poor woman.

OCR for page 338
--> organizational type. In their studies of hospitals, for example, Cowing and Holtmann (1983), Granneman, Brown, and Pauly (1986), and Nyman and Dowd (1991) specify cost functions that incorporate shift factors representing different forms of organization, as for example, not-for-profit versus proprietary hospitals. With such data in hand, one then selects a functional form for the cost function Ci(O,N,I,Q,R;w). At this stage a decision needs to be made about how to aggregate services within the broad categories of outreach, information, provision of methods, and provision of other reproductive health services. Cowing, Holtmann, and Powers (1983) discuss the formal criteria that justify service aggregation, and of course data availability must also be taken into account. As a general rule, the less aggregation applied to the services data, the better; but some compromise is inevitable. Nyman and Dowd (1991) have estimated a cost function having as many as seven dimensions of service delivery. In principle, at least, even more dimensions could be incorporated, up to the limits imposed by the data.25 The estimation procedure takes the form of non-linear regression, with total costs (or the natural log of costs) being the dependent variable and with the set of explanatory variables defined in terms of levels of services and input prices. For example, suppose that a homothetic functional form is selected, where Di is a dummy variable representing program type and the sign of the coefficient βi indicates whether, with all else constant, this type of program is associated with greater or lesser costs. A specification such as the above would typically be estimated in log form: and non-linearity arises through the functions of services Q and input prices F on the right-hand side of the equation. If the estimated bi > 0, this indicates that a program of type i is cost-inefficient relative to the benchmark program. How can this approach be employed to assess the cost-effectiveness of a new program, on which perhaps only one data point is available? If an estimated cost function is available for a range of already established programs, then the cost-effectiveness 25   There are additional considerations, in particular concerning programs that provide zero levels of some service dimension (e.g., outreach). Some of the standard functional forms used in cost function estimation, such as the translog form, cannot deal with zero output levels. See Granneman, Brown, and Pauly (1986) for discussion and an alternative that can incorporate zero outputs.

OCR for page 338
--> of the new program can be evaluated using the estimated function as a benchmark. In other words, by substituting into a previously estimated cost function the service levels (O,N,I,Q,R) for the new program, and the input prices w faced by that program, one can derive the predicted costs of the new program. These predicted costs can be compared to the actual costs exhibited by the new program. Although no strong conclusions about cost-effectiveness can be drawn on the basis of a single data point, a divergence of predicted costs from actual costs may nevertheless prove informative. Summary To sum up, the assessment of program cost-effectiveness is a demanding task, particularly so in respect to the data that are required to support a rigorous analysis. If the data are not available, evaluation can proceed only on an informal basis, by invoking strong assumptions on the nature of the cost functions. Much of the cost-effectiveness literature in family planning has rested on two exceedingly strong yet rarely scrutinized assumptions: (1) that the multiple dimensions of output can somehow be collapsed into a single output indicator; and (2) that average costs, defined as total costs divided by (composite) output, are constant over the range of output. If both conditions are met, then a single observation on average costs can provide a basis for program comparisons. But in the absence of supporting evidence—the committee finds none in the literature—these strong assumptions are not well justified and may be misleading as a guide to policy. Conclusions This discussion and analysis has attempted to provide an introduction or guide to the evaluation literature on effectiveness and cost-effectiveness, with emphasis on those issues which are central to family planning programs. One of the more alarming facts about the state of family planning evaluation in the United States is the scarcity of research, and the degree to which techniques that have become standard practice in other fields have yet to be applied. Some of these are difficult techniques, to be sure, and results based on them lack the feel of certainty that attaches to randomized-experiments research. Nevertheless, their application to family planning evaluation is long overdue.

OCR for page 338
--> References Corman H, Joyce T, Grossman M. Birth outcome production function in the United States. J Hum Resour. 1987;22:339–360. Cowing T, Holtmann A. Multiproduct short-run hospital cost functions: Empirical evidence and policy implications from cross-section data. Southern Econ J. 1983; 49:637–653. Cowing T, Holtmann A, Powers S. Hospital cost analysis: A survey and evaluation of recent studies. Adv Health Econ Health Serv Res. 1983;4: 257–303. Donovan P. Family planning clinics: Facing higher costs and sicker patients. Fam Plann Perspect. 1991;23:198–203. Enke S. The gains to India from population control: Some money measures and incentive schemes. Rev Econ Stat. 1960;42:175–181. Fitzgibbons E. Benefit: Cost Analysis of Family Planning in Washington State. Unpublished Master's thesis. University of Washington; 1993. Forrest J, Singh S. Public-sector savings resulting from expenditures for contraceptive services. Fam Plann Perspect. 1990;22:6–15. Foster R. Identifying experimental program effects with confounding price changes and selection bias. J Hum Resour. 1989;24:253–279. Frank R, Strobino D, Salkever D, Jackson C. Updated estimates of the impact of prenatal care on birthweight outcomes by race. J Hum Resour. 1992;27:629–642. Grannemann T, Brown R, Pauly M. Estimating hospital costs: A multiple-output analysis. J Health Econ. 1986;5:107–127. Hausman J, Wise D. Technical problems in social experimentation: Cost versus ease of analysis. In Social Experimentation. Hausman J, Wise D, eds. Chicago, IL: The University of Chicago Press; 1985. Heckman J, Hotz VJ, Choosing among alternative nonexperimental methods for estimating the impact of social programs: The case of manpower training. J Am Stat Assoc. 1989;84:862–880. Huntington J, Connell F. ''For every dollar spent …" The cost-savings argument for prenatal care. Department of Health Services, University of Washington; no date. Janowitz B, Bratt J. Costs of family planning services: A critique of the literature. Int Fam Plann Perspect. 1992;18:137–144. Jensen E. Cost-effectiveness and financial sustainability in family planning operations research. In Operations Research: Helping Family Planning Programs Work Better. Seidman M, Horn M, eds. New York, NY: Wiley-Liss; 1991. Jonsson B. The value of prevention: Economic aspects. In The Value of Preventive Medicine. London, England: Pitman. (Ciba Foundation symposium 110); 1985. Joyce T, Corman H, Grossman M. A cost-effectiveness analysis of strategies to reduce infant mortality. Med Care. 1988;26:348–360. Kenney G, Lewis M. Cost analysis in family planning: Operations Research programs and beyond. In Operations Research: Helping Family Planning Programs Work Better. Seidman M, Horn M, eds. New York, NY: Wiley-Liss; 1991. Kristein M. Using cost-effectiveness and cost-benefit analysis for health care policymaking. Adv in Health Econ and Health Services Res. 1983;4:199–224.

OCR for page 338
--> Ku L. Financing of family planning services. In Publicly Supported Family Planning in the United States. Washington, DC: The Urban Institute and Child Trends, Inc.; 1993. Levey L, Nyman J, Haugaard J. A benefit-cost analysis of family planning services in Iowa. Eval Health Prof. 1988;11:403–424. Levine R, Tsoflias L. Use in the 1980s. In Publicly Supported Family Planning in the United States. Washington, DC: The Urban Institute and Child Trends, Inc.; 1993. Long D. Analyzing social program production: An assessment of Supported Work for Youths. J Hum Resour. 1988;22:551–562. Maddala GS. A survey of the literature on selectivity bias as it pertains to health care markets. Adv Health Econ Health Serv Res. 1985;6:3–18. Meier K, McFarlane D. State family planning and abortion expenditures: Their effect on public health. Am J Public Health. 1994;84:1468–1472. Montgomery M. Dynamic behavioral models and contraceptive use. Dynamics of Contraceptive Use. J Biosoc Sci Suppl No. 11. 1989;17–40. National Research Council. Risking the Future: Adolescent Sexuality, Pregnancy, and Childbearing. Vol I. Hayes C. (ed.). Washington, DC: National Academy Press; 1987. Nyman J, Dowd B. Cost function analysis of Medicare policy: Are reimbursement limits for rural home health agencies sufficient? J Health Econ. 1991;10:313–327. Shelton J. What's wrong with CYP? Stud Fam Plann. 1991;22:332–335. Trussell J, Kost K. Contraceptive failure in the United States: A critical review of the literature. Stud Fam Plann. 1987;18:237–283. Tsui A, Herbertson M, eds. Dynamics of Contraceptive Use. J Biosoc Sci Suppl No. 11. 1989. United Nations. Manual IX: The methodology of measuring the impact of family planning programs on fertility. Popul Stud. No. 66. New York, NY: United Nations. 1979. Vincent M, Lepro E, Baker S, Garvey D. Projected public sector savings in a teen pregnancy prevention project. J Health Educ. 1991;22:208–212. Warner K, Luce B. Cost-Benefit and Cost-Effectiveness Analysis in Health Care. Ann Arbor, MI: Health Administration Press; 1982. World Bank. World Development Report, 1993: Investing in Health. Washington, DC: The World Bank; 1993.

OCR for page 338
This page in the original is blank.