Session III:
Evaluating Program Impact and Performance

In this session participants discussed USAID's data needs for evaluating program performance and program impact. Twenty years of lively debate has yet to finally lay to rest the question of whether family planning programs induce, make important contributions to, or are basically superfluous to fertility declines (see, for example, Freedman and Berelson, 1976; Cutright and Kelly, 1981, Lapham and Mauldin, 1987; Bongaarts et al., 1990, Pritchett, 1994a, 1994b; Bongaarts, 1994). Past research on the effect of family planning on fertility has indicated that little can be concluded from trends in indicators alone. Complex multivariate analyses, panel studies, and experiments are the preferred—but by no mean conclusive—evaluation methods.

For monitoring and evaluating USAID strategic objectives, Diamond supported the use of DHS programs for national level reporting on fertility rates and infant and child mortality rates. Self-reported survey data on STD prevalence, especially for women, are not very reliable, although the DHS in Ethiopia is testing the idea of collecting body fluids to estimate STD/HIV prevalence.

A DHS survey, in Diamond's view, is not needed annually for fertility or infant and child mortality rates. For local-level information, he mentioned using quick, rapid assessment-type surveys that target particular localities. Eckhard Kleinau noted that program managers find it difficult to choose the data collection method that work best for program monitoring and that is affordable. Program managers are constantly faced with the dilemma of deciding whether to do a rapid assessment of a program or to undertake a full survey to collect the requisite data for program management.

Workshop participants agreed that maternal mortality rates are not needed for analysis of program impact. Koblinsky noted that a better indicator would focus on complications of delivery, or identify where women with complications actually deliver. Researchers are beginning to look at the postpartum period because many maternal deaths occur during that time. One potentially useful indicator of complications for women during delivery is perinatal mortality, which is highly correlated with maternal mortality, but 10 times more prevalent, and thus easier to measure in feasible samples.

In his presentation, Robert Black distinguished between the needs of international agencies for indicators of impact and the needs of program managers. For international purposes, it is usually sufficient to have a nationally representative estimate for each indicator. For national program management, the usefulness of the one overall indicator of program status is limited. Program managers typically want estimates for subnational areas so they can determine which areas are doing well. They may also want to identify underserved clients, who can be targeted for new efforts, or underperforming health services, so they can be the subject of additional training or managerial attention. Black urged USAID to



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 13
--> Session III: Evaluating Program Impact and Performance In this session participants discussed USAID's data needs for evaluating program performance and program impact. Twenty years of lively debate has yet to finally lay to rest the question of whether family planning programs induce, make important contributions to, or are basically superfluous to fertility declines (see, for example, Freedman and Berelson, 1976; Cutright and Kelly, 1981, Lapham and Mauldin, 1987; Bongaarts et al., 1990, Pritchett, 1994a, 1994b; Bongaarts, 1994). Past research on the effect of family planning on fertility has indicated that little can be concluded from trends in indicators alone. Complex multivariate analyses, panel studies, and experiments are the preferred—but by no mean conclusive—evaluation methods. For monitoring and evaluating USAID strategic objectives, Diamond supported the use of DHS programs for national level reporting on fertility rates and infant and child mortality rates. Self-reported survey data on STD prevalence, especially for women, are not very reliable, although the DHS in Ethiopia is testing the idea of collecting body fluids to estimate STD/HIV prevalence. A DHS survey, in Diamond's view, is not needed annually for fertility or infant and child mortality rates. For local-level information, he mentioned using quick, rapid assessment-type surveys that target particular localities. Eckhard Kleinau noted that program managers find it difficult to choose the data collection method that work best for program monitoring and that is affordable. Program managers are constantly faced with the dilemma of deciding whether to do a rapid assessment of a program or to undertake a full survey to collect the requisite data for program management. Workshop participants agreed that maternal mortality rates are not needed for analysis of program impact. Koblinsky noted that a better indicator would focus on complications of delivery, or identify where women with complications actually deliver. Researchers are beginning to look at the postpartum period because many maternal deaths occur during that time. One potentially useful indicator of complications for women during delivery is perinatal mortality, which is highly correlated with maternal mortality, but 10 times more prevalent, and thus easier to measure in feasible samples. In his presentation, Robert Black distinguished between the needs of international agencies for indicators of impact and the needs of program managers. For international purposes, it is usually sufficient to have a nationally representative estimate for each indicator. For national program management, the usefulness of the one overall indicator of program status is limited. Program managers typically want estimates for subnational areas so they can determine which areas are doing well. They may also want to identify underserved clients, who can be targeted for new efforts, or underperforming health services, so they can be the subject of additional training or managerial attention. Black urged USAID to

OCR for page 13
--> take care that its legitimate information needs not get out of balance with the direct information needs for program management. Because programs evolve over time, and new elements, such as integrated management of childhood illnesses, require new or modified indicators, the appropriateness of particular indicators needs continual reevaluation. Black presented his recommendations for health-related indicators serving two purposes: impact evaluation and program performance evaluation. For the former, he proposed an emphasis on assessing medium-term trends in age-specific child mortality rates, HIV or STD infection rates, and in some settings maternal mortality rates. Given the well-documented relationship of child nutrition with mortality, he expressed surprise that child anthropometry had not been included among the outcome indicators sought by USAID. It would be sufficient, in Black's view, to assess such outcome indicators at a national level every 5 years or so. More detailed studies, to assess how these outcome measures are linked with program performance measures, could be done on a case-study basis in selected countries. Black listed several criticisms of the program objective indicators listed in USAID's strategic plan. He argued that the proposed set of indicators of program performance is too extensive, more than is needed for USAID accountability or monitoring of national programs. Indicators should be limited in complexity to ensure high quality and feasibility of measurement. Limiting the number of indicators, and their complexity, and dropping the evaluation of outcomes from requirements for routine monitoring could allow samples to be large enough to produce province-or even district-level estimates for a few key performance indicators. Black proposed that USAID sponsor work on development of health facility surveys as well as population-based surveys, because much of the information for performance monitoring should be facility or provider based. He called for greater priority to helping develop national capabilities to produce data and to use data for policy decisions. Lastly, Black proposed that USAID cooperate with other donors to maintain a single database, containing the most current information for all countries on an agreed, limited set of impact and performance indicators. Annual reports could be produced using the most up-to-date information, but this would not require annual measurement of every indicator. Data on the Accessibility and Quality of Services Provided In recent years, increased emphasis has been placed on assessing and improving the quality of health and family planning services. Wayne Stinson and Jane Bertrand, in their presentations, argued that quality as well as the availability of services have to be well understood when assessing how and why programs affect health and demographic outcomes in the population. For example, it is misleading to assume that the nearest facility is the one most often used by the

OCR for page 13
--> respondent. Understanding and measuring program impact is a critical component of the effective monitoring of programs. Service Availability Modules The service availability module (SAM) to the DHS has been used to provide a basis for assessing the relationship between the availability and use of services (Wilkinson et al., 1993). The SAM collects data on family planning services provided to a sample of women who have responded to a DHS interview. Bertrand pointed out that the SAM has been a useful tool for measuring access by measuring time and distance to the nearest service delivery point, and measuring the density or the number of such points per population and geographic area. The SAM collects data on different aspects of facilities (similar to situation analysis discussed below). Data include: time and mode of transport to nearest family planning facility; location of facilities offering specific types of family planning and reproductive health services; contraceptive prevalence rates and method by distance to nearest facility; and percentage of women with an unmet need for family planning who reside near a facility. The significant benefit of the SAM is that it provides one of the few data sets that link information on facilities with behaviors of populations. Workshop participants favored further research on facility-based information linked to population-based surveys. Use of the modules has been limited due to a number of limitations, such as difficulties in identifying which facility women consider to be the nearest to them and whether or not the data show the actual practices at a facility. A comparison study of SAMs from a facility in Tanzania in 1991 and 1994, however, offers a favorable view of SAM data (Agallba et al., 1994). The EVALUATION Project is currently considering the use of SAM data in a multilevel panel design to examine the links between family planning programs and increased contraceptive prevalence in clusters at two times. Participants supported the idea of establishing a task force of experts to examine a facility-based survey, especially one using a multilevel panel design. Situation Analysis Situation analysis was developed in connection with the Population Council's Africa Operations Research/Technical Assistance Project as an operations research methodology that can rapidly assess the strengths and weaknesses of family planning programs in developing countries (see, for example, Miller et al.,

OCR for page 13
--> 1991). The demand for situation analysis has increased greatly in recent years as the demand for information on services provided and client perceptions and behaviors has grown. Situation analysis, in Bertrand's view, is useful for describing the ability of programs to provide quality services to clients and for describing and comparing the quality of services actually provided. The core set of procedures for data collection used in situation analysis include: examining a representative sampling of facilities within a geographic area; taking a complete inventory of equipment and supplies and collecting service statistics over some period; interviewing all service providers; and observing provider-client interactions and conducting exit interviews of clients. Andrew Fisher argued that the great appeal of situation analysis is that it identifies problems that managers can address immediately. For example, it can provide extensive information on subsystems at a facility, such as logistics and commodities; staffing issues, including training and experience; supervision and management; data on information, education, and communication (IEC) material and activities; and record keeping. Situation analysis has been an example of a valuable tool that assists researchers in assessing how programs work and why. James Shelton noted that data collection methods such as situation analysis offer a ''window to what's going on'' at a facility or in determining client preferences. The limitations of situation analysis include the complexity that has accrued over the years to what was originally thought of as a "quick and clean" analysis. Studies now often have multiple modules and extensive data, requiring complex analysis and evaluation. For example, ensuring that the sample of facilities is a representative sample is not always straightforward. Another limitation is that situation analysis typically does not link service facilities with a specified household population. But there have been several promising examples of studies linking situation analysis data with the DHS information (in Peru, Brazil, and one under way in Tanzania). Fisher and other participants cautioned that the analysis of data from a combined population-based DHS and the facility-based situation analysis is difficult, and more methodological work is needed. Situation analysis is a large and complex instrument, so merging population-based data will require extensive work. Quality Assurance Stinson drew a distinction between data needed for monitoring and evaluation and data needed for program management. USAID requires measurements partly for cross-sectional comparisons, while data for program management focuses on utility at the service delivery level that supports local ownership and

OCR for page 13
--> responsibility. A comprehensive data management strategy as outlined by Stinson would include: quality of service delivery and managerial processes; support for problem-solving capacity and routine management; effective data use at all levels; and encouragement of local level ownership of data and management. To increase ownership at the program level and improve the quality of services provided, Stinson identified three elements of a health program: quality design, quality monitoring, and quality improvement. Client-focused programs with effective service delivery and management can be assessed through data on clients' needs, social acceptability, and provider preferences. These types of data can be obtained through focus groups and as well as through routine monitoring systems. Quality control depends on adherence to standard processes by verifying that specific procedures are followed and client needs met. Quality control should include critical managerial and service delivery procedures, but it should concentrate on quality of care and client satisfaction. Exit interviews with fixed questions and rating scales have not generally been useful, in Stinson's view, except where focus groups have previously clarified what clients consider to be the key characteristics of quality. Providers and others in direct contact with clients may gain more useful information (though not quantitative data) by routinely asking empathetic questions and periodically discussing responses with colleagues. Motivating service-level staff and managers to do this may be a critical element for improving service quality. Hassig noted in this regard that the manual for quality assessment of STD/HIV services of the Global Program on AIDS includes in its checklist whether managers have demonstrated an ability to address problems. Micro-level, client-oriented provider efficient (COPE) methods of data collection may have a major influence on quality assurance at the facility level. However, assessing access and quality of services at a national program level often requires a macro-level approach that includes the use of situation analysis or service availability modules. In the discussion, Anne Pebley noted that in the area of reproductive health, the role of traditional birth attendants who are located in the village rather than at a health center is often not included in a situation analysis or a SAM. Oot indicated that USAID is seeking to identify a core set of indicators that can serve as useful proxy measures for total service availability and accessibility. Data collected from survey respondents in households can be linked to data on nearby facilities if the location of both households and facilities can be recorded, either by use of maps or through use of signals from Global Positioning System (GPS) satellites. With some information about local transportation sys-

OCR for page 13
--> tems and travel time estimates, these linked data can be used to analyze issues related to physical accessibility of different types of facility and service to different groups of potential users. Ronald Rindfuss discussed the need to monitor infrastructure changes, such as road and transportation changes, to enable programs to better target the placement of facilities. A common problem with facility data used for studies of accessibility, quality of care, and other issues is that they rely on the assumption that clients use the facility of a given type nearest their homes. Amy Tsui described an alternate procedure, used in the PERFORM surveys in India, in which facilities are selected for inclusion in the study based on household respondents' reports of which facilities they actually use. Data on Costs and Expenditures Although some estimates exist at global levels, there is relatively little information at the country level about how much programs cost or about household expenditures for particular services. It is very rare to have usable information on costs at the level of an individual facility. Barbara Janowitz commented that determining the costs of providing family planning services at a macro level, such as a national family planning program, involves a review of financial records and aggregate budgets. But because most developing countries do not allocate expenditures on salaries, facilities, and other shared inputs among the various purposes (such as family planning or maternal and child health services), it is difficult to disaggregate costs for these services. One way of allocating costs of shared resources among family planning and health services is by the proportion of visits that are for each purpose. The EVALUATION project is currently supporting a study using this method to determine family planning costs in three countries. Janowitz recommended greater effort be devoted to collecting family planning cost data at individual service delivery points: that is, using a "bottom-up" approach rather than the more common method of estimating expenditures at the macro level and allocating them among services. Family Health International has developed methods to collect and analyze cost data on different types of visits. Information such as the number of visits to service delivery sites and continuation rates can be expressed in common terms (for example, costs per couple-year of protection) and compared across contraceptive methods or among regions. Diamond and Susan Hassig suggested that in allocating costs of staff time, it is important to include an estimate of the time spent by providers on consultations for a particular service, rather than relying only on counts of the numbers of visits. Consultations for STD/HIV prevention, for example, could be more intensive than consultations for continuing contraceptive users. Time-and-motion studies, patient flow analyses, or client and provider interviews can be used to collect information on how staff time is divided among seeing clients, performing administrative duties, and waiting for clients (or

OCR for page 13
--> unallocated time). A high percentage of unallocated staff time may be an indicator of low demand or low quality of services provided. Information on staff allocation ultimately helps identify areas for improvement that could reduce program costs. For child survival programs, Kleinau noted that there is a need to understand the implications of staff time allocation, especially as integrated case management for health services is being promoted. Time-and-motion analyses would help determine to what extent health workers could take on added service responsibilities. Family Health International is working with a project in Latin America (INOPAL III) to test less costly and nonintrusive means of collecting data on allocation of staff time. Methods under review include patient-flow analysis, staff interviews, use of time sheets, and a combination of these. Peter Berman proposed greater investments in building capacity at the national level of host countries to monitor costs and expenditures, and he suggested priorities for data collection and analysis. Periodic assessments of total annual national expenditures for services would allow program managers to identify priority program areas and use expenditures as indicators of program outcomes. Analysis of factors influencing change in expenditures is critical for adequate interpretation. With a "sources and uses" matrix, a comprehensive picture of the resource allocation for each program, such as family planning, child survival, HIV prevention, can be examined at the country, district, or local levels and by source of funding (government, nongovernmental organization, donor, user fees). Program budgets can be used to monitor national expenditures disaggregated within categories, such as outreach for STD education or diarrheal treatment centers. Budget tracking systems are currently being implemented in Egypt through the Data for Decision Making Project (Research Triangle Institute, 1995). Berman suggested that the data could be drawn from some of the information mentioned by Janowitz on costs of services provided. Berman suggested that new health sector technologies for USAID-supported programs should be subject to cost-effective tests prior to use. Current gaps in information include: cost-effectiveness estimates for certain interventions; bias toward estimates from clinical trials rather than population-based services; lack of estimates for groups of services (in contrast to isolated interventions); lack of comparisons between existing programs. Several participants agreed that much more research is needed in the area of cost-effectiveness. Rodney Knight questioned whether data on expenditures could be collected without an exceptionally long survey instrument. There was some discussion about fixed costs associated with any survey and even higher costs for follow-up studies, so the length of a survey is not necessarily a negative factor. However, many of the workshop participants seemed more inclined to support a second round of surveys rather than trying to load every item and special topic into an initial survey. Attempting to collect comprehensive information about disease prevalence and incidence, facility utilization, and health expenditures would be a

OCR for page 13
--> difficult task. Not only would additional questions be needed, but also a different interview approach, particularly if information such as on expenditures on traditional providers is desired. Several participants mentioned that the household expenditure module developed for the DHS in Indonesia might be worth exploiting further. There was much discussion about the advantages of using existing data from national household consumption surveys (NHCS) or other surveys to examine expenditure information. Some participants agreed that making better use of large surveys such as the NHCS should be considered, but others noted that surveys such as the NHCS provide only aggregate information on health care expenditures at the household level, rather than attributing expenditures to particular programs or activities.