National Academies Press: OpenBook

Improving Democracy Assistance: Building Knowledge Through Evaluations and Research (2008)

Chapter: 2 Evaluation in USAID DG Programs: Current Practices and Problems

« Previous: 1 Democracy Assistance and USAID
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 43
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 44
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 45
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 46
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 47
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 48
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 49
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 50
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 51
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 52
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 53
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 54
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 55
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 56
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 57
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 58
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 59
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 60
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 61
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 62
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 63
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 64
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 65
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 66
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 67
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 68
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 69
Suggested Citation:"2 Evaluation in USAID DG Programs: Current Practices and Problems." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.
×
Page 70

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

2 Evaluation in USAID DG Programs: Current Practices and Problems Introduction To make decisions about the best ways to assist the spread of democ- racy and governance (DG), the U.S. Agency for International Develop- ment (USAID) must address at least two broad questions: 1. Where to intervene. In what countries and in what sectors within countries? Selecting the targets for DG programming requires a theory, or at least hypothesis, about the relationships among different institutions and processes and how they contribute to shaping overall trajectories toward democracy and governance. It also requires strategic assessment, that is, the ability to identify the current quality of democratic institu- tions and processes in various countries and set reasonable goals for their future development. 2. How to intervene. Which DG projects will work best in a given country under current conditions? Learning how well various projects work in specific conditions requires well-designed impact evaluations that can determine how much specific activities contribute to desired outcomes in those conditions. The two questions are clearly connected. To decide where to intervene (Question 1), one wants to know which interventions can work (Ques- tion 2) in the conditions facing particular countries. Indeed, in the current state of scientific knowledge, answers to Question 2 may provide the most helpful guidance to answering Question 1. 43

44 IMPROVING DEMOCRACY ASSISTANCE This chapter therefore focuses on USAID’s policies and practices for monitoring and evaluation (M&E) of its DG projects. To provide a con- text, we begin with a brief description of the current state of evaluations of development assistance programs in general. Then existing USAID assessment, monitoring, and evaluation practices for DG programs are described. Since such programs are called into existence and bounded by U.S. laws and policies, the key laws and policies that shape current USAID DG assessment and evaluation practices are examined, to lay the foundation for the changes recommended later in the report. The chapter concludes with a discussion of three key problems that USAID encounters in its efforts to decide where and how to intervene. Current Evaluation Practices in Development Assistance: General Observations As Chapter 5 discusses later in detail, there is a widely recognized set of practices for how to make sound and credible determinations of how well specific programs have worked in a particular place and time (see, e.g., Shadish et al 2001, Wholey et al 2004). The goal of these practices is to determine, not merely what happened following a given assistance program, but how much what happened differs from what would be observed in the absence of that program. The final phrase is critical, because many factors other than the given policy intervention—including ongoing long- term trends and influences from other sources—are generally involved in shaping observed outcomes. Without attention to these other factors and some attempt to account for their impact, it is easy to be misled regarding how much an aid program really is contributing to an observed outcome, whether positive or negative. The practices used to make this determination generally have three parts: (1) collection of baseline data before a program begins, to determine the starting point of the individuals, groups, or communities who will be receiving assistance; (2) collection of data on the relevant desired outcome indicators, to determine conditions after the program has begun or oper- ated for a certain time; and (3) collection of these same “before and after” data for a comparison set of appropriately selected or assigned individu- als, groups, or communities that will not receive assistance, to estimate what would have happened in the absence of such aid.   The ideal comparison group is achieved by random assignment, and if full randomiza- tion is achieved, a “before” measurement may not be required, as randomization effectively sets the control and intervention groups at the same starting point. However, both because randomization is often not achievable, requiring the use of matched or baseline-adjusted comparison groups, and because baseline data collection itself often yields valuable infor- mation about the conditions that policymakers desire to change, we generally keep to the three-part model of sound evaluation design.

EVALUATION IN USAID DG PROGRAMS 45 Wide recognition of these practices for determining project impacts does not mean that they are widely or consistently applied, however. Nor does it mean that policy professionals or evaluation specialists agree that the three elements are feasible or appropriate in all circumstances, especially for highly diverse and politically sensitive programs such as democracy assistance or other social programs. Thus, while some areas of development assistance, such as public health, have a long history of using impact evaluation designs to assess whether policy interventions have their intended impact, social programs are generally much less likely to employ such methods. In 2006 the Center for Global Development (CGD), a think tank devoted to improving the effectiveness of foreign assistance in reducing global poverty and inequality, released the report of an “Evaluation Gap Working Group” convened to focus on the problem of improving evalu- ations in development projects. Their report concludes: Successful programs to improve health, literacy and learning, and house- hold economic conditions are an essential part of global progress. Yet . . . it is deeply disappointing to recognize that we know relatively little about the net impact of most of these social programs. . . . [This is be- cause] governments, official donors, and other funders do not demand or produce enough impact evaluations and because those that are con- ducted are often methodologically flawed. Too few impact evaluations are being carried out. Documentation shows that UN agencies, multilateral development banks, and developing coun- try governments spend substantial sums on evaluations that are useful for monitoring and operational assessments, but do not put sufficient resources into the kinds of studies needed to judge which interventions work under given conditions, what difference they make, and at what cost. (Savedoff et al 2006:1-2) Although not a focus for the CGD analysis, democracy assistance reflects this general weakness. As a recent survey of evaluations in democ- racy programming noted: “Lagging behind our programming, how- ever, is research focusing on the impact of our assistance, knowledge of what types of programming is (most) effective, and how programming design and effectiveness vary with differing conditions” (Green and Kohl 2007:152). The Canadian House of Commons recently investigated Canada’s DG programs and came to similar conclusions: [W]eaknesses . . . have been identified in evaluating the effectiveness of Canada’s existing democracy assistance funding. . . . Canada should invest more in practical knowledge generation and research on effective democratic development assistance. (House of Commons 2007) As discussed in more detail below, there are many reasons why DG projects—and social development programs more generally—are not rou-

46 IMPROVING DEMOCRACY ASSISTANCE tinely subject to the highest standards of impact evaluation. One reason is that “evaluation” is a broad concept, of which impact evaluations are but one type (see, e.g., World Bank 2004). On more than one occasion commit- tee members found themselves talking past USAID staff and implement- ers because they lack a shared vocabulary and understanding of what was meant by “evaluation.” Diverse Types of Evaluations Because the term “evaluation” is used so broadly, it may be useful to review the various types of evaluations that may be undertaken to review aid projects. The type of evaluations most commonly called for in current USAID procedures is process evaluation. In these evaluations investigators are chosen after the project has been implemented and spend several weeks visiting the program site to study how the project was implemented, how people reacted, and what outcomes can be observed. Such an evaluation often provides vital information to DG missions, such as whether there were problems with carrying out program plans due to unexpected obsta- cles, or “spoilers,” or unanticipated events or other actors who became involved. They are the primary source of “lessons learned” and “best practices” intended to inform and assist project managers and implement- ers. They may reveal factors about the context that were not originally taken into account but that turned out to be vital for program success. Process evaluations focus on “how” and “why” a program unfolded in a particular fashion, and if there were problems, why things did not go as originally planned. However, such evaluations have a difficult time determining precisely how much any observed changes in key outcomes can be attributed to a foreign assistance project. This is because they often are unable to re-create appropriate baseline data if such data were not gathered before the pro- gram started and because they generally do not collect data on appropri- ate comparison groups, focusing instead on how a given DG project was carried out for its intended participants. A second type of evaluation is participatory evaluation. In these evaluations the individuals, groups, or communities who will receive assistance are involved in the development of project goals, and investi- gators interview or survey participants after a project was carried out to determine how valuable the activity was to them and whether they were satisfied with the project’s results. Participatory evaluation is an increas- ingly important part of both process and impact evaluations. In regard to all evaluations, aid agencies have come to recognize that input from participants is vital in defining project goals and understanding what con-

EVALUATION IN USAID DG PROGRAMS 47 stitutes success for activities that are intended to affect them. This focus on building relationships and engaging people as a project goal means this type of evaluation may also be considered part of regular project activity and not just a tool to assess its effects. Using participatory evaluations to determine how much a DG activity contributed to democratic progress, or even to more modest and specific goals such as reducing corruption or increasing legislative competence, can pose problems. Participants’ views of a project’s value may rest on their individual perceptions of personal rewards. This may bias their per- ception of how much the program has actually changed, as they may be inclined to overestimate the impact of an activity if they benefited from it personally and hope to have it repeated or extended. Thus participatory evaluations should be combined with collection of data on additional indicators of project outcomes to provide a full understanding of project impacts. Another type of evaluation is an output evaluation (generally equiva- lent to “project monitoring” within USAID). These evaluations consist of efforts to document the degree to which a program has achieved certain targets in its activities. Targets may include spending specific sums on various activities, giving financial support or training to a certain number of nongovernmental organizations (NGOs) or media outlets, training a certain number of judges or legislators, or carrying out activities involving a certain number of villagers or citizens. Output evaluations or monitor- ing are important for ensuring that activities are carried out as planned and that money is spent for the intended purposes. USAID thus currently spends a great deal of effort on such monitoring, and under the new “F Process,” missions report large numbers of output measures to USAID headquarters (more on this below). Finally, impact evaluation is the term generally used for those evalua- tions that aim to establish, with maximum credibility, the effects of policy interventions relative to what would be observed in the absence of such interventions. These require the three parts noted above: collection of baseline data; collection of appropriate outcome data; and collection of the same data for comparable individuals, groups, or communities that, whether by assignment or for other reasons, did and did not receive the intervention. The most credible and accurate form of impact evaluation uses ran- domized assignments to create a comparison group; where feasible this is the best procedure to gain knowledge regarding the effects of assistance projects. However, a number of additional designs for impact evalua- tions exist, and while they offer somewhat less confidence in inferences about program effects than randomized designs, they have the virtue of being applicable in conditions when randomization cannot be applied

48 IMPROVING DEMOCRACY ASSISTANCE (e.g., when aid goes to a single group or institution or to a small number of units where the donor has little or no control over selecting who will receive assistance). Impact evaluations pose challenges to design, requiring skill and not merely science to identify and collect data from an appropriate com- parison group and match the best possible design to the conditions of the particular assistance program. The need for baseline data on both the group receiving the policy intervention and the comparison group usually means that the evaluation procedures must be designed before the project is begun and carried out as the project itself is implemented. Finally, the need to collect baseline data and comparison group data may increase the costs of evaluation. For these reasons, among others, impact evaluations of DG programs are at present the most rarely carried out of the various kinds of evalua- tions described here. Indeed, many individuals throughout the commu- nity of democracy assistance donors and scholars have doubts about the feasibility and utility of conducting rigorous impact evaluations of DG projects. Within the committee, Larry Garber has strongly expressed con- cerns in this regard, and the committee as a whole has given a great deal of attention to these worries. However, as discussed in Chapters 6 and 7, there are a number of practical ways to deal with these issues, and these were explored in the field by the committee’s consultants in partnership with several missions. In addition, a good evaluation design is not neces- sarily more expensive or time-consuming than routine monitoring or a detailed process evaluation. The differences among these distinct kinds of evaluations are often obscured by the way in which the term “evaluation” is used in DG and foreign assistance discussions. “Evaluation” is often used to imply any estimate or appraisal of the effects of donor activities, ranging from detailed counts of participants in specific programs to efforts to model the aggregate impact of all DG activities in a country on that country’s overall level of democracy. This catch-all use of the term “evaluation” undermines consideration of whether there is a proper balance among various kinds of evaluations, how various types of evaluations are being used, and whether specific types of evaluations are being done or are needed. As another CGD report notes: Part of the difficulty in debating the evaluation function in donor insti- tutions is that a number of different tasks are implicitly simultaneously assigned to evaluation: building knowledge on processes and situations in receiving countries, promoting and monitoring quality, informing judgment on performance, and, increasingly, measuring actual impacts. Agencies still need their own evaluation teams, as important knowledge providers from their own perspective and as contributors to quality

EVALUATION IN USAID DG PROGRAMS 49 management. But these teams provide little insight into our actual im- pacts and, although crucial, their contribution to knowledge essentially focuses on a better understanding of operational constraints and local institutional and social contexts. All these dimensions of evaluations are complementary. For effectiveness and efficiency reasons, they should be carefully identified and organized separately: some need to be conducted in house, some outside in a cooperative, peer review, or independent manner. In short, evaluation units are supposed to kill all these birds with one stone, while all of them deserve specific approaches and meth- ods. (Jacquet 2006) Efforts to Improve Assessments and Evaluations by Donor Agencies There are encouraging signs of efforts to put greater emphasis on impact evaluations for improving democracy and governance programs. The basic questions motivating USAID’s Strategic and Operational Research Agenda (SORA) project are also motivating other international assistance agencies and organizations. The desire to understand “what works and what doesn’t and why” in an effort to make more effective policy decisions and to be more accountable to taxpayers and stakehold- ers has led a host of agencies to consider new ways to determine the effects of foreign assistance projects. This focus on impact evaluations in particular has increased since the creation of the Millennium Challenge Corporation (MCC) and the 2005 Paris Declaration on AID Effectiveness. Yet while there is wide agree- ment that donors need more knowledge of the effects of their assistance projects, and there are increased efforts to coordinate and harmonize the approaches and criteria employed in pursuit of that knowledge, donors are far from consensus on how best to answer the fundamental questions at issue. As the Organization for Economic Cooperation and Development (OECD) has stated: There is strong interest among donors, NGOs and research institutions in deepening understanding of the political and institutional factors that shape development outcomes. All donors are feeling their way on how to proceed. (OECD 2005:1) Several donors have focused on the first question posed above, the question of where to intervene in the process of democratization to help further that process. In the committee’s view this is a question that the current state of knowledge on democratic development cannot answer. It is an essential question, however, and Chapters 3 and 4 suggest specific research programs that might help bring us closer to answers. These issues are more a matter of strategic assessment of a country’s condition and potential for democratic development, rather than evaluation, a term

50 IMPROVING DEMOCRACY ASSISTANCE the committee thinks is better reserved for studying the effects of spe- cific DG programs. Nonetheless, several national development assistance agencies have, under the general rubric of improving evaluation, sought to improve their strategic assessment tools. What all of the following donor programs have in common is an increased effort at acquiring and disseminating knowledge about how development aid works in varied contexts. The broad range of current efforts to revise and improve evaluation procedures undertaken by national and international assistance agencies described below are aimed at better understanding the fundamental ques- tions of interest to all: “what works and what doesn’t and why,” although at present only some involve the use of impact evaluations. Perhaps the most visible leader in efforts to increase the use of impact evaluations is MCC, which has set a high standard for the integration of impact evaluation principles into the design of programs at the earliest stages and for the effective use of baseline data and control groups: There are several methods for conducting impact evaluations, with the use of random assignment to create treatment and control groups pro- ducing the most rigorous results. Using random assignment, the control group will have—on average—the same characteristics as the treatment group. Thus, the only difference between the two groups is the program, which allows evaluators to measure program impact and attribute the results to the MCC program. For this reason, random assignment is a preferred impact evaluation methodology. Because random assignment is not always feasible, MCC may also use other methods that try to es- timate results using a credible comparison group, such as double differ- ence, regression discontinuity, propensity score matching, or other type of regression analysis. (MCC 2007:19) The World Bank has also embarked on the use of impact evaluations for aid programs through its Development Impact Evaluation (DIME) project. Many of the DIME studies involve randomized-experimental evaluations; moreover, “rather than drawing policy conclusions from one-time experiments, DIME evaluates portfolios of similar programs in multiple countries to allow more robust assessments of what works” (Banerjee 2007:30). A major symposium on economic development aid also recently explored the pros and cons of conducting impact evaluations of specific programs (Banerjee 2007). While there were numerous objections to the unrestrained use of such methods (which are explored in more detail in Chapters 6 and 7 below), many eminent contributors urged that foreign   The CGD has also created the International Initiative for Impact Evaluation to encourage great- er use of this method. See http://www.cgdev.org/section/initiatives/_active/evalgap/calltoaction.

EVALUATION IN USAID DG PROGRAMS 51 aid cannot become more effective if we are unwilling to subject our assumptions about how well various assistance programs work to cred- ible tests. The lead author argued that ignorance of general principles to guide successful economic development (a situation that applies as much or more to our knowledge of democratization) is a powerful reason to take the more humble step of simply trying to determine which aid proj- ects in fact work best in attaining their specific goals. The Department for International Development (DfID) of the United Kingdom has developed the “Drivers of Change” approach because “donors are good at identifying what needs to be done to improve the lives of the poor in developing countries. But they are not always clear about how to make this happen most effectively” (DfID 2004:1). By focusing on the incorporation of “underlying political systems and the mechanics of pro-poor change . . . in particular the role of institutions—both formal and informal” into their analysis, this approach attempts to uncover more clearly what fosters change and reduces poverty. This approach is currently being widely applied to multiple development contexts and is being taught to numerous DfID country offices (OECD 2005:1). Multipronged approaches to evaluation are being employed by the German Agency for Technical Cooperation (Deutsche Gesellschaft für Technische Zusammenarbeit, GTZ). The range of instruments currently being employed is based on elements of self-evaluation as well as inde- pendent and external evaluations. Evaluations aim to address questions of relevance, effectiveness, impact, efficiency, and sustainability. These questions are addressed throughout the project’s life span as a means of better understanding the links between inputs and outcome. Com- mitment by the GTZ to evaluations is demonstrated by the agency’s increased spending on these activities, spending “roughly 1.2 percent of its public benefit turnover on worldwide evaluations—some EUR 9 mil- lion a year” (Schmid 2007). The Swedish Agency for International Development Cooperation (SIDA) is also actively considering ways to improve its evaluation tools. Since 2005, SIDA has shifted from post-hoc project evaluations to a focus on underlying assumptions and theories; specifically, SIDA is currently conducting a project that “looks at the program theory of a number of different projects in the area. This evaluation focuses on the theoretical constructs that underpin these projects and tries to discern patterns of   For further information, see “Working on Sustainable Results: Evaluation at GTZ.” Avail- able at: http://www.gtz.de/en/leistungsangebote/6332.htm. Accessed on September 12, 2007.

52 IMPROVING DEMOCRACY ASSISTANCE ideas and assumptions that recur across projects and contexts.”  Building on these initial efforts, SIDA hopes to combine the results of this study with others to “make an overall assessment of the field.” The Norwegian Agency for Development Cooperation (NORAD) has also initiated a new strategy for evaluating the effectiveness of its pro- grams in the area of development assistance. The intent of this new strat- egy, undertaken in 2006, is to “help Norwegian aid administrators learn from experience by systematizing knowledge, whether it is developed by (themselves), in conjunction with others, or entirely by others. Addition- ally, the evaluation work has a control function to assess the quality of the development cooperation and determine whether resources applied are commensurate with results achieved.” Additional attention is being paid to communicating the results of such evaluations with other agencies and stakeholders; this emphasis on communicating results is widely shared in the donor community. The Danish Ministry of Foreign Affairs has embarked on an extensive study of both its own and multilateral agencies’ evaluations of develop- ment and democracy assistance (Danish Ministry of Foreign Affairs 2005). It has found that evaluations vary greatly in method and value, with many evaluations failing to provide unambiguous determinations of program results. In regard to the United Nations Development Program’s central evaluation office, “its potential for helping strengthen accountability and performance assessment is being underexploited, both for the purpose of accountability and as an essential basis for learning” (Danish Ministry of Foreign Affairs 2005:4). Finally, the Canadian International Development Agency (CIDA) has been involved in recent efforts to improve evaluation and learning from collective experiences at international assistance in the area of democracy and governance. In April 1996, as part of its commitment to becoming more results- oriented, CIDA’s President issued the “Results-Based Management in CIDA—Policy Statement.” This statement consolidated the agency’s experience in implementing Results-Based Management (RBM) and established some of the key terms, basic concepts and implementation principles. It has since served as the basis for the development of a variety of management tools, frameworks, and training programs. The Agency Accountability Framework, approved in July 1998, is another   For more information on this project, see SIDA, “Sida’s Work with Democracy and Hu- man Rights.” Available at: http://www.sida.se/sida/jsp/sida.jsp?d=1509&a=32056&language=en _US. Accessed on September 12, 2007.   For more information, see NORAD’s Web site: http://www.norad.no/default.asp?V_ITEM_ ID=5704. The new strategy discussed here can be found at http://www.norad.no/items/5704/38/ 7418198779/EvaluationPolicy2006-2010.pdf. Accessed on September 12, 2007.

EVALUATION IN USAID DG PROGRAMS 53 key component of the results-based management approach practiced in CIDA. (CIDA 2007) The CIDA report makes an important distinction, however: “The frame- work articulates CIDA’s accountabilities in terms of developmental results and operational results at the overall agency level, as well as for its various development initiatives. This distinction is crucial . . . since the former is defined in terms of actual changes achieved in human development through CIDA’s development initiatives, while the latter represents the administration and management of allocated resources (organisational, human, intellectual, physical/material, etc.) aimed at achieving develop- ment results.” In short, there is growing agreement—across think tanks, blue-ribbon panels, donor agencies, and foreign ministries—that current evaluation practices in the area of foreign assistance in general, and of democracy assistance in particular, are inadequate to guide policy and that substan- tial efforts are needed to improve the knowledge base for policy planning. Thus, USAID is not alone in struggling with these issues. Current Policy and Legal Framework for USAID DG Assessments and Evaluations Current DG policies regarding project assessment and evaluation are shaped in large part by broader USAID and U.S. government policies and regulations. Official USAID polices and procedures are set forth in the Automated Directives System (ADS) on its Web site; Series 200 on “Pro- gramming Policy” covers monitoring and evaluation in Section 203 on “Assessing and Learning” (USAID ADS 2007). Of particular importance for this report, in 1995 the USAID leadership decided to eliminate the requirement of a formal evaluation for every major project; instead evalu- ations would be “driven primarily by management need” (Clapp-Wincek and Blue 2001:1). The prior practice of conducting mainly post-hoc evalu- ations (which were almost entirely process evaluations), often done by teams of consultants brought in specifically for the task, was seen as too expensive and time consuming to be applied to every project. As a result of the change, the number of evaluations for all types of USAID assistance, not just DG, has declined, and the approach to evalua- tion has evolved over time (Clapp-Wincek and Blue 2001). ADS 203.3.6.1 (“When Is an Evaluation Appropriate?”) lists a number of situations that should require an evaluation: • A key management decision is required, and there is inadequate information; • Performance information indicates an unexpected result (posi-

54 IMPROVING DEMOCRACY ASSISTANCE tive or negative) that should be explained (such as gender differential results); • Customer, partner, or other informed feedback suggests that there are implementation problems, unmet needs, or unintended consequences or impacts; • Issues of sustainability, cost effectiveness, or relevance arise; • The validity of Results Framework hypotheses or critical assump- tions is questioned (e.g., due to unanticipated changes in the host country environment); • Periodic Portfolio Reviews have identified key questions that need to be answered or that need consensus; or • Extracting lessons is important for the benefit of other Operating Units or future programming (USAID ADS 2007:24). These evaluations generally remain the traditional process evaluations using teams of outside experts undertaken while a project is under way or after it has been completed. The second significant policy shaping USAID evaluation practices is the Government Performance and Results Act (GPRA) of 1993. GPRA “establishes three types of ongoing planning, evaluation, and reporting requirements for executive branch agencies: strategic plans . . . , annual performance plans, and annual reports on program performance. In com- plying with GPRA, agencies must set goals, devise performance mea- sures, and then assess results achieved” (McMurtry 2005:1). GPRA has led to the development of an elaborate performance monitoring system across the federal government. Performance monitoring is different from evaluation; as defined by USAID, for example: Performance monitoring systems track and alert management as to whether actual results are being achieved as planned. They are built around a hierarchy of objectives logically linking USAID activities and resources to intermediate results and strategic objectives through cause- and-effect relationships. For each objective, one or more indicators are selected to measure performance against explicit targets (planned results to be achieved by specific dates). Performance monitoring is an ongoing, routine effort requiring data gathering, analysis, and reporting on results at periodic intervals. Evaluations are systematic analytical efforts that are planned and con- ducted in response to specific management questions about performance of USAID-funded development assistance programs or activities. Unlike   Clapp-Wincek and Blue (2001), for example, define evaluation as “any empirically-based analysis of problems, progress, achievement of objectives or goals, and/or unintended con- sequences for missions” (p. 2).

EVALUATION IN USAID DG PROGRAMS 55 performance monitoring, which is ongoing, evaluations are occasional— conducted when needed. Evaluations often focus on why results are or are not being achieved. Or they may address issues such as relevance, effectiveness, efficiency, impact, or sustainability. Often, evaluations pro- vide management with lessons and recommendations for adjustments in program strategies or activities. (USAID 1997:1) To implement the system required by GPRA, every USAID oper- ating unit (missions overseas, bureaus or offices in Washington) must develop strategic objectives (SOs). The DG office created a process for strategic assessments that is often used to inform the development of mission strategies (USAID 2000). Typically, a team of experts, which may include a mix of contractors and USAID personnel, spends several weeks evaluating current conditions in a country with respect to key aspects of democracy and governance and analyzing the opportunities for interven- tion and impact. This assessment is not quite keyed to the four elements in USAID’s DG goals described in Chapter 1, however. Rather, strategic assessments deal with five areas: consensus, rule of law, competition, inclusion, and good governance. After surveying the degree to which the country has these elements, the assessment considers the key actors and institutions whose behavior or condition needs to change to improve democratic development and then suggests policies—with explicit atten- tion to feasibility given the resources of USAID in that country and coun- try conditions—to promote advances in some or all of these areas. Not every country is assessed and some country assessments may be updated if conditions change enough to warrant a reexamination. Since the formal assessment tool was adopted in late 2000, more than 70 assessments have been conducted in 59 countries. To achieve their strategic objectives, all USAID operating units develop a Results Framework and a Program Monitoring Plan that include subobjectives that tie more closely to specific projects (see Fig- ure 2-1 for an illustrative results framework). Depending on the size of its budget and other factors, a mission might have anywhere between one and a dozen SOs, of which one or perhaps two will relate to democracy and governance. Indicators are used to track progress from the project level through intermediate objectives up to the SO. Missions are required to report their performance against these indicators annually, but below the SO level they can choose which indicators to report and can change the indicators they report each year. Generally, each contract or grant must have an   Interviews with USAID personnel, August 1, 2007 and March 3, 2008. Not all of the as- sessments are made public because missions sometimes consider the judgments politically sensitive.

56 Figure 2-1 Illustrative Results Framework. SOURCE: USAID 2004. Intergrated Strategic Plan for USAID’s Program in Uganda, 2002-2007. vol 1: Assisting Uganda to reduce mass poverty. Washington, DC: USAID.

EVALUATION IN USAID DG PROGRAMS 57 approved performance monitoring plan, which includes both targets and the indicators that will be used to determine whether the project meets its objectives (USAID ADS 2007). Some implementers also develop and track additional indicators, usually to provide further evidence of achiev- ing project goals. In DG alone, thousands of indicators are used every year to track project performance. Most of them are related to the outputs of specific activities or very proximate project outcomes. This process, supplemented by occasional evaluations, constitutes the largest portion of what USAID refers to as “monitoring and evaluation.” The results of this process are that USAID DG missions spend a large amount of time and money acquir- ing and transmitting the most basic accounting-type information on their projects (what is described above as “output” evaluations); far less time and money are spent in determining which projects really work and how efficient they are at producing desired results. In January 2006, Secretary of State Condeleeza Rice initiated a series of reforms, centered on the budget and program planning process, intended to bring greater coherence to U.S. foreign assistance programs (USAID 2006). As part of these reforms the USAID administrator was designated the director of foreign assistance (DFA) and provided with a staff in the State Department to supplement the staff of USAID in implementing the reforms. Instead of a largely bottom-up process that collected, coor- dinated, and eventually reconciled budget and program requests from individual offices and missions, the new F Process exercised an unprec- edented degree of centralized control, setting common objectives for State and USAID and bringing most budget and programming decisions to Washington. Eventually, the first joint State-USAID budget was submit- ted to Congress for FY2008, with significant changes in aid allocations for a number of countries (Kessler 2007). Creation of the DFA structure in the State Department led to the dis- solution of the separate policy planning apparatus in USAID. As part of this change, the Center for Development Information and Evaluation (CDIE), which served as a clearinghouse for all evaluations in USAID and had also commissioned the series of independent evaluations of USAID DG programs discussed above, was dissolved and its personnel were transferred into the new DFA Office of Strategic Information in the State Department. The F Process also resulted in the creation of a set of common indica- tors collected for all programs in all missions. Most of these are output measures, which for the first time provided a comprehensive look at   number of projects, however, including the MCC and the President’s Emergency Fund A for AIDS, were not included in the F Process for the FY2008 budget.

58 IMPROVING DEMOCRACY ASSISTANCE USAID activities worldwide (U.S. Department of State 2006). Their use in DG is examined in greater detail below. While these output indicators are designed to reflect the overall level of USAID DG activity in a country, they are not intended to provide a strategic assessment of levels of democ- racy in a country or evidence of the impact of specific DG projects. Any recommendations for changing the approach to evaluation of DG programs will have to operate within this broader context in USAID and the wider donor community. Within USAID the GPRA-required structure of SOs for programs and performance monitoring for projects is a legal mandate that USAID can adapt but not eliminate. How much of the F Pro- cess will endure is unclear at present, but it does illustrate how much can happen—and how quickly—with high-level leadership and support. Three Key Problems with Current USAID Monitoring and Evaluation Practices Focusing on Appropriate Measures Regarding DG Activities As noted above, USAID has developed many good indicators to track the results of its DG projects. USAID is clearly aware of the impor- tant differences between various levels of indicators—those dealing with attaining targeted outputs, those dealing with the institutional or behav- ioral changes sought by the program, those dealing with broad sectoral changes at the country level, and those dealing with national levels of democracy. The Handbook of Democracy and Governance Program Indicators, developed by the Center for Democracy and Governance (USAID 1998) as part of the implementation of GPRA, is the most comprehensive collec- tion of indicators in this area of which the committee is aware. It sets forth detailed suggestions on how to measure outputs and outcomes in the four areas of concern to the DG office: rule of law, elections and political processes, civil society, and governance. It provides a valuable resource to missions and subcontractors as they develop appropriate indicators to assess the impact of specific programs in these sectors. The development of output measures, especially in some program areas, has continued. The following is taken from the draft of a handbook on support for decentralization programming, currently being prepared for use by USAID: A distinction should be drawn at the outset between two different kinds of M&E [monitoring and evaluation] activities. One kind of M&E seeks to assess progress on program implementation, that is, the process of implementing decentralization reforms. To this end, one might gather and analyze data on what are sometimes called output indicators: the number of meetings and workshops held, officials trained, and so on.

EVALUATION IN USAID DG PROGRAMS 59 These kinds of indicators can help to document whether necessary steps are being taken towards effective support of decentralization programs, and they may be especially useful as management tools for program implementation. Another kind of M&E, however, seeks to assess the impact of decentral- ization programming on the broader goals described in this handbook: enhancing stability, promoting democracy, and fostering economic de- velopment. The key questions are whether and how we can attribute outcomes along these dimensions, or aspects of these dimensions, to the effect of USAID initiatives in support of decentralization programming. This kind of M&E is crucial, for it is the only way to assess what works and what does not in decentralization programming. (USAID 2007) A few of the democracy indicators recommended by this handbook include: • Ease with which political parties can register to participate in elections; • Ability of independent candidates to run for office; • Number of human rights violations, as tracked by civil society organizations (CSOs) or ombudsman’s office; • Proportion of citizens who positively evaluate government respon- siveness to their demands; • Existence of competitive local elections; • Percentage of total subnational budget under the control of partici- patory bodies. USAID has also funded various agencies to collect valuable data on outcome indicators. For example, a recent national survey in Afghanistan conducted by the Asia Foundation (2007) and underwritten by USAID collected data on the following indicators and many others: • Do you agree or disagree with the statement that some people make: “I don’t think the government cares much about what people like me think.” • How would you rate the security situation in your area: Excellent, good, fair, or poor? • Compared to a year ago, do you think the amount of corrup- tion overall in your neighborhood has increased, stayed the same, or decreased? In your province? In Afghanistan as a whole? • Would you participate in the following activities with no fear, some fear, or a lot of fear: voting, participating in a peaceful demonstration, running for public office?

60 IMPROVING DEMOCRACY ASSISTANCE Such survey questions make excellent baseline indictors on outcome measures for many DG assistance projects. USAID could then survey assisted and nonassisted groups on the same questions a year later to help determine the impact of DG assistance. This is an example where USAID can make use of extant surveys that already provide baseline data on a variety of relevant outcome measures. A more centralized set of indicators was developed as part of the F Process. As mentioned above, the Foreign Assistance Performance Indica- tors are intended to measure “both what is being accomplished with U.S. foreign assistance funds and the collective impact of foreign and host- government efforts to advance country development” (U.S. Department of State 2006). Indicators are divided into three levels: (1) the Objective level, which are usually country-level outcomes, as collected by other agencies such as the World Bank, United Nations Development Program, and Freedom House; (2) the Area level, measuring performance of sub- sectors such as “governing justly and democratically,” which captures most of the objectives pursued by the DG office; and (3) the Element level, which seeks to measure outcomes that are directly attributable to USAID programs, projects, and activities, using data collected primarily by USAID partners in the field (U.S. Department of State 2006). Clearly, USAID has taken the task of performance-based policymak- ing seriously. The central DG office, the various missions throughout the world, and the implementers who support USAID’s work in the field are all acutely aware of the importance of measurement and the various obstacles encountered. The concerns the committee heard were often not that USAID lacks the right measures to track the outcomes of its pro- grams. Although this can be a major problem for some areas of DG, the committee also saw evidence that USAID field missions and implement- ers have, and seek to use, appropriate measures for program outcomes. Rather, the problem is that the demands to supply detailed data on basic output measures or to show progress on more general national-level mea- sures overwhelm or sidetrack efforts that might go into collecting data on the substantive outcomes of projects. Matching Tasks with Appropriate Measurement Tools Broadly speaking, USAID is concerned with three measurement- related tasks: (1) project monitoring, (2) project evaluation, and (3) coun- try assessment. The first concerns routine oversight (e.g., whether funds are being properly allocated and implementers are adhering to the terms of a contract). The second concerns whether the program is having its intended effect on society. The third concerns whether a given country

EVALUATION IN USAID DG PROGRAMS 61 is progressing or regressing in a particular policy area with regard to democratization (USAID 2000). Corresponding to these different tasks are three basic types of indica- tors: outputs, outcomes, and meso- and macro-level indicators. Output mea- sures track the specific activities of a project, such as the number of individuals trained or the organizations receiving assistance. Outcome measures track policy-relevant factors that are expected to flow from a particular project (e.g., a reduction in corruption in a specific agency, an increase in the autonomy and effectiveness of specific courts, an improve- ment in the fairness and accuracy of election vote counts). Meso- and macro-level measures are constructed to assess country-level features of specific policy areas and are often at levels of abstraction that are particu- larly difficult to determine with any exactness. Examples include “judicial autonomy,” “quality of elections,” “strength of civil society,” and “degree of political liberties.” For purposes of clarification, these concepts are included, along with an illustrative example, in Table 2-1. As noted, USAID has made extensive efforts to identify indicators at all levels and across a wide range of sectors of democratic institutions. Nonetheless, in practice a mismatch often arises between the chosen measurement tools and the tasks these tools are expected to perform. Two problems, in particular, stand out. First, based on the committee’s discussions with USAID staff and implementers and further discussions and reviews of project documents during the three field visits described TABLE 2-1  Measurement Tools and Their Uses 3.  eso-Level M 4.  acro-Level M 1. Output 2. Outcome Indicator Indicator Definition Indicator Indicator Indicator Indicator focused focused on focused on focused on on national levels counting policy-relevant broad national of democracy activities or impacts of a characteristics immediate program of a policy area results of a or sector program Level Generally Generally National National subnational subnational Example: Number of Reduction in Quality of Level of Improving polling stations irregularities election democracy (e.g., elections with election at the polls Freedom House observers (bribing, Index of Political intimidation) Rights) Objective Monitoring Evaluation Assessment Assessment

62 IMPROVING DEMOCRACY ASSISTANCE in Chapter 7, there is continuing concern that the effectiveness of specific USAID DG projects should not be judged on the basis of meso- or macro- level indicators, such as the overall quality of elections or even changes in national-level indicators of democracy. Second is whether current prac- tices lead to overinvestment in generating and collecting basic output measures, as opposed to policy-relevant indicators of project results. The F Process indicators reflect both of these problems, although they had little impact on day-to-day project implementation during the course of this study. As noted above, these mandate collecting data at the “Objec- tive” and “Area” levels, which correspond to macro- and meso-level indicators in the table, and at the “Element” level, which corresponds mostly to the output level. Data at the outcome level, which seems crucial to evaluating how well specific projects actually achieve their immediate goals, thus suffer relative neglect. USAID mission staff and program implementers complained that the success of their projects was being judged (in part) on the basis of macro- level indicators that bore very little or no plausible connection to the proj- ects they were running, given the limited funds expended and the macro nature of the indicator. The most common example given was the use of changes in the Freedom House Political Rights or Civil Liberties Index as evidence of the effectiveness or ineffectiveness of their projects, even though these national-level indices were often quite evidently beyond their control to affect. One implementer commented that his group had benefited from an apparent perception that his project had contributed to improvements in the country’s Freedom House scores over the past several years. While this coincidence worked in his firm’s favor, he made it clear that this was purely coincidental; he was also concerned that if the government policies that currently helped his work changed and made his work more difficult, this would be taken as evidence that his project had “failed.” This is a poor way to measure project effectiveness. To use the example in Table 2-1, although USAID may contribute to better elections or even more democracy in a nation as a whole, there are always multiple forces and often multiple donors at work pursuing these broad goals. USAID may be very successful in helping a country train and deploy election monitors and thus reduce irregularities at the polling stations. But if the national leaders have already excluded viable opposition candidates from running, or deprived them of media access, the resulting flawed elections should not mean that USAID’s specific election project was not effective. As a senior USAID official with extensive experience in many areas of foreign assistance has written regarding this problem: To what degree should a specific democracy project, or even an entire USAID democracy and governance programme, be expected to have an

EVALUATION IN USAID DG PROGRAMS 63 independent, measurable impact on the overall democratic development in a country? Th[at] sets a high and perhaps unreasonable standard of success. Decades ago, USAID stopped measuring the success of its economic development programmes against changes in the recipient countries’ gross domestic product (GDP). Rather, we look for middle- level indicators: we measure our anti-malaria programmes in the health sector against changes in malaria statistics, our support for legume re- search against changes in agricultural productivity. What seems to be lacking in democracy and governance programmes, as opposed to these areas of development, is a set of middle-level indicators that have two characteristics: (a) we can agree that they are linked to important char- acteristics of democracy; and (b) we can plausibly attribute a change in those indicators to a USAID democracy and governance programme. It seems clear that we need to develop a methodology that is able to detect a reasonable, plausible relationship between particular democracy activi- ties and processes of democratic change. (Sarles 2007:52) The appropriate standard for evaluating the effectiveness of specific DG projects and even broader programs is how much of the targeted improvement in behavior and institutions can be observed compared to conditions in groups not supported by such projects or programs. It is in iden- tifying how much difference specific programs or projects made, relative to the investment in such programs, that USAID can learn what works best in given conditions. Of course, it is hoped that such projects do contribute to broader pro- cesses of democracy building. But these broader processes are subject to so many varied forces—from strategic interventions to ongoing conflicts to other donors actions and the efforts of various groups in the country to obtain or hold on to power—that macro-level indicators are a mislead- ing guide to whether or not USAID projects are in fact having an impact. USAID efforts in such areas as strengthening election commissions, build- ing independent media, or supporting opposition political parties may be successful at the project level but only become of vital importance to changing overall levels of democracy much later, when other factors inter- nal to the country’s political processes open opportunities for political change (McFaul 2006). Learning “what works” requires that USAID focus its efforts to gather and analyze data on outcomes at the appropriate level for evaluating specific projects—what is labeled “outcome” measures in Table 2-1. The committee wants to stress that there are good reasons for employ- ing meso- and macro-level indicators of democracy and working to improve them. They are important tools for strategic assessment of a country’s current condition and long-term trajectory regarding democ- ratization. But these indicators are usually not good tools for project evaluation. For the latter purpose, what is needed, as Sarles noted, are

64 IMPROVING DEMOCRACY ASSISTANCE measures that are both policy relevant and plausibly linked to a specific policy intervention sponsored by USAID. The committee discusses these policy-relevant outcome measures and provides examples from our field visits in Chapter 7. If one concern regarding USAID’s evaluation processes is that they may rely too much on meso- and macro-measures to judge program suc- cess, the committee also found a related concern regarding USAID’s data collection for M&Es: USAID spends by far the bulk of its M&E efforts on data at the “output” level, the first category in Table 2-1. Current M&E Practices and the Balance Among Types of Evaluations In the current guidelines for USAID’s M&E activities given earlier, only monitoring is presented as “an ongoing, routine effort requiring data gathering, analysis, and reporting on results at periodic intervals.” Evaluation, by contrast, is presented as an “occasional” activity to be undertaken “only when needed.” The study undertaken for SORA by Bollen et al (2005) that is discussed in Chapter 1 found that most USAID evaluations were process evaluations. These can provide valuable infor- mation and insights but, as already discussed, do not help assess whether a project had its intended impact. Although we cannot claim to have made an exhaustive search, the committee asked repeatedly for examples of impact evaluations for DG projects. The committee learned about very few. One example was a well- designed impact evaluation of a project to support CSOs in Mali (Manage- ment Systems International 2000). Here the implementers had persuaded USAID to make use of annual surveys being done in the country, and to use those surveys to measure changes in attitudes toward democracy in three distinct areas: those that received the program, those that were nearby but did not receive the program (to check for spillover effects), and areas that were distant from the sites of USAID activity. The results of this evaluation suggested that USAID programs were not having as much of an impact as the implementers and USAID had hoped to see. The response within USAID was informative. Some USAID staff mem- bers were concerned that a great deal of money had been spent to find little impact; complaints were thus made that the evaluation design had not followed changes made while the program was in progress or was not designed to be sensitive to the specific changes USAID was seeking. On the other hand, there were also questions about whether annual surveys were too frequent or too early to capture the results of investments that were likely to pay off only in the longer term. And the project, by fund- ing hundreds of small CSOs, might have suffered from its own design flaws; some of those who took part in the project suggested that fewer

EVALUATION IN USAID DG PROGRAMS 65 and larger investments in a select set of CSOs might have had a greater impact. All of these explanations might have been explored further as a way to understand when and how impact evaluations work best. But from the committee’s conversations, the primary “lessons” taken away by some personnel at USAID were that such rigorous impact evaluations were not worth the time, effort, and money given what they expected to get from them or did not work. While certainly only a limited number of projects should be sub- ject to full evaluations, proper impact evaluations cannot be carried out unless “ongoing and routine efforts” to gather appropriate data on policy- relevant outcomes before, during, and after the project are designed into an M&E plan from the inception of the project. Current guidelines for M&E activity tend to hinder making choices between impact and process evaluations and in particular make it very difficult to plan the former. Chapter 7 discusses, based on the committee’s field visits to USAID DG missions, the potential for improving, in some cases, USAID M&E activi- ties simply by focusing more efforts on obtaining data at the policy out- come level. Using Evaluations Wisely: USAID as a Learning Organization Even if USAID were to complete a series of rigorous evaluations with ideal data and obtained valuable conclusions regarding the effectiveness of its projects, these results would be of negligible value if they were not disseminated through the organization in a way that led to substantial learning and were not used as inputs to planning and implementation of future DG projects. Unfortunately, much of USAID’s former learning capacity has been reduced by recent changes in agency practice. A longstanding problem is that much project evaluation material is simply maintained in mission archives or lost altogether (Clapp-Wincek and Blue 2001). For example, the committee found that when project evaluations involved surveys, while the results might be filed in formal evaluation reports, the underlying raw data were discarded or kept by the survey firm after the evaluation was completed. While many case studies of past projects, as well as many formal evaluations, are supposed to be available to all USAID staff online, not all evaluations were easy to locate. Moreover, simply posting evaluations online does not facilitate discus- sion, absorption, and use of lessons learned. Without a central evaluation office to identify key findings and organize conferences or meetings of DG officers to discuss those findings, the information is effectively lost. As mentioned above, CDIE is no longer active. USAID also formerly had conferences of DG officers to discuss not only CDIE-sponsored evalu- ations but also research and reports on DG assistance undertaken by

66 IMPROVING DEMOCRACY ASSISTANCE NGOs, academics, and other donors. These activities appear to have sig- nificantly atrophied. The committee is concerned about the loss of these learning activities. Even the best evaluations will not be used wisely if their lessons are not actively discussed and disseminated in USAID and placed in the context of lessons learned from other sources, including research on DG assistance from outside the agency and the experience of DG officers themselves. The committee discusses the means to help USAID become a more effective learning organization in Chapters 8 and 9. Conclusions This review of current evaluation practices regarding development assistance in general and USAID’s DG programs in particular leads the committee to a number of findings: • The use of impact evaluations to determine the effects of many parts of foreign assistance, including DG, has been historically weak across the development community. Within USAID the evaluations most commonly undertaken for DG programs are process and participatory evaluations; impact evaluations are a comparatively underutilized ele- ment in the current mix of M&E activities. • Some donors and international agencies are beginning to imple- ment more impact evaluations. Nonetheless, considerable concerns and skepticism remain regarding the feasibility and appropriateness of apply- ing impact evaluations to DG projects. These need to be taken seriously and addressed in any effort to introduce them to USAID. • Current practices regarding measurement and data collection show a tendency to emphasize collection of output measures rather than policy- relevant outcome measures as the core of M&E activities. There is also a tendency, in part because of the lack of good meso-level indicators, to judge the success of DG programs by changes in macro-level measures of a country’s overall level of democracy, rather than by achieving outcomes more relevant to a project’s plausible impacts. • Much useful information aside from evaluations, such as survey data and reports, detailed spending breakdowns, and mission director and DG staff reports, remains dispersed and difficult to access. • USAID has made extensive investments in developing outcome measures across all its program areas; these provide a sound basis for improving measurements of the policy-relevant effects of DG projects. • Once completed, there are few organizational mechanisms for broad discussion of USAID evaluations among DG officers or for integra-

EVALUATION IN USAID DG PROGRAMS 67 tion of evaluation findings with the large range of research on democracy and democracy assistance being carried on outside the agency. • Many of the mechanisms and opportunities for providing orga- nizational learning were carried out under the aegis of the CDIE. The dissolution of this unit, combined with the longer term decline in regular evaluation of projects, means that USAID’s capacity for drawing and sharing lessons has disappeared. The DG office’s own efforts to provide opportunities for DG officers and implementers to meet and learn from one another and outside experts have also been eliminated. • Evaluation is a complex process, so that improving the mix of evaluations and their use, and in particular increasing the role of impact evaluations in that mix, will require a combination of changes in USAID practices. Gaining new knowledge from impact evaluations will depend on developing good evaluation designs (a task that requires special skills and expertise), acquiring good baseline data, choosing appropriate mea- sures, and collecting data on valid comparison groups. Determining how to feasibly add these activities to the current mix of M&E activities will require attention to the procedures governing contract bidding, selection, and implementation. The committee’s recommendations for how USAID should address these issues are presented in Chapter 9. Moreover, better evaluations are but one component of an overall design for learning, as making the best use of evaluations requires plac- ing the results of all evaluations in their varied contexts and historical perspectives. This requires regular activities within USAID to absorb and disseminate lessons from case studies, field experience, and research from outside USAID on the broader topics of democracy and social change. The committee’s recommendations on these issues are presented in Chapter 8. These recommendations are intended to improve the value of USAID’s overall mix of evaluations, to enrich its strategic assessments, and to enhance its capacity to share and learn from a variety of sources—both internal and from the broader community—about what works and what does not in efforts to support democratic progress. REFERENCES Asia Foundation. 2007. Afghanistan in 2007: A Survey of the Afghan People. Available at: http://www.asiafoundation.org/pdf/AG-survey07.pdf. Accessed on February 23, 2008. Banerjee, A.V. 2007. Making Aid Work. Cambridge, MA: MIT Press. Bollen, K., Paxton, P., and Morishima, R. 2005. Assessing International Evaluations: An Example from USAID’s Democracy and Governance Programs. American Journal of Evaluation 26:189-203.

68 IMPROVING DEMOCRACY ASSISTANCE CIDA (Canadian International Development Agency). 2007. Results-Based Management in CIDA: An Introductory Guide to the Concepts and Principles. Available at: http://www. acdi-cida.gc.ca/CIDAWEB/acdicida.nsf/En/EMA-218132656-PPK#1. Accessed on Septem- ber 12, 2007. Clapp-Wincek, C., and Blue, R. 2001. Evaluation of Recent USAID Evaluation Experience. Wash- ington, DC: USAID, Center for Development Information and Evaluation. Danish Ministry of Foreign Affairs. 2005. Peer Assessment of Evaluation in Multilateral Organi- zations: United Nations Development Programme, by M. Cole et al. Copenhagen: Ministry of Foreign Affairs of Denmark. DfID (Department for International Development). 2004. Public Information Note: Drivers of Change. Available at: http://www.gsdrc.org/docs/open/DOC59.pdf. Accessed on Sep- tember 16, 2007. Green, A.T., and Kohl, R.D. 2007. Challenges of Evaluating Democracy Assistance: Perspec- tives from the Donor Side. Democratization 14(1):151-165. House of Commons (Canada). 2007. Advancing Canada’s Role in International Support for Democratic Development. Ottawa: Standing Committee on Foreign Affairs and Interna- tional Development. Jacquet, P. 2006. Evaluations and Aid Effectiveness. In Rescuing the World Bank: A CGD Working Group Report and Collected Essays, N. Birdsall, ed. Washington, DC: Center for Global Development. Kessler, G. 2007. Where U.S. Aid Goes Is Clearer, But System Might Not Be Better. Washington Post, p. A1. McFaul, M. 2006. The 2004 Presidential Elections in Ukraine and the Orange Revolution: The Role of U.S. Assistance. Washington, DC: USAID, Office for Democracy and Governance. McMurtry, V.A. 2005. Performance Management and Budgeting in the Federal Government: Brief History and Recent Developments. Washington, DC: Congressional Research Service. Management Systems International. 2000. Third Annual Performance Measurement Sur- vey: Data Analysis Report. USAID/Mali Democratic Governance Strategic Objective. Unpublished. Millennium Challenge Corporation. 2007. Fiscal Year 2007 Guidance for Compact Eligible Countries, Chapter 29, Guidelines for Monitoring and Evaluation Plans, p. 19. Available at: http://www.mcc.gov/countrytools/compact/fy07guidance/english/29-guidelinesformande. pdf. Accessed on September 12, 2007. OECD (Organization for Economic Cooperation and Development). 2005. Lessons Learned on the Use of Power and Drivers of Change Analyses in Development Operation. Review commissioned by the OECD DAC Network on Governance, Executive Sum- mary. Available at: http://www.gsdrc.org/docs/open/DOC82.pdf. Accessed on September 12, 2007. Sarles, M. 2007. Evaluating the Impact and Effectiveness of USAID’s Democracy and Gov- ernance Programmes, in Evaluating Democracy Support: Methods and Experiences, P. Burnell, ed. Stockholm: International Institute for Democracy and Electoral Assistance and Swedish International Development Cooperation Agency. Savedoff, W.D., Levine, R., and Birdsall, N. 2006. When Will We Ever Learn? Improving Lives Through Impact Evaluation. Washington, DC: Center for Global Development. Schmid, A. 2007. Measuring Development. Available at: http://www.gtz.de/de/dokumente/ELR- en-30-31.pdf. Accessed on September 12, 2007. Shadish, W.R., Cook, T.D., and Campbell, D.T. 2001. Experimental and Quasi-Experimental Designs for Generalized Causal Inference, 2nd ed. Boston: Houghton Mifflin. USAID ADS. 2007. Available at: http://www.usaid.gov/policy/ads/200/. Accessed on August 2, 2007. USAID (U.S. Agency for International Development). 1997. The Role of Evaluation in USAID. TIPS 11. Washington, DC: USAID.

EVALUATION IN USAID DG PROGRAMS 69 USAID (U.S. Agency for International Development). 1998. Handbook of Democracy and Governance Program Indicators. Washington, DC: Center for Democracy and Gover- nance. USAID. Available at: http://www.usaid.gov/our_work/democracy_and_governance/ publications/pdfs/pnacc390.pdf. Accessed on August 1, 2007. USAID (U.S. Agency for International Development). 2000. Conducting a DG Assessment: A Framework for Strategy Development. Available at: http://www.usaid.gov/our_work/ democracy_and_governance. Accessed on August 26, 2007. USAID (U.S. Agency for International Development). 2006. U.S. Foreign Assistance Reform. Available at: http://www.usaid.gov/about_usaid/dfa/. Accessed on August 2, 2007. USAID (U.S. Agency for International Development). 2007. Decentralization and Democratic Local Governance (DDLG) Handbook. Draft. U.S. Department of State. 2006. U.S. Foreign Assistance Performance Indicators for Use in De- veloping FY2007 Operational Plans, Annex 3: Governing Justly and Democratically: In- dicators and Definitions. Available at: http://www.state.gov/f/releases/factsheets2007/78450. htm. Accessed on August 25, 2007. Wholey, J.S., Hatry, H.P., and Newcomer, K.E., eds. 2004. Handbook of Practical Program Evaluation, 2nd ed. San Francisco: Jossey-Bass. World Bank. 2004. Monitoring & Evaluation: Some Tools, Methods, and Approaches. Washington, DC: World Bank. de Zeeuw, J., and Kumar, K. 2006. Promoting Democracy in Postconflict Societies. Boulder: Lynne Rienner.

Next: 3 Measuring Democracy »
Improving Democracy Assistance: Building Knowledge Through Evaluations and Research Get This Book
×
Buy Paperback | $70.00 Buy Ebook | $54.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Over the past 25 years, the United States has made support for the spread of democracy to other nations an increasingly important element of its national security policy. These efforts have created a growing demand to find the most effective means to assist in building and strengthening democratic governance under varied conditions.

Since 1990, the U.S. Agency for International Development (USAID) has supported democracy and governance (DG) programs in approximately 120 countries and territories, spending an estimated total of $8.47 billion (in constant 2000 U.S. dollars) between 1990 and 2005. Despite these substantial expenditures, our understanding of the actual impacts of USAID DG assistance on progress toward democracy remains limited--and is the subject of much current debate in the policy and scholarly communities.

This book, by the National Research Council, provides a roadmap to enable USAID and its partners to assess what works and what does not, both retrospectively and in the future through improved monitoring and evaluation methods and rebuilding USAID's internal capacity to build, absorb, and act on improved knowledge.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!