Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 289
Improving Democracy Assistance: Building Knowledge through Evaluations and Research E Field Visit Summary Report1 OVERVIEW OF NATIONAL ACADEMIES’ MISSION AND TASKS The field visits were part of a larger project conducted by the National Academies (NA) for the U.S. Agency for International Development (USAID), the purpose of which was to develop an overall research and analytic design that will lead to specific findings and recommendations for the Strategic and Operational Research Agenda (SORA) of the democracy and governance (DG) programs. These findings and recommendations were developed through the vetting of a variety of methodologies for assessing and evaluating democracy assistance programs. OBJECTIVES OF FIELD VISITS In support of these overall project objectives, the field visits were intended to serve two major purposes: The collection of information for the NA committee to inform its recommendations, in particular to increase members’ understanding of: how USAID programs are developed and implemented in the field as background for its recommendations to improve program evaluation and understanding of program successes and failures, what data, evidence, and other resources are primarily or 1 Some of the material in this Appendix also appears in Chapters 6 and 7.
OCR for page 290
Improving Democracy Assistance: Building Knowledge through Evaluations and Research uniquely available in the mission or in country to support improved program evaluation, the perspectives of mission personnel and USAID implementers regarding the feasibility of potential options for improving program evaluation; to provide an opportunity to explore a “proof of concept” of the committee’s preliminary recommendations, in particular the feasibility of introducing more rigorous approaches to program evaluation. SELECTION OF FIELD VISIT SITES Three countries were selected as the sites of the field visits conducted by teams of consultants and staff: Albania, Peru, and Uganda. In particular, the selection was based primarily on the stage of program development within a country’s DG portfolio, the breadth of USAID programming, and the depth of USAID programming (as determined by long-term funding in multiple program areas of interest; see “Current and Recent USAID Projecst at the Time of Field Visits” at the end of this appendix for a list of the major DG projects in each country). In each country selected, the DG staff were at the stage of developing new projects, offering an optimal opportunity to explore options for program design that may be more or less suited for various research methodologies. The NA field team members (see “Consultant Biographies” at the end of this appendix) were thus able to understand a variety of projects at the stage of their inception, the point at which new methodologies would be most effectively designed to maximize confidence about the impact of projects and under what conditions. These considerations guided the selection of cases across geographically and politically distinct regions of the world (Central Europe/Post-Communist, Latin America/Post-Military Rule, Africa/Post-Conflict). While there is no single point at which DG programs can be most effectively designed, implemented, or evaluated, the initial stages of development and design provide the most fruitful points at which innovative yet feasible options may be considered. Each field team therefore selected one or more projects and worked closely with USAID Mission DG officers, project implementers, and local partners through a series of in-depth conversations to understand the various opportunities and challenges presented by newly proposed program designs, data collection, and more rigorous evaluation techniques. A fuller discussion of these proposed program designs in each country visited follows.
OCR for page 291
Improving Democracy Assistance: Building Knowledge through Evaluations and Research KEY OBSERVATIONS AND FINDINGS FROM FIELD VISITS2 There are ample opportunities for improving the methodology of program monitoring and evaluation within the DG sector. This is in large part due to the well-developed existing USAID evaluation procedures. To maximize these opportunities, various approaches to evaluation must be selected based on program goals and program designs. This should involve the provision of assistance (e.g., visits by specialists in program monitoring and evaluation (M&E) from USAID/Washington to missions during the project conceptualization stage as well as subsequent stages of M&E development. Improvements in program evaluation need not be expensive. Maximizing existing mechanisms (surveys and other data collection systems) and strategically targeting sample populations and control groups can result in more robust findings at a cost savings overall. By improving program evaluation, the impact of USAID programs can be more accurately assessed and documented. Creating knowledge of program impacts through rigorous evaluation is the best way to identify and take advantage of lessons learned. Institutional knowledge gained through these experiences should be shared within and beyond the mission to affect learning on a broader, agency-wide basis. Building on Current Tools and Approaches Several current practices of mission staff demonstrate the necessary willingness to maximize reasonable opportunities for learning and provide the basis for more solid inferences over time. Currently, as a part of ongoing DG programs, mission staff collect regular and systematic information about those who receive training through USAID-funded programs. This approach to data collection should be encouraged and expanded to complement other more rigorous methodologies described below. Similarly, implementers working with USAID have developed elaborate mechanisms for quarterly data collections pertinent to their programs. To maximize the potential represented by these mechanisms, data collected should be directed toward understanding outcomes and impacts over outputs. Similarly, mechanisms created by local implementers should be strategically collected and analyzed to maximize cost benefits and 2 This text is drawn from memos prepared for the committee by three of its field consultants—Thad Dunning, Yale University (Peru); Devra Cohen Moehler, Cornell University (Uganda); and Dan Posner, University of California at Los Angeles (Albania)—and reflects their judgments and assessments.
OCR for page 292
Improving Democracy Assistance: Building Knowledge through Evaluations and Research efficiencies. For example, collecting local government data in the form of smaller, cost-effective samples from municipalities would be beneficial. Furthermore, this information should be fully transferable to USAID for learning purposes. Most important, these mechanisms should be consistent with key program design elements requiring consideration at the initial stages of program development. Measurement of Outcome Indicators Indicators gathered in connection with past programs tend to be measures of “outputs” or very proximate outcomes. Examples of these output indicators include, in the context of a decentralization program, the number of relevant municipal officials trained by the implementer or the percent of target municipalities who agree to an assistance plan. Although these output measures may be useful and necessary for monitoring the performance of local implementers or to assess short-term progress on the process of implementing a program, they are less helpful for measuring the outcomes that the programs hope to promote. To improve assessment of the impact of USAID programs on ultimate objectives, it is important to gather data to the extent possible on outcome variables. One example gathered in connection with the decentralization program was the percentage of local citizens who rate the quality of local government services as “good” or “very good.” Controls Most program evaluations involve indicators gathered only or mostly on “treated” units (those groups, individuals, or organizations who were assisted by USAID). Sometimes this is unavoidable, as when a program works with only one unit or actor (e.g., the Congress). At other times, however, it is possible to find comparison units that would be useful for assessing the impact of U.S. interventions. Using control groups is invaluable for attributing impact to a USAID program. For example, without a control group it is impossible to know if the change in local party development is a result of a USAID intervention or another factor such as change in national party law, economic growth, or better media coverage. Gathering outcome measurements on control units need not be prohibitively costly. The cost of modifying the 2003 and 2005 national surveys in Peru conducted by the Latin American Public Opinion Project (LAPOP) to include a sample of residents in control group municipalities would likely have run around $15,000 per survey, a small investment when compared to the $20 million cost of the program over five years.
OCR for page 293
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Opportunities for Randomization Comparisons across units or groups with which USAID partners worked and those with which they did not are only partially informative about the impact of USAID interventions. For example, differences across these groups could reflect preexisting differences and unobserved confounders, rather than the impact of the intervention. Similarly, selection bias could account for the variation in performance between the treatment and control groups. One of the ways that social scientists sometimes approach this difficulty is through random assignment of units to treatment. In the context of decentralization, for example, the municipalities with which USAID implementers work could be determined by lottery. Subsequent differences between treated and untreated municipalities are likely to be due to the intervention, since other factors will be roughly balanced across the two groups of municipalities. Randomization is not feasible for many kinds of programs, and there can be a range of practical obstacles; yet these are also often surmountable. In addition, experimental designs need not be expensive; additional costs can be offset by savings introduced by appropriate designs. SAMPLE PROPOSED PROGRAM EVALUATION DESIGNS FROM THREE FIELD VISITS3 Selected Designs from Albania: Rule of Law Programs A major part of USAID’s DG-related activities in Albania involved increasing the effectiveness and fairness of legal sector institutions. With one possible exception, none of these rule of law activities are amenable to randomized evaluation. This is because they each deal with either (a) technical assistance to a single unit (e.g., the Inspectorate of the High Council of Justice, the Inspectorate of the Ministry of Justice, the High Inspectorate for the Declaration and Audit of Assets, the Citizen’s Advocacy Office, and the National Chamber of Advocates), (b) support for the preparation of a particular piece of legislation (e.g., the Freedom of Information Act and Administrative Procedures Code, a new conflict of interest law, and a new press law), or (c) support for a single activity, such as the implementation of an annual corruption survey. For a randomized evaluation of the efficacy of these activities to be possible they would have to be, in principle, implementable across a large number of units, 3 In addition to this group of selected projects discussed here, several others were analyzed by the field teams.
OCR for page 294
Improving Democracy Assistance: Building Knowledge through Evaluations and Research which these are not. There is only one Inspectorate of the High Council of Justice, only one conflict of interest law being prepared, and only one National Chamber of Advocates being supported, so it is not possible to compare the impact of support for these activities both where they are and are not being supported, and certainly not across multiple units. The best—indeed, only—way to evaluate the success of these activities is to identify the outcomes they are designed to affect, measure these outcomes both before and after the activities have been undertaken, and compare these measures. The trick, however, is to find appropriate measures of the outcomes that the activities are designed to affect, and this is frequently far from straightforward. For example, the goal of the technical assistance to the Inspectorates of the High Council of Justice and the Ministry of Justice is to improve the transparency and accountability of the judiciary and to increase public confidence in judicial integrity. The latter can be measured fairly easily using public opinion polls that probe respondents’ trust in the judiciary and perceptions of its integrity (these would be administered before and after the period during which technical assistance was offered, and the results of the polls compared). However, measuring the degree to which the judiciary is transparent and accountable is much more difficult. Part of the problem stems from the fact that transparency and accountability can only be ascertained vis-à-vis an (unknown) set of activities that should be brought to light and an (unknown) level of malfeasance that needs to be addressed. For example, suppose that, following the implementation of the programs designed to support the Inspectorate of the High Council of Justice, we observe that three judges are brought up on charges of corruption. Should this be taken as a sign that the activities worked in generating greater accountability? Compared to a baseline of no prosecutions, the answer is probably yes, to at least some degree. But knowing just how effective the activities were depends on whether there were just three corrupt judges who should have been prosecuted or whether there were, in fact, twenty, in which case prosecuting the three only scratched the surface of the problem, or whether the prosecutions might be selective with the targets chosen for political reasons. Parallel problems affect other rule of law initiatives, such as efforts to improve the ability of lawyers to police themselves. A slightly different evaluation problem arises with respect to the activities designed to support the drafting of various pieces of legislation. One fairly straightforward measure of success in this area is simply whether or not the law was actually drafted, and, if so, whether it included language that will demonstrably strengthen the rule of law. But assessing whether or not USAID’s support had any impact requires weighing the counterfactual question: Would the legislation have been drafted without USAID’s support
OCR for page 295
Improving Democracy Assistance: Building Knowledge through Evaluations and Research and what would it have looked like? If the answers to these questions are that the legislation would not have been drafted or that the language in the resulting law would not have been optimal, then we can judge the support from USAID to have been successful to the extent that the result we observe is better than this counter factual outcome. The broader problem, however, is that achieving the overarching strategic objective of strengthening the rule of law will involve more than just getting legislation drafted but also getting it passed and then having it enforced. The point is that the measurable outcome of the USAID-sponsored activity is several steps removed from the true goals of the intervention, and any assessment of “success” in these areas must be interpreted in this light. This is equally true with respect to other activities, such as technical assistance to aid the Albanian government in the establishment of a copyright office or an office of patents and trademarks. Whether these institutions, once created, will have any impact on protecting intellectual property will depend on much more than whether or not a formal office designed to do so has been established. The larger point that this discussion hints at is that many of the activities in the rule of law area involve the creation of laws or the strengthening of institutions whose existence is a prerequisite for a legal system that works, and that supports democracy and market reform. Whether or not these laws and institutions actually have a positive impact on these outcomes can only be ascertained after they have been created or made sufficiently strong to work properly. In this context, evaluating the efficacy of the resources spent on such activities may not make much sense, since the impact will only be meaningful after this initial, necessary foundation-building stage. Supporting the writing of laws and the setting up of institutions such as inspectorates, citizens’ advocacy offices, and attorneys’ associations may simply be necessary investments, even if it is very difficult to know whether or not they have had, or will have, an impact on the ultimate outcomes that USAID wants to affect. The one activity area within rule of law that might be amenable to randomized evaluation, at least in principle, is the support for rule of law–oriented nongovernmental organizations (NGOs). The problem here is that the preferred method of selecting NGOs for support is through a small grants competition, whereas a truly rigorous evaluation of the impact of support would require randomly choosing NGOs for funding. One possible solution would be to hold a small grants competition and, having ranked the applications from best to worst, work down the list funding every other one. Then, data would need to be collected on the quality of the performance and/or the impact in its area of focus of every NGO on the list—both those that were funded and those that were not—and a comparison could then be made across those groups. The problem, again, however, is to figure out what, precisely, to measure (which will
OCR for page 296
Improving Democracy Assistance: Building Knowledge through Evaluations and Research depend, in any case, on the particular goals that the NGO sets for itself). Also, unless the small-grants competition generates a very large number of high-quality applications, this method is not likely to generate very useful results. The need for a large number of funded and nonfunded NGOs will be increased by the likelihood that NGOs will propose different sets of activities, so “success” will have two possible sources—the difficulty of the tasks that the NGO sets out to accomplish and the benefits of having received the small grant—and the sample of NGOs analyzed will need to be large enough to permit the impact of funding through the “noise” of the random variation in task difficulty. Selected Designs from Peru: Decentralization, Rule of Law, and Political Parties Decentralization USAID/Peru launched a program in 2002 to support national decentralization policies initiated by the Peruvian government. Over a five-year period, the Pro-Decentralization (PRODES) program was intended to support the implementation of mechanisms for citizen participation with subnational governments (such as “participatory budgeting”); strengthen the management skills of subnational governments in selected regions of Peru; and increase the capacity of nongovernmental organizations in these same regions to interact with their local government. With the exception of some activities relating to national-level policies, all interventions under the program took place in seven selected subnational regions (also called departments): Ayacucho, Cusco, Huanuco, Junin, Pasco, San Martin, and Ucayali.4 These seven regions contain 61 provinces, which in turn contain 536 districts.5 Workshops on participatory budgeting, training of civil-society orga- 4 As discussed elsewhere, the regions were nonrandomly selected for programs because they share high poverty rates, significant indigenous populations, narcotics-related activities, and because a number of the departments were strongholds for the Shining Path movement in the 1980s. 5 Peru has 24 departments plus one “constitutional province”; the 24 departments in turn comprise 194 provinces and 1,832 districts. Provinces and districts are often both called “municipalities” in Peru and both have mayors. Sometimes two or more districts combine to form a city, however.
OCR for page 297
Improving Democracy Assistance: Building Knowledge through Evaluations and Research nizations, and other interventions took place at the regional, provincial, and district levels.6 The ultimate goal of the program was to promote “increased responsiveness of sub-national elected governments to citizens at the local level in selected regions.” This outcome is potentially measurable on different units of observation. For example, government capacity and responsiveness could be measured at the district or provincial level (through expert appraisals or other means), while citizens’ perceptions of government responsiveness may be measured at the individual level (through surveys). Experimental designs could be used to study the impact of the decentralization program, and the cost of appropriately designed experimental evaluations could in fact be far beneath the actual costs spent on monitoring and evaluation. Best-possible designs. We discuss best-possible designs from the perspective of program evaluation. First, we discuss what an ideal ex ante design for the decentralization program might have been in 2002, when the program was begun. Second, we also discuss how an experimental design might be employed in a second phase of the program, given that all the municipalities in the seven regions were already treated in the first phase. A “tabula rasa” design. We assume that the decentralization program will be implemented in the seven nonrandomly chosen regions in which USAID commonly works; inferences about the effect of the intervention will then be made to the districts and provinces that comprise these regions. The simplest design would involve randomization of treatment at the district level. Districts in the treatment group would be invited to receive the full bundle of interventions associated with the decentralization program (e.g., training in participatory budgeting, assistance for civil society groups, and so on); control districts would receive no interventions. There are two disadvantages to randomizing at the district level, however. One is that some of the relevant interventions in fact take place at the provincial level.7 Another is that district mayors and other actors may more easily become aware of treatments in neighboring districts. For both of these reasons, it may be useful to randomize instead at the provincial 6 Relevant subnational authorities include members of regional councils, provincial mayors, and mayors of districts. 7 Some interventions also occurred at the regional level, particularly toward the end of the program, yet these interventions constitute a relatively minor part of the program.
OCR for page 298
Improving Democracy Assistance: Building Knowledge through Evaluations and Research level. Then, all districts in a province that were randomly selected for treatment would be invited to receive the bundle of interventions. Several different kinds of outcome measures can be gathered. Survey evidence on citizens’ perceptions of local government responsiveness will be useful; so may be evaluations of municipal governance capacity taken across all municipalities in the seven regions (both treated and untreated). A difference in average outcomes across groups at the end of the program—for example, differences in the percentage of residents who say government services are “good” or “very good,” or the percentage who say the government responds “almost always” or “on the majority of occasions” to what the people want—can then be reliably attributed to the effect of the bundle of interventions, if the difference is bigger than might reasonably arise by chance.8 One feature of this design that may be perceived as a disadvantage is the fact that treated municipalities are subject to a bundle of interventions; thus, if we observe a difference across treated and untreated groups, we may not know which particular intervention was responsible (or most responsible) for the difference. Did training in participatory budgeting matter most? Assistance to civil society groups? Or some other aspect of the bundle of interventions? This problem arises as well in some medical trials and other experiments involving complex treatments, where it may not be clear exactly what aspect of treatment is responsible for differences in average outcomes across treatment and control groups. It seems preferable at this stage to design an evaluation plan that would allow USAID to know with some confidence whether a program financed by USAID makes any difference. Bundling the interventions may provide the best chance to estimate a causal effect of treatment. Once this question is answered, one might then want to ask what aspect of the bundle of interventions made a difference, using further experimental designs. However, another possibility discussed below is to implement a more complex design in which different municipalities would be randomized to receive different bundles of interventions. The intention-to-treat principle can be used to analyze the results of the experiment. Some municipalities assigned to treatment may refuse to sign participation agreements or otherwise may not cooperate with the local contractor; these municipalities may be akin to noncompliers in a medical trial. In this context, estimating the “effect of treatment on the treated” may be of interest. It may be worth choosing pilot districts at random as well. In the first 8 Standard errors may need to be adjusted to account for the clustering of treated districts within provinces.
OCR for page 299
Improving Democracy Assistance: Building Knowledge through Evaluations and Research phase of the implemented decentralization program, only 145 municipalities were incorporated in the program in the first year, out of 536 that were eventually incorporated. Comparing municipal capacity across incorporated and unincorporated municipalities at the end of the pilot period may not lead to useful results; the incorporated municipalities were chosen for their high degree of capacity. It would be much more meaningful to randomly assign municipalities for inclusion in the pilot phase. To the extent it is necessary to include some municipalities with high ex ante management capacity and resources, this may be accomplished through stratified sampling of municipalities. Second-phase design. USAID/Peru is preparing to roll out a second five-year phase of the decentralization program, again in the seven regions in which it typically works. At this point, all municipalities in the seven regions were already treated (or at least targeted for treatment) in the first phase. This may raise some special considerations for the second-phase design. Our understanding is that there are at least two possibilities for the actual implementation of the second phase of the program; which option is chosen will depend on the available budget and other factors. One is that all 536 municipalities are again targeted for treatment. As in the first-phase design, this would not allow the possibility to partition municipalities in the seven regions into a treatment group and controls. In this case, the best option for an experimental design may be to randomly assign different treatments—bundles of interventions—to different municipalities. While such an approach will not allow us to compare treated and untreated cases, it will allow us to assess the relative effects of different bundles of interventions. This may be quite useful, particularly for assessing the question raised above about which aspect of a given bundle of interventions has the most impact on outcomes. Do workshops on participatory budgeting matter more than training civil society organizations (CSOs)? Randomly assigning workshops to some municipalities and training to others would allow us to find out. A second possibility for the second phase of the program is to reduce the number of municipalities treated, for budgetary reasons. Suppose the number of municipalities were to be reduced by half. The best option in this case is probably to randomize the control municipalities out of treatment, leaving half assigned to treatment and the other half in control. Those municipalities assigned to treatment would be offered the full menu of interventions in the decentralization program. Of course, randomizing some municipalities out of treatment is sure to encounter displeasure among authorities in control municipalities. Yet if the budget only allows for 268 municipalities assigned to treatment and 268 to control, this displeasure will arise whether or not the allocation of
OCR for page 304
Improving Democracy Assistance: Building Knowledge through Evaluations and Research terns, background of key officials, location, ethnic composition, number and type of health facilities, and infection rates. The most important criteria to ensure comparability should be determined in consultations with experts. Grouped subcounties might be next to each other but immediate proximity is not necessary (or even desirable).13 In each subcounty, one CSO working in HIV/AIDS will be selected with the aim of finding similar CSOs across three subcounties in the group. One subcounty in each group will be randomly assigned to receive a large CSO grant to monitor HIV/AIDS services in the subcounty. Another subcounty in the group will be randomly selected to receive a small CSO grant for HIV/AIDS. The remaining sub-county in the group will act as the pure control and receive no grant. This will be repeated for at least 50 groups, and preferably more.14 It is important to ensure that: (1) the large grant provides a significant increase to the existing budget of the CSOs, and that the small grants do not and (2) that the CSOs spend their grants entirely on HIV/AIDS activities within the selected subcounty and that there is not contamination (sharing of resources or expertise) across subcounties. It would probably work best to select CSOs that work only in a single subcounty to prevent the supplementing or siphoning off of funds to the treatment sites due to the grant. CSOs in both treatment and partial control groups should receive equivalent technical assistance and training on how to use the grant money and how to monitor and improve service delivery. USAID interactions with the CSOs in the treatment group, and partial control group should be equivalent throughout. Evaluation. The primary question for evaluation purposes is: What are the effects of monetary grants on the organizational capacity of CSOs and on the ability of CSOs to monitor and improve government service delivery? The best possible evaluation for this type of project would be a large N randomized controlled field experiment. Because a large N study would require sizeable grants to at least 50 CSOs and additional monitoring and measurement, the costs are greater than that which is currently envisioned for CSO grants within the Linkages program. However, this design offers substantial benefits over a small N experiment and is of general interest to USAID. 13 Instead of grouping subcounties in sets of three, it might be more feasibly to use an alternative stratified sampling procedure whereby all the subcounties in the sample are stratified into types according to key factors and then subcounties within each stratum are randomly assigned into each of the three categories. 14 Depending on the districts chosen for Linkages, it may be possible to randomly select all the treatment and control subcounties from within the 10 districts.
OCR for page 305
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Measurement. Data should be collected before the grants are awarded, after the money is given (or at several points during the grant period), and two years after the end of the grant in order to assess both short-term and medium-term effects of the monetary infusion. Equivalent data should be collected about CSOs and service delivery in the treatment, partial-control, and full-control subcounties. The ability of USAID to collect comparable data in the partial control group should be facilitated by the fact that the CSOs are receiving some funds from USAID. USAID may have to provide a small fee or incentive to the CSOs not receiving grants to enable the collection of similar intrusive and time-consuming data from the CSOs in the pure control group. In order to study the effect of grants and increased resources on the organizational capacity of the CSOs, data should be collected on the budget, activities, operations, and planning of the CSOs. In addition, pre- and postintervention surveys can be conducted with CSO employees, volunteers, government officials and employees, and stakeholders to evaluate changes in the activities, effectiveness, and reputation of the CSOs. In order to evaluate the effectiveness of grants’ government service delivery data can be collected on HIV/AIDS services and outcomes within each subcounty. Much of these data may already be collected by the government (such as the periodic National Service Delivery Survey conducted by the Uganda Bureau of Statustics (UBOS)—though perhaps USAID would need to fund an oversampling in treatment and control subcounties) or perhaps it can be collected in collaboration with other donor projects such as the President’s Emergy Plan for AIDS Relief. Special attention should be given during the research design stage to determine the government activities that are likely to be affected by greater CSO involvement and how those activities might be accurately measured. Additional data collection could be done through surveys of service recipients or randomized checks on facilities and services. In addition, money-tracking studies of local government and government agencies could be conducted to evaluate the level of corruption in HIV/AIDS projects within the selected subcounties. Possible alternatives The grants could be given for an issue other than HIV/AIDS. Selected issues must be ones where (a) the government plays a major role in providing services and (b) there are measurable outcomes of service delivery. The intervention can be carried out at either the district level or the village level instead of at the middle subcounty level. At higher levels of local government, CSOs are denser and better organized. While the ability
OCR for page 306
Improving Democracy Assistance: Building Knowledge through Evaluations and Research of CSOs to effect change in government may be greater at higher levels, the size of the grant needed to make a detectable difference will also be larger. Furthermore, it may be too difficult to find similar groups, and to protect units from contamination by other donors at higher levels of government. If additional funds cannot be secured to conduct a large N randomized controlled experiment, a small N experiment could be conducted with the available funds, although with significantly less power to accurately evaluate the effects of CSO grants. In order to increase the number of possible comparisons, and to help control for the effect of context with a small number of treatment sites, a variation on the above design may be warranted. The inclusion of a second issue area may facilitate analysis in a small N context. For example, in each subcounty, one CSO working on education and one working on HIV/AIDS will be selected with the aim of finding similar CSOs across subcounty groups and issues. One subcounty will be randomly assigned to receive a large education grant and a small HIV/AIDS grant, and another subcounty will receive a large HIV/AIDS grant and a small education grant. Figure E-1 provides an illustration. FIGURE E-1 Comparison of large and small grants to education and HIV/AIDS CSOs.
OCR for page 307
Improving Democracy Assistance: Building Knowledge through Evaluations and Research This research design affords several useful comparisons. Within a single subcounty, changes in the education CSO versus the HIV/AIDS CSO (one of which got a large grant and the other of which got a small grant) can be compared, and the degree of change in each sector can be evaluated. Within each subcounty group, the education CSOs (one with a large grant, one with a small grant, and one with no grant) can be compared and the changes in educational outcomes across the grouped subcounties can be compared. In addition, within each subcounty group, the two HIV/AIDS CSOs (one with a large grant, one with a small grant, and one with no grant) can be compared and the changes in HIV/AIDS outcomes across the grouped subcounties can be compared. The repetition of these comparisons across a number of different groups will help the researchers to parse out the effects of the grants from contextual factors. Training and Assistance for a Random Selection of New Members of Parliament The Strengthening Democratic Linkages in Uganda program seeks to enhance the knowledge, expertise, and resources of members of parliament (MPs) so they can more effectively operate in a multiparty parliament, legislate and perform oversight functions, foster sustainable development, and engage constituents, civil society, and local governments. The entire group of new MPs (approximately 150) will be randomly divided into two groups. USAID can explain that they only have enough resources to work with half the group at a time and that the fairest way to decide is by lottery. To ensure that the partisan makeup of the treated group is equivalent to the control group, USAID will probably want to stratify by party affiliation. They may also want to stratify by other key factors such as previous political experience, committee assignment, and gender and randomly assign MPs within strata to ensure that the treatment and control groups are equivalent along critical dimensions. The treatment group will receive intensive personalized training and assistance from technical personnel. This assistance my take the form of group trainings on key issues, weekly or bi-monthly individual meetings with trained legal assistants, regular research assistance on topics chosen by the MP, secretarial services, and/or repeated meetings with CSO representatives. The control group will not receive these additional services (at least initially). It is important to ensure that the intervention (1) is deemed useful by the MPs so that they continue to participate fully in the program for its duration; (2) is significant enough that the effects, if there are any, can be measured; and (3) is limited to the MPs in the treatment group alone and not easily passed on to those in the control group. For example, if the treatment was the distribution of a newsletter each week
OCR for page 308
Improving Democracy Assistance: Building Knowledge through Evaluations and Research to the treatment group, then it is very likely that many legislators in the control group would gain access to the newsletter and receive the same treatment as those in the treatment group. Measurement. Jeremy Weinstein and Macartan Humphreys, in cooperation with the African Leadership Initiative, are currently producing annual scorecards for all of Uganda’s MPs recording their behavior in the parliament, in committee, and in their constituencies. These scorecards could be used to compare the behavior of MPs in the treatment and control groups. In addition, surveys could be conducted with MPs to measure the knowledge and reported behavior of new MPs and to assess perceptions of fellow MPs. Surveys could also be conducted with parliamentary staff, civil service leaders, key stakeholders, or constituents to assess the reputation and influence of different legislators. Perhaps other measures of MP involvement (such as visits to the library) can be collected. Eventually, for those who run for reelection, the vote results could be used to evaluate popularity. Evaluation. For the purposes of evaluation, the most important question is: What are the effects of technical training and assistance on the ability of individual legislators to operate more actively, effectively, and independently in parliament? Possible alternatives To reduce costs of the intervention, a smaller number of MPs can be selected to be in the treatment group. The required number depends on the intensity of the intervention, the quality of the measures, and the heterogeneity of the group, but a treatment group of 50 MPs may be sufficient. If it is not politically feasible to provide benefits to only some of the new MPs, then the treatment could be conducted in a rollout fashion. Half (or one-third) of the MPs would receive the treatment for the first several years, and the other group would receive the treatment in the later part of the term. The interventions with each group would have to be timed to fit with the collection of data for the scorecards. Returning MPs could also be included in the experiment, although returning MPs are more experienced and thus less likely to be affected by additional assistance. Their inclusion also adds to the heterogeneity of the population. The intervention activities (and the associated costs) would have to be greater, and/or more widespread, in order to discern an effect.
OCR for page 309
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Revised Remuneration Policies to Fight Corruption The Strengthening Capacity to Fight Corruption in Uganda Program suggests that “the Government of Uganda will consider increased pay for key personnel, through the implementation of an enhanced remuneration package for anti-corruption investigators and prosecutors.” The revised remuneration policies would “enable performance (job evaluation) based salary structures for anti-corruption prosecutors, investigators, and other officers within GOU entities such as the DEI, DPP and the CID fraud squad.” The effects of changes in remuneration policies are of general interest to USAID. Although the implementation of the program cannot be manipulated to create contemporaneous control or comparison groups, the effects can still be evaluated effectively with a temporal comparison—before and after the intervention. The main consideration is to try to ensure that exogenous shocks do not take place during the period of measurement. For that reason we suggest that such an intervention could only be accurately evaluated if it took place some time before the other proposed reforms in the Request for Proposal for Strengthening Capacity to Fight Corruption in Uganda. Perhaps the changes in remuneration could be implemented immediately, while the other interventions are still in the planning stage. Measurement. The main comparison is before the change in remuneration policies versus after the change. To evaluate the effect of changes in remuneration policies on recruitment and retention, the qualifications of the current employees will be assessed. In addition, the qualifications of all those who apply and former employees who sought alternative employment should also be assessed. To evaluate the effect of the remuneration policies changes on the effectiveness of anticorruption activities, the number of malpractices that are detected, effectively investigated, prosecuted, punished, and publicized before and after the changes can be compared. Evaluation. The primary question from the perspective of evaluation is: How do changes in remuneration policies affect recruitment and retention of qualified personnel and the performance of employees? Possible alternatives. If time permits, it would be better to stagger the changes in remuneration policies by types of civil servants or grades. For example, prosecutors could receive the new remuneration packages several months before the investigators. Thus, if there is an external shock, it is less likely to similarly affect the outcomes of every subject of the study.
OCR for page 310
Improving Democracy Assistance: Building Knowledge through Evaluations and Research CURRENT AND RECENT USAID PROJECTS AT THE TIME OF FIELD VISITS Albania (March 2007) New Local Government—RFP issued Current/Recently Ended Local Government (2004–end July 2007)—Urban Institute Rule of Law (2004–end July 2007)—Casals Political Parties and Civic Participation (2004–September 2007)—NDI/ IREX/Partners Albania Anti-Corruption/MCC Threshold (2006-2008)—Chemonics Peru (June 2007) Current Pro Decentralization (PRODES)—ARD, Inc. Political parties/Elections—NDI/Transparencia Congress Program—United Nations Developoment Program and George Washington University LAPOP Survey “Democracy Political Culture in Peru, 2006”—Vanderbilt University Not Included in Field Visit Conflict Mitigation in Mining—CARE Human Rights National Coordinator Institutional Development and Therapy Attention to Victims of Torture and Political Violence— Human Rights National Coordinator and Center for Psycho-Social Attention Trafficking in Persons—Capital Humano y Social Alternativo Uganda (June 2007) New Democratic Linkages (within and among parliament, selected local governments, and CSOs)—Center for Legislative Development SUNY Albany MCC Threshold (anti-corruption and civil society to improve procurement systems and build capacity to more effectively investigate and prosecute corruption cases) Political parties and politically active CSOs (capacity building)—designing project
OCR for page 311
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Recent/Soon to End Decentralization (to end December 2007)—ARD Not Included in Field Visit Community Resilience and Dialogue (September 2002–September 2007)—International Rescue Committee CONSULTANT BIOGRAPHIES Albania Team Members: David Black, USAID; Rita Guenther, National Academies; Jo Husbands, National Academies; Karen Otto, consultant; Daniel Posner, consultant. Karen Otto, a former USAID direct hire, is a monitoring and evaluation specialist/consultant with a strong background in democracy and governance (especially rule of law). She has developed 70 performance monitoring plans for proposals and ongoing development projects in a wide array of areas, particularly DG. She has evaluated the performance of many development projects and the operations of all federal courts in the United States, and has developed a formal evaluation system for the Administrative Office of the U.S. Courts to review courts under its jurisdiction. Ms. Otto has been a court administrator in federal, state, and municipal courts in the United States. She has been a rule of law advisor in USAID and a project manager for DG projects overseas. She has personal experience in many of the areas involved in DG activities: court administration (she was a court administrator), media (she was a journalist), judicial disciplinary system (she was an inspector in a judicial inspection service), etc. Daniel Posner, associate professor of political science at the University of California, Los Angeles, conducts research in the following four broad areas: ethnic politics, ethnicity and economic development, political change in Africa, and social capital and civil society. His research in this area is motivated by a number of questions: When and why do some ethnic identities (and ethnic cleavages) matter for politics, and when do they not? Why, when people think about who they are, do they see themselves (and others) as members of particular ethnic groups, and why do the groups that they see themselves as part of have the sizes and physical locations that they do? How can we reconcile what we know about the fluidity and context dependence of ethnic identities and ethnic cleavages with the need to measure social diversity and code individuals by their
OCR for page 312
Improving Democracy Assistance: Building Knowledge through Evaluations and Research group affiliations? Why does ethnicity matter for collective action? How well are people able to identify the ethnic backgrounds of others? He approaches each of these questions with a combination of theory and the collection of original data (including experimental data). Peru Team Members: Moises Arce, consultant; Tabitha Benney, National Academies; David Black, USAID; Thad Dunning, consultant; Rita Guenther, National Academies. Moises Arce is an associate professor in the Department of Political Science at the University of Missouri. His research focuses on the politics of market reform, comparative political economy, and Latin American politics (Peru). He received funding from the National Science Foundation, the Social Science Research Council, and the Fulbright Scholar Program. His publications include the book Market Reform in Society: Post-Crisis Politics and Economic Change in Authoritarian Peru, and articles in the Journal of Politics, Comparative Politics, Comparative Political Studies, and the Latin American Research Review. He previously taught at Louisiana State University. He received his Ph.D. in 2000 from the University of New Mexico. Thad Dunning is assistant professor of political science and a research fellow at the Whitney and Betty MacMillan Center for International and Area Studies at Yale. His current research focuses on the influence of natural resource wealth on political regimes; other recent articles investigate the influence of foreign aid on democratization and the role of information technology in economic development. He conducts field research in Latin America and has also written on a range of methodological topics, including econometric corrections for selection effects and the use of natural experiments in the social sciences. Dunning’s previous work has appeared in International Organization, the Journal of Conflict Resolution, Studies in Comparative International Development, Geopolitics and in a forthcoming Handbook of Methodology (Sage Publications). In 2006-2007, he was teaching an undergraduate lecture course and a seminar on ethnic politics and a graduate seminar on formal models of comparative politics. He received a Ph.D. in political science and an M.A. in economics from the University of California, Berkeley.
OCR for page 313
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Uganda Team Members: Mark Billera, USAID; Mame-Fatou Diagne, consultant; John Gerring, committee member; Jo Husbands, National Academies; Devra Cohen Moelher, consultant. Mame-Fatou Diagne is a Ph.D. candidate in economics at the University of California, Berkeley. A native of Senegal, she graduated from the Institut d’Etudes Politiques de Paris and received a Master of International Affairs from Columbia University. She has worked as an emerging markets economist for Societe Generale in Paris and for Standard and Poor’s in London, where she was the principal analyst for South Africa and other African-rated sovereigns. Her current areas of research are development, public and labor economics, and particularly, the economics of education and political economy in Africa. Devra Cohen Moehler is an assistant professor of political science at Cornell University. She recently returned to Cornell from two years as a Harvard Academy Scholar at the Harvard Academy for International and Area Studies. Her research interests include political communications, education and democratization, consequences of political participation, political behavior, comparative constitution-making, law and development, cross-national survey research, and the international refugee regime. Her dissertation, based on research conducted in Uganda, focused on the effects of citizen participation in Ugandan constitution making in creating “distrusting democrats.” She received her Ph.D. in political science from the University of Michigan and a B.A. in development studies from the University of California, Berkeley.
OCR for page 314
Improving Democracy Assistance: Building Knowledge through Evaluations and Research This page intentionally left blank.