The importance of a strong evidence base of outputs and outcomes from investments in science, technology, and innovation (STI) cannot be overstated. Lessons for scaling can only be gained through systematic tracking and analysis of specific program and project implementation using science and technology. The U.S. Agency for International Development (USAID) has taken steps to significantly strengthen its monitoring and evaluation (M&E) capacity by expanding the science-based evaluation toolkit for agency staff, making M&E more systematic and science-based, and establishing an evaluation culture at the agency. USAID Forward (2010), the administrator’s primary strategic document, states, “Evaluations inform decisions including designing follow-on projects, making mid-course corrections, developing country strategies, scaling-up projects, budget allocations and other related decisions.” This emphasis led to creation of the Bureau for Policy, Planning and Learning (PPL) in 2010, development of an agency-wide Evaluation Policy in 2011, and increased resources for evaluation.
The policy states evaluation will be integrated into project design, and should be unbiased, relevant, based on the best methods, oriented toward reinforcing local capacity, and transparent.1 The value of the agency’s M&E investment can only be confirmed if the organization learns from evaluations about what works and what does not, and why—and then applies the evaluation findings to implement more agile, midcourse corrections and development of new tools and approaches. A key challenge relates to the evaluation culture itself—ensuring the right evaluation approach and the appropriate indicators for the project
1 USAID Evaluation Policy, January 2011, https://www.usaid.gov/sites/default/files/documents/1868/USAIDEvaluationPolicy.pdf.
or program under review, transparency about the evaluation goals and process, and use of the results to maximum benefit. A five-year review showed improvements in the number of evaluations, staff trained, and quality and use of evaluation, but USAID leadership has acknowledged it will take much longer to ensure that staff fully use the findings from a robust M&E program for current and future programs.2
The agency leadership has also emphasized expanded data collection and sharing, and thus has, in principle, increased access to a key building block for evaluation.3 The Development Experience Clearinghouse is an online repository for reports and data from the evaluations. With an inventory of more than 11,000 reports, it provides the opportunity for users both inside and outside the agency to draw lessons from a remarkable accumulation of experience. Making the data available as quickly and clearly as possible (e.g., through summaries) will ensure its most effective use.
Monitoring and evaluation at USAID have three primary purposes: real-time tracking of progress and problems to ensure rapid midcourse corrections, accountability to stakeholders (such as Congress and the public), and learning to improve effectiveness. The value of monitoring is described below under Adaptive Management and has become increasingly important as the leadership urges the agency to become more agile. Accountability includes ensuring funds are used efficiently, measuring effectiveness, disclosing findings, and using evaluation findings to inform budget decisions. To the extent possible from current methodologies, evaluation can help the agency to better understand which kinds of investments, including in STI, yield the greatest benefits. Learning encompasses generating and sharing knowledge, and using that knowledge to improve program design. As a learning tool, evaluation can track results and the impacts of programs; lead to understand-
2 Remarks by Administrator Gayle Smith, March 9, 2016, https://www.usaid.gov/news-information/speeches/mar-9-2016-administrator-gayle-smith-us-leadership-international-development.
3 Background at “Announcing USAID’s Open Data Policy,” https://blog.usaid.gov/2014/10/announcing-usaids-open-data-policy/.
ing why programs succeed or fail; and suggest ways to adapt a project to improve performance. Evaluations can improve Country Development Cooperation Strategy (CDCS) planning, project design, and spending decisions.
USAID has a long and extensive history of incorporating monitoring and evaluation in its work. The use of impact evaluations surged in the 1980s, then declined until the last decade when a committed leadership supported better tools. True impact evaluations remain limited in number on an annual basis. In 2014, USAID self-reported eight impact evaluations, a subset of which were randomized controlled trials. One such evaluation example with striking results was conducted in Central America to examine crime reduction programs.4 The goal was to determine which interventions resulted in improvements, and the results as determined through intensive sociological analysis were strikingly positive: “The USAID approach to crime prevention under CARSI [Central America Regional Security Initiative] has been shown by the Vanderbilt LAPOP [Latin American Public Opinion Project] impact evaluation reported on here to reduce violence, crime, and fear of crime across communities at-risk in four countries in Central America….” This evaluation reinforces the idea that STI processes and tools can benefit many sectors, in this case an evaluation of crime and violence prevention.
USAID’s Scientific Research Policy describes the interaction between impact evaluations and agency-funded research as a virtuous cycle. As the policy states, “Research priorities help formulate and refine impact evaluation questions so that these can advance the state of knowledge around a particular subject. In turn, impact evaluations ground-truth research findings: they test innovative strategies and approaches in a real-world setting before they are scaled up with USAID funding, and in doing so, reveal new areas of research.”5
Evaluations can also reveal how development is helping or hurting women or other target groups. The Economic Growth, Education, and
4 Susan Berk-Seligson, Diana Orcés, Georgina Pizzolitto, Mitchell A. Seligson, and Carole J. Wilson. Impact Evaluation of USAID’s Community-Based Crime and Violence Prevention Approach in Central America: Regional Report for El Salvador, Guatemala, Honduras and Panama. Agency for International Development. October 2014.
5 USAID, Scientific Research Policy, December 2014, p. 5.
Environment (E3) Bureau provides an example of how the interaction between evaluation and project planning can develop, with an emphasis on gender. It commissioned a review of its evaluations with regard to the integration of gender into all its projects. The findings included the following:
- A growing number of evaluations address gender differentials (67 percent of evaluations conducted in 2014, more than a fourfold percent increase from 2011) and provide sex-disaggregated data (53 percent of evaluations in 2014, more than a sevenfold increase from 2011).
- Evaluations highlighted the importance of sex-disaggregated project data in contextualizing and understanding project results.
- Evaluations noted the need to consider the implications of gender norms during project design.
- Evaluations highlighted the benefits of including women in project planning, leadership, and implementation.6
The role of gender in STI evaluation also provides an example of the challenge in ensuring the entire organization plays a role in implementing a strategic priority. In a 2015 external study of how USAID and other international development organizations are implementing gender as a mainstream issue, the authors found donors performed well in setting out strategies over the last decade, but had accomplished little in incorporating gender into their evaluation cultures. Instead of using impact evaluations, the study noted, the institutions tend to develop qualitative stories.7 The study, and the research on which it relied, did not break out STI, but did differentiate progress by traditional development sector. Progress in evaluating gender aspects of projects was greatest in agriculture, health, finance, and education, whereas the au-
6 USAID, Gender Integration in E3 Sector Evaluations, 2013–2014, April 2016, p. vi.
7 As the study notes, agriculture has made the greatest progress incorporating gender, and USAID’s Feed the Future demonstrates the feasibility of integrating a strong focus on gender in its evaluation efforts. See https://agrilinks.org/events/increasing-feed-future-impacts-through-targeted-gender-integration.
thors could cite little progress in infrastructure, transportation, private-sector development, and public management. The study points to an opportunity for USAID to take the lead by showing how gender strategies and programs can be rigorously evaluated, especially in some of the STI-based sectors that have been most challenging to other development agencies. While USAID has taken steps to implement its Gender Equality and Female Empowerment Policy (see Chapter 7), it has not fully developed the roles and responsibilities related to evaluation of gender.
Finally, as noted in the external review, unanticipated benefits result from evaluations. A significant number of respondents noted that the mere engagement in using evaluations gave them greater understanding of the purpose and approaches to commissioning them. The agency cites 1,600 staff as participating in evaluation training in the last six years, but training cannot alone create an appreciation for the potential value of evaluations, especially among staff whose principal responsibilities are not in evaluation. It is important to recognize the link between the use of completed evaluations and the selection of evaluation tools at the start of projects. Evaluation needs to be built into project or program design at the time of inception.
In a 2016 external review of progress since issuance of USAID’s Evaluation Policy,8 the agency identified an increase in the annual number of evaluations from 130 to 230. While a numerical increase is commendable, ensuring that the right kinds of evaluations are deployed and the results utilized is of equal, if not greater, importance in improving an evaluation culture. The review found progress, but it also identified five steps for further improvement: (1) expansion of impact evaluation clinics to enable missions to fill gaps in the appropriate use of randomized controlled trials; (2) expansion of training so program managers gain more in-depth competence in evaluation oversight; (3) expansion of
8 USAID, Strengthening Evidence-Based Development: Five Years of Better Evaluation Practice at USAID, 2011–2016. https://www.usaid.gov/documents/1870/strengthening-evidence-based-development-five-years-better-evaluation-practice-usaid.
“gap maps” to target key opportunities to commission systemic reviews of the evidence; (4) greater attention on measuring the quality of evaluations; and (5) establishment of country-level learning plans to use evaluation findings within the CDCS process. USAID is updating its evaluation policy based on the experiences of the last five years and the external review.
The sectors that draw on STI are highly amenable to systematic evaluation, although STI was not called out specifically in the 2011 policy or the 2016 review. This is demonstrated by the Learning Agenda program in the Bureau for Food Security (BFS), designed to improve implementation of the administration’s Feed the Future Initiative (see Box 6-1).
The Global Development Lab has assumed a role to advance state-of-the-art evaluation approaches and tools throughout the agency, many of which have particular application to science, technology, and innovation, and partnerships (STIP) (e.g., through the Monitoring, Evaluation, Research, and Learning Innovations [MERLIN] tools discussed later in this chapter). The Lab mirrors other offices within USAID in using a framework with objectives and intermediate indicators to measure progress in its own projects. But it also looks beyond disaggregated outputs, as evidenced by the Center for Agency Integration, which is expected to deliver Lab support to increase STI+P integration.
USAID Contributions to the Science of Monitoring, Evaluation, and Learning
Evaluation and research are similar in that they try to understand how something works. However, as differentiated by one evaluation expert, “Evaluation determines the merit, worth or value of things…Social science research by contrast does not aim for or achieve evaluative conclusions.”9 Evaluation is itself a science that can provide tools and approaches within and across sectors.
In its 2008 report, the National Research Council’s Committee on Evaluation of USAID Democracy Assistance Programs urged the agency to better use impact evaluations to understand whether and how its
9 Michael Scriven, Evaluation Exchange, Volume IX, No. 4, http://www.hfrp.org/evaluation/the-evaluation-exchange/issue-archive/reflecting-on-the-past-and-future-of-evaluation/michael-scriven-on-the-differences-between-evaluation-and-social-science-research.
democracy and governance programs were achieving their goals.10 The committee proposed a five-year experiment in the use of RCTs, an experiment that led to a greater emphasis on RCTs and on evaluation in general within USAID, beyond democracy and governance programs. But the National Research Council committee also pointed out that “such designs are not always feasible or appropriate, and a number of other designs also provide useful information, but with diminishing degrees of confidence, for determining the impact of many different kinds of assistance projects.”11 In other words, a comprehensive, resource-intensive RCT is not, nor should it be, the only evaluation route. The risks of applying RCT approaches to poverty-alleviation projects are well-described in an International Initiative for Impact Evaluation (3ie) study from 2011.12 Some projects need a different approach.
Finding 6.1: The reestablishment of a proactive approach to monitoring and evaluation has been a hallmark of the current administration. Recent assessments of progress demonstrate the gains, and at the same time, point to remaining gaps where remedies could have significant payoff. The USAID experience with evaluation also shows that one size does not fit all activities. Basic or applied research investments may be monitored in aggregate, while innovation investments may need monitoring and evaluation at a project level.
Adaptive Management Approach
The USAID program cycle (planning, project design and implementation, and evaluation and monitoring) acknowledges that development is rarely linear, and therefore stresses the need to assess and reassess through regular monitoring, evaluation, and learning. Allowing for flexibility, pace of change and progress, and sharing of knowledge and resources between partners is important.
USAID deploys STI in complex environments where politics, culture, and key individuals can facilitate or frustrate a well-designed pro-
10 National Research Council, Improving Democracy Assistance, 2008.
12 Eric Roetman, A Can of Worms? Implications of Rigorous Impact Evaluations for Development Agencies. 3ie, 2011.
ject. It is particularly important to know who can be trusted, and to what degree, and who cannot. Within USAID itself, a more scientific approach to evaluation of STI projects could yield significant benefits. A more effective rapid feedback process would identify “early failure” of those programs that are not gaining traction, and allow the reallocation of resources into more promising initiatives. It is important that USAID develops a more robust approach for using M&E evidence to terminate programs that are not achieving their goals.
Adaptive management is an approach that seeks to better achieve desired results and impacts through the planned use of periodic M&E information throughout the implementation of programs and projects. Monitoring can systematically test assumptions in a project and then make adjustments based on the learning.13 This approach allows for experimentation with a variety of interventions and, via learning and feedback, can adjust a project as managers gain a better understanding of the setbacks.14 Aid agencies, multinational nongovernmental organizations, and international organizations have the adaptive management approach.15 The Global Development Lab developed evaluation approaches based on adaptive management, especially for STI+P projects.
MERLIN as a Tool for Adaptive Management Approaches to Evaluation
USAID’s standard approaches are limited when the potential outputs and outcomes are not easily identifiable upfront, and where episodic changes in project direction may occur. This is common for projects in the complex environments of developing countries.16 Recogniz-
13 N. Salafsky, R. Margoluis, and K. Redford, Adaptive Management: A Tool for Conservation Practitioners. Washington, DC: Biodiversity Support Program, 2001.
14 N. Salafsky, R. Margoluis, and K. Redford, Adaptive Management: A Tool for Conservation Practitioners. Washington, DC: Biodiversity Support Program, 2001.
15 L. Rist, B.M. Campbell, and P. Frost, Adaptive management: where are we now? Environmental Conservation, 2013, 40(01), pp. 5–18.
16 USAID Global Development Lab, MERLIN fact sheet, https://www.usaid.gov/sites/default/files/documents/15396/MERLINProgramFactSheet.pdf.
ing the challenges, the Lab announced a new approach in 2015, the MERLIN Program.
The Global Development Lab developed MERLIN in partnership with two other bureaus: PPL and Global Health.17 It allows USAID to work with partners, such as universities, companies, innovation labs, and nongovernmental organizations, to collaboratively identify, design, and test new solutions. USAID currently has four active MERLIN mechanisms (see Box 6-2) managed by the Lab. Given their recent debut, the mechanisms do not have a systematic track record for broader adoption across the missions, but the committee did review evidence of first adoptions of the MERLIN tools in various sectors and missions.18
17 MERLIN Addendum 0000008, http://dai.com/sites/default/files/rfps/cdi008_addendum.pdf; MERLIN overview, https://www.usaid.gov/GlobalDevLab/fact-sheets/monitoring-evaluation-research-and-learning-innovations-program-merlin.
18 Interviews carried out by the committee at the May 2016 STIP summit, Pretoria.
Improved data collection and analysis will strengthen the evaluation process, whatever the methodology and sector. Evaluation can ensure that the impact of innovations is documented more rapidly and corrective actions are taken in real time by project managers to move toward greater program success and resource allocation decisions. Establishing advanced evaluation systems can develop local capacity to collect appropriate evidence and use this to make evidence-based decisions about development. Impact evaluations can broaden the suite of questions asked about STI projects and the extent of follow-on gains, if
any. Innovation in evaluation approaches can spur more rapid program adjustments throughout the program cycle with the intention of improving development outcomes.
Four challenges in current approaches to M&E emerge from USAID’s experience:
First is the tension between the need to evaluate short-term results versus the reality that many aspects of development do not lend themselves to such time horizons. The political climate in Washington creates an incentive to demonstrate rapid progress made with taxpayer funds that are expended. Many projects are therefore set up with short time frames, with the implicit message that support will cease in the absence of results to show for the investment. This may lead to indicators that can measure “success” in terms of activities (e.g., numbers of people trained, numbers of devices distributed, numbers of web visits logged), but may not measure relevant impacts, development outcomes, and sustainability. This is a particular problem in evaluating innovation where the desired outcomes typically occur well into the future. This is why the default position is to measure outputs, not outcomes. For STI investments by USAID, the need for longer-term results is critical. The agency could usefully make a case for the evaluation of such programs and projects over an appropriate time frame to capture full results.
Moreover, each organization has its own particularistic approach to monitoring and evaluation of innovation. A group of 12 development innovation funders (including USAID) called the International Development Innovation Alliance (IDIA) has responded to this challenge by developing a shared framework for M&E of innovation that includes modeling future outcomes. In 2015, the IDIA published “A Call for Innovation in International Development,” outlining the need for innovation to address pressing development challenges and laying out six core principles to facilitate development innovation.19 In 2015, IDIA supported the establishment of USAID’s Global Innovation Exchange, an open platform developed by the Global Development Lab to focus on data harmonization, interoperability, governance, and financing.
19 Further information is available at http://www.resultsfordevelopment.org/about-us/press-room/call-innovation-international-development.
The second challenge that adaptive management and real-time monitoring can help address is the difficulty in implementing midcourse corrections. Although encouraging change is important, in reality, it can be disruptive to make changes in midcourse, especially where projects are developed by partnerships. This can result in unwillingness to make changes.
Third is the need to build into a project the time and money for evaluations from the start. Of primary importance, baseline data needs to be collected at the start of a project. If implementing partners are either expected to conduct or commission evaluations, the resources to carry out those responsibilities are a precondition for a project, not an afterthought. A Request for Proposals, and the subsequent award, can stipulate what is required.
Fourth is the challenge to translate evaluations into evidence-based decision making. This raises the issue of feedback loops. What is USAID doing now and what should it do to ensure that the results of evaluations are translated into future policy choices and program designs? It has embraced the principle of feedback loops to capture lessons from evaluations for future projects and to play a role in midcourse corrections for projects.
The judicious application of best practices implies selection of the appropriate evaluation approach, not necessarily the most sophisticated. Over-evaluations waste time, money, and resources. Small grants may not need an individual evaluation—perhaps, instead, an aggregated program merits the cost and time. Innovation investments may need due diligence at a project level. Identifying appropriate M&E budgets in the context of different STI investments will make for a more comprehensive approach that provides results to guide progress.
Recommendation 6.1: With agency evaluation policies in place, the leadership needs to emphasize implementation and push for greater testing of new tools and lessons from recent experience in evaluation. USAID needs to ensure compliance with its policies on collecting relevant baseline data, and that midcourse reviews are fully utilized to enable managers to adapt or pivot in order to achieve success.
Recommendation 6.2: USAID should develop clear guidelines on the intensity of evaluation for each kind of programmatic activity, incorporating appropriate M&E tools that would help project developers to better calibrate their investments with an appropriate balance between cost burden and potential program gains. Much could be learned about appropriate monitoring and evaluation design from the current pilot efforts being implemented, with the engagement of program staff at all levels and across all missions/bureaus.
Recommendation 6.3: USAID should, perhaps with other development agencies and institutions, develop robust, state-of-the-art methods for assessing the impact of longer-term interventions, including investments in adaptive research and human and institutional capacity development.