This chapter therefore focuses on USAID’s policies and practices for monitoring and evaluation (M&E) of its DG projects. To provide a context, we begin with a brief description of the current state of evaluations of development assistance programs in general. Then existing USAID assessment, monitoring, and evaluation practices for DG programs are described. Since such programs are called into existence and bounded by U.S. laws and policies, the key laws and policies that shape current USAID DG assessment and evaluation practices are examined, to lay the foundation for the changes recommended later in the report. The chapter concludes with a discussion of three key problems that USAID encounters in its efforts to decide where and how to intervene.


As Chapter 5 discusses later in detail, there is a widely recognized set of practices for how to make sound and credible determinations of how well specific programs have worked in a particular place and time (see, e.g., Shadish et al 2001, Wholey et al 2004). The goal of these practices is to determine, not merely what happened following a given assistance program, but how much what happened differs from what would be observed in the absence of that program. The final phrase is critical, because many factors other than the given policy intervention—including ongoing long-term trends and influences from other sources—are generally involved in shaping observed outcomes. Without attention to these other factors and some attempt to account for their impact, it is easy to be misled regarding how much an aid program really is contributing to an observed outcome, whether positive or negative.

The practices used to make this determination generally have three parts: (1) collection of baseline data before a program begins, to determine the starting point of the individuals, groups, or communities who will be receiving assistance; (2) collection of data on the relevant desired outcome indicators, to determine conditions after the program has begun or operated for a certain time; and (3) collection of these same “before and after” data for a comparison set of appropriately selected or assigned individuals, groups, or communities that will not receive assistance, to estimate what would have happened in the absence of such aid.1


The ideal comparison group is achieved by random assignment, and if full randomization is achieved, a “before” measurement may not be required, as randomization effectively sets the control and intervention groups at the same starting point. However, both because randomization is often not achievable, requiring the use of matched or baseline-adjusted comparison groups, and because baseline data collection itself often yields valuable information about the conditions that policymakers desire to change, we generally keep to the three-part model of sound evaluation design.

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement