Naomi Goldstein (Administration for Children and Families [ACF]) provided background on ACF’s evaluation policy. She noted that the agency’s leaders encouraged the evaluation office to develop the policy and that the process to establish it—which included reviewing existing policies from other federal agencies, as well as the American Evaluation Association Roadmap1—was fairly straightforward. The policy (published in 2012) confirms the agency’s commitment not only to conducting evaluations, but also to using evidence from evaluations to inform policy and practice. It was intended to clarify a few key governing principles, disseminate them both internally and externally, bolster their implementation, and protect them against potential threats.
Goldstein reminded the workshop participants that evidence is just one component of decision making, and evaluation is but one form of evidence, along with such factors as descriptive research studies, performance measures, financial and cost data, survey statistics, and program administrative data. While the ACF policy focuses primarily on evaluation, many of the principles also apply to the development and use of other types of evidence.
Goldstein discussed the five principles in ACF’s policy: rigor, relevance, transparency, independence, and ethics.
Rigor means getting as close as possible to the truth and being committed to using the most appropriate methods to do so. Rigor is not restricted
to impact evaluations; it is also necessary in implementation evaluation, process evaluation, descriptive studies, outcome evaluations, formative evaluations, and in both qualitative and quantitative approaches. Goldstein noted that rigor does not automatically mean the use of randomized controlled trials—although those trials are generally considered to have the greatest internal validity, particularly for questions about cause and effect and are preferred for addressing such questions.
Rigor requires having appropriate resources, a workforce with appropriate training and experience, a competitive acquisition process, and—along with impact studies—robust implementation components that will enable evaluators to identify why a program did or didn’t work or what elements were associated with greater impacts.
Relevance means setting evaluation priorities that consider many factors, including legislative requirements, the originating agency’s interests, and those of other stakeholders: state and local grantees, tribes, advocates, and researchers. Relevance can be strengthened by the presence of strong internal and external partnerships and by embedding an evaluation plan into the initial program planning. It is also important to disseminate the findings in useful ways. Goldstein stressed that rigor without relevance could yield studies that are accurate but not useful.
Transparency means operating in a way that supports credibility of the findings and allows for critique and replication of the methods used in an evaluation. It promotes accessibility and reinforces a commitment to share evaluation plans in advance and release results regardless of the findings. Goldstein said evaluation reports should: describe the methods used, including strengths and weaknesses; discuss the generalizability of the findings; present comprehensive results, including unfavorable and null results; and be released in a timely manner. She noted that ACF also archives evaluation data for secondary use.
Goldstein said that independence and transparency are the protective goals of ACF’s evaluation policy. They help create a culture in which broad dissemination of results becomes the standard. Independence, particularly when coupled with objectivity, is a core principle of evaluation, she said: although many parties should contribute to identifying evaluation questions and priorities, study methods and findings should be insulated from bias and undue influence.
Ethics, Goldstein emphasized, means recognizing the importance of safeguarding the dignity, rights, safety, and privacy of the participants in evaluation studies.
Goldstein closed by noting that having a policy has helped the agency clarify its goals and principles and that disseminating the policy helped make the agency’s principles a shared set of values both in the organization and with its program partners.
Demetra Nightingale (Urban Institute), who previously worked at the Department of Labor (DOL), began by emphasizing the importance for professional evaluators to tap into their professional networks and share their knowledge. She described DOL’s mission, which includes promoting the welfare and protecting the rights of wage earners, job seekers, and retirees in the United States. She said that many of DOL’s dozen or so operating agencies, such as the Employment and Training Administration and the Occupational Safety and Health Administration, have their own evaluation offices. As chief evaluation officer, her role was not to centralize evaluation, but to raise the quality of and consciousness around evaluation, raise awareness of evaluation methodology, and improve the use and dissemination of results to support these smaller entities.
As such, Nightingale said, her office ensured that its policy applies throughout the department and across administrations. She noted that her office drew from the work of other prominent evaluation agencies when creating its policy and takes pride in the fact that the policy has been accepted and supported throughout the department. DOL has also created an evidence-based Clearinghouse for Labor Evaluation and Research (CLEAR),2 which contains guidelines for methodological rigor to which the agency expects both staff and contractors to adhere.
Nightingale reiterated Goldstein’s point about the importance of rigor and how an evaluation policy should contain principles of rigor that apply to all types of evaluation and research. She said the focus should be on building and accumulating evaluation—that is, forming decisions that are built on a body of evidence and not just a single study—and on continuous improvement and innovation. Nightingale touched on transparency by reiterating the need to place dissemination protocol in legislation, and she closed by reminding the group that ethics should apply to both the protection of the subjects and to the integrity with which evaluations are conducted.
Ruth Neild (Research for Action), who previously worked at the Institute of Education Sciences [IES]), started her presentation by acknowledging that IES uses very similar strategies to those of ACF and DOL to promote transparency, rigor, independence, relevance, and ethics in its evaluations. The difference for IES, Neild explained, is that it also incorporates formal peer review of its evaluation reports to promote rigor and scientific integ-
rity. This peer review process applies to evaluations of programs conducted by the agency and its contractors, but not to field-initiated grants.
The IES was established in 2002 as a product of the Education Sciences Reform Act (ESRA).3 What makes IES unique, Neild pointed out, was that the ESRA charges the director of IES with ensuring that the agency’s activities are “objective, secular, neutral, non-ideological, free of partisan political influence, [and] free of racial, cultural, gender, or regional bias,” and it authorizes the director to publish scientific reports without approval from the Secretary or any other office—what is referred to as IES’s independent publication authority (ESRA Section 186). Neild said that this authority, in addition to rigorous peer review (which is also a mandate of ESRA), makes it much more challenging for results of IES’ scientific studies to be suppressed or changed to support political objectives.
Neild noted that the current evaluation budget at IES is approximately $40 million. In response to ESRA, the National Board for Education Sciences, IES’ advisory board, established a Standards and Review Office (SRO) to manage the peer review process. IES evaluation staff work with program offices to identify needs for evaluation and then conceptualize what those evaluations will look like. The board provides direction to the contractors who are carrying out the evaluations, review draft reports, and determine when a report is ready for external peer review. Once the report is ready, the commissioner transmits the report manuscript to the SRO. The SRO coordinates the peer review process, which works similarly to that for scholarly journals. Neild believes that the time and staff needed to conduct the peer review process are worthwhile tradeoffs to obtain a valuable product.
Neild shifted the discussion to factors that threaten the five major principles for evaluation. Some threats are external, such as suppression or manipulation of results for political purposes. Other threats can come from within, from hasty work or a desire to capture a story in a specific way that may not be exactly what the data show. She said that external peer review helps to mitigate these risks and increase the public trust in their agency’s findings. Neild said she also believes that peer review incentivizes high-quality work by staff because they know that publication is not a given: it has to be earned by producing work that meets rigorous and objective standards. Peer review pushes evaluators to provide clear explanations of the purpose, the background, the methods, and the findings for a study. Lastly, Neild said she thinks that it contributes to increased overall credibility for evaluation products originating from federal agencies.
3 See https://www2.ed.gov/policy/rschstat/leg/PL107-279.pdf [May 2017].
Jack Molyneaux (Millennium Challenge Corporation [MCC]) explained that MCC is a small, independent federal agency, founded in 2004, committed to reducing poverty in well-governed low-income countries through investments in sustained economic growth. It was created in response to bipartisan interest in aiding with international assistance, which was seen as ultimately beneficial to the U.S. economy. MCC’s authoring legislation contains components that ensure that international investments are being used in the right way, for the right purpose, and yielding the expected results—all of which is contingent on having a credible evaluation strategy. Congress also requires that MCC’s compacts—grants provided to partner countries’ governments—contain specific benchmarks, strategies, and plans for annual progress updates.
MCC’s Board of Directors consists of the agency’s chief executive officer, four executive branch members, and four congressional appointees. The agency’s evaluation policy, first proposed in 2009 and formally adopted in 2012, mirrors those of prominent evaluation agencies in many ways, but it also has key differences. One such difference is the requirement that every project, regardless of size, be expected to undergo independent evaluation. About 97 percent of MCC’s projects are subjected to independent evaluations, which accounts for about 98.5 percent of the funding: the exceptions are very small studies and a few canceled projects.
To manage cost and scope of the evaluation, Molyneaux explained, the policy is structured in a way that promotes developing evaluation design in tandem with the program design. The operations staff who work with MCC’s foreign counterparts to create, implement, and maintain the projects work in country teams with MCC’s evaluation staff but report administratively to a separate department. In practice, MCC implements its independent evaluations by contracting reputable evaluators who are given authority over the contents of their evaluations (subject to ethical protection of respondents’ confidentiality). And although there is a process in place by which staff can provide feedback if they think a factual error or a methodological problem has arisen, evaluators have editorial independence in reporting their results; they can choose to accept or reject any feedback from the project’s sponsor.
Molyneaux said the goal of the evaluations is to measure attributable impact whenever feasible and when the costs are considered warranted. Because of the nature of MCC’s projects (infrastructure projects such as building roads, for example), this cannot always be completed using rigorous impact evaluations. He reiterated Goldstein’s point about the need to use methods that are appropriate to each program and emphasized that rigor is still at the forefront. He explained that this process is not always
smooth, but it is guided by the principles of cost-efficiency: just as MCC uses economic logic and cost-benefit analysis to inform evaluation design, the agency must also ensure that the cost of the evaluation can be justified with regard to the value of the accountability and learning it is expected to yield.
Molyneaux described some early evaluations MCC conducted, including a series of evaluations of farmer training programs. When the staff pulled together a critical mass of the evaluation results for publication, they were disappointed to find that the programs had not had the desired effects. He said it is often a challenge to develop and implement successful new programs. Most of the simple interventions that are known to work have already been exploited in MCC’s partner countries. The real challenge is improving upon these already exploited opportunities. Some other early problems MCC faced were due to a lack of integration between evaluation planning and program design: in one instance, a farmer training program was executed and evaluated, even though procurement issues had delayed completion of the irrigation system the farmers were trained to use until several years after the training and the evaluation were completed. Molyneaux said that MCC is forthright with its evaluation results, even when they are disappointing, and that the ensuing open dialogue has helped the agency improve evaluation and program design. He added that he is impressed with the increased due diligence he has seen in MCC’s agriculture and road development sectors as a result of the sectoral evaluation reviews, and that the irrigation infrastructure sector is on a similar positive path.