This open forum session focused on two questions: What more do we need to understand about federal evaluation? What’s missing from current evaluation principles and practices? Russ Whitehurst (chair, steering committee) invited workshop participants to comment on the policies and policy-making processes at their respective agencies and organizations.
Mark Shroder (Department of Housing and Urban Development [HUD]) said that HUD’s policy statement on evaluation was created after having seen the success of the Department of Labor’s (DOL) and other major agencies’ policy documents and, as such, closely resembles several of them. He particularly mentioned the previous discussants’ points on transparency and on publishing reports regardless of findings. He noted that while he agrees that every methodologically valid report (as determined by agency staff) should indeed be published, he personally does not believe that reports of evaluations that were not found to be methodologically sound should have to be released simply because they were done. Shroder referenced the Information Quality Act (IQA), under which he said there is a little-known directive to agencies not to publish findings they do not believe to be true.1
Whitehurst asked if this issue could be addressed on the front end
1 Pursuant to the IQA, agencies are to maintain and disseminate data that are found to be of good quality, objectivity, utility, and integrity of information. They are also to provide administrative mechanisms allowing relevant persons to seek and obtain correction of information deemed not to align with those characteristics: see Section 515 of the Treasury and General Government Appropriations Act, 2001 [Pub. L. No. 106-554, 44 U.S.C. § 3516 note].
instead of after the fact—if an agency could determine, before money has been spent, that an evaluation will not be methodologically valid. Shroder responded that changes in a program’s expectation or scope can sometimes lead to unforeseen problems. Jack Molyneaux (Millennium Challenge Corporation [MCC]) agreed that some evaluations are indeed stronger or weaker than others, but said that MCC prefers not to be the censor: it registers all evaluations, publishes all results in the agency’s evaluation catalog, and often encourages peer reviewers to weigh in on methodological quality. He also added that the methodology should be appropriate to the project and not placed generically into the evaluation requirements.
Clinton Brass (Congressional Research Service)2 commented on the pros and cons of incorporating policies into a statute, as perceived by practitioners and advocates. Although some people believe it is useful to place methods into statutes to ensure they remain part of the discussion, others believe the inclusion yields too narrow a focus. Brass gave an example of a tiered evidence initiative that narrowly defined “evidence” (for internal and external validity) as primarily coming from impact evaluations—a point of controversy in the evaluation field. Whitehurst echoed Brass’ observation, noting similar issues at the Department of Education (DoED), for example, in the Elementary and Secondary Education Act and the Education Sciences Reform Act: a strongly worded congressional preference for the use of randomized controlled trials quickly became synonymous with a need to carry them out to reinforce scientific rigor. As education program evaluation has continued to mature, Whitehurst said, he has seen the language transform into wording that calls for use of the most rigorous method appropriate to the question being asked.
Judith Gueron noted two gaps in current principles and practices. One is how the policies for evaluation are written in a “one-off” way that does not promote replication, despite evidence that replication bolsters findings. The other is the need for more focus on communication and educating the general public on the importance of evaluation, beyond simply putting reports on the internet. She said that evaluation is a big investment and agencies need to build a constituency to educate the general public and government officials on the results from and the importance of evaluation.
Thomas Feucht (National Institute of Justice) asked the panelists about independence in terms of funding: When an agency places a requirement for evaluation in its policy, does that consequently yield the type of one-off evaluations to which Gueron referred? Conversely, how are other programs’ evaluations funded when they do not have the same written provisions? Is independence tied to appropriation? He mentioned that evaluations of
2 Brass reminded participants that all of his comments throughout the workshop reflect his views and not those of the Congressional Research Service.
crime prevention programs he used to manage suffered when, without appropriated provisions, the agency’s core funding was stretched too thin to fully support them.
Bethanne Barnes (Washington State Institute for Public Policy), formerly of the U.S. Office of Management and Budget (OMB), noted a paper OMB wrote for the Commission on Evidence-Based Policy on the uses of evidence that discusses how funding structures can affect the development of a portfolio of evidence.3 Nightingale briefly described DOL’s funding strategy for evaluation, which includes drawing small percentages from the department’s operating budget, as well as from funding programs and discretionary grants. The evaluation office prioritizes appropriation on the basis of its agencies’ needs and the need to build on evidence from previous studies, and then uses that information to create an annual evaluation agenda for the department. She said there is a “professional balancing act” that is needed when discussing funding for evaluation, which requires highlighting the importance of evaluation in light of other mission-critical activities at operating agencies. Whitehurst recalled having observed a similar system at DoED and agreed that the planning for appropriations should not come solely from within the evaluation agencies; rather, it should incorporate the positions and needs of all the stakeholders with the agency.
Before the session was brought to a close, George Cave (Summit Consulting) noted that although randomized controlled trials are an improvement over prior evaluation methods, there are different types. In addition to the “thumbs up, thumbs down” trials, there are also theory-of-change trials, in which the timing and sequencing of events in a particular treatment are used to gain insight into impacts observed in a program. He suggested that both methods be given equal consideration.
This page intentionally left blank.