The objective of benefit-cost analysis is to bring evidence to bear on the policy making process. Four speakers at the workshop explored that objective from the perspective of the users of benefit-cost analyses. All agreed that benefit-cost analyses have provided valuable guidance to policy makers.
For decades the U.S. Office of Management and Budget (OMB) has sought to make the best use of the marginal dollar of federal funding, said Kathy Stack. Yet in only a few areas does evidence exist for how to do that. According to Stack, government programs are marked by immense inertia. Federal agencies are quite comfortable doing what they did the year before, unless somebody tells them to do something differently. Also, Congress places a priority on maintaining political support. As a result, taking money away from any particular entity is very difficult unless a compelling reason exists for doing so. Finally, organizational silos cause communications to break down, creating missed opportunities where organizations are not talking to each other about ways of doing things better.
The budgeting process is also characterized by what Stack called the “wrong pockets” problem. For example, interventions in housing could produce savings in health care, but appropriations subcommittees are focused on housing and on health care, not on the connections between the two. Similarly, appropriators for discretionary programs in govern-
1 This section summarizes information presented by Kathy Stack, U.S. Office of Management and Budget, Washington, District of Columbia.
ment are unlikely to change existing priorities to produce savings in mandatory programs. The same thing happens between the federal government and the states, which often operate at cross-purposes rather than collaboratively.
A Brief History of Evaluation in the Federal Government
A major challenge to breaking out of entrenched practices has been the lack of robust measurement and evaluation tools, Stack observed. In the 1980s the effect of a program focused on children was measured simply by the number of children served. In the 1990s the Government Performance and Results Act shifted the emphasis to inputs, outputs, and outcomes. Operationalizing those concepts within agencies turned out to be exceedingly difficult. In seeking to do comprehensive assessments of all federal programs, the Bush administration developed the Program Assessment Rating Tool, which required agencies to evaluate every program, largely on the basis of performance data. This generated a lot of work on defining outcomes and outputs, Stack reported, but the data produced often did not say much about impacts or cost effectiveness.
At the beginning of the Obama administration, stimulus money became available to think about new program designs, which led leaders in the Executive Office of the President to look for opportunities to embed research in new program designs, according to Stack. For example, a presentation by David Olds about the Nurse–Family Partnership program to a group of staffers at the OMB during the Bush administration contributed eventually to the development of a $1.5 billion program by the Obama administration. Similarly, the OMB has emphasized ways of building evidence through comparative cost effectiveness in its communications with agencies.
Successive administrations also have sought to build a clearinghouse to create a repository of knowledge about impacts and cost effectiveness. The WWC is one product of this emphasis, and several others have been developed. However, the clearinghouses lack some forms of data, such as cost data and comparative data. Standards that could serve as a “north star” for the convergence of agency discussions and actions would be helpful, Stack said.
A Waiting Audience
Benefit-cost analysis researchers have a “waiting audience” for their work, said Stack. Both the OMB and the budget committees on Capitol Hill have policy levers they can use to drive funding to programs that are more cost effective, but mustering the political will to change requires evidence. Federal policy makers tend to do things in 5- and 10-minute chunks, Stack explained. To have an effect, information needs to be simple and clear, even with complex research. A simple presentation that grabs people’s attention and focuses on outcomes can be influential.
In response to a question, Stack also pointed out that the OMB is eager to enable researchers to conduct evaluations of programs, whether through random assignment, quasi-experiments, the use of administrative data, or some other methodology. Some agencies are doing this well, while others are lagging behind. The idea is to seek out variation within programs and figure out which variants make the most difference. Though most policy makers are not focused on doing random assignment studies, virtually every large program has opportunities for doing such studies if program managers can be connected with researchers at the appropriate time.
The OMB is encouraging programs to generate data about effectiveness, Stack concluded, but evidence standards in addition to incentives could work to encourage people to adhere to those standards. If programs had strong evaluation components, they could learn as they go. Data on cost effectiveness and return on investment could redirect money at the state and local levels, indicated Stack.
The term policy maker is usually interpreted to include legislators who make laws and fund programs, but it actually includes a wider range of people, said Jacqueline Jones, former Deputy Assistant Secretary for Policy in Early Learning at the Department of Education. Among these people are the program specialists who write regulations, implement programs, and monitor progress within the executive branch of government.
2 This section summarizes information presented by Jacqueline Jones, Ph.D., independent consultant, Princeton, New Jersey.
The advocates and stakeholders who support or oppose legislation and regulation and aim to influence policy are also included. These policy influencers can be a powerful force on legislators and program specialists, said Jones.
Benefit-Cost Analyses in Practice
Before going to work at the U.S. Department of Education, Jones was assistant commissioner for the Division of Early Childhood Education in the state of New Jersey. In 2006 and 2007, New Jersey was implementing a court-ordered high-quality preschool program charged with providing all 3- and 4-year-olds in 31 of the state’s poorest districts with a full-day full-year preschool program. The governor of the state and commissioner of education were very interested in expanding the program beyond the 31 districts to the more than 600 districts in the state and in changing the way the program was funded.
The preschool program had a number of components, including its full-day duration, a maximum of 15 children per class, a requirement that teachers have a bachelor’s degree and a P-3 certification, the use of master teachers who have at least 5 to 7 years of experience, assistance from family workers, and transportation. Cost was a major consideration, which required considering the benefits from several of the program’s components. For example, do data exist that 15 children in a preschool classroom is better than 17? Do the teachers need to have bachelor’s degrees right away or can they earn their bachelor’s degrees while they are teaching? Which parts of the program should be implemented first and which later to ensure high quality? Meanwhile, legislators did not know how they were going to pay for the program, the governor was determined to make it happen, and policy influencers were threatening court action to see that certain components were implemented.
As an example of the inevitable complications, Jones pointed out that teacher salaries were the biggest driver of cost, and the program was paying teachers at parity with public school teachers. But each district in New Jersey negotiates on its own with its union, so each district had a different pay scale. New Jersey “is a very complicated place,” Jones said.
Interventions involve multiple actions and have multiple outcomes. A major question is therefore to what extent do components of a complex preschool program function independently and to what extent are they
interrelated in ways that are not yet understood. For example, would removing the master teachers have a major adverse effect on the program? What is the role of teacher preparation in the program? What contribution do family workers make to preschool education, and does that contribution differ from place to place?
At times, policy makers knew that the promises they were making were not entirely borne out by research. But pressures can be so intense that people will do whatever they can to make something happen, Jones said. Everyone wants better outcomes for children, but how to pay for programs and see results within the time frame of a particular administration is not easy.
Jones called for more conversation between policy makers and researchers. Policy makers usually need information right away, not a week from now. Researchers can benefit by knowing what policy makers need and what kinds of stresses they are under. For their part, advocates, in their zest to do the right thing, sometimes promise more than can actually be delivered, Jones observed. Social scientists could help policy makers and advocates understand what works for whom under what circumstances.
Suggestions for Action
Jones made several suggestions for increasing the value of benefit-cost analyses to policy makers. Clear and accurate description of what is happening with control groups is critical, she said. Also, gathering baseline data before starting to look for effects can improve the quality and usefulness of data. Then, standardizing the presentation of data can be tremendously important by providing a common conceptual framework with which people can interpret results. The world has changed since some of the landmark studies in the field were conducted. New studies are needed that reflect modern circumstances, Jones said.
Legislators can expect variations in program quality, especially in the early stages of implementation. No program is monolithic, even those that have sets of standards. Moreover, benefits do not appear immediately after a program is instituted.
Finally, researchers could ask policy makers what they think would be helpful. Ongoing conversations can help policy makers learn about the complexity of an issue, encouraging them to ask more informed ques-
tions, while helping researchers understand the issues that policy makers face. Such relationship building can foster a sense of trust and support on both sides.
Good policy results from a combination of three inputs, said Linda Smith. The first is good research about what works and what does not work. The second is good data. The third is human stories. Policy makers may know the research and have the data, but they may not take action without knowing about the human aspects of a program.
The Complexity of Using Research to Inform Policy
Benefit-cost analyses tend to focus on particular aspects of a program and ignore other aspects, Smith said. For example, studies of the Perry Preschool Project have focused on its effects on children. But an important aspect of that program was the parent involvement fostered through its weekly home visits, which has not received as much attention. These home visits changed the parent–child relationship, but it also changed the parents and their relationship with their communities. For example, these interventions can have an effect on parent’s work and their ability to get out of poverty. These kinds of effects require a different lens to detect, said Smith, whether examining data from past studies or planning future studies.
Child development is extremely complex and cuts across social, emotional, cognitive, and other domains, Smith stated. Programs need both horizontal and vertical alignment to be maximally effective. No matter where a child is, according to Smith that child needs and deserves consistent interventions and a certain level of quality of care.
3 This section summarizes information presented by Linda Smith, Deputy Assistant Secretary and Inter-Departmental Liaison for Early Childhood Development for the Administration for Children and Families at the U.S. Department of Health and Human Services, Washington, District of Columbia.
Research results also can be misinterpreted, Smith observed. Researchers may assume that their results have clear implications. Unfortunately, this assumption is not always warranted. Researchers can help policy makers not to use research to make bad decisions.
Making Policy Decisions with Limited Budgets
Benefit-cost analyses can be hugely important in implementing and sustaining a program, and more are needed, particularly as decisions are made about the future of Head Start and child care programs in the United States, stated Smith. Given that policy makers need to make decisions with the information available and limited resources, these can be very difficult decisions to make. What is happening with child care, for example, is that quality is declining as funding fails to keep up with the need, because no one is willing to cut the number of child care slots, Smith said.
The reauthorization of Head Start in the next few years will entail making these kinds of hard decisions. For example, mandating a longer day, a longer week, or a longer year has enormous budget implications. Which of these three options would return the most benefits? Today, teachers in Head Start make approximately half as much as teachers in the public school system. A tough decision that has to be made is whether to increase the pay of teachers or cut the number of children served. The results of sequestration have been severe in Head Start, Smith reported, and program managers have trimmed as much as they can, leaving wages not much above the poverty line.
Smith indicated that another decision involves how much of a program an individual child should receive. Is it better for 1 child to get 2, 3, or 4 years of child care, or is it better for more children to get 1 year? Similarly, are programs targeted to poor or mostly minority children preferable to programs that include children from more advantaged backgrounds? These are the kinds of decisions that managers are now facing.
In almost all of these areas, program managers do not have much data with which to make decisions. For example, Head Start has retained its emphasis on parent engagement, but what about parent engagement is most important? Could a different form of engagement have greater benefits?
Communicating Research Results
For research to influence policy, it needs to be understandable, Smith emphasized. Whenever research is not translated into simple language, an opportunity is lost. Policy makers and policy implementers struggle with interpreting the results of benefit-cost analyses for the general public. Yet, without public backing, better policies are hard to implement.
Gary VanLandingham, director of the Pew-MacArthur Results First Initiative, which is a joint program of the Pew Charitable Trusts and the John D. and Catherine T. MacArthur Foundation, said that the initiative is essentially trying to replicate a model developed by the Washington State Institute for Public Policy. The initiative is working with 14 states and 2 counties in California to replace Washington State–specific data with data specific to other locations to provide benefit-cost analyses to policy makers in a form that they can use.
The Results First Initiative also has conducted a nationwide assessment of the field, focusing at the state level. It recently issued a report based on a comprehensive assessment of cost-benefit analyses produced by the 50 states and the District of Columbia during a 4-year period. This assessment identified about 1,000 studies that looked like cost-benefit analyses and then closely analyzed a third of those. It looked for six features:
1. Did they measure program cost and benefits across some type of baseline?
2. Did they assess both direct and indirect costs?
3. Did they discount future costs and benefits to current year values?
4. Did they monetize tangible and intangible benefits?
5. Did they disclose assumptions?
6. Did they do some form of sensitivity analysis?
4 This section summarizes information presented by Gary VanLandingham, director of the Pew-MacArthur Results First Initiative, a joint program of the Pew Charitable Trusts and the John D. and Catherine T. MacArthur Foundation, Washington, District of Columbia.
Of the 384 studies assessed, only 11 percent met all 6 criteria, leading VanLandingham to conclude that substantial room for improvement exists. Full monetization of tangible and intangible benefits was by far the weakest criterion.
Finding a Baseline of Practice to Inform Policy
Policy makers cannot wait several years for research results to be available, according to VanLandingham. They need to make decisions in real time. VanLandingham indicated that clearinghouses therefore can play a critical role by collecting and disseminating information being produced by cost-benefit analyses. However, a challenge of using clearinghouses is the differing nomenclature they use. Of eight clearinghouses reviewed by the Results First Initiative, the best tier of programs are alternately called well supported, top tier, effective, proven, positive, strongly positive, or are given a score of 3 to 4. Agreeing on what to call the good programs would be a step forward, said VanLandingham.
Policy makers have a great hunger for this kind of information, observed VanLandingham. In addition, advocates have latched onto it as a way to promote favored programs. They often cite studies that demonstrate returns on investments as a way to influence policy.
However, that influence has a flip side, VanLandingham warned. The credibility of the field could be destroyed unless a baseline of practice is established. Already, there is little relationship between the technical quality of benefit-cost analyses and their use in the policy process. Policy makers are often using all studies alike regardless of how they are viewed within the field. Even if consensus in some areas is difficult to achieve, agreement on several fairly basic things would help maintain the field’s credibility by establishing standards for practice.
VanLandingham pointed out that other organizations engage in this kind of standard setting. For example, the Government Accounting Standards Board establishes standards for accounting to help people agree on the validity of an accounting statement. Standards setting does not happen overnight, but if the field can get started and move through an iterative process, it can maximize its impact on policy.