Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 17
Proceedings of a Workshop to Review PATH Strategy, Operating Plan, and Performance Measures 5 Defining Success and Performance Measures for the Evaluation and Management of PATH Melvin Mark Pennsylvania State University A topic that arises from the discussion of program evaluation involves logic models and how to look at the draft PATH strategy, operating plan, and performance measures. I am going to talk about how these translate into benefits for the program. In this context, we are going to talk about some selected aspects of the current version of the PATH model and raise some questions that should recur in the three PATH goal discussion panels. I do not want to say “here is the answer,” but “here are some of the questions.” Very often, when evaluators are brought to the table, or when performance measurement people come to the table, they start with the development of a logic model. This is a means of capturing and communicating a theory of change, a theory of action, and a notion of what it is that the program does and how that translates into benefits. There are multiple variants that use different terms, but for our conversation we will use the term logic models. An evaluation logic model usually includes inputs, activities, outputs, and outcomes. PATH has adopted a frequently used form of a logic model that measures performance by tracing the flow of inputs, activities, outputs, and outcomes. Inputs are the program’s resources, such as its budget, staffing, and physical resources that are allocated to the program or that come from partnerships. Activities are the things that are done. Sometimes when people are talking about this, they use verbs. These are programs that have been created, things that are done in one way or another to get to the objective. Outputs are products. For example, an activity may be creating a curriculum for a training program for builders. The outputs are the sessions that are conducted and how many people received training. The outputs then would be the sessions that are conducted and how many people received the training, similar to McDonald’s count of how many hamburgers they have served. Then we consider outcomes, the effect the output has on the goal. Sometimes I like to think about a logic model as a set of dominoes. It is like knocking over a series of dominoes. Eventually the last domino falls or the program achieves a certain goal. Sometimes these logic models include facilitating conditions that make it easier or inhibiting conditions that may make the program less likely to be successful. When people talk about logic models, what they are suggesting is an “if-then” logic. If we do this, then this other thing will result. If we put these training programs out there, certain kinds of learning will occur and if that learning occurs, then certain kinds of changes will occur in practices. For example, if we train builders about R&D, they will understand its value and they will undertake R&D activities that will provide innovative technologies that improve the value of houses. One of the problems that one sees as an evaluator is that people sometimes list these things, but there is not much of a logic to connect them. The dominoes do not all fall. PATH has applied a logic model to its operating plan and performance measures. Looking at a page from the metrics, there are inputs such as staff time, industry expert time; activities such as forums; and outputs or products that lead to the goals such as reduced or eliminated barriers.
OCR for page 18
Proceedings of a Workshop to Review PATH Strategy, Operating Plan, and Performance Measures The reason evaluators and others go through these exercises, known as a “formative evaluation,” is that the process is supposed to make the program better. There are benefits in just doing this kind of logic modeling, first, because it imparts a better understanding of the goals and the processes being managed and, second, because it gets people rowing in the same direction. Various partners and staff members sometimes have very different perspectives on what a program is supposed to be doing. That means they are likely not to be bringing actions together to try to achieve the same objectives. The effort gets diffused, sometimes in conflicting directions. Simply having an agreement on what the program is about, what it is doing, and perhaps most importantly, where it is trying to go, can be beneficial. Similarly, there is sometimes a formative function in making the program plan more rational. Joe Wholey, who is one of the pioneers in evaluation logic modeling, showed in one of his first examples that the program managers were trying to do too many things. When they looked at a picture of the whole program plan, they saw that they could not reasonably have all of the components in the plan, given the resources that were available. That may also be true for PATH. This kind of revelation does not always happen, but it is not an uncommon consequence of going through an evaluation logic model exercise. Evaluators use logic models and move from logic models to various indicators, measures, and metrics because this provides a way of guiding a summative evaluation or bottom-line judgment. Does the program work? Is it functioning effectively? Is it beneficial? Without some specificity about the objective and without some prior agreement about what kinds of measures might capture the objective, it is difficult to know whether the program works. It is hard to have agreement if there is no rational basis for judgment. This is, of course, one of the motivations for the Government Performance and Results Act (GPRA) and various other initiatives that have pushed agencies to undertake performance measurement. There are some complications and challenges, but our time is limited, so I am going to focus on the potential benefits. A good evaluation system is one that supports results oriented management. If Joe Wholey were here, he might tell a story about the U.S. Coast Guard (USCG) that illustrates the benefits of evaluations. The Coast Guard developed a performance measurement system using data of a kind they had never collected before. From this performance indicator system, they observed that there were unusually high rates of injuries and even fatalities in certain aspects of the seaborne industries. I don’t know how many of you have read the book Tommy Tugboat to your children, but it turns out that Tommy Tugboat is a very dangerous place to work. USCG had never collected data that allowed them to slice and dice by the different parts of the industry. Once they had the performance data, they saw where the problem was, and it guided them to create new programs and regulations that resulted in a precipitous decline in injuries. We started with a quote from Peter Drucker: “If you are not measuring it, how can you manage it?” If you have no idea, how do you know if you need to make changes or stay the course? How do you know which things need your attention the most? Results-oriented management is one of the reasons that one tries to get performance metrics, despite the challenges. A list of some of the criteria that we might use to think about evaluation models includes: Practicable—Is it feasible to implement with the given resources? Plausible chain—Is there a logical sequence that is likely to achieve goals? Quality—Are measurements available for assessment of goal achievement that are valid and not easy to game? Adequate—Does it provide short-term outcome measurements needed by management? I have the word “practicable,” which essentially means the extent to which it can be carried out in practice. When we think about this criterion, we are asking if we can implement the plan, given the resources. Simply, is the program doable in terms of these planned activities? A second criterion is the plausibility of the logic chain. There is a set of activities that the resources will support. They are supposed to result in certain products, which we call outputs, which in turn are supposed to lead to medium- and long-term outcomes. Considering the logic chain as a column
OCR for page 19
Proceedings of a Workshop to Review PATH Strategy, Operating Plan, and Performance Measures of dominoes, are those dominoes lined up in a way that if the first one is knocked over, the rest of them will fall, or are one or two dominos out of place so that the series of events will not be carried forward? How plausible is the chain? Simply because a chain has been laid out does not mean that it is likely to lead to the stated results. The third criterion is quality of measurement. Will the quality of measurement actually assess achievement of the goal? This is particularly important for long-term goals. Begin by examining the specific metrics and performance indicators to ensure they are countable. How is PATH going to show it has reduced X percent of the barriers or reduced the severity of the barriers by X percent? This is obviously critical because communicating successes requires a means to demonstrate it. The validity of the measure is also a critical criterion. There should be no question that it is measuring what it purports to measure. If a metric is about reduction of barriers, is it really capturing in some honest sense a reduction of barriers or is it just a number that that does not tell much about barriers? Evaluators are increasingly concerned that the measures do not become an end in themselves. One of the things that first got the economist who co-wrote Freakonomics public attention was coming up with a statistical algorithm to detect teachers who were cheating on standardized tests of their students. They were basically giving out the answers to manipulate the result, which means the test scores had no relation to what the children were learning. Obviously, measures that do not allow that kind of manipulation are desired. An issue that sometimes gets lost is the adequacy of the measurement for supporting day-to-day management decisions. Sometimes we focus all of our attention on the long-term objectives, which we are not going to reach this year or perhaps next year or perhaps the year after that. If we trying to implement management by results, we need to have indicators that are shorter term that have certain other characteristics that allow us to make day-to-day decisions based on feedback about how things are going. A measure might be sufficient to tell if long-term results are attained, but not be useful for day-to-day management because it is too distant in time or too general. By looking at the draft PATH plan in terms of a logic model it suggests that there are three primary parts or goals that facilitate completing the program mission. One of those has to do with removing barriers, another with technology transfer through dissemination of information, and the third with facilitating R&D. For each goal there are three objectives. Each of the objectives is described by something that looks like a conventional logic model. For each objective there are inputs (the resources it takes), the activities to be undertaken, the outputs or products of these activities, and then the short- and long-term outcomes. The organization implies that we are trying to get to the mission with multiple pathways. Each of these pathways has specific outcomes and the resources we have to achieve those longer-term outcomes. PATH should be applauded for having separated input, activities, outputs, and outcomes in a way that links them to specific long-term outcomes and objectives. Often an evaluator will walk into an organization that has gone through a planning process like this to find they have lumped input, activities, and so on into five buckets. The evaluator cannot determine which activities are supposed to achieve which outcomes. The problem now is to examine each performance measure in terms of the criteria mentioned earlier to determine to what extent it is feasible to implement this entire plan given the resources that are available. We already heard that the resources are now less than when this plan was first developed and many of inputs are coming from various partners with unknown levels of commitment. In the discussion following Dr. Slaughter’s presentation there were several interesting points about several different audiences and types of builders, as well as various roles within the large homebuilding organizations, and architects, consumers, and suppliers. Dr. Slaughter noted a variety of mechanisms including technology push and demand pull. The role of branding came up as well as research following the money in a variety of ways. There are obviously many opportunities for program activities that fit the PATH mission, which is both a blessing and a curse. PATH is faced with the problem of determining which is likely to be most beneficial. The issues of practicability and the consequences of practicability are worth some discussion.
OCR for page 20
Proceedings of a Workshop to Review PATH Strategy, Operating Plan, and Performance Measures We are beginning to describe what is called an aspirational model, that is, what PATH could be. There is a potential danger with any model that is built on what is possible or what might be possible at some point in the future. The program may be judged according to those standards and metrics regardless of the level of resources that are currently allocated. PATH also needs to ensure the plausibility of the logic chains. This judgment needs to be made by people with knowledge and expertise in housing and innovation in the housing industry. For example, how likely is it that PATH can develop a branding capability? Does PATH have the necessary focus, and size to get enough exposure that branding is a plausible activity? Do the dominoes in the PATH model connect or are additional steps required to get to a single long-term outcome? If the objective is to create pull by building demand from consumers, maybe there are activities that need to go together to converge on that one single long-term outcome. Quality measurements to assess attainment of a goal need to support rational management decisions. Long-term achievements and immediate management require somewhat different kinds of metrics. In either case they should have validity to ensure they are measuring what they claim to be measuring. If the objective is reduced barriers, the user of that metric should be convinced that it actually reflects the value of reduced barriers. Another consideration that can be important is whether there is a comparison standard to measure changes over time. Cost of the performance measurement activity is also important. A $5 million program cannot use a set of metrics that is going to cost $7 million to implement. It is essential to determine the feasibility of the measures given budget constraints. For results-oriented management, there is a set of criteria that come up more strongly than when considering the long-term objectives. That is, did we get the job done? It should be possible to desegregate the measures to examine different regions of the country or different sectors of the industry to see the trends. Large national homebuilding companies may be responding differently than smaller builders, and builders in the South may respond differently than builders in the West. An additional complexity is that management decisions require current information. If data are not available until three years later, the data will not help make the current decision. Managers often turn to proxies or indirect measures, but it is often difficult to know if they accurately represent the intended objective. As an example, the National Science Foundation wants to increase the nation’s scientific proficiency. How valid are third grade test scores as a predictor of long-term human capacity? Does this indicator work well enough to help make decisions? The questions we have been addressing will recur during the panel discussions. Is the activity plan commensurate with the level of resources and if not, what can be done? Where do you make finer choices? What do you give up? Is it plausible? If PATH has a great set of activities that are not plausible, what good is it to use resources on those activities? After focusing on the details of the logic model, it is import to determine if the program actually accomplishes its mission. Do the individual long-term objectives and the long-term outcomes support the mission? There are sets of evaluation questions that help determine the level of confidence in the program. There is a need to know if the changes that are measured are the result of the program activities. If all the large builders create internal research units, can we be confident that PATH made some difference or are the observed changes due to these other activities? DISCUSSION MR. KASTARLAK: It seems to me that perhaps we can add one more word to the lexicon of housing. In addition to sustainability and affordability, there is also attainability. PATH can build its logic model beginning with that as the end objective. Start from there and walk backwards. DR. MARK: Absolutely; I started on the left-hand side because that is how we read in this country. Another approach is to start on the downstream side with the objective and then work backwards to plan how to get there. In fact, when it is done well, planning typically is an iterative
OCR for page 21
Proceedings of a Workshop to Review PATH Strategy, Operating Plan, and Performance Measures process. It works backwards to how is it going to get there and then forward to determine if there are resources to do that. Are the dominoes going to fall? MR. KASTARLAK: Yes, but there is more to that because you might end up changing your goal. DR. MARK: Absolutely, this process can change the goals in ways that are desirable or in ways that are not desirable. Sometimes this results in “goal displacement—for example, if we are interested in children’s education, but we get fixated on test scores because we can measure them. Test scores are good for some purposes, but maybe they are not the be-all and the end-all. The goal can be changed in ways that are not commensurate with the mission. For example, the mission statement is so broad and vague that it would enable you to do anything, which means it is really not the best mission statement. The process will help to highlight such inconsistencies. MR. ENGEL: On the one hand, the program wants to show all the pieces that need to be done and on the other hand, as you pointed out, there is a resource constraint. Is there a method to show both aspirational and plausible goals and measures? I would hate to submit something that was only a piece of the puzzle and didn’t show the whole complexity, but the issue you raise of resource constraints is very appropriate. DR. MARK: I am going to let a couple of my evaluation colleagues jump in if they will, but first I will say I think you have actually hit it precisely. The draft demonstrates the big picture. Here are the levers that we can push for which we have adequate resources to push and these that we have a case for saying they are most likely to make a differences. The plan can then show next steps PATH would include. I do not know if OMB likes that, but it certainly can be part of a presentation. Here are our key priorities given where we are now. Subsequent activities would likely involve other activities. If the plan indicates the program is going to do everything right now, it is a bit like hoisting oneself on one’s own petard. MR. FREEDBERG: Ultimately all of these short- and long-term outcomes must relate back to the larger mission. The new mission statement really says that the mission is to improve housing technology innovation in order to improve housing values, affordability, energy efficiency, and so forth. Looking at the outcomes, I don’t see specific references to those values or the components of the values that are the mission of the program. Those are very difficult things to measure. How much more affordable is the housing as a result of these activities? How do you address that in a logic model? DR. MARK: You do clearly want those long-term outcomes to be in the service of that mission. If they are not, you have mission creep. In your question, you have shifted, perhaps inaccurately, from what the mission is. I suggest that we not answer that question now, but when we look at each of the three pieces over the next three sets of panels, that should be a question that is in your mind. DR. SLAUGHTER: I think that goes directly to the previous question about working your way backwards. If the goal is to increase availability, the program should be able to increase the speed at which houses are produced. By working backwards, determine how to do it. Then there is the issue of prioritization. There is the issue of how effective PATH will be in achieving those various elements. Industries that are revenue and profit motivated are going to be effective at reducing the cost of a specific unit, especially if they can increase their profit margin. I think prioritization of the long-term goals is the justification for federal expenditures in this area. DR. MARK: I am not going to argue with that. What I will say is my understanding of the purpose of this session is to provide input to PATH about the current draft plan in ways that it can take into consideration before that plan goes forward organizationally. I am not sure we need to answer every question, but I agree with you that we could do this in multiple ways, but we can’t go through all of those ways today. DR. MARTIN: Just to answer some of your questions about the process. In most of this discussion, the background document defines the history of what happened, but the mission actually was established by the PATH Industry Committee two and a half years ago. Then that was translated to the goal and the three sub-goals. MR. GONZALEZ: The criteria of practicability, plausibility, and so forth will help to focus the
OCR for page 22
Proceedings of a Workshop to Review PATH Strategy, Operating Plan, and Performance Measures discussions as the workshop addresses the different goals. As was noted, the objectives of the discussions are not to eliminate parts of the draft operating plan, but rather to give them input and help to determine how PATH can be most effective and where it can have the most impact.
Representative terms from entire chapter: