National Academies Press: OpenBook
« Previous: Appendix A - FTIS Instructions
Page 86
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 86
Page 87
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 87
Page 88
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 88
Page 89
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 89
Page 90
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 90
Page 91
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 91
Page 92
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 92
Page 93
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 93
Page 94
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 94
Page 95
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 95
Page 96
Suggested Citation:"Appendix B - Peer-Grouping Methodology Details." National Academies of Sciences, Engineering, and Medicine. 2010. A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry. Washington, DC: The National Academies Press. doi: 10.17226/14402.
×
Page 96

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

86 Introduction This appendix presents the details of the peer-grouping methodology developed and tested by TCRP Project G-11. A summary version of the methodology appears in the body of the report as part of the description of Step 3 in Chapter 3. The peer-grouping methodology was developed in con- junction with the project’s broader benchmarking method- ology. The general process used to develop the peer-grouping methodology was as follows: • Prior to the start of the project, the oversight panel for TCRP Project G-11 specified their desired key characteristics for the peer-grouping methodology. • The research team conducted outreach to the transit indus- try on the industry’s desired aspects of a benchmarking methodology and reviewed the literature to determine what methodologies had been tried before. • The research team developed initial concepts for the peer- grouping and benchmarking methodologies, conducted internal tests on the reasonableness of the results, and pre- sented the concept to the panel for comment. • Based on the panel’s feedback, a second version of the bench- marking methodology was developed. No changes to the peer-grouping aspect of the methodology needed to be made at this point. A second outreach effort was conducted to obtain industry feedback on the reasonableness of the approach described in the methodology. • The outreach feedback was incorporated into a third version of the methodology, which was implemented in spreadsheet form. Agencies were recruited to participate in a small-scale test of steps 2–4 of the benchmarking methodology, with the peer-grouping methodology being part of this test. The agencies provided a performance-measurement question to be answered, and the research team applied the methodology to form peer groups, identify appropriate performance measures, and present results. At each step of the process, agency feedback was solicited on the reasonableness of the results. At the end of the testing effort, the feedback on the peer groups that were created was incorporated into a fourth version of the peer-grouping methodology. • The peer-grouping methodology was incorporated into the FTIS software. Additional agencies were recruited for a large-scale test of the benchmarking methodology. This time, the agencies performed the work themselves (with the research team available to answer questions) and provided feedback. Their feedback on the peer groups was incorpo- rated into the final version of the peer-grouping method- ology presented here. The remainder of this appendix describes the development of the peer-grouping methodology, including aspects of the methodology that were considered but discarded during the process, and provides the calculation details for the “likeness scores” used in the peer-grouping process. Peer-Grouping Philosophy Many of the overarching aspects of the peer-grouping methodology were determined by the project oversight panel at the start of the project. The panel’s desired characteristics for the methodology were the following: • Robustness – able to work with different transit modes, agency sizes, and operating environments. • Practicality – relevant to and usable by transit agencies, state departments of transportation, and other interested stakeholders. • Transparency – having an understandable process, with visible inputs, outputs, and intermediate results (i.e., not a black box). • Uniformity – using readily available, uniformly defined and reported data. A P P E N D I X B Peer-Grouping Methodology Details

• Innovation – going beyond traditional performance mea- sures while avoiding previous peer-grouping approaches that have not been adopted by the industry. The practicality and uniformity characteristics, in particular, drove the way the methodology was developed. The method- ology is not intended to represent the best theoretical way that transit agency peer groups could be developed. During the course of the project, the research team identified peer- grouping factors that could be improved if better data were available, and some of the project’s recommendations in Chapter 5 reflect these data-definition and reporting needs. Instead, the methodology is designed to do the best possible job of meeting the industry’s needs, within the constraints of available data and tools. During the course of developing the methodology, a few other desirable aspects were identified from the project’s indus- try outreach efforts and incorporated into the methodology: • Adaptability – Not every user will share the same philosophy that underlies the methodology; therefore, users should be able to adapt the methodology for their own use and not be locked into a single approach to peer grouping. • Accessibility – Easy-to-use tools for applying the method- ology should be made readily available to the industry. • Updateability – To make the methodology usable into the future, a process should be described for calculating any peer grouping factors not directly available from a national database. Early in the project, the project oversight panel decided that rural and demand-response service was not to be a particular focus of the project. The final methodology can accommodate demand-response mode comparisons, although it has not been specifically tested on demand-response service (except as part of an agency’s overall service). The methodology should be adaptable to rural service; however, rural data from the NTD were not yet available at the time the project was completed to allow any testing to occur. One aspect of the methodology that not all stakeholders agreed with is the philosophy that agencies that operate bus service but not rail service (with the exception of vintage trol- leys or downtown streetcar circulators) should not be com- pared to agencies that operate rail service. Rail lines substitute for what would otherwise be an agency’s highest-demand, most productive bus routes. Therefore, the scale, function, and productivity of bus service in a city that also operates rail serv- ice would be expected to be different than in a comparably sized city that only operates bus service. However, two of the three largest bus-only operators in the United States have com- pared themselves in the past to rail-operating agencies and are comfortable doing so (the third has only compared itself to bus-only operators). The TCRP Project G-11 methodology will not directly produce the peer groups those two large bus- only operators are used to seeing since it screens out rail oper- ators as potential peers even when doing a motorbus-specific mode comparison. Nevertheless, it is possible for those two op- erators to take the peer-grouping spreadsheet that FTIS can ex- port, remove the rail-related factors from the peer-grouping calculation, and re-calculate a likeness score. Agencies can make similar adjustments for other portions of the methodol- ogy they may not agree with, without having to abandon the entire methodology. Thus, the adaptability criterion is met. Another aspect of the methodology that not all may agree with, as it has been incorporated into other peer-grouping methodologies in the past, is the deliberate exclusion of certain outcome measures as peer-grouping factors. The philosophy here is that outcomes should be the focus of a benchmarking effort: for example, if one looks solely at agencies with similar ridership, one is unlikely to find anything that could be used to improve one’s own ridership. On the other hand, if two agencies are similar in a number of input characteristics but have divergent outcomes (e.g., ridership, number of employ- ees required for a given level of service, distance between mechanical failures), one is more likely to find something that can improve one’s own performance. At the same time, it is recognized that some peer-grouping factors included in the methodology, such as operating budget, vehicle miles operated, or amount of contracted service, could also be con- sidered as outcomes for certain performance questions. Finally, the methodology was not designed to be used as a means of ranking agencies to determine on a national basis the “best” agencies overall, or best at a particular aspect of service (although nothing prevents it from being applied that way). That approach has been tried before [e.g., Hartgen and Horner (B-1), Perk and Kamp (B-2)], but has not been widely accepted by the industry. Rather, peer-grouping and performance mea- surement is intended to serve as a starting point for an agency to ask questions and identify areas of possible improvement. That course—a true benchmarking process—holds much greater potential for long-term performance improvement. Methodology Development Initial and Outreach Versions Description The first two versions of the methodology used a three-step screening process to arrive at an initial peer group. In the first screening step, peers were screened out on the basis of popu- lation: an urban area population within ±25% for urban areas under 1 million population and ±50% for larger urban areas. The larger range for larger urban areas was to keep a reasonably large pool of potential peers available for subsequent steps 87

since there are fewer large urban areas. Urban area population was determined from the U.S. Census Bureau’s American Community Survey. Service area population would in theory be preferred to urban area population because it would allow agencies with similar market sizes to be compared. Unfortunately, although this variable is uniformly defined and collected for the NTD, it is not uniformly reported. (For example, a county-based system might report the entire county’s population as its ser- vice area, although the actual population within the specified distance of transit service might be considerably less.) In addition, only one service area is reported to the NTD, even though the service areas of the various modes operated by an agency might be considerably different. As an alternative, the combination of urban area population and service type is used to identify agencies providing similar types of service within similar-sized urban areas. In the second screening step, peers were screened out on the basis of three factors: 1. Modes operated—for consistency in the mix of modes operated. NTD data were used. 2. Service area type (e.g., region-wide, suburban only, central city only)—for consistency in the types of routes operated and markets served. Agencies had to exactly match ser- vice types to continue as potential peers. This variable was developed by TCRP Project G-11. 3. Proximity of adjacent comparably sized or larger urban areas—to account for commuting differences between a stand-alone urban area (e.g., Boise) and two or more urban areas in close proximity (e.g., Raleigh and Durham). A threshold of 45 miles was used to determine if the target agency’s urban area had another urban area in close prox- imity. Agencies had to match (i.e., either both have or do not have an adjacent urban area) to continue as potential peers. This variable was determined based on U.S. Census Bureau data for the geographic coordinates of the center of each urban area. In the third screening step, a set of variables was used to further refine the peer group. These variables covered a num- ber of factors that could differentiate one agency or region from another and that could account for observed differences in ridership or other outcomes. The variables were identified through a combination of the literature review, project over- sight panel input, and project team brainstorming. A larger pool of variables was developed at this stage than was intended to be included in the final methodology in order to see which of several variables did the best job of distinguishing between regions and agencies. The exact variables used in the screening depended on the type of application. Four types of peer-comparison applications were identified, depending on the type of performance question being asked: • Operations – questions relating to the service provided on the street, taken from the agency’s viewpoint. • Planning – questions in support of mid- to long-term plan- ning efforts, often with policy and funding implications. • Market focus – questions related to the service provided, taken from the viewpoint of the broad range of customers, including riders, non-riders, local jurisdictions, and policy- makers. • Financial – questions related to the agency’s financial performance. The variables investigated are discussed below: • Agency Proximity. This variable serves multiple functions. First, it serves as a proxy for other factors, such as climate, that are more difficult to quantify but tend to become more different the farther apart two agencies are. Second, agencies located within the same state are more likely to operate under similar legislative requirements and have similar funding options available to them. Finally, for benchmarking purposes, closer agencies are easier to visit, and stakeholders in the process are more likely to be familiar with nearby agencies and regions. Some past peer-grouping efforts grouped agencies by region of the country; however, that method is somewhat arbitrary and may not be useful for agencies located near the border of a region. Instead, prox- imity was based on distance, using the distance between the centers of the agencies’ respective urban areas, as determined from U.S. Census Bureau data. This variable was used for all applications. • State Capital. State capitals are typically associated with large employment centers that are frequently located in small to mid-sized communities. Because of the singular nature of capitals, a yes/no variable was used. This variable was devel- oped by TCRP Project G-11 and used for all applications. • Percent College Students. Urban areas with large college populations typically have more transit service and higher ridership since colleges provide natural activity centers on which to focus transit systems and often assist in funding transit service. However, the effect of a university on rider- ship is proportional to the size of the urban area (e.g., UC Berkeley influences travel patterns in the San Francisco Bay Area less than UC Davis does for Davis). The variable is derived from the U.S. Census Bureau’s American Commu- nity Survey; data include community colleges as well as 4-year universities. It was used for all applications. • Population Growth Rate. Transit agencies located in areas that are growing quickly often experience different chal- lenges than those with more stable populations, including the need to expand service to keep pace with growth. Agencies 88

in regions that are shrinking face another set of challenges. This variable used the urban area’s average growth rate between 2000 and 2006, using U.S. Census Bureau data. It was used for all applications. • Population Density. This is a well-recognized factor in attracting transit ridership and increasing transit viability, and is readily calculated for urban areas using U.S. Census Bureau data. Because population density can be lowered by the existence of large open spaces within the urban area boundary, other density-related factors were also tested, as described below. Population density was used for all applications. • Census Block Density. The number of census blocks per square mile within the urban area can be used as a proxy variable to measure network connectivity, and by extension, pedestrian access to transit. The variable was determined from U.S. Census Bureau data and was used for planning and market focus applications. • Percent Low-Income Population. The proportion of an urban area’s residents that are “low-income” (defined by the U.S. Census Bureau as members of a family with income less than 150% of the poverty threshold) affects transit ridership because those residents are more likely to use transit. Low- income statistics reflect both household size and configura- tion in determining poverty status and are therefore a more robust measure than either household income or automobile ownership, which are other factors known to influence ridership. This variable was used for all applications, based on U.S. Census Bureau American Community Survey data. • Population Dispersion. While population density provides an overall measure of land-use intensity in an urban area, it does not reflect the homogeneity of land use. Urban areas with high-density cores and centers but low-density outlying areas may be more transit-friendly than those with popu- lation spread evenly throughout the area. Population dis- persion is calculated by dividing an urban area’s population density by its weighted density. Weighted density is calculated by multiplying each census block’s population density by its proportion of total urban area population and summing across all census blocks. A dispersion value of 1 means that population is distributed evenly across all block groups, while a value closer to 0 means that residents are concentrated in specific areas. This measure was tested for planning and market focus applications using 2000 Census data. • Employment Dispersion. This measure follows the same general principle and calculation methodology as population dispersion. It was tested for planning and market focus applications using 2000 Census data. • Transit-Supportive Area. The amount of transit-supportive land-use found in an urban area plays a large role in transit operations. The Transit Capacity and Quality of Service Man- ual (TCQSM, B-3) uses the concept of “transit-supportive area” (areas capable of supporting at least hourly weekday transit service, based on population or employment density) to make apples-to-apples comparisons of different agencies’ service coverage areas. While population density and popu- lation dispersion also address this issue to some extent, this measure can potentially do the work of both. Converting the TCQSM’s household-based threshold to a population-based threshold suggested a value of 7.5 persons per acre as the minimum value to test. The measure was tested for planning and market focus applications using 2000 Census data. • Congestion Delay Per Capita. Highway congestion has a large effect on bus operating conditions and may provide a greater incentive for persons to use all forms of transit for peak-period trips. Congestion is more likely to be an issue in larger urban areas. Data for this measure are avail- able from the Urban Mobility Report (B-4) for larger urban areas. The measure was used for all applications, except financial, for agencies located in urban areas with at least 1 million population. • Freeway Lane-Miles Per Capita. The extent of a region’s freeway network may indicate the level of priority given to roadway investments compared to transit investments. It may also influence service design; for example, systems focused on large park-and-ride lots. In small and mid-sized urban areas, freeways may serve more intercity travel than intra-city travel and therefore have less of an influence on commuting patterns. The measure was based on Urban Mobility Report data and was used for all applications, except financial, for agencies located in urban areas with at least 1 million population. • Total Vehicle Miles Operated. The total amount of service provided by an agency influences a number of transit service factors. This variable was used for operations and financial applications and was based on NTD data. For the other two applications, this variable was felt to be an outcome and was therefore not included. • Total Operating Budget. Total Operating Budget influences many aspects of transit service. Structurally, operating budget is a measure of the scale of a transit agency’s operations; agencies with similar budgets may face similar challenges. This variable was used for operations and financial appli- cations and was based on NTD data. For the other two appli- cations, this variable was felt to be an outcome and was therefore not included. • Mean Wage Rate. Typical wages vary between regions. Higher wages will typically be associated with higher labor costs for transit agencies. This variable was used for oper- ations and financial applications, and was based on Bureau of Labor Statistics data for metropolitan areas. A “likeness score” approach was used for each variable (factor) included in an application. The factor likeness scores 89

were added together to form a (non-normalized) likeness score for an agency, with lower likeness score values indicating a greater degree of similarity. The factor likeness scores were, for the most part, calculated similarly to the scores used in the final methodology (described later). Whether or not to weight certain factors more heavily was considered at this stage, but not implemented, pending the results of more widespread methodology testing later in the project. Factors Considered but Not Included in the Methodology The following variables were considered but were dropped from further consideration after initial internal testing by the research team: • Median Household Income and Percent of Households Earning Less than $35,000 (U.S. Census Bureau). Dropped because they do not reflect the size and composition of the household, unlike Percent Low-Income Population. • Automobiles Per Capita and Percent Zero-Car House- holds (U.S. Census Bureau). The former was dropped because it had the lowest variation between urban areas of any of the tested measures. The latter also showed lower vari- ation than most other measures. A lack of variation limits the ability of a variable to distinguish differences between regions or agencies. Poverty-related measures (e.g., Percent Low-Income Population) and density-related measures capture similar demographic characteristics. • Percent of Population Less Than 18 and Percent of Population 65 or Older (U.S. Census Bureau). Dropped because of low variability between urban areas in the tests. While age may be a key consideration when making local service planning decisions, it does not provide much ben- efit when distinguishing between urban areas. The Agency Proximity variable can also help account for any regional differences that may exist. • Arterial Miles Per Capita and Freeway Miles Per Capita [FHWA Highway Performance Monitoring System (HPMS)]. Dropped because of the work that would be involved each year deriving these measures from raw HPMS data to keep the database current. The Urban Mobility Report provides freeway miles data (B-4), albeit for a smaller set of urban areas. • Parking Cost (Collier’s Parking Cost Survey). Parking costs influence the decision to use transit. This measure was dropped because parking cost data are not available for most smaller urban areas. The Urban Area Population variable helps control for parking costs since larger urban areas will tend to have higher parking costs. • Sprawl Index (Smart Growth America). More-sprawling regions are more difficult to serve with transit. SGA’s Sprawl Index provides a comprehensive, national source of data based on objective research. This measure was dropped because data are only available by county, making it diffi- cult to assess the degree of sprawl for urban areas that span multiple counties or only a small portion of a single county. • USDA Plant Hardiness Zones and Annual Precipitation (U.S. Department of Agriculture and National Oceanic and Atmospheric Administration). These are surrogates for climate. The former is based exclusively on average annual low temperature, which masks differences in summer extremes. The latter does not account for the distribution of precipitation throughout the year. The Agency Proximity variable helps control for climatic differences, as nearby agencies are more likely to have similar climates (although it is recognized that topography also plays a role). • Cost of Living Index (ACCRA cost-of-living index and others). Cost-of-living differences between regions can influence agency costs. This variable was dropped because data are not available for all areas and require payment of a fee to obtain and distribute. Median wage was used as a surrogate for differences in costs between regions. • Park-and-Ride Spaces (no standard source). This variable helps describe service structure, but was dropped due to a lack of a national data source. • Bicycle Friendly Community Rating (League of American Bicyclists). Helps describe ease-of-access to transit since bicycle-friendly communities are typically also pedestrian- friendly. Dropped because ratings are only generated by re- quest and are by jurisdiction, making them difficult to use for transit agencies serving multiple jurisdictions. Small-Scale Testing Description During this stage of testing, the methodology was tested by 16 agencies: 10 transit agencies, 5 state departments of trans- portation (DOTs), and the Regional Transportation Authority in Chicago. The participating agencies were as follows: • Transit agencies – Denver RTD (Denver, CO) – Utah Transit Authority (Salt Lake City, UT) – Santa Clara Valley Transportation Authority (San Jose, CA) – Lane Transit District (Eugene, OR) – Knoxville Area Transit (Knoxville, TN) – Triangle Transit Authority (Durham, NC) – Rochester Genesee RTA (Rochester, NY) – Greater Cleveland RTA (Cleveland, OH) – Greater Bridgeport Transit Authority (Bridgeport, CT) – Bay County Council on Aging (Panama City, FL) 90

• State DOTs – Florida DOT – Indiana DOT – Pennsylvania DOT – Texas DOT – Washington State DOT • Other – RTA (Chicago, IL). Each agency developed a performance measurement ques- tion or topic to be addressed, while the research team applied the methodology and presented the results to the agencies. Several feedback points with agency staff were built into the process to obtain feedback on particular steps of the peer- grouping and benchmarking methodologies and to make sure staff were comfortable with the results before continuing. These feedback points consisted of: • Identifying a performance measurement topic of interest to the agency; • Identifying an initial set of peers for the agency and an initial set of performance measures relating to the topic; • Identifying a final set of peers and performance measures; and • Discussing the performance results and the usefulness of the methodology. Methodological Changes As originally proposed, the FTIS software was going to be used for this round of testing. The methodology was pro- grammed into FTIS, and a user interface was developed. This allowed more extensive testing of the initial methodology than had previously been possible. An initial research team observation was that the portion of the screening process that screened out potential peers based on modes operated, service area type, and proximity of comparably sized or larger urban- ized areas did too good a job of screening and left too small a pool of potential peers. Rather than continually update FTIS, the research team decided it would be faster and more cost-effective to implement a spreadsheet version of the methodology for the team’s use for the small-scale testing, and then to update FTIS prior to beginning the large-scale tests, where agencies themselves would be applying the method. The original spreadsheet contained all of the data needed to use the peer-grouping methodology and, for each application type (operations, plan- ning and market focus, and financial), produced summary lists of the 20 peers most similar to the target agency. There were five major versions of the spreadsheet, which was updated to add data for additional modes as the need to analyze them came up during the testing. In addition, each successive version also corrected errors (mostly in the way agencies were assigned a service area type) in the non-NTD portions of the database. Implementing the peer-grouping methodology in a spread- sheet required some alterations to the original methodology. The main change was that all peer-grouping variables generated a likeness score instead of completely screening out agencies from further consideration. This had the positive side-effect of allowing the other 643 reporters in the NTD database to be assigned a likeness score and a likeness ranking relative to a given target agency. The original intent of using certain variables to screen out agencies from further consideration was retained by assigning a high factor likeness score for differ- ences in those variables [rail-operator (yes/no), and rail-only operator (yes/no)]. Other changes to the methodology that were implemented prior to the small-scale testing, based on the FTIS testing, were: • The variable on distance to the nearest comparably sized or larger urbanized area was dropped because testing showed it screened out too many potential peers. • Agency proximity was weighted twice as heavily as before. • Changes were made to the way the likeness score was cal- culated for the population growth rate variable. • Agencies were defined as being rail operators if they operated more than 100,000 vehicle miles of rail service annually. This threshold was selected to distinguish between operators of vintage trolleys and downtown streetcar circulators and operators of full-scale light rail and commuter rail lines and systems. Testing Feedback The version of the methodology used for the small-scale applications identified four types of applications: operations, financial, planning, and market focus (the latter two applica- tions shared the same peer-grouping variables). The perfor- mance topics selected by the participating agencies included three operations topics, three planning and market focus topics, and ten financial topics. (Some of the operations topics could also have been classified as financial topics and vice versa.) The “financial” peer-grouping method resulted in the least number of requested changes to peers among the partici- pating agencies. “Operations” performed reasonably well but generated more requested changes (particularly with Denver). When given a choice of using the “operations” and “financial” set of peers, Indiana DOT chose the “financial” set. The “financial” method included all of the peer-grouping variables used by the “operations” method, plus three others: vehicle-miles operated, total operating budget, and mean wage rate. The “planning” method tended to identify a large number of agencies within the same region, without regard to agency size. Service area population was added as an additional 91

screening variable but did not help much, regardless of the weight assigned to it (often because the reported service area population reflected the urban area population). The range of average wage rates among agencies was not significant enough to cause that variable to influence the peer group selection (i.e., the list of peers might be shuffled a little, but the same peers would generally appear with or without the variable). Two variables that were suggested as additional screening variables were the ratio of demand-responsive vehicles operated in maximum service to motorbuses operated in maximum service, and the ratio of purchased service to directly operated service (requested twice). Agencies that operate entirely demand-response service are accounted for in the methodology by the “service area type” screening variable, but not agencies that operate mostly demand-response service. Ability to Obtain Local (non-NTD) Data A significant peer-comparison challenge to overcome is the ability to obtain data not included in the NTD. The research team spent considerable time trying to track down such information. Particular issues include (a) the avail- ability of staff at the target agency to find contacts at the peer agencies and request information from them, (b) the avail- ability of staff at the peer agencies to track down the requested information, (c) the existence of the data, and (d) compat- ibility of measure definitions between agencies. We were, eventually, able to gather sufficient customer-satisfaction data from peers to be able to conduct a comparison. However, we were not able to gather absenteeism data or performance data specific to regional or express bus routes or to bus divisions (suburban vs. urban). NTD Data Reliability and Detail As expected, participants raised questions about the relia- bility of some of the NTD data. The most common problem that appeared was agency definitions of service area population. Some followed the FTA definition based on population within a certain distance of transit service, while others simply used the urbanized area population, regardless of whether they served the entire area. Agencies were also inconsistent from year to year in reporting population: for example, one agency reduced its service area population by 75% from 2005 to 2006, which caused obvious problems with “per capita” trends. Being able to compare performance on a per-capita basis can be very use- ful, but much more work appears to be needed to get agencies to report their service area population in a standard way. Case study participants also commented about potential differences in how agencies reported passenger-mile and vehicle-malfunction data and the general lack of detail of the maintenance data. The research team noticed problems at indi- vidual agencies (sometimes only in one year, sometimes every year) with some cost categories and in breaks/allowance time. Large-Scale Testing Description During the final stage of testing, the methodology was tested by 22 agencies: 19 transit agencies, 2 state DOTs, and the Re- gional Transportation Authority in Chicago. The participat- ing agencies are listed below. Agencies that also participated in the small-scale test are shown with an asterisk (*). • Central Oklahoma Transportation and Parking Authority, Oklahoma City, OK • Greater Bridgeport Transit Authority, Bridgeport, CT (*) • Hillsborough Area Rapid Transit, Tampa, FL • King County Metro, Seattle, WA • Knoxville Area Transit, Knoxville, TN (*) • MARTA, Atlanta, GA • Metrolink, Los Angeles, CA • North County Transit District, Oceanside, CA • Oahu Transit Service, Honolulu, HI • Orange County Transportation Authority, Orange, CA • Pennsylvania DOT, Harrisburg, PA (*) • Pinellas Suncoast Transit Authority, St. Petersburg, FL • Regional Transportation Authority, Chicago, IL (*) • Regional Transportation District, Denver, CO (*) • San Joaquin Regional Transit District, Stockton, CA • San Mateo County Transit, San Carlos, CA • Sarasota County Area Transit, Sarasota, FL • SEPTA, Philadelphia, PA • StarMetro, Tallahassee, FL • Texas DOT, Austin, TX (*) • Utah Transit Authority, Salt Lake City, UT (*) • Virginia Railway Express, Alexandria, VA Each agency applied the methodology using instructions provided to them by the research team. The instructions pro- vided background information on the purpose of the project and described the process for applying the methodology (including detailed instructions for using FTIS). At a mini- mum, agencies were instructed to provide feedback on the following questions: 1. Do you feel the peer group identified for your agency was reasonable? Were peer agencies identified that you feel are inappropriate (and if so, why)? Were agencies not identified that you feel should have been (and if so, why)? 92

2. Do you feel that the performance results were reasonable (i.e., reflect reality, to the best of your knowledge)? Were there any observed issues with the data (i.e., missing data, illogical trends, unexplainable results) that could affect the credibility of the results? 3. How easy was it to follow the instructions in this document and apply the software? The research team was available throughout the process to answer any questions as they arose, but did not otherwise participate in the application of the methodology during the large-scale test. Methodological Changes Based on the results of the small-scale tests, only a single peer-grouping method was used for the large-scale test. This method was based on the “financial” peer-grouping application used in the small-scale test, with the following changes: • Average wage rate was eliminated as a peer-grouping variable. There was not enough variation in the wage rate between regions to make a substantial difference in the peer-grouping results. The wage data were retained in FTIS to allow agen- cies to manually adjust costs based on wage rate differences if they so desired. • The percentage of service that is demand-response was added as a peer-grouping variable for agency-wide and bus-mode comparisons. This variable helps distinguish agencies that mostly operate demand-response service from those that mostly operate fixed-route service. • The percentage of service that is purchased was added as a peer-grouping variable for all types of comparisons. Agen- cies that purchase their service will typically have different organization and cost structures than those that directly operate service. • Being a heavy rail operator (yes/no) was added as a third screening variable. A mismatch between the target agency and the peer agency for this variable resulted in a likeness score of 20 being assigned for this variable, effectively elim- inating the potential peer from further consideration. • The likeness score was reported as a normalized value by dividing the sum of the peer-grouping factors by the number of factors used (excluding the three rail-related screening factors). Guidance was provided that, in general, a total like- ness score less than 0.50 was considered a very good match, a score of 0.50–0.74 was considered satisfactory, and a score of 0.75–0.99 indicated an agency that might be usable as a peer but with caution since there might be some significant differences that might need to be considered. A score of 1.00 or higher indicated that there were probably too many dif- ferences to make the agency a good peer. Testing Feedback Most of the feedback on the peer-grouping aspect of the methodology related to a problem with implementing the methodology in the FTIS software (since corrected), where the software assigned a likeness score for factors with missing data (indicating a very close match) rather than the intended value of 1,000 (to effectively drop the potential peer from further consideration). This problem resulted in inappropriate (mostly small) peers appearing in agencies’ lists of potential peers. At least nine test applications were affected by this issue. After fixing the issue, the lists of potential peers generally seemed appropriate. As discussed previously in the Peer-Grouping Philosophy section, two large bus-only operators felt that the potential peers identified by the TCRP Project G-11 methodology did not match the ones they had used in previous efforts and were comfortable with. In both cases, the TCRP Project G-11 methodology identified a mix of smaller suburban operators from the same region and/or state, plus some larger national bus-only peers. In comparison, these agencies did not include suburban operators and included national peers that oper- ated light rail systems. A third large bus-only operator only uses bus operators in its peer group and was comfortable with the results once the FTIS missing-data issue described above was addressed. These concerns were addressed by providing guidance on how to work with the peer-grouping data ex- ported by FTIS to include rail operators as potential peers. In some cases, agencies felt that a potential peer was inappro- priate because one particular factor (e.g., urban area popula- tion, agency budget) was too big or small relative to the target agency. As described above, prior versions of the methodology considered setting absolute cutoffs (e.g., population within ±25% of the target agency), but found that these often reduced the pool of potential peers too much. Instead, agencies that are substantially different in one characteristic need to be quite similar in a number of other respects in order to end up with a total likeness score low enough to be considered as a potential peer. The concerns expressed by these agencies were addressed by providing guidance that agencies should identify limits for factors of concern to them prior to conducting the peer group- ing and should then apply those criteria as part of a secondary screening process. Other changes relating to the guidance related to the inter- pretation of likeness scores and how to address special cases. One special case involved a transit operator in Hawaii, where some additional spreadsheet work was needed to adjust the likeness scores to account for the long distances between the target agency and any potential peer. 93

Final Methodological Changes The following changes were made to the methodology to address the feedback from the large-scale test: • Distance was removed as a peer-grouping factor for mode- specific comparisons involving rail service. Rail operators tend to be widely spread apart outside the Northeast, and there is little expectation that peers will be located nearby. Removing distance as a factor for these comparisons allows the general guidance on interpreting likeness scores to be applied more consistently. • The weights applied to different combinations of service types were adjusted in order to make it less likely that sub- urban operators would be matched as peers to central-city operators. • More weight was applied to differences in service types between operators within the same urban area in order to compensate for the fact that these operators will be alike on the screening area factors that are based on urban area characteristics. This change was designed to make it less likely that suburban operators would be matched as peers to central-city operators within the same urban area. • The definition of a rail operator was adjusted to count only those operating more than 150,000 vehicle miles annually since a downtown streetcar operator approached 100,000 vehicle miles in the 2007 NTD. Likeness Score Calculation Total Likeness Score The heart of the peer-grouping methodology is the cal- culation of a total likeness score that indicates the degree of similarity between a target agency and a potential peer, based on a variety of factors that account for many of the differ- ences between agencies and regions that can impact perfor- mance results. A score of 0 indicates a perfect match between two agencies (and is unlikely to ever occur). Higher scores indicate greater levels of dissimilarity between two agencies. In general, a total likeness score under 0.50 indicates a good match, a score between 0.50 and 0.74 represents a satisfactory match, and a score between 0.75 and 0.99 represents potential peers that may usable but for which care should be taken to investigate potential differences that may make them unsuitable. In some cases, peers with scores 1.00 may also be usable (with even greater caution) or, in a few cases, may be the only candidates available. A total likeness score of about 70 or higher may indicate that a potential peer had missing data for one of the screening factors. In some cases, suitable peers may be found in this group by manually re-calculating the total likeness score in a spreadsheet, removing that factor from consideration if the user determines that the factor is not essential for the per- formance questions being asked. Missing congestion-related factors, for example, may be more easily ignored than a missing total operating budget. The total likeness score is calculated as follows: Screening Factors Three screening factors are used in the process. These are used to distinguish bus-only operators from rail operators and types of rail operators from each other. • Rail operator (yes/no). A rail operator is defined as one operating 150,000 or more rail vehicle miles annually. A match on this factor produces a likeness score of 0; a mis- match produces a likeness score of 20. This factor is derived from the NTD. • Rail-only operator (yes/no). A rail-only operator operates rail and has no bus service. A match on this factor produces a likeness score of 0; a mismatch produces a likeness score of 20. This factor is derived from the NTD. • Heavy-rail operator (yes/no). A heavy-rail operator oper- ates the heavy rail mode. A match on this factor produces a likeness score of 0; a mismatch produces a likeness score of 20. This factor is derived from the NTD. Peer-Grouping Factors Up to 14 peer-grouping factors are used in the process, depending on the type of analysis (rail-specific vs. bus-specific or agency-wide) and the target agency’s urban area size (which determines whether the two Urban Mobility Report factors are included). Most factor likeness scores are determined from the per- centage difference between a potential peer’s value for the factor and the target agency’s value. A score of 0 indicates that the peer and target agency values are exactly alike, while a score of 1 indicates that one agency’s value is twice the amount of the other. For example, if the target agency was in a region with an urbanized area population of 100,000, while the popu- lation of a potential peer agency’s region was 150,000, the likeness score would be 0.5 because one population is 50% higher than the other. For the factors that cannot be compared by percentage difference (e.g., state capital or agency proximity), the factor likeness scores are based on formulas that are designed to pro- duce similar types of results—a score of 0 indicates identical Total likeness score Sum screening factor sc = ores Sum peer-grouping factor scores Cou ( ) + ( ) nt peer-grouping factors( ) . 94

characteristics, a score of 1 indicates a difference, and a score of 2 or more indicates a substantial difference. For example, if one agency serves a state capital and the other does not, the likeness score for the state capital factor would be 1, while if both served or did not serve state capitals, the likeness score would be 0. The exact calculation process is provided within the description of the peer-grouping factors below for those factors that do not use percent-difference as the method for determining likeness. Not all agencies have a complete set of values for their peer- grouping factors. Typically this occurs when a value was not reported to the NTD for vehicle miles operated or annual operating budget, but it can also occur for some mid-sized agencies in urban areas that lack Urban Mobility Report con- gestion data. In cases where the target agency has data for a peer- grouping factor and a potential peer does not, the potential peer is assigned a factor likeness score of 1,000 for that factor. (The high score is used to help identify agencies with missing data when reviewing total likeness scores.) If the target agency is missing data for a peer-grouping factor, then that factor is simply dropped from consideration. The peer-grouping factors are as follows: • Urban Area Population. Likeness scores are determined by the percent-difference method. Data come from the U.S. Census Bureau’s American Community Survey. • Total Annual Vehicle Miles Operated. Likeness scores are determined by the percent-difference method. Data come from the NTD. • Annual Operating Budget. Likeness scores are determined by the percent-difference method. Data come from the NTD. • Population Density. Likeness scores are determined by the percent-difference method. Data are derived from the U.S. Census Bureau’s American Community Survey, dividing urban area population by urban area size in square miles. • Service Area Type. Likeness scores are determined from the matrix shown in Table B1. Transit agencies were assigned one of eight service types by the research team, as shown below the table, depending on the characteristics of their service (e.g., entire urban area vs. central city only). The likeness score is multiplied by 3 if the peer agency and target agency are based in the same urban area (to compensate for the fact that the two agencies will be identical on all of the factors based on urban area characteristics). • State Capital (yes/no). If both agencies match on this factor (i.e., both serve or both do not serve a state capital), a likeness score of 0 is assigned, otherwise a value of 1 is assigned. • Percent College Students. Likeness scores are determined by the percent-difference method. Data come from the U.S. Census Bureau’s American Community Survey. • Population Growth Rate. The likeness score is taken by dividing the difference between the target and peer agency’s urban area population growth rate by 5. For example, if one agency has a +3% growth rate and the other has a +1% growth rate, the likeness score would be (3–1)/5 = 0.4. The growth rate is based on the urban area’s 2000 population (from the decennial census) and the current population (based on the U.S. Census Bureau’s American Community Survey). • Percent Low-Income Population. Likeness scores are determined by the percent-difference method. Data come from the U.S. Census Bureau’s American Community Survey. • Annual Delay (Hours) per Traveler. Likeness scores are determined by the percent-difference method. Data come from the Urban Mobility Report (B-4). This factor is only used for target agencies in urban areas with populations of 1 million or more. • Freeway Lane Miles (Thousands) Per Capita. Likeness scores are determined by the percent-difference method. Data come from the Urban Mobility Report. This factor is only used for target agencies in urban areas with populations of 1 million or more. • Percent Service Demand-Responsive. Likeness scores are determined by multiplying the difference between the two agencies’ percentages (expressed as decimals) by 2. Data are derived from the NTD. This factor is only used for agency- wide and bus-mode comparisons. • Percent Service Purchased. Likeness scores are determined by multiplying the difference between the two agencies’ percentages (expressed as decimals) by 2. Data are derived from the NTD. 95 Target Agency Service Type 1 2 3 4 5 6 7 8 1 0 10 10 10 10 3 5 100 2 10 0 3 4 4 5 2 100 3 10 3 0 2 5 5 1 100 4 10 4 2 0 4 5 3 100 5 10 4 5 4 0 2 3 100 6 3 5 5 5 2 0 5 100 7 5 2 1 3 3 5 0 100 P ee r A ge nc y Se rv ic e Ty pe 8 100 100 100 100 100 100 100 0 Table B1. Likeness scores by service type combination.

• Distance. Likeness scores are calculated as the distance between the two agencies’ urban areas (in miles), divided by 500. The urban area centroid is derived from U.S. Census Bureau data. Service types are defined as follows: 1. Agency provides service only to non-urbanized areas. 2. Agency provides service to multiple urban areas (may also include non-urban areas) and is the primary service provider within at least one urban area central city. 3. Only agency operating within an urban area and has no non-urban service. 4. Agency is the primary service provider in the urban area’s central city, where other agencies also provide service to portions of the urban area. Urban areas with multiple central cities (e.g., Tampa–St. Petersburg) may have more than one type 4 agency. 5. Agency provides service into an urban area’s central city, but its primary service area does not include a central city. 6. Agency provides service within an urban area but does not provide service to a central city. 7. Only agency operating within an urban area and also pro- vides non-urban service. 8. Other (e.g., special needs transportation service only, ferry- only, monorail-only, agency in Puerto Rico, agency provides funds to another NTD reporter that operates the service). References B-1. Hartgen, David T. and Mark W. Horner. Transportation Publication Report 163: Comparative Performance of Major U.S. Bus Transit Systems: 1988–1995. University of North Carolina at Charlotte, May 1997. B-2. Perk, Victoria and Nilgün Kamp. Benchmark Rankings for Transit Systems in the United States. National Center for Transit Research at the Center for Urban Transportation Research, University of South Florida, Tampa, Fla., December 2004. B-3. Kittelson & Associates, Inc., KFH Group, Inc., Parsons Brinckerhoff Quade & Douglass, Inc., and Katherine Hunter-Zaworski, TCRP Report 100: Transit Capacity and Quality of Service Manual, 2nd ed., Transportation Research Board of the National Academies, Washington, D.C., 2003. B-4. Schrank, David and Tim Lomax. 2007 Urban Mobility Report. Texas Transportation Institute, Texas A&M University System, College Station, Tex., September 2007. 96

Next: Appendix C - Task 10 Working Paper »
A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry Get This Book
×
 A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s Transit Cooperative Research Program (TCRP) Report 141: A Methodology for Performance Measurement and Peer Comparison in the Public Transportation Industry explores the use of performance measurement and benchmarking as tools to help identify the strengths and weaknesses of a transit organization, set goals or performance targets, and identify best practices to improve performance.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!