Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
1 SUMMARY Using Archived AVL-APC Data to Improve Transit Performance and Management Automatic vehicle location (AVL) and automatic passenger counter (APC) systems are capa- ble of gathering an enormous quantity and variety of operational, spatial, and temporal data that, if captured, archived, and analyzed properly, holds substantial promise for improving transit performance by supporting improved management practices in areas such as service planning, scheduling, and service quality monitoring. Historically, however, such data has not been used to its full potential. Many AVL systems, designed primarily for real-time applications, fail to cap- ture and/or archive data items that would be valuable for off-line analysis. And where good qual- ity data is captured, new analysis tools are needed that take advantage of this resource. Recent technological advances have created new opportunities for improving the quantity, variety, and quality of data captured and for analyzing it in meaningful ways. The objective of this research was to develop guidance for the effective collection, archiving, and use of AVL-APC data to improve the performance and management of transit systems. This project yielded three types of products: a survey of practice, guidance on AVL-APC sys- tems and data analysis, and prototype tools for analysis of archived AVL-APC data. The state of the practice in AVL-APC data capture and analysis was ascertained by means of literature review, widespread telephone interviews, intensive case studies of nine transit agencies in three coun- tries, and a workshop for suppliers. The case studies (published as TCRP Web Document 23) are from five transit agencies in the United States (Seattle [WA], Portland [OR], Chicago [IL], New Jersey, and Minneapolis [MN]); two agencies in Canada (Ottawa, Montreal); and two agencies in the Netherlands (The Hague and Eindhoven). This report offers guidance on five subjects: · Analyses that can use AVL-APC data to improve management and performance · AVL-APC system design that facilitates the capture of data with the accuracy and detail needed for off-line data analysis · Data structures and analysis software for facilitating analysis of AVL-APC data · Screening, parsing, and balancing automatic passenger counts · Use of APC systems for estimating passenger-miles for National Transit Database (NTD) reporting The tools developed for analyzing AVL and APC data are described in this report; in addition, provision has been made for their distribution. Prototype analyses of passenger waiting time (using AVL data) and passenger crowding (using APC data), developed on a spreadsheet platform, are available from the project description web page for TCRP Project H-28 on the TRB website (www.trb.org). Tools for analyzing running times and designing scheduled running times were cre- ated as an extension of the already existing software TriTAPT (Trip Time Analysis in Public Trans- port), a product of the Delft University of Technology. TriTAPT was also used to demonstrate one
OCR for page 2
2 of the advanced data structures recommended, that of a "virtual route"consisting of multiple over- lapping routes serving the same street. Under the terms of this project, TriTAPT is available with- out license fee to U.S. and Canadian transit agencies for 4 years. AVL and APC System Design For their primary use, AVL systems include a reliable means of location and APC systems include sensors and algorithms that count passengers entering and exiting. However, archived data analysis requires data beyond what is needed for the primary purpose of AVL system design, real-time monitoring. Most important for archived data analysis is the ability to match raw location data to a base map and schedule. Inability to match data is the primary cause for rejecting data from database archives; data recovery rates as poor as 25% to 75% have been reported for APCs, although rates are generally far better for AVL. Success in matching depends to a large extent on the data cap- tured. If the AVL system is integrated with a radio, operator sign-in including route-run num- ber can be captured, which aids matching. Systems that detect door openings and correlate them with stops have an advantage over those without door sensors, because stops are natural match points. The most difficult part of matching is correctly identifying the end of the line, because vehicle behavior at the end of a line is less predictable. Better matching algorithms are needed especially for end-of-line detection. With detailed information about vehicle movements and door openings in the terminal area, algorithms can better determine when a bus arrives at the end of a line and when a bus begins a trip. AVL systems produce three kinds of frequent records: polling records, stop records, and time- point records. Polling records indicate a bus's location when queried by a central computer doing round-robin polling. Polling is intended mainly for real-time knowledge of vehicle location; its records can be called "location-at-time" data as opposed to stop and timepoint records, which are "time-at-location" data. Most analyses of AVL-APC data need time at specific locations, for example, to analyze running time or schedule adherence. While estimation of time-at-location is possible by interpolation from polling data, it has so far proven impractical. Polling data's only value for off-line analysis is for incident investigation, using "playback." Stop and timepoint records include the time at which the bus departed and/or arrived at spec- ified points. Different system designs define the time being recorded differently; for example, arrival time may be when the bus enters within a zone of a 10-m radius about a stop or when the first door opens (or the bus passes the stop if the door is not opened). Door switch and odome- ter connections enable more precise sensing of arrival and departure times, improving data accu- racy. Because some analyses use arrival time while others use departure time, recording both increases the usefulness of the data archive. Stop records offer greater geographic detail than timepoint records. On buses with APC, records are always made at the stop level; with AVL, timepoint records are more common. However, there are advantages to making stop records even when there are no passenger counts. The advantages include the ability to detect holding (i.e., a bus remaining at a stop with doors closed or with doors open but for an unusually long time, as when a bus is ahead of schedule); geographic detail for ana- lyzing delays; and the ability to make stop-level schedules, whether for publishing, for computer- based trip planners, or for next bus arrival prediction systems. AVL-APC systems also make types of records other than these basic, frequent records. Radio- based systems create event records, sometimes for more than 100 event types, all stamped with time and location. They can include events that are generated automatically (e.g., engine turned on or off) and events that are operator initiated and sent by data radio (e.g., pass-up, railroad crossing delay). Event records useful for data analysis include pass-ups, various types of delay (e.g., drawbridge, railroad crossing), indications of special user types (e.g., wheelchair lift users,
OCR for page 3
3 bicycle rack users), and events that help with matching the trip. If the on-board computer monitors bus speed and heading, records can be written for noteworthy changes (e.g., when speed crosses a crawl speed threshold) or for the maximum speed achieved between stops. AVL systems can also be configured to make records very frequently (e.g., every 2 s). That kind of data is helpful for mapping trajectories and for capturing speed-related details. AVL systems have two options for data recording: (1) via an on-board computer that is uploaded nightly, usually using an automated, high-speed link, or (2) via real-time radio trans- mission, with records stored in the receiving computer. Because data radio has limited trans- mission capacity, this mode limits the frequency (and to a lesser extent, length) of records that can be made. On-board storage, in contrast, presents no practical capacity limitation and is there- fore inherently better suited to data collection. On-board devices that can be integrated with an AVL system include APCs, radio control head, odometer (transmission), gyroscope, door sensors, wheelchair lift sensor, farebox, and stop enunciator. In general, the more devices included, the richer the data stream, which can aid both in matching and in offering new kinds of information. Integration with the radio control head enables the system to record operator-initiated messages as events and to capture valuable sign- in information. Integration with the odometer provides a backup to the geographic positioning system (GPS) and information on speed and can be used to detect when a bus starts and stops moving, which aids in determining arrival and departure times and delay between stops. Door sensors are valuable for detecting arrival and departure times at stops, as well as for matching. Integration with a newer farebox or other fare collection system that makes transaction records offers the possibility of getting location-stamped farebox records, which can be a valuable source of ridership data, especially where APCs are lacking. Integration with a stop announcement system does not add any new data to the system, but because stop announcements demand careful matching, it helps ensure that the location and matching system has high reliability. Likewise, integration with real-time radio offers the possibil- ity of detecting and correcting, with operator or dispatcher assistance, errors such as invalid sign- in information (e.g., a run number that does not agree with where a bus is found to be operating). Analyses Using Archived AVL-APC Data A large number of uses for archived AVL-APC data were identified, and their various needs for record type and data detail were analyzed. Targeted investigations apply to passenger complaints, legal claims, and payroll disputes, among others. They require only the ability to, in effect, play back a bus's trajectory and, therefore, can be done with polling data. The greater the detail of the data stream, including the frequency of records, the more its potential uses for targeted investigations. One of the richest application areas for archived AVL data is in running time analysis, includ- ing designing scheduled running times and monitoring schedule adherence. Traditional schedul- ing methods, created in an era of expensive manual data collection, are based on mean observed running times, which are estimated from small sample sizes. Now, however, AVL data offers the possibility of using extreme values such as 85- and 95-percentile running times as a basis for scheduling. Extreme values are important to passengers, who care less about mean schedule deviation than about avoiding extreme deviations such as buses that are early or very late. Extreme values are also important to planning, because a goal of cycle time selection (sum of scheduled run- ning time and scheduled layover) is to limit the probability that a bus finishes one trip so late that its next trip starts late. For transit agencies that use holding to prevent early departures, data analy- sis can try to identify holding time, allowing the analysis of net running time for making sched- ules. Most running time analyses can be performed with timepoint-level data, but there are advantages to using stop records, as mentioned earlier.
OCR for page 4
4 As transit agencies take a more active role in improving operating speed and protecting their routes from congestion, analyses of speed and delay along a route become valuable tools. There- fore, stop-level detail is important for determining where delays are occurring and monitoring the effects of local changes to traffic conditions. AVL data also can be applied to schedule adherence, headway regularity, and passenger waiting time. In this arena, too, extreme values are at least as important as mean values. Also, because extreme val- ues can only be estimated reliably from a large sample size, the availability of archived AVL data offers new opportunities for analysis. In particular, transit agencies are looking for measures of service quality that reflect passengers' experience and viewpoint. Tools developed by this project for measuring waiting time, service reliability impacts, and crowding demonstrate this possibility. APC data lends itself to a variety of passenger demand analyses including determining load pro- files and using demand rates to set headways and departure times. While traditional analyses rely on mean values, APC data also offers the possibility of focusing on extreme values of crowding, as well as relatively rare events such as wheelchair lift and bicycle rack use. AVL data containing stop records can be used to verify and update base maps. By analyzing where buses actually stop, one can update stop locations. Some AVL systems deliberately include a "learning mode" in which bus location is recorded frequently enough to give the bus's path where route information is needed for the base map, such as through a shopping center or a new subdivision. Exploring archived AVL-APC data can enable transit agencies to find hidden trends that help explain irregularities in operations and suggest new avenues for improvement. As an example, one agency found that a surprising amount of schedule deviation could be explained by the operator--that is, some operators consistently depart late or run slow--which suggests the need for improved methods of operator training and supervision. An example of an advanced analysis using a highly detailed AVL data stream is calculating and monitoring measures of ride smoothness. A smooth ride is certainly important to passengers, and AVL data with either very frequent observations or accelerometers can allow ride smooth- ness to be measured objectively. For any of the detailed analyses mentioned, there is also interest in higher level analysis involv- ing tracking trends over time, comparing routes or periods of time, and so forth. Another exam- ple is geographic information system (GIS)-based analyses that take as input passenger use and service quality statistics. Prototype Analysis Tools: Waiting Time and Crowding Analysis tools were developed for passenger waiting time on short- and long-headway serv- ices, for crowding, and for designing scheduled running time. The tools developed for waiting time analysis use some newly proposed measures that reflect the amount of time passengers budget for waiting, which is particularly sensitive to service reliability. These proposed measures are based on extreme values of the headway and schedule deviation distribution, which can only be estimated using the large sample sizes that AVL datasets provide. For short-headway service, for which passengers can be assumed to arrive independent of the schedule, the distribution of passenger waiting time can be determined from headway data. The tools developed include graphing the distribution of waiting time, determining mean and 95-percentile waiting time, and determining the percentage of passengers whose waiting time falls into various user-defined ranges. The latter is useful for supporting a service quality stan- dard such as "no more than 5% of passengers should have to wait more than 2 minutes longer than the scheduled headway." For short-headway service, "budgeted waiting time" is taken to be the 95-percentile waiting time. The difference between budgeted waiting time and the mean time passengers actually spend
OCR for page 5
5 waiting is called "potential waiting time." While potential waiting time is not spent waiting on the platform, it still involves a real cost to passengers. "Equivalent waiting time" is a proposed measure of passenger waiting cost, being a weighted sum of platform waiting time (with proposed weight = 1) and potential waiting time (with proposed weight = 0.5). Example analyses and reports illustrate the concept and show how improved headway regularity reduces passenger waiting cost. Waiting time and its components also can be divided between a part that is "ideal" (i.e., what it would be if service exactly followed the schedule) and the remainder ("excess" waiting time). This division separates the effects of planning and operations on passenger waiting. For long-headway service, excess platform waiting time is defined as the difference between the mean and the 2-percentile departure deviation, based on the idea that experienced passengers will arrive early enough to limit to 2% or less their probability of missing the bus. Potential wait- ing time is the difference between the 95-percentile and mean departure deviation, and excess equivalent waiting time is a weighted sum of these components. Example analyses and reports show how these waiting time measures are sensitive to improvements in service reliability (in this case, on-time performance). Traditional measures of crowding are mean maximum load and the percentage of trips in vari- ous crowding ranges. Neither reflects well the impact of crowding on passengers. With APC data, it is possible to determine the number of passengers experiencing various levels of crowding rang- ing from "seated next to an unoccupied seat" to "standing at an unacceptable level of crowding." The analysis tools used for waiting time and crowding are included on a spreadsheet platform on the project description web page for TCRP Project H-28 on the TRB website (www.trb.org). Prototype Analysis Tools: Scheduled Running Time Analysis tools were also developed for running time. One is an interactive tool for determining scheduled running times (allowed times) across the day, including selecting boundaries between periods of homogeneous running times. This tool includes an automated portion, in which boundaries and allowed times are selected based on user-supplied criteria regarding feasibility (probability that trips can be completed in their allowed time) and tolerance. On a graphical inter- face, users can modify both period boundaries and allowed times by simple drag-and-drop; the program will respond immediately with the feasibility of the proposed changes. The second main tool is for allocating running time along the route, thereby determining run- ning time on each segment. It uses the Passing Moments method of maintaining a given proba- bility of completing a trip on time, in order to give operators an incentive to hold when they are ahead of schedule, thus improving schedule adherence. The analysis tools for running time are part of the TriTAPT package, which is being distrib- uted without license fee to U.S. and Canadian transit agencies as part of this project. Processing and Using Automatic Passenger Counts Accuracy of passenger counts is always a concern with automated systems. Analysis of the var- ious dimensions of accuracy confirmed theoretical findings with published findings of APC data accuracy. The report shows that systematic under- or overcounting is a more serious problem than random errors, because of the large sample size afforded by APCs. The accuracy of load and passenger-miles measures can be substantially worse than the accuracy of on and off counts because of the way load calculations allow errors to accumulate. Getting accurate load and passenger-miles estimates from automatic passenger counts demands not only relatively accurate counts, but also good methods of parsing the data into trips and balancing on-off discrepancies. Therefore, checking the accuracy of APC-measured load against manual counts is a good system test.
OCR for page 6
6 To prevent drift in load estimates, a day's data stream of automated counts has to be parsed at points of known load. Those points are usually layover points; therefore, parsing usually means dividing the data into trips. This need requires the end of the line to be clearly defined for APC systems. Failing to parse data at the trip level can bias load and passenger-miles estimates upwards because downward errors cause negative loads, which are routinely corrected when any trip's data is processed, while upward errors are not readily apparent except at the block (vehicle assignment) level. Where load does not necessarily become zero at the end of a trip, whether due to a route end- ing in a loop or to through-routing, data structures must account for passengers inherited from a previous trip, either by adding dummy stop records or using a field in a trip header record. Algorithms are presented for parsing data for trips that end with short loops, making the assumptions that (1) nobody rides all the way around the loop and (2) nobody's trip lies entirely within the loop. Within the loop, passengers alighting are attributed to the trip entering the loop, and passengers boarding are attributed to the trip leaving the loop. With these assumptions, ons and offs can be balanced just as if load were known to be zero at the end of the trip, even if actual load on the bus never goes to zero. If a stop within the loop is designated as the route endpoint, inherited passengers can be determined at that point. An algorithm for balancing ons and offs at the trip level is presented. It includes calculation of the most likely value of total ons and offs, proportional corrections to stop-level counts, and rounding to integer values. It accounts for inherited passengers and checks not only against neg- ative departing load, but also against negative through load, which is often a tighter condition. It also makes a first attempt to account for operator movements off and on the bus. Because APCs are normally deployed on only a percentage of the fleet that is rotated around the system, sample size can vary substantially between routes and trips. Data analysis using APC data should account for varying sample size, weighting not each observation, but each operated trip, equally. Most uses of passenger counts require only moderate sample sizes, which can be met with the typical 10% to 15% fleet penetration. The only exception is monitoring crowding levels on crowded routes; because of the importance of estimating extreme values of crowding, that application requires a large sample size. For NTD reporting, estimates of passenger-miles made from APC data can easily meet the spec- ified accuracy level, provided systematic under- or overcounts are limited. The report shows the relationship between required sample size and bias and shows that, for all but the smallest transit systems, NTD accuracy can be obtained even if only a small percentage of the fleet is instrumented with APCs. Data Structures That Facilitate Analysis Many analyses are tied to a route pattern, that is, a sequence of stops. Examples are load pro- files, headway irregularity, and running time. For such analyses, the basic unit is the stop or time- point record. Data structures are needed to indicate the sequence of stops and the schedule. When the data stream includes information on events that happened on segments between stops or timepoints, that information can be used in a pattern-based analysis only if it can be associated with a stop or timepoint. One approach is to include a summary of segment infor- mation (e.g., total delay on the segment or maximum speed on the segment) in fields in the stop or timepoint records. An alternative and more flexible approach is to include in each event record a field for the stop with which it can be associated. Header records for trips, blocks, and days can speed analysis by allowing selection filters to be applied to a much smaller number of records. Making summary records at the trip level-- containing such summary measures for a trip as total ons and offs, maximum load, passenger-
OCR for page 7
7 miles, and running time--will speed analysis of the standard measures that are summarized. Summary records also can be created over standard date ranges (e.g., every month or quarter) for higher level analysis, such as trends or historical comparisons. However, some agencies find that their database systems are able to generate summaries on the fly quickly enough that sum- mary records are not needed. Transit agencies often want analyses to cover multiple route patterns (e.g., all the patterns that compose a line or all the patterns operating along a certain corridor or in a certain geographic area). If the analysis does not relate to a particular sequence of stops, the data can be analyzed by simple aggregation; for example, total boardings or total number of early departures can simply be aggregated over any group of stops. However, an analysis that involves relating the data to a sequence of stops used by multiple patterns--such as running time, headway, and load on a trunk--requires a data structure specifying the sequence of stops. A data structure called a "vir- tual route" was developed and implemented in TriTAPT as a proof of concept. It allows users to do almost any analysis on all the patterns that serve, in part or whole, a given sequence of stops. The development of software tools for analyzing AVL-APC data is still in its infancy. There are advantages both to solutions customized by or for individual transit agencies and to solu- tions that are common to multiple transit agencies. Customized solutions have worked well for some transit agencies, but require a level of database programming expertise unavailable to many transit agencies. Market mechanisms for shared solutions--which offer the possibil- ity of greater expertise, economy, and continual upgrading (a necessity with today's informa- tion technology)--include AVL and APC equipment suppliers, scheduling software suppliers, and third-party software suppliers. Ideally, externally supplied software should be modular with respect to the native data format and ultimate report formats. It should provide an open interface for data input, so that it is not restricted to a single type of data collection system. While it may be programmed to deliver vari- ous types of reports, it also should offer the possibility of exporting tables as a result of its analysis, giving agencies the freedom to format as they wish. In addition, transit agencies should always maintain the freedom to manipulate and explore their archived data beyond the tools provided by any externally supplied software.