Cover Image

Not for Sale

View/Hide Left Panel
Click for next page ( 71

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 70
70 CHAPTER 11 Data Structures That Facilitate Analysis As mentioned in earlier chapters, original APC data con- efit from regular upgrades, necessary in this age of techno- sists chiefly of stop records, plus possible sign-in records. logical advance. Coding for standard and ad hoc reports is Original AVL data consists of stop or timepoint records, sign- prepared either in a database query language or using report- in records, and records of various other events. It may also generating software such as Crystal Reports and Brio. Analyses include polling records. that demand more complex calculations are often performed For analysis, these data records have to be screened and with spreadsheets or statistical analysis packages, with database possibly corrected. Data that is not matched to a route and queries used as a front end to select the data for analysis. One schedule should be matched. Beyond cleaning and matching, disadvantage of COTS database platforms and reporting soft- certain data structures may need to be created in the analysis ware is that they can be slow when a lot of data is involved. Some database in order to facilitate analysis. Header and summary agencies have found that powerful report-generating tools records offer some convenience for queries and analyses (available at 3 to 10 times the cost of their low-end counter- involving aggregation. Special data structures are needed to parts) help overcome this problem by periodically pre-staging deal with multiple pattern analyses that are more than simple the data most likely to be used in reports and analyses. aggregations. Modularity in analysis procedures can also be Response speed for large datasets can also be reduced by use enhanced by using standard, specialized database formats. of special data structures optimized for fast data retrieval. Tri-Met is an example of an agency whose AVL-APC data analysis software was developed in house. Data is stored and 11.1 Analysis Software Sources managed in an Oracle database. Using a query language, Software used in practice to analyze archived AVL-APC selected data (e.g., by route, direction, times, dates) can be data can come from five different sources: in house, the AVL- extracted. Extracted data is then imported to a commercial APC vendor, a scheduling software vendor, a third party with statistical analysis system (SAS) for numerical analysis. a standard product, and a custom software developer. Each Scripts for standard queries and analyses are stored and arrangement has its advantages and drawbacks. reused. Sometimes results are imported to Microsoft Access for easier formatting. King County Metro, with separate AVL and APC systems, 11.1.1 Software Developed in House uses multiple databases and applications. Its AVL data is stored Much of the current analysis of archived AVL-APC data in an Informix database. For schedule deviation analysis, uses home-grown software tools. This arrangement has scripts coded in Microsoft Access provide a friendly user worked well for some agencies, allowing them the flexibility interface for selecting AVL route, direction, time, date range, to adapt to their particular needs and enterprise databases and so forth. The analyses themselves were programmed in a and ensuring that tool development is closely tied to need and query language and are performed by Informix, which pro- likely use. For pioneering agencies, developing their own soft- duces output in the form of Microsoft Excel tables and ware was a necessity. graphs. Analysts may do further manipulations of the Excel Since the mid-1990s, self-developed database and report- tables. For running time analysis, a query language program ing software for AVL-APC data has used commercial off-the- runs every 2 weeks on the AVL Informix database, extracting shelf (COTS) database platforms on PC networks. COTS data that is then input to their scheduling package, Hastus, platforms have the advantage of being less expensive and ben- which includes the add-on software product ATP for running

OCR for page 70
71 time analysis. Raw APC data is kept in an Oracle database. house to run their own reports because the vendors' software Using programs prepared in the Focus query language, sum- did not provide the flexibility they needed. mary records are created and exported to a Microsoft Access database, which has been programmed to offer a friendly user 11.1.3 Software Supplied by Scheduling interface and nice reports. There are also standard reports cre- System Vendors ated using query language from the original databases. Metro Transit, a third example, analyzed running time data Analysis programs offered by scheduling system vendors from its now obsolete AVL system using macros written in focus on analyzing running time data to suggest scheduled Microsoft Excel, once the analyst had extracted the data of running times. An example is the tool used at King County interest from the database. In its new AVL system (now in Metro. Because it is tied to the scheduling system, its suggested implementation), Metro Transit is working with the AVL ven- scheduled running times can be semi-automatically entered dor to define analysis and reporting needs; they plan to share into the scheduling system database. Ironically, for the version responsibility for development of analysis software. seen in the 2002 case study, its running time analysis is per- Two final examples are NJ Transit and Broward County formed without reference to scheduled departure times or Transit, whose APC/event recorder and AVL systems (respec- headways and, therefore, cannot analyze schedule or headway tively) are operational and expanding. They are using COTS adherence, or report results for particular scheduled trips. database platforms for data management and COTS report- Software coupled to the scheduling system has many of the generating software Brio and Crystal Reports for analysis. same disadvantages as software coupled to an equipment ven- Unfortunately, developing one's own database and report- dor. One case study agency that uses such a tool for running ing software demands resources and expertise that are beyond time analysis has to use its own database and software tools the reach of many transit agencies. Because of differences in for other analyses and ad hoc queries. software platforms and data formats, tools developed at one However, one advantage of this source of software is that agency are usually not transferable to another. for scheduling system vendors, software development is their business. If they take on AVL-APC data analysis seriously, they are well positioned to develop some very good tools and 11.1.2 Software Supplied by to maintain them. With many customers worldwide, they are Equipment Vendors in a good market position if they choose to exercise it. Software supplied by some APC vendors provides useful reports including on/off/load profiles, running time distribu- 11.1.4 Third-Party Software tions, and on-time performance. However, it usually lacks flexible query capabilities. In the Netherlands, Delft University of Technology's Trans- Historically, AVL vendors provided software related to real- portation Engineering Laboratory has developed the database time applications only; for archived data analysis, their job and reporting software TriTAPT for detailed analyses of AVL ended when they handed the transit agency the data. Often, and APC data. Various editions have been applied over the last the only archived analysis tool is the ability to play back the 20 years to several Dutch transit agencies; the current edition AVL data stream. Some AVL suppliers include a genuine data- is being used in Eindhoven and in the Hague. It features many base and analysis function, but tend to offer only elementary useful single-route reports; excellent graphical representa- analyses such as on-time performance percentages and reports tions, including proportional scaling to represent distance and on how often various event codes were transmitted. For two time intervals; attention to distributions and extreme values as of our case study agencies, AVL vendors are developing more well as mean values; a graphical user interface for selecting comprehensive database and analysis capabilities as part of days and times to be included in an analysis; edit capability their procurement contracts. that allows an analyst to suppress outliers; and practical tools Software that is coupled to on-board equipment limits the for suggesting scheduled running times. It has been applied flexibility to add other on-board equipment or to replace with data gathered using a variety of automated data collec- aging equipment with equipment from a different vendor. tion equipment, including APCs, event recorders, and AVL The vendor may go out of business or may not continue to systems of different makes. Interfaces have been developed to improve the software. Furthermore, a note of caution comes scheduling system databases. It uses a custom database to from reviewing 20 years of industry experience with farebox speed processing, but includes an export and import utility so data. While the major electronic farebox vendors also supply that data tables can be transferred to and from text files. software for analyzing farebox data, most larger U.S. agencies In Germany, the Hannover transit system Uestra developed who rely on farebox data for monitoring ridership have found its own database and reporting software for AVL data; a related that they had to export the data to a database developed in spin-off company has recently commercialized it.