Cover Image

Not for Sale

View/Hide Left Panel
Click for next page ( 64

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 63
63 CHAPTER 9 APC Sampling Needs and National Transit Database Passenger-Miles Estimates Two common questions regarding APC system design are values requires a far greater sample size than estimating mean how much accuracy is needed from APCs and how many APC values, which is an argument favoring instrumenting the units are needed to obtain an adequate sample size. Answer- entire fleet with APCs, a course being pursued by Tri-Met. ing those questions requires facing a related question: when an agency generates measures such as peak load, passenger- 9.2 Accuracy and Sample Size miles, and route boardings from APC data, how timely and Needed for Passenger-Miles precise must those estimates be? All U.S. agencies receiving federal assistance and operating in urban areas are required to report annual systemwide 9.1 Sample Size and Fleet passenger-miles by mode to the NTD. Traditionally, these esti- Penetration Needed mates are made from a sample of manually counted ons and for Load Monitoring offs. Agencies can use a standard sampling and estimation pro- According to the Transit Data Collection Design Manual (16), cedure that requires on-off counts on 549 or more trips (39), the passenger-countbased statistic requiring the greatest pre- or they can use any other sampling method that achieves a pre- cision is average peak load on heavy-demand routes, a mea- cision of 10% or smaller at the 95% confidence level. Because sure used to adjust headway. A reasonable target precision to manual on-off counts are labor intensive, there is a natural ensure that the route is neither overcrowded nor overserved is desire to find less burdensome measurement and estimation 5% or 6%, effectively limiting permissible load bias on crowded methods, including using APC-generated counts (40). segments to about 5%. One factor in using counts measured by APCs is the accu- Sample size needed to achieve this target precision depends racy of the counts themselves, which, as the previous chapter on the bias and cv of load estimates. Fleet penetration needed, shows, depends not only on sensor accuracy but also on data in turn, depends on the number of daily trips on the route- processing techniques used for parsing, screening, and balanc- direction-period being analyzed, the data recovery rate, and ing. The second factor is having an adequate sample size. The how the instrumented fleet is distributed. Fleet penetration of two factors are related; the less accurate the counts, the larger a 10% will afford about 20r observations per quarter for a route- sample is needed. This section deals with that accuracy/sample direction-period with five trips per day, where r is the data size trade-off. recovery rate. (For example, if r = 80%, then 20r = 16 observa- For all but the smallest transit agencies, sampling require- tions that would be obtained.) If needed, greater sample sizes ments for NTD passenger-miles reporting are considerably can be achieved by simply concentrating equipped vehicles on less demanding than are other uses of the data such as moni- heavy-demand routes, at the expense of low-demand routes, toring load or boardings by route, because the NTD precision for which less precision in load estimates is needed. requirement is only applied to a whole year's sample aggre- We have posited elsewhere that APCs make possible a more gated systemwide. Therefore, meeting the NTD require- precise method of scheduling and service quality monitoring ment should be easy for almost any transit system with APCs. focused on extreme values of load rather than mean values. However, because the NTD requires that alternative sampling Extreme values reflect the impacts of load variability and methods be statistically justified, the following section service regularity as well as frequency and better reflect the examines passenger-miles sampling and estimation with quality of service as felt by passengers. Estimating extreme APCs in detail.

OCR for page 63
64 9.2.1 Standard Error Targets in the Presence 9.2.2 Sample Size and of Bias Coverage Requirements Let The determination of sample size requirements assumes three stages of sampling: in stage 1 all routes are selected; in Y = mean passenger-miles per trip stage 2, for each route, certain timetable trips are selected; and b = relative bias in the passenger-miles estimate in stage 3, for each selected timetable trip, certain days are _ (b = bias/Y ) observed. The assumed cv's for trip-level passenger-miles at y = estimated mean passenger-miles stages 2 and 3 are: se = standard error of the passenger-miles estimate rse = se/Y = relative standard error cv 2 = 0.9 = cv oftimetable trip means (within route) The precision specification can be interpreted as: cv 3 = 0.3 = cv ofdaily passenger-miles (within a given timetable trip) P ( y - 0.1Y Y + 0.1Y ) = P (Y - 0.1Y y Y + 0.1Y ) 0.95 (5) The assumed values are conservative estimates based on experience with data from many transit agencies. The values ] = Y Subtracting E[y (1 + b) and then dividing by se, reflect the fact that, for a given route, most variation in trip- level passenger-miles is due to differences in where trips fall - 0.1Y - bY y - Y - bY 0.1Y - bY within the timetable (peak/off-peak, inbound/outbound), P 0.95 se se se rather than random differences between days. Sample size requirements derived in this section are based on the week- By the Central Limit Theorem, the middle term approaches day sample only; the addition of weekends, sampled with the a standard normal variate as sample size increases; therefore, same degree of fleet penetration as on weekdays, will improve using the notation () = cumulative standard normal distri- precision, although not by much. bution, the precision requirement becomes The effective penetration rate (f3) is defined as the expected fraction of the daily schedule observed each day. It is the - 0.1 - b - 0.1 - b product of fleet penetration rate and data recovery rate. - 0.95 (6) rse rse From relation 6, selected values of permitted relative stan- Covering Every Weekday Trip dard error for a given value of relative bias are shown in With an effective fleet penetration rate as small as 1% and Table 11. For manual data collection, assumed bias-free, the careful rotation, every weekday timetable trip can be observed permitted relative standard error is 0.051; with 8% relative at least once per year. The annual estimate is determined by bias, the permitted relative standard error falls to 0.012. To calculating average passenger-miles for each timetable trip, be safe, a transit agency would do well to limit the permissible expanding by number of days that trip was operated, and bias in passenger-miles or load to less than 8%. summing over all timetable trips. Stratifying to this level is a very effective estimation technique because it eliminates the Table 11. Relative standard effect of variability between timetable trips. The weekday error required versus sample size requirement is measurement bias. n max ( N 2 , ( 0.3 rse ) 2 ) (7) Measurement Permitted Relative Bias* Standard Error* where N2 equals the number of weekday timetable trips and 0.00 0.0510 rse is the permitted relative standard error from Table 11. For 0.01 0.0500 bias up to 8% and for all but the smallest transit systems, the 0.02 0.0471 0.03 0.0423 N2 term will control; that is, it is sufficient to simply observe 0.04 0.0365 every timetable trip once. 0.05 0.0304 0.06 0.0243 Covering Most Weekday Trips 0.07 0.0182 (Two-Stage Sampling) 0.08 0.0122 0.09 0.0061 Logistics and data recovery problems can frustrate plans *Relative to mean passenger-miles per trip to observe every weekday timetable trip. The following plan

OCR for page 63
65 1.00 8% bias, 1% eff. penetration 8% bias, 4% eff. penetration 5% bias, 1% eff. penetration 0.95 5% bias, 4% eff. penetration 2% bias, 1% eff. penetration 0.90 0.85 0.80 0 200 400 600 800 1000 1200 Number of Trips in Weekday Timetable Figure 20. Timetable coverage rate required versus timetable size. assumes that only a percentage ( f2) of the timetable trips because lower coverage rates suggest poor logistical manage- is covered. The estimation procedure is to get an average ment with likely sampling biases (e.g., whole routes being for each timetable trip that was observed, determine the missed or seriously undersampled). With 5% bias, 85% cover- route average (per trip), expand each route average by the age is sufficient even for an agency with only 200 trips in the number of trips operated per year, and then sum over all weekday timetable and 1% effective penetration. Only smaller routes. systems with moderate to large bias will need greater timetable The relative standard error of the estimate is given by coverage or effective penetration. cv 22 cv 2 rse 2 = (1 - f 2 ) + 3 (8) 9.2.3 Intentional Sampling f2 N 2 Df 3 N 2 The recommended estimation procedures just described where D is the number of weekdays in the year (about 252). involve unintentional sampling--the APCs collect data all For all but the smallest transit systems, the third term will be year long, and the agency just rolls it up. This approach insignificant, and the size of the relative standard error will assumes that instrumented buses, for reasons beyond NTD depend mostly on f2. passenger-miles estimation, are being circulated in a manner Using equation 8, Figure 20 shows the required timetable that covers the entire schedule regularly. Intentional sampling coverage f2 versus the number of trips in the timetable (N2) methods with limited sample sizes are clearly inferior, unless for selected values of bias and effective penetration rate. data processing procedures are still so undeveloped that each Degree of coverage is restricted to values of 85% or greater, trip's data must be manually checked.