Click for next page ( 33

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 32
32 TMC. The analyst should monitor (or acquire from the The analyst has four options for dealing with errors and TMC) any incident reports (accidents, work zones, bad outliers in the data: weather, police actions, etc.) for the time periods of the data collection effort. 1. Correct the error; 2. Repeat the field measurement; 3. Replace the outlier with a maximum or minimum accept- 3.5 Processing/Quality Control able value; or Before proceeding to the computation of means, variances, 4. Drop the data point from the data set. and confidence intervals, the analyst should first review the travel time and delay data for errors. 3.5.2 Computation of Mean and Variance (Travel Time and Delay) 3.5.1 Identification and Treatment The mean travel time is equal to the sum of the measured of Errors and Outliers in Data travel times (T) divided by the number of measurements (N). An error is an obvious mistake in the measured data. N An outlier is an observation that lies so far from the other Ti observations that the investigator suspects that it might be Mean (T ) = i (Eq. 3.2) N an error. The analyst should first evaluate the measured travel The variance is a measure of the spread of the distribution times or delay data to eliminate obvious errors. The analyst of observed travel times. should reject any data that violates physical limitations, 2 such as negative travel times or speeds more than twice the N N Ti2 - T i design speed of the roadway. Any measured travel times or Var (T ) = i i (Eq. 3.3) delays that are greater than the duration of the study are N -1 suspect as well. Most data analysis software packages or spreadsheet programs can be used to automatically flag any where data record whose value violates a defined minimum or Mean(T) = The mean of the measured travel times; maximum. Var(T) = The estimated variance of the measured travel The analyst also should check records for unusual events times; occurring during data collection, including accidents, work Ti = The measured travel time for observation zones, police actions, and bad weather. In most cases the number i; and analyst will not want to eliminate travel time and delay data N = Total number of observations of travel time. gathered during unusual events, as this variation is what allows the data to describe variability in travel time or delay. The mean delay and its variance are computed similarly, If data are being collected to calibrate the speed-volume using the above two equations, substituting delay for travel relationship in a transportation planning model, however, time. Both the mean and variance can be estimated for travel times (or delay) from any desired sampling timeframe (e.g., removing the incidents and events that regularly occur may throughout a 24-hour period) from the peak period or peak be appropriate. hour only, etc. Similarly, they can be computed including or The search for outliers is more subtle. A scatter plot of the excluding unusual incident-generated data points that lie data can be very helpful to the analyst in quickly spotting the outside the typical range of observations for periods of recur- few data points that do not seem to belong with the rest. ring congestion (i.e., nonincident). A more mechanical search for outliers can be made by iden- tifying all points greater than 3 standard deviations above the mean, or less than 3 standard deviations below the mean 3.5.3 Computation of Confidence Intervals travel time (or delay). (Travel Time and Delay) Statistically, these outliers are not necessarily invalid The confidence interval is the range of values within which observations, however, they are unlikely. The analyst should the true mean value may lie. The confidence interval for the review the raw data sheets for the outlier observations and mean travel time or the mean delay is given by the following verify that no simple arithmetic error was made. If an error is equation: found, it can be corrected. If no obvious error is found, the analyst must make a judgment call whether or not to retain Var (T ) CI1- % = 2 t (1-/2),N -1 (Eq. 3.4) the outliers in the data set. N

OCR for page 32
33 where Var (T) = The variance in the measured travel times. CI(1-alpha)% = The confidence interval for the true mean with probability of (1-alpha)%, where "alpha" The confidence interval for the true mean of delay also can equals the probability of the true mean not be estimated with the above equation by substituting delay lying within the confidence interval; for travel time. t(1-alpha/2), N-1 = The Student's t statistic for the probability of Chapter 6 contains more detailed directions for data sam- two-sided error summing to "alpha" with pling and calculation of travel time/delay variance and relia- N - 1 degrees of freedom, where "N" equals bility measures using estimated data, rather than observed or the number of observations; and TMC data.