Cover Image

Not for Sale

View/Hide Left Panel
Click for next page ( 44

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 43
Summary of Recommended Standardized Procedures and Guidelines 43 Question 2: Did someone else in your household complete the survey? (yes or no) If "yes," go to Question 3 below. If "no," terminate the validation survey. Question 3: Select a trip that the respondent is likely to remember from among the trips reported in the initial survey and note the time spent at the destination. Ask the re- spondent to recall the trip in question and to report the approximate time spent at the destination. 3. A statistic should be prepared indicating the percent of validated surveys that provided a negative answer to each of the first two questions or a mismatch on the third question. 4. The commissioning agency should establish at the outset what is considered to be a toler- able level of failure on validation. Acceptance of a 1% failure on the first two questions and 5% on the third might be consid- ered to represent a reasonably good quality. 2.7.6 Q-7: Data Cleaning Statistics Data cleaning or data checking is an activity that is conducted almost routinely in travel sur- veys. It involves checking and, where possible, correcting data values that can be identified as being incorrect. It is usually performed as soon as possible after the data are retrieved. This is to enable queries to be made while the information is still fresh in the memories of the respondents. For errors that are caused or accentuated by the survey process, it also allows timely correction. This is elaborated on in Section 10.6 of the Technical Appendix. The following data cleaning statistic (DCS) provides a mechanism to measure the incidence of cleaned data items in a data set: N I count ( xi ,n ) n=1 i =1 DCS = N 1 where: xi,n = ith data item of respondent n count(xi,n) = { 1 if i th data item of respondent n was cleaned 0 otherwise } N = number of respondents in survey I = number of minimum (core) questions It is recommended that all transportation surveys compute and report the DCS statistic and that, based on experience with this statistic, future ranges be established to indicate the quality of the data based on the amount of cleaning required. 2.7.7 Q-8: Number of Missing Values The number of missing values in a data set is a measure of how much information was not collected. If expressed as a proportion of the total number of data items in the data set, it serves as a measure of the relative information content of the data. Thus, it could be used as a measure of data quality. It is important to define what a missing data item is and what it is not. As described in Section 2.5.3, recommended coding practice is to distinguish between non-responses that are refusals-- those in which a respondent does not know the answer to the question--and those in which a response would not be applicable. Among these categories, only responses where a respondent either refuses or does not know the answer are truly missing values. Further information is to be found in Section 10.7 of the Technical Appendix.