Read "Standardized Procedures for Personal Travel Surveys" at NAP.edu

« Previous: Chapter 3 - Training Approaches and Priorities

Page 48

Suggested Citation:"Chapter 4 - Procedures and Measures for Further Research." National Academies of Sciences, Engineering, and Medicine. 2008. Standardized Procedures for Personal Travel Surveys. Washington, DC: The National Academies Press. doi: 10.17226/13805.

Page 49

Page 50

Page 51

Page 52

Page 53

Page 54

Page 55

Page 56

Page 57

Page 58

Page 59

Page 60

Page 61

Page 62

Page 63

Page 64

Page 65

Page 66

Page 67

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

The items in this section could not be examined in this project and require further research. These items can be classified into the following three groups: 1. Items that were initially identified to be beyond the scope of the project; 2. Items included in the original project plan, but were not analyzed because of time and budgetary constraints; and 3. Other areas identified during the course of the project. These are shown in Table 1 and are discussed in the following sections. An overview is pro- vided of each item, together with a discussion of its relative importance. Recommendations are made about specific research areas that should be examined in the future. The items detailed in Section 4.1 were originally discussed in the interim report, but have been reproduced here for convenience. Items in Section 4.2 have also been extracted from the interim report, but have been modified and shortened in most cases. The information presented in Section 4.3 is based mainly on comments made by members of the research team. 4.1 Items Initially Identified as Beyond the Scope of this Project 4.1.1 D-11: GPS Surveys There is growing interest in the use of GPS devices to collect data on sub-samples of households in household travel surveys. GPS is capable of providing very precise information about the loca- tions to and from which people travel, the times of their travel, the routes used, and even the traf- fic conditions along the route of travel at the time of travel. At present, this is largely an experi- mental procedure although it is moving rapidly forward as a mainstream activity in household travel surveys. There are at least 20 ongoing surveys that have a GPS component in the United States at the time of this report. Clearly there is potential for defining standardized procedures and providing guidance on a num- ber of aspects of such surveys. This includes sample sizes and methods of drawing samples, geo- graphic and socio-demographic distribution of the sample, the number of days for which GPS data should be collected, minimum hardware specifications for the GPS devices, the use of incentives, methods for deployment of the devices, methods of return of the devices, etc. However, at this time, it is probably too early in the development of such surveys, and there is too little experience to define standardized procedures. Therefore, this is an area that should be considered as being currently out of scope, but necessary to add within the next 2 or 3 years. It also may require extensive field exper- imentation to develop good standardized procedures through comparative studies that clearly show which are the preferred methods. Also, as personal GPS devices (as opposed to in-vehicle GPS devices) become more practicable and available, the nature of the survey may change quite rapidly. 48 C H A P T E R 4 Procedures and Measures for Further Research

Procedures and Measures for Further Research 49 4.1.2 D-12: Internet Surveys Internet surveys are similar to GPS surveys in that it is a data collection mechanism that is emerging at the present and has yet to undergo extensive field testing. Nevertheless, like GPS sur- veys, it is likely to be a technique that will evolve rapidly and, if successful, be incorporated more and more frequently as a potential means for a household to provide the data for a household travel survey. Again, it is an experimental procedure that is not yet in the mainstream although several current surveys are working to offer Internet as an alternative means of response for a number of households. Again, there is enormous potential for defining standardized procedures and providing guide- lines. These may address such issues of how to provide access to websites, the type of graphics and other materials to be provided, building in cross checks on data and cross-referencing travel of other household members, encryption, and a variety of ethical issues that will arise with Internet surveys. As with GPS, however, this area is considered too under-developed for this project. Stan- dardized procedures should be considered during the next 2 or 3 years and may, again, require a number of comparative studies in order to determine what consistent practice should be. 4.1.3 I-8: SP Data Many recent travel surveys have included collection of stated-choice data, more commonly referred to as âstated-preferenceâ or SP data. Assuming that such data will become more and more a standard element of many surveys, standardized procedures and guidelines are almost certainly Table 1. Procedures and measures for further research. Category Original reference Item D-11 GPS surveys D-12 Internet surveys Items beyond scope of project I-8 SP data D-2 Who should be surveyed? D-9 Times of day for contacts E-6 Retention of data on incomplete households E-7 Cross checks in data collection and data review E-8 Days and periods to avoid data collection I-3 Collection of in-home activities I-4 Ordering of questions I-6 Instrument design I-7 Multitasking of activities S-1 Sample size S-2 Sizes and procedures for surveying augment samples S-3 Collecting augment samples S-4 Stratification options for samples S-5 Specification of sampling error requirements S-6 Development of default variances P-1 Focus groups P-5 Reporting of pretests and pilot surveys Items originally identified and not researched Q-4 Sampling error â Cell phones Other items identified during research â Incentives â Personalized interview techniques â Geocoding methods â Impacts of the national âdo not callâ registry â Initial contacts â Refusal and non-contact conversions â Effect of interview mode on recruitment and non-response rates â Unknown eligibility rates â Data archiving in transportation

50 Standardized Procedures for Personal Travel Surveys needed relating to these data. These would relate to the size of the task that can and should be presented to respondents (Stopher and Hensher, 2000), as well as issues of how alternative attri- bute levels are set in a stated-choice survey (Stopher, 1998). There is also a need to determine whether attribute levels should be generated in real time or can be pre-set and committed to a printed survey. Consistent instrument designs for the collection of stated-choice data are clearly needed. Many survey firms understand this area relatively poorly, and the whole field of stated-choice research is subject to potential discredit if poor designs are fielded and erroneous conclusions drawn from the data. There are enormous differences of opinion in such areas as â¢ The need for contextual data to be collected at the same time; â¢ The number of possible alternatives that respondents can be asked to handle; â¢ The number of attributes that can be included in the design; â¢ The number of levels of each attribute that can be included; â¢ How far the levels of the attributes can depart from current experience of the respondent; â¢ The number of treatments that an individual respondent can be asked to handle; â¢ Whether the order in which treatments are offered has an effect on choices; â¢ The need for orthogonality in the design; and â¢ How to administer the SP experimentâthat is, by paper and pencil, on laptop computer, etc. In addition, there are some survey researchers who do not believe that stated-choice experi- ments are valid and would argue against their use. Research is clearly needed into these various issues. In this case, it appears from the litera- ture that transportation applications of stated-choice surveys are ahead of marketing and other fields that may also use the techniques. As a result of a review of the literature on this topic, the transportation field appears to be addressing issues that other researchers have not con- sidered (Louviere et al., 2000). However, these listed issues have not been researched in trans- portation or elsewhere to date. Hence, to develop standardized procedures for SP data, it will be necessary to undertake research on all of these issues. For the most part, this will require a battery of alternative SP survey designs to test various options in each of the bullets listed above. Several of these can be tested together; the results, in the form of some measure of the quality of the SP sur- vey, can be analyzed through models that seek to explain differences in the quality as a function of the various design variants. At the outset this area was considered to be beyond the scope of this project; it is, therefore, up to future research to establish standards. 4.2 Items Originally Identified and Not Researched 4.2.1 D-2: Who Should Be Surveyed? There is no general consensus about the minimum age for persons included as part of a house- hold travel survey. Traditionally, data have been collected on all household members over the age of 5 years on the assumption that any young children will travel with the non-working mother, who would, therefore, provide complete data on the movements of any very young children. In current society, both parents now work in most households, and it is becoming increasingly dif- ficult to deduce the travel of younger children in the household. In light of this, more household travel surveys are collecting data on all family members, irrespective of age. Another issue that arises in household travel surveys is whether to survey persons living in group quarters. In many instances, those living in group quarters do not travel (e.g., prison inmates, those in hospital, some types of elderly and infirm care facilities); however, other types of group quarters may produce large amounts of travel (e.g., university dormitories and military facilities). Some guidance is

needed on whether to survey these persons or whether some group quarters should be included and others not. It is recommended that research be conducted on these issues using existing surveys for analy- sis. By examining a survey for which there is no minimum age, the data obtained on children under 5 years of age could be compared with data from adults in the same household. From this, it would be possible to determine the extent to which the infantâs data could have been inferred from the parentsâ data. It would also be useful to determine whether trip rates and other related informa- tion are ever corrected for infants when analyzing those data sets that did not include infants in the collection of travel data because this may have significant impact on mode-choice and automobile occupancy. Parents with infants are often restricted to using an automobile available to the house- hold, and this decreases the potential for transit use. Failure to include infants will result in incor- rectly lower average automobile occupancy rates that will probably not match occupancy rates from other sources. It is also recommended that analyses be done on surveys that include persons from group quar- ters. Specifically, it is suggested that level of tripmaking be compared between persons living and not living in group quarters. It would also be worthwhile to look at what fraction of total trips are represented by people living in group quarters (through examining census data) to determine the effect of inclusion or exclusion on overall regional travel statistics. It is anticipated that standard- ized procedures would suggest specific conditions that need to be met to warrant the inclusion of group quarters in surveys. These might include situations in which the retired elderly people exceed a certain fraction of the population of the study region or where there might be a military or other mobile institutional presence (e.g., colleges or universities with dormitory accommodations) in excess of some proportion of total population. 4.2.2 D-9: Times of Day for Contacts Within telephone surveys, the time of day when contact is attempted has a critical influence on response rates. There is a wide range of practices in existing surveys, however, and these have never been formally documented. In some surveys, the client agency may stipulate the hours between which telephone contacts can be made by the contract firm. Because different cities show markedly different habits with respect to work times and times at which people retire for the night, this may not be an area in which consistency of practice will be possible. It may be possible, however, to spec- ify a core period of time when calling would normally be productive and to specify other times when calling is almost certainly not productive. For example, calling is generally productive between 6 and 8:30 P.M. on weekday evenings. It is recommended that recent surveys be reviewed to determine what has been set as appropriate times. By examining call attempts and outcomes in call histories, it would be possible to determine the relative productivity of calls made at different times of the day. Particular attention should be paid to determining the most productive and acceptable hours for calling on weekends. A second issue that needs to be addressed in this area is how to determine when to re-contact households that have either requested a non-specific call-back or are considered to be soft refusals. It seems possible that some consistent rules can be established on how to distribute times for call- backs to try to resolve previously incomplete surveys. There appears to be a lack of common prac- tice on when to make a subsequent attempt after finding the number is busy, there is no answer, or an answering machine picks up the call. In some instances, the protocol appears to be to recall the household at least once, and sometimes more than once, on the same evening as the initial call. In other cases, the call may be re-rostered for the same time on the next day or the same time in the next week. By reviewing procedures that have been used in prior surveys and also those that may be used in other areas of market research, it may be possible to recommend guidelines for re-contacting sample units. Procedures and Measures for Further Research 51

4.2.3 E-6: Retention of Data on Incomplete Households Data on incomplete households have the potential to provide extremely useful information that can be used in analysis of survey results and to improve the quality of surveys in the future. With these data, it is possible to examine the design of certain questions that may result in premature terminations of interviews and information on the biases in non-respondents. Despite the appar- ent usefulness of such data, in many surveys it is destroyed after the full sample is obtained either because it is automatically done by CATI software or because of specific desires of survey firms or clients. Many agencies are ignorant of the value of partial data and will either not specify in the con- tract that such data should be turned over or may even specify that such data are to be destroyed. In addition to this, many agencies would not know what to do with such data if retained and need help in knowing how to make optimal use of it. Again, there was insufficient time in this project to develop standardized procedures on the retention of data on incomplete households. It is recommended that several tasks be performed in any future research. As a starting point, one needs to define what constitutes a partially com- plete household. This would not be difficult in light of the work done in this project to define a complete household (see Section 2.2.3). At a minimum, households could be classified into the following basic categories: 1. Refused recruitment; 2. Terminated recruitment prematurely; 3. Completed recruitment, but refused mail-out survey; 4. Completed recruitment, accepted mail-out survey, but refused diary completion or retrieval of diary data; 5. Partially completed diaries and related information; and 6. Completed all survey materials. There is also a need to determine whether all incomplete household records should be retained or only those meeting some minimum criterion of completion. To do this, one would need to demonstrate the potential uses of such data through analysis of incomplete records from a variety of surveys. This might include examining the questions at which surveys are terminated and the distribution of household characteristics for households that are partially complete and those that are fully complete. A few key areas for analysis should be recommended to help determine what specific data should be retained. In developing data retention standards, it may be necessary to specify modifications that need to be made to some commercial CATI software packages. While subsequent analysis may determine that there is no useful information to be gained from some par- tially complete surveys, it would be prudent to err on the side of keeping too much rather than too little data. With the current low costs for data storage and the small overheads resulting from increasing the overall size of data sets, there is no reason to try to minimize retention of data by throwing out such data as that on incomplete households. 4.2.4 E-7: Cross-Checks in Data Collection and Data Review In any survey, cross-checks should always be undertaken on data to ensure that results are mean- ingful and certain information is not contradictory. For example, a survey in California a few years ago reported a substantial proportion of school children, under the legal minimum age for hold- ing a driverâs license in California, apparently driving alone to school. There are other problems to be avoided: almost every travel survey includes instances of people forgetting to report a trip back to home at the end of the day or failing to report an activity at home after the last trip of the day. Work trips by people who report that they are not workers are another common occurrence in sur- veys. Another problem in activity and time-use diaries arises when people do not include activities at a place between trip segmentsâfor instance, waiting at a bus stop or parking a vehicle. 52 Standardized Procedures for Personal Travel Surveys

In most cases, these problems are completely avoidable with appropriate checks. CATI and CAPI surveys offer enormous potential for cross checks on data quality in real time as a survey progresses and, in most such surveys, at least limited cross-checks are usually programmed in. Anecdotal information and recent experience of some of the research team suggests, however, that cross- checks are not always built in to survey data-collection procedures or that they are built in, but overridden or ignored by interviewers. Because of time constraints, it was not possible to develop consistent procedures in this project for cross-checks that should be built in to CATI or other types of interviews or to develop stan- dardized procedures for checking data after they have been retrieved. In our opinion, future research should focus on two main tasks. First, there is a need to develop a general list of the vari- ous checks that should be included in any travel survey. In part, these would need to be based on the minimum question specifications already developed as part of this project (see Section 2.1.1). Second, once a list has been compiled, standards for cross-checks that can apply to CATI and CAPI surveys should then be developed. Based on the experiences of the team working on this project, some of the better known problems include the following: 1. Children below minimum driving age reporting a drive-alone trip; 2. Children below minimum working age reporting work activities and travel; 3. People failing to report trips back to home both during and at the end of the day; 4. People failing to include activities at a place in transit trips (e.g., waiting and transferring) in a time-use or full activity survey; 5. People who are not employed reporting trips to or from work; 6. People failing to report other family members who accompanied them on travel; 7. Head of household reporting being under the age of 16; 8. People reporting more workers in the household than adults; and 9. People reporting more adults or more children in the household than the total house- hold size. It is recommended that unprocessed data from recent surveys be reviewed to compile as com- plete a list of these types of problems as possible. Second, once a list has been compiled, standards for cross-checks that can apply to CATI and CAPI surveys should then be developed. To this end, it is recommended that available CATI scripts and programs be reviewed to determine that checks have been built in and to examine the effectiveness of these checks. Structuring the received data into a snapshot of the actual behavior of a person over the course of the survey day is likely to be a very productive way of detecting errors and illogical responses. In most instances, the same checks that would be appropriate in a CATI or CAPI survey can also be used in a non-computerized survey to review data as they are obtained on paper diaries or other media. This may not be possible in all sit- uations, however, and it is likely that some standards developed for CATI and CAPI surveys will need to be adjusted for application in paper-and-pencil interviewing (PAPI) and related surveys. 4.2.5 E-8: Days and Periods to Avoid Data Collection While there are unwritten conventions about days on which household surveys should or should not be undertaken, no specific guidelines exist on this issue. Most household travel surveys are con- ducted in the Spring and Fall, but in some areas of the South, Spring may be defined as beginning earlier in February or even mid-January once schools are back in session. Most surveys generally avoid Thanksgiving and New Year because of the perception that travel is abnormal at this time of the year. Similarly, there is usually an attempt to avoid the period from the end of May through the middle of August because people are taking annual vacations and schools are not in session. There are inconsistencies however on whether surveying should be temporarily suspended for such times as Spring Break (either for schools or universities), Columbus Day, and Presidents Day. In addi- tion to this, there is a more general issue that relates to whether data from just Fall or Spring, or a Procedures and Measures for Further Research 53

combination of both, are really appropriate for modeling purposes and for the decisions to be made from data and subsequent models. Whether this is an appropriate item for standardized procedures is somewhat questionable although it would appear that guidance would, at least, be appropriate on this issue. The extent to which travel differs during holiday periods and at certain times of the year is not entirely clear. If travel is significantly different during these times then it may be appropriate to avoid these periods in the interests of ensuring comparability among surveys. To the extent that is it possible to obtain data, it would be worthwhile to examine the effect of such time periods on survey findings and determine whether they present a problem in relation to the usual goals of household travel sur- veys. While guidelines may suggest periods that should be avoided during data collection, they may, on the other hand, recommend that no period needs to be excluded. There is a real question as to whether this issue of âatypical travelâ is appropriate and whether the exclusion of certain days will result in serious biases in survey findings and transport models. 4.2.6 I-3: Collection of In-Home Activities While there appears to be general agreement among most travel-demand modelers that more detail needs to be collected about in-home activities, many agencies avoid collecting in-home infor- mation based on the perception or expectation that it would reduce response rates and lead to (greater) incompleteness of data. There are fears of how the public would react to a transportation agency asking questions about what people do in their homes. As a result, most surveys do not ask about in-home activities or ask only about work at home and everything else at home. The per- ceptions associated with this issue have never been proved in any structured test. It would be worth- while, in our opinion, to conduct a side-by-side survey in which some respondents are asked for full details of in-home activities while others are asked only for abbreviated data on working at home and everything else. When information is collected on in-home activities, there are great inconsistencies in the level of detail of information that is obtained. For example, the Oregon and Southwest Washington household travel survey, which attempted to collect detailed in-home data, set a minimum time of 30 minutes for an activity to be reported in detail. Another strategy, used in the Baton Rouge Area Household Time-Use Survey (Stopher and Wilmot, 2001) was to instruct people to use âOther at Homeâ to designate any personal and intimate activities that they do not wish to report on in detail. While both of these approaches are valid, there appears to be significant potential for consistency in this area. It is recommended that recent activity surveys be examined to evaluate the different options that have been used to collect in-home activities, (e.g., time limit in Portland and the min- imal description of in-home activities in DallasâFort Worth). The usefulness of the activity data that resulted from these alternative procedures should be evaluated before any standardized pro- cedures are suggested. It may also be useful to examine recent surveys for additional evidence as to whether requests for this detail appear to have had impacts on response rates. The literature on time use (Robinson, 1977 and 1991; Robinson and Godbey, 1997) should also be helpful in this regard because this is presumably an issue that has been faced and dealt with in time-use surveys in sociology and psychology. 4.2.7 I-4: Ordering of Questions The ordering of questions can be crucial in obtaining good responses in a survey. Although lit- tle empirical research has been done on the ordering of questions, there are a few basic principles that are considered good practice in most survey settings. Sensitive questionsâincome, race, etc.âare generally placed as near to the end of the survey as practicable to minimize the potential of non-response. âFunâ questions, particularly those that ask respondents for their opinion on a 54 Standardized Procedures for Personal Travel Surveys

certain issue or satisfaction with a service, should be asked as early as possible to make respondents feel as though their input and participation is valued. It is also considered good practice to ensure that questions follow a logical and appealing sequence that helps respondents understand what is being sought from them. For example, in asking about travel or activity details, one should begin with the starting time of the travel or activity, the location of the activity or the means of travel used first, and so proceed through a logical sequence of details. Sequencing is also important for ques- tions on occupation and working at home. There are many occupations (retail clerk, air-traffic con- troller, sanitation worker, etc.) that do not permit working at home. Therefore, care needs to be taken not to ask a question about working at home following a question on occupation. These ordering procedures are valid for all types of surveys because, even in self-administered surveys, respondents generally work through the survey from the beginning and answer as they go as they would using a different methodology. Although these issues are generally well understood in the transportation planning community, very little research has ever been done to investigate the extent to which ordering of questions appears to be correlated with non-response. It was intended that such research would be conducted in this project using a collection of survey instruments dating back to the 1960s. However, the scale of this task became very large, and there was insufficient time to conduct a thorough investigation of the area. It is recommended that future research be focused on meeting two main objectives. First, to determine what aspects of question ordering are important to the creation of respondent- friendly surveys and what question ordering seems to be most beneficial to response. Second, where applicable, a practical list of âdoâs and donâtsâ should be developed on the ordering of questions which can be observed by practitioners during the survey design process. Standardized procedures should suggest an order for certain blocks of questions within a survey (e.g., those relating to recruitment, travel/activity recall) and should provide guidance on what questions should be con- sidered as part of each groupâfor example, the household information, vehicle information, per- son information, and travel/activity information. It is also recommended that work be done to develop some alternative orderings of sensitive questions and to include these within some com- parative pilot tests. Future research should also consider the possibility that some questions should be asked more than once and in different ways, such as asking income in both recruitment and retrieval calls in a CATI survey, and asking one time with categories and one time with a more than/less than question format. 4.2.8 I-6: Instrument Design Developing consistency in instrument design is not a trivial task, and it was known from the out- set of this project that there would probably not be enough time for sufficient research on this item. The potential for variations in instrument design is unlimited. There are many different formats that can be adopted (booklet, leaflet, two-sided card, etc.) and the length of the instruments them- selves can vary, depending on the level of information sought. Tests to date of different formats in this respect have been inconclusive, and it seems likely that rather extensive further tests will be needed to provide any type of conclusive results on this issue. In addition to considerations about the physical form of the instruments themselves, there is also the issue of what fonts should and should not be used. Hundreds, if not thousands, of differ- ent fonts are available in modern word processing programs, and there is limitless potential for other formatting features to be used for directing respondentsâbold, underline, italics, use of color, arrows, boxes, and other devices. One of the main difficulties in defining consistent designs relates to the fact that design is a relatively subjective process and relies heavily on personal prefer- ences. A design considered by one person as bad may be considered good by another. While it may be important to develop consistency in this area, recommendations should not be prescriptive about the way instruments should be designed because instrument design is an area that should Procedures and Measures for Further Research 55

see much innovation in the future. There is a diversity of opinion about some specific aspects of instrument design, which are very difficult to resolve without extensive research. In the planning stages of this project, it was suggested that standardized procedures be devel- oped around three main areas. First, it was considered necessary to address some basics of design such as typefaces and sizes, use of color, arrows, boxes, and other devices to direct respondents; the use of clip art; and the survey instrument length, etc. It was intended that a primer document would be developed to provide some basic guidelines in survey instrument design, which would be sup- ported by an example survey instrument developed in accordance with such guidelines. In our opinion, this idea has considerable merit and is worth pursuing in the future. It is important to note that any such work should incorporate the results of other tasks performed as part of this proj- ect such as minimum questions, consistent categories for answers, and consistent question word- ings. Guidelines on ordering of questions, although not developed as part of this project, should be observed even if only in the form of the basic principles outlined in the previous section. The second major issue that needs to be examined is whether printed surveys used in CATI or CAPI surveys should contain all questions that will be asked in the interview or if it is necessary only to ask a sub-set of questions with the remaining questions being asked at the time of the inter- view. To test this, it would appear to be necessary to conduct some focus-group testing, together with a series of pilot tests of the two options, to see both what respondents prefer and whether there is any noticeable difference in the responses obtained. Evidence from focus groups conducted for surveys in Dallas and Southern California suggests that respondents prefer not to have all questions in travel diaries and that this might increase response rates. Further research is required to exam- ine trade-offs in completeness of responses and response rates. Finally, there is a need for some consistency to be developed in the design of instructions for respondents. Many past transportation surveys have included extensive written instructions, which a review of the survey results shows either were not read or at least were not understood and applied by respondents. It appears to be clear that people simply will not bother to read extensive instruc- tions, and intuitively this suggests there is a need to move toward more graphic instructions, requir- ing fewer specific instructions to be read. It is recommended that a specific survey be developed to evaluate the impact of different types of instructions on responses. 4.2.9 I-7: Multitasking of Activities All survey instruments in transportation continue to ask questions as though people only undertake a single activity at a time. It is very apparent that people perform various multi- tasked activities throughout the day. These include such activities as driving and talking on a cellular phone; eating and watching TV; traveling on public transit and performing work activ- ities such as reading, reviewing, using a laptop, etc. By asking questions on a single activity only, much information is missing from typical surveys, and purposes are probably misstated by this simplification. This item was not considered in any detail in this project and it is suggested that recent and cur- rent travel surveys and the literature on time use (Robinson, 1977 and 1991; Robinson and God- bey, 1997) be consulted to determine whether it is possible and reasonable to define a standard question format for obtaining information on multitasking of activities. If such standards are approached, it will be necessary to undertake field testing and possible focus-group testing of the question structure and wording and to investigate its overall effect on instrument design and com- plexity. From the viewpoint of the blurring of work and other activities, the increasing ability of people to multitask as a result of technological advances, and the potential impacts of these on daily travel and activity patterns, this would appear to be an important area for further research and the development of standardized procedures. 56 Standardized Procedures for Personal Travel Surveys

4.2.10 S-1: Sample Sizes Sample size is probably the single most controversial item in household travel surveys and one on which there is virtually no agreement, as evidenced by samples ranging from a few hundred to as much as 20,000 households. Even though there have been a number of documents providing guidance on sampling (TMIP, 1996b; Smith, 1979; Stopher, 1982), there seems to be either igno- rance of the existence of these documents or the guidance that they suggest are not accepted. It was hoped to develop minimum sample sizes, based on the purpose of the personal travel survey (model estimation, model updating, regional description, and policy testing and formulation), that would be different from previous guidance, which either offered formulas for calculating minimum samples or provided some possible default values to use in sample-size calculations. Procedures to develop appropriate sample sizes are not lacking either in the transportation field or in the survey sampling literature. Clearly, the fact that there is such a wide variation in chosen sample sizes for household travel surveys arises from at least two issues: (1) available budget and (2) political rather than statistical justification of a particular sample size. Costs for household travel surveys are large compared with any other planning activity. Many smaller MPOs will undertake a household travel survey because the staff feels it is essential, but the sample size will be dictated by available funds. This often leads to a decision to collect data with an inadequate sample because it is felt to be a better option to collect less than the optimal amount of data than to collect no data at all. Furthermore, even though an inadequate sample size may result in modeling problems, models will still be built with what data are available, and too rarely are problems with the models and their forecasts correctly attributed to lack of sufficient data in the first place. It is very possible that no amount of effort in defining adequate or mini- mum sample sizes will ever completely change this situation. Political issues may range from multiple jurisdictional contributions to the survey costs, result- ing in pressures for the sample to be large enough for each contributing jurisdiction to obtain reli- able results to a belief that neither politicians nor the public will accept that a statistically adequate sample will actually be sufficient for the purposes of the survey. An example of both of these issues arose in Southern California in 1990. A statistically adequate sample of the region would be in the range of 3,500 to 5,000 households. However, because funds were being derived from various coun- ties and other jurisdictions in the region, it was essential that each of those jurisdictions receive suf- ficient sample to be able to conduct independent analyses and, in some cases, modeling. At the same time, it was felt that people in the region would not accept that adequate information could be provided for a region with a population of 12 million from a sample of 5,000 or fewer house- holds. The end result was a decision to draw a political sample of about 15,000 households rather than a statistical sample of 3,500 to 5,000 households. Notwithstanding that such situations will arise, it still seems reasonable to specify standardized procedures in sample design that are based on statistical requirements rather than unknown polit- ical requirements. To proceed with this task, it will be necessary to take into account the issues of stratification, error levels, and augment samples and develop simple guidance for sample size from this. Sample sizes should be examined from recent surveysâparticularly those that have been used for model estimation, model updating, and policy testing and formulationâand a determi- nation made of the adequacy of the sample for these purposes. Again, we note here that the 15,000 household sample in Southern California turned out to be less than adequate for mode-choice modeling in that region because there were no augment samples and the decision on how to strat- ify the sample resulted in very few transit trips in the final data setâtoo few, in fact, to allow reli- able mode-choice models to be built with the intended specifications. One of the important issues to consider in setting the sample-size standards is to deviate from previous guidance and not tie the sample size to regional size, except in very broad terms. The Procedures and Measures for Further Research 57

reason for this is that unless the finite population correction factor is large, which will rarely be the case in urban area surveys, the error levels of a sample will not be dependent on the regional population. The specifics of the sample size will be dependent, however, on the use to which the data will be put and the sample designâthat is, stratification, clustering, or other sampling method. 4.2.11 S-2: Sizes of and Procedures for Surveying Augment Samples Household travel surveys often require augmentation because of a lack of rare behaviors in the collected data and the problems of sampling to include them. Rare behaviors in the United States and other countries include transit riding and bicycling, among others. In most metropolitan areas in the United States, the proportion of transit riders varies between about 0.5% and 5% of all trips. These low percentages may mean that small samples will contain very few transit trips for generalization to the entire population and are certainly too small for modeling mode choice. Research is needed to determine when an augment sample is necessary. A review of various regional statistics and also past reported problems with insufficient observations on specific aspects of a household travel survey would help identify the types of situations where an augment sample would be required and how such data could be used. There is also a need for guidance on the size of the augment samples. Because augment samples are generally collected for modeling purposes, there is usually a focus on collecting data on specific rare mode choices for estimating mode-choice models. In light of this, guide- lines would probably need to be based on sample sizes required for reliable estimation of cur- rent mode-choice models. It would be important to consider that the sample needed must support segmentation by trip purpose, at least into home-based work, home-based non- work, and non-home-based. It is therefore suggested that research examine the split of pur- poses within such trips as transit, bicycle, and walk and develop recommendations on sam- ple sizes from this. For example, it has been suggested in the past that at least 300 observations are needed on each mode to obtain reasonable estimates of mode-choice model parameters in a logit model. Assume that models of the three trip purposes mentioned above are to be estimated and that approxi- mately 60% of transit trips are home-based work, 25% are home-based non-work, and 15% are non-home-based. In this case, the need is to have 300 samples in the non-home-based category for that model, which would generate the requirement that 2,000 transit trips are measured in total. If we were to find that the average rate of transit trip making by transit-riding households is 4 transit trips per day, then this would translate to the requirement for a total sample of 500 transit- riding households. If it is further assumed that the general household sample will produce about 50 transit-riding households, then the augment sample would need to be 450 transit- riding households. This provides an example of how the guidance would be developed for aug- ment samples. 4.2.12 S-3: Collecting Augment Samples In addition to the sample sizes and procedures for surveying augment samples, there is also a need to examine how data should be collected on the augment sample. For example, a number of past household surveys have used an on-board bus survey to augment the sample for transit trips. However, the nature of the on-board survey is usually significantly different from the nature of the household travel survey. In the event that such a mechanism is to be used, there are certain requirements that need to be spelled out for the on-board transit survey. Similar issues would apply if special surveys are conducted with other subgroups of the population on a choice-based or other sampling basis. 58 Standardized Procedures for Personal Travel Surveys

There are generally five approaches one can take to the problem of under-representation of rare behaviors in a random household survey. The first is to over-sample in certain sub- areas of the region. The second and similar approach is to target a portion of the sample into such areas as would have been over-sampled. A third approach is to use a secondary sampling procedure, such as an intercept survey, to find transit riders (or other relevant rare behaviors), and to obtain telephone numbers for the households of those encountered in the intercept survey. The fourth approach is to organize an independent survey, such as an on-board tran- sit survey, and obtain the augment sample from this source. The fifth approach is to stratify the population into the groups of interest, and then use screening to fill the samples for each stratum. While benefits and problems associated with each of these methods are generally well under- stood, there is a need to review recent practice and productivity of the different methods of aug- mentation. There is a need to look at other possible ways to augment household and personal travel surveys for rare travel behaviors and specific rare socio-demographic characteristics. Future research would need to examine the costs of the different approaches and determine some mea- sure of cost-effectiveness for them. 4.2.13 S-4: Stratification Options for Samples Although the usual aim of stratification in household and personal travel surveys is to ensure coverage of household characteristics, it will generally have the effect of reducing the sampling error. This aspect of stratification has been largely ignored in travel survey sample designs. While the aim of stratification is to ensure that the sample contains households in specific geographic subdivisions of the region and that each household size and vehicle ownership combination of significance is represented in the final sample, there does not appear to have been any investiga- tion of the effects of this on model estimation. It would appear to be useful and valuable to provide guidance on the stratification designs. As far as the literature reveals, little attention has been paid to the effects of stratification on the error properties of modeling steps beyond trip generation. Second, it has not been established that strat- ifying on the variables of trip generation necessarily produces more efficient samples and samples with desirable error properties. Third, there has been little or no investigation of whether there may be good alternative stratification schemes that can be used. Fourth, there is little guidance on what sample sizes to choose for each cell of the stratification matrix. In the absence of informa- tion on the variances in trip rates for each cell, there is no guidance on whether choosing equal samples in each cell is appropriate or whether there is some other possible method of determin- ing an appropriate sample size for each cell. Finally, the relative advantages and disadvantages of stratified sampling versus simple random sampling have not been investigated for household travel surveys. Because there is a cost to stratified samplingâwhich cannot generally be done based on prior identification of households as to the stratum to which each belongsâthe use of stratified sampling generally requires contacting households to determine membership in a stra- tum and then qualifying or disqualifying the household on the basis of the required sample in a cell. The costs of this method over increasing the sample size for a simple random sample are not known for household travel surveys. Standardized procedures are probably not appropriate in this area. Rather, what appears to be needed is guidance. As discussed above, it is recommended that future research examine the impacts of stratification by the variables of trip generation modeling on both subsequent model- ing steps and on the sample properties. Recent surveys should be reviewed to determine whether other stratification schemes have been used and to determine what effect these have had on sam- ple properties. Research should develop guidance as to how to choose sampling strategies and how to choose the sample sizes in the cells of a stratification matrix. Procedures and Measures for Further Research 59

4.2.14 S-5: Specification of Sampling Error Requirements Frequently, RFPs specify that the required sample must provide no more than, say, Â±10% error with 95% confidence in something such as a trip rate. Generally, this appears to be specified with little understanding of what it means. It would be reasonable to question whether 10% error is acceptable compared with 5% error and whether the significance level should be set to 90% or 95%. Also, the error is almost always specified for trip rates, while the data will be used for much more than trip-rate estimation. The implications of a particular error level for trip rates on esti- mation of such elements as mode choice or network volumes are largely unknown. The first issue that needs to be addressed is to determine an appropriate specification of the level of error and confidence level to be used in designing samples. This issue could be researched by using existing data sets. In this case, variations in data caused by differences in survey protocols, firms, survey instruments, etc., would be irrelevant. It may be most useful to present graphs show- ing the effects of changing each of the error levels and the confidence level so that the implications of each can be seen. This should, ideally, be done using actual computations of sampling error from recent surveys. The implications of the level of error can be investigated by examining simple trip- production models and showing the implications in terms of ability to distinguish statistically between the trip rates for different population sub-groups. The second issue is to determine the implications for other variables that may be estimated from the data of setting an error level on trip rates. To do this, one would need to select certain other variables of interestâthe proportions of trips by mode and purpose, the average trip length by pur- pose, trip rates by purpose, average household size, average vehicle ownership, etc.âand estimate the sampling errors on these attributes. These would need to be related to the sampling errors for the overall trip rates to show how the sampling errors on the other attributes relate to the trip-rate sampling error. If there is insufficient variability in the overall error of trip rates, it may be neces- sary to sub-sample from some existing surveys since the sub-samples will have much larger sam- pling error for all characteristics. The third issue is to investigate the potential to use other attributes, such as mode shares, for the design sampling error. Existing data sets could be used to determine the error properties of such attributes as mode shares and possibly other attributes like average trip lengths by purpose. If the attribute on which the sampling error is specified is changed, then a different type of sam- pling will be required to achieve the desired sample. This would require investigation of what would be required and how it could be attained. Existing data sets could be used for thisâfor example, a secondary data source like Public Use Microdata Sample (PUMS) could be sampled to replicate the procedure that would need to be used. 4.2.15 S-6: Development of Default Variances Estimation of error requires an estimate of the variance of crucial variables. One of the issues that has made sampling strategies relatively simplistic in household travel surveys is the lack of information on variances for those variables that are normally considered crucial in transporta- tion planning analysis. This has implications on all aspects of sampling because the error levels are determined by the variance; hence, sample size and stratification procedures are also deter- mined by the variance. In the absence of information about the variance, survey designers either assume constant variances across all strata in a sampling scheme or make some other working assumption that will allow sample size calculations to be made. Default variances could be used to determine appropriate sample sizes and other issues in the absence of actual local data on these values. They could also be used subsequently to assist in 60 Standardized Procedures for Personal Travel Surveys

assessing the quality of any given survey by comparing the variance measured in a specific survey to the default value for each attribute of interest. Variances either much smaller or much larger might indicate potential problems in the survey. It is recommended that research on this issue be undertaken in conjunction with work on stratification options (see Section 4.2.13) and the specification of sampling error requirements (see Section 4.2.14). It is suggested that variances be estimated for a variety of relevant variables and from as many different data sets as possible. These could include trip rates by purpose and overall per person and per household, mode shares by purpose, and average trip lengths by pur- pose. Recommendations should suggest a mean or median variance that could be used as a default for sample design for each appropriate variable. The implications of using default vari- ances for setting sample sizes would need to be checked by comparing them with the results of using actual variances for several recent surveys. In the absence of any local information, these variances could be used to estimate stratification, sampling rates, and sampling errors. Perhaps of even greater use would be to determine default values of coefficients of variation (cv) that could be used in determining sample size because sample size and error computation also require knowledge of the mean. 4.2.16 P-1: Focus Groups The transportation profession has only recently begun to understand and appreciate the poten- tial of focus groups. These have been a mainstay of the marketing profession for quite some time and have enormous applicability to various aspects of survey design. While some personal travel surveys are conducted by marketing firms that may be familiar with focus groups, many surveys are conducted by transportation engineering and planning firms who are not familiar with them. Only a small minority of transportation surveys has used focus groups to help with the design of the survey; yet, this is a powerful mechanism to improve the design and quality of a survey. In the design process, one or more focus groups can provide important information in an effective man- ner and may be much more cost-effective than a number of pretests. While extremely useful, focus groups can probably be considered as good or better practice rather than basic practice in trans- portation surveys. Guidelines or a primer on how focus groups could be set up and used in household and per- sonal travel surveys would appear to be very useful. Among the issues that need to be addressed are the following: â¢ How many focus groups are needed? â¢ What is the optimum size of a focus group? â¢ How should focus groups be used to test a travel survey? â¢ How can a focus group be recruited? â¢ How much is it necessary or desirable to pay focus group members to participate? â¢ Should focus group members receive survey materials prior to meeting? â¢ How should a meeting location and time be arranged? â¢ What qualifications are needed to facilitate a focus group? â¢ Should focus group discussions be recorded? â¢ What benefits arise from using focus groups? â¢ How is a focus group conducted? Literature from marketing and other areas should be consulted to prepare responses to these and other important questions and could help determine the extent to which focus groups are subject to standardized procedures. If possible, it may be useful to field test a small focus group to provide additional information for any proposed standards or guidelines. Procedures and Measures for Further Research 61

4.2.17 P-5: Reporting of Pretests and Pilot Surveys From a review of previous surveys, it appears that there is no consistency in reporting whether a pretest or pilot survey was performed. This would lead one to suppose that pretests or pilot sur- veys have not been conducted. There should be a standardized procedure here that the final report of a survey should document whether a pilot survey or any pretests were conducted. If none was conducted, there should also be a clear statement as to the reason why this was the case. The other major issue relates to what should be reported from a pretest or pilot surveyâfor example, details on how the sampling was done, sample sizes determined, elements tested and results of the tests, and any specific statistical tests of significance that were performed. There is a need for minimum reporting standards to be developed here. It is suggested that reports on recent surveys be reviewed to determine what has been documented in the past. Some of the items to be considered here should be â¢ Sample sizes and methods of drawing the samples for any pretests and pilot surveys; â¢ Nature of the design that was tested; â¢ Results of the tests, including response rate(s) and other measures of quality; and â¢ Conclusions drawn from any pretests and pilot surveys and changes implemented as a result of the pretests or pilot surveys. The documentation should include any statistical test performed to establish whether to make changes to the final survey, and anecdotal information should also be included that may have led to changes in the design of the survey and its protocols. For example, problems encountered by interviewers in using the scripts provided and questions raised by prospective respondents are all appropriate items to be included in the documentation. A report outline should be developed as the means to convey the standard for documentation of any pretests and pilot surveys conducted. 4.2.18 Q-4: Sampling Error Sampling error not only is a part of the specification of the required sample size and an input to the design of the sample, but also is an important measure of the quality of the resulting sur- vey. Sampling error of individual variable estimates is measured by the Standard Error of the Esti- mate (SEE). However, the magnitude of the measure is affected by the units of measurement of the variable under consideration, making interpretation of the value and comparison of values among data sets difficult. To eliminate this effect, the coefficient of variation (SEE divided by the estimate) provides a dimensionless measure of variation of the estimate about the mean and allows meaningful comparison among data sets. However, this does not alter the fact that sam- pling errors need to be calculated separately for each variable in question. Given the difficulties that survey planners have in communicating information about sample- size calculations to clients (Richardson et al., 1995), one would ideally like to obtain one measure of sampling error for a data set as a whole, which could be derived from an average or weighted average value calculated for a number of key variables. Unfortunately, it was not possible to devise such a measure in this project. A practical approach for assessing overall survey quality would be to use the highest sampling error obtained among a list of key variables. This idea is consistent with the idea of âtotal designâ promoted by Dillman (1978), which suggests that the quality of a process is only as good as the weakest link in the process. It is recommended that research be conducted to determine the most appropriate variables for a combined measure of sampling error. These may include activity rates per person and household; trip rates by purpose per person and household; mode shares by trip purpose; and selected house- hold and person attributes such as vehicle ownership, household size, driverâs license status, etc. Two specific approaches could be taken to determine such variables. One approach could involve 62 Standardized Procedures for Personal Travel Surveys

selecting a set of key variables from among the core variables required in any survey. An alterna- tive approach could be to identify those variables most relevant to the purpose, or purposes, of the survey and to measure the sampling error on each variable (TMIP, 1996a). Regardless of the approach taken, the determination of the key variables should be related back also to the minimum specification of questions already developed as part of this project (see Section 2.1.1). To illustrate the effects of the standardized procedure and its interpretation, it is recommend that sampling errors be calculated for two or three recent surveys on the key variables specified. 4.3 Other Research Directions In this section, we outline briefly ideas that surfaced during the execution of this research. A number of these have been partially researched in this project, but further work is seen as being warranted to complete what has been started and to develop standardized procedures or consis- tent guidelines. 4.3.1 Cell Phones Cell phone usage has grown at a phenomenal rate over the past decade and has profound impli- cations for the way in which surveys are conducted. In 2003, cell phones composed about 43% of all U.S. phones, which represented an increase of 37% since the year 2000 (USA Today, 2003). In addition to this, many households are now moving away from landline phones and using cell phones exclusively. In June 2003, the Federal Communications Commission reported that in the period since the year 2000, landline phones decreased by more than 5 million, or around 3% (USA Today, 2003). The majority of cell phones are unlisted, which means that it will become increas- ingly difficult to contact large sections of the population through RDD. In our opinion, research on the use of cell phones should be focused on two key areas. First, there is a need to determine the effects that growing cell phone use will have on household travel surveys. Specifically, more needs to be known about members of the population who are moving toward exclusive use of cell phones. It is likely that such information could be obtained from a federal government agency such as the Federal Communications Commission or a communica- tions industry group such as the Cellular Telecommunications and Internet Association. Second, once it is known which segments of the population will become increasingly difficult to contact, alternative strategies will need to be developed to find new ways of reaching such groups. Although the increased take up of cell phones may create difficulties in reaching certain sections of the population, it is worth noting they may also create some new opportunities. Once initial contact has been made and a person has agreed to take part in a survey, they may actually be easier to contact (e.g., for recall interviews) than would have previously been the case. If a per- sonalized interview technique were adopted, such as the BrÃ¶g method, a relationship could even be established whereby the interviewer would deal with their contacts as âclients,â who would be free to call their interviewer or âagentâ whenever they felt it necessary. It is recommended that these kinds of opportunities be explored as part of any research conducted on the impacts of increasing cell phone use among the population. One of the problems with using cell phones is that calls received incur the same cost as a call placed and, therefore, the use of cell phones in a survey would impose a cost on survey respon- dents, which is generally considered a violation of ethical standards for surveys. In the past, it has been relatively easy to exclude cell phones because certain blocks of numbers were reserved by telephone companies for allocation to cell phones. However, this is eroding as number porta- bility allows people to shift their landline number to a cell phone. One possibility is that this trend will further damage the potential of using the telephone as a means to recruit and retrieve Procedures and Measures for Further Research 63

survey information. It is certainly beyond the scope of this present report to examine the poten- tials for using or not using cell phones in the future and to potentially recommend changes to ethics standards that would permit the use of such phones. 4.3.2 Incentives Standardized procedures for the type of incentives to be used have been described in Section 2.2.8. However, it is unknown how different types of survey methodologies would effect the recep- tion of the cash incentive. For example, it is unknown how a $10 cash incentive would be received amongst those who respond to a CATI versus those who respond to a face-to-face interview. This issue may become greater for survey practitioners wanting to use multi-modal surveys: what level of incentive is likely to reduce non-response across the different survey modes? This needs to be investigated before any standardized procedures or guidelines could be suggested. Further, as noted in Section 5.8 of the Technical Appendix, there has been no comprehensive test of the effect of incentives. It is not known how much of an increase in response rate can be obtained with incentives of different sizes, nor what biases may result from their use. This is research that would be warranted. To determine the effect of incentives, it would be necessary to undertake a survey in at least two locations in which varying incentives were offered (including no incentive) in a random pattern and in such a way that comparisons could be made on the response rates and on who responds with and without an incentive. In addition, a non-response survey could be conducted in which the survey is repeated to respondents who refused or terminated on the first occasion, but offer- ing either an incentive where none was offered before, or a larger incentive where a small one was offered before. It may even be worth exploring incentives from a completely different angle. Instead of attempt- ing to establish an âinvisibleâ sense of reciprocation through an obligation-free incentive, one could go a step further and enter a formal agreement which establishes an explicit connection between the reward being offered and tasks required on the part of the respondent. This could possibly develop a greater sense of reciprocation, which would move the role of the respondent away from that of a âdonorâ to something resembling more of an âemployee.â It is recommended that research be done to evaluate the impact of such an approach on the recruitment process, as well as on response and completion rates. Offering a more substantive gift such as a football ticket, manicure, etc., may appear exorbitantly expensive on one hand, but the additional money costs may be justifiable if they result in significant improvements in quality of data or if the survey itself runs more quickly and smoothly. It may not even be necessary to offer a large incentive. There is some evidence to suggest that response rates improve simply through the act of having respon- dents sign a document to say they will complete it. 4.3.3 Personalized Interview Techniques In this project, it was not possible to explore personalized interview techniques and the impacts they have on the response rates, completion rates, and the quality of data. The most well-known alternative approach to interviewing respondents is known as the âBrÃ¶g technique.â This approach differs from conventional interviewing techniques in that it stresses the importance of trust between the interviewer and the respondent. Instead of being contacted by several inter- viewers through the course of a survey, the respondent is instead given the name and phone num- ber of a specific member of the interviewing staff who will serve as a âmotivatorâ (BrÃ¶g, 2000). Respondents are given the freedom to communicate using their own terms rather than those spec- ified in a questionnaire, and a certain amount of dynamics are permitted in the interview while maintaining a coverage of essential topics. In general, the survey is made to be respondent-friendly 64 Standardized Procedures for Personal Travel Surveys

even if that means that it is not necessarily interviewer-friendly. Personalized interviewing tech- niques are also becoming increasingly popular through travel behavior modification programs such as TravelSmartÂ® and Travel BlendingÂ®. As part of NCHRP Project 8-37, Westat undertook a pilot study of a modified version of the BrÃ¶g interviewing technique on a sub-sample of around 100 households participating in the 2002 wave of the Metropolitan Washington DC Council of Governments Longitudinal Household Travel Survey (COG LHTS) (Freedman and Machado, 2003) (see Section 5.1 of the Technical Appendix). In this CATI survey, a three-person team of interviewers was assigned to each house- hold through its participation period. This approach was adopted to establish a high level of rap- port between interviewers and participants and to create a situation where respondents would feel comfortable to call interviewers at any time during the daily interview hours (Freedman and Machado, 2003). Although it was found that the procedures adopted in the study showed prom- ise, operational difficulties made it difficult to make any firm conclusions regarding the effective- ness of the method (Freedman and Machado, 2003). It is recommended that more work be done to evaluate the effectiveness of the BrÃ¶g method. The test undertaken by Westat, while useful, was limited by constraints imposed by the COG LHTS of which it was a part. It is suggested that in future work, the method be tested in a stand alone survey. 4.3.4 Geocoding Methods A number of general standards relating to geocoding were recommended in this project (see Sections 8.1 and 8.2 of the Technical Appendix). However, there is further work that can be done. The success of geocoding data depends on three issues: the quality of reference data (address infor- mation stored in GIS); the quality of target data (addresses reported by respondents); and the method adopted to match addresses. The limitations of reference data have been well documented (Greaves, 1998 and 2003), as have the problems that respondents have in accurately reporting addresses (Stopher and Metcalf, 1996). However, very little work has been done to evaluate the effectiveness of different techniques that can be used for dealing with partial matches (e.g., crite- ria relaxation and scoring-based systems). While Drummond (1995) provided a general overview of geocoding techniques, it is largely unknown what approach produces the best results. Also, decisions about what soundex score should be accepted or the extent to which matching criteria should be relaxed are generally very subjective. In future research, it is recommended that geocod- ing be performed on a number of common data set using a variety of different GIS packages. In addition, a more thorough evaluation could also be conducted of systems capable of geocod- ing in real time. In this project, it was not possible to do any meaningful analysis of the costs and benefits associated with real-time geocoding. Anecdotal evidence suggests that significant improvements can be made when reported addresses can be instantaneously validated and cross- checked during the interview process through specialized CATI systems that incorporate address gazetteers (for schools, shopping malls, and other commonly visited locations). Although such systems have now been used in a substantial number of surveys, it is difficult to quantify the ben- efits of the technology because of the difficulties in comparing different types of surveys and dif- ferent CATI systems. However, with a more detailed review of these surveys, it would be possible to at least determine what types of addresses can be included on online gazetteers. 4.3.5 Impacts of the National Do Not Call Registry The National Do Not Call Registry was set up to protect households from being bombarded with telemarketing calls. It would be useful to know whether this has had a positive impact on the recruit- ment rates to household travel surveys. If so, then survey firms would need to draw smaller samples than in the past, and this would represent a cost saving in terms of the number of households that Procedures and Measures for Further Research 65

would need to be called and also in relation to the number of pre-notification letters that would need to be mailed out. However, it would also be useful to know the characteristics of households that respond positively to recruitment calls after subscribing to the registry and whether their character- istics differ from the characteristics of households that respond negatively to survey recruitment calls. This will give an understanding of the non-response bias and is important to account for in household travel survey results. A possibility is to determine whether it is possible to obtain a list of households subscribed to the registry and then to compare response rates, characteristics, etc., among households recruited that are on the registry and those that are not. 4.3.6 Initial Contacts Initial contacts are discussed in Section 2.2.7 of this report and Section 5.7 of the Technical Appendix. However, due to limited information, standardized procedures and guidelines could not be suggested. Thus, further research is required that investigates the phrasing of recruitment scripts and other contact materials to enable the development of a suggested consistent approach for the wordings of such materials. This will also depend on the nature of the survey and client requirements. Again, the method that would be preferred is to test several different alternatives in a side-by- side comparison in actual surveys in more than one location. The goal would be to compare refusal and termination rates according to the alternative methods of initial contact, including the effects of pre-notification letters, and alternative ways of phrasing the opening of the recruitment script. 4.3.7 Refusal and Non-Contact Conversions It has been well documented that response rates have been declining and that it is becoming increasingly difficult to get households and individuals to agree to participate in travel surveys. Among other things, this may be attributed increasingly to lengthy and complex surveys (increased respondent burden), more physical barriers inhibiting contact with prospective participants such as call-screening devices (telephone surveys), and gated communities (face-to-face surveys) (Kalfs and van Evert, 2003; Kam and Morris, 1999; Melevin et al., 1998; Oldendick and Link, 1999; Vogt And Stewart, 2001). Also, increasing numbers of marketing surveys have led people to perceive increased respondent burden; therefore, these individuals no longer even consider participating (Black and Safir, 2000; Kalfs and van Evert, 2003). There are two broad categories for unit non-response: refusals (hard refusals, soft refusals, and terminations) and non-contacts (busy, no reply, and answering machines). Unit non-response becomes problematic if the responses of refusers and non-contacts differ significantly from the responses of contacts because this will add to non-response bias (Zmud, 2003). For example, it has been found that younger households and households with higher incomes require more calls to complete an interview due to telephone-screening devices. These households also tend to have higher refusal rates (Zmud, 2003). Evidence suggests that non-contacts lead active lifestyles and are highly mobile. In terms of travel surveys, absence of data from these households results in an under-estimation of trip rates. In addition, potential refusers possess different demographic characteristics than non-contacts. Higher refusal rates have been found among the elderly and low-educated persons (Kurth et al., 2001). As part of this project, research was undertaken to gain some insight into demographic and travel characteristics of non-respondents, why they do not respond, and whether there are any particular elements in survey design and execution that would appeal to non-respondents. Analysis of a call-history file confirmed that households that require fewer call attempts to estab- lish contact and result in a complete response differed, both in terms of mobility and socio- 66 Standardized Procedures for Personal Travel Surveys

demographics, from households that were more difficult to contact. Although this research was able to confirm characteristics of non-respondents found in other work, it was not possible to draw any definitive conclusions about how many refusals/non-contacts should be converted for every call attempt to reduce the overall incidence of bias in data set. It is recommended this issue be examined in greater depth in the future. It is suggested that multiple call-history files be ana- lyzed as part of any future research effort. One of the main difficulties in comparing different call history files is that disposition codes are inconsistently defined among travel surveys. In light of this, it is suggested that future analysis should use files from contemporary surveys that are able to adopt the definitions proposed in this project. 4.3.8 Effect of Interview Mode on Recruitment and Non-Response Rates The effect of interview mode on recruitment and non-response rates is related to the section on personalized interview techniques, Section 4.3.3, except that the focus is different. In this case, the issue is whether different modes of survey will have different impacts on recruitment rates and on eventual non-response rates. The same experiment probably could be conducted for this as would be envisaged for Section 4.3.3. However, the difference in this case will be that the focus is on whether different interview modes used in recruitment are associated with signifi- cantly different recruitment rates and what effect the different modes have on actual completion rates for the survey. 4.3.9 Unknown Eligibility Rates In defining standardized procedures for computing response rates, the issue of the estimated rate of eligibility for those contacts that remained with unknown eligibility was recommended as being left to the survey firm. However, better guidance would be preferred for this issue because it has a critical impact on the calculation of response rates. Effectively, this requires the acquisi- tion of a number of additional call-history files from which analysis can be conducted on the eli- gibility rates at different points in the calling. Ideally, these files should be obtained from surveys that have used 10 or more calls as the limit for trying to recruit households so that it is possible to determine an eligibility rate for a 5-call limit from information obtained from calls made beyond the fifth attempt. 4.3.10 Data Archiving in Transportation In this report, we have proposed standardized procedures for data archiving for household travel surveys (see Section 2.6.4). However, past transportation surveys have not been archived according to the standards. The research that is needed is to archive data, using the standardized procedures, and then test the usefulness and effectiveness of the archiving. This may then result in modifications to the proposed procedures. Procedures and Measures for Further Research 67

Next: Chapter 5 - Sample Request for Proposals Template »

Standardized Procedures for Personal Travel Surveys (2008)

Chapter: Chapter 4 - Procedures and Measures for Further Research

Welcome to OpenBook!

Get Email Updates