Read "Estimating the Value of Truck Travel Time Reliability" at NAP.edu

« Previous: References

Page 75

Suggested Citation:"Appendix A - Stated-Preference Survey Design." National Academies of Sciences, Engineering, and Medicine. 2019. Estimating the Value of Truck Travel Time Reliability. Washington, DC: The National Academies Press. doi: 10.17226/25655.

Page 76

Page 77

Page 78

Page 79

Page 80

Page 81

Page 82

Page 83

Page 84

Page 85

Page 86

Page 87

Page 88

Page 89

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

75 Stated-preference surveys have been the primary tool used for estimating the value of reliability (VOR) because they allow researchers to isolate the effect of unreliability on respondentsâ choices. Often in data collected from actual choices, there is a strong relationship between unreliability and other variables of interest, such as travel time, which makes isolating how much consideration is given to unreliability difficult. This appendix provides guidance and best practices in using stated- preference surveys to estimate the VOR. However, it does not provide a thorough review of theory and methods, which can be found in Louviere et al. (2000) and Hensher et al. (2005). Table A-1 describes the stated-preference methods that have been used to study truck VOR. In this table âheterogeneityâ refers to the level of detail at which VOR estimates were reported, âsample sizeâ refers to the number of completed surveys, âalternativesâ refers to the type of choice questions asked, âattribute levelsâ refers to the range of the effect tested, âexperiment designâ refers to how the set of choice questions was constructed, and ânumber of choicesâ refers to the number of choice questions presented per survey. Making different decisions in each of these areas results in surveys that measure different phenomena and have different properties. A P P E N D I X A Stated-Preference Survey Design Table A-1. Overview of stated-preference surveys used for estimating VOR. Year Author(s) Modes Heterogeneity Sample Size Alternatives Attribute Levels Experiment Design Number of Choices 1999 Small et al. Road 4 commodity groups 20 carriers Within-mode experiment (road only) 3 levels for each attribute Full factorial removing dominant choices 10 choice sets 2000 Wigan et al. Road Intercity, urban, urban multiple stops 43 shippers, 129 responses Within-mode experiment (road only) â0.2 Fractional factorial na 2000 Kurri et al. Road and Rail Commodity type 236 road shipments, 162 rail shipments Two separate within-mode experiments (road, rail) Four levels for cost and reliability Fractional factorial 120 different choice sets; each respondent answers 12â15 questions 2003 Bolis and Maggi Road Shipment weight, geography, just in time, distance, intermodality 22 shippers âIntegrated approachâ Short-run attributes changed first before long-run attributes were changed Adaptive 40 binary choices per firm 2006 Fowkes and Whiteing Road and Rail 9 commodities, shipment type, time of day 49 shippers Across-mode experiment (road, rail) Adaptive Adaptive 10 choices, 4 alternatives (continued on next page)

76 Estimating the Value of Truck Travel Time Reliability A.1 Participation Identifying decision-makers to survey in passenger transportation is significantly easier than in freight transportation, because decisions in freight result from the interactions of different companies. As can be seen in Table A-1, previous studies have surveyed logistics managers, freight forwarders, drivers, receivers, and others. Most surveys have targeted either motor carriers, ship- pers that contract out transportation, or shippers that perform their own transportation. Each of these will provide a different perspective and consider costs differently (de Jong 2014). Motor carriers will not know how unreliability affects supply chain costs (cargo related), and shippers that contract out transportation will not know how unreliability affects vehicles and drivers. Bergantino and Bolis (2008) suggested that it might be advantageous to survey freight for- warders because they work for all types of shippers involving different commodities and supply chain types. Jin and Shams (2016) agreed with this assessment and described favorable coopera- tion with forwarder associations as one of the most productive elements of their study. Some studies have only considered firms of certain sizes. Witlox and Vandaele (2005) only surveyed firms with more than 20 employees, and Marcucci and Scaccia (2004) only surveyed firms with more than 40 employees. These studies hypothesized that smaller firms might have difficulties understanding the questions and answering them consistently. However, there is lit- tle evidence from the literature that this is the case. Regardless, these and other studies concluded that information about the sizes of firms should be collected. It is unclear from the literature Year Author(s) Modes Heterogeneity Sample Size Alternatives Attribute Levels Experiment Design Number of Choices 2008 Beuthe and Bouffioux Road, rail, water, and mutimodal Shipping distance, commodity value, type, and weight 113 shippers Within-mode alternative 5 levels of 6 attributes relative to baseline (Â±10% and Â±20%) Fractional factorial design 25 unlabeled alternatives 2010 Halse et al. Road and rail Shipper vs. carrier 640 shippers, 117 carriers Within-mode experiment (road, rail) 3 experiments, each multiple levels of cost, travel time, and reliability Randomized block design 20 choice questions 2011 Miao et al. Road Geography, transport provider 111 truck drivers Within-mode experiment (road only) na na Two questions 2011 Zamparini et al. Road Transport provider, value density of goods 24 shippers Within-mode experiment (road only) na na na 2014 Kawasaki et al. Road None 48 shippers Within-mode experiment (road only) 6 levels of 3 attributes Orthogonal 18 questions, 2 alternatives 2014 de Jong et al. Road, rail, air, and water Containerization, truck size, mode 249 road shippers, 166 motor carriers, 397 other modes Within-mode experiment (road, rail, air, water) 3 experiments, each 5 levels of cost (Â±15%), travel time (Â±10- 15%), and reliability Bradley design and orthogonal factorial 3 experiments with 19 choice questions 2016 Jin and Shams Road Carrier type, 4 commodity groups, perishability 35 shippers, 108 carriers, seven 3PLs Within-mode experiment (road only) 5 levels of 3 attributes Orthogonal 25 questions, 3 alternatives Note: na = not applicable. Source: Shams et al. (2017) and authorsâ research. Table A-1. (Continued).

Stated-Preference Survey Design 77 whether firm size increases or decreases VOR. Some larger firms will have greater flexibility in responding to unreliability events, while others (such as those in the automobile manufacturing sector) operate very lean supply chains that value reliability highly. Once firms have been identified, finding the right person within the organization to take the survey can be challenging. Ideally, participants should be decision-makers with close to full knowledge about the companyâs products, clients, and logistic strategies. Almost all previous studies have communicated this to potential respondents. Few if any studies have examined how VOT or VOR estimates vary depending on who took the survey and their role within their organization (Feo-Valero et al. 2011). Stated-preference surveys are only useful if a large number of high-quality responses are received. Failing to achieve enough participation will inevitably lead to poor estimates. Most previous studies indicated that this was one of the main challenges they faced. Businesses are typically averse to sharing information about how they operate, especially with regard to costs, which is necessary for estimating VOR. However, previous studies have employed a variety of approaches to increase participation. Maier et al. (2002) achieved participation rates of 80 percent by contacting companies by phone and explaining clearly the purposes of the research. Johnston et al. (2017) listed the following as the strategies most often employed to increase par- ticipation: use of participation incentives, mailing advanced letters, making multiple contact attempts, two-phase sampling, convincing refusals to participate, and increasing the collection period of the survey. Some of these have been employed successfully in the VOR literature. Jin and Shams (2016) concluded that the following strategies were useful in their study: â¢ Avoid surveying in multiple stages. Jin and Shams (2016) intended to have a two-stage sur- vey, with the first stage asking background questions about the firms and their shipments so that questions in the second stage can be tailored (similar to the two-stage surveys commonly used in passenger travel demand studies). However, they found early on that almost none of the firms that had participated in the first stage were participating in the second stage. They ultimately decided to use a single stage survey instead. â¢ Work with trade associations and stakeholder groups. The majority of participants in their study learned about the survey through trade associations that had agreed to publicize it. The research group also attended a conference organized by the local trucking association and was able to distribute the survey through its newsletter. The local metropolitan planning organiza- tion was also helpful and included a link to the survey in its publications. â¢ Seek out champions. The team concluded that finding champions within trade associations and public agencies was fundamental to elevating the profile of the study and mobilizing interest. â¢ Harder to get participation from shippers than from motor carriers. The team found that achieving participation from shippers was much harder than from carriers. This echoes the findings of other studies, and indicates that a concerted effort is required to achieve high participation rates from shippers. â¢ Approach large national shippers. The research group received a warmer reception from national shippers than local ones. National shippers could also generate a wealth of informa- tion by soliciting the participation of local warehouse managers. Additionally, many of these national shippers (e.g., Walmart) also do much of their own transportation and therefore will consider all of the costs associated with unreliability. A.2 Survey Design Overview The field of experimental design has received much attention in virtually all scientific disci- plines. Insights from this field are needed to construct the hypothetical choices in stated-preference surveys. However, as in other types of surveys in social science, it is impossible to design a perfect

78 Estimating the Value of Truck Travel Time Reliability stated-preference survey because there are always multiple priorities that need to be balanced. Johnston et al. (2017) described this challenge as develop[ing] designs that yield efficient and unbiased estimates of preference parameters [by] allowing for interactions (and perhaps other types of nonlinear-in-attributes utility functions), considering both statistical efficiency and respondentsâ cognitive abilities and attention budgets, employing constraints on implausible attribute levels and combinations, using designs that are robust to alternative model specifi- cations, and considering how the levels chosen for each attribute influence design properties. Experimental survey design is fundamentally about managing these trade-offs to maximize the success of the study. This section does not discuss the nuances of how to design surveys, but instead focuses on the best practices that are most relevant for estimating the VOR. One of the key objectives of survey design is achieving a high degree of statistical efficiency. Statistical efficiency describes the ability of a survey to provide information that is useful in esti- mating a particular model. Therefore, the efficiency of a survey depends on the type of model that will be estimated. A highly efficient survey for estimating willingness-to-pay trade-offs might be different from a highly efficient survey for modeling market shares. Different surveys will work best for answering different types of questions. In addition, having a priori knowledge about what the answers might be can help in asking more precise questions, thereby increasing the efficiency of the survey. Most previous studies in this area have designed stated-preference surveys based on strict statistical axioms. As shown in Table A-2, there are four basic types of designs. Full factorial designs consider all combinations of questions that could be asked, resulting in long surveys that can be used to estimate virtually any relationships (main effects and all interaction effects). Fractional factorial designs reduce survey length by limiting the types of interactions considered. Orthogonal designs reduce the length even further by ignoring all interactions and only estimat- ing main effects. Full factorial, fractional factorial, and orthogonal designs make no assumptions about the underlying relationships of the variables and therefore lead to surveys that search ran- domly through the space of possible answers. This is appealing for statistical reasons; however, it leads to surveys that include irrelevant questions and are longer than they need to be. The last category in Table A-2 involves statistically efficient designs that use algorithms to generate surveys that maximize the amount of information being collected to estimate particular Table A-2. Statistical experiment designs. Source: Jin and Shams (2016).

Stated-Preference Survey Design 79 models (reduce the combined standard errors of the model estimates). This can either make no assumptions about underlying relationships (as in the three types before) or include a priori knowledge about the preferences of respondents so that the survey can be more focused. The algorithms for specifying these designs are complex and are often only available in specialized statistical software. The approaches described in Table A-2 have well-defined statistical properties that allow for a transparent estimation of models, where the standard errors of estimates can be calculated with- out difficulty. In the past decade, a new type of stated-preference survey has become popularâ especially with practitionersâthat does not follow strict statistical axioms (Polak and Jones 1997). These are often called âadaptive stated-preference surveysâ because they use heuristics that generate questions on the basis of the responses to previous questions. Instead of search- ing randomly through the space of potential answers, these surveys attempt to narrow in more quickly on participantsâ preferences by observing their answers and making assumptions about their preferences. Adaptive surveys have gained popularity because they can potentially zero in on prefer- ences with fewer questions. In the VOR literature, this approach has been used by Fowkes and Whiteing (2006) and Bolis and Maggi (2003). However, this approach has a major drawback. The statistical properties of these surveys are unknown because the questions and answers are conditional on each other. Moreover, correlations between unobserved factors and the ques- tions being asked violate the assumptions of many common models (Bradley and Daly 2000). Therefore, it is difficult, if not impossible, to calculate the standard errors of parameters and conclude that certain relationships are present in the data. Some researchers, particularly in academia, have concluded that adaptive stated-preference surveys should be discredited for this reason. Another drawback is that adaptive surveys need to be carefully constructed with significant a priori knowledge of the relationships being measured, in order to avoid asking irrelevant questions (focusing questions too narrowly). This is less of an issue in passenger transpor- tation because a large number of stated-preference surveys have already been conducted, giving analysts a sense of how respondents might answer. However, very few stated-preference surveys have been conducted for freight transportationâparticularly in the United Statesâ and, therefore, there is less conclusive information available on which to design these types of surveys. A description of how adaptive surveys have been used to estimate the VOR is presented in Section A.4. The rest of this appendix focuses on traditional stated-preference surveys. A.3 Attributes To estimate the VOR, it is necessary to describe the stated-preference alternatives by their cost and travel time reliability. This allows for the trade-off between costs and reliability to be explored. Table A-3 summarizes the attributes that have been used in previous studies. In addition to reliability, previous studies have considered mode, frequency, and damage. Several studies have also included flexibility when describing transportation choices, attempting to cap- ture the degree to which shipments need to be scheduled ahead of time or whether they can be dispatched on demand. Feo-Valero et al. (2011) provide an expanded table similar to Table A-3 that summarizes the attributes used in other freight modeling studies that have not included reliability. While studying many attributes seems appealing, the more attributes considered, the longer the survey needs to be. This could lead to survey fatigue and low response rates. Too many

80 Estimating the Value of Truck Travel Time Reliability attributes can also result in multicollinearity, thereby making it hard to estimate individual parameters. Some researchers have argued that no more than three attributes should be used in this type of study. Including more attributes makes it cognitively difficult to answer correctly and consistently. Stated-preference surveys need to clearly define each attribute at the beginning and give instructions to participants about what they should and should not consider when answering the questions (de Jong et al. 2014). For example, shippers that contract out transportation should be reminded not to consider the costs of vehicles or drivers, while motor carriers should be reminded not to consider cargo-related costs. Shippers with transportation capabilities should be instructed to consider all costs, even though it is possible that the person taking the survey is only involved in part of these decisions. Another helpful instruction is to indicate to participants that the transportation attributes affect all motor carriers and do not represent a competitive advantage between firms. Maximum effort should be placed in making sure that choice attri- butes and survey context are not ambiguous, as the success of the survey hinges on the accuracy of the answers (Hensher et al. 2005). A.3.1 Presentation of Reliability The presentation of reliability information to respondents has been a vexing issue in the lit- erature. One of the most common approaches has been to show the proportion of shipments that are late past a certain delivery window. However, this has two limitations: (1) the severity of the unreliability is not considered and (2) assumptions are required to translate this measure to a standard deviation for modeling. Other researchers have presented five or so equally probable transportation times that include uncertainty. This was the approach taken recently by research- ers in the Netherlands after testing various formats (de Jong et al. 2014). Table A-4 describes how reliability was communicated in previous studies. Despite this potentially having a large impact on how respondents rate different alternatives, there is no consensus in the literature about which approach to follow. Table A-3. Shipment attributes considered in relevant stated-preference surveys. Shipment Attributes W in st on ( 19 81 ) K aw am ur a (1 99 9) W ig an e t al . ( 20 00 ) K ur ri e t al . ( 20 00 ) M ai er e t al . ( 20 02 ) B ol is a nd M ag gi ( 20 03 ) F ow ke s an d W hi te in g (2 00 6) W it lo x an d V an da el e (2 00 5) B eu th e an d B ou ff io ux ( 20 08 ) H al se e t al . ( 20 10 ) Z am pa ri ni e t al . ( 20 11 ) K ru ge r et a l. (2 01 3) K aw as ak i e t al . ( 20 14 ) de J on g et a l. (2 01 4) Ji n an d Sh am s (2 01 6) Cost X X X X X X X X X X X X X X X Average travel time X X X X X X X X X X X X X X X Travel time reliability X X X X X X X X X X X X X X X Mode X X X Frequency X X X X X Flexibility X X X X X X X Loss and damage X X X X

Stated-Preference Survey Design 81 A.3.2 Attribute Levels Choosing attribute levels for the stated-preference questions is one of the most important decisions in constructing the survey. Johnston et al. (2017) concluded that attribute levels need to be chosen on the basis of the values needed to support planning decision-making, feasibility of implementation, plausibility to respondents, and statistical efficiency. In choosing attribute levels, there is a clear trade-off between statistical efficiency and respondentsâ cognitive abilities. The following considerations need to be balanced: â¢ Survey size. Surveys with more attribute levels need to be longer to preserve orthogonality. This could lead to survey fatigue (and lower-quality responses) or the need for a larger sample (if respondents only take a portion of the survey). â¢ Coverage of utility space. The number of attribute levels needs to correspond to the range of situations where the results will be applicable. In the case of the present study, the willingness to pay (WTP) estimate will only be valid within the range of cost and reliability levels consid- ered. If the survey considers changes in reliability of 0.1 standard deviations and 0.3 standard deviations, for example, then the WTP should not be used for changes of 0.4 standard devia- tions. The model estimated, and the WTP parameter calculated, will be valid within the range of the data collected. â¢ Plausibility. The attribute levels need to be plausible to avoid respondents losing interest in the survey. Illogical or implausible combinations of attributes could reduce how seriously respondents take the survey (Johnston et al. 2017). â¢ Familiarity and relevance. Surveys will produce the best results when the questions are relevant to the respondents; otherwise, their answers might depend on presuppositions and not experience. A.3.2.1 Attribute Pivoting The most common way that researchers have achieved choice familiarity is by asking a revealed- preference question at the beginning of the survey and pivoting the stated-preference questions on the basis of their initial response (de Jong et al. 2014; Jin and Shams 2016). This can be done prescriptively, where different surveys are administered to respondents of different characteristics (e.g., urban versus intercity shipments, shippers versus carriers), or it can be done continuously, where a question is asked to describe the distance, cost, and travel time of a typical shipment (or several typical shipments) and attribute levels are pivoted on the basis of their answers. These types of surveys are often called revealed preferenceâstated preference (RP-SP) surveys in the literature and they are used widely in a variety of applications. However, Train and Wilson Table A-4. Presentation of reliability in freight VOR literature. Study Presentation of Reliability Small et al. (1999) Five equiprobable arrival delays Wigan et al. (2000) Percentage of deliveries on time Kurri et al. (2000) 20% of the time late by XX, 10% of the time late by YY% Bolis and Maggi (2003) Percentage of deliveries on time Fowkes and Whiteing (2006) 90% arrive by, 95% arrive by, 98% arrive by Beuthe and Bouffioux (2008) Percentage of deliveries on time Kawasaki et al. (2014) Five equiprobable arrival delays de Jong et al. (2014) Five equiprobable transportation times Jin and Shams (2016) 4 out of 5 times: on-time 1 out of 5 times: 45 minutes delayed from schedule

82 Estimating the Value of Truck Travel Time Reliability (2008) point out that many of these approaches violate the independence of errors assumption that is common in many choice models. This assumption postulates that unobserved factors are independent between questions, or, more broadly, that the unobserved factors are independent from the observed factors (attribute levels). However, because the answer to the revealed- preference question is likely affected by observed and unobserved factors, setting the stated- preference attribute levels as a function of this response might result in unobserved factors act- ing across the choice questions. This issue might be of secondary importance relative to the improvements in familiarity of the questions. Nonetheless, to mitigate this issue, Train and Wilson (2008) suggest using more flexible models and relying on âSP-of-RPâ questions that control for some of these confounding issues. A.3.2.2 Nonlinearity Unreliability costs are likely nonlinear. Being late by 10 hours is likely to lead to very differ- ent impacts than being late by 1 hour. Nonlinearity can be captured and tested in the modeling stage by either including two separate parameters for unreliabilityâone for high values and one for low valuesâor by adding a threshold in the estimation of the unreliability parameter. Marcucci and Scaccia (2004) added cutoff points to the attributes studied to test whether deci- sion-makers have thresholds for perceiving shipment characteristics. The results of this study indeed point to decision-makers using thresholds when analyzing transportation choices, particularly unreliability. A.4 Choice Sets Generating the hypothetical choice questions in stated-preference surveys is a critical step that requires balancing multiple theoretical and practical considerations. Previous studies have used a wide range of approaches, as no approach works well in all cases. Choice sets are often evaluated by their statistical efficiency, which is often defined as the combined standard errors of parameter estimates. These errors describe how precisely the estimates are known given the data generated by the survey. Choice sets with higher efficiency lead to more robust estimates. However, choice sets need to be defined at the beginning of the study, well before one has full certainty about the models that will need to be estimated. There is a trade-off between using general choice sets that have adequate efficiency for a wide range of models and using a spe- cific choice set with high efficiency but only for a few models. Choice sets could be made more statistically efficient by considering a priori information about parameter estimates, and even by focusing on the errors of certain parameters (in the present study, by focusing on cost and unreliability parameters). The discussion above highlights why there does not exist a single design that works well for all needs. As described in Section A.2, there are two types of choice sets that have been used in the literature: fixed choice sets and adaptive choice sets. Fixed choice sets involve defining the questions that all respondents see and randomly searching the space of possible questions and answers, making no a priori assumptions about the respondentâs preferences. This approach is used most frequently in academic research because it leads surveys to have well-defined statisti- cal properties. On the other hand, adaptive choice sets use algorithms that tailor questions on the basis of the responses to previous questions, potentially narrowing in on the respondentâs trade-offs. These algorithms require strong a priori assumptions and knowledge about the pref- erences of respondents, which has not often been an issue in passenger transportation because much work has been done in this area. Adaptive surveys are generally preferred by practitioners because they can produce results from relatively short surveysâshorter than randomly search- ing through the space of potential answers, as with fixed choice sets. However, as mentioned in

Stated-Preference Survey Design 83 the previous section, the statistical properties of these surveys are typically unknown, and it is often impossible to calculate the standard errors of estimates. A.4.1 Fixed Choice Sets The literature on designing stated-preference surveys was reviewed (Louviere et al. 2000; Hensher et al. 2005; Johnston et al. 2017) to identify best practices for constructing the choice questions. Following are some of the main findings from the literature that are relevant to the present study. â¢ Use choice questions instead of ranking and rating questions. In the past, stated-preference surveys have asked respondents to choose, rank, or rate alternatives as a way of revealing their preferences. However, over the past decade, a consensus has developed against asking partici- pants to rank or rate alternatives, because many unreasonable assumptions are required to estimate utility models on this data (Johnston et al. 2017). Binary choice surveys have grown in popularity because they appear to be more dependable, particularly for studies on the WTP of participants (Johnston et al. 2017). â¢ Prefer unlabeled alternatives. Alternatives need labeling when each alternative represents a different mode (or a different product) and the objective is to predict market shares. In the present study, because the interest was in estimating the VOR in trucking only, labels were unnecessary (Hensher et al. 2005). Focusing on trucking is wise because the literature has generally had difficulty estimating effects for other freight modes (Jin and Shams 2016). â¢ Show only two alternatives per question. Experience shows that choosing between three or more alternatives can be cognitively difficult, resulting in greater survey fatigue and lower- quality responses. This is especially true when the questions involve a complex attribute such as reliability. Choice questions with two alternatives are preferred, as the comparisons are easier to make. â¢ Randomize question order. Randomizing the order of the choice questions for each respon- dent is important to reduce biases caused by survey fatigue or attrition. â¢ Avoid unbalanced designs. Many previous studies have used âunbalancedâ surveys, where the number of levels is different between attributes (or at least not multiples of each other). Louviere et al. (2000, p. 120) indicate that these designs are undesirable because âstatistical power differs within attribute levels and/or between attributes, and artificial correlations with grand means or model intercepts are introduced.â â¢ Consider only main effects. Interaction effects are unlikely to be significant in estimating the VOR and therefore have not been considered by previous studies. This allows for only main effects to be considered, greatly reducing the length of the survey. â¢ Have as many choice questions as modeling degrees of freedom. The number of choice ques- tions needs to be higher than or equal to the total number of degrees of freedom of the model estimated (Louviere et al. 2000). This is unlikely to be an issue in the present study, as the types of models estimated for the VOR do not typically have numerous degrees of freedom. There are many ways to construct the choice questions that respondents see (i.e., which alternatives are compared), and each approach will produce a survey that is different, yet not necessarily better than the rest. Sanko (2001) and Street et al. (2005) described approaches to constructing the questions simultaneously or sequentially. The sequential approach was adopted by Jin and Shams (2016), citing its high statistical efficiency. Several statistical software have modules that can be used to generate efficient experimental designs (SAS, Sawtooth, Ngene, Stata, and NLOGIT); however, their cost and level of support vary widely. Burgess (2007) devel- oped a free web tool that generates efficient choice question designs for simple cases. Once the choice questions have been constructed, it is likely that there will be several trivial questions. This occurs more prominently in with-in product comparisons (as in estimating the

84 Estimating the Value of Truck Travel Time Reliability VOR for a single mode). Jin and Shams (2016) eliminated trivial alternatives by rotating attribute levels within choice sets until no dominant alternative was present. However, this could reduce the statistical efficiency of the design and cancel out some of the benefits of the sequential technique. A.4.2 Adaptive Choice Sets Adaptive stated-preference surveys attempt to extract the most relevant information from participants with the least number of questions. The adaptive survey used by Bolis and Maggi (2003) starts from a typical shipment sent by participants and then asks participants to rate hypothetical alternatives where the attributes are changed by certain amounts relative to the previously preferred alternative. This process was repeated by varying cost and only one other attribute until the respondent rates all alternatives with a score of 95 to 105. At this point, the whole process starts with another attribute. Short-run attributes such as transportation cost, travel time and reliability were explored first, and long-run attributes such as logistic flexibility and frequency were explored afterward. This study used the Leeds Adaptive Stated Preference software (Figure A-1), also used by Fowkes and Whiteing (2006). Both Fowkes and Whiteing (2006) and Bolis and Maggi (2003) reported obtaining useful information to characterize the preferences of respondents. However, the approach used in these studies has not been used again in follow-up studies. The questionable statistical properties of adaptive stated-preference surveys have likely dissuaded contemporary researchers. Moreover, asking participants to rate alternatives has become less common because results appear to be less consistent. Figure A-1. Leeds Adaptive Stated Preference model. Source: Fowkes and Whiteing (2006).

Stated-Preference Survey Design 85 A.5 Heterogeneity Capturing the variability of the VOR in the population of shippers and motor carriers is important from a planning perspective. Ultimately, the degree of heterogeneity that can be cap- tured depends on the range of respondents taking the survey. A wealth of models for capturing heterogeneity in survey data has been developed in the past couple of decades. Some of the most common modeling approaches include inclusion of respondent characteristics in model speci- fications, identification of groups that share distinct preferences (latent class models), modeling preferences with distributions (mixed logit models), use of nonparametric models, and estima- tion of separate subgroup models. The last of these has been the most common approach taken in the freight VOR literature, however all these approaches are theoretically feasible. Table A-5 describes the types of heterogeneity considered by previous studies. A.6 Survey Administration Stated-preference surveys can be administered in many ways. Small et al. (1999) used tele- phone interviews, which they concluded likely contributed to the study not finding statistically significant estimates for VOR. Miao et al. (2011) interviewed drivers at truck stops in Texas and Wisconsin. Kurri et al. (2000) found that portable computers âproved a successful method of Table A-5. Heterogeneity considered for road mode. Year Author(s) Heterogeneity Modeled 1999 Small et al. 4 commodity groups 2000 Wigan et al. Intercity, urban, urban multiple stops 2000 Kurri et al. Commodity type 2003 Bolis and Maggi Shipment weight, geography, firm JIT, distance, intermodality, etc. 2006 Fowkes and Whiteing 9 commodities, shipment type, time of day 2008 Beuthe and Bouffioux Shipping distance: <300 km, 300â700 km, 700â1,300 km, >1,300 km Value density of goods: less than $6/kg, $6/kg to $35/kg, more than $35/kg 5 commodity groups: foodstuffs, minerals/building materials/ores and metal waste/agricultural products/fertilizers/petroleum, metal products, chemicals/pharma, miscellaneous Loading unit: semi-trailer, container, other 2010 Halse et al. Type: Shipper with transportation, shipper without transportation, carriers Size: Small trucks (up to 3.5 tonnes), large trucks, all trucks 2011 Miao et al. Geography: Taxes, Wisconsin, Overall Type: Ownerâoperator, for-hire, private carrier 2011 Zamparini et al. Type: Shipper with transportation, shipper without transportation 24 commodity groups Value density of goods: less than $6/kg, $6/kg to $35/kg, more than $35/kg 2014 Kawasaki et al. None 2014 de Jong et al. Containerized, noncontainerized Truck size: 0â2 tonnes, 2â15 tonnes, 15â40 tonnes 2016 Jin and Shams Type: carriers (108), shippers with transportation (9), shippers without transportation (26), 3PL (7) Commodity groups: Agriculture & food, Heavy Manufacturing, Paper/Chemicals/Non-durable Manufacturing, Petroleum & Minerals (other groups were not found to be statistically significant) Perishable, Nonperishable

86 Estimating the Value of Truck Travel Time Reliability gathering preference information from transport managers.â Maier et al. (2002) and Bolis and Maggi (2003) also used portable computers. In these studies, an interviewer was present helping participants complete the survey. This seems to have produced high response rates; however the costs of this approach are very high. Beuthe and Bouffioux (2008) administered the survey on paper, showing different alternatives on different cards. Figures A-2 through A-5 show how previous studies presented the choices to respondents. Dillman et al. (2014) discussed the benefits and drawbacks of different ways of administering stated-preference surveys. Figure A-2. Stated-preference survey: Example A. Source: Kurri et al. (2000). Source: de Jong et al. (2014). Figure A-3. Stated-preference survey: Example B.

Stated-Preference Survey Design 87 Figure A-5. Stated-preference survey: Example D. Source: Kawasaki et al. (2014). Figure A-4. Stated-preference survey: Example C. Source: Jin and Shams (2016).

88 Estimating the Value of Truck Travel Time Reliability Hensher et al. (2005) stressed that one of the fundamental challenges of stated-preference surveys is having participants provide serious and thoughtful answers. This could be elicited by including a section at the beginning that describes the context of the study and places respon- dents in the right mind-set. Information about the characteristics of nonrespondents could be collected to mitigate or quantify nonresponse biases (Johnston et al. 2017). Response rates them- selves can be a poor indicator of nonresponse bias. Baker et al. (2013) discuss how to control for self-selection bias in stated-preference surveys administered through websites. A common concern in the passenger travel literature is that people of different socioeconomic backgrounds are going to have different response rates to web surveys. This concern is likely not important when interviewing supply chain logistics managers. A.7 Revealed-Preference Examples Some work has been done on estimating the VOR by using revealed-preference data. Carrion and Levinson (2013) tracked the commuting patterns of participants with GPS devices, observ- ing whether they pay tolls to avoid congestion. A discrete choice model was used to estimate the VOR of participants. Prato et al. (2014) used vehicles instrumented with GPS devices to observe the travel patterns of drivers and estimated a discrete choice model on the route decisions of drivers as a function of the cost, free-flow time, congested travel time, unreliability of roads, and other factors. Route alternatives were simulated so that each observation had 100 other possible routes that could have been taken. While these two studies used GPS data to estimate the VOR, they are not instructive for the present study because the route is only one of many variables considered by logistics managers. They also can change shipment sizes, shipment delivery schedules, buffer times, and more. The use of these GPS approaches would miss these other behaviors. A.8 Additional Recommendations The following recommendations were also found in the literature: â¢ Have orthogonal and balanced questions. The choice questions should be orthogonal (no correlation between attribute levels) and balanced. These two properties will lead to better VOR estimates. â¢ Be as short and simple as possible to reduce survey fatigue and obtain higher-quality responses. This can be accomplished by using fractional designs (especially orthogonal designs) and only including as few attributes as necessary. It is likely that most previous studies into truck VOR have been too complex, including too many attributes and asking respondents to compare between three or more alternatives. A contemporary understanding of stated-preference sur- veys favors much simpler designs to maximize response quality. â¢ Include a section that explains the context of the choice questions, making sure that the choice scenarios are clearly understood and credible. Johnston et al. (2017) recommend discussing the temporal and spatial aspects and the uncertainty of the scenarios constructed. They also recommend including a short description of the types of policies or actions that would lead to the hypothetical scenarios. In the present study, this could involve describing how reliability will improve as a consequence of a roadway project. â¢ Use focus groups (qualitative) or pilot studies (quantitative) to test whether the questions are understandable and credible. This is particularly important for large surveys that might render controversial results. â¢ Include auxiliary questions to provide covariates for the model; segment or restrict the sample; determine if respondents understood the survey questions; and help develop adjust- ments to improve the validity of the results (Johnston et al. 2017).

Stated-Preference Survey Design 89 â¢ Consider firm characteristics to describe the sectors for which results are representative. The results can be reweighted on the basis of the characteristics of the respondents (ranking) to increase their representativeness to a population (Johnston et al. 2017). These context ques- tions should be placed at the end of the survey to reduce their effect on attrition (Hensher et al. 2005). â¢ Keep trivial choice questions, but use them to determine whether respondents understood the survey. Trivial questions where one alternative is clearly better than the other one are unavoidable in orthogonal designs. Some previous studies have removed these questions or rotated attribute levels to make them nontrivial; however doing so can change the statistical properties of the survey and lead to biased estimates. Contemporary guidance is to leave these questions in the survey and use them to identify respondents who are answering incorrectly by consistently selecting alternatives that are worse in every dimension. â¢ Test the effect of survey fatigue by comparing the results of models estimated on the first few questions versus the last few questions. Choice questions should be randomized to reduce the effect of survey fatigue. â¢ Include attitude questions to understand how shippers think about reliability and whether the choice questions presented were relevant to their operations. These answers could be useful in interpreting the results of the survey.

Next: Appendix B - Survey Responses »

Estimating the Value of Truck Travel Time Reliability (2019)

Chapter: Appendix A - Stated-Preference Survey Design

Welcome to OpenBook!

Get Email Updates