**Suggested Citation:**"Chapter 7 - Model Formulations." National Academies of Sciences, Engineering, and Medicine. 2019.

*Impacts of Policy-Induced Freight Modal Shifts*. Washington, DC: The National Academies Press. doi: 10.17226/25660.

**Suggested Citation:**"Chapter 7 - Model Formulations." National Academies of Sciences, Engineering, and Medicine. 2019.

*Impacts of Policy-Induced Freight Modal Shifts*. Washington, DC: The National Academies Press. doi: 10.17226/25660.

**Suggested Citation:**"Chapter 7 - Model Formulations." National Academies of Sciences, Engineering, and Medicine. 2019.

*Impacts of Policy-Induced Freight Modal Shifts*. Washington, DC: The National Academies Press. doi: 10.17226/25660.

**Suggested Citation:**"Chapter 7 - Model Formulations." National Academies of Sciences, Engineering, and Medicine. 2019.

*Impacts of Policy-Induced Freight Modal Shifts*. Washington, DC: The National Academies Press. doi: 10.17226/25660.

**Suggested Citation:**"Chapter 7 - Model Formulations." National Academies of Sciences, Engineering, and Medicine. 2019.

*Impacts of Policy-Induced Freight Modal Shifts*. Washington, DC: The National Academies Press. doi: 10.17226/25660.

**Suggested Citation:**"Chapter 7 - Model Formulations." National Academies of Sciences, Engineering, and Medicine. 2019.

*Impacts of Policy-Induced Freight Modal Shifts*. Washington, DC: The National Academies Press. doi: 10.17226/25660.

**Suggested Citation:**"Chapter 7 - Model Formulations." National Academies of Sciences, Engineering, and Medicine. 2019.

*Impacts of Policy-Induced Freight Modal Shifts*. Washington, DC: The National Academies Press. doi: 10.17226/25660.

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

80 This chapter describes the mathematical formulations used to specify the freight mode choice models estimated as part of this research. The discussion starts with the shipment-size models, followed by the market-share and the shipment-level models. Variables Used (1) Transit time. This variable measures the time it would take a shipment to travel from the shipperâs location to the receiver. It includes the drayage time at both ends of the trip and the time spent in transfers. In the case of transit time by rail, the drayage includes the truck dray- age time at both ends and the transfer times between the railroads for various commodities. (2) Freight rate. This is the amount of money the payer of the transportation service would have to pay to deliver a shipment from the shipperâs to the receiverâs location. (3) Generalized cost. The generalized cost combines the rate and the time into a single metric (using an estimate of the intrinsic value of time of the shipment to convert time into cost). The reasons to consider generalized cost are (1) this metric considers the combined effects of time and cost in a relatively easy-to-compute metric, and (2) it provides an alternative to cases where either time or rates are not found to be statistically significant separately (which is a likely outcome due to data issues). The generalized cost was defined as where GCmi = Generalized cost of transporting the shipment i by mode m (truck or rail), Rmi = Rate of transporting shipment i by mode m (truck or rail), = =IVC V O W i i Intrinsic value of cargo for shipment i, Tmi = Transit time for shipment i by mode m (truck or rail); Vi = Value of shipment i from the CFS data, O = Opportunity cost, and W = Number of working hours (per year). As shown in Equation (13), the computation of generalized costs relies on the estimated IVC. This parameter estimates, through a rather approximate manner, the value of time of a given shipment. The assumption is that the value of time is determined by the opportunity cost to the receiver of the shipment, which is related to profits and number of hours worked. Although the opportunity costs vary from firm to firm, the lack of data at this level of detail forced the research team to use average values. The team estimated freight mode choice models for three opportunity cost scenarios (i.e., 5 percent, 10 percent, and 25 percent). The number of hours of work per year was assumed to be 2,472 hours/year. (13)GC R IVC Tmi mi i mi= + C H A P T E R 7 Model Formulations

Model Formulations 81 Overview of Model Results Three different types of models were estimated as part of this research: (1) Shipment-size models. These models express shipment size as a function of the GCD between the shipper and the receiver. Although these models were originally estimated to eliminate the endogeneity between the decisions of shipment size and freight mode (or vehicle) choice, these models are very useful in cases where the shipment-size data are not reliable or available. As a result, they can be used in applications of both market-share and shipment-level models. (2) Market-share models of freight mode choice. These models estimate the percent share of the freight modes of a given market that is expected to select either truck or rail. The models presented in this report use generalized cost as the key independent variable. Although the research team conducted extensive testing of models using transit times and rates as separate independent variables, none was found to be statistically significant and conceptually valid. (3) Shipment-level models of freight mode choice. These models estimate the probability that a shipment would be sent by rail or truck as a function of (1) the estimates of the actual ship- ment size provided by the shipment-size models (to remove the correlation between ship- ment size and freight mode choice) and (2) the variables that characterize the operational performance of the freight modes (i.e., generalized cost, transit time, and freight rates). The disaggregate (shipment-level) models are further classified into three types: (1) unweighted models, (2) weighted based on domestic cargo, and (3) weighted based on total (domestic and international) cargo. The unweighted models estimate the freight mode choice assum- ing that the CFS actually estimates the modal shares in the United States. The weighted models ensure that the freight mode choice data reflect the actual mode share published by FAF Version 4 (FAF 2018). Table 19 provides a count of the total number of models estimated for the freight mode choice analysis. The sections that follow explain the various freight mode choice models in detail. As shown, the number of models estimated is typically different. This is the result of two different factors. The first is the effect of disclosure constraints from the Census Bureau aimed at preventing the release of models that contain confidential information. Since the aggregate models have a better chance of passing the disclosure requirements, a larger num- ber of aggregate models could be released. The second factor was the combination of the computation time required by the shipment-level models, which typically took days and sometimes weeks to finish, and a restriction on the maximum number of programs that could be run by a researcher on the Census Bureau servers. Model Sample Weight Functional Form Count NA Power functions 159 NA Exponential functions 159 Aggregate No Weight Commodity-wise 266 Pooled 5 Commodity-wise 249 Pooled 3 Commodity-wise 247 Pooled 7 Commodity-wise 251 1,346 Total Shipment size Disaggregate No Weight Domestic Cargo Total (Import+Exports) Table 19. Summary of models estimated.

82 Impacts of Policy-Induced Freight Modal Shifts The freight mode choice models (both aggregate and disaggregate) are grouped into two categories: (1) Models based on transit time and freight rates. These models explicitly consider the effects of transit times and freight rates by mode. The main benefit of these models is that they enable the estimation of the implicit value of time by commodity type. (2) Models based on generalized cost. These models used a composite measure of cost that includes freight rates and time. The latter variable is multiplied by the IVC. These models are useful in cases where there are no models that consider transit times, and freight rates are significant. Table 20 shows the summary counts of the various types of models estimated together with the table numbers of the tables in this report where the results can be found. In general terms, the models that used generalized costs (particularly, with a 5 percent opportunity cost) were found to work better than the models that considered freight rates and transit times separately as independent variables. Model Formulations This section describes the mathematical formulations used to specify the freight mode choice models. The discussion starts with the shipment-size models, followed by the market-share and the shipment-level models. Shipment-Size Models The shipment-size models were estimated to solve the endogeneity problems caused by the inclusion of shipment size in the estimation of discrete choice models for freight mode choice. Using the instrumental variable approach, the shipment-size models were used to estimate the shipment sizes for specific shipments, using them as independent variables in the discrete choice SCTG- wise Pooled SCTG- wise Pooled Shipment Size Models NA Great circle distance 21 NA 30 NA Avg. travel time and rates 22 4 Avg. generalized costs (Op. cost=5%) 23 32 Avg. generalized costs (Op. cost=10%) 24 24 Avg. generalized costs (Op. cost=25%) 25 10 Travel times and rates 27 26 3 32 Generalized costs (Op. cost=5%) 29 28 16 35 Generalized costs (Op. cost=10%) 30 11 Generalized costs (Op. cost=25%) 31 6 Travel times and rates 33 32 3 34 Generalized costs (Op. cost=5%) 35 34 17 35 Generalized costs (Op. cost=10%) 36 11 Generalized costs (Op. cost=25%) 37 5 Travel times and rates 39 38 3 35 Generalized costs (Op. cost=5%) 41 40 16 35 Generalized costs (Op. cost=10%) 42 11 Generalized costs (Op. cost=25%) 43 6 Table Numbers Count Model Sample Weight Independent Variables (by commodity type) NA NA Market- Share Models Total (Imports+Exports) Weighted Shipment- Level Models NA Unweighted Domestic Weighted NA NA NA NA NA NA Table 20. Summary and locations of freight mode choice models.

Model Formulations 83 models, in place of the real shipment sizes. The models expressed shipment size as a function of the GCD between shipment origin and destination in miles, which is reported in the CFS for each shipment. Different functional forms were tested, and the best models for each type of commod- ity (or group of commodities) were selected. The functional forms that consistently provided the best results were the power and logarithmic functions, which are described as follows: Power function: Logarithmic function: where S = Shipment size (in pounds), G = GCD in miles between origin and destination, Î²0 = Constant term for the model, Î²1 = Parameter for the GCD, s2 = Mean squared error for the non-linear model. Market-Share Mode Choice Models As indicated previously, the market-share models express the share of a mode (of a typical shipment) as a function of the characteristics of the competing modes. This is achieved with the use of logistic functions, or the kind shown in Equation (16) that uses a utility function to account for the effects of the various independent variables. The utility functions used are shown for the generalized cost version in Equation (19) and for the version of the model that considers transit times and freight rates as independent variables in Equation (18). Market share of trucking: Market share of rail: For models that use transit time and freight rates separately: where Tti = Transit time (in hours) by truck for commodity i Tr = Transit time (in hours) by rail for commodity i Cti = Cost by truck for commodity i Cri = Cost by rail for commodity i Î²0 = Constant term for the model Î²C = Freight rate coefficient Î²T = Transit time coefficient Î²GC = Generalized cost coefficient = Î² + Î² (14)0.50 2 1S e Gs ln (15)0 1S G( )= Î² + Î² = + (16)MS e e e ti U U U ti ti ri = â1 (17)MS MSri t = Î² + Î² + Î² = Î² + Î² (18) 0U C T U C T ti i Ci ti Ti ti ri Ci ri Ti ri

84 Impacts of Policy-Induced Freight Modal Shifts For models that use generalized cost: As shown in Equation (16) and (17) MSti includes a constant Î²0i, intended to capture any bias in favor (or against) the use of truck. If, in equality of conditions, shippers prefer trucking, Î²0i is expected to be positive. The other parameters, Î²Ci, Î²Ti, and Î²GCi must be negative, as increases in time, rate, or generalized costs of a particular mode are bound to reduce the use of a mode. The market shares MSti and MSri are the market share by shipments. In essence, the market-share models estimate the probability of a typical shipment choosing between truck and rail. Market-share models can be readily estimated with OLS techniques, typically referred to as regression analyses. However, in order to do so, the original function must be linearized. The first step is to compute the ratio of the MSti and MSri, for MSri > 0, which leads to the following: For models that use transit time and freight rates separately: For models that use generalized cost: Taking natural logarithms, the formulation used in the transit time and freight rate model becomes In the case of the generalized cost models, the linearization leads to As Equations (22) and (23) show, since Î²C , Î²T, and Î²GC are expected to be negative, the higher the transit time, rate, or generalized cost of the truck mode, the lower its market share and, conversely, the larger the market share of rail. Similarly, increases in transit time, rate, or generalized cost of rail will increase the market share of truck. The estimation of the market- share models required post-processing the CFS data to aggregate the individual shipments into distance bins using the GCD reported in the CFS for each shipment. The distance bins were defined starting from 5 miles. Shipments with distances less than 5 miles were removed from the database, since they likely represent either trucking captive shipments or data errors. There- fore, the first bin captured those shipments ranging from 5 to 25 miles. The following bins were divided by increments of 25 miles starting from 26 miles and going up to 1,000 miles (26 to 50, 50 to 75, 75 to 100, . . . 1,000). From 1,000 miles on, bins were defined in increments of 50 miles up to 1,400 miles (1,001 to 1,050, 1,051 to 1,100, . . . 1,400). Shipments with trip lengths above 1,400 miles were included in a single bin of distances greater than or equal to 1,400 miles. Shipment-Level Mode Choice Models The estimation of shipment-level freight mode choice models was conducted with discrete choice models. A unique feature of the discrete choice models of freight mode choice is that they = Î² + Î² = Î² (19) 0U GC U GC ti i GCi ti ri GCi ri = ( ) ( )Î² +Î² â +Î² â (20)0MS MS e ti ri i T T C CTi ti ri Ci ti ri (21)0 MS MS e ti ri i GC GCGCi ti ri= ( )Î² +Î² â ( ) ( )ï£«ï£ï£¬ ï£¶ ï£¸ï£· = Î² + Î² â + Î² âln (22)0 MS MS T T C C ti ri i ti ti ri Ci ti ri ( )ï£«ï£ï£¬ ï£¶ ï£¸ï£· = Î² + Î² âln (23)0 MS MS GC GC ti ri i GCi ti ri

Model Formulations 85 must take into account the econometric interactions between the continuous choice of shipment size and the discrete choice of freight (or vehicle) mode. The team used the shipment-size mod- els to obtain estimates of the actual shipment sizes to compute the values of transit times and freight rates by truck and rail. In the case of models based on generalized costs, the freight rates and transit times were combined using the IVC. The basic specifications of the shipment-level freight mode choice models are described in the following. Pooled Models These models consider the effect of the commodity type using a set of binary variables and the interaction of these binary variables with modal attributes (i.e., transit times, rate, and general- ized costs) in the utility functions. In essence, the pooled models capture the commodity-specific behavior in a single model, instead of in separate (two-digit SCTG) models for each commodity. For models that use transit time and freight rates separately (the subscripts for commodity type i and shipment m have been dropped for simplicity): For models that use generalized cost: where Ut = Utility for truck, Ur = Utility for the rail, Î²0 = Constant term for the model, Î²oi = Marginal intercept associated with commodity i, di = Binary variable that equals one if the shipment belongs to commodity i and is zero otherwise, Î²Ci = Marginal freight rate effect associated with commodity i, Î²Ti = Marginal transit time effect associated with commodity i, Î²GCi = Marginal generalized cost-effect associated with commodity i, Î²C = Freight rate coefficient for base-case commodity (coal), Î²T = Transit time coefficient for base-case commodity (coal), Î²GC = Generalized cost coefficient, Tt = Transit time (in hours) by truck, Tr = Transit time (in hours) by rail, Ct = Cost by truck, Cr = Cost by rail, GCti = Cti + IVTiTti = Generalized cost by truck for commodity i, and GCri = Cri + IVTiTri = Generalized cost by rail for commodity i. Two-Digit SCTG Models These models consider the effect of commodities by estimating a freight mode choice model for each commodity separately, at the level of two-digit SCTGs. The expressions considered for the two-digit SCTG models, for each commodity follow. â â â â â = Î² + Î² Î´ + Î² + Î² Î´ + Î² + Î² Î´ = Î² + Î² Î´ + Î² + Î² Î´ (24) 0U C C T T U C C T T t oi i C t i Ci i t T t i Ti i t i r C r Ci i r T r i Ti i r i â â â = Î² + Î² Î´ + Î² + Î² Î´ = Î² + Î² Î´ (25) 0U GC GC U GC GC t oi i GC t i GCi i t i r GC r GCi i r i

86 Impacts of Policy-Induced Freight Modal Shifts For models that use transit time and freight rates separately: For models that use generalized cost: where Tt = Transit time (in hours) by truck, Tr = Transit time (in hours) by rail, Ct = Cost by truck, Cr = Cost by rail, Î²0 = Constant term for the model, Î²C = Freight rate coefficient, Î²T = Transit time coefficient, Î²GC = Generalized cost coefficient, GCti = Cti + IVTiTti = Generalized cost by truck for commodity i, and GCri = Cri + IVTiTri = Generalized cost by rail for commodity i. Limitations Notwithstanding their importance as the first freight mode choice models estimated with the CFS microdata, the models estimated in this research have limitations that are worth acknowl- edging, in the hope that their resolution could be the outcome of future work. (1) Lack of sufficient shipment-level data for modes other than truck, rail, and intermodal. As presently configured, the CFS collects a stratified random sample of shipments in the United States, which provides a very small sample of shipments sent by inland waterway, air, parcel service, ocean, and pipeline. As a result, the number of observations collected is too small to estimate freight mode choice that considers these modes. Redesigning the CFS to increase the number of shipments that use these modes or repurposing other surveys so that they could be used in conjunction with the CFS data could provide the data needed. (2) Lack of readily available data about the operational characteristics of the freight modes. There is no single repository of data about modal characteristics that could have been seam- lessly integrated with the input data used in the modeling process. As a result, the team had to estimate transit times and freight rates for both truck and rail using plausible assumptions and statistical inference techniques. Although this process seemed to have worked well, other important variablesâmost notably, reliabilityâcannot be estimated a posteriori and could not be included in the models. Addressing this important issue requires a complementary effort to characterize the performance of the major freight modes at the time the CFS data are collected. Addressing these issues would be a major step forward in the development of the freight mode choice models that assist policymakers in quantifying the effects of various policies on mode shares. = Î² + Î² + Î² = Î² + Î² (26) 0U C T U C T t C t T t r C r T r = Î² + Î² = Î² (27) 0U GC U GC t GC t r GC r