Read "Dynamic, Integrated Model System: Jacksonville-Area Application" at NAP.edu

« Previous: Executive Summary

Page 17

Suggested Citation:"Chapter 1 Model Implementation." National Academies of Sciences, Engineering, and Medicine. 2013. Dynamic, Integrated Model System: Jacksonville-Area Application. Washington, DC: The National Academies Press. doi: 10.17226/22482.

Page 18

Page 19

Page 20

Page 21

Page 22

Page 23

Page 24

Page 25

Page 26

Page 27

Page 28

Page 29

Page 30

Page 31

Page 32

Page 33

Page 34

Page 35

Page 36

Page 37

Page 38

Page 39

Page 40

Page 41

Page 42

Page 43

Page 44

Page 45

Page 46

Page 47

Page 48

Page 49

Page 50

Page 51

Page 52

Page 53

Page 54

Page 55

Page 56

Page 57

Page 58

Page 59

Page 60

Page 61

Page 62

Page 63

Page 64

Page 65

Page 66

Page 67

Page 68

Page 69

Page 70

Page 71

Page 72

Page 73

Page 74

Page 75

Page 76

Page 77

Page 78

Page 79

Page 80

Page 81

Page 82

Page 83

Page 84

Page 85

Page 86

Page 87

Page 88

Page 89

Page 90

Page 91

Page 92

Page 93

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

17 C h A P T e r 1 SHRP 2 Project C10A, Partnership to Develop an Integrated, Advanced Travel Demand Model and a Fine-Grained, Time- Sensitive Network: Jacksonville-Area Application was under- taken to develop a dynamic, integrated model and to demonstrate its performance through validation tests and policy analyses. This chapter describes the data requirements and steps necessary to implement the integrated model system. First, all key data inputs and tools are identified and described. The DaySim model requires data that reflect a wide variety of factors that influence travel decisions. Much of the data is developed and applied at the detailed parcel level. data development The DaySim model requires data that reflect a wide variety of factors that influence travel decisions, including socioeconomic, employment, and school information; transportation network level of service; and urban form attributes. Much of the data is developed and applied at the detailed parcel level, which enhances the modelâs sensitivity but which also increases the data development, maintenance, and update requirements. The DaySim data inputs are discussed in the following sections. Synthetic Population DaySim was initially implemented in Sacramento, California. Before applying the DaySim models in Jacksonville and Burl- ington, the project team first had to develop a synthetic popu- lation of these regionsâ residents. This synthetic population is a list of households and persons that is based on observed or forecast distributions of socioeconomic attributes and created by sampling detailed Census Bureau microdata. This list functions as the basis for all subsequent choice-making simu- lated in the model system. The base year 2005 data used to develop the synthetic population with the DaySim population gene ration component are available from three sources: (1) the Census Bureauâs American Community Survey (ACS) Public Use Microdata Sample (PUMS) and Decennial PUMS, (2) Northeast Florida Regional Planning Model (NERPM) inputs, and (3) the National Household Travel Survey (NHTS). PopGen PopGen, a synthetic population generator developed at Arizona State University, was chosen for synthesizing the Jacksonville and Burlington populations. Synthetic population generators typically use Census-based marginal distributions on house- hold attributes to generate joint distributions on variables of interest using standard iterative proportional fitting (IPF) procedures. Households are then randomly drawn from an available sample in accordance with the joint distribution such that household-level attributes match perfectly. However, these traditional procedures typically operate at the household level and do not control for person-level attributes and joint dis- tributions of personal characteristics. PopGen incorporates a heuristic approach to generate synthetic populations while matching both household-level and person-level character- istics of interest. PopGen is a Python-based software with an easy-to-use and flexible graphical user interface (GUI). Its wizard-based project setup process allows users to choose the region for population synthesis and specify the required inputs. Figure 1.1 shows the PopGen project setup wizard. It accommodates sample and control inputs from Census, ACS, and region-specific sources such as household surveys and land-use model outputs. Popu- lations can be synthesized with controls at various geographic resolutions such as Census-block groups or travel analysis zones (TAZs). For Jacksonville and Burlington, the popula- tions were synthesized at the TAZ level and subsequently allo- cated to individual parcels. Once the required inputs have been specified, PopGen imports them into tables in a MySQL database and works from those tables to draw a synthetic population. After a population synthesis run, the match between the synthetic population and control data can be checked using visual- ization features in PopGen. Figure 1.2 shows one such Model Implementation

18 Figure 1.1. PopGen project setup wizard. Figure 1.2. PopGen visualization.

19 feature. If the synthetic population is found to be appropri- ate, PopGen tools can export it to specific file formats for use in travel demand microsimulation applications, such as DaySim. Preparing a synthetic population for microsimulation using DaySim involves four basic steps. Each of the steps is described in detail in the following sections. 1. Prepare the control data. 2. Prepare the sample data. 3. Synthesize the population. 4. Process the synthetic population for use in DaySim. The Jacksonville synthetic sample population comprises three segments: permanent households and population, sea- sonal households and population, and the group quarters population. The segments were established to reflect the differ- ences in travel patterns associated with these subpopulations as well as to support seasonal analyses. For example, the seasonal population is generally older than the permanent population, has lower levels of workforce participation, and clusters in cer- tain geographic areas. All of these attributes influence travel patterns and the demand for travel. The Burlington synthetic sample population comprises two segments: permanent households and population and the group quarters population. Burlington does not have a signifi- cant seasonal population; thus a separate seasonal segment was not necessary. However, Burlington does have a significant group quarters population comprising University of Vermont students, so this segment was maintained. Jacksonville Synthetic Population Control Data This section identifies the data sources and steps to prepare the control data for all three of the subpopulations that make up the synthetic population: permanent resident households and population, seasonal households and population, and noninstitutionalized group quarters (GQ) residents. EstimatE thE DEmographic Distributions The first step is to identify specific control variables of interest and derive demographic distributions for them. The control variables are attributes based on demographic distributions that are relevant to travel demand patterns. Control variables are specified for each of the three segments. pErmanEnt housEholDs. Table 1.1 and Table 1.2 show the con- trol categories and data sources for households and for per- sons, respectively. The PopGen program uses this information to synthesize the permanent household population. For house- holds, the categories include the following: â¢ Age of the head of household; â¢ Household size; â¢ Number of workers; â¢ Household income; and â¢ Presence of children. For persons, the categories include gender and age. The attributes are based primarily on Census Transporta- tion Planning Products (CTPP) distributions; the presence of children attribute is obtained from the Census Summary File 1 (SF1). Before working with the data, CTPP tables at the Census TAZ level were mapped to corresponding NERPM TAZs using the following steps: â¢ A NERPM parcel centroid file was created from the parcel boundary shape file. This file also contains the NERPM TAZ for each parcel. â¢ The parcel centroid shape file was intersected with the Census TAZ shape file in ArcGIS, and centroids of NERPM parcels were matched with Census TAZs. This step creates a many-to-many correspondence between Census and NERPM TAZs. â¢ Using the total number of housing units in all the parcels in a NERPM TAZ and the total number of housing units in all the parcels in a Census TAZ, the project team calculated the proportion of housing units from a Census TAZ that belong to a particular NERPM TAZ. The numbers of households in various categories of control variables were aggregated at the Census TAZ level and distrib- uted to NERPM TAZs on the basis of the calculated proportions. The data were aggregated again at the NERPM TAZ level. Similarly, because SF1 data are at the Census-block level, centroids for Census-block polygons were mapped to NERPM TAZs using ArcGIS to obtain a block-TAZ correspondence table. By combining appropriate fields in the data tables, distri- butions of the various categories among the control variables chosen were obtained at the NERPM TAZ level. sEasonal housEholDs. The control categories for the sea- sonal population are the same as for permanent households. However, the base year control values come directly from the seasonal households in the statewide NHTS survey sample, conducted in 2008 and 2009, which includes 530 households that reported living in Florida for 8 months or less per year. Of those households, 463 provided income information. The demographic distribution of this statewide sample is assumed to apply to all TAZs in the model area because reliable sea- sonal population attributes at detailed geographic levels are

20 Table 1.1. Household Control Data for Permanent and Seasonal Households Household Attribute Control Column Categories Source Data Householder age 1 18â44 CTPP 1-70 2 45â64 3 65+ Household size, number of workers, and income Size categories 1â4: 1, 2, 3, 4+; Workers categories 1â3: 0, 1, 2+; Income categories 1â4: Under $30,000, $30,000â $59,999, $60,000â$99,999, $100,000 & over (Specified as joint distribution using a composite attribute): CTPP 1-75 4 Size1 Workers1 Income1 5 Size1 Workers1 Income2 6 Size1 Workers1 Income3 7 Size1 Workers1 Income4 8 Size1 Workers2 Income1 9 Size1 Workers2 Income2 10 Size1 Workers2 Income3 11 Size1 Workers2 Income4 12 Size2 Workers1 Income1 13 Size2 Workers1 Income2 14 Size2 Workers1 Income3 15 Size2 Workers1 Income4 16 Size2 Workers2 Income1 17 Size2 Workers2 Income2 18 Size2 Workers2 Income3 19 Size2 Workers2 Income4 20 Size2 Workers3 Income1 21 Size2 Workers3 Income2 22 Size2 Workers3 Income3 23 Size2 Workers3 Income4 24 Size3 Workers1 Income1 25 Size3 Workers1 Income2 26 Size3 Workers1 Income3 27 Size3 Workers1 Income4 28 Size3 Workers2 Income1 29 Size3 Workers2 Income2 30 Size3 Workers2 Income3 31 Size3 Workers2 Income4 32 Size3 Workers3 Income1 33 Size3 Workers3 Income2 34 Size3 Workers3 Income3 35 Size3 Workers3 Income4 36 Size4 Workers1 Income1 37 Size4 Workers1 Income2 (continued on next page)

21 not available. However, the seasonal population is clustered in certain areas, such as along the coast. For population synthesis, all dollars are normalized to repre- sent 1999 dollars as closely as possible; that value was used in the 2000 Census, which supplies PUMS and control table data. The NHTS survey data are recorded in categories of nominal 2007 or 2008 dollars ($5,000 increments to $80,000, then $80,000 to $100,000, then above $100,000); each of these categories must be placed within one of the four 1999 income categories used for population synthesis (under $30,000, $30,000 to under $60,000, $60,000 to under $100,000, and $100,000+). To do this the gross domestic product (GDP) deflator (0.817) was used to inflate the 1999 synthesis categories to 2007 values (under $36,700; $36,700 to under $73,400; $73,400 to under $122,400, and $122,400+), so that the recorded category of each house- hold could be placed in the best synthesis category. Because the NHTS survey dataâs top category is only $100,000+ (2007 dol- lars), high-income survey respondents could not be accurately assigned to the top two synthesis categories given that the threshold ($100,000 in 1999 dollars) falls between them ($73,400 to under $122,400 and $122,400+ in inflated 1999 dol- lars). The best option was to assign all such respondents to the top income category of $100,000+ (1999 dollars). noninstitutionalizED group QuartErs. Table 1.3 shows the proposed control categories for the Group Quarters (GQ) residents. The distribution is extremely simple because of limited Census data for GQ residents. However, the age distribution helps PopGen properly locate two important GQ subpopula- tions: college students and retirement center residents. The Table 1.1. Household Control Data for Permanent and Seasonal Households Household Attribute Control Column Categories Source Data 38 Size4 Workers1 Income3 39 Size4 Workers1 Income4 40 Size4 Workers2 Income1 41 Size4 Workers2 Income2 42 Size4 Workers2 Income3 43 Size4 Workers2 Income4 44 Size4 Workers3 Income1 45 Size4 Workers3 Income2 46 Size4 Workers3 Income3 47 Size4 Workers3 Income4 Presence of children under 18 48 Yes SF1-p19 49 No (continued) Table 1.2. Person Control Data for Permanent and Seasonal Households Person Attribute Control Column Categories Source Data Gender and age Gender categories 1&2: male/female; age categories 1â5: 0â15, 16â20, 21â44, 45â64, 65+ CTPP 1-51 1 Male age 0â15 2 Male age 16â20 3 Male age 21â44 4 Male age 45â64 5 Male age 65+ 6 Female age 0â15 7 Female age 16â20 8 Female age 21â44 9 Female age 45â64 10 Female age 65+ Table 1.3. Control Data for Noninstitutionalized Group Quarters Residents Household/Person Attribute Control Column Category Source Data Age 1 Under 18 2000 SF1-p38 2 18â64 3 65+

22 control information is so simple that an IPF procedure is not necessary. However, if using PopGen to generate the sample, it can be set up to run only the household-level IPF, which will converge quickly, and avoid entirely the person-level iterative proportion fitting (IPF) and the iterative proportion updating (IPU) procedures. EstimatE thE numbEr of housEholDs anD pErsons in Each taz The numbers of households and persons living in each TAZ in 2005 are required as control totals for both the permanent and seasonal populations. The final control total required to synthesize the population is the number of GQ residents. The total number of permanent households and seasonal households for 2005 at the TAZ level were obtained by com- bining NERPM model data on permanent and seasonal hous- ing occupancy with parcel-level estimates of housing units. The development of the parcel-level estimates of housing units is described in the section on DaySim parcel data. The NERPM model demographic data include TAZ-level data on the number of housing units in a TAZ, the proportion of those households that are seasonally occupied or vacant, and the proportion that are vacant. The following formulas (see Equa- tion 1.1) were used to derive the number of households. They produced a total of 479,250 permanent households and 35,339 seasonal households. PHHP SFDU 1 SFSEAS 100 MFDU 1 MFSEAS 100 SFDU MFDU PHH ParcelHU PHHP SHHP SFDU SFSEAS SFVAC 100 MFDU MFSEAS MFVAC 100 SFDU MFDU SHH ParcelHU SHHP (1.1) ( ) ( ) ( ) ( ) ( ) ( ) = â â + â â ï£«ï£ï£¬ ï£¶ï£¸ï£· + = â = â â + â â ï£«ï£ ï£¶ï£¸ + = â where PHHP = permanent household proportion; SFDU = single-family dwelling units (NERPM data); SFSEAS = percentage of seasonal or vacant single-family dwelling units (NERPM data); MFDU = multifamily dwelling units (NERPM data); MFSEAS = percentage of seasonal or vacant multifamily dwelling units (NERPM data); PHH = permanent households; ParcelHU = housing unit estimates from parcel data; SHHP = seasonal household proportion; SFVAC = percentage of vacant single-family dwelling units (NERPM data); MFVAC = percentage of vacant multifamily dwelling units (NERPM data); and SHH = seasonal households. The permanent population controls are based on the total number of persons by county from the Census population estimated data. According to these data, the July 1, 2005, pop- ulation of the four-county Jacksonville model area was 1,223,279. That number includes GQ residents but is assumed not to include seasonal residents. GQ residents were sepa- rated out by estimating their 2005 populationâinterpolating between the number of GQ residents according to the 2000 Census (20,122) and the number according to the 2006â2008 ACS (21,047)âwhich gives 20,783 GQ residents. The total permanent population according to the Census is 1,202,496. The county-level permanent population totals were used to calculate an average household size for the highest household size category (4+ people) for each county. That number was applied to the TAZ-level household-size distribution (from CTPP Table 1-62) and the number of permanent households to calculate the number of permanent residents in each TAZ. The average seasonal household size was calculated from the NHTS data using the ratio of the total number of seasonal persons in the sample and households in the sample. To cal- culate the total number of persons in the seasonal popula- tion, this average household size was multiplied by the total number of seasonal households in each TAZ. This calculation resulted in 63,611 seasonal residents. The total number of noninstitutional GQ residents for the base year was estimated using the total GQ population and data from Census 2000 SF1 (Table P37) which identified the propor- tion of GQ residents classed as noninstitutional. (This distinc- tion is important for travel modeling because institutionalized GQ residents, such as prisoners in jails, do not travel outside of their institution.) The number of noninstitutional GQ resi- dents is 10,813. The county-level estimates were assigned to TAZs on the basis of the number of GQ housing units accord- ing to parcel data. The demographics distributions were rescaled to match the estimated number of households and persons living in each TAZ. rEformat control Data to popgEn spEcifications For permanent residents, two PopGen âmarginalsâ files are needed, with 49 household controls in one file and 10 personal controls in another, as shown in Table 1.1 and Table 1.2. The layout of the marginal input file required by PopGen is shown in Table 1.4. The file begins with four mandatory fields: state, county, tract, and bg, with bg interpreted as TAZ. After that is a column for each control category, with entries representing the number of households (or persons) within the category for each TAZ. Two header rows (the column name in Row 1 and the data type in Row 2) are followed by one of control data items for each TAZ. The control data items are represented here as dots. Household and person marginals files with the same con- trol categories are needed for seasonal residents. For the GQ

23 population, the three controls in Table 1.3 need to be included in a household marginals file, but no person marginals file is required. All of these steps have been coded in an R-script. The R Proj- ect (http://www.r-project.org/) provides open-source software for statistical computing and graphics. It is also an efficient tool for data manipulation and processing. The inputs required for the R-script are correspondences between Census TAZs, NERPM TAZs (for Jacksonville), and DaySim TAZs (DaySim TAZs are renumbered NERPM TAZs since the DaySim soft- ware requires that external zones be listed first), CTPP tables in comma-separated values (CSV) format, SF1 data with NERPM TAZ mapped (also in CSV format), NERPM zonal data (data- base file, dbf, format), demographic distributions of seasonal households from NHTS, and GQ population totals by county from Census SF1. On synthesizing the population, the project team found a considerable overestimation in the total number of workers compared with the employment in the region, accounting for in- and out-commuting. The household distribution by num- ber of workers was found to be influencing the higher number of workers. Thus, the demographic distributions obtained from Census data were adjusted at the county level to match those obtained from ACS 2005â2007, which had different proportions of households by workforce participation. Sample Data The household sample provides the household and person records that will be drawn into the synthetic population. It also provides the multidimensional attribute seed distribution for the IPF procedures used in PopGen. Because the distribution does not depend primarily on the household sample, but rather on the controls, the sample need not exactly represent the distri- bution. However, preferably, the sample should include many households of the types found in the region included in the syn- thetic distribution. Thus, a large sample from which to draw is preferred. Typically, the Census PUMS of each Public Use Microdata Area (PUMA) serves as the sample for all smaller geographical units included in the PUMA. Now that the ACS PUMS is available, either the 2000 Census PUMS or ACS PUMSâor bothâcan be used; the PUMA definitions and the definitions of the PUMS data items used by DaySim are essen- tially the same for the 2000 Census and for ACS PUMS. Both data sources have been combined to create the sample file. Table 1.5 lists the PUMAs that cover the model area. The PUMS for permanent and seasonal households include all occupied-housing and person records from these PUMAs. The sample for noninstitutionalized GQ residents includes only the occupied-housing and person records, from these PUMAs, that represent the noninstitutionalized GQ population. For 2000 PUMS, these have housing record UNITTYPE=2; and for 2006â2008 ACS, they have housing record TYPE=3. If the resulting sample is quite small, then all the GQ records may need to be combined into a single sample that is used for all PUMAs, and even GQ records from other PUMAs in the state may need to be added. Table 1.6 lays out the data elements needed for the PopGen input sample files, including the items required by PopGen, the items corresponding to control variable categories, and the items needed by DaySim. These elements include data items corresponding to the controls required for the generation of all three synthetic subpopulations. Table 1.7 shows the exact format of the input sample file as required by PopGen. An input sample file contains four man- datory fields (state, pumano, hhid, serialno) followed by pop- ulation attributes. The serial number and household ID are identical IDs for the sample housing unit, indexed at 1. (The duplication is a legacy of an earlier version of PopGen. The Table 1.4. PopGen Household Marginal File Layout State County Tract Bg <hhvar1cat1> <hhvar1cat2> <variabletype> <variabletype> <variabletype> <variabletype> <variabletype> <variabletype> <data> <data> <data> <data> <data> <data> . . . . . . . . . . . . . . . . . . Note: . . . = control data. Table 1.5. Jacksonville PUMAs PUMA ID Description 1300 Clay County 1101 Parts of Duval and Nassau Counties 1102 Part of Duval County 1103 Part of Duval County 1104 Part of Duval County 1105 Part of Duval County 1106 Part of Duval County 1107 Part of Duval County 1200 St Johns County

24 Table 1.6. PopGen Input Sample File Data Items Data Item and Description Values Control Variable ACS 2006â2008 Item Census 2000 5% PUMS Item Household Sample File State ST STATE Pumano PUMA PUMA5 Hhid (same as serialno) Serialno Household size NP PERSONS Number of people in related family NPF NPF Household incomea In dollars HINCP (PINCP for GQ) HINC (INCTOT for GQ) CTPP1-70 category: age of householder HH CTPP 1-75 category: household size, incomea, and number of workers HH SF1-P19 category: presence of children in household HH SF1-P38 category: age GQ Person Sample File State ST STATE Pumano PUMA PUMA5 Hhid (same as serialno) Serialno Pnum Gender 1-male, 2-female SEX SEX Age Years AGEP AGE Grade in school 1:Pre-K 2:K 3:Grade 1â4 4:Grade 5â8 5:Grade 9â12 6:Undergrad 7:Grad/Prof school SCHG GRADE Hours worked per week WKHP HOURS CTPP1-51 category: gender and age HH a Income data from separate years are deflated to 1999 dollars, used by 2000 CTPP 1-75 and by DaySim. Household income is not available in PUMS for GQ residents, but total personal income is available in the person record and was used instead. Table 1.7. PopGen Household Sample File Layout State Pumano Hhid Serialno <hhvariable1> <hhvariable2> . . . <variabletype> <variabletype> <variabletype> <variabletype> <variabletype> <variabletype> <data> <data> <data> <data> <data> <data> . . . . . . . . . . . . . . . . . .

25 current version requires only one unique ID, but the code still requires that two fields be present in the input file.) Population Synthesis In this step, PopGen is run separately for each of the three sub- populations, using the specified input control and sample files. In addition to the control and sample files, PopGen requires a geography correspondence file (shown in Table 1.8). This file has one row per TAZ and associates the TAZ with the PUMA (and other larger geographies) to which it belongs. For Jacksonville, the correspondence file has been prepared and is named Geocorr.csv. After PopGen is run, output population files are exported. Table 1.9 lays out the data elements needed for the PopGen synthetic population output files. Compared with the sam- ple input data items, this step drops the items required for sampling and adds the identification numbers required by PopGen. Synthetic Population DaySim Integration DaySim currently generates and reads the synthetic population in the form of person records, with household data repeated in every person record. In addition, DaySim operates at the parcel level, while PopGen creates the synthetic population at the larger TAZ level. Therefore, a DaySim population conversion/ parcel allocation utility has been created that reads PopGen population files, associates household attributes with persons, and allocates the households in the synthetic population to parcels. It then outputs a combined synthetic population file (dbf) in the format required by DaySim. The primary inputs to this utility are six PopGen output files (household and person files for three population groups), a TAZ controls file, and DaySimâs regular parcel data input file. The TAZ controls file is an input of permanent households, seasonal households, and noninstitutionalized GQ residents living in each TAZ. The file format is shown in Table 1.10. Table 1.11 shows the format of the PopGen population household files that are needed by the DaySim utility. Whether one, two, or three input files are required depends on the seg- mentation of the synthetic population and the associated Table 1.8. PopGen Geographic Correspondence File Layout County Tract Bg State Pumano Stateabb Countyname <vartype> <vartype> <vartype> <vartype> <vartype> <vartype> <vartype> <data> <data> <data> <data> <data> <data> <data> . . . . . . . . . . . . . . . . . . . . . Table 1.9. PopGen Synthetic Population Output File Data Items Data Item and Description Values Household Characteristic State County Tract Bg (TAZ) Hhid Serialno Frequency HhuniqueID Household size Number of people in related family Household income In dollars Person Characteristic State County Tract Bg (TAZ) Hhid Serialno Pnum Frequency PersonuniqueID Gender 1-male, 2-female Age Years Grade in school 1:Pre-K (age 3+) 2:K 3:Grade 1â4 4:Grade 5â8 5:Grade 9â12 6:Undergrad 7:Grad/Prof school Hours worked per week

26 settings in the control file. For Jacksonville, three files are used, one for each population segment. Table 1.12 shows the formats of PopGen population person files that are needed by the DaySim utility. A PopGen popula- tion person file is required for each of the population segments; three in the case of Jacksonville. The utility also creates three additional data items (shown in Table 1.13) required by DaySim during microsimulation. It creates binary variables for each person, indicating whether or not that individual is a worker and/or a student. It then assigns the household to the specific parcel within the TAZ to which the synthetic household was assigned by PopGen. Permanent and seasonal households in a TAZ are combined into one large group and allocated to parcels on the basis of the availability of dwelling units. GQ residents are allocated to parcels con- taining GQ dwelling units, which are identified in the parcel data input file separately from dwelling units for permanent and seasonal households. Synthetic Population Validation For Jacksonville, the synthetic population generated using PopGen was validated across the different control dimensions (both household and person) used to ensure that the popula- tion matched the control variables. The validation was done separately for the three population groups. Table 1.14 sum- marizes the differences in the total number of households and Table 1.10. DaySim Synthetic Population Input Data Items Label Format Definition TAZ Integer Zone number HHPerm_ZC Float Permanent households living in TAZ HHSeas_ZC Float Seasonal households living in TAZ GQUnitsZC Float Noninstitutionalized GQ residents in TAZ Note: These data must be in CSV format (.csv), with a header row, in the order specified. DaySim reads them as integers. Table 1.11. Input Files Format of Households Generated from PopGen: Permanent, Seasonal, or GQ Label Definition State State of residence County County of residence Tract Tract of residence Bg TAZ of residence Hhid Household ID (generated by PopGen; DaySim reassigns household number) Serialno Serial number (generated by PopGen) Frequency Number of households represented (DaySim assigns this number of households) HINC Household income (dollars) Hhsize Number of persons in household NPF Number of persons part of family Note: These data must be in CSV format (.csv), without a header row, in the order specified. DaySim reads them as integers. Table 1.12. Input Files Format of Persons Generated from PopGen: Permanent, Seasonal, and/or GQ Label Definition Pstate State of residence Pcounty County of residence Ptract Tract of residence Pbg TAZ of residence Phhid Household ID (generated by PopGen; DaySim reassigns household number) Pserialno Serial number (generated by PopGen) Pnum Person number within household (DaySim reassigns person number) Pfrequency Number of households represented (DaySim assigns this number of households) Age Age in years Gender Gender: 1-male, 2-female GradeCat 0:Non-student 1:Pre-K (age 3+) 2:K 3:Grade 1â4 4:Grade 5â8 5:Grade 9â12 6:Undergrad 7:Grad/Prof school Hours Hours worked per week Note: These data must be in CSV format (.csv), without a header row, in the order specified. Each person file must be in the same household order as its corresponding household file. DaySim reads the data as integers. Table 1.13. DaySim Synthetic Population Derived Data Items Data Item and Description Value ACS 2006â2008 Item Census 2000 5% PUMS Item Household Characteristic Parcel Parcelid Person Characteristic Worker indicator Yes/no WKHP>0 HOURS>0 Student indicator Yes/no SCHG not blank Grade>0

27 persons in the synthetic population and observed controls. Overall, the synthetic population has about 1.4% fewer per- sons. That can be considered reasonable given the total popu- lation of about 1.2 million persons in the modeling region. Figure 1.3 shows a comparison of the household size dis- tribution from the control data and synthetic population to illustrate the degree of match achieved. Table 1.15 and Table 1.16 show the comparison of household size distribu- tions further disaggregated to the county level. Since the population was synthesized at the TAZ level, the county level match is close, as expected. Note that the proportion of sea- sonal households is quite low. The distributions of household workers for both permanent and seasonal populations are shown in Figure 1.4. At a county level, household workers distributions are given in Table 1.17 and Table 1.18 for permanent and seasonal populations, respectively. The level of match is similarly close to that of the household size distribution. The household income distributions for both permanent and seasonal population are shown in Figure 1.5. Table 1.19 and Table 1.20 illustrate that, for both permanent and seasonal households, the income distribution in the synthetic population is close to that of the observed control data at the county level. In addition to evaluating the household validation, users should also look at the matches among person attributes. Person-level attributes are one of the distinguishing features of the PopGen tool. Figure 1.6 shows that the person-level attri- bute distribution of age in the synthesized population is a close match to the controls. Table 1.21 and Table 1.22 split the distri- butions of age further, by counties. Note that both the distribu- tions and the county-level total of number of persons fit well. Table 1.23 illustrates that for the GQ population, the age distribution in the synthetic population is representative of the observed marginal numbers in each of the four counties. Burlington Synthetic Population Control Data EstimatE thE DEmographic Distributions The first step is to identify specific control variables of interest that are relevant to the travel demand forecasting process. The control variables or attributes are identified for each of the two population segments, permanent households and the group quarters population, separately. Because the choice of attri- butes for Jacksonville was not based on reasons specific to the geographic area, the same attributes were used for Burlington. Table 1.14. Synthetic Population Validation Summary Population Group HH Obs. HH Syn. HH Diff. Per Obs. Per Syn. Per Diff. Permanent 479,250 479,298 0.01% 1,202,855 1,184,800 -1.50% Seasonal 35,339 35,367 0.08% 63,611 64,185 0.90% Group Quarters 10,813 10,823 0.10% 10,813 10,823 0.10% Total 525,402 525,488 0.02% 1,277,279 1,259,808 â1.37% Note: HH = households; Obs. = observed; Syn. = synthesized; Diff. = difference; Per = persons. 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 1 2 3 4+ Household size Observed-Permanent Synthesized-Permanent Observed-Seasonal Synthesized-Seasonal Figure 1.3. Jacksonville regional household size distribution comparison.

28 pErmanEnt housEholDs. The household and person controls and their categories shown in Table 1.1 and Table 1.2 for Jack- sonville were also used for PopGen synthesis of the perma- nent household population in Burlington. The tables also show the specific data sources used to derive distributions for each of the variables. The household-level control attributes include the following: â¢ Age of the head of household; â¢ Household size; â¢ Number of workers; â¢ Household income; and â¢ Presence of children. The person-level control attributes include gender and age. Table 1.15. Jacksonville Household Size Distributions for Permanent Households, by County Household and County Household Size 1 2 3 4+ Total Observed Clay 10,489 20,296 12,058 18,508 61,352 Duval 87,640 107,388 57,993 75,330 328,350 Nassau 4,900 8,940 3,971 5,720 23,531 St Johns 15,812 24,489 10,212 15,509 66,022 Synthesized Clay 10,583 20,394 11,965 18,427 61,369 Duval 88,402 107,661 57,412 74,877 328,352 Nassau 4,974 8,992 3,914 5,665 23,545 St Johns 15,942 24,548 10,103 15,439 66,032 Table 1.16. Jacksonville Household Size Distributions for Seasonal Households, by County Household and County Household Size 1 2 3 4+ Total Observed Clay 583 2,331 66 86 3,066 Duval 4,840 19,358 550 715 25,463 Nassau 499 1,996 57 74 2,626 St Johns 795 3,182 90 118 4,185 Synthesized Clay 561 2,504 2 0 3,067 Duval 4,934 20,406 120 23 25,483 Nassau 493 2,104 21 8 2,626 St Johns 797 3,370 18 6 4,191 0 50000 100000 150000 200000 250000 0 1 2+ Number of household workers Observed-Permanent Synthesized-Permanent Observed-Seasonal Synthesized-Seasonal Figure 1.4. Jacksonville regional household workers distribution comparison.

29 Table 1.17. Jacksonville Household Workers Distributions for Permanent Households, by County Household and County Household Workers 0 1 2+ Total Observed Clay 12,231 24,171 24,949 61,352 Duval 74,193 141,661 112,496 328,350 Nassau 6,783 8,053 8,695 23,531 St Johns 17,824 25,939 22,259 66,022 Synthesized Clay 12,233 24,154 24,982 61,369 Duval 74,116 141,841 112,395 328,352 Nassau 6,810 8,053 8,682 23,545 St Johns 17,871 25,963 22,198 66,032 Table 1.18. Jacksonville Household Workers Distributions for Seasonal Households, by County Household and County Household Workers 0 1 2+ Total Observed Clay 2,510 430 126 3,066 Duval 20,843 3,575 1,045 25,463 Nassau 2,149 369 108 2,626 St Johns 3,426 588 172 4,185 Synthesized Clay 2,750 268 49 3,067 Duval 22,335 2,618 530 25,483 Nassau 2,315 258 53 2,626 St Johns 3,703 400 88 4,191 Table 1.19. Jacksonville Household Income Distribution for Permanent Households, by County Household and County Household Income <$30K $30Kâ$60K $60Kâ$100K >$100K Total Observed Clay 16,165 22,164 15,995 7,028 61,352 Duval 112,892 115,895 66,066 33,497 328,350 Nassau 7,560 7,917 5,454 2,601 23,531 St Johns 20,515 19,976 14,448 11,082 66,022 Synthesized Clay 16,146 22,239 16,005 6,979 61,369 Duval 113,047 116,117 65,979 33,209 328,352 Nassau 7,570 7,948 5,460 2,567 23,545 St Johns 20,556 19,967 14,435 11,074 66,032 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 Household Income Observed-Permanent Synthesized-Permanent Observed-Seasonal Synthesized-Seasonal Figure 1.5. Jacksonville regional household income distribution comparison.

30 Table 1.20. Jacksonville Household Income Distribution for Seasonal Households, by County Household and County Household Income <$30K $30Kâ$60K $60Kâ$100K >$100K Total Observed Clay 748 1,033 430 854 3,066 Duval 6,214 8,579 3,575 7,094 25,463 Nassau 641 885 369 732 2,626 St Johns 1,021 1,410 588 1,166 4,185 Synthesized Clay 738 1,061 405 863 3,067 Duval 6,149 8,718 3,412 7,204 25,483 Nassau 643 896 353 734 2,626 St Johns 998 1,450 566 1,177 4,191 Table 1.21. Jacksonville Age Distribution Comparison for Permanent Population, by County Household and County Age Group (years) 0â15 16â20 21â44 45â64 65+ Total Observed Clay 43,330 11,951 59,705 39,152 14,317 168,454 Duval 192,099 53,689 309,132 174,889 80,396 810,204 Nassau 14,250 3,956 21,236 16,526 7,741 63,709 St Johns 36,389 9,073 53,278 40,333 21,414 160,487 Synthesized Clay 42,868 11,733 58,806 38,741 14,179 166,327 Duval 190,461 52,956 303,452 171,421 79,067 797,357 Nassau 13,426 3,715 19,795 15,583 7,295 59,814 St Johns 36,493 9,107 53,681 40,325 21,696 161,302 Figure 1.6. Jacksonville regional age distribution comparison. 0 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 0-15 16-20 21-44 45-64 65+ Age group Observed-Permanent Synthesized-Permanent Observed-Seasonal Synthesized-Seasonal

31 Distributions for all of the control attributes except those for presence of children were derived from CTPP tables. The dis- tributions for the presence of children attribute was obtained from the Census SF1. Because the CTPP distributions were at the Census TAZ level, they needed to be converted to the Chittenden County Metropolitan Planning Organization (CCMPO) model TAZ level. (Chittenden County encompasses Burlington.) The following steps were used to create a corre- spondence between the Census and CCMPO model TAZs: â¢ A CCMPO model parcel centroid file was created from the parcel boundary shape file. This file also contains the CCMPO model TAZ for each parcel. â¢ The parcel centroid shape file was intersected with the Census TAZ shape file in ArcGIS, and centroids of CCMPO parcels were matched with Census TAZs. This creates a many-to-many correspondence between Census and CCMPO TAZs. â¢ Using the total number of housing units in all the parcels in a CCMPO TAZ and total number of housing units in all the parcels in a Census TAZ, the project team calculated the proportion of housing units from a Census TAZ that belong to a particular CCMPO TAZ. The numbers of households in various categories of control variables were aggregated at the Census TAZ level and distrib- uted to CCMPO TAZs on the basis of the calculated propor- tions. The data were aggregated again at the CCMPO TAZ level. SF1 data are at the Census-block group level. A block groupâCCMPO TAZ correspondence was similarly created using parcels as the go-between. Parcel centroids were mapped to Census-block groups using ArcGIS, and the block groupâ level households in each category of the presence-of-children attribute were distributed in the same proportion to the par- cel level. Aggregating the data to CCMPO TAZ level at that point resulted in the required distributions. For population synthesis, all dollars are normalized to repre- sent 1999 dollars as closely as possible; that value was used in the 2000 Census which supplies PUMS and control table data. noninstitutionalizED group QuartErs. The control categories used for GQ population synthesis in Jacksonville were also used for Burlington (see Table 1.3). As in Jacksonville, the distribu- tion is extremely simple because of limited Census table data for GQ residents. However, the age distribution helps PopGen properly locate two important GQ subpopulations: college stu- dents and retirement center residents. No IPF is required, and simple scaling suffices to match this one-dimensional control. However, PopGen was run in a simplified mode to synthesize the GQ population. EstimatE thE numbEr of housEholDs anD pErsons in Each taz The total number of permanent households for 2005 at the TAZ level was obtained from the CCMPO model data. Because a separate population for seasonal households was not being synthesized, the additional step for estimating the total num- ber of households at the seasonal level was not required. Table 1.22. Jacksonville Age Distribution Comparison for Seasonal Population, by County Household and County Age Group (years) 0â15 16â20 21â44 45â64 65î± Total Observed Clay 98 58 202 1,267 3,893 5,518 Duval 817 480 1,682 10,521 32,333 45,833 Nassau 84 50 173 1,085 3,334 4,726 St Johns 134 79 276 1,729 5,315 7,534 Synthesized Clay 0 1 33 1,200 4,341 5,575 Duval 23 26 447 10,424 35,278 46,198 Nassau 10 5 59 1,050 3,672 4,796 St Johns 8 3 71 1,690 5,844 7,616 Table 1.23. Jacksonville Noninstitutional GQ Population Age Distribution Household and County Age Group (years) 18â44 45â64 65+ Total Observed Clay 0 121 638 758 Duval 233 7,841 646 8,720 Nassau 0 199 33 233 St Johns 37 943 122 1,102 Synthesized Clay 0 120 639 759 Duval 235 7,847 648 8,730 Nassau 0 199 33 232 St Johns 37 945 120 1,102

32 Permanent population controls were estimated in a straight- forward manner. An approximate average of 4.5 persons was assumed for the highest household size category (households with four or more persons). That average along with the number of households in each size category resulted in an esti- mate of total number of persons by TAZ. The total number of noninstitutional GQ residents for the base year was estimated using parcel-level data from the CCMPO model. Specific parcels belonging to educational institutions were identified, and number of GQ units on each was aggregated up to the TAZ level. These demographic distributions were rescaled to match the estimated number of households and persons residing in each TAZ. rEformat control Data to popgEn spEcifications The process for reformatting the control data in Burlington was similar to that performed in Jacksonville. For permanent residents, two PopGen marginals files were needed, with 49 household controls in one file and 10 personal controls in another, as shown in Table 1.1 and Table 1.2. The layout of the marginal input file required by PopGen is shown in Table 1.4. The file begins with four mandatory fields: state, county, tract, and bg, with bg interpreted as TAZ. After that is a column for each control category, with entries representing the number of households (or persons) within the category for each TAZ. Two header rows (the column name in row 1 and the data type in row 2) are followed by one row of control data items for each TAZ. For the GQ population, the three controls in Table 1.3 need to be included in a household marginals file, but no person marginals file is required. All of these steps have been coded in an R-script for Burling- ton similar to that used for Jacksonville. The inputs required for the R-script are correspondences between Census TAZs, CCMPO TAZs, and DaySim TAZs (DaySim TAZs are renum- bered CCMPO TAZs since the DaySim software requires that external zones be listed first), CTPP tables in CSV for- mat, SF1 data with CCMPO TAZ mapped (also in CSV for- mat), CCMPO zonal data (dbf format), and GQ population totals by parcel from CCMPO data. Sample Data The sample data represent household samples with detailed demographic attributes and are used to draw individual house- holds to form the synthetic population. The sample data are also used to provide a seed matrix of multidimensional control attributes for the IPF procedure in PopGen. A large and socio- demographically diverse sample from which to draw the syn- thetic population is desirable. For this reason, as in the case of Jacksonville, PUMS samples from both ACS and Census 2000 were combined to prepare the sample for population synthesis. Only one PUMA covers the model area: PUMA 100. The PUMS sample for permanent households includes all occupied housing and person records from this PUMA. The sample for noninstitutionalized GQ residents includes only those hous- ing and person records, from this PUMA, that represent the noninstitutionalized GQ population. For 2000 PUMS, these have housing record UNITTYPE=2; and for 2006â2008 ACS, they have housing record TYPE=3. Table 1.6 lays out the data elements needed for the PopGen input sample files, including the items required by PopGen, the items corresponding to control variable categories, and the items needed by DaySim. The data items are the same for the two population segments developed in Burlington. Table 1.7 shows the exact format of the input sample file as required by PopGen for both Burlington and Jacksonville. An input sample file contains four mandatory fields (state, pumano, hhid, serialno) followed by population attributes. The serial number and household ID are identical IDs for the sample housing unit, indexed at 1. (As noted in the Jacksonville case, the duplication is a legacy of an earlier version of PopGen. The current version requires only one unique ID, but the code still requires that two fields be present in the input file.) Population Synthesis In addition to the control and sample files, PopGen requires a geography correspondence file (shown in Table 1.8). As in the Jacksonville case, this file has one row per TAZ and associates the TAZ with the PUMA (and other larger geographies) to which it belongs. For Burlington, which has only one PUMA, the TAZs are simply mapped to PUMA 100. Using all these input files, PopGen was run separately for the two population subgroups: permanent and noninstitu- tionalized GQ. After PopGen was run, output population files were exported. The data elements needed for the PopGen syn- thetic population output files are identical in both Burlington and Jacksonville and are show in Table 1.9. Synthetic Population DaySim Integration As described in the Jacksonville population synthesis section, DaySim requires a synthetic population in the form of persons records, with the household information attached to each per- son record. Also, DaySim operates at the parcel level; thus, the population synthesized at the TAZ level by PopGen has to be allocated to individual parcels within a particular TAZ. For this purpose, a utility was created in Delphi to randomly allo- cate households within a TAZ to individual parcel units. It then outputs a combined synthetic population file (dbf) in a format required by DaySim. The primary inputs to this utility are four PopGen output files (household and person files for two population groups), a TAZ controls file, and DaySimâs regular parcel data input file. Because the utility was created

33 for Jacksonville, which has three population segments, the TAZ controls file is an input of permanent households, sea- sonal households, and noninstitutionalized GQ residents liv- ing in each TAZ. The file format is shown in Table 1.10. The project team was able to use the utility in Burlington by speci- fying the number of seasonal households in all TAZs as zero. Table 1.11 and Table 1.12 show the formats of PopGen population household and person files, respectively, that are needed by the DaySim utility. A pair of PopGen population household and person files is required for each of the two population segments in Burlington. The utility also creates three additional data items (shown in Table 1.13) required by DaySim during microsimulation. It cre- ates binary variables for each person, indicating whether or not that individual is a worker and/or a student. It then assigns the household to a specific parcel within the TAZ to which PopGen assigned the synthetic household. Permanent households in a TAZ are allocated to parcels on the basis of the availability of dwelling units. GQ residents are allocated to parcels containing GQ dwelling units, which are identified in the parcel data input file separately from dwelling units for permanent households. Synthetic Population Validation The Burlington synthetic population generated using PopGen was validated across the different control dimensions (both household and person) to ensure that the population matched the control variable distributions. The validation was done sep- arately for the two synthesized population groups. Table 1.24 summarizes the differences in the total number of households and persons in the synthetic populations and observed controls. Overall, the synthetic population has about 1.5% fewer persons than the observed data. Figure 1.7 shows a comparison of the household size dis- tributions from the control data and synthetic population to illustrate the degree of match achieved. Because the model region has only one county (Chittenden County), the match can be interpreted as occurring at the county level. Because the population was synthesized at the TAZ level, the county level match is close, as expected. Similarly, Figure 1.8 and Figure 1.9 show comparisons of distributions for number of household workers and household income between observed and synthetic permanent popula- tions. At the household level, the synthetic population attri- butes seem to match the observed distributions closely. In addition to evaluating the match among household attributes, comparing the distributions of person attributes in the synthetic populations and observed data is also impor- tant. Figure 1.10 shows that the distribution of person age in the synthetic population matches well with that observed from Census data. Finally, Figure 1.11 makes clear that the distribution of age of the GQ population is almost the same as that in the observed data. Table 1.24. Burlington Synthetic Population Validation Summary Population Group HH Obs. HH Syn. HH Diff. Per Obs. Per Syn. Per Diff. Permanent 59,975 59,975 0.00% 150,263 147,909 -1.57% Group quarters 5,474 5,477 0.05% 5,474 5,477 0.05% Total 65,449 65,452 0.00% 155,737 153,386 -1.51% Note: HH = households; Obs. = observed; Syn. = synthesized; Diff. = difference; Per = persons. Figure 1.7. Burlington household size distribution comparison. 0 5000 10000 15000 20000 25000 1 2 3 4+ Household Size Observed-Permanent Synthesized-Permanent

34 0 10000 20000 30000 40000 50000 60000 70000 0-15 16-20 21-44 45-64 65+ Age Group Observed-Permanent Synthesized-Permanent Figure 1.10. Burlington person age distribution comparison. 0 5000 10000 15000 20000 25000 Household Income Observed-Permanent Synthesized-Permanent Figure 1.9. Burlington household income distribution comparison. 0 5000 10000 15000 20000 25000 30000 0 1 2+ Number of Household Workers Observed-Permanent Synthesized-Permanent Figure 1.8. Burlington household workers distribution comparison.

35 daySim Parcel data A distinguishing feature of DaySim is that it uses parcels as one of the fundamental spatial units. The parcel data input file con- tains input data at the parcel level of detail, the more detailed of the two spatial levels at which DaySim input data are prepared. The less detailed spatial level uses TAZs; the TAZ input file is described in the section on DaySim TAZ data. Figure 1.12 shows the relationship between TAZs (TAZ boundaries are shown in red) and parcels. The parcel polygons, which show the physi- cal extent of the parcel, are shown in gray with the parcel boundaries shown as thin white lines. The area of the parcel is included in the parcel data input file in units of square feet in the AREA_SQF field. In this figure, darker gray colors indicate increasing numbers of housing units in the parcel; housing units numbers are recorded in the HOUSESP variable in the parcel data input file. The parcel centroids are shown as brown dots. The locations of the parcel centroids are described in the parcel file in the X_COORD and Y_COORD fields. Omitted from the parcel file (and shown as thick white lines) are highway and other rights of way. Parcels clearly allow a far more detailed, spatially disaggregate description of the land use in a region than do TAZ-based models; but, consequently, it necessitates the development and management of larger quantities of data. The parcel data input file is a dBase IV format file (.dbf) with one row of data per parcel. Table 1.25 shows the fields con- tained in the parcel data input file. The file begins with several fields that identify the parcel and describe the physical location and size of the parcel; the file also contains fields that describe the quantity of housing, school enrollment, and employment on the parcel and within a quarter-mile and a half-mile of the parcel. In addition, the parcel file contains information about urban form and the transportation system on and close to the parcel, including the proximity to transit stops and the price and supply of parking. The data sources and the development process for these fields are discussed in the following sections. Parcel Files Housing UnitsâJacksonville Parcel-level information on housing units is used to allocate the synthetic population to individual parcels and to influ- ence destination choices. The data are available in the model area from parcel-level databases maintained in each county for tax assessment purposes. This section provides an over- view of the steps required to take four separate parcel data- bases, combine them to create a consistent regional database, Figure 1.11. Burlington GQ population age distribution comparison. 0 1000 2000 3000 4000 5000 6000 <18 18-64 65+ Age Group Observed-GQ Synthesized-GQ Figure 1.12. TAZ boundaries, parcel polygons, and parcel centroids.

36 Table 1.25. Parcel Data Input File Format Labela Definition PARCELID Parcel ID number X_COORD X coordinate â state plane feet Y_COORD Y coordinate â state plane feet AREA_SQF Area â square feet TAZ TAZ number HOUSESP Housing units â parcel (Ã 100) HOUSESQ Housing units â quarter-mile radius (Ã 100) HOUSESH Housing units â half-mile radius (Ã 100) STUDK12P Students Kâ12 â parcel (Ã 100) STUDK12Q Students Kâ12 â quarter-mile radius (Ã 100) STUDK12H Students Kâ12 â half-mile radius (Ã 100) STUDUNIP Students University â parcel (Ã 100) STUDUNIQ Students University â quarter-mile radius (Ã 100) STUDUNIH Students University â half-mile radius (Ã 100) NODES1Q 1 link nodes â quarter-mile radius NODES1H 1 link nodes â half-mile radius NODES3Q 3 link nodes â quarter-mile radius NODES3H 3 link nodes â half-mile radius NODES4Q 4+ link nodes â quarter-mile radius NODES4H 4+ link nodes â half-mile radius DIST_LRT Distance to nearest LRT stop (miles Ã 100, -1 if none) DIST_BUS Distance to nearest bus stop (miles Ã 100, -1 if none) PARKDY_P Daily paid parking spaces â parcel PARKDY_Q Daily paid parking spaces â quarter-mile radius PARKDY_H Daily paid parking spaces â half-mile radius PPRICDYP Avg. price daily parking â parcel (cents) PPRICDYQ Avg. price daily parking â quarter mile (cents) PPRICDYH Avg. price daily parking â half mile (cents) PARKHR_P Hourly paid parking spaces â parcel PARKHR_Q Hourly paid parking spaces â quarter-mile radius PARKHR_H Hourly paid parking spaces â half-mile radius PPRICHRP Avg. price hourly parking â parcel (cents) PPRICHRQ Avg. price hourly parking â quarter mile (cents) Labela Definition PPRICHRH Avg. price hourly parking â half mile (cents) EMPEDU_P Education jobs â parcel (Ã 100) EMPFOODP Food service jobs â parcel (Ã 100) EMPGOV_P Government jobs â parcel (Ã 100) EMPOFC_P Office jobs â parcel (Ã 100) EMPOTH_P Other jobs â parcel (Ã 100) EMPRET_P Retail jobs â parcel (Ã 100) EMPSVC_P Service jobs â parcel (Ã 100) EMPMED_P Medical jobs â parcel (Ã 100) EMPIND_P Industrial jobs â parcel (Ã 100) EMPTOT_P Total jobs â parcel (Ã 100) EMPEDU_Q Education jobs â quarter-mile radius (Ã 100) EMPFOODQ Food service jobs â quarter-mile radius (Ã 100) EMPGOV_Q Government jobs â quarter-mile radius (Ã 100) EMPOFC_Q Office jobs â quarter-mile radius (Ã 100) EMPOTH_Q Other jobs â quarter-mile radius (Ã 100) EMPRET_Q Retail jobs â quarter-mile radius (Ã 100) EMPSVC_Q Service jobs â quarter-mile radius (Ã 100) EMPMED_Q Medical jobs â quarter-mile radius (Ã 100) EMPIND_Q Industrial jobs â quarter-mile radius (Ã 100) EMPTOT_Q Total jobs â quarter-mile radius (Ã 100) EMPEDU_H Education jobs â half-mile radius (Ã 100) EMPFOODH Food service jobs â half-mile radius (Ã 100) EMPGOV_H Government jobs â half-mile radius (Ã 100) EMPOFC_H Office jobs â half-mile radius (Ã 100) EMPOTH_H Other jobs â half-mile radius (Ã 100) EMPRET_H Retail jobs â half-mile radius (Ã 100) EMPSVC_H Service jobs â half-mile radius (Ã 100) EMPMED_H Medical jobs â half-mile radius (Ã 100) EMPIND_H Industrial jobs â half-mile radius (Ã 100) EMPTOT_H Total jobs â half-mile radius (Ã 100) USED (unused) COUNTY County (used only for usual work validation) GQUnitsP Noninstitutionalized group quarters units â parcel (Ã 100) (used only with PopGen population) a DaySim can read these variables in any order, but the variable names must remain the same as shown. All values from the file are read as integers, with no decimal.

37 impute housing units for land-use types such as condomin- ium and multifamily housing developments (because hous- ing unit numbers were not necessarily present in the parcel database), and verify that the resulting housing unit numbers were reasonable. The county-level databases were each structured differ- ently and contained different data items. The following steps were applied to each database before they were combined: â¢ Identify and remove non-travelâgenerating parcels such as highway rights of way, bodies of water, or other extraneous parcels. This step did not remove currently undeveloped land that could be developed in the future. â¢ Merge into a single parcel any parcels that are the same parcel but are spread across separate GIS shapes with mul- tiple database rows. This step involved summing some fields, such as land areas, if they were divided across mul- tiple polygons. â¢ Develop a set of common fields across the four databases such as land-use type, effective year, land area, building area, and number of buildings. Ultimately, a combined database of 618,981 parcels was developed. The parcel database does not specify the number of housing units on a parcel; that number must be imputed based on the data included in the database. The following steps were taken to impute the number of housing units: â¢ The databases varied depending on when they were last updated. Buildings with an effective year (the year the struc- ture was built) after 2005 (the model base year) were not included in the analysis. â¢ Parcels with a single-family or mobile-home land-use type were assumed to contain one unit. â¢ Parcels with a condominium land-use type typically con- tain one unit. Generally, the parcel database shows large condominium buildings or developments as a grid of small parcels, with each parcel representing a single unit. Some exceptions to this rule were identified on the basis of a scan of land area, building area, and numbers of buildings. In those cases, the parcels were treated in a way similar to other types of multifamily housing parcels. â¢ Multifamily housing presented the largest challenge when imputing a number of housing units. 44 The available data were identified as a development of either fewer than 10 units or 10 or more units, and typi- cally a number of buildings and a total building area. 44 An initial imputation was made to assign one unit for every 1,000 square feet of floor area in each building and to assign an extra unit to any remainder in excess of 500 square feet. 44 All parcels in the fewer-than-10-units category were then constrained to a minimum number of units of 2 and a maximum of 9; those in the 10-or-more-units category were constrained to a minimum number of units of 10. 44 At this point, outliers were identified by considering the largest developments in terms of imputed units. This step identified issues such as large developments that were split across parcels but for which the total building area for the whole development was assigned to each parcel. Following manual identification and corrections, the housing unit numbers were recalculated. 44 The number of multifamily units was then compared at a county level with data from the NERPM model to check for aggregate consistency. The 1,000 ft2 of floor area per unit was increased to 1,275 ft2 to match the regional multifamily unit total contained in the NERPM model for 2005. Table 1.26 shows a summary of the housing unit numbers developed using the parcel databases, by unit type and by county. The numbers compare reasonably closely to the county level totals by unit type used in the NERPM model shown in Table 1.27. Table 1.26. Summary of Housing Unit Numbers, by Type and County (DaySim parcel file) Description Clay Duval Nassau St Johns Grand Total SINGLE FAMILY 50,990 236,138 19,170 50,340 356,638 MOBILE HOMES 10,183 11,768 5,941 5,594 33,486 CONDOMINIA 11 18,576 2,493 3 21,083 MULTI-FAMILY(â¥10) 6,632 97,143 742 8,490 113,007 MULTI-FAMILY (<10) 851 11,031 1,076 12,676 25,634 Total Single Family 61,173 247,906 25,111 55,934 390,124 Total Multi Family 7,494 126,750 4,311 21,169 159,724 Total Units 68,667 374,656 29,422 77,103 549,848

38 Parcel-level calculations of proximity to total housing units within quarter-mile and half-mile buffers are also important urban form measures used in DaySim. They are calculated using a script, as described in the section on parcel-level buffers. Housing UnitsâBurlington In the Burlington region, most of the parcel-level data on housing, enrollment, and employment were obtained from the CCMPO regional travel demand model. The parcel data geodatabase (year_built_to_share.mdb) contained the table Parcels_yrbuilt; the fields in the table included parcel ID, location data, town, and area. The table was processed to remove non-travelâgenerating parcels such as highway rights of way, bodies of water, or other extraneous parcels. This resulted in polygon data for 50,052 parcels, covering the CCMPO model area. The data on Burlington housing units were obtained from the CCMPO regional travel demand model. The data for the model were originally collected by the Chittenden County Regional Planning Commission (CCRPC) based on the 2005 municipal Grand List. These data had been compared with data from other sourcesâsuch as existing parcel data, 2000 Census data, and building permitsâto ensure their accuracy. The data were in the form of a point shape file (e.g., 20080612_ rpc_2005_housing_points.shp) which contained points for 42,142 residential locations covering the CCMPO model area. The data fields included parcel ID, town, address, dwelling unit type (e.g., single-family, multifamily, GQ), dwelling unit count, and TAZ. Total dwelling units, including GQ, came to 65,449. Some data cleaning was done, such as removing non-travelâ generating parcels which include highway rights of way, bodies of water, and other extraneous parcels. Table 1.28 summarizes the housing unit numbers developed using the parcel data- bases, by unit type for Chittenden County. Employment by TypeâJacksonville Parcel-level information on the total number of jobs by employment type on each individual parcel is one of the most essential model inputs. In DaySim, the number of workers attracted to each employment site is calibrated to the number of jobs at that site. Detailed information on employment by type was acquired for the Jacksonville area. The employment data for Clay, Duval, and Nassau countiesâprepared by FDOT, NFTPO, and the consulting firm PBS&Jâwere obtained from PBS&J. PBS&J provided a GIS file of employment location points with records for individual businesses, including num- ber of employees and six-digit standard industrial classifica- tion (SIC) codes. The firm had made extensive efforts to clean and verify the raw employment database, so it required mini- mal additional cleaning for use in this model. The employment data for St. Johns County were obtained in the form of an InfoUSA database that had not been cleaned or processed in any way. (InfoUSA is a commercial data provider.) These data were similar in format to the data for the other three counties with records for individual businesses, including number of employees and SIC codes. Before processing these data into the format required for the parcel data input file, the data were checked to ensure quality. These checks included reviewing the largest individual points to verify numbers of employees: a common problem is that large employers with many locations in a region allocate all jobs to the home office rather than distribute them across the firmâs locations. Once the employment point databases were obtained and deemed satisfactory, they were processed using the following steps: 1. Association of DaySim employment types with each business. DaySim models employment using nine employment types. Table 1.29 shows the correspondence between the aggre- gated employment categories and the more detailed SIC Table 1.27. Summary of Housing Unit Numbers, by Type and County (NERPM model, 2005) Unit Type County Clay Duval Nassau St Johns Grand Total Single Family Units 57,477 251,373 26,190 54,588 389,628 Multi Family Units 9,662 124,499 6,348 19,568 160,077 Total 67,139 375,872 32,538 74,156 549,705 Table 1.28. Summary of Chittenden County Housing Unit Numbers, by Type Description Chittenden County SINGLE FAMILY 36,342 MULTI-FAMILY(2â4) 11,510 MULTI-FAMILY(5+) 12,123 GROUP QUARTERS 5,474 Total Units 65,449

39 classification. A two-digit SIC code was derived from the six- digit SIC codes in the employment databases (i.e., the first two digits of the six-digit SIC code). Then the correspon- dence was used to associate the DaySim employment types with each employment location. 2. Association of employments point with parcels. The employ- ment points were associated with parcels using GIS to intersect the employment point file and the parcel poly- gon file. During this process, several checks were carried out to ensure the reasonableness of the association. These included the following: a. Checking that all employment had been assigned to a parcel: Given that the parcel polygon file excludes rights of way and many employment points are geocoded close to the adjacent street, points can fall outside of any parcel. These points were associated with the closest parcel to move them outside of the right of way and into a parcel. b. Checking the land use of the parcels with which employ- ment was associated: The parcel polygon file includes data on the land-use type. Employment associated with residential parcels and vacant parcels was checked. In some cases, residences are legitimate business locations, particularly for individual businesses (i.e., those with one employee who works from home). Association with vacant parcels can indicate poor geocoding (in which case points can be manually repositioned to the appro- priate parcel) or inconsistency between the employ- ment data and the parcel database (for example, recently opened businesses that supersede the latest year built information in the parcel database). 3. Aggregation of employment at the parcel level. DaySim requires that employment data be aggregated from indi- vidual points within a parcel to totals for each of the nine employment types on a parcel. Parcels with large numbers of employees were spot checked to ensure that a reason- able number of jobs were assigned to individual parcels. 4. Adjustments to employment in military base parcels. The employment at two military bases was adjusted to reflect data obtained on the number of jobs. For the Naval Air Sta- tion Jacksonville, the number of jobs on the parcel was increased by 25,552 to reflect active duty and civilian per- sonnel not accounted for in the employment data, and 10,565 jobs were added to the Naval Station Mayport parcel. Together a total of 36,117 jobs were added. Table 1.30 summarizes the processed employment data, by county and by employment type. Duval County accounts for the majority of the employment in the region, in excess of 80%, with fewer than 50,000 jobs in the each of the other three counties. The distributions of jobs by employment type are similar in each of the four counties. Table 1.31 shows a comparison of county-level employment totals used by the NERPM model for the 2005 model year with those derived from employment databases for use in DaySim. The data received from PBS&J and used in DaySim are relatively similar to the NERPM model data for Clay and Nassau coun- ties. Duval County has 56,000 (12%) more jobs according to the updated PBS&J data, including 36,000 additional jobs at the military bases. For St. Johns County, the InfoUSA database con- tains only 34,000 jobs, significantly fewer than the 53,000 in the NERPM model data. To address this discrepancy, St. Johns Table 1.29. DaySim Employment Sectors and Correspondence with SIC Categories ID DaySim Employment Sector SIC Major Categories/ Generalized 2-Digit Categories SIC 2-Digit Codes 1 Education Educational Services 82 2 Food service Eating and Drinking Places 58 3 Government Public Administration 91â97 4 Office Finance and Real Estate, Services (some major categories) 60â67, 73, 81, 86, 87 5 Other Private Households, Nonclassifiable Establishments 88, 99 6 Retail Retail Trade 52â59 7 Service Transportation, Services (some major categories) 40â49, 70, 72, 75, 76, 78, 79, 83, 84, 89 8 Medical Health Services 80 9 Industrial Agriculture, Mining, Con- struction, Manufacturing, Wholesale Trade 01â39, 50, 51 Table 1.30. Cross Tabulation of Employment, by Type and County, from DaySim Parcel File Employment Type County Clay Duval Nassau St Johns Total Education 3,981 25,632 1,499 2,805 33,917 Food service 5,519 32,233 2,192 4,005 43,949 Government 3,227 65,212 1,980 1,429 71,848 Office 5,210 96,596 2,175 5,118 109,099 Other 95 1,178 9 133 1,415 Retail 10,572 64,647 3,397 5,049 83,665 Service 6,790 78,601 4,614 6,264 96,269 Medical 5,224 57,383 1,681 3,664 67,952 Industrial 6,748 92,779 2,666 5,345 107,538 Total 47,366 514,261 20,213 33,812 615,652

40 County employment was subsequently scaled up to match aggregate NERPM-based totals. As with housing units, parcel-level calculations of proxim- ity to total employment by sector, within quarter-mile and half-mile buffers, are used in DaySim and calculated using a script, as described in the section on parcel-level buffers. Employment by TypeâBurlington The parcel-level employment data for Chittenden County were derived from the CCMPO model inputs. The CCMPO origi- nally collected employment land-use data from two distinct sources: InfoUSA and the Vermont Department of Employ- ment and Training (VT DET). Because the VT DET has a pri- vacy agreement and use restrictions, the CCMPO chose to use the InfoUSA data and supplemented gaps in the InfoUSA data using the VT DET data. InfoUSA data contain information such as the name of the employer, the address of the employer, the general number of employees, and the employerâs SIC code. The CCMPO had geocoded the employers according to address. As in the case of housing unit data, the parcel employ- ment data were received in the form of a point shape file. The CCMPO model area encompasses points for 7,478 employ- ment locations. The employment data include schools, with grades and enrollment ranges (for most schools). Two fields in the data contain employment numbers. One of the fields that was supposed to contain the number of jobsâACTUAL_ EMPâwas found to have double counted the number of jobs at some major employers (such as IBM) that were based at more than one site. Therefore, a second fieldâMPO_EMPâwas included to resolve these issues; this second field was used to derive the number of jobs at each employment location. The total number of jobs in the region is 102,260. Each employment location was associated with a DaySim employment type on the basis of the correspondence with SIC codes shown in Table 1.29. In the next step, each employment location was assigned to a parcel by intersecting with the parcel polygon file in GIS. Employment locations that fell outside all parcels were assigned to the nearest parcel. Finally, the number of jobs by employment type were aggregated to the parcel level and appended to the DaySim parcel input data file. Table 1.32 summarizes the processed employment data, by employment type, for Chittenden County. As with housing units, parcel-level calculations of proxim- ity to total employment by sector, within quarter-mile and half-mile buffers, are used in DaySim and calculated using a GIS script. School EnrollmentâJacksonville Like workers, the number of students that are attracted to each school location is calibrated to the enrollment by grade level at that school location. As a result, parcel-level informa- tion on school enrollment is necessary. DaySim distinguishes school enrollment into enrollment in grades Kâ12 and then college or university enrollment. In the Jacksonville region, the Florida Department of Edu- cation (FDOE) provides school-level information on enroll- ment by grade for schools enrolling Kâ12 students. The schools were then geocoded on the basis of their addresses to obtain enrollment information at the parcel level. Overall, enrollment information was identified for 223 private and 287 public schools. Figure 1.13 shows the distribution of enrollment for both public and private schools. As expected, enrollment in public schools is skewed more toward the higher ranges than in the private schools. Universities, community colleges, and technical schools are identified in the employment database. The project team identified enrollment for some of these institutions by visit- ing the institutionâs website; data on the state university system was obtained from the Florida Board of Governors and on the community college systems from FDOE. In the remainder of cases, enrollment was estimated on the basis of the number of employees at the institution. An average ratio of student enrollment to number of employees was calculated for this purpose, which equals 12.43. Table 1.33 shows the Table 1.31. Total Employment Comparison, by County and Model Type County NERPM DaySim Clay 41,513 47,366 Duval 458,166 514,261 Nassau 20,579 20,213 St Johns 53,359 33,812 Total 573,617 615,652 Table 1.32. Number of Jobs in Chittenden County, by Employment Type Employment Type Number of Jobs Education 9,679 Food service 5,967 Government 4,486 Office 16,123 Other 263 Retail 13,779 Service 19,539 Medical 6,686 Industrial 25,738 Total 102,260

41 0% 10% 20% 30% 40% 50% 60% 0-100 101-500 501-1000 1001 or more Enrollment Private Public Figure 1.13. Kâ12 enrollment distribution. Table 1.33. University-Level Enrollment and Employment County School Employment Enrollment Imputed Duval Edward Waters College 20 839 Duval FCCJ College Administration 20 189 X Duval FCCJ Culinary Institute 28 264 X Duval Florida Community College 1,970 18,598 X Nassau Florida Community College 60 566 X Duval Florida Coastal School of Law 95 1,539 Duval Florida Metropolitan University 80 994 X Duval Florida Technical College 15 186 X Duval ITT Technical Institute 45 559 X Duval Jacksonville University Campus 20 3,400 Duval Jones College 160 650 X Duval Logos University 10 124 X Duval Remington College 40 497 X Duval St Thomas Christian College 20 249 X Duval Troy University 20 249 X Duval University of Florida College 20 249 X Duval University of North Florida 1,233 15,420 Duval University of Phoenix Inc 25 311 X Duval Webster University 10 124 X Duval Conservative Theological Seminary 14 174 X St Johns University of St Augustine 30 373 X St Johns First Coast Technical Institute 165 734 St Johns Flagler College 250 2,716 St Johns Flaglerâs Legacy 12 149 X St Johns St Johns River Community College 40 1,091 Clay Florida Metropolitan University 10 124 X Clay St Johns River Community College 70 941

42 enrollment and employment information for all of the uni- versities and colleges, with an indication of whether the enrollment was imputed. For brevity, only those institutions with 10 or more employees are shown in this table. The infor- mation from the Jacksonville University campus was not used in the calculation of the average enrollment to employment ratio because of the unusually high value (170). Table 1.34 summarizes the final enrollment numbers by type of school and county. Table 1.35 compares derived enrollment with that from the NERPM model at a county level. The enrollment input into DaySim matches reasonably closely with that used in the NERPM model. As with housing units and employment, parcel-level calcu- lations of proximity to school enrollment by sector, within quarter-mile and half-mile buffers, are used in DaySim. They are calculated using a script as described in the section on parcel-level buffers. School EnrollmentâBurlington As in Jacksonville, DaySim uses two grade levels: Kâ12 and college or university. For the Kâ12 grade level, the enrollment numbers were derived using the employment data from CCMPO as described in the following steps: a. All employment locations falling in the primary category of schools (about 100) were considered separately. b. Some of these were found to be invalid locations, such as school district administrative offices, dance or martial arts classes, and were filtered out. Table 1.34. Type of Enrollment, by County Enrollment Type County Clay Duval Nassau St Johns Total Kâ12 43,251 195,662 12,188 20,417 271,518 University 1,115 45,349 567 4,764 51,795 Total 44,366 241,011 12,755 25,181 323,313 Table 1.35. Total Enrollment Comparison, by County and Model Type County NERPM DaySim Clay 39,582 44,366 Duval 227,964 241,011 Nassau 13,740 12,755 St Johns 32,712 25,181 Total 313,998 323,313 Table 1.36. University Student Enrollment in Chittenden County College/University Enrollment St Michaelâs College 2,700 Champlain College 2,000 Burlington College 200 University of Vermont 11,704 Vermont Hitec Inc 240 c. Each employment location had the number of jobs and a range of student enrollment associated with it. The mid- point of the range of student enrollment at a particular location was used as the estimate of enrollment. d. Using the total estimated enrollment and the total number of jobs at all school locations, an average value of students per (school) employee was computed, which equals 6.0. e. This overall average students-per-employee ratio was then used to estimate the final Kâ12 student enrollment for each school location. The total number of students in all the parcels in the region was calculated as 23,706. This value was quite close to the value of 22,403 found in Vermont Public School Enrollment Report for 2008â09. For university-level enrollment, the numbers were directly obtained from websites of the respective colleges and universi- ties since Chittenden County has only a few of them. Table 1.36 shows the colleges included in the parcel-level college-grade enrollment. As with housing units and employment, parcel-level calcu- lations of proximity to school enrollment by sector within quarter-mile and half-mile buffers are calculated using a sep- arate script. Parcel-Level Buffers of Housing Units, Employment, and School Enrollment Parcel-level calculations of proximity to total housing units, employment by employment type, and school enrollment by school type within quarter-mile and half-mile buffers are important urban form measures used in DaySim, and are cal- culated using a GIS script. Figure 1.14 shows an example of the buffers around a parcel centroid. The figure shows the parcel centroids of adjacent parcels in brown. All the centroids that fall within a buffer are counted when the various buffer vari- ables are summed. Figure 1.15 through Figure 1.18 show maps of housing, employment, and school (Kâ12) enrollment per parcel and within quarter-mile and half-mile buffers. Figure 1.15 and Figure 1.16 compare urban and suburban housing variables.

43 Figure 1.14. Example of quarter-mile and half-mile buffer areas around a parcel centroid. Figure 1.15. Housing units per parcel and quarter-mile and half-mile housing unit buffers, urban area. Figure 1.16. Housing units per parcel and quarter-mile and half-mile housing unit buffers, suburban area. Transportation Access In addition to using zone-level information on access times to transit, DaySim also incorporates detailed parcel-level information on the distance to transit, by transit submode. In the case of Jacksonville, two transit modes are included: bus and the JTA Skyway. The Jacksonville Transportation Authority provided GIS data on transit stop locations, and a GIS-based script has been developed to calculate distances to the closest transit stop for every parcel in the region (see Figure 1.19). Urban Form A unique parcel-level measure of urban form that DaySim incorporates is the number of intersections or nodes of dif- ferent types within quarter-mile and half-mile buffers. These intersection types include dead-ends (1 link), T-intersections (3-links), and traditional intersections (4+ links) and help characterize the pattern of urban develop- ment. An automated process has been developed to calcu- late these urban form measures for Jacksonville on the basis of detailed GIS street centerline files. In Burlington, an

44 Figure 1.17. Total employment per parcel and quarter-mile and half-mile total employment buffers, urban area. Figure 1.18. Kâ12 student enrollment per parcel and quarter-mile and half-mile Kâ12 student enrollment buffers, urban area. Figure 1.19. Example of closest bus stops to parcel centroids. updated Tiger Lines road map for Vermont was downloaded and used for this purpose. Both of these networks are more detailed than the modeled network, which does not include all streets. The GIS process first analyzes the GIS street centerline file to locate nodes and assign an intersection-type code based on the number of links joined to the node. The process then cre- ates buffer areas around each parcel and counts the number of intersections of each type that fall within the buffers (see Figure 1.20). Parking DaySim uses information on the number and prices of both daily and hourly parking spaces on the parcel and within quarter-mile and half-mile buffers of each parcel. The project team has inventoried the location and pricing information for paid off-street locations but has not yet obtained or devel- oped accurate information on capacities. The point-based parking locations are assigned to indi- vidual parcels to develop the parcel data (e.g., PARKDY_P = daily paid parking spaces on the parcel). The buffer variables are calculated in a way similar to the urban form network buffers already described, using a GIS process to create buffer areas around each parcel, sum the parking capacity within the buffers, and calculate the weighted average of the parking price at the parking lots in the buffer.

45 addresses the more practical aspects of the current implemen- tation. In several aspects of this work, the project team has made use of the findings from the SHRP 2 C04 project, Improv- ing Our Understanding of How Highway Congestion and Pric- ing Affect Travel Demand. The objective has been to implement the key findings from that study in a manner that retains as much behavioral detail as possible while also remaining practi- cal for model application. Choice Model Context DaySim is an activity-based model (ABM) structure that includes several different levels of travel choices, as shown in Figure 1.21. The solid arrows in the figure depict the down- ward flow of conditionality of the simulated choicesâauto ownership is conditioned by the simulated longer-term work and school locations, the day activity pattern (tour genera- tion, plus some aspects of intermediate stop generation) is conditional on all longer-term choices, the tour-level choices are conditional on the longer-term and day-level choices, and so on. The dashed arrows represent the upward flow of acces- sibility in the models; travel times and costs have the most immediate effects on the trip-level models but ultimately affect all of the DaySim models. The DaySim structure has been carefully designed to include accessibility effects at all choice levels, as consistently and comprehensively as possible. (In the literature, this is termed vertical integrity, with consistent information flows both upward and downward.) This is done through the use of accessibility logsums, a logsum being a statistical construct Figure 1.20. Example of quarter-mile and half-mile node buffers. daySim TAZ data Parcels are the primary spatial units used in DaySim. However, current implementations have used a limited set of TAZ-level data, including PUMA and summary district correspondence codes, and physical attributes such as the land area and coordi- nate locations. The Jacksonville model uses the TAZ system from the NERPM model, except that the zones are renumbered so that external zones are first (1â23) and internal zones are numbered consecutively (24â1,335). Similarly, the Burlington model uses the TAZ system from the CCMPO model. Table 1.37 shows the file layout for the Jacksonville imple- mentation of DaySim. The XCORD, YCORD, and SQFT_Z fields were developed in ArcGIS using the TAZ shape file sup- plied with the NERPM model. PUMAs were assigned to each TAZ by intersecting the TAZ shape file with a PUMA bound- ary shape file obtained from the Census Bureau. daySim Pricing enhancements Key goals of the SHRP 2 C10A model system development effort include providing enhanced representation of travelersâ sensitivities to price and incorporating findings from other SHRP 2 Capacity projects. This section describes how DaySim and TRANSIMS have been refined and configured to provide more robust capabilities with respect to tolls and other types of road user charges that are modeled in the integrated DaySimâ TRANSIMS model framework. The section starts with a more theoretical discussion of an ideal model representation, then Table 1.37. TAZ Data Input File Format Labela Definition TAZ Zone number AUTACC Auto access time (min Ã 100)b AUTEGR Auto egress time (min Ã 100)b PRKCOST Parking cost in zone (cents per hour)b DAVIS Davis dummy (0/1) PEDENV Pedestrian environment scoreb PUMA PUMA code for zone RAD RAD code for zone (aggregation of zones)b XCORD X coordinate of zone centroid (state plane feet) YCORD Y coordinate of zone centroid (state plane feet) PKNRCOST Park and ride lot cost in zone (cents) SQFT_Z Area of zone (square feet) a DaySim can read these variables in any order, but the variable names must remain the same as shown. All values from the file are read as integers, with no decimal. b Not used in models.

46 used in discrete choice modeling to capture the expected util- ity across all available choice alternatives. As much as possible, DaySim uses fully disaggregate logsums, which essentially combine two models into a single joint, simultaneous model. An example is the tour-level main destination-choice model, which uses logsums representing the composite attractiveness of traveling to a given destination across all available modes. For the longer-term and day-level choices, however, the use of fully detailed disaggregate logsums is not practical, because too many possible combinations of tour- and trip-level alter- natives correspond to each upper-level choice alternative. Instead, these models use aggregate accessibility logsums, which are logsums from a simplified tour-level model across all possible destinations, modes, and time periods, segmented by (a) residence zone, (b) travel purpose, (c) car availability level, and (d) distance from the nearest transit stop. Incorporating Auto Route Choice into DaySim For modeling highway pricing and congestion effects, the relationship between auto route choice (at the bottom of Fig- ure 1.21) and other travel choices is critical. As the available auto routes change in terms of their travel time and/or cost at different times of day, the most direct effects are on the travel mode and departure time chosen for a particular trip. (How- ever, the effects on the other DaySim choices are important to represent as well and will be addressed.) In previous implementations of DaySim, auto route choice has been handled outside of DaySim itself. Instead, for a given auto submode, such as single-occupancy vehicle (SOV), and a given time period, such as a.m. peak, the best route has been predetermined in a network package such as CUBE, TransCAD or EMME, and DaySim has simply used matrices of travel time, distance, and cost along the best path between each originâdestination (O-D) pair. The teamâs previous work on integrating DaySim with TRANSIMS followed this same general approach, but with two key improvements: (a) a more realistic, dynamic representation of traffic conges- tion than is typically possible using a static equilibrium assignment approach, and (b) incorporation of much more detail regarding how congestion levels vary across the day. While DaySim typically uses highway skims from only four or five different time periods across the day, the current DaySimâTRANSIMS implementation uses skims from 22 time periods, with durations as short as 30 min during the peak periods. If finding the best path through a network simply involves finding the shortest-time path, then all path building and assignment may satisfactorily be done outside of DaySim in a more aggregate environment. That may not be the case, how- ever, when toll cost and/or operating cost become key consid- erations in choosing a route. Each traveler may have different trade-offs between travel time and costâtheir so-called value of time (VOT) or willingness to pay (WTP) for time savings. The same is true if additional variables such as travel time variability or reliability are added to the route-choice model, in which case different travelers may also have different val- ues of reliability (VOR) relative to travel time and cost. Ideally, the network path choice would be fully integrated into an ABM such as DaySim. When applying a time-of-day Figure 1.21. DaySim model structure. Longer term choices: Work and school location Auto ownership Day-level choices: Day pattern (tour generation) Aggregate accessibility Tour-level choices: Main destination choice Tour main mode choice Main activity scheduling Trip-level choices: Stop generation and location Trip mode choice Trip departure time Auto route (type) choice KEY: Conditional choices Accessibility measures

47 choice for a given tour or trip, for example, DaySim would evaluate all available paths through the network at each time of day for that given traveler on that given tour or trip. In other words, network path choice would be done âon the flyâ in a fully disaggregate manner depending on each travelerâs trade- offs between travel time, toll, distance, and any other impor- tant route characteristics that are known in the network. Although the project team studied possibilities for setting up such an âon the flyâ integration of DaySim with TRANSIMS, it is not yet practical from the standpoint of computation and runtimes. Another possible solution is to set up different user classes depending on VOT and to use different network skim files with predetermined best paths for each class. For exam- ple, if the two best paths for a given O-D are a travel time of 40 minutes with no toll, or a travel time of 30 minutes with $2.50 toll (and no difference in distance or operating cost), then any traveler willing to pay $2.50 for a 10-minute savings (VOT higher than $15/h) would choose the tolled path and any traveler with VOT lower than $15/h would choose the free path. In the same example, three user classes could be designed as (1) VOT = $0 to $10/h, (2) VOT = $10 to $20/h, and (3) VOT > $20/h. The best path for class (1) would clearly be the free route, and the best path for class (3) would clearly be the tolled route. For class (2) however, which path is best is not clear, because the indifference point of VOT = $15/h falls in the middle of the class (2) range of $10 to $20/h. To deal with this type of inaccuracy, the project team has adopted an approach which is commonly used in practice: â¢ Provide network skims of the best paths for two different route types: routes from the full network, including tolled links, and routes from the nontolled network only, exclud- ing tolled links. â¢ Incorporate a binary route-type-choice (tolled versus non- tolled) model into DaySim. â¢ To further increase the accuracy of the approach, provide different sets of best tolled and nontolled paths for differ- ent ranges of VOT. (Even the nontolled links include oper- ating cost, so the best nontolled path may also vary by VOT class.) â¢ Within each VOT class, use the VOT at the high end of the range to select the best path in TRANSIMS to use as input to DaySim. Thus, in the example with three VOT classes, six different routes would be selected in TRANSIMS and input to DaySim: 1. VOT = 0â10/nontolled network: the lowest generalized- cost route with VOT set at $10/h, excluding tolled links; 2. VOT = 10â20/nontolled network: the lowest generalized- cost route with VOT set at $20/h, excluding tolled links; 3. VOT = 20+/nontolled network: the lowest generalized- cost route with VOT set at $50/h (an arbitrary, high value), excluding tolled links; 4. VOT = 0â10/full network: the lowest generalized-cost route with VOT set at $10/h, including tolled links; 5. VOT = 10â20/full network: the lowest generalized-cost route with VOT set at $20/h, including tolled links; and 6. VOT = 20+/full network: the lowest generalized-cost route with VOT set at $50/h (an arbitrary, high value), including tolled links. Using the same example, if the best two paths are 40 min with no toll and 30 min with a toll of $2.50, then the best nontolled route for all VOT classes (options 1â3) would be the 40-min free route, as the tolled route is excluded. The best path from the full network for VOT = 0â10 (option 4) would also be the free route. But the best path from the full network for the other two VOT classes (options 5 and 6) would be the tolled path, because both of those classes use a VOT set higher than $15/h to pick the best path. Inside of DaySim, travelers with VOT in the range 0â10 would face a binary choice between two identical nontolled paths, which is essentially no choice. Travelers with VOT higher than 10 would all have a choice between the tolled and non- tolled paths. The probability that any traveler will pick the tolled path increases with VOT, so moving the route-type choice inside of DaySim makes it more sensitive to small variations in WTP and ultimately more accurate. The binary route-choice model is a probabilistic model, however, so even for very high VOT there may be a small probability of selecting the free route. The following sections provide more details about how the binary model is implemented in DaySim. First, however, more perspective on the need to include a route-type choice model within DaySim is useful, as is information on different skims to DaySim for different ranges of VOT. The feature using different VOT user classes is not always used in practice, and may seem unnecessary, because the binary route-type- choice model already accounts for differences in WTP. In cases with simple pricing scenarios, such as those that include one or two isolated high occupancy/toll (HOT) lanes or express lanes, the best tolled route is usually the facility in question, and that is not likely to vary across VOT classes. A more detailed pricing scenario, however, such as mileage- based pricing on a regional freeway network, may include a large set of different tolled paths to choose from; and the best tolled path may vary according to VOT. In such a case, pro- viding different best paths for different ranges of VOT helps compensate for the decision to choose a single best tolled path to input to DaySim rather than providing a larger set of possible tolled paths. The implication is that the more com- plex the regional pricing scenario, the larger the number of VOT-specific user classes which should be used.

48 Binary Route-Type-Choice Model The route-type-choice model implemented in DaySim works as shown in Equation 1.2: , . Time , . Distance , opcost , . . Time , . Toll , Distance , opcost , 1 , exp , exp , exp , (1.2) V n i s b i n i s c i n i V t i s a i s b i t i s c i t i t i P t i P n i V t i V t i V n i [ ] [ ] [ ]{ }[ ] ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) = â + â â = + â + â + â = â = + where V(n,i) and V(t,i) are the systematic logit utilities for the nontolled and tolled routes, respectively, for individual traveler i, and P(t,i) and P(n,i) are the corresponding binary logit probabilities; Time(n,i), Time(t,i), Distance(n,i), Distance(t,i) are the travel time and distance along the best nontolled and tolled routes, respectively, for traveler i, depending on the traveler/tripâs origin, destination, time of day, and VOT class; Toll(n,i) is the toll along the best tolled route for traveler i, depending on the traveler/tripâs origin, destination, time of day, and VOT class; a(i) is an alternative-specific constant for the tolled route for traveler i; b(i) is the travel time coefficient for traveler i; c(i) is the travel cost coefficient for traveler i; s is a scale factor applied to all coefficients, denoting the scale of this model relative to mode choice; and opcost is the auto operating cost per mile. The strategy for providing best path skim values for time, distance, and toll from TRANSIMS to DaySim was explained in the previous section. The assumptions and methods used for setting coefficients a, b, c, and s are given in the next section. Note that if two paths are identical in terms of time, distance, and toll (= 0), then the nontolled path is selected as the chosen route type without applying the model. Also note that operating cost per mile is treated as a constant in DaySim (which can be varied by the user to represent future fuel cost assumptions). If DaySim is enhanced in the future to include a model of vehicle type choice (e.g., economy, sport utility vehicle, hybrid), then operating cost can be treated as traveler-specific. Also, network simulation software such as TRANSIMS could be enhanced to provide an O-D/time-of-dayâspecific estimate of average fuel usage based on speeds and traffic conditions (e.g., stop and go) along the route. In that case, average fuel usage could be another skim variable used as input to DaySim. Traveler-Specific Coefficients In setting traveler-specific coefficients for the model, the proj- ect team used the findings from the SHRP 2 C04 study to the greatest extent possible, both for the functional forms and the magnitudes. Equation 1.3 shows the values are set as follows: Work tours 0.15 $ income 30,000 occupancy â0.030 min draw from a lognormal distribution, with mean 1.0 and standard deviation 0.8 â1.00 1.5 Nonwork tours 0.15 $ income 30,000 occupancy( ) â0.015 min draw from a lognormal distribution, with mean 1.0 and standard deviation 1.0 â1.00 1.5 (1.3) 0.6 0.8 0.5 0.7 c i i i b i a i s c i i i b i a i s { }( ) { }( ) [ ] [ ] [ ] [ ] ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) = â â = â = = = â â = â = = The cost coefficient c is set at -0.15/$ for both work and nonwork tours. It is adjusted according to the household income of the traveler, using a power function with a some- what higher exponent for work tours (0.6) than for nonwork tours (0.5). When applied to specific car occupancy levels, the cost coefficient is also adjusted downward for cost-sharing, again using a power function with a somewhat higher coef- ficient for work tours (0.8) than for nonwork tours (0.7). The base travel time coefficient is set at -0.030/min for work tours and -0.015/min for nonwork tours. For an SOV trip for a traveler with income = $30,000, this corresponds to a VOT ratio of 60 * -0.030/-0.15, or $12/h for work tours, and 60 * -0.015/-0.15, or $6/h for nonwork tours. The C04 study also found significant random taste varia- tion around the base travel time coefficient, with the best results assuming a lognormal shape to the distribution, which is typical for VOT analysis. Although the results are not con- clusive with regard to the amount of random variation, the C04 study and past analyses of this type generally support a coefficient of variation (standard deviation/mean) in the range of 0.7 to 1.0. Here the project team assumes a some- what higher coefficient of variation for nonwork trips because that covers a wider variety of trip types. (Note that the code for performing random draws from approximate normalâ Gaussianâand lognormal distributions uses the ratio of uniforms method of A. J. Kinderman and J. F. Monahan augmented with quadratic bounding curves. The original algorithm was published in Transactions on Mathematical Software, Vol. 18, No. 4, 1992, pp. 434â435.) The alternative-specific constant for the tolled route is set at -1.0 for both work and nonwork tours, as evidence shows some aversion to paying tolls, all else being equal. Note that

49 simulating normal or lognormal taste variation around this coefficient for each individual would also be possible. However, empirical evidence to go by is lacking, and, statistically, estimat- ing taste variation parameters on both the toll constant and the time or cost coefficient at the same time would be difficult. The final parameter is s, the model scale relative to mode choice. If we think of the binary toll/nontoll model as a nest of mode alternatives under each of the auto alternatives in a mode-choice model, then the unscaled time and cost coeffi- cients b and c are those used in the mode-choice model, while the scaled coefficients s.b and s.c are those used in the lower- level route-type-choice nest when the logsum parameter for the nest is 1/s. Empirically, the C04 report contains nest log- sum parameters on the route-type-choice nest ranging from 0.9 (constrained) for the New York revealed preference (RP) data to 0.5 for various stated preference (SP) data sets. Here, the project team assumed a logsum of roughly 0.67, and the scale s is the inverse of that, at 1.5. Conceptually, the larger the value of s, the more sensitive and deterministic the route-type-choice model probabilities will be, and the less sensitive the logsum from the model will be to the attributes of the unchosen/inferior alternative. The logsum from the model is important, because it is that value which is fed upward from the auto route-type-choice model to all of the other DaySim models, as described in the follow- ing section. Use of the Route-Type-Choice Model Within DaySim Conceptually, the route-type choice model can be thought of as a binary nest beneath each of the auto alternativesâsingle- occupancy vehicle (SOV), high-occupancy vehicle with two people (HOV2), high-occupancy vehicle with three people or more (HOV3+)âin the DaySim mode-choice model. When- ever auto time and cost for one of the auto submodes is ref- erenced in the DaySim models, they need to be replaced by the composite utility from both the tolled and nontolled paths under each of those modes, just as they would be in a fully nested model. In DaySim, this is done by using the route-type-choice model to return a âgeneralized auto timeâ logsum whenever it is applied. The generalized time is calcu- lated as shown in Equation 1.4: , . Time , . Distance , opcost , . . Time , . Toll , Distance , opcost GT LN exp , exp , . (1.4) V n i s b i n i s c i n i V t i s a i s b i t i s c i t i t i i V t i V n i s b i [ ] [ ]{ }[ ] ( ) ( ) ( ) ( ) ( ) [ ] ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) = â + â â = + â + â + â = + The two utility equations are the same as presented earlier and are used to set the logit probability of choosing either route. The generalized time GT(i) is simply the logsum across those two alternatives, divided by the scaled travel time coef- ficient [s.b(i)] to obtain units of minutes. (Because a, b, and c are always negative and there are only two alternatives, the logsum will virtually always be negative as well; so the gener- alized time will be positive. However, a check has been placed in the code to avoid cases of negative generalized time.) When no tolled alternative exists, then the V(t,i) term is not used, so the generalized time simplifies to V(n,i)/[s.b(i)] = Time (n,i) + c(i)/b(i) * Distance(n,i) * opcost, which is simply travel time plus operating cost divided by VOT. In upper-level models, this generalized time is typically multiplied once again by a time coefficient, b(i), so it becomes b(i) * Time(n,i) + c(i) * Distance(n,i) * opcost, which is the unscaled version of V(n,i). Note that the generalized time now includes the effects of tolls and operating cost as well as travel time, so all explicit utility terms related to time, tolls, and operating cost were replaced in the code for those models by the single gen- eralized time term (times a relevant time coefficient, when appropriate). Table 1.38 summarizes how the route-type-choice model is used within the various component models within DaySim. Note that only one of the lowest-level models, trip-mode choice, actually simulates a route-type choice (toll or nontoll) as a prediction, but nearly all of the DaySim models use the route-type-choice model in the form of the generalized auto time logsum. This feature ensures that the effects of pricing at various times of day are represented consistently at all lev- els of the model system. Also, most of the models take into account the effects of pricing and congestion separately for the SOV, HOV2, and HOV3+ modes; that allows the effects of HOT lanes and other occupancy-specific types of pricing and facilities to be accurately represented. The effects of pricing are also treated consistently for each of the 22 different skim periods in the tour and trip time-of-day choice models, so time-of-day variations in prices and congestion can have nuanced effects on demand. The tour-level and upper-level models also react to pricing for both legs of a tour round trip. As indicated in the table, the upper-level models use dis- aggregate logsums from the tour-mode-choice model and/or aggregate mode and destination choice accessibility logsums to âcarry upâ the effects of pricing and congestion in a way that is as consistent as possible with discrete choice theory. Feedback of DaySim Results to TRANSIMS The DaySim model system produces a list of person-trips for a single day for the entire regional population. With the incorpo- ration of the route-type-choice model for tolling, three new variables have been added to the DaySim trip level output file: â¢ The toll paid for the trip; â¢ The trip-specific (unscaled) time coefficient; and â¢ The trip-specific (unscaled) cost coefficient.

50 This extra information can be used by TRANSIMS to (a) know whether to exclude tolled links from possible paths when assigning the trip to the network, and (b) use the ratio of time and cost coefficients to determine the best VOT- specific path of each type. Both of these types of information will help ensure that the choice behavior being predicted by DaySim is consistent with the route choices and traffic flows being predicted by TRANSIMS. Treatment of Travel Time Variability and Reliability Travel time variability has not yet been included in the route- type-choice model or other choice models within DaySim. Although the C04 report provides a good deal of useful evi- dence regarding trade-offs between cost, usual travel time, and travel time variability and reliability, the project team has not yet determined a feasible way to simulate spatial, O-Dâ specific levels of travel time variability in TRANSIMS as input to the DaySim demand models. As discussed in the C04 report, most proxy-type variables that can be generated from a single run of the network model (e.g., congested time minus free-flow time) tend to be so highly correlated with the main travel time variable that they provide very little new information. Also, given the long TRANSIMS runtimes, any procedures that would require multiple network simulation runs to produce day-to-day dis- tributions of O-D travel times are not practical at this point. The project team reviewed the latest available versions of the SHRP 2 C04 and L04 (Incorporating Reliability Performance Measures in Operations and Planning Modeling Tools) reports and considered methods they suggested which might be both useful (in terms of adding real network spatial and temporal information) and feasible (in terms of computation and runtimes). TrANSIMS Network The supply side models developed for the Jacksonville and Burlington model implementations are based on the TRANSIMS network and travel assignment process. This process assigns for each individual household person a sequence of trips or tours between specific activity locations and to roadways, walkways, and transit modes on a second- by-second basis for a full travel day. The network includes detailed information regarding the operational characteris- tics of the transportation facilities that may vary by time of day and by vehicle or traveler type. This information includes the number of lanes, lane-use restrictions, traffic controls and signal timing and phasing plans, turning restrictions, tolls, and parking fees. Most of the detailed network coding can be synthetically generated from traditional transportation modeling net- works or GIS files. Traffic engineering warrants and coding rules can be customized for local conditions. The resulting data for a regional network can be edited to more accurately reflect actual conditions in the field. However, because TRANSIMS and the SHRP 2 C10 project are designed to address transportation planning needs and future opera- tional and policy scenarios, the network models should be Table 1.38. How Route-Type-Choice Model Is Used in DaySim DaySim Model Predicts route- type choice? Uses logsum as generalized auto time? Used for modes . . . Used for periods . . . One way or round trip? Work location No Yes SOV, HOV2, HOV3+a Assumeda Round tripa School location No Yes SOV, HOV2, HOV3+a Assumeda Round tripa Auto ownership No Yes SOV, HOV2, HOV3+a Assumeda Round tripa Day-pattern choice No Yes SOV, HOV2b Assumedb Round tripb Tour-destination choice No Yes SOV, HOV2, HOV3+c Simulatedc Round tripc Tour-mode choice No Yes SOV, HOV2, HOV3+ Simulated Round trip Tour-time-of-day choice No Yes Predicted tour mode All possible Round trip Stop-generation and location choice No Yes Predicted tour mode Predicted tour periods One way via stop detour Trip-mode choice Yes Yes SOV, HOV2, HOV3+ All possible One way Trip-time-of-day choice No Yes Predicted trip mode All possible One way a Via both. b Via aggregate accessibility logsums. c Via disaggregate tour-mode-choice logsum.

51 designed and developed to dynamically adjust to future conditions rather than be fixed or limited to existing traffic controls. Network Conversion Process The TRANSIMS suite includes a number of tools to synthe- size a TRANSIMS network from traditional MPO networks. These tools provide a quick method of developing a detailed TRANSIMS network without a lot of extra data collection and arduous network coding. This gets the model up and running quickly and uses the trip assignment process to iden- tify locations where the synthetic process requires refine- ment. The generic process for converting a TP+ network is depicted in Figure 1.22. The TPPlusNet program reads the one-way link records and the speed-capacity lookup table used in the regional model network (in Jacksonville this is a TP+ network while in Burlington this is a TransCAD network). The program reformats, regroups and reconfigures the data into stan- dard TRANSIMS input link and node data files. Transims- Net then reads the modified link and node files to synthesize the additional information needed for a network simu- lation. This information includes pocket lanes, lane con- nectivity, parking lots, activity locations, and signal and sign warrants. The signal and sign warrants are typically TransimsNet Synthetic Network TP+ Network TPPlusNet Node Data Link Data Signal/Sign Warrants IntControl Traï¬c Controls Speed-Capacity ArcNet Network Shapefiles Figure 1.22. Network conversion process. Figure 1.23. Typical TRANSIMS network. reviewed and edited before the execution of IntControl. IntControl synthesizes traffic signal timing and phasing plans, detectors, and signs. The ArcNet program then cre- ates ArcGIS shape files to display and edit the network. Fig- ure 1.23 shows a typical TRANSIMS network following this conversion.

52 Jacksonville Network Development In consideration of the flexibility requirement and the project goal to develop a fine-grained network, the team started the model development by creating three network resolutions from the NERPM regional modeling data sets: 1. PLANNING Network. This network is equivalent to the NERPM regional modeling network. 2. ALLSTREETS Network. This network is equivalent to the NERPM regional modeling network plus all other existing minor streets such as neighborhood streets and alleys. 3. FINEGRAINED Network. This network adds local through streets to the PLANNING network to provide greater distri- bution of travel to, from, and within traffic analysis zones. In addition to the information included in the NERPM network files, the TRANSIMS conversion process synthe- sizes the operational details required for network simulation. These include traffic controls, pocket lanes, lane connectiv- ity, and lane-use or vehicle-use restrictions. The following sections describe the generic as well as network-specific conversion and enhancement processes, including network data inputs and their treatment and application. These networks were prepared using TRANSIMS Version 4 tools and later converted to TRANSIMS Version 5 in order to sup- port the implementation of the C10A model system in TRANSIMS v5. NERPM Network Florida Department of Transportation (FDOT) District 2âs Northeast Regional Planning Model (NERPM) maintains multiple networks in a single master network file. The master files are those files that are universally applicable to all sce- narios in NERPM. These files are not altered from scenario to scenario. Instead, the files contain source data from which scenario-specific information is extracted. 1. MERGED-GIS. This is a set of files that collectively form an ESRI shapefile corresponding to the master network. 2. MERGED.NET. This is the master network file from which all scenarios and alternatives are derived. 3. TCARDS.PEN. This is the turn penalty file that contains all turning movement penalties and prohibitions for all scenarios. Penalty sets are used to distinguish scenarios. Thus, various modeling year networks are coded as scenar- ios with specific attributes. As such, their values change from scenario to scenario. Scenario-specific attributes are identified by the presence of catalog keys designating the year and alter- native of the scenario. For example, the attribute for facility type for the 2005 base year scenario is called FTYPE_05A, whereas the facility type attribute for the 2030 network is called FTYPE_30A. A breakdown of the default scenario- specific networks is as follows: â¢ 00Aâ2000 base year; â¢ 05Aâ2005 interim year; â¢ 10Aâ2010 existing-plus-committed (no socioeconomic data); â¢ 15Aâ2015 interim year; â¢ 25Aâ2025 interim year; â¢ 30Aâ2030 cost feasible plan horizon year; and â¢ 30Nâ2030 needs plan (no socioeconomic data). The C10A study focuses on using the 2005 scenario for the base year network and the financially constrained 2030 scenario for the forecast year network. The MERGED.NET master network is conflated with the MERGED-GIS ESRI shapefile and allows viewing the true shapes of the links in CUBE. The shape information is also exported to the out- put shapefiles for input to the TransimsNet program. (See Figure 1.24.) The NERPM model network topology is accomplished by having distinct nodes that are superimposed. The dis- tinction is not visible at normal zoom levels. However, grade-separate roadways have unique AnodeâBnode pairs relative to the underpass/overpass roadways. Figure 1.25 highlights nodes where links are seemingly intersecting with their cross streets but in reality need to be represented as grade separations. The 2005 regional planning network has approximately 9,800 directional links, 6,500 nodes, and 1,642 zones. Shape points or curvature information is available for approxi- mately 6,000 directional links. The project team converted the 2005 TP+ network with time-of-day variations to a time-dependent TRANSIMS network. The TRANSIMS network was subsequently used to route and simulate the TP+ trip tables. (See Figure 1.26.) The NERPM master network attributes are described in Table 1.39. NERPM Speed Capacity The Jacksonville regional model stores speed and capacity information for links in TP+ formats. An example file for the base year 2005 is shown in Figure 1.27. For a given area- type range, facility-type range, and lane range combina- tion, this file provides a corresponding speed and capacity value. The speed is considered to be free-flow speed and is translated as such during conversion to TRANSIMS formats.

53 Figure 1.24. NERPM master network. Figure 1.25. Locations of NERPM superimposed nodes. TAZ-Area-Type Definition Area-type information is typically stored as an attribute of the TAZs in most regional models. TransimsNet was therefore designed to read this TAZ-area-type equivalence to allow users to control the generation of synthetic TRANSIMS net- work elements such as pocket lanes by area type in addition to facility type. Figure 1.28 shows an example of the TransimsNet control parameters. However, the NERPM area types are only available on links and not on zones. The area types are also subdivided to range from 11 to 99. Figure 1.29 shows the distribution of area types by link. Because the area types were link-based, a representative area type for a given TAZ could not be easily established given the presence of multiple area types within a TAZ. This prob- lem was overcome by weighting link-based area types with their lane-feet to create a TAZ-area-type equivalence. Network Corrections As mentioned earlier in this chapter, the superimposed nodes in the NERPM network were collapsed during the Transims- Net application to synthesize TRANSIMS network elements. Only the nodes that connect what would otherwise be a single continuous link are considered during the collapsing process. Attention is paid to link attributes such as the facility type, number of lanes, and speeds. The nodes are not collapsed if any of these attributes differ; in other words, only homoge- nous links are merged. During the network conversion and simulation processes a number of issues with the network were revealed which required the project team to implement a series of network refinements. These issues can be divided into two basic types. The first type is refinements pertaining to large abrupt changes in roadway attributes such as facility type, through lanes, and speeds. These refinements were included in the TPPlusNet

54 Figure 1.26. NERPM year 2005 network (shown in light blue). Figure 1.27. NERPM speed capacity file. Table 1.39. NERPM Network Attributes FTYPE_{year}{alt} This is the facility type attribute. It distinguishes such facility types as freeway, arterials, and collectors. Values range from 11 to 99 and generally follow Florida Standard Urban Transportation Modeling Structure (FSUTMS) highway network coding practices. A facility type value of zero is used to indicate that a particular link is not present in a given scenario. For any given scenario, the facility type for many of the links in the master network is zero. This is one of three attri- butes used to calculate link speeds and capacities, the other two being area type and number of lanes. ATYPE_{year}{alt} This is the area type attribute. It distinguishes such land uses as CBD, Residential, and Rural. Values range from 11 to 55 and generally follow standard FSUTMS highway network coding practices. This is one of three attributes used to calculate link speeds and capacities, the other two being facility type and number of lanes. LANES_{year}{alt} This is the attribute that designates the number of directional lanes on any given link. Values range from 1 to 9 and follow standard FSUTMS highway network coding practices. This is one of three attributes used to calculate link speeds and capacities, the other two being facility type and area type. IMPROV_{year}{alt} This attribute indicates whether a particular link is a roadway improvement project that first becomes active in this scenario. Values are Y(es)/N(o). AGENCY_{year}{alt} This attribute identifies the agency responsible for making the roadway improvement (indicated in the improvement attribute) if that agency is known. Values are the names of the funding agencies. DESC_{year}{alt} This attribute describes in plain text the nature of the roadway improvement indicated in the improvement attribute. conversion scripts to maintain automated procedures to retain their applicability to future-year networks and network alter- natives. The abrupt changes can be classified as follows: 1. Discontinuities in facility types; 2. Discontinuities in through lanes; and 3. Discontinuities in speeds. Figure 1.30 shows the locations where such discontinuities were observed. Figure 1.31 illustrates an example location where the func- tional class changed to local (FTYPE_05 = 46) for a short dis- tance in between a continuous facility type of major arterial (FTYPE_05 = 23) as the link approached an intersection. This resulted in an extremely short link that complicated inter- section operations. The next type of discontinuity was the unrealistic change of two or more through lanes as seen in Figure 1.32. Because

55 Figure 1.28. TransimsNet controls, by area type. Figure 1.29. NERPM link-based area types. Figure 1.30. All locations where TPPlusNet issued warnings.

56 Figure 1.31. Example location showing short link resulting from facility-type change. Figure 1.32. Example location where roadway cross section changed abruptly.

57 lanes are the primary source of capacity within the simulation and lane changing is one of the primary reasons for conges- tion and lost vehicles, these types of errors are extremely problematic. The last type of discontinuity was a large change in speeds as seen in Figure 1.33. The speed discontinuity was the result of a change in the coded area type for the link. These three coding errors result in a significant number of unnecessary nodes and short links which cause congestion problems in the simulation. Most of the issues were addressed by modifying the TPPlusNet conversion script and collapsing the nodes. Figure 1.34 shows the distribution of locations where these nodes were dropped. Since the edits were per- formed at the input level, they can easily be applied to all network resolutions and analysis years. The second group of issues relates to more systematic problems raised by the assignment visualization. The visual- ization results showed significant backups on freeways at several locations far away from their real access points. Inves- tigation showed several locations where freeways intersect with arterial roadways because of network coding errors in the input NERPM master network. These were locations where links should have been coded using grade-separated nodes, but the link was assigned the wrong node number. An example is shown in Figure 1.35. Again, to address this issue, network edits were performed to the NERPM TP+ network to keep the conversion procedure intact. Intersection Controls traffic signals TRANSIMS includes a number of ways to change the con- figuration of the network by time of day. In addition to the roadway configurations, traffic controls can also vary by time of day. The signal timing and phasing plans can be adjusted to optimize time-of-day flow conditions. Signal progression tools are also available to coordinate fixed time signals along specified corridors or throughout a grid system. Demand- actuated signals can include multiple detectors and simulate ramp metering behavior. The signal formats also allow changing signal types by time of day. A rich data set containing the location of all signals in ESRI shapefile format was available from FDOT for the entire state of Florida. As discussed in earlier sections, this signal location information was used to replace rule-based signal warrants from TransimsNet. However, some process- ing was required by the project team to interface these data with TRANSIMS programs. There was no equivalence between these signals and the NERPM node numbers. The signals had to be spatially or manually tagged to the 2005 scenario of the NERPM master network nodes. As an out- come of this process, one or more nodes were identified as part of a single signal in the Jacksonville region. This occurred because, in a number of cases, a single real-world intersection is represented by two or more nodes in the NERPM network. Figure 1.33. Example location with sudden drop in speeds.

58 Figure 1.34. Nodes collapsed as result of removing discontinuity in link attributes.

59 For example, when a divided arterial is represented as a pair of two one-way links, the intersections are represented by two nodes. The tagging process was useful in two respects. First, it served as a direct resource for the replacement of rule-based synthetic signal warrants in TRANSIMS Version 4 network conversion across all the three network resolutions. Also, it provided for leveraging the group control feature of TRANSIMS Version 5 signal format, in which a single signal could be defined over more than one node. The following steps describe how this process of matching FDOT signal locations with the NERPM master network (MERGED.NET) was carried out: 1. The merged_node and merged_link shapefiles were pro- jected onto the coordinate system of the traffic_signal_ locations shapefile, NAD_1983_UTM_Zone_17N. 2. Using a spatial join, the project team joined all of the nodes to the nearest traffic signal. A filter was used so that only nodes whose joining distance was less than or equal to 50 ft were considered. 3. When deciding which nodes should be attached to a spe- cific traffic signal, the team used a few general cases that came up regularly to make this decision. These cases are explained in the following examples. The red nodes indi- cate the nodes that were attached to the traffic signal. a. If two single-link roadways intersected at a single node intersection, then only that node was attached to that traffic signal. Figure 1.35. NERPM freeways intersecting with arterials because of coding errors. 38317

60 48867 75017 49031 49086 49134 49417 49252 49067 3586935887 35873 35877 36046 36462 3650836455 36050 51371 51853 5185151388 b. If a single-link roadway intersected a double-link road- way, then the nodes at the two intersection points were attached to the traffic signal. c. If two double-link roadways intersected, then all four nodes of the intersection were attached to the traffic signal. d. Freeway on and off ramps usually had two traffic sig- nals: one on either side of the freeway. Usually two nodes were attached to each traffic signal as shown. Traffic signal intersections that varied from these examples were evaluated on a case-by-case basis. The intersections were examined using Google Earth, and a determination was made as to which nodes belonged to each signal. Only traffic signals with a VALUE_field of 02 were used in this mapping. Others, such as flashing beacons and school signals, were omitted. The output shapefile with mappings was formatted for use with TransimsNet. unsignalizED intErsEctions The sign warrant creation logic in TransimsNet was used to synthesize traffic controls for the unsignalized intersections. This logic considers the facility class levels and their dif- ferences in addition to the area-type information to deter- mine whether and what type of sign control is required for each of the approaching roadways at an intersection. The TransimsNet program allows the user to define rules for gen- erating signal warrants and places sign warrants for the remaining intersections. Because signal location informa- tion from the FDOT data set was used, no rules were spe- cified in TransimsNet, resulting in the creation of sign warrants for all inter sections in the region. The IntControl program was later provided with the FDOT signal location- based signal warrants, and TransimsNet generated signal timing and phasing plans and sign controls for the inter- sections. Naturally, many conflicting sign and signal war- rants resulted; signal warrants were preserved and sign warrants were discarded. Network Conversion Process The network conversion process starts with applying TPPlusNet to convert TP+ network formats to generic TRANSIMS link and node format. During this application, the coordinate system is also converted from Florida StatePlane 0901 in feet to UTM 17N in meters. The resulting TRANSIMS network is maintained in metric units. This process is depicted in Figure 1.36. The TPPlusNet conversion script is critical to the conversion process. It maps the NERPM functional codes to TRANSIMS facility-type strings and populates the hourly capacity, number of lanes, maximum speed, and free-flow speed fields. Table 1.40 shows how the TP+ functional class codes were mapped to TRANSIMS facility types. The number of lanes coded in the NERPM network repre- sents all-day travel lanes and excludes parking and turn lanes. In TRANSIMS, all lanes are coded along with link names and distances, which are converted from feet into meters. The generic TRANSIMS link, node, and shape files are then pro- vided to TransimsNet along with TAZ area-type equivalence files to generate synthetic TRANSIMS network elements as shown in Figure 1.36.

61 Figure 1.36. Network conversion process. TransimsNet Node NERPM TP+ Nodes.shp NERPM TP+ Links.shp TPPlusNet Node Functional Class Conversion Script (FTYPE = â05Aâ) Link Link Shape Activity Location Parking Process Link Lane Connectivity Pocket Lane SignWarrants Zone Centroids and Area Types ArcNet ArcNet Node.shp Link.shp Node.shp Link.shp Activity Location.shp Parking.shp Process Link.shp Lane Connectivity.shp Pocket Lane.shp SignWarrants.shp Signal Warrants.shp Shape Signal Warrants To overcome the problem of visualizing network topology at grade-separated crossings, the collapse-nodes feature in TransimsNet was used to merge homogenous links and reduce the number of input nodes that are kept in the output network. Following this step, synthetic intersection controls are created using the process shown in Figure 1.37 and the activity location fields are updated and expanded to include zonal attributes. PLANNING Network The PLANNING network is a network resolution including only the NERPM regional modeling links. The corresponding scenario in NERPM model is year 2005, or 05A. Only the records defined in the 05A scenario, that is, records containing nonzero values for field FTYPE_05A, were included. The resulting network, which includes 6,525 nodes and 9,864 links, is shown in Figure 1.38. Table 1.40. TRANSIMS Functional Class Mapping FUNCLASS Codes TRANSIMS Facility Type 11â12, 79â84, 90â91 FREEWAY 16â17, 61, 85, 93 EXPRESSWAY 70â78, 86â89, 97â98 RAMP 20â22, 60, 94â95 PRINCIPAL 23â25, 62â63 MAJOR 30â38, 64 MINOR 15, 40â43 COLLECTOR 65â68 FRONTAGE 44â49 LOCAL 29 FERRY 50 ZONE CONNECTOR 52 EXTERNAL 92 OTHER Figure 1.37. Process to create synthetic intersection controls. IntControl Signalized Node Signal Locations from FDOT Sign Warrants for All Intersections Timing Plan Phasing Plan Detectors Signal Coordinator Unsignalized Node Link Lane Connectivity Pocket Lane Node TransimsNet Signals Override Signs

62 ALLSTREETS Network The PLANNING network and the ALLSTREETS network were envisioned as the two ends of network resolution for this project. While the PLANNING network was limited to the NERPM modeling links for the 2005 scenario, the ALLSTREETS network used links that were not defined in any of the NERPM scenarios. Such links in the NERPM mas- ter network are assumed to be existing streets of very low facility classâfor instance, neighborhood streetsâwhich do not carry any significant level of traffic. These additional links were categorized as locals in TRANSIMS, equivalent to NERPM facility number 42 and marked with a common name for identification purposes (ADDON_DETAIL). Modification of the TPPlusNet conversion script to include these additional streets was the only difference in the ALLSTREETS network conversion process compared with that of the PLANNING network. The resulting TRANSIMS network, which contains 51,420 nodes and 69,361 links, is shown in Figure 1.39. Figure 1.38. PLANNING network. FINEGRAINED Network In accordance with one of the primary goals of this project, a fine-grained network was developed as an intermediate resolution network for greater policy sensitivity and increased fidelity without the huge computational overhead associ- ated with the ALLSTREETS network. Since additional local through streets are the fundamental difference between the ALLSTREETS and PLANNING networks, different filtering methodologies can produce several different intermediate resolution networks. The selection process, however, needs to address the fol- lowing considerations to have any reasonable or meaningful impact on the TRANSIMS simulation process: 1. While all streets bring more realism to the modeling pro- cess, they come with heavy computational costs, potentially resulting in very large and unreasonable processing times. Figure 1.39. ALLSTREETS network.

63 2. Additionally, not all streets bring value to the TRANSIMS simulation modeling. For example, long dead-end links may provide better estimates for disaggregate travel times, but they do not affect nearby links or path alternatives; thus the simulation results would be more or less unchanged. 3. The chosen network resolution needs to strike a balance between reasonable representation of roadway accessibil- ity and the additional burden on model processing times. Given these considerations, the project team developed an approach that balances the mobility and accessibility factors within the project scope. The presence of additional network detail provides opportunities to consider a wider range of path options or diversions to, from, and within traffic analysis zones. Links were selected for inclusion in the FINEGRAINED network based on the frequency with which they were used on paths between zone origins and destinations. Figure 1.40 demonstrates this conceptual approach, show- ing planning links in red, zone boundaries in purple, additional links in brown, and potential links for selected zones in blue based on distance-based paths built from a single zone. Since the analysis of mobility options was the primary motivation for this approach, the PLANNING network is used as a benchmark for keeping a subset of the additional ALLSTREETS links that provided new path-diversions. Multi ple levels of network resolutions can thus be created from this subset by filtering based on an accessibility score. This score could alternatively be considered a parameter for choosing the intermediate network resolution. The score would compute the percentage of the regional employment, population, and households within a given distance from roadways. Figure 1.41 shows an approach for further filtering the additional streets by examining volume levels. This process of selecting local through streets requires building numerous paths between targeted activity locations. The TRANSIMS Version 5 program, PathSkim, with enhanced capabilities was used for this purpose. Additionally, the project team found that deleting unde- sirable streets from the ALLSTREETS network was an eas- ier way to create the FINEGRAINED network than adding selected streets to the PLANNING network. Figure 1.42 shows the process of creating a FINEGRAINED network Figure 1.40. Example of qualified links. Figure 1.41. Filtering qualified links by volume levels. Figure 1.42. FINEGRAINED network conversion overview. Activity Locations on PLANNING LinksESRI ArcGIS PathSkim ESRI ArcGISZero volumenon PLANNING Links TransimsNet Link Volumes Synthetic ALLSTREETSNet Synthetic FINEGRAINED NetLocationData IntControl Synthetic Intersection Controls Updated Activity_Location

64 given the PLANNING and ALLSTREETS networks using PathSkim. The accessibility scores for the resulting FINEGRAINED network were computed by measuring the Euclidean distance of each parcel in the region from its spatially nearest roadway. The distribution of employment, population, and households in the region is summarized by this distance in Table 1.41. The FINEGRAINED network chosen for this project had the following accessibility score: â¢ 80% of the regional employment is within 1/10th of a mile of a modeled roadway. â¢ 50% of the regional population is within 1/10th of a mile of a modeled roadway. â¢ 50% of the regional households are within 1/10th of a mile of a modeled roadway. Figures 1.43 through 1.45 show the employment, popula- tion, and household distributions for the PLANNING and FINEGRAINED networks. The resulting FINEGRAINED network, which includes 10,577 nodes and 16,910 links, is shown in Figure 1.46. Fig- ure 1.47 shows the distribution of the additional network detail in the FINEGRAINED network in contrast to the PLANNING network. A comparison of the network attributes included in the three network resolutions is provided in Table 1.42. Network Conversion Summary The overall network conversion process is summarized in the following steps: 1. Enable True Shape Display in CUBE using MERGED- GIS ESRI shape file and TP+ network files. 2. Export the TP+ network data files to link and node ESRI shape-file format. Note that the user is not required to per- form any attribute-based filtering to create specific sce- narios in this step. The entire master network is exported. The filtering process is performed in subsequent steps. 3. Run TPPlusNet to convert the TP+ link and node files to generic TRANSIMS input files. During this process a conversion script is used to translate NERPM facility- type codes, number of lanes, speeds, and capacities to TRANSIMS coding rules. Zone centroids and connectors are excluded. For the PLANNING network, records con- taining nonzero values for field FTYPE_05A are selected. For the ALLSTREETS network, links that are not defined in any other scenarios are included as locals and marked with a separate name. Warnings are raised if area type or lanes are not defined for such links, since those are required to obtain speed and capacity information from the speed-capacity lookup table. These warnings are examined, and appropriate values for missing attributes are coded on the basis of neighboring links. This step is repeated until all the warnings are addressed. 4. Run ArcNet to review basic link information and network continuity related to facility types, speeds, capacities, and number of lanes. 5. Run TransimsNet with the shapes file to create the syn- thetic TRANSIMS network files such as pocket lanes, lane use, lane connectivity, parking lots, activity locations, process links, and signal and sign warrants. Note that rules for creating signal warrants are not provided in TransimsNet; thus the resulting signal warrants file is empty. In the absence of these signal rules, all of the inter- sections in the region are âfilledâ with sign warrants, as it were. The FDOT signal location information takes the place of signal warrants and replaces the prefilled sign warrants wherever applicable. 6. Run ArcNet to visualize and review the resulting net- work. The focus of this review is pocket lanes and inter- section connectivity. The locations of signals and signs are typically reviewed and edited as well. 7. Run IntControl using the signal and sign warrants to generate the signal timing and phasing plans, demand actuated detectors, and sign-controlled intersections. 8. Run ArcNet again to review the traffic control data. 9. Run LocationData to post the zonal attributes to activity locations and update their TAZ numbers on the basis of the supplied TAZ boundary layer. This method uses a point-in-polygon approach to update the initial TAZ number assignment to activity locations in TransimsNet on the basis of Euclidean distances from zone-centroids. 10. Run ArcNet to visualize the resulting network. The process used to create the FINEGRAINED network requires the output of steps 1 through 5 for both the PLANNING and ALLSTREETS networks. The process then implements the following additional steps: 1. In ArcGIS: From the ALLSTREETS network, flag all links (street name equal to ADDON_DETAIL) not part of the PLANNING network. 2. In ArcGIS: Select all activity locations in the ALLSTREETS network that are located on links included in the PLANNING network and save this list to an output file. This file will be supplied to PathSkim to create paths between activity locations. 3. Run PathSkim to build paths between the selected activity locations. This process may be time consuming (~55,000 Ã ~55,000 = ~2.8 billion paths), taking over 46 hours on an eight-core workstation. Save the output Link Delay file from PathSkim in text format without any turning move- ment data. Note that it is not necessary to build paths for

65 Table 1.41. Accessibility Computation S. No. Distance (m) Distance (miles) PLANNING FINEGRAINED Parcels Emp Pop HH % Emp % Pop % HH Parcels Emp Pop HH % Emp % Pop % HH 1 16.1 0.01 10,385 12,333 15,294 6,264 2% 1% 1% 19,495 17,370 31,512 12,822 3% 3% 3% 2 32.2 0.02 47,365 61,108 70,707 28,651 11% 6% 6% 81,706 82,396 138,684 56,141 14% 12% 12% 3 48.3 0.03 80,569 151,652 124,839 50,505 26% 11% 11% 125,866 176,814 216,632 87,690 31% 18% 18% 4 64.4 0.04 111,272 223,312 175,343 71,028 39% 15% 15% 161,358 248,365 279,140 113,074 43% 24% 24% 5 80.5 0.05 138,861 275,838 232,602 94,367 48% 20% 20% 190,431 299,207 341,108 138,304 52% 29% 29% 6 96.6 0.06 163,117 327,106 286,118 115,977 56% 24% 24% 214,491 350,131 394,641 159,915 60% 33% 33% 7 112.7 0.07 187,394 362,727 344,530 139,612 63% 29% 29% 237,538 385,291 450,734 182,627 66% 38% 38% 8 128.7 0.08 210,519 400,430 399,719 161,872 69% 34% 34% 258,688 419,907 501,369 203,119 72% 42% 42% 9 144.8 0.09 231,568 426,750 452,307 183,124 74% 38% 38% 277,403 443,605 550,639 222,992 77% 47% 47% 10 160.9 0.10 249,946 445,855 499,480 202,156 77% 42% 42% 293,863 461,575 594,959 240,874 80% 50% 50% 11 241.4 0.15 327,751 506,998 686,590 278,188 87% 58% 58% 361,953 515,980 758,228 307,196 89% 64% 64% 12 321.9 0.20 385,118 530,225 818,972 332,153 91% 69% 69% 412,523 537,705 870,301 352,815 93% 74% 74% 13 402.3 0.25 426,539 545,337 902,178 365,841 94% 76% 77% 450,015 550,012 943,052 382,199 95% 80% 80% 14 482.8 0.30 458,811 555,324 957,838 388,304 96% 81% 81% 479,454 559,577 992,539 402,117 97% 84% 84% 15 563.3 0.35 484,129 562,376 1,005,355 407,632 97% 85% 85% 502,681 565,587 1,035,266 419,450 98% 88% 88% 16 8046.7 5.00 618,981 579,535 1,182,431 478,072 100% 100% 100% 618,981 579,535 1,182,431 478,072 100% 100% 100%

66 Figure 1.43. Employment accessibility. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 0.35 Pe rc en ta ge of Re gi on al Em pl oy m en t Distance betweenparcel-centroid and nearest roadway in miles PLANNING Network FINEGRAINED Network all activity location combinations. A sufficient number of samples, approximately 10 per zone, requires much less processing time and is quite representative. 4. In ArcGIS: Join the PathSkim output Link Delay file to the ALLSTREETS link shapefile to flag all links that have non- zero volumes. Be sure to include all PLANNING network equivalent links. Flip this selection, and save the list of non-PLANNING zero-volume links to a file. 5. Run TransimsNet in Update mode with the ALLSTREETS network and a list of deleted links as input. This process deletes the requested links and their corresponding park- ing lots, process links, and activity locations. In addition, all nodes, links, and lane connections are also refreshed including the evaluation of sign and signal warrants. 6. Continue with steps 6 through 10 of the conversion process. Burlington Network Development The Burlington TRANSIMS network was developed following a process similar to that described for Jacksonville, although only a single network was built. The TransCAD base year highway network maintained by the Chittenden County Metropolitan Planning Organization (CCMPO) for use in its daily regional travel demand model was the starting point for developing a detailed microsimulation TRANSIMS network for the region. The 2005 base year model highway network has approxi- mately 1,700 links and 1,300 nodes which represent the major roadway facilities in the county. Interstate I-89 is the only interstate highway in the county; it serves the 18 cities and towns in Chittenden County, most notably Burlington, the largest city in Vermont. The TransCAD base year highway network is a typical planning-level networkâthough the links reflect true shapes while zonal access to the street grid is modeled with centroid connectors. Table 1.43 presents the extent of roadway network coverage by facility type. Fig- ure 1.48 presents the 2005 base year CCMPO TransCAD four-step planning model network. Table 1.44 provides the number of records in each of the TRANSIMS network files generated by the network prepara- tion process. The Burlington TRANSIMS network for the 2005 analysis year has 524 nodes, 779 links, and 2,608 activity

67 locations. To synthesize TRANSIMS simulation network data, the number of links into and out of a given node was used along with intersection logic to construct turn pockets, lane connectivity, and traffic controlsâboth signs and sig- nals. Unique logic was applied depending on the facility type of the link. For instance, arterial intersections examine the relative orientation of each movement and the functional class of each link to determine when and where to include turn pockets and signals or signs. In general, if an approach has opposing traffic, a turn pocket was added to accommo- date the movement. Signal warrants are determined based on the number of legs and the user-specified functional class, by area type signal warrant parameters. Activity locations were automatically synthesized using the TransimsNet utility. The program creates activity loca- tions (loading points for TRANSIMS) along every block face separated by a user-specified location spacing variable (e.g., 100 m). Two additional criteria that dictate the place- ment of activity locations in the simulation network are (1) a minimum block length of 30 m and (2) no more than three activity locations per block face. Enhancing the Synthesized Network Integrity The automated procedure works well, but two problems typically needed to be corrected subsequent to running TransimsNet. First, in the more rural areas of Chittenden County, some traffic analysis zones in the four-step model were not associated with at least a single activity location. This typically occurred in places where the traffic analysis zone represents open land with very little road frontage and/ or where the roadway network is sparse. An ArcMap overlay of the traffic analysis zones on top of the activity locations was used to manually associate TAZs with activity locations on the nearest appropriate roadway for the cases not auto- matically allocated correctly by TransimsNet. The automatic synthesis of activity locations using TransimsNet can also produce loading points where, in real- ity, no loading should occurâfor example, in the middle of highway interchanges. Rather than manually remove such activity locations, a polygon layer representing those areas was built. Geographic rules were then applied to systemati- cally remove all locations within the undesired polygon Figure 1.44. Population accessibility. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 0.35 Pe rc en ta ge of Re gi on al Po pu la tio n Distance betweenparcel-centroid and nearest roadway in miles PLANNING Network FINEGRAINED Network

68 locations. This provided an automated means of importing new and future-year four-step planning networks and auto- matically correcting and updating the activity locations syn- thesized by TransimsNet. Regional Signal Retiming The original Burlington TRANSIMS network was developed as part of an earlier Federal Highway Administration (FHWA) TRANSIMS demonstration project. During this C10A proj- ect effort, the project team elected to review and update the fixed traffic signal timing and phasing plans developed as part of the original network development. A regional signal retiming and rephasing of the traffic signals in the simulation network was performed using the TRANSIMS utility IntControl. An automated and iterative retiming and rephasing of the traffic signal data was conducted using 90-s cycle lengths and link vol- umes resulting from the simulation of the increased regional demand. The signal timings and phasing were iteratively updated until link flows reached an acceptable level of calibra- tion against observed ground counts. Auxiliary demand Jacksonville DaySim provides detailed estimates of the long-term and short-term travel choices of Jacksonville residents when they travel within the region. But this travel demand does not fully represent all trips on the regional transportation networks. Commercial and truck traffic comprise a significant share of all roadway volumes, typically up to 20% or more. In addi- tion, nonresidents enter the region through key external gate- ways to access jobs, shopping, or other opportunities or they may simply pass through the region. Similarly, residents may leave the region to satisfy other needs. Special generators may also create demand not explicitly represented by person- travel-demand models. Auxiliary demand refers to the regional demand that is not forecast by the DaySim model system but that must be rep- resented in the Jacksonville DaySim-TRANSIMS-MOVES integrated model system to reasonably assess network per- formance and the impacts of different policies or improve- ments. Auxiliary demand is derived from the existing Figure 1.45. Household accessibility. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.15 0.20 0.25 0.30 0.35 Pe rc en ta ge of Re gi on al Ho us eh ol ds Distance betweenparcel-centroid and nearest roadway in miles PLANNING Network FINEGRAINED Network

69 Figure 1.46. Resulting FINEGRAINED network. Table 1.42. Network Attributes Comparison Attribute PLANNING ALLSTREETS FINEGRAINED Nodes 6,519 51,305 10,577 Links 9,850 69,106 16,910 Links w/Shape Points 5,960 25,250 9,345 Activity Locations 29,272 216,246 58,846 Parking Lots 29,272 216,246 58,846 Process Links 58,544 432,492 109,692 Pocket Lanes 6,461 76,152 16,574 Lane Connections 37,758 303,106 74,302 Unsignalized Nodes 3,820 36,215 8,458 Signalized Nodes 934 1,140 1,058 Figure 1.47. Distribution of network detail in FINEGRAINED network compared with PLANNING network. Table 1.43. CCMPO TransCAD Base Year Network Lane Miles, by Facility Facility Links Lane Miles % Share Interstate 66 166 11% Limited Access Hwy 18 18 1% Principal Arterial 285 159 10% Minor Arterial 178 163 11% Major Collector 163 174 11% Urban Local 54 165 11% Rural Major Collector 313 305 20% Ramps 80 18 1% Internal Centroid 530 325 21% External Centroid 17 32 2% Total 1,704 1,523 100%

70 NERPM model system currently used in Jacksonville, with spatial and temporal detail added to support integration with the detailed demand and supply simulation models. Because this demand is exogenous to the DaySim-TRANSIMS model system, the total demandâand the spatial distribution, mode, and timing of those tripsâis fixed within a given fore- cast or horizon year, though of course it will vary across model run years. Network times and costs influence the routes used, however, so the network assignment of auxiliary demand is not fixed. The auxiliary demand in Jacksonville can be generally grouped into four main classes: internal-internal commercial vehicle trips, internal-external personal and commercial vehi- cle trips, external-external personal and commercial vehicle trips, and internal-internal special generators. Table 1.45 lists the demand components of each of these four main classes, the total trips associated with each component, and the relative share of total regional demand that this compo- nent represents. The table shows that auxiliary demand accounts for approximately 19% of the total regional demand in the DaySim-TRANSIMS model system, which seems gen- erally consistent with reported practice. However, a closer inspection reveals that commercial vehicles make up about 12% of total regional demand. Figure 1.48. CCMPO TransCAD base year network.

71 Note that, regarding special generators, the original NERPM model contains more special generators than were ultimately included in the DaySim-TRANSIMS model sys- tem (e.g., state parks, military bases, and malls). In the end, only the airport special generator was maintained in the inte- grated model system. In some cases, the NERPM locations were not treated as special generators because the employ- ment and population assumptions used in DaySim generate sufficient demand. The primary examples of this type are military bases and university group quarters. In other cases, the data were not used because they contained counterintui- tive patterns. The auxiliary demand was temporally disaggregated from daily numbers using the same household survey information that was used in the original NERPM trip matrix conversion process. Figure 1.49 shows the diurnal distribution of trip start times. Future refinements to this temporal disaggrega- tion process may include using vehicle class-specific or exter- nal station-specific diurnal distributions derived from traffic counts, or using scheduled airport takeoffs and landings to impute the temporal distribution of travelers coming from or going to the airport. The auxiliary demand should also be disaggregated spatially from TAZs down to TRANSIMS activity locations, which are the fundamental spatial units used in the TRANSIMS network assignment process. The subzone distribution of the trips is based on simple activity- location weights, though additional refinementsâsuch as the use of size variables reflecting employment and populationâ can be easily implemented. Component Integration A key goal of the SHRP 2 C10A project is implementing the integrated demand-supply model in a dynamic modeling framework in a way that is easily transferable to local jurisdic- tions for policy analysis. In support of this goal of transfer- ability, the model system incorporates a system manager that controls the execution of the three primary model system components: DaySim, TRANSIMS, and MOVES. The model system manager, TRANSIMS Studio, is a Python programming languageâbased integrated develop- ment environment (IDE) built specifically to run TRANSIMS Version 4 applications. It has two basic components: (1) a Python-based library called Run Time Environment (RTE) and (2) a full-featured Python GUI. The RTE is at the core of TRANSIMS Studio and is responsible for executing a series of TRANSIMS programs and external programs such as DaySim and MOVES in an iterative modeling framework. The GUI is fully featured, allowing users to manage and view input and output files, develop program controls and pro- cessing scripts, and track model execution status. The TRANSIMS Studio model manager is configured to do the following: â¢ Run on Windows or Linux. â¢ Run on stand-alone or clustered computers (e.g., TRACC). â¢ Run Jacksonville or Burlington models. â¢ Run tour-based (DaySim) or trip-based models (converted static demand). â¢ Start the assignment process from free-flow speeds (cold start) or from the loaded speeds of a previous assignment (warm start). Figure 1.50 illustrates the configuration of the integrated model system components. Table 1.44. Burlington TRANSIMS Network Size Network File Records Nodes 524 Links 779 Activity Locations 2,608 Parking Lots 2,608 Process Links 5,216 Pocket Lanes 310 Lane Connectivity 31,000 Unsignalized Nodes 328 Signalized Nodes 114 Table 1.45. Jacksonville Auxiliary Demand Summary Auxiliary Demand Segment Total Trips Regional Share (%) II four-wheeled truck 269,695 8 II single-unit truck 75,957 2 II combo truck-trailer 30,190 1 IE SOV 94,640 3 IE HOV 48,158 1 IE light duty 3,318 0 IE heavy duty 14,188 0 EE SOV 22,709 1 EE HOV 17,864 1 EE light duty 1,686 0 EE heavy duty 8,867 0 Airport 41,080 1 Total auxiliary vehicle demand 628,352 19 Total DaySim vehicle demand 2,708,077 81 Total vehicle demand 3,336,429 100 Note: II = internal-internal; IE = internal-external; EE = external-external.

72 Figure 1.49. Jacksonville auxiliary demand time-of-day distribution. Figure 1.50. Integrated model system components. Network impedances (Simulation based) Demand (activities and trips) TRANSIMSSTUDIO Iteration/Convergence FileManager DaySim Exogenous Trips TRANSIMS Router Microsimulator MOVES MOEs / Indicators

73 Studio Components The integrated model implemented in TRANSIMS Studio comprises four software components: 1. TRANSIMS Studio user interface and application man- agement software; 2. TRANSIMS and DaySim modeling software; 3. Python scripts that define the modeling process; and 4. A folder structure housing network and other input data. The modeling scripts and input data sets for Jacksonville, Florida, and Burlington, Vermont, are a deliverable of this project distributed through SHRP 2. All of the software com- ponents are available free of charge from the following open- source websites: â¢ The TRANSIMS Studio software can be obtained from http://sourceforge.net/projects/transimsstudio/. â¢ The latest TRANSIMS modeling software can be obtained from http://sourceforge.net/projects/transims/. The modeling package consists of two primary folders: (1) a folder containing the TRANSIMS and DaySim software and (2) a folder containing all the model data. Figure 1.51 shows the folder names created for this project. The model uses the concept of relative-paths which makes it easy to move the modeling package folders without having to set the full system paths in every control file. File paths also use the Linux directory convention to enable the programs to run on both Windows and Linux operating systems without modifi- cation. Thus the user can easily set up the model and also easily repackage or move the model folders. All of the Python scripts used in the model are placed under the RTE folder. The following is a brief description of the purpose and functionality of each script: 1. SysDef.py. This script is the central location at which con- figuration keys that affect the entire model are defined. Changes to this script are usually performed only at the beginning of a model run. This script inherits relevant standard Python libraries and is inherited by each of the other scripts. 2. Main.py. This script defines the various procedures and software steps used to perform global and assignment iterations. This is the only script that is actually executed as part of a model run. It inherits all model definitions and global variables from SysDef.py and functions from other scripts. 3. AssembleSkims.py. This script defines the process for creat- ing free-flow or loaded speed zone-to-zone skims for 22 time periods as input to DaySim. Figure 1.52. Model folder structure. Figure 1.53. Expanded model folder structure. Figure 1.51. Model package folders. The model data folder is further subdivided into a folder named âRTEâ and a folder containing the actual model data. Figure 1.52 and Figure 1.53 show the condensed and expanded subfolders inside the model folder.

74 4. DaySim.py. This script runs DaySim and prepares the DaySim outputs for input to TRANSIMS. 5. MsimIterations.py. This script defines the steps involved in a TRANSIMS assignment iteration. 6. MiscUtilities.py. As the name indicates, this script contains several miscellaneous procedures. These include the startup procedures and various data processing functions. 7. VisualizerPrep.py. This script prepares the microsimulator outputs for visualization using TRANSIMS-VIS. In addition to the Python scripts, which are identified by the â.pyâ extension, a number of other file types are created during the course of model execution. These include the following: 1. â.pycâ or compiled Python script; 2. â.guiâ for display in the navigator pane of TRANSIMS Studio; 3. â.logâ for display in the execution log window of TRAN- SIMS Studio; 4. â.pfmâ for TRANSIMS Studio management; 5. â.pidâ for recording the process ID; 6. â.jobâ for model execution commands; and 7. â.resâ for files that contain the information for resuming a model run. A batch file is provided in the same folder to help remove temporary files created during model execution. However, the batch file is intended to be run only when a model needs to be started afresh because it clears all model logs and track- ing information. The TRANSIMS Studio settings are saved in a file with a â.prjâ extension, which is also saved in the RTE folder. This file is also called the TRANSIMS Studio project file. Application Options The model scripts were designed to store key model parameters and application options in a central locationâSysDef.pyâand to reference those parameters across all other scripts, using the high-level processing steps defined within a single application scriptâMain.pyâwhich controls the model execution and data flow. In addition, the procedures included in all Python scripts except Main.py are encapsulated within one or more functions to facilitate the âresumeâ feature of TRANSIMS Stu- dio. Functions that perform small limited tasks help break down the complex model flow into simpler substeps, thus simplifying the process of tracking and resuming from a sub- step after an abnormal execution termination. A typical application of the modeling process involves decompressing the model package at a certain location on the local or network drive, opening the TRANSIMS Studio soft- ware, and loading the TRANSIMS Studio project file by navigating to the RTE folder inside the model package. Then the user opens SysDef.py in the navigator pane to confirm or edit the folder paths and model variables. Next, the user opens Main .py from the navigator pane to check the number of global and assignment iterations. After everything is set, the user presses the play button for the Main.py script to launch a model run. Information is shared between the scripts by means of vari- ables that are declared within the scope of RTE. The variables are prefixed with the term âvar.â and are globally accessible across all scripts that import the script where these variables are defined. For instance, the link delay resolution is defined inside SysDef.py script as âvar.LINK_DELAY_RESOLUTIONâ and is used within MsimIterations.py for generating controls for TRANSIMS programs. This variable is coded inside a mas- ter control file by replacing the prefix, âvar.,â with the symbol, â@,â and adding the same symbol as a suffix to the variable name. Thus, âvar.LINK_DELAY_RESOLUTIONâ is referenced in a master control file as â@LINK_DELAY_RESOLUTION@.â This functionality helps dynamically adjust control key values, if required, during TRANSIMS applications. For the assignment iterations, the user can choose between the default simulation-based iterations or skip simulation and perform only Router-based iterations by relying on volume- delay functions (VDF) in lieu of simulation-based delays. The following two model parameters are used to switch between Router-based iterations or Microsimulator iterations. 1. Router-based iterations (planning mode): a. var.MODEL_ASSIGNMENT_OPTION = 'No_Simu- lation'; and b. var.SKIMS_LINK_DELAY_SOURCE = 'Router_Based.' 2. Microsimulator-based iterations (planning + operations mode): a. var.MODEL_ASSIGNMENT_OPTION = 'Default'; and b. var.SKIMS_LINK_DELAY_SOURCE = 'Msim_Based.' Other model parameters defined in SysDef.py that affect the TRANSIMS supply-side model are shown below along with their default values and a description of their impact on the modeling process: â¢ var.NUM_PARTITIONS = 8 This variable defines the number of partitions (.t*) to be used in the model. Partitioning helps use multiple cores across one or more machines, depending on the resources dedicated for the job. Partitioning primarily helps in the application of the Router and PlanPrep programs by enabling tasks to be per- formed in parallel. When running the model on a single machine, this value is set equal to the number of cores avail- able on a machine. A default value of 8 is provided, corre- sponding to a modern workstation with eight cores.

75 â¢ var.NUM_PATHSKIM_THREADS = 8 This variable applies only to the TRANSIMS Version 5 software PathSkim, which is used to create skims. It speci- fies the number of threads to use in its application. A default value of 8 is provided, corresponding to a modern workstation with eight cores. â¢ var.MAXIMUM_PERCENT_SELECTED = 10 This parameter applies to the PlanCompare program and defines the maximum percentage of regional trips to be changed per iteration. Large changes during iterations tend to have a destabilizing effect on the model conver- gence; thus an upper limit of 10% is provided as a default. â¢ var.WEIGHTING_FACTOR = 1 This parameter applies to the LinkDelay program and defines the value for the PREVIOUS_WEIGHTING_ FACTOR key in that program. It defines the weight for the previous link delay during the link delay averaging process. A value of 1 implies equal weights or simple averaging; a value of 2 implies a weight of 2/3 for the previous link delay and 1/3 for the current link delay. â¢ var.NUM_DELETE_PREVIOUS_RUN = 3 TRANSIMS assignments produce several gigabytes of data per iteration. A simulation-based Jacksonville model pro- duces in excess of 7 to 8 gigabytes per iteration. Because the intermediate iterations are not normally saved, this vari- able allows users to retain only the last few iterations at any time to conserve hard disk space. When set to â3,â the three most recent iterations are preserved and the model deletes the fourth most recent iteration at the end of each assign- ment iteration. This deletion feature can be turned off so that all intermediate iterations can be preserved by setting the variable to a number greater than or equal to the num- ber of expected assignment iterations. DaySim Demand Component All the inputs, controls, intermediate files, and outputs of DaySim reside under the âdaysimâ folder of the model as shown in Figure 1.54. The input skims and output activities are processed and copied into and out of this folder for inter- facing with TRANSIMS. As discussed earlier, the calls to the DaySim process are placed from within the Main.py script, while the actual proce- dures for running DaySim are defined inside the DaySim.py script. Initial skims for the first DaySim run are prepared using a call to the AssembleSkims.py script. The user can choose to create these initial skims on the basis of free-flow speeds or a link delay file. The details of the skim generation process are discussed in the Time Period Skims section. The DaySim demand creation process starts by running the DaySim executable using the initial skims in an iterative loop to prepare shadow-prices and subsequently executes the entire DaySim model system. DaySim produces an activity file and a vehicle file in TRANSIMS Version 4 format. For the Jacksonville region, the process takes approximately an hour of computer processing time. The resulting activity file includes the internal travel demand generated by regional households. This demand is combined with the auxiliary trips to represent the complete travel demand for the region. To this end, the script combines the internal and auxiliary vehicle files and places the output in the âvehiclesâ subfolder of the âdemand_daysim_activities_ plus_auxiliary_trips folder.â The activity file is copied to its âdemandâ subfolder. The DaySim activity file does not need to be merged with the auxiliary trip file because TRANSIMS pro- grams, especially the Router, are able to read both the activity and trip files and process them in the same application. Copies of DaySim activity and vehicle files are preserved from each global iteration for convergence analysis purposes. TRANSIMS Supply Component The TRANSIMS supply-side model assigns the DaySim inter- nal demand and auxiliary trips on the TRANSIMS network through assignment iterations designed to achieve dynamic user equilibrium convergence of the individual travel paths. The resulting network performance by time of day is used to generate zone-to-zone travel time, distance, and cost skims for 22 time periods for input into the next global iteration. TRANSIMS models demand as trips between an origin and a destination activity location (i.e., link offsets) at a specific time of day (i.e., seconds). The Router builds a minimum Figure 1.54. DaySim folders.

76 impedance path between the origin and destination based on time-dependent link travel times and turning movement delays. The paths or travel plans for all of the trips over a 24-hour period are loaded onto the network and simulated by the Microsimulator. The Microsimulator considers traffic sig- nal timing, lane changes, and vehicle interactions in estimat- ing the volume and travel time on the network at any point in time. These data are aggregated by link and time period for feedback to the Router for path adjustments. The process con- tinues until most travelers cannot improve their travel time by changing paths. The temporal resolution of the link delays has traditionally been 15 min, but this model typically uses 5-min link delays and has been tested using 2-min link delays and interpolated link delays. The appropriate time increment for link delay summaries depends significantly on the simulation methodol- ogy. The TRANSIMS Microsimulator uses a cellular automata method that moves vehicles between link-lane cells on a second-by-second basis. A 6-m cell size was used for this model. This has the effect of limiting the instantaneous speed of a vehicle at any second within the simulation to one of seven values (i.e., 0, 6, 12, 18, 24, 30, and 36 m/s). This means that a relatively large number of vehicle-second observations are required for a given link to generate a reasonable average speed. Average speeds generated using 5 min of data appear to generate the best results. Startup To get the assignment process started, a startup script is used to separate the steps that either are applied only once during the model execution or require special considerations because of the lack of a previous iteration. The startup procedures are implemented as functions within the MiscUtilities.py script. This step or function is executed at the beginning of every global iteration before commencing assignment iterations and is run only once. The primary steps executed at this stage are as follows: 1. Partitioning the regional households; 2. Building all-or-nothing (AON) paths for each trip on the basis of free-flow speeds or loaded speeds from a previous model application; 3. Creating link delays by time of day on the basis of volume- delay functions; and 4. If appropriate, averaging the link delays with the results of the previous model run. This process is depicted in Figure 1.55. One of the most significant steps at this stage is the parti- tioning of households because it affects the rest of the model execution and the overall processing time. A higher number of partitions typically implies lower processing times for par- titioned programs such as Router and PlanPrep. However, the Figure 1.55. TRANSIMS assignment startup. Vehicles Auxiliary Trips HouseholdsDaySim Activities HHList Partition ListRouterPrevious Link Delays AON Plans PlanSum Link DelaysLinkDelay Average Link Delays optional

77 number of partitions can be specified independent of the number of machines/threads or nodes at the userâs disposal. When the number of partitions is specified higher than the number of machines/threads, TRANSIMS Studio processes the partitions as sequential sets of applications according to the number of machines/threads available. For example, if 18 partitions (â.tAAâ through â.tARâ) are processed on a machine with eight cores, the first set of eight partitions (â.tAAâ through â.tAHâ) is processed first, followed by the sec- ond set of eight partitions (â.tAIâ through â.tAPâ); finally, the remaining two partitions (â.tAQâ through â.tARâ) are pro- cessed using only two threads while the other six threads remain idle. Therefore, for maximum efficiency, the number of partitions should be set equal to or an integer multiple of the number of machines/threads available. In the tour-based model, the HHList program reads the household or traveler lists generated by DaySim and the aux- iliary trip model to compile a master list of all travelers in the region; it then randomly distributes each traveler into a spe- cific partition file. All trips made by a household are pro- cessed within the same partition. In the trip-based model, all regional trips are stored in a single file, and each trip is assigned a unique household number. The trips are parti- tioned according to the household trip number. Equilibrium Convergence Convergence is necessary to ensure the behavioral integrity of the model system. The impedances or level-of-service mea- surements used as the basis for accessibility measures and as key inputs to the destination and mode-choice models must be approximately equal to the travel times and costs produced by the final network assignment process. Model system conver- gence is also necessary to ensure that the model system will be useful as an analysis tool. The stability of model outputs is essential to support planning and engineering analyses, and changes to demand or supply should lead to reasonable changes in model outputs. In the context of an integrated demand and network simu- lation model system, an essential precondition for pursuing the overall model system is establishing network assignment convergence. Network convergence is analogous to model system convergenceâthe inputs to the network assignment process (the current traveler paths that give rise to the current network costs) must be approximately equal to a set of new best paths that are based on these current network costs. A key focus of the C10A effort has been identifying and test- ing different strategies for achieving both network assignment convergence and overall model system convergence within the context of the DaySim-TRANSIMS integrated model. This section describes the user equilibrium convergence, or gap, cal- culation procedures employed in the TRANSIMS supply-side model. These procedures are run at the end of every iteration and do not influence the model assignment procedures. Two gap measures have been defined and employed in this model: relative-gap or link-based-gap and trip-gap or traveler- based-gap measures. The relative-gap measure is equivalent to the widely used network link-based-gap measure in con- ventional deterministic travel demand models. The trip gap is a newer measure enabled by the detailed information about individual travelers available within a disaggregate model like TRANSIMS. The trip-gap concept is further divided into three estimation methods identified as reskimmed, event- based, and hybrid. The event-based and hybrid trip-gap methods require the Event file from the Microsimulator and therefore are not applicable in the Router-only assignment methods. For Router-only iterations, the model automati- cally switches the trip-gap measure to the reskimmed method. Microsimulation-Based Equilibrium (Planning + Operations Mode) The project team has extensively investigated and tested methods for achieving dynamic user equilibrium in the con- text of the TRANSIMS Router and Microsimulator. This work has been done in coordination with a parallel FHWA- funded TRANSIMS deployment effort researching conver- gence and other issues associated with advanced integrated travel demand model systems. As part of this parallel effort, a peer review panel organized by the FHWA reviewed the C10A integrated model process and identified refinements that ensured the methods implemented were consistent with current practice. A key strategy related to convergence that the peer exchange panel deemed acceptable was to average Microsimulator link delays to dampen oscillation effects but simulate the full set of AON paths during each iteration. This application approach is depicted in Figure 1.56. This method involves generating a new set of plans for each traveler at each iteration based on average simulated delays and simulating those plans. A key concern is the ability of the Microsimulator to realistically simulate AON paths. However, empirical tests of this method in both Jacksonville and Burlington confirmed that the Microsimulator can simu- late AON paths without creating significant congestion prob- lems. Because each traveler has a unique AON path between activity locations, starting at a specific time of day sufficiently distributes the paths to avoid the types of AON assignment problems typically experienced by traditional modeling frameworks. These results appear to hold even with large increases in demand. Router-Based Equilibrium (Planning Mode) Microsimulator-based iterations are the preferred method of performing a user equilibrium assignment using TRANSIMS, but they are not the only way. The Router-only iterative pro- cess used in the model system planning mode is similar to the Microsimulator-based iterative process, but it uses traditional

78 VDFs for computing link delays rather than the second-by- second simulation of individual vehicles. Thus, the Router- only process does not consider traffic signal operations, lane changing, or vehicle interactions. It simply takes user-provided link capacity and estimated volumes to calculate the link travel time for each time increment. In this case, volume is estimated by tracing the location of each vehicle at any given time using the trip path and start time stored in the travel plan file. The primary reason for using Router-only iterations for all or some of the assignment process is the time saved. The Microsimulator is computationally complex and therefore the most time consuming step in each iteration. The Micro- simulator performance is further complicated because it is a single-threaded program in TRANSIMS Version 4 and cannot be partitioned. As a result, the Router-only approach can perform an assignment iteration in approximately one- tenth of the time required for a Microsimulator-based itera- tion. This makes the Router-only process attractive for initializing estimates of link travel time before performing Microsimulator-based convergence or as a substitute for simulation for applications that can tolerate less rigorous analysis or are not focused on traffic operations. As with the Microsimulator-based process, the Router- only iterations can be implemented within TRANSIMS in two fundamental ways. The primary method used in this project is shown in Figure 1.57 and is conceptually consistent with a traditional assignment process. An AON path is built in accordance with the previous link delays, and the resulting volumes are converted to travel times using a VDF. The link delays are averaged using a weighted average or method of successive averaging (MSA) technique for input to the next set of AON paths. In this case the averaging technique is criti- cal to managing the stability of the assignment and the con- vergence process. Alternative approaches to seeking convergence are also fea- sible. One is an incremental assignment approach that com- pares an AON routing of each traveler with a reskimmed version of the previous path and selects a subset of trips with large travel time and impedance differences for inclusion in Figure 1.56. Microsimulator-based equilibrium process.

79 the composite plan file. This plan file is then aggregated into link volumes by time of day; delays are calculated using a facility-type-based VDF to the 5-min or 15-min link volumes to estimate the travel time for each time increment; and a weighted average or MSA procedure is used to combine the travel time estimates with previous travel time estimates for feedback to the next Router iteration. Note that, in either approach, volume-to-capacity ratios can exceed 1.0 just as they do in traditional models. In fact this problem can be even more significant in TRANSIMS when fine-grained time periods are used. Volume-to-capacity ratios for 5-min or 15-min time periods are significantly more likely to exceed 1.0 than volume-to-capacity ratios cal- culated using peak period or daily volumes. Thus, extra care needs to be taken in designing the parameters used in the VDFs. If the travel times become excessive, the Router will have difficulty completing trips, which will create scheduling problems for subsequent trips within tours and move link volumes into much later time periods of the day. Equilibrium Convergence Measures Relative-Gap Measure Link-based relative gap is a convergence statistic that quanti- fies the difference between the simulated performance of the traffic on each link by time of day and the vehicle hours of travel that would result from each traveler taking the mini- mum impedance path based on the simulated travel times. The mathematical formulation for link-based relative gap is shown in Equation 1.5. â â â= Ã â Ã Ã Relative Gap (1.5) VE CE VA CE VA CE t t t t t t where S = summation over all network links; VEt = the simulated volume on a given link and time increment; CEt = the travel cost (time) associated with volume VEt; and VAt = the link volume from an AON assignment based on CEt. The data processing steps required to calculate the relative- gap convergence measure are outlined in Figure 1.58. The primary inputs to the process are travel plans from the cur- rent iteration and the Microsimulator performance or link delay file. These are shown in grey in Figure 1.58. The relative- gap process then builds AON paths using the performance file. The paths are then reskimmed using the same perfor- mance file. The reskimming process updates the travel time and generalized-cost values for all travel plans in a consistent way. The AON plans are processed by PlanSum to create an Figure 1.57. Router-only user equilibrium.

80 AON link delay file. Similarly, PlanSum is used to create a link delay file using the travel plans and the Microsimulator per- formance file. This step, although seemingly unnecessary, is needed to create a link delay that has a complete and consis- tent set of volumes for all times of day. Furthermore, a com- mon list of travelers is supplied to both PlanSum applications to limit the statistical comparison to trips that were success- fully completed in both path building applications. The final two link delay files are compared using LinkSum to compute relative gaps for every hour of the day. Trip-Gap Measure TRANSIMS builds a unique path or travel plan for each trip on the basis of the origin and destination activity locations, trip start time, and other travel mode attributes (e.g., HOV or truck restrictions). In its simplest form, a trip gap is com- puted by comparing the generalized costs for every travelerâs path in the travel plan file with that travelerâs path in the AON plan file. However, a number of variations and com- plexities arise when calculating this measure, depending on whether only the Router is used in assignment or whether the Microsimulator is used. The nature of traffic microsimulation also contributes to those complexities. Ultimately, three variations of the trip gap were developed and tested. Two use experienced time for each traveler, derived directly from the Microsimulator outputs. The formula for these gaps is shown in Equation 1.6. The third gap measure is based on reskimmed time and is shown in Equation 1.7. Trip Gap (1.6) CE CA C CA C x x mt x mt â â { }( ) { }= â where {Cmt} = simulated time varying link costs; CAx = AON cost of trip x based on link costs {Cmt}; and CEx = simulated cost of trip x that resulted in link costs {Cmt}. Trip Gap (1.7) CR C CA C CA C x mt x mt x mt â â { } { }( ) { }= â where {Cmt} = simulated time varying link costs; CAx = AON cost of trip x based on link costs {Cmt}; and CRx = reskimmed costs for trip x along the path used to generate {Cmt}. Figure 1.58. Link-based relative-gap calculation process.

81 The data processing steps required to calculate the three trip-gap convergence measures are outlined in Figure 1.59. The inputs to this process are the same as the relative-gap process. They include the travel plans and the Microsimulator perfor- mance or link delay file from the current iteration. These are shown in gray in Figure 1.59. All three measures are calculated in every iteration; however, only one of them is used in the model for measuring convergence. The event-based and hybrid trip-gap methods attempt to consider the actual simulation travel time for every traveler. The hybrid trip-gap method additionally attempts to correct for the influence of simulation and routing problems. The reskimmed trip gap is obtained by comparing the reskimmed travel plans with the reskimmed AON plans. The event-based trip gap is computed by comparing reskimmed AON plans and the updated travel plans incorporating the actual start and end times of every traveler in the simulation. In this comparison, problem travelers in the Microsimulator are excluded. The hybrid trip gap goes one step further and includes the reskimmed paths for travelers with simulation problems for comparison against the reskimmed AON plans. DaySim-TRANSIMS Integration As shown in earlier figures, DaySim provides trip and vehicle information to the TRANSIMS Router to perform network assignment. In its original implementation, DaySim produced person-trip records with trip-end locations defined as parcels and trip start and end times defined as specific minutes of the day. To integrate DaySim with TRANSIMS, a number of mod- ifications were made to translate DaySim tour and trip records into vehicle-trip records, associate the parcel locations with the activity locations used by TRANSIMS, and output the records in the format required by the Router. DaySim does not Figure 1.59. Traveler-based trip-gap(s) calculation process.

82 simulate vehicle-type choice or allocate specific vehicles to person trips. Therefore, a simple procedure is used to treat every vehicle trip as an independent vehicle. These modifica- tions are described in the following subsections. Activity File The primary change required to integrate DaySim and TRANSIMS was adding to DaySim the capability to create TRANSIMS input activity and vehicle files. The necessary modifications were relatively straightforward given the com- parability and the list-based nature of both DaySimâs trip output and TRANSIMSâs activity file input. Activity Files Versus Trip Files Version 4 of TRANSIMS can accept two types of list-based inputs: activity files and trip files. A trip file is straight forward: a list of trips to be assigned to the network, including infor- mation on the origin activity location, destination activity location, departure time of the trip, and mode of travel used. All of the auxiliary demand used in the integrated model, such as airport ground access trips and commercial vehicle trips, are converted from the SACSIM trip matrices devel- oped for Sacramento into TRANSIMS trip files. When the TRANSIMS Router assigns a trip file to the network, each trip is considered a discrete movement, independent from any other trips in the file. Activity files are more complex. An activity file is a list of activities undertaken by regional residents and does not explic- itly include trips. For each household, person, and activity, the activity file includes a purpose, start time, end time, duration, mode, and location. Using this information, the TRANSIMS Router creates a plan for each movement required for an indi- vidual to reach the desired activity locations. These plans are essentially equivalent to trips. The critical distinction between using an activity file and a trip file is that, when routing activi- ties, TRANSIMS treats all the movements as interconnectedâ the tour structure is preserved within each person. As currently configured, the activity durations are fixed; if a traveler takes longer to reach one activity location, the remainder of that travelerâs trips and activities will be pushed back in time as well. This has distinct implications for the integration of the activity model with the TRANSIMS Router, and for the overall model system calibration and validation, an issue discussed in Chap- ter 2. This approach was necessary in the initial model develop- ment to ensure that all trips were assigned and conserved. Future integrated model development efforts will consider how rescheduling and time pressures can be flexibly accom- modated in DaySim, TRANSIMS, or both. Using a simple tour comprising two trips, with a single trip on each tour leg, Table 1.46 and Table 1.47 illustrate the differ- ences between the original DaySim trip output file and the TRANSIMS activity input file. Both files indicate the household and person traveling. For each of the two given trips in Table 1.46, the DaySim trip output file contains information on the person-tour number on which the trip occurs, the tour half (outbound or return) on which the trip occurs, and the trip number within the tour. Critical trip details are also included, such as the origin and destination parcel and TAZ numbers, the travel mode used to make the trip, the origin and destination purposes, the trip departure time, and the trip arrival time. While the DaySim trip output contains two records repre- senting two trips, the TRANSIMS activity file record contains three records representing three activities. The first activity represents the personâs âat homeâ activity, the starting point for the day. TRANSIMS derives one trip to take the person from home to the first activity location (shown in the second activity file record) and then derives a second trip to take the person to the next activity location, which is back home, as indicated by the common location ID in the first and third activity records. Adding this initial at-home activity was one of the key changes made to the DaySim output. Temporal Units Table 1.47 also shows that the first at-home activity ends at time 44520. Another key change made to the DaySim output involved Table 1.46. DaySim Trip List Output Example SAMPN PERSN TOURNO TOURHALF TRIPNO OTAZ OCEL DTAZ DCEL MODE 1 1 1 1 1 445 429711 1088 133524 7 1 1 1 2 1 1088 133524 445 429711 7 OPURP DPURP DEPTIME ARRTIME EACTTIME TRAVTIME TRAVDIST EXPFACT 8 4 1222 1238 1556 16.09 8.56 1.00 4 8 1556 1615 2659 18.65 8.56 1.00

83 Table 1.47. TRANSIMS Activity File Example HHOLD PERSON ACTIVITY PURPOSE PRIORITY START END DURATION MODE VEHICLE LOCATION PASSENGER 1 1 111110 0 9 1 44520 44519 1 0 5937 0 1 1 11111 4 9 45480 57360 11880 2 1 13688 0 1 1 11121 0 9 58500 97140 38640 2 1 5937 0 the conversion of the time units from hours and minutes (for example, 1222 represents 12:22 in the original DaySim trip list output) to seconds. The first activity is shown to end at 44520; when translated from seconds to hours and minutes, this time is again 12:22. Thus, the end time for each activity in the activity file is the same as the start time for the trip that takes the traveler to the next activity, consistent with the DaySim trip file. Note that the start time for the second activity, which is âout of homeâ as indicated by the new location identifier, is 45480. Subtracting 44520 from 45480 results in 960 seconds, which is consistent with the 16 minutes shown as the travel time for the first trip in the DaySim trip list output. Spatial Resolution Table 1.46 and Table 1.47 also illustrate the differences in the geographic resolution between the DaySim trip list output and the TRANSIMS activity file. As previously described, DaySim uses detailed parcels as the fundamental spatial unit; but in the original DaySim implementation, this parcel-level detail was aggregated to a TAZ-level before network assign- ment, using traditional static equilibrium assignment meth- ods. The DaySim trip list output contains both the origin and destination parcel and the TAZ information. When integrated with TRANSIMS for network assignment, DaySim uses activ- ity locations. Activity locations are more fine-grained spatially than TAZs (the Jacksonville region has approximately 25,000 activity locations) but not as detailed as individual parcels (the Jacksonville region has approximately 620,000 parcels). A cor- respondence file between parcels and activity locations was developed to translate parcel information to activity locations before assignment in TRANSIMS. This spatial disaggregation in assignment is one of the distinguishing aspects of the inte- grated model. Mode Two changes in the configuration of DaySim to produce a TRANSIMS activity file involved the treatment of mode. The simpler of the two changes involved recoding the travel modes used in DaySim into the preestablished TRANSIMS mode codes. For example, Table 1.46, the DaySim trip list output example, shows the first trip using mode 7, which is âdrive aloneâ in DaySim. Table 1.47, the new TRANSIMS activity file example, shows the second record contains mode 2, which is âdrive aloneâ in TRANSIMS; mode 2 indicates the travelerâs first trip to the first out-of-home activity location. The mode logic is significantly more involved for shared- ride trips. In existing activity-based model implementations that used static network assignment procedures, shared-ride trips are simply aggregated to the zonal level and divided by an assumed occupancy rate to calculate vehicle trips. That approach does not work in a disaggregate assignment simula- tion such as TRANSIMS because the goal is to preserve the details about each individual trip. Dividing discrete shared- ride trips by an occupancy rate to estimate vehicle trips is neither appropriate nor logical. Instead, the driver and pas- senger status must be assigned to travelers whose mode is identified as shared ride. In TRANSIMS, only auto driver tours are of interest. DaySim predicts the occupancy for auto tripsâdrive alone (DA), shared ride 2 (SR2), or shared ride 3+ (SR3)âbut it does not predict whether the person is the driver or the passenger, and it does not coordinate the driver and passengers within a household. In addition, different trip modes (vehicle occupancies) may apply to different trips within an auto tour. The project team used a detailed analysis to derive the most realistic and unbiased method for assigning a driver or passenger designation to each auto tour and tripâand thus determine which tours to send to the Router. To determine which car trips are part of car driver tours, a set of rules was established to deal with mixed tours that include some car trips and some noncar trips. Car trips that are part of school bus or transit tours are typically car pas- senger trips in which the person gets a ride in one tour direc- tion and takes a bus in the other direction. For simplicity, the project team assumed that all such trips are passenger trips and need not be routed in TRANSIMS. In addition, some mixed mode auto tours include one or more walk or bike trips. Because those modes are difficult to handle in TRANSIMS, and their number is quite small, the team assumed that the auto trips in those tours are passenger trips. The expected number of car driver trips was calculated using assumed occupancy values of 1.0, 2.0, and 3.63 for the three auto modes and the total number of trips in each tour mode. The team also established a method for determining which tours

84 at each occupancy level to assign as driver tours; the method depends on the other trip modes used on the tour. For exam- ple, if a tour includes one or more walk or bike trips as well as shared-rider trips, it is designated as a car passenger tour. In contrast, if a tour includes no walk or bike trips but does include one or more drive-alone trips, it is designated as a car driver tour. Finally, for tours including only shared-ride trips, a certain proportion of the tours are randomly designated as car driver tours and the rest are designated as car passenger tours according to proportions derived from survey and modeled data. One final note on mode coding pertains to the TRANSIMS activity file. In TRANSIMS, all of the activities that are accessed using the drive-alone and shared-ride driver modes are identified as MODE=2. TRANSIMS then uses informa- tion in the PASSENGER field to determine if the trip is truly drive alone or shared ride. If PASSENGER=0, the trip is treated as a drive-alone trip and assigned to the network. If PASSENGER>0, the trip is treated as a shared-ride driver trip and assigned to the HOV network. Vehicle File TRANSIMS has the ability to allocate or assign vehicles to individual travelers and to track those vehicles throughout the day. DaySim does not allocate vehicles to individual trav- elers. Thus, when creating the activity file, a separate vehicle is created for each auto driver tour, unconstrained by the number of vehicles each household is predicted to own or by competition among household members for the household vehicles. The project team anticipates enhancing DaySim so that it can assign household vehicles to each auto driver tour as part of other research efforts. Such a change would enhance the value of the integrated model by enabling it to more real- istically model vehicle usage and resulting air-quality impacts in the region. TRANSIMS-DaySim Integration Network skims, or location-to-location measures of network impedances and costs, are an essential element of any travel demand forecast system. The skims are used directly or indi- rectly in virtually every component of the DaySim model sys- tem, from calculating accessibility measures that influence long-term choices (e.g., auto ownership and overall daily tour and trip generation) to providing direct input into short- term choices (e.g., destination, mode, and time-of-day). The skims are generated using network assignment or sim- ulation software that is based on network performance by time of day, are defined along a number of key dimensions (e.g., spatial, temporal, and modal resolution), and may be provided in a number of different file formats. In the initial implementation of the Jacksonville DaySim-TRANSIMS model system, the primary spatial unit used for skimming is the travel analysis zone (TAZ), and the temporal unit used is the detailed time period. In the current implementation, 22 time-period skims are generated and used in the model system; these time periods vary in length from a half-hour during the 3-hour a.m. and p.m. peak periods to 1 hour dur- ing the midday, early morning, and early evening, to a single broad overnight time period (Figure 1.60). The DaySim- TRANSIMS model system can be configured to other levels of temporal resolution as well. One of the significant enhancements to TRANSIMSâs capa- bilities is the new PathSkim program, which is used to build paths and gather travel attributes between selected locations at specific times of day. In addition to significantly improving performance through multithreading and one-to-many path building techniques, PathSkim makes selecting origins and destinations for zone-to-zone skims by time of day signifi- cantly more convenient. It automatically selects one or more activity locations near zone centroids as path origins and desti- nations. The locations can also be randomly or geographically distributed within the zone or provided by the modeler through a zone-location file. The zone-to-zone or district-to-district skim information is aggregated in memory and written directly as a single skim matrix or a series of matrix files for different time periods. In PathSkim the output time periods can also vary in length or combine travel from different times of day. This is particularly important for the SHRP 2 C10 model because DaySim requires 30-min skims during peak periods, hourly skims for off-peak periods, and a nighttime skim that com- bines late evening hours with early morning hours. DaySim also uses skims to set the trip departure time on the basis of a scheduled arrival time at the activity location. The TRANSIMS Version 4 Router only builds paths from an origin at a specific time of day to a destination. What DaySim would prefer and what PathSkim provides is the ability to generate paths and aggregate skims on the basis of specified arrival times at the destination. In other words, paths are built backward in time. For the Jacksonville network, the Version 4 skimming process took more than 24 h to run and generated a temporary plan file close to 100 gigabytes in size. Performing the same task with a multithreaded version of PathSkim takes approxi- mately 15 min and produces no temporary files. The initial C10A system architecture envisioned that the TRANSIMS tools would be used to generate and return to DaySim activity locationâlevel measures of network imped- ances for a specified set of originâdestination pairs (O-Ds) and a given time period. At present, however, the model sys- tem is still employing TAZ-level network impedances.

85 Ultimately, time and cost measures may be based on more spatially detailed TRANSIMS activity locations and for spe- cific times that a trip or activity may be routed. As described earlier, the fundamental spatial unit used in DaySim is the individual parcel, which is significantly more fine-grained than the TAZs used for network skimming. This is primarily driven by the computational burden of creating, storing, and accessing more spatially detailed skim data. For example, a region such as Jacksonvilleâwith approximately 1,500 TAZsânecessitates the development of separate skims that, for each modal attribute and time period, contain 2,250,000 individual values. If those skims were developed at the TRANSIMS activity location level used in network assignment, more than 400,000,000 indi- vidual values for each modal attribute and time period would have to be stored. To refine the TAZ-level skims, DaySim does incorporate some parcel-level information, such as the distance from each parcel to the nearest transit stop by transit submode. In the Jacksonville DaySim- TRANSIMS model, TRANSIMS creates fixed format ASCII skim files. However, DaySim can be enhanced to read and write other data file formats, such as native CUBE matrix format and binary files. TRANSIMS-MOVES Integration One of the objectives of this study is to estimate the air-quality impacts of each of the application alternatives using the Environmental Protection Agencyâs (EPA) new motor vehicle emission simulator (MOVES). MOVES replaces MOBILE6 and NONROAD as the mobile source emission tool required for air-quality conformity analysis and emission impact anal- ysis. It is designed to produce county-level emission invento- ries for the entire nation, zone- and link-level emissions for state implementation plans and regional conformity analyses, and microscale emission rates for hot-spot and project-level analyses. The goal is to move away from average operating characteristics over broad geographic areas to finer analysis scales based on detailed operating characteris- tics of a wide variety of vehicle types at specific locations and times of day. MOVES Architecture At its lowest level, the MOVES software applies emission rates to activity data by source and operating mode bins. For transportation applications, source use types are basically Figure 1.60. 22 time-period skim definition. 0.0% 1.0% 2.0% 3.0% 4.0% 5.0% 6.0% 7.0% 8.0% 9.0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 % of Re gi on al Tr av el 4 PERIOD SKIMS 22 PERIOD SKIMS EV PMAM MD 1 evening skim 9 hourly midday & shoulder skims 12 30-min peak period skims

86 equivalent to vehicle types. Figure 1.61 provides an overview of the core model processing. The activity data are allocated to source and operating mode bins by the total activity allocator (TAA). This module processes data from three primary data sources: source bin distributions, total activity, and operating mode distributions. The activity data are passed into the emis- sion calculator along with emissions rates, which are adjusted using local and seasonal fuel and meteorology data. From the MOVES core model perspective, â¢ Fuel data are the supply of fuel by time, location, and type. â¢ Meteorology data are the temperature and humidity by time and location. â¢ Total activity is the quantity of emission-generating activ- ity by time, location, and source use type. â¢ Operating mode distributions distribute average operating characteristics, such as average speed, to vehicle-specific power (VSP) bins by time, location, and roadway type. â¢ Source bin distributions convert vehicle data to emission- specific classifications. Distributions can vary by source use type, but not by time or location. Total Activity Data Generator To support the core model and facilitate a wide variety of user inputs, the MOVES architecture includes a number of data generators to manipulate and format the required informa- tion. The total activity generator (TAG) is designed to pro- duce the activity for the source use types by time and location. This includes the vehicles and the operation of the vehicles. The MOVES process prepares the total activity through a series of steps that perform calculations at increasing levels of detail and specificity. The steps are controlled through the data input interface. Table 1.48 provides a high-level over- view of the calculations performed by the TAG. The first few steps in the TAG are designed to define the array of vehicle types that are included in the analysis domain or region. MOVES defines vehicles as a source use type and tends to define vehicle types at a much higher level of speci- ficity than a transportation planner. From a MOVES perspec- tive, every combination of vehicle weight, fuel, technology, emission standard, and engine size represents a different source bin for a given vehicle age. Table 1.48. Total Activity Generator Steps Step Calculation TAG-0 Determine the base year TAG-1 Calculate base year vehicle population TAG-2 Grow vehicle population to analysis year TAG-3 Calculate analysis year travel fractions TAG-4 Calculate analysis year VMT TAG-5 Allocate analysis year VMT by roadway type, use type, and age TAG-6 Allocate annual VMT to hour by roadway type, use type, and age TAG-7 Convert to total activity basis by process TAG-8 Allocate total activity basis by zone location TAG-9 Calculate distance traveled Figure 1.61. MOVES core model.

87 Operating Mode Distribution Data Generator The operating mode distribution generator (OMDG) pro- vides a mechanism for defining the distribution of operating modes used to calculate emissions. For exhaust running emissions and energy consumption, this is a distribution of VSP bins by time, location, and roadway type. This generator would typically be used to create the default or user-specific profiles by roadway type to account for link or roadway- specific characteristics. These characteristics might provide specific information for additional roadway types (e.g., ramps), the grade of the link, or the distribution of the drive schedules around a specific average speed value. Adjustments to parameters other than average speed are probably limited to applications involving detailed operational simulations. Table 1.49 lists the calculations performed by the operating mode distribution generator. Application Modes MOVES supports three primary levels of analysis: macro- scopic, mesoscopic, and microscopic. EPA uses the macro- scopic analysis to perform national-level estimates of energy consumption and policy-related studies. The mesoscopic analysis focuses on generating state implementation plans and emissions inventories for the state and regional agencies responsible for air- quality conformity analysis. The micro- scopic analysis or project-level analysis is designed for hot- spot analysis of local projects that have air-quality implications. This level of analysis is intended to support environmental impact statements. Given the primary purposes of the C10A model develop- ment effort, the mesoscopic or county-level analysis option is most appropriate. This level of analysis focuses on one or more counties within a region or state. Counties can be mod- eled independently or grouped into custom domains. If mod- eled independently, the input tables containing total activity (VMT), fleet mix and age distributions, vehicle inspection programs, fuel types, and meteorology can be different for each county. If a custom domain is used, a single set of inputs is provided for the domain. The decision to use individual counties or custom domains depends on available data and significant variations within the region. If vehicle populations, fleet, and fuel data are only available at the regional level, a custom domain may be the only option. If the region includes different states or counties with different vehicle inspection programs, the analysis may have to be subdivided. In that case, defining multiple custom domains may be desirable. Custom domains can simplify input data preparation, but they also reduce or avoid the need for output processing. For a region that includes a large num- ber of counties, merging the output databases into a single answer can be time-consuming. Within the mesoscopic or county-level analysis, MOVES supports two primary application methods. In the âlookup method,â MOVES generates a table of emissions rates by vehicle type, facility type, speed bin, and various other clas- sifications. Customized software can then be used to read the table and to calculate and aggregate emissions from individ- ual links. In the âinventory method,â the transportation data are processed to generate a series of tables and distribution factors that MOVES importers can read for a MOVES emis- sions inventory application. The primary advantage of the inventory method is that the emissions estimates can be more accurate by minimizing rounding and interpolation errors within the internal calcu- lations. The application process can also include greater pol- icy sensitivity by permitting the user to adjust the emissions assumptions and input data for each run. The primary dis- advantages of this approach are the need to customize inputs and outputs for each application; the time required to run the MOVES software for one or more custom domains or coun- ties; and the skills and training in MOVES applications that transportation modelers require. The lookup method has the advantage that the MOVES software is run once and the rates are applied to multiple sce- narios or alternatives. Transportation modelers can focus on applying the rates and leave the details of setting up and applying the MOVES software to air-quality experts. The process also has the advantage that the rates can be applied to individual links and aggregated in standard ways. It is also Table 1.49. Operating Mode Distribution Generator Steps Step Calculation OMDG-1 Define drive schedules OMDG-2 Define the distribution of drive schedules by average speed OMDG-3 Calculate the distribution of drive schedules for a given link OMDG-4 Calculate the second-by-second vehicle specific power OMDG-5 Determine operating mode bin for each second OMDG-6 Calculate operating mode fractions for each drive schedule OMDG-7 Calculate operating mode fractions for each link OMDG-8 Adjust operating mode fractions based on the grade of the link

88 very similar to the way most MPOs applied the MOBILE6 software in the past. The primary disadvantage of the lookup method is that the resulting rate tables may be too big and bulky for practical use. The attribute details created by MOVES are often not helpful and need to be restructured for efficient application. This restructuring can be challenging because individual rates need to be properly weighted to create aggregate rates that match the available transportation data. Using the lookup emission rates with operational simula- tion models can also be problematic. Rates are available in 5-mph increments, but they should not be applied to instan- taneous speeds. The rates are based on VMT distributions of average speeds. Since stopped vehicles do not generate VMT, the lowest emission rate cannot be applied to a vehicle that is stopped for 1 s or more. The rate needs to be applied to the average speed of the vehicle over the length of the link or some other measure of distance. TRANSIMS Interface The TRANSIMS Emissions program is designed to support both county-level and project-level applications of the MOVES software. For county-level analysis, the program supports both lookup table and inventory application methods. Figure 1.62 shows the TRANSIMS interface using MOVES lookup tables. In this approach, MOVES is applied once with appropriate county-specific data to generate one or more lookup tables. The TRANSIMS Microsimulator is executed to generate speed bin files for each vehicle type. The files contain the number of seconds over each 15-min period that each 30-m segment of roadway has vehicles of the specified type traveling in each of six speed bins. The TRANSIMS Emissions program is then executed with various parameters to aggregate some val- ues and disaggregate other values in the MOVES lookup table and output the resulting composite rates. These rates can then replace the MOVES lookup table for subsequent applications. In addition, the Emissions program applies the composite rates to each record and aggregates the resulting emissions by facility type, vehicle type, and/or summary district. For MOVES inventory applications, the process shown in Figure 1.63 is used. In this case the TRANSIMS Micro- simulator generates the speed bin files and a link delay file. The link delay file contains the volume and speed on each link in 15-min increments. The LinkSum program aggregates this information to generate the VMT by Highway Perfor- mance Monitoring System (HPMS) vehicle types and the dis- tribution of VMT by MOVES facility types. The Emissions program in this case is configured to output VMT distribu- tions and average speed bin distributions by hour of the day. The tables are generated in the format required by MOVES importers that insert the data into the MOVES MySQL data- base for emissions inventory processing. microsimulator outputs One of the key inputs to the emissions estimate is a set of speed bin files output from the TRANSIMS Microsimulator. Speed bin files are generated for each vehicle type included in the network demand (i.e., travel plans). The control keys are listed in Table 1.50. At a minimum, speed bin files are Figure 1.62. Emissions rate lookup table method.

89 Figure 1.63. MOVES emissions inventory method. Table 1.50. Microsimulator Control Keys Control Key Description OUTPUT_SPEED_FILE File name to be created within the project directory OUTPUT_SPEED_FORMAT File format to be created (default = Version3) OUTPUT_SPEED_VEHICLE_TYPE A vehicle type code number (default 0 = ALL) OUTPUT_SPEED_FILTER Minimum number of vehicles per time increment (default = 1) OUTPUT_SPEED_TIME_FORMAT Output time format (default = seconds) OUTPUT_SPEED_INCREMENT Time increment duration (default = 24 hours) OUTPUT_SPEED_TIME_RANGE Time period range (default = ALL) OUTPUT_SPEED_LINK_RANGE Link number range (default = ALL) OUTPUT_SPEED_SAMPLE_TIME The time frequency in seconds at which the speed bins will be summarized (default = 1 second) OUTPUT_SPEED_BOX_LENGTH The length in meters of the link segments for which speed bins are summarized (default = 0 = full link length) OUTPUT_SPEED_NUM_BINS The number of speed bins that are summarized (default = 6)

90 Table 1.51. TRANSIMS Speed Bin MetaData MetaData Description TIME_STAMP The time and date when the file was created BOX_LENGTH The segment length in meters CELL_LENGTH The cell length used in the simulation SAMPLE_TIME The frequency in which data are collected (seconds) INCREMENT The summary time increment VEHICLE_SUBTYPE The subtype of the vehicle type summa- rized in the file VEHICLE_TYPE The vehicle type code summarized in the file VELOCITY_BINS The number of speed bins VELOCITY_MAX The maximum speed in meters per second Table 1.52. TRANSIMS Speed Bin Data Fields Field Name Description LINK Link number DIR Direction of travel (0 = AB, 1 = BA) OFFSET Distance from the beginning of the link to the end of the segment TIME Ending time of the time increment SPEED0 Total vehicle seconds at speed zero cells per second SPEED1 Total vehicle seconds at speed one cell per second SPEED2 Total vehicle seconds at speed two cells per second SPEED3 Total vehicle seconds at speed three cells per second SPEED4 Total vehicle seconds at speed four cells per second SPEED5 Total vehicle seconds at speed five cells per second SPEED6 Total vehicle seconds at speed six cells per second generated for autos and trucks. If the region includes differ- ent vehicle inspection programs for different subregions, the auto vehicle types should include separate vehicle type or subtype to aggregate the travel separately. For detailed emis- sions estimates, the sample rate is once per second in 15-min time increments and the links are subdivided into 30-m seg- ments (box length). The output speed bin files include the metadata header record shown in Table 1.51 and the data fields listed in Table 1.52. Control keys for the TRANSIMS Emissions program pro- vide the tools necessary to compress MOVES emissions rates into values that correspond to the transportation data and the analysis requirements. This includes selecting columns and column attributes for selecting rows and providing weighting factors for combining emissions rates into weighted average values. Once the table is collapsed it can be output as a new emissions rate table for use as input to subsequent applications. An example of a collapsed emissions rate table is shown in Table 1.53. The TRANSIMS Emissions program applies these rates to the Microsimulator speed bin data. This involves mapping TRANSIMS facility and area types to MOVES road types and TRANSIMS vehicle types to MOVES source types. Summary years, months, and weekend travel factors are specified. As the speed bin data are read for each link segment and time period, the vehicle seconds in each TRANSIMS speed bin are distrib- uted to the 16 speed bins defined by MOVES. This distribu- tion process ensures that the total vehicle miles traveled (VMT) and vehicle hours traveled (VHT) included in the TRANSIMS speed bins equal the total VMT and VHT repre- sented in the MOVES speed bins. The appropriate emissions rates are applied to the VMT in each speed bin and summa- rized as requested. Emissions summary reports can be gener- ated by area type, facility type, vehicle type, road type, area and facility types, area and vehicle types, facility and vehicle types, road and vehicle types, and total emissions. Emissions summary data can also be written to a file for additional processing. Emissions program applicationsâEmission invEntory mEthoD The TRANSIMS Emissions program can also be used to gen- erate input tables in the format required by the MOVES county-level data importers. The TRANSIMS speed bin and link delay data sets provide the information needed for five of the MOVES input tables. County or custom domain attri- butes such as temperature and relative humidity and vehicle population data related to fuel and age distributions need to be provided from other sources. The interface includes many of the same elements as a lookup table application, but the process is reversed. Rather than collapse or convert MOVES emissions rates to TRANSIMS data elements, this process expands or converts TRANSIMS data to MOVES data classifications. For example, TRANSIMS facility and area types are collapsed to MOVES road types; TRANSIMS vehicle types are expanded to MOVES source types; and TRANSIMS speed bins are distributed to MOVES speed bins. As each TRANSIMS speed bin record by link seg- ment and time period is read, the data are converted and aggre- gated into an appropriate MOVES-related data structure. The MOVES data are processed, formatted, and output as tab delimited data files. The first table required by the MOVES emissions inventory process is a distribution of annual VMT by HPMS vehicle

91 Table 1.53. Collapsed Emissions Rate Table yearID monthID sourceTypeID roadTypeID pollutantID processID avgSpeedBinID emissionRate 2008 1 21 2 1 1 1 1.91824 2008 1 21 2 1 1 2 1.02998 2008 1 21 2 1 1 3 0.608886 2008 1 21 2 1 1 4 0.430296 2008 1 21 2 1 1 5 0.37313 2008 1 21 2 1 1 6 0.318093 2008 1 21 2 1 1 7 0.2814 2008 1 21 2 1 1 8 0.258368 2008 1 21 2 1 1 9 0.241404 2008 1 21 2 1 1 10 0.228209 2008 1 21 2 1 1 11 0.217652 2008 1 21 2 1 1 12 0.206746 2008 1 21 2 1 1 13 0.195869 2008 1 21 2 1 1 14 0.208396 2008 1 21 2 1 1 15 0.237551 2008 1 21 2 1 1 16 0.276734 Table 1.54. VMT by HPMS Vehicle Type HPMSVtypeID yearlD VMTGrowthFactor HPMSBaseYearVMT baseYearOffNetVMT 10 2008 0 0 0 20 2008 0 7528117453 0 30 2008 0 3051203213 0 40 2008 0 361066.94 0 50 2008 0 3266014.62 0 60 2008 0 5935721.71 0 types. A mapping between TRANSIMS vehicle types and HPMS vehicle types is provided along with distribution frac- tions as necessary. An expansion factor is provided to convert the daily TRANSIMS VMT to annual VMT. This factor may also include some consideration for travel on roadways not included in the TRANSIMS network. Table 1.54 shows an example of the HPMS VMT distribution. Table 1.55 provides factors for distributing the VMT assigned to each MOVES source type to road types. TRANSIMS vehicle types not only need to be mapped to HPMS vehicle types, they also need to be mapped to MOVES source types. TRANSIMS facility and area types are also mapped to MOVES road types. The VMT by vehicle type is summed by road type, then the road type fractions are calculated for each vehicle type. Since the purpose of this table is to distribute VMT assigned to a given source type to road types, the same fractions can be used for each source type associated with a given vehicle type. Table 1.55 provides an example of road type fractions for two source types. Because MOVES road types are limited to restricted-access and unrestricted-access categories, and speed profiles on free- ways are considerably different from on ramps, MOVES splits the VMT assigned to the restricted-access road type into free- ways and ramps using a ramp fraction table. This fraction is simply the total VMT on ramps in urban or rural area types divided by the total VMT on ramps plus freeways (and

92 Table 1.56. Ramp Fractions roadTypeID roadDesc rampFraction 1 Off-Network 0 2 Rural Restricted Access 0.056354 3 Rural Unrestricted Access 0 4 Urban Restricted Access 0.084319 5 Urban Unrestricted Access 0 Table 1.57. VMT Hour Fractions sourceTypeID roadTypeID dayID hourID hourVMTFraction 21 2 2 1 0.004541 21 2 2 2 0.003671 21 2 2 3 0.003132 21 2 2 4 0.003283 21 2 2 5 0.005489 21 2 2 6 0.015789 21 2 2 7 0.039964 21 2 2 8 0.069848 21 2 2 9 0.073569 21 2 2 10 0.057104 21 2 2 11 0.051712 21 2 2 12 0.055294 21 2 2 13 0.061564 21 2 2 14 0.061838 21 2 2 15 0.064897 21 2 2 16 0.073156 21 2 2 17 0.081059 21 2 2 18 0.083631 21 2 2 19 0.066849 21 2 2 20 0.045871 21 2 2 21 0.028413 21 2 2 22 0.023375 21 2 2 23 0.015719 21 2 2 24 0.010232 expressways). An example of the ramp fractions file is shown in Table 1.56. Table 1.57 distributes daily VMT associated with a given source type and road type to VMT by hour of the day. The fractions can also vary for weekdays and weekends. The dayID field distinguishes a weekday (5) from a weekend (2). MOVES uses one distribution for Monday through Friday and the other distribution for Saturday and Sunday. Total weekend VMT is modeled as a fraction of total weekday VMT. Table 1.57 shows an example of the hourly distribu- tion of VMT assigned to a given combination of source type, road type, and day type. Table 1.58 is perhaps the most important. It distributes the VMT assigned to each combination of source types, road type, day type, and hour of the day to average speed bins. MOVES includes 16 speed bins in 5-mph increments. The amount of VMT assigned to each speed bin is critical to the emissions calculations. The shape of the distribution defines the operating mode distribution, driving schedules, and VSP bins used to calculate emissions. The data from the TRANSIMS speed bin files are distributed to source types using the vehicle type to source type map and source type factors. The link facility and area type attributes map the link segment to a MOVES road type. The 15-min time periods are summed to hours of the day. The vehicle seconds in each TRANSIMS speed bin are then distributed to the 16 MOVES speed bins. This distribution process ensures that the total VMT and VHT included in the TRANSIMS speed bins equal the total VMT and VHT represented in the MOVES speed bins. The VMT in each speed bin are divided by the total VMT for the hour to set the average speed fraction. An exam- ple of the speed bin distribution for one classification category is shown in Table 1.58. The tables created by the TRANSIMS Emissions program are then imported into the MOVES database, various MOVES parameters are set, and a MOVES run is executed. If multiple counties or custom domains are required, the MOVES data- bases have to be combined to create the total emissions inven- tory. Data can then be selected from the tables to generate summary reports. Table 1.55. VMT Road Type Fractions sourceTypeID roadTypeID roadTypeVMTFraction 21 1 0 21 2 0.12795 21 3 0.095435 21 4 0.399313 21 5 0.377302 41 1 0 41 2 0.201374 41 3 0.030168 41 4 0.696728 41 5 0.07173

93 Table 1.58. Average Speed Bin Distribution sourceTypeID roadTypeID hourDayID avgSpeedBinID avgSpeedFraction 21 2 12 1 0.004948 21 2 12 2 0.004122 21 2 12 3 0.003 21 2 12 4 0.002265 21 2 12 5 0.002105 21 2 12 6 0.003277 21 2 12 7 0.00927 21 2 12 8 0.019876 21 2 12 9 0.04253 21 2 12 10 0.093737 21 2 12 11 0.152748 21 2 12 12 0.169864 21 2 12 13 0.125502 21 2 12 14 0.072482 21 2 12 15 0.065015 21 2 12 16 0.229258

Next: Chapter 2 Model Calibration and Validation »

Dynamic, Integrated Model System: Jacksonville-Area Application (2013)

Chapter: Chapter 1 Model Implementation

Welcome to OpenBook!

Get Email Updates