National Academies Press: OpenBook

Traffic Forecasting Accuracy Assessment Research (2020)

Chapter: Chapter 3 - Archiving Traffic Forecasts and Associated Data

« Previous: Chapter 2 - Using Measured Accuracy to Communicate Uncertainty
Page 46
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 46
Page 47
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 47
Page 48
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 48
Page 49
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 49
Page 50
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 50
Page 51
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 51
Page 52
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 52
Page 53
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 53
Page 54
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 54
Page 55
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 55
Page 56
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 56
Page 57
Suggested Citation:"Chapter 3 - Archiving Traffic Forecasts and Associated Data." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 57

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

I-34 The project team recommends that agencies responsible for traffic forecasts systematically archive those forecasts and collect observed data on project outcomes both before and after the project opens. A strong archival process ensures that the necessary details about project forecasts are preserved in a readily accessible format once each project has been opened to traffic. Although some agencies are archiving some details of their project forecasts, the expe- rience of the project team revealed that archiving procedures are not consistently followed. For the analysis in this research, incomplete information caused more than 1,000 projects to be eliminated from the original collection of projects, which reduced the project’s forecast accuracy database by nearly half. Other examples of inconsistent archiving processes were revealed during the study’s deep dive analysis. The deep dive results of one project were severely constrained because relevant documentation was unavailable. Another project had copious information about the project, including voluminous environmental reports, but little of the information was relevant to the project forecast. Looking forward, the project team made two key observations based on their experience: 1. Knowing where to begin searching for documentation related to project forecasts could dramatically reduce the analysis time, and 2. Much of the available project-related documentation contained sparse information about the details of the forecasts, whereas consistent, strict archiving procedures could have greatly increased the number of projects available for use in the study’s forecast accuracy database and strengthened the findings of the research. One goal of this guidebook has been to specifically enumerate the information that should be preserved through archiving while allowing for realistic variation based on agency priorities and resources. In Chapter 1, three levels of archiving were introduced (Bronze, Silver, and Gold). The three levels can be used to preserve differing amounts of information about forecasts. In addition to the three levels, the project team developed a forecast archive and information system that provides a standardized data structure within which to preserve this information. This chapter provides specific recommendations for what each archiving level should include and describes the recommended forecast archive and information system for preserving the information. 3.1 Archiving Levels Archiving forecasts in a consistent manner reduces the time needed to analyze the forecast accuracy and strengthens any findings. The three recommended archiving levels are: • Bronze. At this archiving level, the basic information of the forecast is recorded, as is the actual traffic volume and basic details about the type of project and the method of forecasting. A Bronze-level archiving standard is recommended for all project-level traffic forecasts. C H A P T E R 3 Archiving Traffic Forecasts and Associated Data

Archiving Traffic Forecasts and Associated Data I-35 • Silver. At the second archiving level, additional details and nuances specific to the project are recorded, as are assumptions about the forecast. This archiving level is not recommended for all projects, but rather for projects that are larger than typical projects or projects that represent new or innovative solutions that do not have a long track record of accurate forecasts. Silver-level archiving also is recommended for a sample of typical projects, which enables the agency to moni- tor the accuracy of forecasts for the kinds of projects that make up the largest number of projects. • Gold. The third archiving level builds upon the Silver archiving standard by focusing on details that make the traffic forecast reproducible after the project has opened. In this way, specific sources of forecasting error can be definitively identified. Gold-level archiving is recommended for unique projects, for innovative projects that have not previously been forecast, and for a sample of typical projects. Essentially, the Bronze archiving standard offers a routine, baseline level of archiving that also demands the least in terms of agency resources (staff time or budgets). Archiving at the Bronze level will allow agencies to consistently preserve the minimum information needed to conduct the most basic analyses of their forecasts. The Silver and Gold levels each require more resources, but detail the additional information that should be preserved to support more detailed retro- spective analyses of forecasts (such as deep dives) for specific subsets of projects. The balance of this section provides a more detailed description of each level. 3.1.1 Bronze-Level Archiving Bronze-level archiving records the basic information of the forecast and the actual traffic volume. This level of archiving is recommended for all project-level traffic forecasts. At the Bronze level, the project team recommends that the following information be preserved: • Project information, such as the project name, city/area location, roadway, project ID used by the agency, a short description of the project, the type of project (e.g., widening, resurfacing), the year the project was assumed to be completed, the year the project actually was completed, and the project’s construction cost. • Forecast information, such as the forecast year and type (e.g., project opening or design-year forecast), the year the forecast was produced, the forecast itself, the methods used to produce the forecast, the version of the model (if a model was used to generate the forecast), and whether post-processing was applied to the raw forecast. Forecasts for individual segments of a project should be recorded individually. • Actual traffic count information, such as the traffic count, the year of observation, and the units of the traffic count (e.g., ADT, peak period counts). It is important that the units used for the forecast information and the actual traffic counts be specified. For purposes of comparison, it is ideal if the forecast and actual counts use the same units. When this is not the case, knowing the specific units that were used for each facilitates any conversions that must be made for the analysis. For example, many travel models will output traffic estimates for a “typical weekday,” but a project’s traffic counts may have been recorded as annual average daily traffic (AADT). In such cases, the units and any adjustments made for comparison should be clear. At this level, recording the necessary information requires little effort beyond organizing the tracking mechanism. Bronze-level archiving systems are currently in practice in several places, including the Ohio DOT and Districts 4 and 5 of the Florida DOT. As part of NCHRP 08-110, the project team developed an online, open-source archiving system that incorporates data archived at the Bronze level. User documentation for the online tool is incorporated directly into the tool, which is described further in this chapter under the heading “Forecast Archive and Information System.”

I-36 Traffic Forecasting Accuracy Assessment Research 3.1.2 Silver Archiving Level Silver-level archiving builds upon the Bronze standard by recording details and nuances that otherwise would not be captured. This level of archiving is not recommended for all projects; however, it should be used for projects that are larger than typical projects or that represent new or innovative solutions that do not have a long track record of accurate forecasts. The project team recommends that Silver-level archiving also be applied to a sample of typical projects because doing so will enable the agency to better monitor the accuracy of forecasts for the kinds of projects that make up the largest number of projects. The Silver standard preserves more details about the forecast, including the methods and assumptions that were used. The greater detail allows for a more detailed review of the sources of forecasting error, enabling more specific sources of error to be identified. At the Silver level, the project team recommends that the following information be preserved: • Project description, which includes all of the Bronze-level project information, plus descrip- tions of the project boundaries, a project map, and a description of key travel markets that are anticipated to use or benefit from the project. (Travel markets are meaningful quantities of trips that traverse geographic areas; sometimes travel markets are further characterized by trip purpose, time period, line-haul or circulation/distribution movements, or socioeconomic criteria.) • Description of the traffic forecasts, which includes all of the Bronze-level forecast informa- tion, plus the most recent traffic count used as a basis for the forecast or to validate the travel model, and the uncertainty windows for the project. • Forecasting methods, which includes a description of the methods used to develop the fore- casts, and a summary of how well the model grasps the existing and expected travel markets that are most likely to use or benefit from the project. • Assumptions, which involves documenting typical assumptions (e.g., population, auto fuel prices, auto ownership, and expected changes in technology and/or land use) as well as extraordinary assumptions that are specific to the project. Examples of extraordinary assumptions could be assumptions about a particularly large development near the project, impacts from adjacent construction, specific policies or ordinances, or how travelers will react to innovative solutions. • Post-opening data collection, which summarizes the data needed to verify the traffic forecast and assumptions, and the associated data collection plan that will be executed upon project opening. An annotated outline of the Silver-level archiving standard is provided in Part III, Appen- dix B, in this report. A customizable Word file containing the outline also is provided as a downloadable file from the NCHRP Research Report 934 webpage at www.trb.org. Silver-level archiving systems are currently in practice at some agencies, including District 5 of the Florida DOT and the Wisconsin DOT. 3.1.3 Gold Archiving Level Gold-level archiving builds upon the Silver archiving standard by focusing on details that make the traffic forecast reproducible after the project has opened. In this way, the sources of forecasting error can be definitively identified. Gold-level archiving is recommended for unique projects and innovative projects that have not been previously forecast. As with the Silver standard, Gold-level archiving also is recommended for a sample of typical projects. By doing so, the agency can use detailed analyses of a small sample of representative forecasts to improve its forecasts for many typical future projects.

Archiving Traffic Forecasts and Associated Data I-37 The Gold level includes all of the Silver-level information plus an electronic archive that allows analysts to completely reproduce the forecast after project opening. The study team found that, when the original model runs were available, reproducing the forecasts was easier to achieve than originally assumed. Gold-level archiving practices are currently being used by some sponsors of FTA Capital Investment Grant (CIG) projects. CIG grant recipients are required to perform a before-and- after study 2 years after project opening. Although they are not required to fully reproduce the model originally used to prepare the forecasts, agencies may find that doing this is useful to more effectively assess the causes of forecasting errors. 3.2 Forecast Archive and Information System As discussed under the Bronze archiving level, the project team recommends that forecasters archive the basic information of the forecast and the actual traffic volume, making these data available for later analysis. To facilitate the process of compiling and archiving the data, a data specification and accompanying software (the Forecast Cards) are being made freely available online for future use and expansion. The system has been published as an open-source software tool, and more detailed user documentation has been integrated directly within the repository. This allows the user documentation to be updated together with any future revisions to the software tool. The Forecast Cards and accompanying data repository are accessible online using the links provided in Part III, Appendix A, in this report. This section of the guidance document provides a high-level overview of the Forecast Cards and accompanying data. The overview is broken down into three parts: • Development and design features. This section describes the development of the system and provides useful context for the design decisions that were made in the final implementation. Specifically, the project team began by developing a Microsoft Access database to support the Large-N analysis described in Part II of this report. This experience allowed the project team to design a system that would be more scalable for future use. • Data specification. This section describes the structure of the data in the Forecast Cards sys- tem. Technically, the Forecast Cards provide both a data specification, which lays out a set of rules for a standardized structure in which to store the data, and software that will compile/ summarize the data stored in that structure. The data specification is available at: https:// github.com/uky-transport-data-science/forecastcards. • Data storage. This section describes options for storing the data once they are in the forecast card format. The current data repository contains all of the data used in this research, but the data repository has been designed so that more data can be added as new projects are planned or opened. The Forecast Cards Data repository can be accessed at: https://github.com/ uky-transport-data-science/forecastcarddata. For readers’ convenience, links to access the Forecast Cards and Forecast Cards Data also appear in Part III, Appendix A. Registration is not required to access the information, but registration (free for a public account, paid for a private account) is required to contribute by uploading new data. 3.2.1 Development and Design Features The data for this project were compiled from existing data about forecasts and actual outcomes provided by several agencies. The data provided by each agency arrived in various formats, and an early task was to put the data into a common format such that the project team could analyze

I-38 Traffic Forecasting Accuracy Assessment Research the combined databases. In defining this common format, the project team reviewed informa- tion provided by agencies in Ohio, Wisconsin, Michigan, Virginia, Florida (District 4 and District 5), Minnesota, and Kentucky. Table I-6 shows the fields the project team used to record the information provided by each of these sources. The information included underlying data, such as traffic forecast reports, that were not included in the summary spreadsheets. The table also notes the number of records and unique projects from each source. Field Data Source Recommended Ohio DOT Wisconsin DOT Michigan DOT Virginia DOT Florida DOT (D4) Florida DOT(D5)* Minnesota DOT Kentucky (KYTC) * for Combined Database Number of records 6,229 458 9 1,160 143 50 2,179 n/a Number of unique projects 2,466 132 7 39 134 31 110 Cumulative spreadsheet or database (flat file) √ √ √ √ √ √ Cumulative database (relational) √ √ ESAL or technical reports √ √ √ √ Project identification number or code √ √ √ √ √ √ √ √ √ Description, including facility name √ √ √ √ √ √ √ √ Location County √ √ √ √ √ √ √ Milepost √ √ Other type of location √ √ Facility type or functional class √ √ √ √ √ √ √ Segment identification codes √ √ √ √ √ √ Length √ √ Project cost Area type Type of improvement √ √ √ √ Forecaster (person, agency, company) √ √ √ √ √ Year forecast made √ √ √ √ √ √ √ √ Forecast year(s) √ √ Opening year √ √ √ √ √ Interim year √ √ √ Design year √ √ √ √ √ √ √ Unlabeled year √ √ Forecast value √ ADT forecast √ √ √ √ √ √ √ √ √ Peak hour or K-values √ √ √ √ √ √ Turning movement forecast √ Segment or link identification √ √ √ √ √ √ √ √ DHV, truck, and ancillary data (sometimes forecasts, sometimes assumptions) √ √ √ √ √ √ √ Actual value √ ADT √ √ √ √ √ √ √ √ VMT √ Table I-6. Common data fields in forecast accuracy sources.

Archiving Traffic Forecasts and Associated Data I-39 Field Data Source Recommended Ohio DOT Wisconsin DOT Michigan DOT Virginia DOT Florida DOT (D4) Florida DOT(D5)* Minnesota DOT Kentucky (KYTC) * for Combined Database GEH √ √ Accuracy ratio √ √ √ Other √ Adjustment for differences in FY, AY √ ADT = Average Daily Traffic; AY = Assessment Year; D4 = District 4; D5 = District 5; DHV = Design Hourly Volume; ESAL = Equivalent Single Axle Load; FY = Fiscal Year; GEH = Geoffrey E. Havers statistic; VMT = vehicle-miles traveled. * The Florida DOT (D5) and the KYTC provided many traffic forecasts and ESAL reports that were not part of a database. Turning movements Year of observation √ √ √ √ √ √ √ Single value √ √ √ √ √ √ Multiple values √ √ Methodology examined and/or applied √ √ √ Historical trend or regression √ √ √ √ √ √ Population projections √ √ √ √ Travel model √ √ √ √ Other √ √ √ √ √ Relative error √ √ √ √ √ Table I-6. (Continued). Some sources could only provide a limited amount of information in a spreadsheet or data- base format, but additional project information remained available in associated ESAL or traffic projection reports: • The Florida DOT (District 5) provided numerous traffic project reports with reasonably complete information that were later added to the database. • The Kentucky Transportation Cabinet also provided numerous ESAL and traffic projection reports, but it was difficult to determine when the projects opened and to obtain matching count data, so those records were left out of the later analysis. • The Virginia DOT provided multiple pieces of information about each project. For example, there is a description of each project, another file with the traffic counts, and a third spread- sheet with the initial accuracy calculations. Including these data would require reviewing the associated reports and information to develop a complete data record, and that task was not undertaken as part of this research. • Data from international projects and from a deep dive in Massachusetts were later added to the database. Several observations were made about the data received: • Assessments of forecast accuracy seemed to focus strictly on ADT values, even though mul- tiple items were produced in the forecast; • Forecast accuracy also seemed to be evaluated in terms of a single point in time (the project’s opening year), with almost no investigation into the years after the opening year; • Most sources contained data for most fields, but varying sources did not necessarily contain the same fields; and • Even when the same fields were used, the definitions of the data the fields contained could differ. For example, consider project data recorded under the field, “Type of improvement.” One agency might have used a binary classification of new road versus existing road, whereas

I-40 Traffic Forecasting Accuracy Assessment Research another agency might have used a more detailed classification that distinguished among operational improvements, adding lanes, repaving, intersection changes, and other types of improvements. The first two points limited the analysis that could be conducted for this research, and the second two points were important to the design of the database. The final column in Table I-6 shows the fields that were recommended for inclusion in the combined database. A key goal of the combined database was to ensure a level of data standard- ization that would allow data to be combined and compared across agencies. The project team accomplished this by developing an initial combined database in Microsoft Access. This imple- mentation was successful for the purposes of this research, but as the project team contemplated deploying it for future use, the limitations of the initial database became clear. Specifically, it would be difficult to share across multiple users in different organizations. One goal of this research was to establish a process for the continued evaluation of traffic forecast accuracy. To make this happen, it was important to develop a tool that makes it easy to conform to the data standard, and that makes it possible for different users to store their data as new projects are forecast or open without creating conflicts with other users. Microsoft Access is a desktop-based software package, designed to store data on a hard drive or a network drive. Any user that wants to add to the data must have access to that drive, and when the information is added the whole file gets updated. This design makes collaboration difficult. As the project team considered how best to adapt the archive and information system, the team determined it wanted a tool with the following design features: 1. A standard data specification. As with the initial (Microsoft Access) version, a core goal of the archive and information system is to establish a standard data specification for forecast accuracy data. Such standardization allows data from different agencies to be stored in a common format and analyzed jointly. 2. Stable, long-term archiving. The purpose of archiving forecasts is to allow them to be evaluated after the project opens. The information system must be able to provide stable storage for a number of years, and not be based on files that could be lost with staff or hardware changes. 3. Multiple users and data sharing. The system should allow multiple users in different agencies to upload their forecasts and actual data, and should allow the data to be shared in a common format accessible to all the agencies. This will allow for combined analysis in the future, which is particularly important if the sample sizes are small for any particular agency. 4. Both a public and a private or local option for data storage. Recognizing that some users may not wish to share their data publicly, the system should provide an option to store the data privately or locally. 5. The ability to add files for the Silver and Gold archiving standards. It is desirable to have quantitative data available in a tabular format for statistical analysis. It also is desirable to keep supporting documentation or forecast reports (which may include qualitative information) for Silver-level archiving, and to keep model files for Gold-level archiving. The information system should facilitate linking supporting documentation, forecast reports, and model files to the tabular data records so that the integrity of the tabular data can be checked against the supporting documentation. 6. Based on mainstream and low-cost software. The solution should be based on software that is mainstream and that does not pose a budgeting obstacle for potential users. The project team also looked for use cases in related fields with similar needs that could sug- gest existing systems or tools adaptable to the project team’s purpose. Examining the fields of scientific publishing and software development yielded productive information:

Archiving Traffic Forecasts and Associated Data I-41 • Increasingly, scientific journals are requiring that the data supporting a research paper be archived and made available to future researchers. The preferred method for doing this is by uploading the data to a public data repository that can store the data over the long term and provide access to upload or download the data through a stable URL. The journal Nature maintains a list of recommended data repositories for different fields at: https://www.nature. com/sdata/policies/repositories. The project team observed that these repositories allow for any type of file to be stored, but they often are set up to correspond to a specific snapshot of data, associated with the publication of a paper. • Software development contends with similar archiving challenges, and repositories exist that make open-source software available similarly to data repositories. However, software usually involves ongoing evolution, either for the development of new versions or to fix bugs, so it needs a strategy for managing those changes, particularly when they are made by different individuals. A strategy is needed to allow multiple users to contribute to the software without creating con- flicts. This task of managing the changes is usually facilitated via a version control system, which tracks the changes made to files, as well as the user who made those changes. A version control system also allows the user to revert to a previous version of the files, making it easy to “undo” a change if the change introduces an error. Strategies also are available for merging changes made by different users, although this task is easier if the changes are made to distinct files that are stored in the same repository, rather than to one large file. Significantly, version controls usually are applied to the software source code files, but they can be applied to any file. A version control system can be installed locally on a single machine, or it can be combined with a software repository, which allows the files to be uploaded and archived to remote servers. A software repository can be public (allowing anyone to download the files), or it can be made private and password protected, such that the files are shared only among designated users. Git (https://git-scm.com/) is a commonly used and free version control system that works with either approach. GitHub (https://github.com/) is a commonly used software repository that integrates with Git and is free for most users. Several agencies that generate traffic forecasts use Git and GitHub to manage the code and inputs for their travel demand models. These agencies include the Metropolitan Transportation Commission (https://github.com/BayAreaMetro), the San Francisco County Transportation Authority (https://github.com/sfcta), and the Oregon DOT (https://github.com/ODOT-PTS). After considering the available options, the project team reasoned that a clearly defined data specification, combined with a version control system and a software specification, would meet the six major design goals outlined above. Moreover, this solution could take advantage of existing software tools (Git and GitHub) where possible. The resulting archive and information system would be composed of two parts: one repository to store the data specification and associated scripts, and a second repository to store the data itself. 3.2.2 Data Specification The data specification defines what the project team chose to refer to as forecast cards. The Forecast Cards data specification provides a way of storing key information about travel fore- casts and associated outcomes in order to evaluate the performance of a forecast, analyze the collective performance of forecasting systems and institutions, and identify contributing factors to accurate or inaccurate forecasts. Each “card” is actually a text-based CSV file. These files are easy to edit without specialized software and they integrate well with version control because any changes can be clearly viewed. (Version control works on binary files too, but in the latter case it will simply show that the file has changed, rather than highlighting specific lines or characters that have changed in a text file.) Using CSV files means that “decks of cards” from different agencies can either be treated separately or easily combined into a larger deck.

I-42 Traffic Forecasting Accuracy Assessment Research The project team developed five types of forecast cards: • Points of Interest, such as roadway segments or transit lines; • Projects, such as specific roadway expansions; • Scenarios, including information used to make the forecast and/or about the forecasting system; • Forecasts, consisting of predictions about the project outcomes at the points of interest at specific points in time (i.e., what the project is expected to do); and • Observations, which are points of data (usually traffic counts) used to evaluate the forecasts. Each type of card can be thought of as a table in a relational database, and each card a record in that table. The cards relate to each other as follows: • A point of interest defines a specific physical or planned location (e.g., Mulberry Road between Busytown and Pleasantville, or Mulberry Road between Pleasantville and Workville). • A project is associated with a planned, physical project that may or may not ultimately be built (e.g., the Mulberry Road Improvement Project). • A project may be associated with one or more scenarios (e.g., a project may be associated with an opening-year scenario and/or a design-year scenario; the project also may be associated with a build scenario and/or a no-build scenario, or with a 2-lane scenario and/or a 4-lane sce- nario). Scenarios are likely to correspond to the various alternatives considered and reported in project planning documents. • Each scenario can be associated with one or more forecasts. A forecast is a prediction of a specific outcome, for a specific point of interest, at a specific point in time. For example, a forecast might predict 10,000 ADT (outcome) on Mulberry Road between Busytown and Pleasantville (point of interest) in the year 2020 (point in time). A related forecast might predict 8,000 ADT (outcome) on Mulberry Road between Pleasantville and Workville (point of interest) in the year 2020 (point in time). Both scenarios could be associated with the 4-lane opening-year scenario for the Mulberry Road Improvement Project. • Multiple forecasts can be stored for the same scenario, outcome, point of interest, and point in time. For example, forecasts are commonly revised between the initial planning and final design phases, or analysts might wish to consider a low-growth, medium-growth, and high- growth forecast. • Observations also are associated with a specific outcome, point of interest, and point in time, but observations need not be associated with a specific physical project. For example, it is quite possible to record a traffic count without a specific project being built. However, an option is available to associate an observation with a specific forecast. Each type of card has some required data fields that contain specific types of data. Most cards also have additional, optional fields that may contain other types of data. The cards also contain categorical variables, which are defined to ensure consistency (e.g., the options for the categorical variable project_type are specified as hov, road-capacity-expansion, transit-priority, road-diet, bike-facility, new-roadway, land use, and other. Figure I-7 shows the forecast card data schema. Required fields are shown in red, and optional fields are shown in black. Blue text shows the options available for categorical variables. The forecast cards for a project are stored together in a single directory, or folder. Figure I-8 shows an illustration of the CSV files associated with a single example project. Each rectangle represents a CSV file, with circles representing rows within the CSV file. The structure specifies one folder for each project. A folder may contain only the required CSVs (e.g., for a project that has been archived at the Bronze level), but it is easy to add files beyond the required CSVs to that same folder without having to change the structure of the database. Additional files might

Archiving Traffic Forecasts and Associated Data I-43 Figure I-7. Forecast card data schema.

I-44 Traffic Forecasting Accuracy Assessment Research logically include a traffic forecast report or other documentation conforming to the Silver-level archiving standard. For select projects, the additional files might also include the files necessary to reproduce the associated model runs, bringing the archived information up to the Gold standard. This design provides an advantage over a traditional database format in that it is easy to add any type of file and have all the relevant information about a project stored in the same place. In the data specification, a balance is needed between keeping the required fields to a mini- mum, and thus making it easier to use, and being more extensive in case specific users may wish to expand the scope of their analysis. For example, this research focused exclusively on the accuracy of ADT forecasts, but it is reasonable to also consider the accuracy of peak period traf- fic volume forecasts or of travel time forecasts. Likewise, future users may wish to consider the accuracy of transit forecasts in addition to road traffic forecasts. The project team aimed to build a degree of flexibility into the data specification, which is why the specification considers a point of interest instead of a roadway segment, and why several options are allowed for the forecast variable. The project team did not test the specification for transit forecasts or other uses beyond the scope of this research, and it is expected that the specification may evolve as other researchers and practitioners seek to extend it. Instructions for getting started with forecast cards and examples are included in the Forecast Cards repository at GitHub. The Forest Cards system also includes a validate_project script that checks to ensure that the forecast cards conform to the data specification. Scripts for combining the data for analysis are discussed in Part I, Chapter 4 of this report. 3.2.3 Data Storage The section on data specification described the recommended data structure for archiving traffic forecasts and associated observations. The archived forecasts do not require significant amounts of storage. The project database for this study (with over 16,000 segments representing more than 2,000 projects) required less than 50 megabytes. Each project has one folder, along with CSV files that store key information conforming to the Bronze standard. Additional files may be added to the same directory to meet the Silver or Gold archiving standard. In its most basic form, a project database could consist of a simple directory of project folders on someone’s hard drive or on a local network drive. However, storing the information this way does not necessarily ensure the long-term archiving of the data, nor does it facilitate sharing outside the organization. Therefore, three options are suggested to agencies as options for storing their own forecast card data: local storage, a private repository, and a public reposi- tory. Which option is most appropriate will depend on the agency’s preferences, priorities, and needs, as follows: • Local storage is appropriate for agencies that do not wish to store their data to the cloud, or agencies that wish to upload their data only at regular time intervals (e.g., annual uploads). Figure I-8. Forecast card illustration for an example project.

Archiving Traffic Forecasts and Associated Data I-45 With this option, the data should be stored on a local network drive where it can be edited directly. Even with the local storage option, the project team recommends that agencies use a version control system (such as Git) to create a local repository. The files are still edited directly, but the version control software will track the changes made and store a record of previous versions in a file on the same drive. Creating a local repository provides a degree of protection in the event that a change inadvertently introduces an error. • A private repository is appropriate for agencies that wish to create an off-site archive of the data to protect it, but still want to restrict access to the data to select users. The files still exist on a local drive and can be edited directly, but with this option the version control system is set up to also copy the files to a remote software repository (such as GitHub) and keep the local and remote versions in sync. Because two copies of the data are stored, it is unlikely they will be lost in the transition of local hardware or due to local hardware failure. This structure will also work well if multiple team members are involved in creating traffic forecasts and adding forecast cards. Team members can each add their own projects and upload them to the remote server. These changes are saved, along with the timestamp and name/email of the person who made the change. The version control system helps ensure that each team member will access the same, current version that incorporates any changes made by any other team member. Access to the private repository can be granted to users outside the organization, such as consultants who create or contribute to traffic forecasts, or researchers involved in analyzing the data. Access is available only if the user creates an account and the administrator grants specific read or write permission. • A public repository (such as the Forecast Cards Data repository created by this project) is appropriate for agencies that wish to be fully transparent and make their data available for use by others who may wish to analyze it. The structure is the same as with a private repository, and write access is restricted to those granted permission, but with a public repository anyone with access to the internet can download a copy of the data. Each agency that provided data to NCHRP Project 08-110 agreed to have the data included in a public repository. The Forecast Cards Data are available at https://github.com/uky-transport- data-science/forecastcarddata. Each project has been stored in a separate folder, and each folder name starts with the name of the agency that provided the forecast. This structure makes it easy to combine data received from multiple agencies in much the same way that data from multiple users can be combined in the private repository option. The public repository is the preferred option because it maximizes the possibility that other researchers will continue to analyze these data, and that insights can be gained from a larger, combined dataset. One limitation relates to file sizes. Version control systems store information on the changes from all previous versions of files included in the repository. This is easy for text files and other small files, but if the files are extremely large it can be burdensome. Software repositories like GitHub typically are free for basic users, but the service company may place limits on the total repository size before incurring data storage fees. Agencies wishing to archive at the Gold level may need to account for this, either by paying for the necessary file storage or by setting up rules in the version control system so that it only stores large model files locally. Instructions for setting up forecast cards with these options are available at https://github.com/ uky-transport-data-science/forecastcards.

Next: Chapter 4 - Reporting Accuracy Results »
Traffic Forecasting Accuracy Assessment Research Get This Book
×
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Accurate traffic forecasts for highway planning and design help ensure that public dollars are spent wisely. Forecasts inform discussions about whether, when, how, and where to invest public resources to manage traffic flow, widen and remodel existing facilities, and where to locate, align, and how to size new ones.

The TRB National Cooperative Highway Research Program's NCHRP Report 934: Traffic Forecasting Accuracy Assessment Research seeks to develop a process and methods by which to analyze and improve the accuracy, reliability, and utility of project-level traffic forecasts.

The report also includes tools for engineers and planners who are involved in generating traffic forecasts, including: Quantile Regression Models, a Traffic Accuracy Assessment, a Forecast Archive Annotated Outline, a Deep Dive Annotated Outline, and Deep Dive Assessment Tables,

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!