Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 58
58 CHAPTER 4 Outline and Requirements for a National Freight Data Architecture Introduction Freight Data Content This chapter provides a catalog of components, characteris- Different stakeholders may have different interpretations of tics, and draft specifications for a national freight data architec- what should be the content of (i.e., what should be included ture that takes into consideration the results and lessons learned in) a national freight data architecture. As with focus, content from the literature review, surveys and follow-up interviews, affects data architecture specifications. Obviously, the con- and peer exchange described in Chapters 2 and 3. tent of a data architecture depends on what is meant by data architecture. Different definitions exist, but, in general, a data architecture is the manner and process used to organize and in- Special Considerations tegrate data components. It is worth noting that a data archi- A prerequisite for the development of specifications for a tecture is not a database (databases may be built based on data national freight data architecture is to define what a national architectures); a data model, a data standard, a specification, freight data architecture should be. There are several dimen- or a framework (these items could be components of a data sions to this issue, including focus, content, scope, and access architecture); a system architecture (a system architecture could to private-sector data sources. use data architecture components); a simulation or optimiza- tion model; or an institutional program. In order to concep- tualize data components, data architectures normally use one Freight Data Focus or more of the following tools: Different stakeholders may have different interpretations of what should be the focus of the national freight data archi- · Business process model (i.e., a representation of processes); tecture. As such, focus affects content and, therefore, data · Conceptual model (i.e., a representation of concepts and architecture specifications. For example, a data architecture relationships); that focuses on commodity flows has certain data require- · Logical model (i.e., a representation of data characteris- ments such as O-D data; commodity characteristics, weight, tics and relationships that is independent of any physical and value; modes of shipment; routing and time of day; and implementation); vehicle type and configuration. By comparison, a data archi- · Physical model (i.e., a representation of data characteris- tecture that focuses on the physical interaction between com- tics and relationships that depends on the specific physical mercial vehicles and the transportation network has different platform chosen for its implementation); and data needs, such as vehicle type and configuration, network · Data dictionary and/or metadata (i.e., listing of definitions, characteristics and performance, oversize and overweight data, characteristics, and other properties of entities, attributes, safety data, and inspection data. Likewise, a data architecture and other data elements). that focuses on commodity flows and requires the collection of real-time supply chain data from the private sector has In practice, which tools to use in a data architecture de- special data confidentiality requirements. There is some over- pends on the purpose and needs of the specific application. lap between different focus options, and the challenge is to For example, a data architecture can be generic or specific. identify which one(s) to pursue. An example of a data architecture that is tightly integrated
OCR for page 59
59 into a specific application is the EFM data architecture (130). freight analyses in connection with phenomena such as reces- The EFM database schema includes several tables that sup- sions or droughts. The 2-year lag between data collection and port shipment tracking across the supply chain. By compar- release is also a weakness. ison, a generic data architecture that focuses on data flows Likewise, the 2002 CFS used the largest 50 metropolitan rather than how data entities are organized and stored in a areas plus remainders of state areas. Several ideas have been database is the National ITS Architecture (87). This architec- suggested to increase the number of CFS regions, including ture describes functions, subsystems where the functions re- using three-digit zip codes (of which there are 929 throughout side, and data flows that connect functions and subsystems in the country) and BEA areas (of which there are 172 through- connection with the implementation of transportation oper- out the country) (37). A recent study of techniques to gener- ation systems. It may be interesting to note that despite the ate national FAZs for transportation models recommended a name, National ITS Architecture implementations are usu- system of 400 zones (40). A current NCFRP project (NCFRP ally carried out at the local level. Project 20, "Guidebook for Developing Sub-National Com- modity Flow Data") is expected to shed light on recommended practices in this area (132). Specifically, the research will pro- Freight Data Scope duce a guidebook describing standard procedures for compil- Freight data scope (including both coverage and resolution) ing local, state, regional, and corridor commodity flow data- also has an impact on data architecture specifications. As doc- bases, as well as new and effective procedures for conducting umented in Chapter 3, different levels of decisionmaking tend sub-national commodity flow surveys and studies. to have different data requirements. For example, as the level The need for data quality awareness is increasing, as evi- of analysis migrates from a national level to a local level, the denced by the 2001 OMB mandate that required Executive quantity and level of detail associated with the data needed for Branch agencies in charge of gathering, processing, or ana- decisionmaking tends to increase. It follows that a data archi- lyzing data for statistical purposes to issue quality guidelines tecture that has to support several levels of decisionmaking has to maximize the integrity, quality, and usability of the infor- to accommodate a wider range of data requirements than does mation those agencies disseminate (133). Relevant U.S.DOT a data architecture that only needs to support one level of de- documents include the HPMS field manual (67), the BTS com- cisionmaking. By extension, a national data architecture that is pendium of source and accuracy statements (76), the Guide to to serve the needs of both public and private decisionmakers Good Statistical Practice in the Transportation Field (131), and not just at the national level, but also at the state and local lev- the BTS Statistical Standards Manual (134). els, has to be even more encompassing. The privacy provisions in the E-Government Act of 2002 Plenty of documents provide information about the limi- included a requirement for federal agencies to conduct pri- tations of current data collection programs, adding weight vacy impact assessments (PIAs) to document what informa- to the idea that the coverage and resolution of current freight tion is to be collected, its purpose and intended use, informa- data sources are not sufficient. The data resolution issue is tion sharing practices and security measures, opportunities particularly critical because no statements are currently for consent, and whether a system of records is created fol- available that outline (1) the required data disaggregation lowing Privacy Act provisions (135). The PIAs for the systems and accuracy levels to address current and anticipated data managed by the various operating administrations within the collection needs from a technical and statistical perspective U.S.DOT, including relevant freight-related systems discussed and (2) the corresponding impacts of those requirements on in this report, are listed online (136). data collection costs and privacy requirements. Developing those statements is critical in order to identify data collection requirements (131). Access to Private-Sector Data Sources Dimensions to freight data disaggregation include areas Aggregated freight data from commercial data providers such as commodity type disaggregation, geographic disaggre- have been available for years. For example, TRANSEARCH gation, temporal disaggregation, financial data disaggregation, Insight merges several data sources including data from operating data disaggregation, and privacy requirements. federal agencies and data from carriers. PIERS relies on data For example, CFS does not collect shipment data for certain sources such as copies of shipping documents, monthly sum- industries and commodities and does not collect shipment maries from CBP, and information gathered through partner- data for shipments passing through the United States. In addi- ships with companies abroad that specialize in manifest data tion, cross-border shipment paths only include U.S. mileage. collection in other countries. In practice, it is not always possi- Further, CFS follows a 5-year cycle, which is inadequate for ble to obtain detailed documentation about the characteristics
OCR for page 60
60 and methodology used for the production of commercial of standardized commodity codes among shippers and carri- databases. These databases can be more expensive compared ers because, in reality, industry operational environments, to public-sector data (at least from the standpoint of regular customer expectations, and freight billing practices affect the freight data users who do not need to internalize the cost to collection of shipment-level data by carriers. For example, TL collect, process, and publish public-sector data). carriers, who tend to bill customers on a per-mile basis or by The shipper industry collects large amounts of data. Many using a flat rate, rarely collect detailed commodity data, col- shippers and logistics service providers transmit data elec- lecting instead generalized, non-standardized, and/or pro- tronically using EDI technologies. These stakeholders use prietary descriptions. Shipper bills of lading also vary widely EDI regularly for load tendering, tracking, and freight pay- in commodity-level descriptions (or contain no description ment purposes. However, accessing data from shippers and at all). In addition, TL carriers are less likely to collect data on logistics service providers for transportation planning ap- tonnage hauled or tare-level data, also attributable to industry- plications (beyond aggregated data from commercial data accepted billing practices. providers and national survey campaigns such as CFS) is not By comparison, LTL carriers typically bill customers using necessarily straightforward. For example, although a data a rate structure based on shipment weight, origin, destina- record might characterize a commodity as well as origin and tion, and freight classification. As a result, they tend to col- destination locations, the route data component may be miss- lect more commodity-level data. The traditional classifica- ing unless the carrier movement data are included. In addi- tion of LTL freight is based on NMFC codes. However, there tion, many of the shipper stakeholders interviewed indicated is anecdotal evidence that LTL carriers frequently collect less they could not share data without the express consent of sen- descriptive or uniform commodity-level detailed data, favor- ior management and a review by their legal departments (par- ing a freight-all-kinds rating structure that assigns a general ticularly on a load-by-load basis, given its proprietary and freight classification to all shipments from a shipper regard- confidential nature). less of freight commodity or type. As opposed to TL carriers, Motor carriers also expressed reservations about sharing LTL carriers are more likely to track total tonnage. proprietary and confidential data. Their reservations were re- The implementation of ITS technologies is facilitating the lated to the need to develop mechanisms to protect proprietary acquisition of operational-level data from carriers. Most of and confidential information and to maintain the anonymity these initiatives focus on the interaction between carriers of carriers and customers. In general, carriers would need to and the transportation network environment, but not on the know in advance the specific uses of the data and, in return, collection of detailed commodity data. This is the case of the would expect information in the form of industry bench- CVISN program, which involves the deployment of systems to marking metrics. It is worth noting that developing metrics streamline the credential process, automate inspection screen- of interest to the private sector is part of NCFRP Project 3, ing activities, and exchange data in connection with safety "Performance Measures for Freight Transportation" (129). checks, credential checks, and fee processing. In practice, the type and amount of data provided by, or Some initiatives are addressing data exchange between available through, carriers varies considerably, depending on stakeholders along the supply chain process, as in the case factors such as carrier size, geographic locations, activity of the EFM initiative sponsored by FHWA, but it is not yet focus, and type of cargo transported. Carriers handle large clear whether, and to what degree, some of the data result- amounts of disaggregated data during the course of their ing from this process could be used for freight transporta- business operations. Increasingly, carriers use EDI standards tion planning purposes. EFM has undergone several field and applications. However, most of this information is con- tests, including field operational tests at O'Hare and JFK fidential and limited to the direct exchange of data between International Airports, and the Columbus Electronic Freight trading partners. Some federal agencies are implementing Management deployment test. An upcoming deployment EDI-based technologies to capture data from carriers, mainly test is scheduled to launch at the Kansas City SmartPort through customs and homeland security processes. project. The amount of shipment information detail in EDI trans- Other initiatives also are resulting in the collection of vast action sets varies according to the type of transaction set used. amounts of operational-level data at little to no cost to carri- In general, although the transaction sets support the use of ers, as in the case of the FHWA-sponsored initiative that has commodity codes such as NMFC or STCC, these codes are resulted in the collection of several billion anonymized posi- different from other codes such as SCTG or NAPCS. Although tional data records per year from more than 600,000 trucks crosswalk tables enable the conversion of commodity codes that operate in North America. This large database is facil- across coding systems, the current inventory of crosswalk itating the determination of performance measures such as tables is neither comprehensive nor coordinated. Questions travel times and speeds on freight-significant highways, as also remain regarding the current level of market penetration well as route choice by truck drivers.