Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
1 The term Big Data represents a fundamental change in what data is collected and how it is collected, analyzed, and used to uncover trends and relationships. Big Data is not just about the volume of data, it also is about the velocity, variety, veracity, and value of data. The ability to merge multiple, diverse, and comprehensive datasets and then to mine the data to uncover or derive useful information on heretofore unknown or unanticipated trends and relationships could provide significant opportunities to advance the state of the practice of traffic incident management (TIM) policies, strategies, practices, and resource management. Research Objectives NCHRP Project 17-75, âLeveraging Big Data to Improve Traffic Incident Management,â had the following objectives: to conduct research to illuminate Big Data concepts, applica- tions, and analyses; describe current and emerging sources of data that could improve TIM; describe potential opportunities for TIM agencies to leverage Big Data approaches; identify potential challenges associated with the use of Big Data; and develop guidelines to help advance the state of the practice for TIM agencies. Research Approach To meet the objectives of the project, the research approach included the following activities: â¢ Assess research, practices, and innovative approaches through a review of the literature. â¢ Organize and conduct a responder workshop to inform the development of an incident response and clearance ontology and to identify areas in which improvements to TIM are needed. â¢ Identify Big Data opportunities for TIM based on the current state of the practice and responder needs. â¢ Conduct a comprehensive assessment of a wide variety of TIM-relevant data sources to determine the openness, maturity, and readiness for Big Data applications. â¢ Create an incident response and clearance ontology. â¢ Develop guidelines that help to advance TIM agencies toward the application of Big Data. Findings State of the Practice of TIM and Big Data The state of the practice of TIM shows significant advancement over the past decade, most notably through the development of regional and statewide TIM committees, the S U M M A R Y Leveraging Big Data to Improve Traffic Incident Management
2 Leveraging Big Data to Improve Traffic Incident Management National Traffic Incident Management Responder Training Program, the implementation of TIM legislation, and the collection and analysis of TIM data for performance measure- ment. Among these advancements, however, the collection and use of TIM data by agencies have lagged. Recent guidance provided by TRB and the FHWA, as well as the ongoing FHWA âOn-Ramp to Innovation: Every Day Countsâ (EDC) initiative to improve the quantity and quality of TIM data, reflect national efforts to advance the collection and use of TIM data. The findings from a review of the state of the practice in Big Data reinforce that Big Data is not new and indeed has been applied for nearly two decades by major technology companies. Big Data is characterized by the âfive Vsââvolume, velocity, variety, veracity, and valueâbut it is not necessary for all datasets to possess all five qualities to be considered Big Data. Contrary to the relational database approach, Big Data analytics is not bound to a single set of tools to perform an analysis; rather, Big Data analytics encompass a wide variety of proprietary and open-source tools that can be customized and modified by users. These tools allow for the rapid transfer, processing, storage, and analysis of extremely large datasets. These tools have increased the ability to analyze divergent data, such as decades-old historical records and real-time streaming data, to derive value that previously could not be attained using traditional approaches that typically rely on relational databases. Big Data applications in the field of transportation are more recent (having developed within the past few years) and include applications in areas such as planning, parking, trucking, public transportation, operations, and Intelligent Transportation Systems (ITSs). A significant gap exists between the current state of the practice in Big Data analytics and the current state of transportation agency applications of data for TIM. The research team identified a few TIM Big Data applications, but for the most part, these applications could be performed using relational databases. Generally, at the local and state levels, data is not collected at the volume needed to effectively use or apply Big Data approaches. Ways are available to expand on these initial approaches to Big Data for TIM, but the data must first be prepared, must be of a sufficient size and must cover a sufficient length of time to enable identification of meaningful patterns that yield value. Big Data Opportunities for TIM The application of Big Data technologies and analytics could further advance the state of the practice in TIM. To illustrate possible Big Data opportunities for TIM, the research discussed in this report contrasts traditional TIM data collection and analysis approaches with example Big Data applications for the same problem or research questions designed to: â¢ Improve scene management practice, â¢ Improve resource utilization and management, â¢ Improve safety, â¢ Enable predictive TIM, â¢ Support performance measurement and management, or â¢ Support TIM justification and funding. Each example application describes the current practice, the potential for a Big Data approach, the differences in data needs and analytical approaches, and the possibilities and benefits afforded by Big Data. These example Big Data applications illustrate that Big Data approaches are not simply an improvement on current practices; rather, Big Data represents a radical change from traditional approachesâa complete paradigm shiftâand many of the benefits of Big Data analytics will require aggregating data at least at the state level, if not at the national level.
Summary 3 Data Source Assessment The research team conducted a comprehensive assessment of 31 TIM-relevant data sources organized across six data domains. The assessment included a description of each data source, its potential application for TIM, the costs of accessing the data, and challenges associated with the data sources. The data sources also were assessed using two different data maturity models, and the assessment included an overall evaluation of data readiness and openness. The assessment findings confirmed that large gaps exist between the current state of TIM-relevant data and the application of this data for Big Data analytics. Although it may be tenable for agencies to merge a few existing datasets, developing and integrating most of the datasets will require major efforts. Building more detailed and integrated datasets will require the dedication of significant resources and expertise, and the application of non-traditional approaches. Challenges such as the lack of standards for data collection and storage, personally identifiable information (PII), legal restrictions, and agency culture and policies will limit the application of Big Data for TIM. Existing TIM-relevant Big Data datasets (from sources like HERE Technologies, INRIX, and Waze) can provide a start to the use of Big Data, but these datasets lack the detail needed for effectively mining and understanding the nuances of incident response and TIM. Furthermore, even though traffic sensors and probes generate millions of data points every second, the relative infrequency of incidents (e.g., crashes) limits the application of Big Data to TIM unless the data is aggregated across multiple regions and organizations to increase its volume and variety. Finally, agencies must possess the willingness and openness to embrace the paradigm shift that is required to use Big Data. Continued unwillingness to open and share data or to utilize cloud infrastructure are basic factors that will limit the growth and application of Big Data within an organization. Incident Response and Clearance Ontology Although it may be possible to use implicit or existing relationships within data elements to perform simple Big Data analyses, more complex and insightful Big Data analyses require a more abstract and concise way to express the knowledge represented by the data. This can be done with an ontology. An ontology is a set of concepts and categories in a subject area or domain that show their properties and the relationships between them. NCHRP Project 17-75 included a first attempt at establishing an incident response and clearance ontology (IRCO), a formal naming and definition of the types, properties, and inter- relationships of the entities that exist in the TIM domain. The development of the IRCO was aided by a workshop attended by a broad range of incident responders who provided insights on the vocabulary, entities, and relationships associated with incident response and clearance. During the workshop, it was established that the TIM ontology should first focus on conceptualizing the response to an incident and how the response relates to the incident itself, as well as the incident environment and the personnel, actions, equipment, and response vehicles involved in the response. The research team combined information gathered during the workshop with informa- tion from existing traffic incident-related ontologies identified in the literature to establish a basis for the IRCO. To capture the distributed and spatiotemporal nature of an incident response, as well as the various tasks performed by responders using various equipment, the Live OWL Documentation Environment (LODE) ontology was used. The LODE ontology allows an event to be described in time, in space, and in terms of who was involved
4 Leveraging Big Data to Improve Traffic Incident Management during the event. The IRCO organizes all these details in terms of defined classes and super classes, various object properties, and various data properties. The IRCO attempts to show how the TIM-relevant datasets are related to each other. The IRCO helps analysts understand how to organize a Big Data data store (or Big Data lake) and data analysis system for TIM so that users can quickly understand and leverage the information that is available. A complete description of the ontology and a graphical repre- sentation of the resulting IRCO are provided in Appendix B of this report. Big Data Guidelines for TIM Agencies The Big Data pyramid (Figure S-1) illustrates four tiers associated with reaching the level of applying data science. These tiers include: (1) the foundational activity of defining key performance measures (KPMs) and key performance indicators (KPIs); (2) the develop- ment of a Big Data store in which to capture, store, manage, and analyze Big Data datasets; (3) the development and maintenance of analytics and business intelligence tools and processes; and (4) the achievement of a mature Big Data practice. The research for NCHRP Project 17-75 suggests that the current state of the practice for TIM data collection, storage, and analysis is between the first and second tiers on the Big Data pyramid. At this point, very limited TIM data is being collected and shared among partner agencies, and a solid data lake as a foundation for the development of TIM business intelligence (the third tier of the Big Data pyramid) and TIM data science (the fourth/top tier of the Big Data pyramid) has yet to be built. Based on the findings from this research, eight guidelines were developed to lay out the various changes that will be necessary for transportation and TIM agencies to develop a usable Big Data lake, implement agency-wide analytics and business intelligence, and Source: Adapted from âBig Progress in Big Dataâ (Drow, Lange, and Laufer 2015) Data Science A scientific approach to statistics, domain expertise, research, and learning. Analytics & Business Intelligence Understanding the model on how systems interact. Determining the ability to take action and measure results using data. Data Warehousing A place to store the data (e.g., data lake). Defining KPM/KPI For TIM: â¢ Roadway clearance time â¢ Incident clearance time â¢ Secondary crashes Figure S-1. The Big Data pyramid.
Summary 5 pursue the development of an evolving data science environment beneficial to the entire agency. These guidelines will help position transportation and TIM agencies for Big Data. The eight guidelines can be summarized as follows: â¢ Adopt a deeper and broader perspective on data use. â¢ Collect more data. â¢ Open and share data. â¢ Use a common data storage environment. â¢ Adopt cloud technologies for the storage and retrieval of data. â¢ Manage the data differently. â¢ Process the data. â¢ Open and share outcomes and products to foster data user communities. These guidelines are further illuminated in Chapter 6 of this report. Next Steps The guidelines encourage agencies to begin putting research into practice by fully embracing low-cost, traditional good practices in data collection, cleaning, warehousing, and analysis with existing data sources. Agencies also are encouraged to concurrently identify opportunities to ready their organizations for Big Data. Opening and sharing data, both internally and externally, are critical cultural shifts that need to be embraced. An incremental approach is recommended that begins with developing the culture, policies, and expertise to improve the usability and increase the use of current data, as well as capturing opportunities to migrate from in-house servers to the cloud. These steps are the basis for positioning agencies to begin capitalizing on the opportunities afforded by Big Data. The time is ripe for Big Data implementation. The technology is here, the tools are available, and the expertise exists to assist transportation agencies in both understanding and applying these technologies and tools to everyday questions and problems. Transporta- tion agencies are encouraged to make the leap forward and begin to embrace the changes that will enable them to tackle Big Data. Even ifâlargely due to the pressures of organizational culture and a lack of dataâtransportation agencies have yet to fully accept and adopt the foundational principles of Big Data, the emergence of connected vehicle, traveler, and infra- structure data will soon drive this change. To avoid drowning in the imminent influx of data, and to capitalize on the wealth of information that can be derived from it, transporta- tion agencies must ready themselves to use Big Data. What are not yet readily available are effective strategies and techniques to break down the barriers (e.g., culture, legal, proprietary software) that impede transportation agencies from adopting Big Data practices. This is one area in which research can help agencies accelerate the adoption of Big Data for TIM. Once transportation and partner agencies have collected, opened, shared, and pooled enough (and varied) data in a cloud environment, further research can be conducted to explore the data using Big Data techniques to discover how it can help to improve specific components of TIM programs.