National Academies Press: OpenBook
« Previous: 5 Lessons from Medical Imaging for Explosive Detection Systems
Suggested Citation:"6 Data Collection, Management, and Analysis." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 52
Suggested Citation:"6 Data Collection, Management, and Analysis." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 53
Suggested Citation:"6 Data Collection, Management, and Analysis." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 54
Suggested Citation:"6 Data Collection, Management, and Analysis." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 55
Suggested Citation:"6 Data Collection, Management, and Analysis." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 56
Suggested Citation:"6 Data Collection, Management, and Analysis." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 57
Suggested Citation:"6 Data Collection, Management, and Analysis." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 58
Suggested Citation:"6 Data Collection, Management, and Analysis." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 59
Suggested Citation:"6 Data Collection, Management, and Analysis." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 60

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

6 Data Collection, Management, and Analysis BACKGROUND As described in Chapter 2, current explosives detection and screening practices are conservatively applied at U.S. airports in order that there be reasonable assurance that explosives cannot be placed on airplanes. One result of this conservative practice is the large number of false alarms that result from the screening of checked baggage: According to Transportation Security Administration (TSA) estimates, this larger number of false alarms—also called false positives—translates into hundreds of millions of dollars per year of added cost to the government for resolving the false alarms, as well as causing passenger inconvenience. Although new technologies continue to be developed to improve the TSA’s ability to detect and intercept explosives that are intended to damage or destroy commercial airplanes, there is an immediate need to reduce the false alarms associated with checked baggage screening. One approach to reducing the number of false alarms is to structure a data collection, management, and analysis system to allow studies of screening processes for the explicit purpose of tracking false positives with the intent of obtaining a better understanding of their causes. This clearer understanding of the causes of false positives should facilitate corrective actions for process improvement either with respect to equipment, software, and algorithms or in the area of operator training. TRANSPORTATION SECURITY ADMINISTRATION DATA On the basis of the information that it received in meetings and during site visits over the course of this study, the committee understands that the TSA has the ability to collect the following types of data: • Baggage-processing data. These include counts of the number of bags checked, the number immediately cleared by the airport’s explosive detection system (EDS), the number of shield alarms (which occur when any area of a bag cannot be penetrated by x-rays), the number cleared by way of the on-screen alarm-resolution protocol (OSARP), and the number cleared at the baggage-inspection room (BIR); • The nature of the identified threat that caused a bag to be sent to the BIR. Examples of such identified threats might be cosmetics; foodstuffs; electronics; books, paper, shoes or other particular materials causing shield alarms; bag parts or a packing style leading to an aggregation error; or non-bag- related causes such as mis-tracking, operator time-out errors, bag jams, or scanner failures; • Electronic copies of certain images that cause alarms. Some airports are currently collecting copies of such images; • Results of periodic standard testing on individual EDSs to ensure that they are operating consistently. Such testing would involve, for example, system voltages and currents and the EDS’s ability to detect threats on certain standardized digital inputs; 52

• Occasional detailed studies on the baggage-inspection process conducted at a particular location or locations. Examples of such studies would be those conducted by the National Safe Skies Alliance and Reveal Imaging; and • Results from red-team testing. During such testing simulated threats are inserted into the system to check the ability of the screening system to detect them. In spite of these vast data collection opportunities, during the course of its site visits the committee was unable to verify that there is any uniform system of data collection and management with the TSA. Without such data, it will be very difficult to manage and improve the baggage screening process. Recommendation: The Transportation Security Administration should work with the Transportation Security Laboratory to collect and analyze field data in order to characterize the overall performance of the system by computing statistically valid estimates of probability of detection and probability of false alarm for today’s CT-based EDSs. These analyses should also be used to better understand the sources of false positives by determining the dependence of these probabilities on material characteristics of potential explosives threats, the variability in the material characteristics, and the characteristics of non-threat materials typically contained in checked bags. These estimates should then be used as baselines for determining the ability of potential improvements to reduce false alarms. TRANSPORTATION SECURITY ADMINISTRATION DATA MANAGEMENT AND PROCESSING The collection and organization of baggage screening data will require the development of a special database and data management system that will allow these data to be available for analysis. Procedures for viewing data and extracting relevant parts of the database for special purposes, such as the generation of reports will need to be coupled with this database system. Procedures could also be developed for the extracting of information needed for special studies such as quantitative risk assessments (QRAs), described below, or anomaly detection (i.e., when a sudden change occurs in the “normal” behavior of an EDS in an airport setting). Commercial off-the-shelf software for building such database systems are readily available and reasonably priced, although additional investment in hardware and training will still be necessary. Finding: Discussion with TSA officials, airport personnel, and vendors indicates some limited- scale data collection and laboratory studies that have enabled the sources of false alarms to be broadly identified. However, system-wide data collection and analysis of the sort necessary to seek out the root causes and guide sustained improvements are not being done. Conclusion: Without more systematic data on the rates and specific causes of false alarms, the TSA cannot determine what changes are likely to result in reduced false alarm rates and, in fact, do not have the infrastructure in place to determine if an implemented change would result in improved performance. Recommendation: The TSA should develop and maintain a central database and data management system. The database should contain important historical data, examples of false-positive images, data from previous special studies that have been conducted, and results of periodic standard tests on individual EDSs. Data from all TSA inspection facilities should be kept in a common format in the form of time- series data that record operating variables in the baggage-screening process from all of the TSA 53

inspection facilities. These records should include frequency counts (in units of bags per hour) of the number of bags handled, the number cleared by the ES, the number having shield alarms, the number cleared by on-screen resolution (OSR), the number sent by OSR to the BIR, and the number of times that the ordnance disposal team (ODT) is called. The Use of TSA Data in Quantitative Risk Assessment The interaction between throughput rate, false-positive rate, probability of detecting explosives, human factors, and the probability of an attack suggests the need for continuous, detailed, system-wide modeling and analysis of the TSA baggage-inspection system. Quantitative risk assessment (see Appendix B for a more detailed description and a simple illustration outlining a QRA for quantifying the cost of false positives) provides methods to study and quantify the risk of extremely rare events, and especially events for which there are very limited data. The thrust of the QRA approach is the quantification of uncertainties, providing a framework for communicating how much confidence one has in reported figures of merit. However, QRA methods can also be applied to other problems that need to be studied quantitatively, such as the problem of reducing the cost of false positives, or for analyzing the probability of explosives being in airline baggage and being cleared to an airplane. Thus QRA could be useful for assessing potential trade-offs that keep probability acceptably low. Recommendation: The TSA should employ risk assessment methods to obtain a better understanding of the causes of false positives at both the system and the component level. QRA could also be an effective approach to analyzing the probability of explosives in airline baggage and for assessing the effects that changes to the baggage-inspection system will have on both probability of false alarm and probability of detection. These QRA studies could be performed on an as-needed basis to develop the understanding of and to improve the baggage-screening processes. Data analyses should explore trends and possible changes in the false alarm rate over time and assess the real effects of changes to the system and procedures. For example, as noted in Chapter 2, changes in rules and charges affecting airline travel can have an influence on the mix of items in checked baggage, and such changes could have an important effect on inspection operations. The Use of TSA Data for Process Monitoring A fundamental principle of process operation is the need to monitor important process variables over time. This monitoring is important for purposes of detecting changes as quickly as possible, for maintaining control of the overall system, for effecting improvements to the process, and for quantifying the effect of process improvements. 1 Such data analyses should be able to identify unexpected changes in the process and also suggest changes that have the potential to improve the process. The data already collected by the TSA need to be linked to the current criteria for clearing baggage and to other control variables that would have direct impact on the performance of the baggage-inspection system. 1 Other industries use such methods for process monitoring, and the realm of aviation security may be able to learn from them. For example, the chemical industry uses what is known as statistical process control (SPC) to track critical parameters over time. SPC is most useful when there is a baseline process that is expected to behave in a stable manner over a period of time. For aviation security, SPC may be useful for daily machine calibrations and similar verification activities. 54

The TSA baggage-inspection process is both complicated and expensive to operate. Information in the proposed database, if used properly, would be useful in helping the TSA to identify weaknesses in its systems, improve the systems’ processes, assess the real effect of changes in the systems, and keep inspection processes operating properly. It would be possible to develop software procedures to generate automatically and inexpensively periodic management-level reports that could provide information on the state of the system and flag significant changes or other potentially interesting findings. Reports could be generated for and sent to individual airports. An overall summary, providing system-wide metrics, could also be included. The content of such reports should be highly graphical, showing trends and patterns in important performance metrics (such as screening cost per bag, probability of false alarm [PFA] and probability of detection [PD]). The Use of TSA Data for Understanding the Root Causes of False Positives A detailed, quantitative understanding of the root causes of false positives is important if the TSA is to reduce the costs associated with these false positives without increasing other risks. For instance, the overall false alarm rate includes two distinct “populations” of bags, each of which would require a different approach to reducing false alarm rates: • The first population includes bags for which the EDS cannot make a decision—so-called “exceptions,” such as bags containing solid objects that cannot be penetrated by the EDS x-rays, mis- tracked bags, and bags that are poorly positioned in the EDS in such a way that the EDS cannot interrogate the entire bag (“cut bags”). These exceptions are sent directly to the baggage inspection room without the opportunity for a screener to evaluate the image and clear the bag. • The second population includes bags whose contents include items that are misidentified by the EDS as potential threat items—for example, when the item’s properties fall within the window defined for threat items, or multiple items are mistakenly aggregated into a single object that meets the criteria for a potential threat item. Without systematic data that can be used to establish how much each population of bags contributes to the overall false alarm rate, or what the specific causes of false alarms are within each population, it is difficult to know what the right course of action is. Recommendation: The Transportation Security Administration should track broad categories of bags with the goal of understanding how each category contributes to the cost of resolving false alarms. Categories should include the following: the number of bags scanned, the number of bags declared exceptions, the number of bags declared potential threats by the EDS and cleared by the screener using the on-screen alarm-resolution protocol, and the number of bags declared potential threats by the EDS and sent by the screener to the baggage-inspection room for further inspection. Tracking these data over multiple airports and multiple seasons would give the TSA a better overall understanding of the cost drivers contributing to the false alarm rate. Although some studies—such as Reveal Imaging’s Image Quality Evaluation program—have been conducted and are an excellent start, they have been limited in scope and do not allow for seasonal and regional variation. Ultimately, data collection endeavors must be system-wide. The wide range of false-positive images that current screening practices detect could be partitioned into a manageable number of categories of baggage items (e.g., cosmetics, food-stuffs, or metal) and non-bag related causes (e.g., algorithm issues, losing track of bags during the screening process, or hardware faults). Again, better data from the entire screening process is needed to assess the merit of this approach. 55

As a TSA database is established, taking the opportunity to gain as much information as possible is important. It would be prudent at this stage to collect too much data rather than too little. 2 Counts of the occurrence of the different root causes of false positives should be included for any TSA database that is developed, along with the other data—by-hour, by-bag, by-airport, and by-standard operational data. Anecdotal evidence about false alarm causes in some airports has been presented to the committee; however, it would also be useful to quantify how the frequency of the different root causes changes—for example, with the season, year, or destination. Knowledge of the frequency of alarms for each of these categories might suggest a further decomposition into more specific articles (e.g., cosmetic gels or liquids, as opposed to the whole category of cosmetics, which includes gels, liquids, creams, powders, and pastes, among other substances) to provide clearer guidance about where the highest payoff for corrective actions lies. For each of these categories, it would be possible to have a link to a set of example images of false positives that lead to false alarms. The goal of examining data such as those described above would be to identify a category or categories of past false-positive images that, on closer examination, provide a basis for more sharply defined criteria that result in fewer false positives. The criteria for establishing the image categories should be driven by the level of likeness to an explosive image. It may be necessary to perform tests and studies in order to provide a technical basis for explosive image-standards for images in the individual categories: The concept is to identify images for each category that vary from having no likeness to explosives to having varying degrees of likeness. Thus, this task must involve experts in interpreting images of improvised explosive devices to sort out the categories. Two primary classification methods are used for other forms of indexing: (1) K-means 3 and (2) hierarchical ascendant classification have demonstrated some usefulness in classifying images in the medical and biomedical fields. 4 However, the ultimate choice and means by which this should be accomplished will depend on the nature of the population of images both with and without threats. Information from such image classification studies could be fed back to automated threat recognition (ATR) algorithm developers and also could be used in focused training for OSR operators. Data-mining communities from other fields such as computer science, medical image analysis, and genomic analysis, among others, might also be able to help inform and guide this process. Recommendation: The TSA should develop a categorization system to record particular causes of false alarms for baggage sent to the baggage-inspection room. The TSA should develop a database to store this information and use it to monitor performance variation and trends over time. Other Uses of TSA Data for Process Improvement Ideally, interactive tools would be coupled with this automatic database system so that researchers could investigate parts of the database not included in the automatically generated reports and could extract potentially interesting slices of data as inputs to other systems (e.g., standard desktop data analysis software). The quantitative risk assessment tools, which are driven by process data and other information, could be used to investigate the “what-if” questions that would be useful for quickly assessing the impact of proposed changes to the baggage-handling system. Then the process-monitoring tools could be used to assess the actual effect on the false alarm rate caused by any changes. Uniform reporting standards that can be used to generate reports automatically, giving detailed information for each screening facility, would be a necessary part of any data management system that is 2 The size of the sample necessary to be statistically relevant ultimately will be dependent on the level of precision desired, the number of variables considered, and the number of effects being measured. 3 The data set is split into a given number (K) of subsets so that each subset is maximally compact. 4 See, for example, J. Frank, Three-Dimensional Electron Microscopy of Macromolecular Assemblies, Academic Press, San Diego, Calif., 1996. 56

established. All data should be from the permanent database and should be available for analysis and study. It is expected that this approach to examining the false-positive data and the decision-making processes for clearing baggage, involving both machines and humans, would lead to a technical basis for obtaining more informative on-screen images of items that may or may not involve explosives. The committee believes that such results would provide the TSA and researchers with a technical rationale for changing equipment specifications, algorithms, and detection criteria that should result in the reduction and better management of false positives. Recommendation: The TSA should develop a system for sharing false-positive data with detection-equipment vendors, including ATR algorithm developers and, when reasonable, with baggage vendors. Vendors should have a clear picture of how well or poorly their own equipment and that of their competitors is operating in an airport setting. The above approach to data analysis should greatly facility identifying the causes of false positives and the forms of corrective action that might be taken to reduce the number of false alarms. Of course, the analysis has to be repeated periodically to account for changes in technology and the tactics of terrorists. If it turns out that airport variability is important to an understanding of overall system state, different locations may have to be sampled for data processing. The above approach should provide a basis for corrective actions with respect to those false positives that can be manifested directly from experience data. More sophisticated analytical models may be required to link rare but high-impact false positives to their fundamental origin. That is, such models may be needed to give consideration to the contribution to false positives made by any part of the total checked-baggage-screening system, be it the passenger, baggage design, baggage-handling equipment, individual screening devices, or screener—including the assessment of changes in processes related to human factors such as the use of threat image projection (inserting a pre-set image of a potential threat among the real-time scans to verify the ability of the TSO to recognize threats)—or the process of physical examination of the baggage. An example of a more sophisticated approach is to perform a quantitative assessment of the risk of false positives and the consequences thereof (e.g., they may lead to a false sense of security and cause delays), as well as the potential consequences of a missed detection. An extension of the approach to such an analysis is illustrated in Appendix B. It is important to use process-monitoring data to gain insight into the how the baggage-inspection process works and how it might be improved. It is also important to conduct special studies in order to assess conditions that will develop in the future in the TSA baggage-inspection system. The Use of TSA Data from Red-Team Testing It is essential that there be strong assurance that the probability of detection is being maintained in the complete baggage-inspection process. There is concern that changes in the TSA protocol (specifically, changes made in an effort to reduce the false alarm rate), traveler behavior, local facility conditions, and various uncontrolled factors could have an adverse effect on PD. Red-team testing, based on a standard bag set containing simulated threats that the inspection process would be expected to catch, can be used to study the actual operating characteristics of the complete system in its actual operating environment. Recommendation: In addition to collecting performance data on a routine basis, the TSA should, from time to time, conduct special studies and experiments for the purpose of obtaining additional information that would be useful for improving the baggage-inspection processes. 57

It may be important to conduct studies such as those recommended above at multiple locations, as there could be interaction effects between the factors that are being studied and the inspection equipment being used at different locations or the mix of the baggage at different locations. Recommendation: The TSA should develop procedures for periodic verification to ensure that fielded EDSs meet detection-performance-level standards that correspond to the requirements for EDS certification. In addition to monitoring detection capability directly (e.g., using standard bag sets and red- team testing), these procedures should include the frequent monitoring of critical system parameters (e.g., voltages and currents) and imaging parameters (e.g., image resolution and image noise) to detect system problems as soon as they arise. For purposes of monitoring EDS performance, the TSA and EDS vendors should develop specification limits for all critical system parameters (and their tolerances) that could be monitored frequently and recorded to track changes in performance during normal operations or to verify performance after maintenance or upgrading. DISCUSSION The TSA has the potential to collect large amounts of data, and these data contain important information. However, the committee found no evidence that the data are being collected or used effectively. Establishing a database and a data management system would allow the TSA to extract important information from its data, facilitating process control and process improvement. Having a deep quantitative understanding of the root causes of false positives would help with finding ways to reduce the probability of false alarms without lowering the probability of detection. Methods of quantitative risk assessment, driven by information in the recommended TSA database, would be useful as an assessment and decision-making tool and would help uncover relationships among the many systems inputs and controls and operational costs, as well as help quantify the risk of a harmful attack. To keep the baggage-inspection process running correctly and to have the tools needed for process improvement, it will be necessary to employ process-monitoring methods that make use of the stream of data being generated by the process and to have detailed knowledge of the root causes of false positives. 58

Appendixes

Next: A--Biographies of Committee Members »
Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage Get This Book
×
 Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage
Buy Paperback | $41.00 Buy Ebook | $32.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

On November 19, 2001 the Transportation Security Administration (TSA) was created as a separate entity within the U.S. Department of Transportation through the Aviation and Transportation Security Act. The act also mandated that all checked baggage on U.S. flights be scanned by explosive detection systems (EDSs) for the presence of threats. These systems needed to be deployed quickly and universally, but could not be made available everywhere. As a result the TSA emphasized the procurement and installation of certified systems where EDSs were not yet available. Computer tomography (CT)-based systems became the certified method or place-holder for EDSs. CT systems cannot detect explosives but instead create images of potential threats that can be compared to criteria to determine if they are real threats. The TSA has placed a great emphasis on high level detections in order to slow false negatives or missed detections. As a result there is abundance in false positives or false alarms.

In order to get a better handle on these false positives the National Research Council (NRC) was asked to examine the technology of current aviation-security EDSs and false positives produced by this equipment. The ad hoc committee assigned to this task examined and evaluated the cases of false positives in the EDSs, assessed the impact of false positive resolution on personnel and resource allocation, and made recommendations on investigating false positives without increase false negatives. To complete their task the committee held four meetings in which they observed security measures at the San Francisco International Airport, heard from employees of DHS and the TSA.
Engineering Aviation Security Environments--Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage is the result of the committee's investigation. The report includes key conclusions and findings, an overview of EDSs, and recommendations made by the committee.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!