National Academies Press: OpenBook
« Previous: Front Matter
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 1
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 2
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 3
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 4
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 5
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 6
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 7
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 8
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 9
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 10
Suggested Citation:"Summary and Recommendations." National Research Council. 2013. Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage. Washington, DC: The National Academies Press. doi: 10.17226/13171.
×
Page 11

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Summary and Recommendations The Aviation and Transportation Security Act (Public Law 170-71) of November 19, 2001, mandated that as of December 31, 2002, all checked baggage on U.S. flights be scanned by explosive detection systems (EDSs) for the presence of potential explosives threats. In response, the Department of Homeland Security’s (DHS’s) Transportation Security Administration (TSA) embarked on a program to quickly procure and deploy certified EDS equipment at all U.S. airports. Although any TSA-certified method of detecting explosives will meet the requirements of the Aviation and Transportation Security Act, the requirement is met in most airports through x-ray computed tomography (CT)-based systems. Now that CT-based detection systems have been in use for more than 10 years, TSA seeks to improve the performance of its baggage screening systems through such measures as better detection algorithms and more effective EDS equipment—especially to reduce the number of false alarms and thereby reduce the costs of screening checked baggage. This report, from the National Research Council’s Committee on Engineering Aviation Security Environments—False Positives from Explosive Detection Systems, examines potential technical enhancements, opportunities to foster innovation, and data requirements for reducing the false alarm rate (the committee’s full statement of task appears in Appendix E). This summary provides a brief overview of the report along with the committee’s key conclusions, findings, and recommendations. The supporting discussion appears in the chapters that follow, along with additional conclusions, findings, and recommendations as appropriate to the topic of discussion. EXPLOSIVES DETECTION USING COMPUTED TOMOGRAPHY CT does not directly detect explosives. Rather, CT is used in combination with automated threat recognition (ATR) algorithms to identify objects whose properties fall within specified ranges of the properties of explosives. The process starts with the CT scanner emitting x-rays that pass through a bag. X-ray detectors convert the received x-ray flux to electrical signals, which are then processed to reconstruct a series of cross-sectional images of the bag, as well as estimates of atomic density (atoms per unit volume), mass, and size and shape of items within the bag. 1 These cross-sectional images are then analyzed by an ATR algorithm that uses information in the images to determine whether the images satisfy a set of criteria consistent with the bag containing an object that is an explosives threat. The ATR algorithms have been developed and refined over many years to alert on threat amounts of materials that fall within a specified density and mass range (“detection window”). If these criteria are met, then the object is declared a “potential threat” and an alarm is raised. There are four possible results of such a screening, as shown in Figure S-1: • A threat is present and the system alarms, resulting in a true detection. • A threat is present and no alarm is raised, resulting in a missed detection. • No threat is present and the system alarms, resulting in a false positive. • No threat is present and no alarm is raised, resulting in a true negative. 1 Some vendors may also use the filtered back-projection data and the images from the digital radiography line scanner in their ATR algorithm. 1

FIGURE S-1 A contingency table of the potential results of an interrogation of checked baggage by a computed tomography-based explosive detection system. Because other, non-threat items can also fall within these specified ranges, there will always be a greater-than-zero chance of false alarm as long as there is also a greater-than-zero probability of detection. The difficulty in isolating threat materials from non-threat materials can be seen in Figure S-2, which shows typical density ranges for some threat and non-threat materials and the overlapping densities of some materials. It is inevitable then that relying solely on density and mass to identify threat materials will lead to some misidentification and false alarms. Human screeners must resolve alarms raised by the EDSs. First, following a TSA-established “on-screen alarm-resolution protocol,” a screener at the baggage viewing station is presented with information from the scan (such as cross-sectional images of the bag and specifics of any suspicious objects generated by the ATR algorithm), which is used to either clear the alarm or to send the bag for further inspection. Bags that cannot be cleared are handled according to local regulations for potential explosive threats. IMPLICATIONS OF A FALSE POSITIVE RATE The TSA estimates that each percentage point of the current false alarm rate costs the government tens of millions of dollars per year. The main element of these costs for resolving false alarms is the total number of personnel required to screen baggage in U.S. airports, because every bag that causes an alarm must be sent for further inspection. However, there are other elements that contribute to the cost of resolving these alarms—including those associated with the infrastructure for segregating bags for manual inspection, with maintaining controlled areas for opening bags, and with tracking bags. There is also a cost in time and convenience to travelers who must arrive earlier at airports to ensure baggage can be screened in time for their flight’s departure. In addition to the added expense that it generates, the process of resolving false alarms may increase the net risk to air transport because the time and personnel allocated to resolving false alarms may take away from other security efforts. Moreover, some studies suggest that the current high false alarm rate may in fact reduce the likelihood of identifying an actual threat, as screeners have come to expect that the cause of an alarm is a non-threat.2 2 See, for example, Mathias S. Fleck and Stephen R. Mitroff, Rare targets are rarely missed in correctable search, Psychological Science 18:943-947, 2007; and Anina N. Rich, Melina A. Kunar, Michael J. Van Wert, Barbara Hidalgo-Sotelo, Todd S. Horowiths, and Jeremy M. Wolfe, Why do we miss rare targets? Exploring the boundaries of the low prevalence effect, Journal of Vision 8:1-17, 2008. 2

FIGURE S-2 Notional distribution of threats and non-threats in computed tomography (CT) density space. Clothes are clearly distinguishable as a single-valued function of density, but other non-threat materials commonly found in passenger bags show some overlap in material density with some threat materials. Adding to the risks described above associated with personnel being diverted from detecting other threats and screeners expecting non-threats is the fact that as currently deployed, CT-based EDSs are tuned only to detect a certain set of explosives. Reducing the false alarm rate without increasing the rate of false negatives would free up resources to develop and deploy capabilities for identifying explosive materials. 3 TECHNICAL APPROACHES TO REDUCING THE FALSE POSITIVE RATE Although there is no way to completely eliminate all false alarms, there are a number of potential technical approaches that would improve the false positive rate. Each offers some potential improvements in system performance while imposing additional development or operational costs—and some also carry the risk of decreasing the detection rate. • Adjust the operating point on the receiver operating characteristic curve. One way to characterize the performance of a detector system is a plot of threat identification (probability of detection, or PD) versus misidentification of an innocuous item as a threat (probability of false alarm, or PFA), known as the receiver operating characteristic (ROC) curve, as shown in Figure S-3. By moving along the curve and narrowing the detection windows in order to eliminate the misidentification of non- threat materials there will be a corresponding risk of decreasing the detection rate and missing a true threat. Moving along the curve in the other direction and expanding the detection window to ensure the capture of all threat materials will result in capturing non-threat materials and increasing the false alarm rate. • Improve image processing. The correction step in CT image reconstruction uses software to compensate for imperfections in the projection data acquired during scanning, but these corrections are not perfect, and image artifacts will always remain. These artifacts increase uncertainty in the measurement and evaluation of the objects that are being scanned, and can result in such things as mis- estimation of object mass, a widening of density windows, and inaccurate region building (which leads to 3 Studies addressing the detection of novel and liquid explosives are discussed in the subsection entitled “Dual- Energy Scanning” in Chapter 2 of this report. 3

FIGURE S-3 An example of a receiver operating characteristic curve. over-aggregation of different objects). The net effect is to lower confidence in the estimated characteristics of an object within a bag, forcing the threat-defining windows to be widened, which results in a concurrent increase in false alarms. Thus, improvements in the image reconstruction and correction process could lead to a lower false alarm rate. • Slow the bag-processing speed. More scanning time allows for more detailed scanning. Indeed, with unlimited time to screen a bag, the estimations of mass and density would be substantially improved. The tradeoff that must be considered is whether increases in automated bag-screening time can be justified by the resulting decrease in the number of false alarms. • Perform additional scans. One way to reduce the probability of false alarms and to improve the probability of detection of CT-based EDSs is to increase the number of cross-sections that the machine takes of an object. More cross-sections usually lead to a better probability of correct discrimination in recognizing whether an object is a threat or a non-threat. The number of cross-sections that a machine takes can be increased either by changing the current hardware or by passing a bag through existing CT scanners multiple times in such a manner that the bag is positioned somewhat differently for each scan. Of course, the costs associated with implementing rescanning, such as increased screening time, additional routing hardware, and modifications to scanners, must be justified by a sufficient decrease in false alarms. • Investigate ways to better distinguish between materials’ similar density. Adding atomic number to the screening criteria via dual-energy CT technology has the potential to improve an ATR algorithm’s ability to distinguish between threat and non-threat materials of the same density, and thus lower the probability that the EDS would give a false alarm. Some researchers have found that extensive 4

calculations would be required to clear a whole bag; 4 more than a minute may be needed even if a graphical processing unit is used to accelerate the computation. 5 Further exploration is needed to obtain a full understanding of its advantages and limitations. • Supplement computed tomography screening with additional technologies. Supplementing the decision from the ATD with information from another technology is another way to reduce the need for manual inspection of baggage. Such information might come from a different form of imaging technology, such as x-ray diffraction; a chemical analysis, such as mass spectrometry; or data from other sources, such as carry-on-baggage and passenger-screening checkpoints, perimeter-surveillance data, or even information about passengers’ behavior or travel habits. Coupled with CT, this information might be used to reduce the overall false alarm rate without increasing the risk for false negatives. Finding: Based on the information available at this time about the performance characteristics of these approaches and available data on the actual sources of false alarms raised by today’s explosives detection systems, it is not possible to establish which are most promising or merit significant investment. For one of these approaches, adjusting the operating point on the receiver operating characteristic curve, the committee has a specific recommendation concerning avenues for further investigation. Recommendation: The Transportation Security Administration, through the Transportation Security Laboratory, should support human-factor studies to assess the impact on overall system performance, that is, the EDS plus the screener resolution, when the operating point on the explosive detection system’s receiver operating characteristic curve is adjusted so that both the probability of detection and probability of false alarm are lowered. If the results of such studies determine that screener attention is degraded by the expectation that every alarm is a false alarm, the TSA should consider implementing adjustments to the operating point on the receiver operating characteristic curve and allowing vendors to reduce probability of detection in an airport setting to the minimum rate required for certification. INCENTIVIZING AND ENABLING INNOVATION AND IMPROVEMENT EDS vendors told the committee that the TSA provides them with few incentives to improve the performance of their equipment. Additionally, although TSA aims to improve the false alarm performance of EDSs for baggage screening, the committee was made aware of no clear plan from the TSA to implement improvements in the performance of fielded systems. Without changes to current TSA policy, there will be insufficient incentives for vendors to spend money to develop improvements beyond the necessary fixes for known problems. Creating incentives for vendors and the technical community to develop improvements will require an organizational framework that includes a known path for the deployment of candidate improved technologies, a realistic strategy for fielding proven improvements, and specific incentives for vendors to provide equipment that performs better than would be necessary to meet baseline requirements. The committee believes that the DHS and the TSA, in cooperation with the equipment vendors, should develop a realistic, long-term strategy for the performance improvement of EDS equipment in an airport setting. 4 Wenyuan Bi, Zhiquiang Chen, Li Zhang, and Yuxiang Xing, A volumetric object detection framework with dual-energy CT, pp. 1289-1291 in IEEE Nuclear Science Symposium Conference Record, IEEE Piscataway, N.J., 2008. 5 Guori Yan, Jie Tian, Shouping Zhu, et al., Fast cone-beam CT image reconstruction using GPU hardware, Journal of X-Ray Science and Technology 16(4):225-234, 2008. 5

Although priorities in a long-term plan involving EDS equipment would necessarily change on the basis of changing threat environments and other outside influences, a long-term plan developed cooperatively would allow companies to evaluate their risk-and-reward strategy in a more stable investment environment. One possibility is adopting a different contracting mechanism. Performance-based logistics (PBL) has been successfully used by the Department of Defense in somewhat analogous circumstances. Under a PBL-based contract, the government and the equipment vendor work together to determine key performance indicators for the equipment, and the government provides incentives for the vendors to invest in improvements with a reasonable expectation that these improvements will be evaluated and implemented if successful. Recommendation: In order to better capitalize on improvements and provide vendors with the necessary incentives to invest in research that will lead to better performance metrics, the TSA should consider adoption of a different contract structure for the procurement and maintenance of the computed tomography-based explosive detection systems used for checked baggage, as well as for other screening technologies. One approach worth considering is performance-based logistics contracting, which is currently used by the Department of Defense. Evaluating Proposed Vendor Enhancements The committee also heard frustration from vendors regarding the prospects that improvements they develop will be purchased and fielded by the TSA. Each vendor that the committee heard from 6 described improvements that could be fielded now but that were being hindered by TSA testing requirements (such as those not permitting candidate improved algorithms to be tested without putting the entire system through the certification process) or by a lack of guidance on how changes were to be evaluated or implemented. Companies will invest in technology improvements that can reasonably be expected to generate a return on the investment. Procurement cycles need to be structured such that vendors will be willing to make appropriate investments in better performance. Of course, not every suggested change will merit fielding. Rather, TSA will need to create a framework to evaluate suggested enhancements. A first step in that process could be the establishment of a “technology board” composed of individuals knowledgeable about the technology and with broad experience in the technology, testing, and field requirements. The group’s charter would be to evaluate proposed technology improvements, to identify evaluation methods, to assess the outcomes of tests, and to identify necessary process changes. Conclusion: The TSA lacks a structured plan for implementing improved EDSs that would give vendors an opportunity to plan research funding and priorities in accordance with the TSA plan. Cooperation with University Researchers and Other Outside Industries Researchers at universities, government laboratories, and other industrial companies, funded from both private and government sources, have long been working in image reconstruction and processing and ATR algorithms. However, such research is not currently being conducted in coordination with TSA or any of the EDS vendors, and the committee saw no structure in place for such researchers to partner with either the government or EDS vendors for the development, evaluation, or fielding of improvements. 6 Representative of General Electric (GE) Security and of L-3 Communications, presentations to the committee, February 12, 2009, San Francisco, California. 6

Conclusion: The engagement of more members of the academic and industrial communities, as well as of those in the medical diagnostics and military communities having theoretical and applied expertise in image reconstruction and target recognition, could lead to increases in the effectiveness (and, in particular, decreases in false alarms) of CT-based explosives detection. Recommendation: The TSA should develop a plan to provide appropriate incentives not only for EDS vendors but also for third parties and researchers in academia in order to improve the overall performance of computed tomography-based EDSs, including their rates of false alarms. Incentives should be provided for both short- and longer-term improvements. Decoupling Image Acquisition from Post-Processing to Foster Innovation Many comparisons can be drawn between CT-based EDS and medical CT. One of the biggest differences is that whereas CT-based EDSs had to be deployed almost universally in a very short timeframe, the development of medical CT scanners has occurred over a period of many years, driven by a broad range of requirements. Further, unlike CT-based EDS where there are only a few vendors and a single customer, the vendors of medical CT scanners compete in a broad market on image quality, device flexibility, and cost, and maintain extensive research and development efforts to remain competitive. Medical CT also enjoys an open environment and standardized image format (DICOM), which has opened the door to academic participation in post-processing innovations in three- and four- dimensional visualization and computer-aided diagnosis programs 7 because details of the scanner process were separated from those details related to the processing of the images for specific applications. This separation allowed the scanner vendors to retain control over propriety details of the acquisition of images, but it provided easy access to academic and industrial researchers. Based on the positive experience with DICOM, there is now a move toward standardizing the image format of CT used for EDSs. Finding: The introduction of an industry-standard medical image format (DICOM) in 1993 fostered the development of a diverse and innovative array of diagnostic and therapeutic image visualization, processing, and automated detection/diagnostic products, fueled by the panoply of academic and private-sector research laboratories with extensive experience in the field. Recommendation: The Department of Homeland Security should promote the rapid acceptance of a standardized format for EDS images for all TSA-certified machines. Separating the acquisition of CT images from the post-processing programs will help enable greater competition for the development of the post-processing programs. Broader participation by these highly experienced groups with diverse backgrounds in image processing would make it likely, the committee believes, that new methods would be developed that may improve the detection and classification efficiency of baggage scanners. It should be noted, however, that although opening up post- processing to a wider community may lead to useful advances, it does not address the quality and completeness of the images provided by the image acquisition and reconstruction stages. 7 See, for example, “SecurView Diagnostic Workstations,” available at http://www.hologic.com/en/breast- imaging/diagnostic-workstations/, accessed September 12, 2010; and Fang-Fang Yin, Maryellen L. Giger, Kunio Doi, Charles E. Metz, Carl J. Vyborny, and Robert A. Schmidt, Computerized detection of masses in digital mammograms: Analysis of bilateral subtraction images, Medical Physics 18(5, September):955-963, 1991. 7

Overarching Advice The committee’s overarching recommendation concerning innovation and improvement is as follows. Recommendation: The TSA should develop a long-term strategy for the continuous improvement of performance. Involving all interested parties including EDS vendors and users would increase the probability that all stakeholders work toward the same goals. In addition to specific technology improvements, there are a number of areas where TSA might better incentivize or foster innovation—providing incentives for contractors, better defining mechanisms for implementing contractors’ improvements, fostering cooperation with universities and outside researchers, and promoting the decoupling of the image acquisition process from the post-processing algorithm. CERTIFICATION TESTING AT THE TRANSPORTATION SECURITY LABORATORY IS USEFUL BUT DOES NOT REFLECT REAL-WORLD CONDITIONS Certification testing of EDSs and subsequent performance testing in an airport setting is one source of information on EDS performance and causes of false alarms. To be certified, a machine must demonstrate the ability to detect a number of categories of explosives, with each category having a specific detection threshold (i.e., level of detection that must be met). The machine must also meet an average detection threshold across all categories of explosives and not exceed a maximum false alarm rate (which is tested separately from the detection). TSA’s Transportation Security Laboratory uses two sets of bags for certification tests. One set contains one threat per bag and the other has no threats. The “threat” test set is, of course, not fully representative of the bags likely to be found in an airport setting—which may contain multiple potential threats. As a result, the probability of detection established for an EDS at the TSL may not be maintained in an airport setting. However, the use of bags more representative of those seen in an airport setting would be quite complicated, given the potential variations in bags and contents. Additionally, the use of a simple, well-defined set of test bags has the advantage of allowing all manufacturers to be tested against a common standard. The certification testing also does not account for the humans in the screening process loop. In an airport setting, there can, for example, be pressure to clear bags in order to make delivery deadlines and variability in the way in which the on-screen alarm-resolution protocol is carried out. 8 Beyond these limitations, current testing does not address the quality of the CT scanner image data output and analysis output once scanners have been deployed. Nor is information about false alarms analyzed and fed back into future iterations of the ATR software. Conclusion: Certification testing at the Transportation Security Laboratory fills a specific and useful role. However, it should not be used as the sole basis for predictions of performance in an airport setting. Recommendation: The TSA should develop procedures for periodic verification to ensure that fielded EDSs meet detection-performance-level standards that correspond to the requirements for EDS certification. In addition to monitoring detection capability directly (e.g., using standard bag sets and red- 8 See, for example, Sara Kraemer, Pascale Carayon, and Thomas F. Sanquist, Human and organizational factors in security screening and inspection systems: Conceptual framework and key research needs, Cognition, Technology, and Work 11:29-41, 2009. 8

team testing), these procedures should include the frequent monitoring of critical system parameters (e.g., voltages and currents) and imaging parameters (e.g., image resolution and image noise) to detect system problems as soon as they arise. For purposes of monitoring EDS performance, the TSA and EDS vendors should develop specification limits for all critical system parameters (and their tolerances) that could be monitored frequently and recorded to track changes in performance during normal operations or to verify performance after maintenance or upgrading. USE A DATA-DRIVEN APPROACH TO REDUCE FALSE POSITIVES A detailed, quantitative understanding of the root causes of false positives is important if the TSA is to reduce the costs associated with these false positives without increasing other risks. For instance, the overall false alarm rate includes two distinct “populations” of bags, each of which would require a different approach to reducing false alarm rates: • The first population includes bags for which the EDS cannot make a decision—so-called “exceptions,” such as bags containing solid objects that cannot be penetrated by the EDS x-rays, mis- tracked bags, and bags that are poorly positioned in the EDS in such a way that the EDS cannot interrogate the entire bag (“cut bags”). These exceptions are sent directly to the baggage inspection room without the opportunity for a screener to evaluate the image and clear the bag. • The second population includes bags whose contents include items that are misidentified by the EDS as potential threat items—for example, when the item’s properties fall within the window defined for threat items, or multiple items are mistakenly aggregated into a single object that meets the criteria for a potential threat item. Without systematic data that can be used to establish how much each population of bags contributes to the overall false alarm rate, or what the specific causes of false alarms are within each population, it is difficult to know what the right course of action is. Recommendation: The Transportation Security Administration should track broad categories of bags with the goal of understanding how each category contributes to the cost of resolving false alarms. Categories should include the following: the number of bags scanned, the number of bags declared exceptions, the number of bags declared potential threats by the EDS and cleared by the screener using the on-screen alarm-resolution protocol, and the number of bags declared potential threats by the EDS and sent by the screener to the baggage-inspection room for further inspection. Tracking these data over multiple airports and multiple seasons would give the TSA a better overall understanding of the cost drivers contributing to the false alarm rate. Also, it is possible that the wide range of false-positive images that current screening practices detect could be usefully partitioned into a manageable number of classes of baggage items (e.g., cosmetics, food-stuffs, or metal) and non-bag related causes (e.g., algorithm issues, losing track of bags during the screening process, or hardware faults). Again, better data from the entire screening process is needed to assess the merit of this approach. In short, better data would go a long way toward improving TSA’s understanding of the causes of false alarms and allow for a more structured approach toward reducing them. Put another way, without a better understanding of how well (or poorly) the systems are working, it is difficult to make improvements. The relevant data required to answer such questions includes information about false positives (including data captured by the EDS machines themselves, data about on-screen alarm resolution, and data about resolution through manual screening) and information about false negatives (including results of red-team testing and actual adverse events). 9

The EDS machines are already equipped to generate useful data for better understanding false positives. To be certified as an EDS, a machine is required to have the capability of recording data about the bags being scanned, including the potential threats identified by the ATR algorithm. However, there is no requirement for either the TSA or the EDS vendors to store or analyze these data. Although vendors have collected some data in the context of particular studies for TSA, 9 officials in the TSA with whom the committee spoke indicated that larger-scale data collection and analysis are not being done. Also, the ATR and On Screen Alarm Resolution Protocol (OSARP) data alone do not provide any information about alarms that must be resolved through manual inspection. However, in its tour of the baggage inspection room at San Francisco International Airport, the committee saw no mechanism for collecting data on the results of these inspections, nor any systematic framework for such information— such as a categorization of causes of alarms. Indeed, the committee believes that San Francisco International Airport is representative of airports throughout the United States in this respect, an observation confirmed by a TSA official who briefed the committee subsequent to its visit. Finding: The low prevalence of the true positives in an airport setting may make it nearly impossible to measure probability of detection with humans-in-the-loop without forcing true positives via red-team testing. Finding: Discussion with TSA officials, airport personnel, and vendors indicates some limited- scale data collection and laboratory studies that have enabled the sources of false alarms to be broadly identified. However, system-wide data collection and analysis of the sort necessary to seek out the root causes and guide sustained improvements are not being done. Conclusion: Without more systematic data on the rates and specific causes of false alarms, the TSA cannot determine what changes are likely to result in reduced false alarm rates and, in fact, does not have the infrastructure in place to determine if an implemented change would result in improved performance. Recommendation: The TSA should develop a system for sharing false-positive data with detection-equipment vendors, including ATR algorithm developers and, when reasonable, with baggage vendors. Vendors should have a clear picture of how well or poorly their own equipment and that of their competitors is operating in an airport setting. Recommendation: The TSA should develop a categorization system to record particular causes of false alarms for baggage sent to the baggage-inspection room. TSA should develop a database to store this information and use it to monitor performance variation and trends over time. A system for collecting, managing, and providing access to this data should be put in place, along with capabilities for viewing, reporting, and analysis—as well as export for special studies such as quantitative risk assessments (QRAs), or anomaly detection (i.e., when a sudden change occurs in the “normal” behavior of an EDS in an airport setting). Commercial off-the-shelf software for building such systems is readily available and reasonably priced, although additional investment in hardware and training will still be necessary. Recommendation: The TSA should employ risk assessment methods to obtain a better understanding of the causes of false positives at both the system and the component level. 9 Representative from Reveal Imaging Technologies, Inc., presentation to the committee, April 29, 2009, Washington, D.C. 10

QRA could also be an effective approach to analyzing the probability of explosives in airline baggage and for assessing the effects that changes to the baggage-inspection system will have on both probability of false alarm and probability of detection. Recommendation: The Transportation Security Administration should work with the Transportation Security Laboratory to collect and analyze field data in order to characterize the overall performance of the system by computing statistically valid estimates of probability of detection and probability of false alarm for today’s CT-based EDSs. These analyses should also be used to better understand the sources of false positives by determining the dependence of these probabilities on material characteristics of potential explosives threats, the variability in the material characteristics, and the characteristics of non-threat materials typically contained in checked bags. These estimates should then be used as baselines for determining the ability of potential improvements to reduce false alarms. Recommendation: In addition to collecting performance data on a routine basis, the TSA should, from time to time, conduct special studies and experiments for the purpose of obtaining additional information that would be useful for improving the baggage-inspection processes. In light of the limited information available today, the committee recommends that the TSA limit its own spending on replacement equipment to allow for learning to inform future expenses. Recommendation: The TSA should not fund an overall replacement of fielded explosive detection systems, because replacing all the units in service with currently available technology would not allow for learning in an airport setting to inform future performance improvements. Instead, the TSA should plan its capital spending for explosives detection improvements over a period of time sufficient to allow several generations of technology to be fielded on a limited basis, evaluated, and iteratively improved—thus leading to a gradual improvement in the overall field performance of CT-based explosives detection systems. However, once the sources of false alarms are better understood, the investment made in the data collection and analysis has the potential to result in a high rate of return based on a targeted approach to false alarm reduction. Table S-1 outlines some of these options. TABLE S-1 Potential Solutions to the False Positive Rate Based on Cause Cause Possible Solutions Poor image quality (streaking, agglomeration, etc.) Improvements to the post-processing algorithm Image standardization Slowing scan speed Additional scans Algorithm sensitivity Additional scans Adjusting operating point on the receiver operating characteristic curve Overlap between threat and non-threat materials Dual-energy scans Supplement with orthogonal technologies such as mass spectrometry or x-ray diffraction Exceptions (cut bag, time-out) Additional scans Shield alarm Supplement with orthogonal technologies such as mass spectrometry 11

Next: 1 Introduction »
Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage Get This Book
×
 Engineering Aviation Security Environments—Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage
Buy Paperback | $41.00 Buy Ebook | $32.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

On November 19, 2001 the Transportation Security Administration (TSA) was created as a separate entity within the U.S. Department of Transportation through the Aviation and Transportation Security Act. The act also mandated that all checked baggage on U.S. flights be scanned by explosive detection systems (EDSs) for the presence of threats. These systems needed to be deployed quickly and universally, but could not be made available everywhere. As a result the TSA emphasized the procurement and installation of certified systems where EDSs were not yet available. Computer tomography (CT)-based systems became the certified method or place-holder for EDSs. CT systems cannot detect explosives but instead create images of potential threats that can be compared to criteria to determine if they are real threats. The TSA has placed a great emphasis on high level detections in order to slow false negatives or missed detections. As a result there is abundance in false positives or false alarms.

In order to get a better handle on these false positives the National Research Council (NRC) was asked to examine the technology of current aviation-security EDSs and false positives produced by this equipment. The ad hoc committee assigned to this task examined and evaluated the cases of false positives in the EDSs, assessed the impact of false positive resolution on personnel and resource allocation, and made recommendations on investigating false positives without increase false negatives. To complete their task the committee held four meetings in which they observed security measures at the San Francisco International Airport, heard from employees of DHS and the TSA.
Engineering Aviation Security Environments--Reduction of False Alarms in Computed Tomography-Based Screening of Checked Baggage is the result of the committee's investigation. The report includes key conclusions and findings, an overview of EDSs, and recommendations made by the committee.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!