put most bluntly, in a discussion of the MD-IBIS hit that yielded a criminal conviction, by a critic of the current implementation, “If you don’t use the system … it isn’t going to work” (quoted in Butler, 2005). The utility of state-level RBIDs will depend on how often the database is actually queried in the conduct of investigations and how investigative leads are followed up. The design of the current databases, and the need to ensure a firewall from NIBIN data due to the legal restrictions on NIBIN content, have made the databases inconvenient to search: exhibits must be transported to specific facilities for acquisition and comparison. To that end, mechanisms for encouraging searches of state RBIDs by law enforcement agencies in the same state or region should be developed and the results evaluated. To the extent that law permits and arrangements can be made, broader research involving the merging and comparison of state-level RBID images with NIBIN-type evidence would also be valuable.
Throughout this appendix, we restrict the discussion to cartridge casings; however, the same problem formulation would apply to bullets.
Suppose one has a database that consists of N images of casings, where N is a large number. These images may correspond to D different types of (new) guns. For each gun type, there are nd different images, from different guns of the same type or various gun and ammunition combinations, etc. So the database has a total of images. Consider now a newly acquired casing from a crime scene. One wants to compare the image of the new casing with the N images in the database and find the best K matches. The top K matches will then be scrutinized by a firearms examiner, and a direct physical comparison made will be to verify any hits.
Assume that the database does in fact contain a casing fired from the particular crime gun. Then, the statistical feasibility of the problem depends on whether the correct image will be among the top K matches, when K is a reasonably small number (top 10, top 50, or even top 100) even though N, the size of the database, is very large—on the order of millions.
Specifically, some of the statistical questions of interest are:
What is the probability that the correct image from the database (the one that corresponds to the crime gun) will be in the top K? How does this probability decrease with N? What are the critical factors that affect it?
How large should K = K(α) be if we want to be certain that the correct image is in the top K with probability at least (1 − α)? How does this depend on the size of the database and other factors?