Appendix C

Committee Evaluation of Statistical Analysis Report

The Statistical Analysis Report (B2M10) was submitted in response to a contract with the FBI, to analyze the results of the assays on the 1,070 FBI Repository (FBIR) samples and determine whether the results of the assays could be related to those obtained on the evidentiary material. If such a relationship could be identified, a secondary issue was the development of a measure of its “statistical strength.” Two of the attack letters assayed positive for the four mutations A1, A3, D, and E. Results on 1,059 of the 1,070 samples were tabulated in the Statistical Analysis Report. Eight samples tested positive for all four mutations; seven of these eight samples came from one institution (USAMRIID) and the remaining sample came from a different institution (Battelle Memorial Institute [BMI]). A table of documented transfers of samples from one institution to another showed a transfer of sample material from the first institution (USAMRIID) to the second institution (BMI). This Appendix discusses the validity of the inferences and calculations in the Statistical Analysis Report submitted to the FBI.

As noted in Chapter 6, the statistical analyses used in the report (e.g., 95 percent confidence interval for the proportion of samples with four mutations, chi-squared tests of independence) require two key assumptions to be valid:

1.    Representativeness: The 1,059 samples are assumed to be a representative and random collection of samples from some well-defined population of samples.

2.    Independence: The 1,059 samples are assumed to be independent of one another (i.e., have no connection with each other, beyond that they all come from the same population).

The Statistical Analysis Report acknowledges that neither assumption can be validated from these data. The committee agrees with this assessment. As a consequence, many of the statistical methods applied to these data cannot be



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 185
Appendix C Committee Evaluation of Statistical Analysis Report The Statistical Analysis Report (B2M10) was submitted in response to a contract with the FBI, to analyze the results of the assays on the 1,070 FBI Repository (FBIR) samples and determine whether the results of the assays could be related to those obtained on the evidentiary material. If such a relationship could be identified, a secondary issue was the development of a measure of its “statistical strength.” Two of the attack letters assayed positive for the four mutations A1, A3, D, and E. Results on 1,059 of the 1,070 samples were tabu- lated in the Statistical Analysis Report. Eight samples tested positive for all four mutations; seven of these eight samples came from one institution (USAMRIID) and the remaining sample came from a different institution (Battelle Memorial Institute [BMI]). A table of documented transfers of samples from one institu - tion to another showed a transfer of sample material from the first institution (USAMRIID) to the second institution (BMI). This Appendix discusses the validity of the inferences and calculations in the Statistical Analysis Report sub- mitted to the FBI. As noted in Chapter 6, the statistical analyses used in the report (e.g., 95 percent confidence interval for the proportion of samples with four muta - tions, chi-squared tests of independence) require two key assumptions to be valid: 1. Representativeness: The 1,059 samples are assumed to be a representa- tive and random collection of samples from some well-defined popula- tion of samples. 2. Independence: The 1,059 samples are assumed to be independent of one another (i.e., have no connection with each other, beyond that they all come from the same population). The Statistical Analysis Report acknowledges that neither assumption can be validated from these data. The committee agrees with this assessment. As a consequence, many of the statistical methods applied to these data cannot be 185

OCR for page 185
186 APPENDIX C validated. The consequences of the violation of these assumptions and their impacts are listed below. 1. FBIR is not a representative and random collection of samples from a well-defined population of B. anthracis samples. The 1,059 samples do not appear to satisfy assumption 1. They were obtained in response to a request from the FBI. No information is available on samples in the population that were not submitted. In fact, the “target population’’ seems not to have been defined. It could be the population of all unique preparations of B. anthracis Ames in the United States, or in the world, or from selected institutions. The absence of a definition of “well-defined population” makes it difficult to assess representativeness of the collection. The elimination of samples that had “inconclusive” results on assays also appears to be nonrandom, as some institutions had many more “inconclusive” assays than others. 2. The 1,059 samples in the FBIR are not independent. The FBI submitted to the committee a table of known transfers of samples between institutions. Hence, the second assumption is violated. Thus, the results of the chi-squared tests for independence of the muta - tions that are calculated in the report are not meaningful. Further, the confidence interval for the proportion 8/947 is not appropriate. The correct denominator for this proportion is likely not 947. A more accu - rate numerator and denominator might refer to the number of known independent preparations rather than the number of samples, but such information may not be possible to obtain. 3. Violation of assumptions renders invalid the inferences from the statistical analyses. Because the FBIR is not a representative and random collection of independent samples, the results on the assays from the repository may be biased. Virtually all statistical procedures assume that the units on which measurements are made comprise a random, representative col- lection from the target population. (The effects of biased sampling on inferences have been well documented; see, e.g., Freedman et al., 2007). Without an appropriate model that characterizes the nonrepresentative- ness and the degree of dependence among the samples, it is not possible to calculate a meaningful measure of “statistical significance” in the results.

OCR for page 185
187 APPENDIX C 4. Results on 112 samples beyond the 947 samples The Statistical Analysis Report eliminated from most of its tables the results of the assays on 112 samples that showed “inconclusive” for A1, A3, MRI-D, or E. Twenty-one of these 112 sample that were eliminated from the statistical analysis assayed positive for 1, 2, or 3 mutations. Table C-1 lists these samples. (Five samples—05-022, 49-014, 53-014, 53-068, 54-008—are listed twice because they were reported as “incon - clusive” or “variant” on two assays.) TABLE C-1 Samples with Positive and “Inconclusive” or “Variant” Assays FBIR Number A1 A3 MRI-D IITRI-D E +Mutations 039-010 inc + - - - A3 044-034 var - - + - IITRI-D 049-014 inc var + + - MRI-D, IITRI-D 053-004 var + - - + A3, E 053-010 var + + + + A3, MRI-D, IITRI-D, E 053-014 var inc - - + E 053-068 inc inc + - - MRI-D 054-008 inc + + inc + A3, MRI-D, E 061-030 inc - + + - MRI-D, IITRI-D 066-015 inc inc + + - MRI-D, IITRI-D 005-022 + var - inc + A1, E 017-006 - var + + - MRI-D, IITRI-D 049-014 inc var + + - MRI-D, IITRI-D 049-018 - var + + - MRI-D, ITRI-D 053-014 var inc - - + E 053-068 inc inc + - - MRI-D 054-066 + inc + + + A1, MRI-D, IITRI-D, E 054-068 - var + - + MRI-D, E 005-020 - - + inc - MRI-D 005-022 + var - inc + A1, E 043-016 - - + inc - MRI-D 044-020 - + - inc + A3, E 052-026 + + + inc - A1, A3, MRI-D 054-008 inc + + inc + A3, MRI-D, E 054-022 - - + inc - MRI-D 057-036 - - + inc - MRI-D

OCR for page 185
188 APPENDIX C inc = inconclusive IITRI = Illinois Institute for Technology Research Institute MRI = Midwest Research Institute var = variant In addition to the two 3-positive samples (+++) among the 947 samples, the four samples below also tested positive for 3 mutations (ordered by FBIR number): 052-026 + + + inc - A1, A3, MRI-D 053-010 var + + + + A3, MRI-D, IITRI-D, E 054-008 inc + + inc + A3, MRI-D, E 054-066 + Inc + + + A1, MRI-D, IITRI-D, E The following four samples revealed positive assays for 2 of the 4 mutations, in addition to the 11 samples noted among the 947 samples (ordered by FBIR number): 005-022 + var - inc + A1, E 044-020 - + - inc + A3, E 053-004 var + - - + A3, E 054-068 - var + - + MRI-D, E DILUTION EXPERIMENTS Dilution experiments were conducted to assess the sensitivity of the assays to various concentrations. Thirty samples were prepared from RMR-1029 at dilution 10.0. As with the other samples, some of the assays were “inconclu - sive.” Genotype E tested positive in all 30 samples; all 4 mutations tested posi - tive for 16 samples. But in the remaining 14 samples, assays for one or more of the genotypes were negative. In fact, one sample tested negative for A1, A3, and D; it was positive for only E. Five samples were positive for two mutations only (A3 and E), and eight samples were positive for only three of the four mutations (7 for A3, D, E; 1 for A1, A3, E). Thus, 6 of the 30 replicate samples (20 percent) tested positive for only 1 or 2 of the mutations. Given that 50 of the 947 FBIR samples showed only 1 positive, and 11 of the 947 showed only 2 positives, this variation indicates that some of the samples may have harbored mutations that went undetected. Absent any repeat testing of these samples, however, it is difficult to know how such false negatives might have affected the inferences. Additional experiments were conducted on RMR-1029 and another sample, “SPS.266 Tube#5,” at 10 dilutions levels (10.1, . . ., 10.10). The results of the

OCR for page 185
189 APPENDIX C three replicates at each dilution level, for each of the five genotypes, for samples from both RMR-1029 and SPS.266 Tube#5 were reported in Chapter 6. Vari- ability in the results on replicates, even from the same sample at the same dilu- tion level, demonstrates the value, and need for, replicate testing. For example, the results on the three replicates from RMR-1029 at dilution 10.1, ordered as A1, A3, MRI-D, IITRI-D, E, were: (- + + + +), (- + + + -), (+ + + + -). Clearly, dilution affects the assay result: the greater the dilution, the more likely the assay is negative. Moreover, it is perhaps unexpected that greater dilutions sometimes give positive results when not all replicates at lesser dilutions did so. CONCORDANCE OF TESTS FROM IITRI-D AND MRI-D The FBI retained both the Illinois Institute for Technology Research Insti - tute (IITRI) and Midwest Research Institute (MRI) to conduct the D assays. Because the assays on the 1,059 samples can be considered to be independent between IITRI and MRI, the Statistical Analysis Report (Table 3, p. 7, as pre- sented below) tabulates the results of the D assays from the two facilities: IITRI-D MRI-D Inconclusive Negative No growth Pending Positive Total Inconclusive 0 22 12 0 0 34 Negative 17 909 1 1 12 940 Negative-u 1 20 0 0 0 21 Positive 6 12 0 0 46 64 TOTAL 24 963 13 1 58 1,059 The Statistical Analysis Report combined the “negative-u” results with the “negative” results, and eliminated the 12 samples that showed “no-growth” by IITRI-D and “inconclusive” by MRI-D as well as the one “pending” sample, to yield the following table: IITRI-D MRI-D Inconclusive Negative Positive Total Inconclusive 0 22 0 22 Negative 18 929 12 959 Positive 6 12 46 64 TOTAL 24 963 58 1,045 Eliminating the 14 “no-growth” and “pending” samples, the concordance rate is 975/1045 = 0.933, with a 95 percent confidence interval (0.916, 0.947). Thus, the agreement between the facilities is unlikely to be lower than 91.6 per-

OCR for page 185
190 APPENDIX C cent and likely does not exceed 94.7 percent. Of greater interest, however, are the 12 samples that were positive by IITRI-D but negative for MRI-D, the 12 samples that were negative by IITRI-D but positive by MRI-D, and the six samples that were positive by MRI-D but inconclusive by IITRI-D. While concordance is informative, these 30 samples with discordant results might provide increased information about the samples and the assay process. On the other hand, we also know from the repeated assays of the dilution series that some discordance also arises owing to variation even when using the same assay procedure. In any case, because genotype D is the only one of the four genotypes that was subjected to independent testing by a second organization one cannot say whether the results on the other genotypes might have been different if they also had been subjected to independent testing. “SIGNIFICANCE” OF SEVEN (++++) SAMPLES FROM INSTITUTION F The Statistical Analysis Report notes in its conclusions: “In summation, though the random chance of occurrence of the sample type (++++) is 8 out of 947 (i.e., 0.84%) with exact 95% confidence interval of 0.0037 to 0.0166 (I.e., from 1 in 270 to 1 in 60), this sample type has been found in only two institutions thus far sampled (USAMRIID and BMI), and its occurrence in BMI is explained by a recent sample transfer from USAM to BMI, since there is no documented record of sample transfers in the other direction.’’ (p. 2) As noted in Chapter 6, 598 of the 947 samples (63 percent) came from Institution F. (Twelve of the institutions submitted 6 or fewer samples; 4 insti - tutions submitted 15-31 samples, and 4 institutions submitted 49-74 samples.) Therefore, one would not be surprised to find more “mutation-positive” sam - ples from Institution F than, say, from Institution B (which contributed only one sample). One might naturally ask: How unusual is the occurrence of seven “4-mutation” samples—or even all eight—from Institution F? Given that Insti- tution F contributed almost 2/3 of the 947 samples, how many of the 4-positive (++++) samples would Institution F receive if the 4-mutation samples were distributed completely at random? The answer to this question is given by the probabilities of observing 0 or 1 or 2 or ... or 8 of the eight (++++) samples from Institution F, given that Institution F submitted 598 of the 947 samples that yielded definitive results on the A1, A3, MRI-D, and E assays. These probabilities (from the hypergeometric probability distribution) (Johnson et al., 2005) are shown in Table C-2.

OCR for page 185
191 APPENDIX C TABLE C-2 Probabilities of k 4-Mutation Samples in Institution F k= 0 1 2 3 4 5 6 7 8 Probability 0.0003 0.0045 0.0276 0.0955 0.2058 0.2826 0.2415 0.1174 0.0248 This table shows that the chance of Institution F having ended up with seven or eight of the eight (++++) samples is (0.1174 + 0.0248) = 0.1422, or about 1/7. Therefore, while the observed data showing that seven of the eight (++++) samples appeared in Institution F is not completely typical, it also could hardly be considered extreme.

OCR for page 185