Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
4 The Principles of Science and Interpreting Scientific Data Scientific method refers to the body of techniques for investigating phe- nomena, acquiring new knowledge, or correcting and integrating previous knowledge. It is based on gathering observable, empirical and measurable evidence subject to specific principles of reasoning. Isaac Newton (1687, 1713, 1726) âRules for the study of natural philosophy,â Philosophiae Naturalis Principia Mathematica Forensic science actually is a broad array of disciplines, as will be seen in the next chapter. Each has its own methods and practices, as well as its strengths and weaknesses. In particular, each varies in its level of scientific development and in the degree to which it follows the principles of scientific investigation. Adherence to scientific principles is important for concrete reasons: they enable the reliable inference of knowledge from uncertain informationâexactly the challenge faced by forensic scientists. Thus, the reliability of forensic science methods is greatly enhanced when those principles are followed. As Chapter 3 observes, the lawâs admission of and reliance on forensic evidence in criminal trials depends critically on (1) the extent to which a forensic science discipline is founded on a reliable scientific methodology, leading to accurate analyses of evidence and proper reports of findings and (2) the extent to which practitioners in those foren- sic science disciplines that rely on human interpretation adopt procedures and performance standards that guard against bias and error. This chapter discusses the ways in which science more generally addresses those goals. 111
112 STRENGTHENING FORENSIC SCIENCE IN THE UNITED STATES FUNDAMENTAL PRINCIPLES OF THE SCIENTIFIC METHOD The scientific method presumes that events occur in consistent patterns that can be understood through careful comparison and systematic study. Knowledge is produced through a series of steps during which data are accumulated methodically, strengths and weaknesses of information are as- sessed, and knowledge about causal relationships is inferred. In the process, scientists also develop an understanding of the limits of that knowledge (such as the precision of the observations), the inferred nature of relation- ships, and key assumptions behind the inferences. Hypotheses are devel- oped, are measured against the data, and are either supported or refuted. Scientists continually observe, test, and modify the body of knowledge. Rather than claiming absolute truth, science approaches truth either through breakthrough discoveries or incrementally, by testing theories repeatedly. Evidence is obtained through observations and measurements conducted in the natural setting or in the laboratory. In the laboratory, scientists can control and vary the conditions in order to isolate exclusive effects and thus better understand the factors that influence certain outcomes. Typi- cally, experiments or observations must be conducted over a broad range of conditions before the roles of specific factors, patterns, or variables can be understood. Methods to reduce errors are part of the study design, so that, for example, the size of the study is chosen to provide sufficient statistical power to draw conclusions with a high level of confidence or to understand factors that might confound results. Throughout scientific investigations, the investigator must be as free from bias as possible, and practices are put in place to detect biases (such as those from measurements, human inter- pretation) and to minimize their effects on conclusions. Ultimately, the goal is to construct explanations (âtheoriesâ) of phe- nomena that are consistent with broad scientific principles, such as the laws of thermodynamics or of natural selection. These theories, and in- vestigations of them through experiments and observed data, are shared through conferences, publications, and collegial interactions, which push the scientist to explain his or her work clearly and which raise questions that might not have been considered. The process of sharing data and re- sults requires careful recordkeeping, reviewed by others. In addition, the need for credibility among peers drives investigators to avoid conflicts of interest. Acceptance of the work comes as results and theories continue to hold, even under the scrutiny of peers, in an environment that encourages healthy skepticism. That scrutiny might extend to independent reproduc- tion of the results or experiments designed to test the theory under different conditions. As credibility accrues to data and theories, they become ac- cepted as established fact and become the âscaffoldingâ upon which other investigations are constructed.
THE PRINCIPLES OF SCIENCE 113 This description of how science creates new theories illustrates key ele- ments of good scientific practice: precision when defining terms, processes, context, results, and limitations; openness to new ideas, including criticism and refutation; and protections against bias and overstatement (going be- yond the facts). Although these elements have been discussed here in the context of creating new methods and knowledge, the same principles hold when applying known processes or knowledge. In day-to-day forensic sci- ence work, the process of formulating and testing hypotheses is replaced with the careful preparation and analysis of samples and the interpretation of results. But that applied work, if done well, still exhibits the same hall- marks of basic science: the use of validated methods and care in following their protocols; the development of careful and adequate documentation; the avoidance of biases; and interpretation conducted within the constraints of what the science will allow. Validation of New Methods One particular task of science is the validation of new methods to determine their reliability under different conditions and their limitations. Such studies begin with a clear hypothesis (e.g., ânew method X can reliably associate biological evidence with its sourceâ). An unbiased ex- periment is designed to provide useful data about the hypothesis. Those dataâmeasurements collected through methodical prescribed observations under well-specified and controlled conditionsâare then analyzed to sup- port or refute the hypothesis. The thresholds for supporting or refuting the hypothesis are clearly articulated before the experiment is run. The most important outcomes from such a validation study are (1) information about whether or not the method can discriminate the hypothesis from an alter- native, and (2) assessments of the sources of errors and their consequences on the decisions returned by the method. These two outcomes combine to provide precision and clarity about what is meant by âreliably associate.â For a method that has not been subjected to previous extensive study, a researcher might design a broad experiment to assist in gaining knowledge about its performance under a range of conditions. Those data are then analyzed for any underlying patterns that may be useful in planning or interpreting tests that use the new method. In other situations, a process already has been formulated from existing experimental data, knowledge, and theory (e.g., âbiological markers A, B, and C can be used in DNA forensic investigations to pair evidence with suspectâ). To confirm the validity of a method or process for a particular purpose (e.g., for a forensic investigation), validation studies must be performed. The International Organization for Standardization (ISO) and the In- ternational Electrotechnical Commission (IEC) developed a joint document,
114 STRENGTHENING FORENSIC SCIENCE IN THE UNITED STATES âGeneral requirements for the competence of testing and calibration labo- ratoriesâ (commonly referred to as âISO 17025â), which includes a well- established list of techniques that can be used, alone or in combination, to validate a method: â¢ calibration using reference standards or reference materials; â¢ comparison of results achieved with other methods; â¢ interlaboratory comparisons; â¢ systematic assessment of the factors influencing the result; and â¢ ssessment of the uncertainty of the results based on scientific un- a derstanding of the theoretical principles of the method and practi- cal experience. A critical step in such validation studies is their publication in peer- reviewed journals, so that experts in the field can review, question, and check the repeatability of the results. These publications must include clear statements of the hypotheses under study, as well as sufficient details about the experiments, the resulting data, and the data analysis so that the studies can be replicated. Replication will expose not only additional sources of variability but also further aspects of the process, leading to greater under- standing and scientific knowledge that can be used to improve the method. Methods that are specified in more detail (such as DNA analysis, where particular genetic loci are to be compared) will have greater credibility and also are more amenable to systematic improvement than those that rely more heavily on the judgments of the investigator. The validation of results over time increases confidence. Moreover, the scientific culture encourages continued questioning and improvement. Thus, the relevant scientific community continues to check that established results still hold under new conditions and that they continue to hold in the face of new knowledge. The involvement of graduate student researchers in scientific research contributes greatly to this diligence, because part of their education is to read carefully and to question so-called established methods. This culture leads to continued reexamination of past research and hence increased knowledge. In the case of DNA analysis, studies have evaluated the precision, reli- ability, and uncertainties of the methods. This knowledge has been used to define standard procedures that, when followed, lead to reliable evidence. For example, below is a brief sample of the specifications required by the Federal Bureau of Investigationâs (FBIâs) Quality Assurance Standards for â Quoted from Section 5.4.5 2 (Note 2) of ISO/IEC 17025, âGeneral requirements for the competence of testing and calibration laboratoriesâ (2nd ed., May 15, 2005).
THE PRINCIPLES OF SCIENCE 115 Forensic DNA Testing Laboratories in order to ensure reliable DNA fo- rensic analysis: â¢ esting laboratories must have a standard operating protocol for T each analytical technique used, specifying reagents, sample prepa- ration, extraction, equipment, and controls that are standard for DNA analysis and data interpretation. â¢ he laboratory shall monitor the analytical procedures using ap- T propriate controls and standards, including quantitation standards that estimate the amount of human nuclear DNA recovered by ex- traction, positive and negative amplification controls, and reagent blanks. â¢ he laboratory shall check its DNA procedures annually or when- T ever substantial changes are made to the protocol(s) against an appropriate and available NIST standard reference material or standard traceable to a NIST standard. â¢ he laboratory shall have and follow written general guidelines for T the interpretation of data. â¢ he laboratory shall verify that all control results are within estab- T lished tolerance limits. â¢ here appropriate, visual matches shall be supported by a numeri- W cal match criterion. â¢ or a given population(s) and/or hypothesis of relatedness, the F statistical interpretation shall be made following the recommenda- tions 4.1, 4.2, or 4.3 as deemed applicable of the National Research Council report entitled The Evaluation of Forensic DNA Evidence (1996) and/or a court-directed method. These calculations shall be derived from a documented population database appropriate for the calculation. This level of specificity is consistent with the spirit of the guidelines presented in ISO 17025. The second edition (May 15, 2005) of those guidelines includes the following minimum set of information for properly specifying the process of any new analytical method: (a) appropriate identification; (b) scope; (c) description of the type of item to be tested or calibrated; â DNA Advisory Board. 2000. Forensic Science Communications 2(3). Available at www. bioforensics.com/conference04/TWGDAM/Quality_Assurance_Standards_2.pdf. â Paraphrased from Section 9 of the FBIâs Quality Assurance Standards for Forensic DNA Testing Laboratories.
116 STRENGTHENING FORENSIC SCIENCE IN THE UNITED STATES (d) parameters or quantities and ranges to be determined; (e) apparatus and equipment, including technical performance requirements; (f) reference standards and reference materials required; (g) environmental conditions required and any stabilization period needed; (h) description of the procedure, including - affixing of identification marks, handling, transporting, storing and preparation of items; - checks to be made before the work is started; - checks that the equipment is working properly and, where required, calibration and adjustment of the equipment before each use; - the method of recording the observations and results; - any safety measures to be observed; (i) criteria and/or requirements for approval/rejection; (j) data to be recorded and method of analysis and presentation; (k) the uncertainty or the procedure for estimating uncertainty. Uncertainty and Error Scientific data and processes are subject to a variety of sources of error. For example, laboratory results and data from questionnaires are subject to measurement error, and interpretations of evidence by human observers are subject to potential biases. A key task for the scientific investigator design- ing and conducting a scientific study, as well as for the analyst applying a scientific method to conduct a particular analysis, is to identify as many sources of error as possible, to control or to eliminate as many as possible, and to estimate the magnitude of remaining errors so that the conclusions drawn from the study are valid. Numerical data reported in a scientific paper include not just a single value (point estimate) but also a range of plausible values (e.g., a confidence interval, or interval of uncertainty). Measurement Error As with all other scientific investigations, laboratory analyses con- ducted by forensic scientists are subject to measurement error. Such error reflects the intrinsic strengths and limitations of the particular scientific technique. For example, methods for measuring the level of blood alcohol in an individual or methods for measuring the heroin content of a sample â Quoted from Section 5.4.4 of ISO/IEC 17025, âGeneral requirements for the competence of testing and calibration laboratoriesâ (2nd ed., May 15, 2005).
THE PRINCIPLES OF SCIENCE 117 can do so only within a confidence interval of possible values. In addi- tion to the inherent limitations of the measurement technique, a range of other factors may also be present and can affect the accuracy of laboratory analyses. Such factors may include deficiencies in the reference materials used in the analysis, equipment errors, environmental conditions that lie outside the range within which the method was validated, sample mix-ups and contamination, transcriptional errors, and more. Consider, for example, a case in which an instrument (e.g., a breatha- lyzer such as Intoxilyzer) is used to measure the blood-alcohol level of an individual three times, and the three measurements are 0.08 percent, 0.09 percent, and 0.10 percent. The variability in the three measurements may arise from the internal components of the instrument, the different times and ways in which the measurements were taken, or a variety of other fac- tors. These measured results need to be reported, along with a confidence interval that has a high probability of containing the true blood-alcohol level (e.g., the mean plus or minus two standard deviations). For this il- lustration, the average is 0.09 percent and the standard deviation is 0.01 percent; therefore, a two-standard-deviation confidence interval (0.07 per- cent, 0.11 percent) has a high probability of containing the personâs true blood-alcohol level. (Statistical models dictate the methods for generating such intervals in other circumstances so that they have a high probability of containing the true result.) The situation for assessing heroin content from a sample of white powder is similar, although the quantification and limits are not as broadly standardized. The combination of gas chromatography and mass spectrometry (GC/MS) is used extensively in identifying con- trolled substances. Those analyses tend to be more qualitative (e.g., iden- tifying peaks on a spectrum that appear at frequencies consistent with the controlled substance and which stand out above the background ânoiseâ), although quantification is possible. Error Rates Analyses in the forensic science disciplines are conducted to provide information for a variety of purposes in the criminal justice process. How- ever, most of these analyses aim to address two broad types of questions: (1) can a particular piece of evidence be associated with a particular class of sources? and (2) Can a particular piece of evidence be associated with one particular source? The first type of question leads to âclassificationâ conclusions. An example of such a question would be whether a particular hair specimen shares physical characteristics common to a particular ethnic group. An affirmative answer to a classification question indicates only that the item belongs to a particular class of similar items. Another example might be whether a paint mark left at a crime scene is consistent (according
118 STRENGTHENING FORENSIC SCIENCE IN THE UNITED STATES to some collection of relevant measurements) with a particular paint sample in a database, from which one can infer the class of vehicle (e.g., model(s) and production year(s)) that could have left the mark. The second type of question leads to âindividualizationâ conclusionsâfor example, does a particular DNA sample belong to individual X? Although the questions addressed by forensic analyses are not always binary (yes/no) or as crisply stated as in the previous paragraph, the para- digm of yes/no conclusions is useful for describing and quantifying the accuracy with which forensic science disciplines can provide answers. In such situations, results from analyses for which the truth is known can be classified in a two-way table as follows: Analysis Results Truth yes Â no yes a (true positives) b (false negatives) no c (false positives) d (true negatives) The conceptual framework and terminology for evaluating the accu- racy of forensic analyses is illustrated using a hypothetical example from microscopic analysis of head hair. In this situation, multiple features, both qualitative and quantitative, on each sample of hair are assessed. Qualita- tive features include color (e.g., blonde, brown, red), coloring (natural or treated), form (straight, wavy, curved, kinked), texture (smooth, medium, coarse). Quantitative features include length and diameter. Undoubtedly, these features will vary from hair to hair, even from the same individual, but features that vary less for the same individual (i.e., within-individual variability) and more for different individuals (i.e., between-individual vari- ability) are needed for purposes of class identification and discrimination. These features may also be combined in some fashion to result in some overall score, or set of scores, for each sample, and these scores are then compared with those from the target sample. In the final analysis, however, a binary conclusion is often required. For example, âDid this hair come from the head of a Caucasian person?â As in the case of all analyses leading to classification conclusions (e.g., diagnostic tests in medicine), the microscopic hair analysis process must be subjected to performance and validation studies in which appropriate error rates can be defined and estimated. Consider a hypothetical study in â More complete discussion of the questions addressed by forensic science may be found in references such as K. Inman and N. Rudin. 2002. The origin of evidence. Forensic Science International 126:11-16; and R. Cook, I.W. Evett, G. Jackson, P.J. Jones, and J.A. Lambert. 1998. A hierarchy of propositions: Deciding which level to address in casework. Science and Justice 38:231-239.
THE PRINCIPLES OF SCIENCE 119 which 100 samples (each with multiple hairs) are taken from the heads of 100 individuals from class C, and another 100 samples are taken from the heads of individuals not in class C. The analyst is asked to determine, for each of the 200 samples, whether it does or does not come from a person in class C, and the true answer is known. The validation study returns the following results: Hypothetical Hair Analysis Validation Study Analysis of Hair Samples Indicates: Class C Not Class C Row Total Sample is from Class 95 5 100 C Persons True Positive (correct False Negative determination) Sample is not from 2 98 100 Class C Persons False Positive True Negative (correct determination) Column Total 97 103 Overall total 200 The accuracy of a test (here, microscopic hair analysis) can be assessed in different ways. Borrowing terminology from the evaluation of medical diagnostic tests, four characterizations and their associated measures are given below. Each one is useful in its own way: the first two emphasize the ability to detect an association; the last two emphasize the ability to predict an association: â¢ mong samples from persons in Class C, the fraction that is cor- A rectly identified by the test is called the âsensitivityâ or the âtrue positive rateâ (TPR) of the test. In this table, the sensitivity would be estimated as [95/(95+5)] Ã 100=95 percent. â¢ mong samples from persons not in Class C, the fraction that is A correctly identified by the test is called the âspecificityâ or the âtrue âSee, e.g., X-H. Zhou, N. Obuchowski, and D. McClish. 2002. Statistical Methods in Diagnostic Medicine. Hoboken, NJ. Wiley & Sons, for a general account of methods for diagnostic tests. A series of NAS/NRC reports have applied such methods to the examination of forensic disciplines. See, e.g., NRC. Committee to Review the Scientific Evidence on the Polygraph. 2003. The Polygraph and Lie Detection. Washington, DC: The National Acad- emies Press; NRC. 2004. Forensic Analysis: Weighing Bullet Lead Evidence. Washington, DC: The ÂNational Academies Press; NAS. 2005. The Sackler Colloquium on Forensic Science: The Nexus of Science and the Law, November 16-18, 2005.
120 STRENGTHENING FORENSIC SCIENCE IN THE UNITED STATES negative rateâ (TNR) of the test. In this table, the specificity would be estimated as [98/(2+98)] Ã 100=98 percent. â¢ mong samples classified by the test as coming from persons in A Class C, the fraction that actually turns out to be from Class C is called the âpositive predictive value (PPV)â of the test. In this table, the PPV would be estimated as [95/(95+ 2)] Ã 100=98 percent. â¢ mong samples classified by the test as coming from persons not in A Class C, the fraction that actually turns out to not be persons from Class C is called the ânegative predictive value (NPV)â of the test. In this table, the NPV would be estimated as [98/(5+98)] Ã 100=95 percent. The above four measures emphasize the ability of the analysis to make correct determinations. âError ratesâ are defined as proportions of cases in which the analysis led to a false conclusion. For example, the complement of sensitivity (100 percent minus the sensitivity) is the percent of false nega- tive cases in which the sample was from class C but the analysis reached the opposite conclusion. In the above table, this would be estimated as 5 percent. Similarly, the complement of specificity (100 percent minus the specificity) is the percent of false positive cases in which the sample was not from class C but the analysis concluded that it was. In the above table this would be estimated as 2 percent. A global error rate could be defined as the percent of incorrectly identified cases among all those analyzed. In the above table this would be estimated as [(5+2)/200] Ã 100=3.5 percent. Importantly, whether the test answer is correct or not depends on which question is being addressed by the test. In this hair comparison example, the purpose is to determine whether the hair came from the head of an individual from class C. Thus, the analysis should be evaluated on the ac- curacy of the classification. In this example, if the analysis indicated âClass Câ but the hair actually came from a ânon-Class Câ individual, then the analysis returned an incorrect classification. This accuracy evaluation does not apply to other tasks that are beyond the goal of the particular analysis, such as pinpointing the individual from whom the specimen was obtained. In the paint example about paint marks left by a vehicle, if the question is whether a vehicle under investigation was a model A made by manufacturer B in 2000, then a correct answer is limited to only the model, manufacturer, and year. âEach estimate (of sensitivity, specificity, PPV, NPV) is associated with an interval that has a high probability of containing the true sensitivity, specificity, PPV, NPV. The larger the study, the more precise the estimate (i.e., the narrower the interval of uncertainty about the estimate).
THE PRINCIPLES OF SCIENCE 121 Although only illustrations, these examples serve to demonstrate the importance of: â¢ he careful and precise characterization of the scientific procedure, t so that others can replicate and validate it; â¢ he identification of as many sources of error as possible that can t affect both the accuracy and precision of a measurement; â¢ he quantification of measurements (e.g., in the example of t GC/MS analysis of possible heroin, reporting peak area, as well as appropriate calibration data, including the response area for a known amount of analyte standard, rather than merely âpeak is present/absentâ); â¢ he reporting of a measurement with an interval that has a high t probability of containing the true value; â¢ he precise definition of the question addressed by the method (e.g., t classification versus individualization), and the recognition of its limitations; and â¢ he conducting of validation studies of the performance of a foren- t sic procedure to assess the percentages of false positives and false negatives. Clearly, better understanding of the measuring equipment and the measurement process leads to more improvements to every process and ultimately to fewer false positive and false negative results. Most impor- tantly, as stated above, whether the test answer is correct or not depends on the question the test is being used to address. In the case of microscopic hair analysis, the validation study may confirm its value in identifying class characteristics of an individual, but not in identifying the specific person. It is also important to note that errors and corresponding error rates can have more complex sources than can be accommodated within the simple framework presented above. For example, in the case of DNA analysis, a declaration that two samples match can be erroneous in at least two ways: The two samples might actually come from different individuals whose DNA appears to be the same within the discriminatory capability of the tests, or two different DNA profiles could be mistakenly determined to be matching. The probability of the former error is typically very low, while the probability of a false positive (different profiles wrongly determined to be matching) may be considerably higher. Both sources of error need to be explored and quantified in order to arrive at reliable error rate estimates for DNA analysis. â C. Aitken and F. Taroni. 2004. Statistics and the Evaluation of Evidence for Forensic Scientists. Chichester, UK: John Wiley & Sons.
122 STRENGTHENING FORENSIC SCIENCE IN THE UNITED STATES The existence of several types of potential error rates makes it abso- lutely critical for all involved in the analysis to be explicit and precise in the particular rate or rates referenced in a specific setting. The estimation of such error rates requires rigorously developed and conducted scientific studies. Additional factors may play a role in analyses involving human interpretation, such as the experience, training, and inherent ability of the interpreter, the protocol for conducting the interpretation, and biases from a variety of sources, as discussed in the next section. The assessment of the accuracy of the conclusions from forensic analyses and the estimation of relevant error rates are key components of the mission of forensic science. Sources of Bias Human judgment is subject to many different types of bias, because we unconsciously pick up cues from our environment and factor them in an unstated way into our mental analyses. Those mental analyses might also be affected by unwarranted assumptions and a degree of overconfidence that we do not even recognize in ourselves. Such cognitive biases are not the result of character flaws; instead, they are common features of deci- sionmaking, and they cannot be willed away. A familiar example is how the common desire to please others (or avoid conflict) can skew oneâs judg- ment if co-workers or supervisors suggest that they are hoping for, or have reached, a particular outcome. Science takes great pains to avoid biases by using strict protocols to minimize their effects. The 1996 National Acad- emies DNA report, for example, notes, â[l]aboratory procedures should be designed with safeguards to detect bias and to identify cases of true ambigu- ity. Potential ambiguities should be documented.â10 A somewhat obvious cognitive bias that may arise in forensic science is a willingness to ignore base rate information in assessing the probative value of information. For example, suppose carpet fibers from a crime scene are found to match carpet fibers found in a suspectâs home. The probative value of this information depends on the rate at which such fibers are found in homes in addition to that of the suspect. If the carpet fibers are extremely common, the presence of matching fibers in the suspectâs home will be of little probative value.11 A common cognitive bias is the tendency for conclusions to be affected by how a question is framed or how data are presented. In a police line-up, â See, e.g., M.J. Saks, D.M. Risinger, R. Rosenthal, and W.C. Thompson. 2003. Context ef- fects in forensic science:Â A review and application of the science of science to crime laboratory practice in the United States. Science and Justice 43(2):77-90. 10â NRC. 1996. The Evaluation of Forensic DNA Evidence. Washington, DC: National Academy Press. 11â C. Guthrie, J.J. Rachlinski, and A.J. Wistrich. 2001. Inside the judicial mind. Cornell Law Review 86:777-830.
THE PRINCIPLES OF SCIENCE 123 for instance, an eyewitness who is presented with a pool of faces in one batch might assume that the suspect is among them, which may not be cor- rect. If the mug shots are presented together at one time and the witness is asked to identify the suspect, the witness may choose the photograph that is most similar to the perpetrator, even if the perpetratorâs picture is not among those presented. Similarly, if the photographs are presented sequen- tially and the witness knows that only a limited number will be presented, the eyewitness might tend to âidentifyâ one of the last photographs under the assumption that the suspect must be in that batch. (This is also driven by the common bias toward reaching closure.) A series of studies has shown that judges can be subject to errors in judgment resulting from similar cog- nitive biases.12 Forensic scientists also can be affected by this cognitive bias if, for example, they are asked to compare two particular hairs, shoeprints, fingerprintsâone from the crime scene and one from a suspectârather than comparing the crime scene exemplar with a pool of counterparts. Another potential bias is illustrated by the erroneous fingerprint iden- tification of Brandon Mayfield as someone involved with the Madrid train bombing in 2004. The FBI investigation determined that once the finger- print examiner had declared a match, both he and other examiners who were aware of this finding were influenced by the urgency of the investiga- tion to affirm repeatedly this erroneous decision.13 Recent research provided additional evidence of this sort of bias through an experiment in which experienced fingerprint examiners were asked to analyze fingerprints that, unknown to them, they had analyzed previously in their careers. For half the examinations, contextual biasing was introduced. For example, the instructions accompanying the latent prints included information such as the âsuspect confessed to the crimeâ or the âsuspect was in police custody at the time of the crime.â In 6 of the 24 examinations that included contextual manipulation, the examiners reached conclusions that were consistent with the biasing information and different from the results they had reached when examining the same prints in their daily work.14 Other cognitive biases may be traced to common imperfections in our reasoning ability. One commonly recognized bias is the tendency to avoid cognitive dissonance, such as persuading oneself through rational argu- ment that a purchase was a good value once the transaction is complete. A scientist encounters this unconscious bias if he/she becomes too wedded to a preliminary conclusion, so that it becomes difficult to accept new infor- 12â Ibid. 13â R.B. Stacey. 2004. A report on the erroneous fingerprint individualization in the Madrid train bombing case. Journal of Forensic Identification 54:707. 14â I.E. Dror and D. Charlton. 2006. Why experts make errors. Journal of Forensic Identi- fication 56(4):600-616.
124 STRENGTHENING FORENSIC SCIENCE IN THE UNITED STATES mation fairly and unduly difficult to conclude that the initial hypotheses were wrong. This is often manifested by what is known as âanchoring,â the well-known tendency to rely too heavily on one piece of information when making decisions. Often, the piece of information that is weighted disproportionately is one of the very first ones encountered. One tends to seek closure and to view the initial part of an investigation as a âsunk costâ that would be wasted if overturned. Another common cognitive bias is the tendency to see patterns that do not actually exist. This bias is related to our tendency to underestimate the amount of complexity that can really exist in nature. Both tendencies can lead one to formulate overly simple models of reality and thus to read too much significance into coincidences and surprises. More generally, human intuition is not a good substitute for careful reasoning when probabilities are concerned. As an example, consider a problem commonly posed in beginning statistics classes: How many people must be in a room before there is a 50 percent probability that at least two will share a common birthday? Intuition might suggest a large number, perhaps over 100, but the actual answer is 23. This is not difficult to prove through careful logic, but intuition is likely to be misleading. All of these sources of bias are well known in science, and a large amount of effort has been devoted to understanding and mitigating them. The goal is to make scientific investigations as objective as possible so the results do not depend on the investigator. Certain fields of science (most notably, biopharmaceutical clinical trials of treatment protocols and drugs) have developed practices such as double-blind tests and independent (blind) verification to minimize the impact of biases. Additionally, science seeks to publish its discoveries, findings, and conclusions so that they are subjected to independent peer review; this enables others to study biases that may exist in the investigative method or attempt to replicate unexpected results. Avoiding, or compensating for, a bias is an important task. Even fields with well-established protocols to minimize the effects of bias can still bear improvement. For example, a recent working paper15 has raised questions about the way cognitive dissonance has been studied since 1956. Although these results must be considered preliminary because the paper has yet to be published, they do demonstrate that continual vigilance is needed. Re- search has been sparse on the important topic of cognitive bias in forensic scienceâboth regarding their effects and methods for minimizing them.16 15â M.K. Chen. 2008. Rationalization and Cognitive Dissonance: Do Choices Affect or Reflect Preferences? Available at www.som.yale.edu/Faculty/keith.chen/papers/CogDisPaper. pdf. 16â See, e.g., I.E. Dror, D. Charlton, and A.E. Peron. 2006. Contextual information renders experts vulnerable to making erroneous identifications. Forensic Science International 156:74- 78; I.E. Dror, A. Peron, S. Hind, and D. Charlton. 2005. When emotions get the better of us:
THE PRINCIPLES OF SCIENCE 125 The Self-Correcting Nature of Science The methods and culture of scientific research enable it to be a self- correcting enterprise. Because researchers are, by definition, creating new understanding, they must be as cautious as possible before asserting a new âtruth.â Also, because researchers are working at a frontier, few others may have the knowledge to catch and correct any errors they make. Thus, science has had to develop means of revisiting provisional results and re- vealing errors before they are widely used. The processes of peer review, publication, collegial interactions (e.g., sharing at conferences), and the in- volvement of graduate students (who are expected to question as they learn) all support this need. Science is characterized also by a culture that encour- ages and rewards critical questioning of past results and of colleagues. Most technologies benefit from a solid research foundation in academia and ample opportunity for peer-to-peer stimulation and critical assessment, review and critique through conferences, seminars, publishing, and more. These elements provide a rich set of paths through which new ideas and skepticism can travel and opportunities for scientists to step away from their day-to-day work and take a longer-term view. The scientific culture encourages cautious, precise statements and discourages statements that go beyond established facts; it is acceptable for colleagues to challenge one an- other, even if the challenger is more junior. The forensic science disciplines will profit enormously by full adoption of this scientific culture. CONCLUSION The way in which science is conducted is distinct from, and comple- mentary to, other modes by which humans investigate and create. The methods of science have a long history of successfully building useful and trustworthy knowledge and filling gaps while also correcting past errors. The premium that science places on precision, objectivity, critical thinking, careful observation and practice, repeatability, uncertainty management, and peer review enables the reliable collection, measurement, and interpre- tation of clues in order to produce knowledge. The effects of contextual top-down processing on matching fingerprints. Journal of Applied Cognitive Psychology 19:799-809; and B. Schiffer and C. Champod. 2007. The potential (negative) influence of observational biases at the analysis stage of fingerprint individualiza- tion. Forensic Science International 167:116-120.