Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
16 Mendelian Randomization: Genetic Variants as Instruments for Strengthening Causal Inference in Observational Studies George Davey Smith and Shah Ebrahim T he incorporation of biomarkers into population-based health sur- veys is generally intended to improve categorization of exposures or health outcome measures (National Research Council, 2001). An unintended consequence of the growing use of biomarkersâfor example, in assessing nutritional statusâis that investigators are less aware of the continued threats to validity of their findings caused by measure- ment error, confounding, and reverse causality, which affect biomarkers in the same way as exposures and outcomes measured using less pre- cise methods. This chapter briefly outlines when and why conventional observational approaches have been misleading and then introduces the Mendelian randomization approach, a form of the use of genes as instru- mental variables as briefly discussed by Douglas Ewbank in the earlier Cells and Surveys volume (Ewbank, 2001). The variety of inferences that can be drawn from this approach is illustrated and then potential limi- tations and ways to address these limitations are outlined. The chapter concludes by summarizing the ways in which Mendelian randomization approaches differ from other methodologies that depend on the use of genetic markers in population-based research. Limits of observational epidemiology To investigators interested in the health consequences of a modifi- able environmental exposureâsay, a particular aspect of dietâthe obvi- ous approach would be to directly study dietary intake and how this 336
GEORGE DAVEY SMITH and SHAH EBRAHIM 337 relates to the risk of disease. Why, then, should an alternative approach be advanced? The impetus for thinking of new approaches is that con- ventional observational study designs have yielded findings that have failed to be confirmed by randomized controlled trials (Davey Smith and Ebrahim, 2002). Observational studies demonstrated that beta carotene intake was associated with a lower risk of lung cancer mortality, and this stimulated an already active market for vitamin supplements that was based on the notion that they substantially influence chronic disease risk (Figure 16-1). The scientists involved in conducting the observational studies advocated taking supplements in material intended for the public (Willett, 2001) and also, relying on observational data, concluded âAvail- able data thus strongly support the hypothesis that dietary carotenoids reduce the risk of lung cancerâ (Willett, 1990). However large-scale ran- domized controlled trials reported disappointing findings: beta carotene supplementation produced no reduction in risk of lung cancer (Alpha- Tocopherol and Beta-Carotene Cancer Prevention Study Group, 1994). With respect to cardiovascular disease, observational studies suggest- ing that beta carotene (Manson et al., 1991), vitamin E supplements (Rimm et al., 1993; Stampfer et al., 1993), vitamin C supplements (Osganian et al., 2003), and hormone replacement therapy (Stampfer and Colditz, 1991) were protective were followed by large trials showing no such protection (Omenn et al., 1996; Alpha-Tocopherol and Beta-Carotene Cancer Preven- tion Study Group, 1994; Lancet, 1999; Heart Protection Study Collabora- tive Group, 2002; Beral, Banks, and Reeves, 2002). In each case special pleading was advanced to explain the discrepancy: Were the doses of vitamins given in the trials too high or too low to be comparable to the observational studies? Did hormone replacement therapy use start too late in the trials? Were differences explained by the duration of follow- up or other design aspects? Were interactions with other factors, such as smoking or alcohol consumption, key? Rather than such particular explanations being true (with the happy consequence that both the obser- vational studies and the trials had got the right answers, but to different questions), it is likely that a general problem of confoundingâby lifestyle and socioeconomic factors, or by baseline health status and prescription policiesâis responsible. Indeed, in the vitamin E supplements example, the observational studies and the trials tested precisely the same thing. Figures 16-2a and 16-2b show the findings from observational studies of taking vitamin E supplements (Rimm et al., 1993; Stampfer et al., 1993) and a meta-analysis of trials of supplements (Eidelman, Hollar, Hebert, Lamas, and Hennekens, 2004). The point here is that the observational studies specifically investigated the effect of taking supplements for a short period (2-5 years) and found an apparent, robust, and large protec- tive effect, even after adjustment for confounders. The trials tested ran-
338 BIOSOCIAL SURVEYS FIGURE 16-1â Advertisement from the Boston Globe. domization to essentially the same supplements for the same period and Figure 16-1 found no protective effect. Importantly, the trial findings cannot be attrib- uted to confounding or self-selection of healthier people into a vitamin- taking group, as taking or not taking vitamin E was determined randomly, which (providing it is done properly) avoids these sources of bias.
GEORGE DAVEY SMITH and SHAH EBRAHIM 339 1.1 1.0 0.9 0.7 0.5 0.3 Stampfer 1993 Rimm 1993 RCTs FIGURE 16-2a Vitamin E supplement use and risk of coronary heart disease in two observational studies (Rimm et al., 1993; Stampfer et al., 1993) and in a meta- analysis of randomized controlled trials (RCTs) (Eidelman et al., 2004). RR 2.0 1.5 1.0 Figure 16-2a 0.5 0 0-1 year 2-4 years 5-9 years >10 years FIGURE 16-2b Health Professional Follow-up Study (Rimm et al., 1993). NOTE: Observed effect of duration of vitamin E use compared with no use on coronary heart disease events in the Health Professional Follow-up Study. Figure 16-2b
340 BIOSOCIAL SURVEYS Other processes in addition to confounding can generate robust, but noncausal, associations in observational studies. Reverse causationâin which the disease influences the apparent exposure, rather than vice versaâmay generate strong and replicable associations. For example, many studies have found that people with low circulating cholesterol levels are at increased risk of several cancers, including colon cancer. If causal, this is an important association, as it might mean that efforts to lower cholesterol levels would increase the risk of cancer. However, it is possible that the early stages of cancer may, many years before diagnosis or death, lead to a lowering in cholesterol levels, rather than low choles- terol levels increasing the risk of cancer. Similarly in studies of inflamma- tory markers, such as C-reactive protein and cardiovascular disease risk, it is possible that early stages of atherosclerosisâwhich is an inflammatory processesâelevate circulating inflammatory markers, and since people with atherosclerosis are more likely to experience cardiovascular events, a robust but noncausal association between levels of inflammatory mark- ers and incident cardiovascular disease is generated. A form of reverse causation can also occur through reporting bias, with the presence of disease influencing reporting disposition. In case-control studies, people with the disease under investigation may report on their prior exposure history in a different way than do controlsâperhaps because the former will think harder about potential reasons that account for why they have developed the disease. These problems of confounding and bias produce associations in observational studies that are not reliable indicators of the true causa- tion. Furthermore, the strength of associations between truly causal risk factors and disease in observational studies is underestimated due to ran- dom measurement imprecision in indexing the exposure. A century ago, Charles Spearman demonstrated mathematically how such measurement imprecision would lead to what he termed the âattenuation by errorsâ of associations (Spearman, 1904; Davey Smith and Phillips, 1996), later renamed âregression dilution bias.â Observational studies can and do produce findings that either spuriously enhance or downgrade estimates of causal associations between modifiable exposures and disease. For these reasons, alterna- tive approachesâincluding those within the Mendelian randomization frameworkâneed to be applied. Mendelian randomization The basic principle utilized in Mendelian randomization is that genetic variants that either alter the level of, or mirror the biological effects of, a modifiable environmental exposure that itself alters disease
GEORGE DAVEY SMITH and SHAH EBRAHIM 341 risk should be related to disease risk to the extent predicted by their influence on exposure to the environmental risk factor. Common genetic polymorphisms that have a well-characterized biological function (or are markers for such variants) can therefore be utilized to study the effect of a suspected environmental exposure on disease risk (Davey Smith and Ebrahim, 2003, 2004, 2005; Davey Smith, 2006). The exploitation of situations in which genotypic differences produce effects similar to envi- ronmental factors (and vice versa) clearly resonates with the concepts of phenocopy (Goldschmidt, 1938) and genocopy (Schmalhausen, 1938, cited by Gause, 1942) in developmental genetics. Phenocopy refers to the situ- ation in which an environmental effect produces the same effect as that produced by a genetic mutation. Genocopy, the reverse of phenocopy, is when genetic variation generates an outcome that could be produced by an environmental stimulus (Jablonka-Tavory, 1982). Why use genetic variants as proxies for environmental exposures rather than measure the exposures themselves? First, unlike environmen- tal exposures, genetic variants are not generally associated with the wide range of behavioral, social, or physiological factors that, for example, confound the association between vitamin C and coronary heart disease. This means that if a genetic variant is used to proxy for an environmen- tally modifiable exposure, it is unlikely to be confounded in the way that direct measures of the exposure will be. Furthermore, aside from the effects of population structure (see Palmer and Cardon, 2005, for a dis- cussion of the likely impact of this), such variants will not be associated with other genetic variants, except those with which they are in linkage disequilibrium. This latter assumption follows from the law of indepen- dent assortment (sometimes referred to as Mendelâs second law), hence the term âMendelian randomization.â We illustrate this powerful aspect of Mendelian randomization in Tables 16-1a and 16-b, showing the strong associations between a wide range of variables and blood C-reactive pro- tein (CRP) levels, but no association of the same factors with genetic vari- ants in the CRP gene. The only factor related to genotype is the expected, biological influence of the genetic variant on CRP levels. Second, we have seen how inferences drawn from observational stud- ies may be subject to bias due to reverse causation. Disease processes may influence exposure levels, such as alcohol intake, or measures of interme- diate phenotypes, such as cholesterol levels and C-reactive protein. How- ever germ line genetic variants associated with average alcohol intake or circulating levels of intermediate phenotypes will not be influenced by the onset of disease. This will be equally true with respect to reporting bias generated by knowledge of disease status in case-control studies or of differential reporting bias in any study design. Third, associative selection bias in which participantsâ entry to a study
342 TABLE 16-1aâ Means or Proportions of Blood Pressure, Pulse Pressure, Hypertension, and Potential Confounders by Quarters of C-Reactive Protein (CRP) N = 3,529 Means or Proportions by Quarters of C-Reactive Protein (Range mg/L) 1 2 3 4 P Trend Across (0.16-0.85) (0.86-1.71) (1.72-3.88) (3.89-112.0) Categories Hypertension (%) 45.8 49.7 57.5 60.0 < 0.001 Body mass index (kg/m2) 25.2 27.0 28.5 29.7 < 0.001 High-density lipoprotein cholesterol 1.80 1.69 1.0 1.53 < 0.001 (mmol/l) Lifecourse socioeconomic position 4.08 4.37 4.46 4.75 < 0.001 score Doctor diagnosis of diabetes (%) 3.5 2.8 4.1 8.4 < 0.001 Current smoker (%) 7.9 9.6 10.9 15.4 < 0.001 Physically inactive (%) 11.3 14.9 20.1 29.6 < 0.001 Moderate alcohol consumption (%) 22.2 19.6 18.8 14.0 < 0.001 SOURCE: Davey Smith et al. (2005).
TABLE 16-1b Means or Proportions of CRP Systolic Blood Pressure, Hypertension, and Potential Confounders by 1059G/C Genotype Means or Proportions by Genotype GG GC or CC P CRP (mg/L log scale)a 1.81 1.39 < 0.001 Hypertension (%) 53.3 53.1 0.95 Body mass index (kg/m2) 27.5 27.8 0.29 High-density lipoprotein cholesterol (mmol/l) 1.67 1.65 0.38 Lifecourse socioeconomic position score 4.35 4.42 0.53 Doctor diagnosed diabetes (%) 4.7 4.5 0.80 Current smoker (%) 11.2 9.3 0.24 Physically inactive (%) 18.9 18.9 1.0 Moderate alcohol consumption (%) 18.6 19.8 0.56 aGeometricmeans and proportionate (%) change for a doubling of CRP. CRP = C-reactive protein. SOURCE: Davey Smith et al. (2005). 343
344 BIOSOCIAL SURVEYS is related to both their exposure level and disease risk can generate spuri- ous associations. This is unlikely to occur with respect to genetic variants. There is empirical evidence that a wide range of genetic variants and participation rates in etiological studies are not associated. Odds ratios for differences in the prevalence of genetic variants between those willing and less willing to participate in studies are generally null, showing no strong evidence to support associative selection bias in genetic studies (Bhatti et al., 2005). As these investigators noted, it is important that researchers test this assumption in their own data, as it is possible that other genotypes, particularly those associated with health-relevant behaviors (e.g., alcohol consumption), may show associations. Finally, a genetic variant will indicate long-term levels of exposure, and if the variant is taken as a proxy for such exposure, it will not suffer from the measurement error inherent in phenotypes that have high levels of variability and are poorly estimated by a single measure. For example, groups defined by cholesterol levelârelated genotype will, over a long period, experience the cholesterol difference seen between the groups. Indeed, use of the Mendelian randomization approach predicts a strength of association that is in line with randomized controlled trial findings of effects of cholesterol lowering when the increasing benefits seen over the relatively short trial period are projected to the expectation for differences over a lifetime (Davey Smith and Ebrahim, 2004). Categories of Mendelian Randomization Several categories of inference can be drawn from studies utilizing the Mendelian randomization paradigm. In the most direct forms, genetic variants can be related to the probability or level of exposure (âexposure propensityâ) or to intermediate phenotypes believed to influence dis- ease risk. Less direct evidence can come from genetic variantâdisease associations that indicate that a particular biological pathway may be of importance, perhaps because the variants modify the effects of environ- mental exposures. Several examples of these categories have been given elsewhere (Davey Smith and Ebrahim, 2003, 2004; Davey Smith, 2006); here illustrative cases of the first two categories are briefly outlined. Exposure Propensity: Alcohol Intake and Health The possible protective effect of moderate alcohol consumption on risk of coronary heart disease (CHD) remains controversial (Marmot, 2001; Bovet and Paccaud, 2001; Klatsky, 2001). Nondrinkers may be at a higher risk of coronary heart disease because health problems (perhaps induced by previous alcohol abuse) dissuade them from drinking (Shaper,
GEORGE DAVEY SMITH and SHAH EBRAHIM 345 1993). As well as this form of reverse causation, confounding could play a role, with nondrinkers being more likely to display an adverse profile of socioeconomic or other behavioral risk factors for coronary heart disease (Hart, Davey Smith, Hole, and Hawthorne, 1999). Alternatively, alcohol may have a direct biological effect that lessens the risk of coronary heart diseaseâfor example, by increasing the levels of protective high-density lipoprotein (HDL) cholesterol (Rimm, 2001). It is, however, unlikely that a randomized controlled trial of alcohol intake, able to test whether there is a protective effect of alcohol on CHD events, will be carried out. Alcohol is oxidized to acetaldehyde, which in turn is oxidized by aldehyde dehydrogenases (ALDHs) to acetate. Half of Japanese people are heterozygotes or homozygotes for a null variant of ALDH2, and peak blood acetaldehyde concentrations post alcohol challenge are 18 times and 5 times higher, respectively, among homozygous null variant and heterozygous individuals compared with homozygous wild type individuals (Enomoto, Takase, Yasuhara, and Takada, 1991). This renders the consumption of alcohol unpleasant through inducing facial flushing, palpitations, drowsiness, and other symptoms. As Figure 16-3a shows, 40 30 Alcohol Intake ml/day 20 10 0 2*2/2*2 2*2/2*1 1*1/1*1 ALDH2 Genotype FIGURE 16-3a Relationship between characteristics and alcohol consumption. SOURCE: Takagi et al. (2002).
346 BIOSOCIAL SURVEYS there are very considerable differences in alcohol consumption accord- ing to genotype (Takagi et al., 2002). The principles of Mendelian ran- domization are seen to apply: two factors that would be expected to be associated with alcohol consumption, age and cigarette smoking, which would confound conventional observational associations between alcohol and disease, are not related to genotype despite the strong association of genotype with alcohol consumption (Figure 16-3b). It would be expected that the ALDH2 genotype influences diseases known to be related to alcohol consumption, and as proof of principle it has been shown that ALDH2 null variant homozygosityâassociated with low alcohol consumptionâis indeed related to a lower risk of liver cir- rhosis (Chao et al., 1994). Considerable evidence, including data from ran- domized controlled trials, suggests that alcohol increases HDL cholesterol levels (Haskell et al., 1984; Burr, Fehily, Butland, Bolton, and Eastham, 1986), which should protect against coronary heart disease. In line with this, ALDL2 genotype is strongly associated with HDL cholesterol in the expected direction (Figure 16-3c). Given the apparent protective effect of alcohol against CHD risk seen in observational studies, possession of the null ALDH2 alleleâassociated with lower alcohol consumptionâshould be associated with a greater risk of myocardial infarction, and this is what was seen in a case-control study (Takagi et al., 2002). Men either homozygous or heterozygous for null ALDH2 were at twice the risk of myocardial infarction. Statistical adjustment for HDL cholesterol greatly attenuated the association between ALDH2 genotype and coronary heart disease, indicating that the cardio-protective effect of alcohol is mediated by increased levels of HDL cholesterol. Age 70 Smoker 70 60 Percentage 60 50 Years 40 50 30 40 20 2*2/2*2 2*2/2*1 1*1/1*1 2*2/2*2 2*2/2*1 1*1/1*1 FIGURE 16-3b Relationship between characteristics and ALDH2 genotype. SOURCE: Takagi et al. (2002). Figure 16- 3b
GEORGE DAVEY SMITH and SHAH EBRAHIM 347 65 60 55 HDL mg/dl 50 45 40 35 2*2/2*2 2*2/2*1 1*1/1*1 FIGURE 16-3c Relationship between HDL cholesterol and ALDH2 genotype. SOURCE: Tagaki et al. (2002). Intermediate Phenotypes Genetic variants can influence such circulating- biochemical factors as Figure 16 3c cholesterol, homocysteine, or fibrinogen levels. This provides a method for assessing causality in associations observed between these measures (intermediate phenotypes) and disease, and thus whether interventions to modify the intermediate phenotype could be expected to influence dis- ease risk. Proof of principle for this approach is provided by familial hypercholesterolemia genetic variants that are associated with higher cir- culating cholesterol levels, which increase risk of coronary heart disease. These observational data are in line with the randomized controlled trial evidence confirming that lowering cholesterol reduces the risk of coro- nary heart disease (Davey Smith and Ebrahim, 2004). C-Reactive Protein and Coronary Heart Disease Strong associations of CRP, an acute phase inflammatory marker, with hypertension, insulin resistance, and coronary heart disease have been repeatedly observed (Danesh et al., 2004; Wu, Dorn, Donahue, Sempos,
348 BIOSOCIAL SURVEYS and Trevisan, 2002; Pradhan, Manson, Rifai, Buring, and Ridker, 2001; Han et al., 2002; Sesso et al., 2003; Hirschfield and Pepys, 2003; Hu, Meigs, Li, Rifai, and Manson, 2004), with the obvious inference that CRP is a cause of these conditions (Ridker et al., 2005; SjÃ¶holm and NystÃ¶m, 2005; Verma, Szmitko, and Ridker, 2005). A Mendelian randomization study has examined the association between polymorphisms of the CRP gene and demonstrated that, although serum CRP differences were highly predic- tive of blood pressure and hypertension, the CRP variantsâwhich are related to sizeable serum CRP differencesâwere not associated with these same outcomes (Davey Smith et al., 2005b). It is likely that these divergent findings are explained by the extensive confounding between serum CRP and outcomes (as shown in Table 16-1). Current evidence on this issue, although statistically underpowered, also suggests that CRP levels do not lead to elevated risk of insulin resistance (Timpson et al., 2005) or coro- nary heart disease (Casas et al., 2006). Again, confounding, and reverse causationâin which existing coronary disease or insulin resistance may influence CRP levelsâcould account for this discrepancy. Similar findings have been reported for serum fibrinogen, variants in the beta fibrinogen gene, and coronary heart disease (Davey Smith et al., 2005a; Keavney et al., 2006). The CRP and fibrinogen examples demonstrate that Mendelian randomization can increase evidence for a causal effect of an environmen- tally modifiable factor (as in the cases of alcohol and cholesterol levels discussed earlier) as well as provide evidence against causal effects, which can help direct efforts away from targets of no preventative or therapeutic relevance. Implications of Mendelian Randomization Study Findings Establishing the causal influence of environmentally modifiable risk factors from Mendelian randomization designs informs policies for improving population health through population-level interventions. They do not imply that the appropriate strategy is genetic screening to identify those at high risk and application of selective exposure reduction policies. For example, establishing the association between genetic vari- ants (such as familial defective ApoB) associated with elevated cholesterol level and CHD risk strengthens causal evidence that elevated cholesterol is a modifiable risk factor for coronary heart disease for the whole popu- lation. Thus even though the population attributable risk for coronary heart disease of this variant is small, it usefully informs public health approaches to improving population health. It is this aspect of Mendelian randomization that illustrates its distinction from conventional risk iden- tification and genetic screening purposes of genetic epidemiology.
GEORGE DAVEY SMITH and SHAH EBRAHIM 349 Mendelian Randomization and Randomized Controlled Trials Randomized controlled trials are clearly the definitive means of obtaining evidence on the effects of modifying disease risk processes. There are similarities in the logical structure of randomized controlled trials and Mendelian randomization, however. Figure 16-4 illustrates this, drawing attention to the unconfounded nature of exposures proxied for by genetic variants (analogous to the unconfounded nature of a random- ized intervention), the lack of possibility of reverse causation as an influ- ence on exposure-outcome associations in both Mendelian randomization and randomized controlled trial settings, and the importance of inten- tion to treat analysesâthat is, comparisons of groups defined by genetic variant, irrespective of associations between the genetic variant and the proxied for exposure within any particular individual. The analogy with randomized controlled trials is also useful in understanding why an objection to Mendelian randomizationâthat the environmentally modifiable exposure proxied for by the genetic variants (such as alcohol intake or circulating CRP levels) are influenced by many other factors in addition to the genetic variants (Jousilahti and Salomaa, 2004)âwhile true, is of no consequence. Consider a randomized con- Mendelian Randomized Randomization Controlled Trial Random Segregation Randomization Method of Alleles Exposed: Control: Exposed: Control: One Allele Other Allele Intervention No Intervention Confounders Equal Confounders Equal Between Groups Between Groups Outcomes Compared Between Groups Outcomes Compared Between Groups FIGURE 16-4 Mendelian randomization and randomized controlled trial designs compared. Figure 16-4
350 BIOSOCIAL SURVEYS trolled trial of blood pressureâlowering medication. Blood pressure is influenced mainly by factors other than taking blood pressureâlowering medication. Obesity, alcohol intake, salt consumption and other dietary factors, smoking, exercise, physical fitness, genetic factors, and early life developmental influences are all of importance. However, the randomiza- tion that occurs in trials ensures that these factors are balanced between the group that receives the blood pressureâlowering medication and the control group that does not. Thus the fact that many other factors are related to the modifiable exposure does not vitiate the power of random- ized controlled trials; neither does it vitiate the strength of Mendelian randomization designs. A related objection is that genetic variants often explain only a trivial proportion of the variance in the environmentally modifiable risk factor that is being proxied for (Glynn, 2006). Again, consider a randomized con- trolled trial of blood pressureâlowering medication, in which 50 percent of participants receive the medication and 50 percent receive a placebo. If the antihypertensive therapy reduced blood pressure by a quarter of a standard deviation, which is approximately the situation for such phar- macotherapy, then within the whole study group the treatment assign- ment (i.e., antihypertensive use versus placebo) will explain 1.5 percent of the variance in blood pressure. In the example of CRP haplotypes used as markers for CRP levels, these haplotypes explain 1.7 percent of the variance in CRP levels in the population (Lawlor, Harbord, Sterne, and Davey Smith, 2007). As can be seen, the quantitative association of genetic variants as proxies can be similar to that of randomized treatments with respect to biological processes that such treatments modify. Both logic and quantification fail to support criticisms of the Mendelian randomiza- tion approach on the basis of either the obvious fact that many factors influence most phenotypes of interest or that particular genetic variants account for only a small proportion of variance in the phenotype. Mendelian Randomization and Instrumental Variable Approaches As well as the analogy with randomized controlled trials, Mendelian randomization can also be likened to instrumental variable approaches that have been heavily utilized in econometrics and social science. In this approach, the instrument is a variable that is related to the outcome only through its association with the modifiable exposure of interest. The instrument is not related to confounding factors, nor is its assessment biased in a manner that would generate a spurious association with the outcome. Furthermore the instrument will not be influenced by the devel- opment of the outcome (i.e., there will be no reverse causation). Figure 16-5 presents this basic schema, where the dotted line between genotype
GEORGE DAVEY SMITH and SHAH EBRAHIM 351 Genotype Exposure Outcome Confounders; reverse causation; bias FIGURE 16-5â Mendelian randomization as an instrumental variables approach. and the outcome provides an unconfounded and unbiased estimate of the causal association between the exposure that the genotype is proxying for and the outcome. The development of instrumental variable methods in econometrics, in particular, has led to a sophisticated range of statistical methods for estimating causal effects, and these have now been applied in Mendelian randomization studies (e.g., Davey Smith et al., 2005a, 2005b; Timpson et al., 2005). The parallels between Mendelian randomization Figure 16-5 and instrumental variable approaches are discussed in more detail else- where (Thomas and Conti, 2004; Didelez and Sheehan, 2007; Lawlor et al., 2007). Mendelian Randomization and Gene-Environment Interaction Mendelian randomization is one way in which genetic epidemiology can inform understanding about environmental determinants of disease. A more conventional approach has been to study interactions between environmental exposures and genotype (Perera, 1997; Mucci, Wedren, Tamimi, Trichopoulos, and Adami, 2001). From epidemiological and Mendelian randomization perspectives, several issues arise with gene- environment interactions. The most reliable findings in genetic association studies relate to the
352 BIOSOCIAL SURVEYS main effects of polymorphisms on disease risk (Clayton and McKeigue, 2001). The power to detect meaningful gene-environment interaction is low (Wright, Carothers, and Campbell, 2002), resulting in a large num- ber of reports of spurious gene-environment interactions in the medical literature (Colhoun, McKeigue, and Davey Smith, 2003). The presence or absence of statistical interactions depends on the scale used (i.e., linear or logarithmic association between the exposure and disease outcome) and the difficulty in defining whether deviation from either an additive or multiplicative model exists, given the imprecision of estimation. Measure- ment errorâparticularly if differential with respect to other factors influ- encing disease riskâmakes interactions both difficult to detect and often misleading when they are apparently found (Clayton and McKeigue, 2001). Furthermore, the biological implications of interactions (however defined) are generally uncertain (Thompson, 1991). The situation may be different with exposures that differ qualitatively rather than quantitatively between individuals. Consider the possible influence of smoking tobacco on bladder cancer risk. Observational stud- ies suggest an association, but clearly confounding, and a variety of biases could generate such an association. The potential carcinogens in tobacco smoke of relevance to bladder cancer include aromatic and heterocyclic amines, which are detoxified by N-acetyl transferase 2 (NAT2). Genetic variation in NAT2 enzyme levels leads to slower or faster acetylation states. If the carcinogens in tobacco smoke do increase the risk of blad- der cancer, then it would be expected that slow acetylators, who have a reduced rate of detoxification of these carcinogens, would be at an increased risk of bladder cancer if they were smokers, whereas if they were not exposed to these carcinogens (and the major exposure route for those outside of particular industries is through tobacco smoke) then an associa- tion of genotype with bladder cancer risk would not be anticipated. Table 16-2 tabulates findings from the largest study to date reported in a way that allows analysis of this simple hypothesis (Garcia-Closas et al., 2005). As can be seen, the influence of the NAT2 slow acetylation genotype is appreciable only among those exposed to heavy smoking. Since the geno- TABLE 16-2 Association of NAT2 Slow Acetylation Genotype with Bladder Cancer in Never and Ever Smokers and Overall. Odds Ratio (95% confidence intervals) Overall Never Smokers Ever Smokers 1.4 (1.2-1.7) 0.9 (0.6-1.3) 1.6 (1.3-1.9) SOURCE: Garcia-Closas et al. (2005).
GEORGE DAVEY SMITH and SHAH EBRAHIM 353 type will be unrelated to confounders, it is difficult to reason why this situation should arise, unless smoking is a causal factor with respect to bladder cancer. Thus the presence of a sizable effect of genotype in heavy smokers but not nonsmokers provides evidence of the causal nature of an environmentally modifiable risk factor, in this example, smoking. However, gene-environment interactions interpreted within the Mendelian randomization framework are not protected from confound- ing in the same way as the main genetic effects are. Problems and Limitations of Mendelian Randomization We consider Mendelian randomization to be one of the brightest cur- rent prospects for improving causal understanding in population-based studies. There are, however, several potential limitations to the applica- tion of this methodology (Davey Smith and Ebrahim, 2003; Little and Khoury, 2003), which we discuss below. Failure to Establish Reliable Genotype-Intermediate Phenotype or Genotype-Disease Associations If the associations between genotype and a potential intermediate phenotype, or between genotype and disease outcome, are not reliably estimated, then interpreting these associations in terms of their implica- tions for potential environmental causes of disease will clearly be inap- propriate. This is not an issue peculiar to Mendelian randomization; instead, the nonreplicable nature of perhaps most apparent findings in genetic association studies is a serious limitation to the whole enterprise. In Box 16-1 we summarize possible reasons for the nonreplication of find- ings (Cardon and Bell, 2001; Colhoun et al., 2003). Population stratifica- tionâthat is, the confounding of genotype-disease associations by factors related to subpopulation group membership in the overall population in a studyâis unlikely to be a major problem in most situations (Wacholder, Rothman, and Caporaso, 2000; Wacholder et al., 2002; Palmer and Cardon, 2005). Genotyping errors can, of course, lead to failures of replication of genotype-disease associations. When intermediate phenotypes can be measured, as in the case of CRP, a demonstration of the expected rela- tionship between genotype and intermediate phenotype in such studies indicates that genotyping errors are not to blame. Regarding failure to replicate results in genetic epidemiology, true variation between studies is clearly possibleâfor example, people het- erozygous for familial hypercholesterolemia seem to experience increased mortality only in populations with substantial dietary fat intake and the presence of other CHD risk factors (Sijbrands et al., 2001; Pimstone et al.,
354 BIOSOCIAL SURVEYS BOX 16-1 Reasons for Inconsistent Genotype-Phenotype Associations True variation Variation of allelic association between subpopulations: (1) disease causing allele in linkage disequilibrium with different marker alleles in different populations; or (2) different variants within the same gene contribute to disease risk in different populations Effect modification by other genetic or environmental factors that vary between populations Spurious variation Genotyping errors Misclassification of phenotype Confounding by population structure Lack of power Chance Publication bias SOURCES: Cardon and Bell (2001); Colhoun et al. (2003). 1998). Nevertheless, the major factor for nonreplication is probably inad- equate statistical power (generally reflecting limited sample size), coupled with publication bias (Colhoun et al., 2003). Confounding of Genotype: Environmentally Modifiable Risk FactorâDisease Associations The power of Mendelian randomization lies in its ability to avoid the often substantial confounding seen in conventional observational epi- demiology. However, confounding can be reintroduced into Mendelian randomization studies and needs to be considered when interpreting the results. Linkage Disequlibrium It is possible that the locus under study is in linkage disequilibrium with another polymorphic locus. Confounding will result if both the study locus and that with which it is in linkage disequilibrium are both associated with the outcome of interest. It may seem unlikelyâgiven the
GEORGE DAVEY SMITH and SHAH EBRAHIM 355 relatively short distances over which linkage disequilibrium is seen in the human genomeâthat a polymorphism influencing, say, CHD risk would be associated with another polymorphism influencing CHD risk (and thus producing confounding). There are, nevertheless, cases of different genes influencing the same metabolic pathway being in physical proximity. For example, different polymorphisms influencing alcohol metabolism appear to be in linkage disequilibrium (Osier et al., 2002). Pleiotropy and the Multifunction of Genes Mendelian randomization is most useful when it can be used to relate a single intermediate phenotype to a disease outcome. However, poly- morphisms may (and probably often will) influence more that one inter- mediate phenotype, and this may mean that they proxy for more than one environmentally modifiable risk factor. This can be the case through multiple effects mediated by their RNA expression or immediate protein coding, through alternative splicing, in which one polymorphic region contributes to alternative forms of more than one protein (Glebart, 1998), or through other mechanisms. The most robust interpretations will be possible when the functional polymorphism appears to directly influ- ence the level of the intermediate phenotype of interest (as in the CRP example), but such examples are probably going to be less common in Mendelian randomization than in cases in which the polymorphism can influence several systems, with different potential interpretations of how the effect on outcome is generated. The association of possession of the ApoE-2 allele with cholesterol levels and coronary heart disease might be an example of pleiotropic effects, since carriers of this allele have lower cholesterol levels but do not have the degree of protection against coronary heart disease that would be anticipated from this (Keavney et al., 2004; Song, Stampfer, and Liu, 2004). In addition to lower cholesterol levels, the ApoE-2 allele is associ- ated with less efficient transfer of very low density lipoproteins and chy- lomicrons from the blood to the liver, greater postprandial lipemia, and an increased risk of type III hyperlipidemia (Smith, 2002; Eichner et al., 2002). These differences will accompany the lower cholesterol levels and may counterbalance the predicted benefits. Multiple Instruments as an Approach to Confounding in Mendelian Randomization Linkage disequilibrium and pleiotropy can reintroduce confound- ing and vitiate the power of the Mendelian randomization approach. Genomic knowledge may help in estimating the degree to which these
356 BIOSOCIAL SURVEYS are likely to be problems in any particular Mendelian randomization study, through, for instance, explication of genetic variants that may be in linkage disequilibrium with the variant under study, or the function of a particular variant and its known pleiotropic effects. Furthermore, genetic variation can be related to measures of potential confounding factors in each study, and the magnitude of such confounding estimated. Empirical studies to date suggest that common genetic variants are largely unrelated to the behavioral and socioeconomic factors considered to be important confounders in conventional observational studies (Smits et al., 2004; Bhatti et al., 2005; Davey Smith et al., 2005b, 2007; Chatterjee, Kalaylioglu, and Carroll, 2005; Umbach and Weinberg, 1997). However, relying on measurement of confounders does, of course, remove the central purpose of Mendelian randomization, which is to balance unmeasured as well as measured confounders (as randomization does in randomized controlled trials). It may be possible to identify two separate genetic variants, which are not in linkage disequilibrium with each other, but which both serve as proxies for the environmentally modifiable risk factor of interest. If both variants are related to the outcome of interest and point to the same underlying association, then it becomes much less plausible that reintro- duced confounding explains the association, since it would have to be acting in the same way for these two unlinked variants. This can be lik- ened to randomized controlled trials of different blood pressureâlowering agents, which work through different mechanisms and have different potential side effects but lower blood pressure to the same degree. If the different agents produce the same reductions in cardiovascular disease risk, then it is unlikely that this is through agent-specific effects of the drugs; instead, it points to blood pressure lowering as being key. The use of multiple genetic variants working through different pathways has not been applied in Mendelian randomization to date, but it represents an important potential development in the methodology. Canalization and Developmental Stability Perhaps a greater potential problem for Mendelian randomization than reintroduced confounding arises from the developmental compen- sation that may occur through a polymorphic genotype being expressed during fetal or early postnatal development, and thus influencing devel- opment in such a way as to buffer against the effect of the polymorphism. Such compensatory processes have been discussed since C.H. Waddington introduced the notion of canalization in the 1940s (Waddington, 1942). Canalization refers to the buffering of the effects of either environmen- tal or genetic forces attempting to perturb development, and Wadding-
GEORGE DAVEY SMITH and SHAH EBRAHIM 357 tionâs ideas have been well developed both empirically and theoretically (Wilkins, 1997; Rutherford, 2000; Gibson and Wagner, 2000; Hartman, Garrik, and Hartwell, 2001; Debat and David, 2001; Kitami and Nadeau, 2002; Gu et al., 2003; Hornstein and Shomron, 2006). Such buffering can be achieved either through genetic redundancy (more than one gene having the same or similar function) or through alternative metabolic routes, in which the complexity of metabolic pathways allows recruit- ment of different pathways to reach the same phenotypic end point. In effect, a functional polymorphism expressed during fetal development or postnatal growth may influence the expression of a wide range of other genes, leading to changes that may compensate for the influence of the polymorphism. In the field of animal genetic engineering studies, such as knock- out preparations or transgenic animals manipulated so as to overex- press foreign DNA, the interpretive problem created by developmental compensation is well recognized (Morange, 2001; Shastry, 1998; Gerlai, 2001; Williams and Wagner, 2000). Conditional preparations, in which the level of transgene expression can be induced or suppressed through the application of external agents, are now being utilized to investigate the influence of such altered gene expression after the developmental stages during which compensation can occur (Bolon and Galbreath, 2002). Thus further evidence on the issue of genetic buffering should emerge to inform interpretations of both animal and human studies. Most examples of developmental compensation relate to dramatic genetic or environmental insults, so it is unclear whether the generally small phenotypic differences induced by common functional polymor- phisms will be sufficient to induce compensatory responses. The fact that the large gene-environment interactions that have been observed often relate to novel exposures (e.g., drug interactions) that have not been pres- ent during the evolution of a species (Wright et al., 2002) may indicate that homogenization of response to exposures that are widely experiencedâas would be the case with the products of functional polymorphisms or common mutationsâhas occurred; canalizing mechanisms could be par- ticularly relevant in these cases. Further work on the basic mechanisms of developmental stability and how this relates to relatively small exposure differences during development will allow these considerations to be understood. Knowledge of the stage of development at which a genetic variant has functional effects will also allow the potential of developmen- tal compensation to buffer the response to the variant to be assessed. In some Mendelian randomization designs, developmental compen- sation is not an issue. For example, when maternal genotype is utilized as an indicator of the intrauterine environment, then the response of the fetus will not differ whether the effect is induced by maternal genotype
358 BIOSOCIAL SURVEYS or by environmental perturbation, and the effect on the fetus can be taken to indicate the effect of environmental influences during the intra- uterine period. Also, in cases in which a variant influences an adulthood environmental exposureâfor example, ALDH2 variation and alcohol intakeâdevelopmental compensation to genotype will not be an issue. In many cases of gene-environment interaction interpreted with respect to causality of the environmental factor, the same applies. However, in some situations there remains the somewhat unsatisfactory position of Mendelian randomization facing a potential problem that cannot cur- rently be adequately assessed. Lack of Suitable Genetic Variants to Proxy for the Exposure of Interest An obvious limitation of Mendelian randomization is that it can only examine areas for which there are functional polymorphisms (or genetic markers linked to such functional polymorphisms) that are relevant to the modifiable exposure of interest. In the context of genetic association stud- ies more generally, it has been pointed out that, in many cases, even if a locus is involved in a disease-related metabolic process, there may be no suitable marker or functional polymorphism to allow study of this process (Weiss and Terwillger, 2000). Since one of our examples, used in an earlier paper (Davey Smith and Ebrahim, 2003), of how observational epidemi- ology appeared to have got the wrong answer related to vitamin C, we considered whether the association between vitamin C and coronary heart disease could have been studied utilizing the principles of Mendelian ran- domization. We stated that polymorphisms exist that are related to lower circulating vitamin C levelsâfor example, the haptoglobin polymorphism (Langlois, Delanghe, DeBuyzere, Bernard, and Ouyang, 1997; Delanghe, Langlois, Duprez, DuBuyzere, and Clement, 1999)âbut in this case the effect on vitamin C is at some distance from the polymorphic protein and, as in the apolipoprotein E example, other phenotypic differences could have an influence on CHD risk that would distort examination of the influence of vitamin C levels through relating genotype to disease. SLC23A1âa gene encoding for the vitamin C transporter SVCT1, vita- min C transport by intestinal cellsâ would be an attractive candidate for Mendelian randomization studies (Erichsen, Eck, Levine, and Chanock, 2001). However, by 2003 (the date of our earlier paper) a search for vari- ants had failed to find any common single-nucleotide polymorphism that could be used in such a way. However, since then, functional varia- tion in SLC23A1 that is related to circulating vitamin C levels has been identified (Timpson et al., personal communication). Rapidly developing knowledge of human genomics will identify more variants that can serve as instruments for Mendelian randomization studies.
GEORGE DAVEY SMITH and SHAH EBRAHIM 359 Conclusions: Mendelian randomization, what it is, and what it is not Mendelian randomization is not predicated on the presumption that genetic variants are major determinants of health and disease in popula- tions. There are many cogent critiques of genetic reductionism and the overselling of âdiscoveriesâ in genetics that reiterate obvious truths so clearly (albeit somewhat repetitively) that there is no need to repeat them here (e.g., Berkowitz, 1996; Baird, 2000; Holtzman, 2001; Strohman, 1993; Rose, 1995). Mendelian randomization does not depend on there being genes âforâ particular traits, and certainly not in the strict sense of a gene âforâ a trait being one that is maintained by selection because of its causal association with that trait (Kaplan and Pigliucci, 2001). The association of genotype and the environmentally modifiable factor that it proxies for will be, like most genotype-phenotype associations, one that is contingent and cannot be reduced to individual-level prediction, but within environ- mental limits will pertain at a group level (Wolf, 1995). This is analogous to a randomized controlled trial of antihypertensive agents, in which at a collective level the group randomized to active medication will have lower mean blood pressure than the group randomized to placebo, but at an individual level many participants randomized to active treatment will have higher blood pressure than many individuals randomized to pla- cebo. These group-level differences are what create the analogy between Mendelian randomization and randomized controlled trials, outlined in Figure 16-4. Finally, the associations that Mendelian randomization depend on do need to pertain to a definable group at a particular time, but they do not need to be immutable. Thus ALDH2 variation will not be related to alcohol consumption in a society in which alcohol is not consumed, and the association will vary by gender and by cultural group and may change over time (Higuchi et al., 1994; Hasin et al., 2002). Within the setting of a study of a well-defined group, however, the genotype will be associated with group-level differences in alcohol consumption, and group assign- ment will not be associated with confounding variables. Mendelian Randomization and Genetic Epidemiology Critiques of contemporary genetic epidemiology often focus on two features of findings from genetic association studies: that the population- attributable risk of the genetic variants is low, and that in any case the influence of genetic factors is not reversible (Terwilliger and Weiss, 2003). These evaluations of the role of genetic epidemiology are not relevant when considering the potential contributions of Mendelian randomiza-
360 BIOSOCIAL SURVEYS tion. This approach is not concerned with the population attributable risk of any particular genetic variant, but the degree to which associations between the genetic variant and disease outcomes can demonstrate the importance of environmentally modifiable factors as causes of disease. Consider, for example, the case of familial hypercholesterolemia or famil- ial defective ApoB. The genetic mutations associated with these condi- tions will account for only a trivial percentage of cases of coronary heart disease in the populationâthat is, the population attributable risk will be low, despite a high relative risk of coronary heart disease (Tybjaerg H et al., 1998). However, by identifying blood cholesterol levels as a causal fac- tor for coronary heart disease, the triangulation between genotype, blood cholesterol, and CHD risk identifies an environmentally modifiable factor with a very high population attributable riskâassuming that 50 percent of the population have raised blood cholesterol above 6.0 mmol/l, and this is associated with a relative risk of two-fold, a population attributable risk of 33 percent is obtained. The same reasoning applies to the nonmodi- fiable nature of genotype-disease associations. The point of Mendelian randomization approaches is to utilize these associations to strengthen inferences regarding modifiable environmental risks for disease and then reduce disease risk in the population through applying this knowledge. Mendelian randomization differs from other contemporary approaches to genetic epidemiology in that its central concern is not with the mag- nitude of genetic variant influences on disease, but rather on what the genetic associations tell us about environmentally modifiable causes of disease. As David B. Abrams, director of the Office of Behavioral and Social Sciences Research at the U.S. National Institutes of Health has said, âThe more we learn about genes, the more we see how important environment and lifestyle really are.â Many years earlier, the pioneering geneticist Thomas Hunt Morgan articulated a similar sentiment in his Nobel prize acceptance speech, when he contrasted his views with the then popular genetic approach to diseaseâeugenics. He thought that âthrough public hygiene and protective measures of various kinds we can more successfully cope with some of the evils that human flesh is heir to. Medical science will here take the leadâbut I hope that genetics can at times offer a helping handâ (Morgan, 1935). More than seven decades later, it might now be time that genetic research can directly strengthen the knowledge base of public health. REFERENCES Alpha-Tocopherol, Beta Carotene Cancer Prevention Study Group. (1994). The effect of vitamin E and beta carotene on the incidence of lung cancer and other cancers in male smokers. New England Journal of Medicine, 330, 1029-1035.
GEORGE DAVEY SMITH and SHAH EBRAHIM 361 Baird, P. (2000). Genetic technologies and achieving health for populations. International Journal of Health Services, 30, 407-424. Beral, V., Banks, E., and Reeves, G. (2002). Evidence from randomized trials of the long-term effects of hormone replacement therapy. Lancet, 360, 942-944. Berkowitz, A. (1996). Our genes, ourselves? Bioscience, 46, 42-51. Bhatti, P., Sigurdson, A.J., Wang, S.S., Chen, J., Rothman, N., Hartge, P., Bergen, A.W., and Landi, M.T. (2005). Genetic variation and willingness to participate in epidemiological research: Data from three studies. Cancer Epidemiology Biomarkers and Prevention, 14, 2449-2453. Bolon, B., and Galbreath, E. (2002). Use of genetically engineered mice in drug discovery and development: Wielding Occamâs razor to prune the product portfolio. International Journal of Toxicology, 21, 55-64. Bovet, P., and Paccaud, F. (2001). Alcohol, coronary heart disease, and public health: Which evidence-based policy? International Journal of Epidemiology, 30, 734-737. Burr, M.L., Fehily, A.M., Butland, B.K., Bolton, C.H., and Eastham, R.D. (1986). Alcohol and high-density-lipoprotein cholesterol: A randomised controlled trial. British Journal of Nutrition, 56, 81-86. Cardon, L.R., and Bell, J.I. (2001). Association study designs for complex diseases. Nature Reviews: Genetics, 2, 91-99. Casas, J.P., et al. (2006). Insight into the nature of the CRPâcoronary event association using Mendelian randomization. International Journal of Epidemiology, 35, 922-931. Chao, Y.-C., Liou, S.-R., Chung, Y.-Y., Tang, H.-S., Hsu, C.-T., Li, T.-K., and Yin, S.-J. (1994). Polymorphism of alcohol and aldehyde dehydrogenase genes and alcoholic cirrhosis in Chinese patients. Hepatology, 19, 360-366. Chatterjee, N., Kalaylioglu, Z., and Carroll, R.J. (2005). Exploiting gene-environment inde- pendence in family-based case-control studies: Increased power for detecting associa- tions, interactions, and joint effects. Genetic Epidemiology, 28, 138-156. Clayton, D., and McKeigue, P.M. (2001). Epidemiological methods for studying genes and environmental factors in complex diseases. Lancet, 358, 1356-1360. Colhoun, H., McKeigue, P.M., and Davey Smith, G. (2003). Problems of reporting genetic associations with complex outcomes. Lancet, 361, 865-872. Danesh, J., Wheeler, J.G., Hirschfield, G.M., Eda, S., Eiriksdottir, G., Rumley, A., Lowe, G.D.O., Pepys, M.B., and Gudnason, V. (2004). C-reactive protein and other circulat- ing markers of inflammation in the prediction of coronary heart disease. New England Journal of Medicine, 350, 1387-1397. Davey Smith, G. (2006). Cochrane lecture: Randomised by (your) god: Robust inference from an observational study design. Journal of Epidemiology and Community Health, 60, 382-388. Davey Smith, G., and Ebrahim, S. (2002). Data dredging, bias, or confounding (editorial). British Medical Journal, 325, 1437-1438. Davey Smith, G., and Ebrahim, S. (2003). Mendelian randomization: Can genetic epidemiol- ogy contribute to understanding environmental determinants of disease? International Journal of Epidemiology, 32, 1-22. Davey Smith, G., and Ebrahim, S. (2004). Mendelian randomization: Prospects, potentials, and limitations. International Journal of Epidemiology, 33, 30-42. Davey Smith, G., and Ebrahim, S. (2005). What can Mendelian randomization tell us about modifiable behavioural and environmental exposures? British Medical Journal, 330, 1076-1079.
362 BIOSOCIAL SURVEYS Davey Smith, G., Harbord, R., Milton, J., Ebrahim, S., and Sterne, J.A.C. (2005a). Does el- evated plasma fibrinogen increase the risk of coronary heart disease? Evidence from a meta-analysis of genetic association studies. Arteriosclerosis, Thrombosis, and Vascular Biology, 25, 2228-2233. Davey Smith, G., Lawlor, D., Harbord, R., Timpson, N., Rumley, A., Lowe, G., Day, I., and Ebrahim, S. (2005b). Association of C-reactive protein with blood pressure and hy- pertension: Lifecourse confounding and Mendelian randomization tests of causality. Arteriosclerosis, Thrombosis, and Vascular Biology, 25, 1051-1056. Davey Smith, G., Lawlor, D., Harbord, R., Timpson, N., Day, I., and Ebrahim, S. (2007). Clustered environments and randomized genes: A fundamental distinction between conventional and genetic epidemiology. Public Library of Science Medicine. Davey Smith, G., and Phillips, A.N. (1996). Inflation in epidemiology: The proof and mea- surement of association between two things revisited. British Medical Journal, 312, 1659-1661. Debat, V., and David, P. (2001). Mapping phenotypes: Canalization, plasticity, and develop- mental stability. Trends in Ecology and Evolution, 16, 555-561. Delanghe, J., Langlois, M., Duprez, D., De Buyzere, M., and Clement, D. (1999). Haptoglobin polymorphism and peripheral arterial occlusive disease. Atherosclerosis, 145, 287-292. Didelez, V., and Sheehan, N.A. (2007). Mendelian randomization: Why epidemiology needs a formal language for causality. In F. Russo and J. Williamson (Eds.), Causality and Probï¿½ ability in the Sciences (pp. 1-30). London: College Publications. Eichner, J.E., Dunn, S.T., Perveen, G., Thompson, D.M., Stewart, K.E., and Stroehla, B.C. (2002). Apolipoprotein E polymorphism and cardiovascular disease: A HuGE Review. American Journal of Epidemiology, 155, 487-495. Eidelman, R.S., Hollar, D., Hebert, P.R., Lamas, G.A., and Hennekens, C.H. (2004). Random- ized trials of vitamin E in the treatment and prevention of cardiovascular disease. Archives of Internal Medicine, 164, 1552-1556. Enomoto, N., Takase, S., Yasuhara, M., and Takada, A. (1991). Acetaldehyde metabolism in different aldehyde dehydrogenase-2 genotypes. Alcoholism: Clinical and Experimental Research, 15, 141-144. Erichsen, H.C., Eck, P., Levine, M., and Chanock, S. (2001). Characterization of the genomic structure of the human vitamin C transporter SVCT1 (SLC23A2). Journal of Nutrition, 131, 2623-2627. Ewbank, D. (2001). Demography in the age of genomics: A first look at the prospects. In National Research Council, Cells and surveys: Should biological measures be included in social science research? (pp. 64-109), Committee on Population, C.E. Finch, J.E. Vaupel, and K. Kinsella (Eds.). Washington, DC: National Academy Press. Garcia-Closas, M., et al. (2005). NAT2 slow acetylation, GSTM1 null genotype, and risk of bladder cancer: Results from the Spanish Bladder Cancer Study and meta-analyses. Lancet, 366, 649-659. Gause, G.F. (1942). The relation of adaptability to adaptation. Quarterly Review of Biology, 17, 99-114. Gerlai, R. (2001). Gene targeting: Technical confounds and potential solutions in behavioural and brain research. Behavioural Brain Research, 125, 13-21. Gibson, G., and Wagner, G. (2000). Canalization in evolutionary genetics: A stabilizing theory? BioEssays, 22, 372-380. Glebart, W.M. (1998). Databases in genomic research. Science, 282, 659-661. Glynn, R.K. (2006). Commentary: Genes as instruments for evaluation of markers and causes. International Journal of Epidemiology, 35, 932-934. Goldschmidt, R.B. (1938). Physiological genetics. New York: McGraw Hill.
GEORGE DAVEY SMITH and SHAH EBRAHIM 363 Gu, Z., Steinmetz, L.M., Gu, X., Scharfe, C., Davis, R.W., and Li, W.-H. (2003). Role of dupli- cate genes in genetic robustness against null mutations. Nature, 421, 63-66. Han, T.S., Sattar, N., Williams, K., Gonzalez-Villalpando, C., Lean, M.E., and Haffner, S.M. (2002). Prospective study of C-reactive protein in relation to the development of dia- betes and metabolic syndrome in the Mexico City diabetes study. Diabetes Care, 25, 2016-2021. Hart, C., Davey Smith, G., Hole, D., and Hawthorne, V. (1999). Alcohol consumption and mortality from all causes, coronary heart disease, and stroke: Results from a prospec- tive cohort study of Scottish men with 21 years of follow up. British Medical Journal, 318, 1725-1729. Hartman, J.L., Garvik, B., and Hartwell, L. (2001). Principles for the buffering of genetic variation. Science, 291, 1001-1004. Hasin, D., Aharonovich, E., Liu, X., Mamman, Z., Matseoane, K., Carr, L., and Li, T.K. (2002). Alcohol and ADH2 in Israel: Ashkenazis, Sephardics, and recent Russian immigrants. American Journal of Psychiatry, 159(8), 1432-1434. Haskell, W.L., Camargo, C., Williams, P.T., Vranizan, K.M., Krauss, R.M., Lindgren, F.T., and Wood, P.D. (1984). The effect of cessation and resumption of moderate alcohol intake on serum high-density-lipoprotein subfractions. New England Journal of Medicine, 310, 805-810. Heart Protection Study Collaborative Group. (2002). MRC/BHF Heart Protection Study of antioxidant vitamin supplementation in 20536 high-risk individuals: A randomised placebo-controlled trial. Lancet, 360, 23-33. Higuchi, S., Matsushita, S., Imazeki, H., Kinoshita, T., Takagi, S., and Kono, H. (1994). Alde- hyde dehydrogenase genotypes in Japanese alcoholics. Lancet, 343, 741-742. Hirschfield, G.M., and Pepys, M.B. (2003). C-reactive protein and cardiovascular disease: New insights from an old molecule. Quarterly Journal of Medicine, 9, 793-807. Holtzman, N.A. (2001). Putting the search for genes in perspective. International Journal of Health Services, 31, 445. Hornstein, E., and Shomron, N. (2006). Canalization of development by microRNAs. Nature Genetics, 38, S20-S24. Hu, F.B., Meigs, J.B., Li, T.Y., Rifai, N., and Manson, J.E. (2004). Inflammatory markers and risk of developing type 2 diabetes in women. Diabetes, 53, 693-700. Jablonka-Tavory, E. (1982). Genocopies and the evolution of interdependence. Evolutionary Theory, 6, 167-170. Jousilahti, P., and Salomaa, V. (2004). Fibrinogen, social position, and Mendelian randomisa- tion. Journal of Epidemiology and Community Health, 58, 883. Kaplan, J.M., and Pigliucci, M. (2001). Genes âforâ phenotypes: A modern history view. Biolï¿½ ogy and Philosophy, 16, 189-213. Keavney, B., Danesh, J., Parish, S., Palmer, A., Clark, S., Youngman, L., DelÃ©pine, M., Lathrop, M., Peto, R., and Collins, R. (2006). The International Studies of Infarct Survival (ISIS) collaborators. Fibrinogen and coronary heart disease: Test of causality by Mendelian randomization. International Journal of Epidemiology, 35, 935-943. Keavney, B., Palmer, A., Parish, S., Clark, S., Youngman, L., Danesh, J., McKenzie, C., DelÃ©pine, M., Lathrop, M., Peto, R., and Collins, R. (2004). Lipid-related genes and myocardial infarction in 4685 cases and 3460 controls: Discrepancies between genotype, blood lipid concentrations, and coronary disease risk. International Journal of Epidemiolï¿½ ogy, 33, 1002-1013. Kitami, T., and Nadeau, J.H. (2002). Biochemical networking contributes more to genetic buffering in human and mouse metabolic pathways than does gene duplication. Nature Genetics, 32, 191-194.
364 BIOSOCIAL SURVEYS Klatsky, A.L. (2001). Commentary: Could abstinence from alcohol be hazardous to your health? International Journal of Epidemiology, 30, 739-742. Lancet. (1999). Dietary supplementation with n-3 polyunsaturated fatty acids and vitamin E after myocardial infarction: Results of the GISSI-Prevenzione trial. Gruppo Italiano per lo Studio della Sopravvivenza nellâInfarto miocardico. Lancet, 354(9177), 447-455. Langlois, M.R., Delanghe, J.R., De Buyzere, M.L., Bernard, D.R., and Ouyang, J. (1997). Effect of haptoglobin on the metabolism of vitamin C. American Journal of Clinical Nutrition, 66, 606-610. Lawlor, D.A., Harbord, R.M., Sterne, J.A.C., and Davey Smith, G. (2007). Mendelian ran- domization: Using genes as instruments for making causal inferences in epidemiology. Statistics in Medicine. Little, J., and Khoury, M.J. (2003). Mendelian randomization: A new spin or real progress? Lancet, 362, 930-931. Manson, J., Stampfer, M.J., Willett, W.C., Colditz, G., Rosner, B., Speizer, F.E., and Hennekens, C.H. (1991). A prospective study of antioxidant vitamins and incidence of coronary heart disease in women. Circulation, 84(Suppl. II), 546. Marmot, M. (2001). Reflections on alcohol and coronary heart disease. International Journal of Epidemiology, 30, 729-734. Morange, M. (2001). The misunderstood gene. Cambridge, MA: Harvard University Press. Morgan, T.H. (1935). The relation of genetics to physiology and medicine. Scientific Monthly, 41, 5-18. Mucci, L.A., Wedren, S., Tamimi, R.M., Trichopoulos, D., and Adami, H.O. (2001). The role of gene-environment interaction in the aetiology of human cancer: Examples from cancers of the large bowel, lung, and breast. Journal of Internal Medicine, 249, 477-493. National Research Council (2001). Cells and surveys. Should biological measures be included in Social Science Research? Committee on Population, C.E. Finch, J.E. Vaupel, and K. Kinsella (Eds.). Washington, DC: National Academy Press. Omenn, G.S., Goodman, G.E., Thornquist, M.D., Balmes, J., Cullen, M.R., Glass, A., Keogh, J.P., Meyskens, F.L., Valanis, B., Williams, J.H., Barnhart, S., and Hammar, S. (1996). Effects of a combination of beta carotene and vitamin A on lung cancer and cardiovas- cular disease. New England Journal of Medicine, 334, 1150-1155. Osganian, S.K., Stampfer, M.J., Rimm, E., Spiegelman, D., Hu, F.B., Manson, J.E., and Willett, W.C. (2003). Vitamin C and risk of coronary heart disease in women. Journal of the American College of Cardiology, 42, 246-252. Osier, M.V., Pakstis, A.J., Soodyall, H., Comas, D., Goldman, D., Odunsi, A., Okonofua, F., Parnas, J., Schulz, L.O., Bertranpetit, J., Bonne-Tamir, B., Lu, R.âB., Kidd, J.R., and Kidd, K.K. (2002). A global perspective on genetic variation at the ADH genes reveals unusual patterns of linkage disequilibrium and diversity. American Journal of Human Genetics, 71, 84-99. Palmer, L., and Cardon, L. (2005). Shaking the tree: Mapping complex disease genes with linkage disequilibrium. Lancet, 366, 1223-1234. Perera, F.P. (1997). Environment and cancer: Who are susceptible? Science, 278, 1068-1073. Pimstone, S.N., Sun, X.-M., du Souich, C., Frohlich, J.J., Hayden, M.R., and Soutar, A.K. (1998). Phenotypic variation in heterozygous familial hypercholesterolaemia. Arterioï¿½ sclerosis, Thrombosis, and Vascular Biology, 18, 309-315. Pradhan, A.D., Manson, J.E., Rifai, N., Buring, J.E., and Ridker, P.M. (2001). C-reactive protein, interleukin 6, and risk of developing type 2 diabetes mellitus. Journal of the American Medical Association, 286, 327-334. Ridker, P.M., Cannon, C.P., Morrow, D., Rifai, N., Rose, L.M., McCabe, C.H., Pfeffer, M.A., Braunwald, E. (2005). C-reactive protein levels and outcomes after statin therapy. New England Journal of Medicine, 352, 20-28.
GEORGE DAVEY SMITH and SHAH EBRAHIM 365 Rimm, E. (2001). Commentary: Alcohol and coronary heart diseaseâLaying the foundation for future work. International Journal of Epidemiology, 30, 738-739. Rimm, E.B., Stampfer, M.J., Ascherio, A., Giovannucci, E., Colditz, G.A., and Willett, W.C. (1993). Vitamin E consumption and the risk of coronary heart disease in men. New England Journal of Medicine, 328, 1450-1456. Rose, S. (1995). The rise of neurogenetic determinism. Nature, 373, 380-382. Rutherford, S.L. (2000). From genotype to phenotype: Buffering mechanisms and the storage of genetic information. BioEssays, 22, 1095-1105. Sesso, D., Buring, J.E., Rifai, N., Blake, G.J., Gaziano, J.M., and Ridker, P.M. (2003). C-reactive protein and the risk of developing hypertension. Journal of the American Medical Associaï¿½ tion, 290, 2945-2951 Shaper, A.G. (1993). Editorial: Alcohol, the heart, and health. American Journal of Public Health, 83, 799-801. Shastry, B.S. (1998). Gene disruption in mice: Models of development and disease. Molecular and Cellular Biochemistry, 181, 163-179. Sijbrands, E.J.G., Westengorp, R.G.J., Defesche, J.C., De Meier, P.H.E.M., Smelt, A.H.M., and Kastelein, J.J.P. (2001). Mortality over two centuries in large pedigree with familial hypercholesterolaemia: Family tree mortality study. British Medical Journal, 322, 1019- 1023. SjÃ¶holm, A., and NystrÃ¶m, T. (2005). Endothelial inflammation in insulin resistance. Lancet, 365, 610-612. Smith, J. (2002). Apolipoproteins and aging: Emerging mechanisms. Ageing Research Reviews, 1, 345-365. Smits, K.M., et al. (2004). Association of metabolic gene polymorphisms with tobacco con- sumption in healthy controls. International Journal of Cancer, 110(2), 266-270. Song, Y., Stampfer, M.J., and Liu, S. (2004). Meta-analysis: Apolipoprotein E genotypes and risk for coronary heart disease. Annals of Internal Medicine, 137-147. Spearman, C. (1904). The proof and measurement of association between two things. Ameriï¿½ can Journal of Psychology, 15, 72-101. Stampfer, M.J., and Colditz, G.A. (1991). Estrogen replacement therapy and coronary heart disease: A quantitative assessment of the epidemiologic evidence. Preventative Medicine, 20, 47-63. Stampfer, M.J., Hennekens, C.H., Manson, J.E., Colditz, G.A., Rosner, B., and Willett, W.C. (1993). Vitamin E consumption and the risk of coronary disease in women. New England Journal of Medicine, 328, 1444-1449. Strohman, R.C. (1993). Ancient genomes, wise bodies, unhealthy people: The limits of a genetic paradigm in biology and medicine. Perspectives in Biology and Medicine, 37, 112-145. Takagi, S., Iwai, N., Yamauchi, R., Kojima, S., Yasuno, S., Baba, T., Terashima, M., Tsutsumi, Y., Suzuki, S., Morii, I., Hanai, S., Ono, K., Baba, S., Tomoike, H., Kawamura, A., Miyazaki, S., Nonogi, H., and Goto, Y. (2002). Aldehyde dehydrogenase 2 gene is a risk factor for myocardial infarction in Japanese men. Hypertension Research, 25, 677-681. Terwilliger, J.D., and Weiss, W.M. (2003). Confounding, ascertainment bias, and the blind quest for a genetic fountain of youth. Annals of Medicine, 35, 532-544. Thomas, D.C., and Conti, D.V. (2004). Commentary on the concept of Mendelian randomiza- tion. International Journal of Epidemiology, 33, 17-21. Thompson, W.D. (1991). Effect modification and the limits of biological inference from epi- demiological data. Journal of Clinical Epidemiology, 44, 221-232.
366 BIOSOCIAL SURVEYS Timpson, N.J., Lawlor, D.A., Harbord, R.M., Gaunt, T.R., Day, I.N.M., Palmer, L.J., Hattersley, A.T., Ebrahim, S., Lowe, G.D.O., Rumley, A., and Davey Smith, G. (2005). C-reactive protein and its role in metabolic syndrome: Mendelian randomization study. Lancet, 366, 1954-1959. Tybjaerg-Hansen, A., Steffensen, R., Meinertz, H., Schnohr, P., and Nordestgaard, B.G. (1998). Association of mutations in the apolipoprotein B gene with hypercholesterolemia and the risk of ischemic heart disease. New England Journal of Medicine, 338, 1577-1584. Umbach, D.M., and Weinberg, C.R. (1997). Designing and analysing case-control studies to exploit independence of genotype and exposure. Statistics in Medicine, 16, 1731-1743. Verma, S., Szmitko, P.E., and Ridker, P.M. (2005). C-reactive protein comes of age. Nature Clinical Practice, 2, 29-36. Wacholder, S., Rothman, N., and Caporaso, N. (2000). Population stratification in epidemio- logic studies of common genetic variants and cancer: Quantification of bias. Journal of the National Cancer Institute, 92, 1151-1158. Wacholder, S., Rothman, N., and Caporaso, N. (2002). Counterpoint: Bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer. Cancer Epidemiology, Biomarkers and Prevention, 11, 513-520. Waddington, C.H. (1942). Canalization of development and the inheritance of acquired characteristics. Nature, 150, 563-565. Weiss, K., and Terwilliger, J. (2000). How many diseases does it take to map a gene with SNPs? Nature Genetics, 26, 151-157. Wilkins, A.S. (1997). Canalization: A molecular genetic perspective. BioEssays, 19, 257-262. Willett, W.C. (1990). Vitamin A and lung cancer. Nutrition Reviews, 48, 201-211. Willett, W.C. (2001). Eat, drink, and be healthy: The Harvard Medical School guide to healthy eatï¿½ ing. New York: Free Press. Williams, R.S., and Wagner, P.D. (2000). Transgenic animals in integrative biology: Ap- proaches and interpretations of outcome. Journal of Applied Physiology, 88, 1119-1126. Wright, A.F., Carothers, A.D., and Campbell, H. (2002). Gene-environment interactions: The BioBank UK study. Pharmacogenomics Journal, 2, 75-82. Wolf, U. (1995). The genetic contribution to the phenotype. Human Genetics, 95, 127-148. Wu, T., Dorn, J.P., Donahue, R.P., Sempos, C.T., and Trevisan, M. (2002). Associations of serum C-reactive protein with fasting insulin, glucose, and glycosylated hemoglobin: The Third National Health and Nutrition Examination Survey, 1988-1994. American Journal of Epidemiology, 155, 65-71.