Read "Biologic Markers in Reproductive Toxicology" at NAP.edu

« Previous: 29. Conclusions and Recommendations

Page 311 Cite

Suggested Citation:"Appendix: Assessing the Validity of Biologic Markers: Alpha-Fetoprotein." National Research Council. 1989. Biologic Markers in Reproductive Toxicology. Washington, DC: The National Academies Press. doi: 10.17226/774.

Page 312 Cite

Page 313 Cite

Page 314 Cite

Page 315 Cite

Page 316 Cite

Page 317 Cite

Page 318 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Appendix Assessing the Validity of Biologic Markers: Alpha-Fetoorotein The previous chapters associated with pregnancy issues have discussed potential biologic markers for use in toxicity evalu- ations during pregnancy; however, only alpha-fetoprotein has been evaluated in sufficient depth to allow for a rigorous evaluation of fetal and embryonic abnor- malities. This detailed analysis is in- cluded to review and establish criteria for evaluating any proposed biologic mark- er of toxicity. The identification of biologic markers that indicate exposure, effect, or suscep- tibility is a complicated process involv- ing studies animals, refinements in labor- atory assays, and studies in special human populations. Moreover, even when a marker has been validated in such studies, its use in larger populations is not straight- forward. This chapter discusses experi- ence in the implementation of large pop- ulations of alpha-fetoprotein (AFP) in maternal serum and amniotic fluid to screen for neural tube defects (NTDs) in fetuses. The NTDs anencephaly and spine bifida are among the most common and serious of congenital malformations (Warkany, 1971~. Both arise in the first month of pregnancy. They are sometimes fatal and sometimes result in lifelong handicap. During the past decade, it has become possible to detect the presence of neural-tube defects 311 r during fetal life by using concentrations of AFP as a biologic marker (Gastel et al., 1980; Mizejewski and Porter, 1985~. Re- viewing the use of AFP as a marker of fetal maldevelopment provides a useful frame- work for considering some practical as- pects of the use of biologic markers of effect. Biologic markers sometimes are used primarily to assess the connection between a particular exposure and a par- ticular disease, and they are sometimes used primarily to help in making personal or public health decisions. AFP is dis- cussed here in the latter context, but the remarks are equally applicable to the former context. Anencephaly is absence of the cranial vault with a degenerated or rudimentary brain. The defect is incompatible with extrauterine life; many affected infants are spontaneously aborted, and the remain- der are stillborn or die soon after birth. Spina bifida is characterized by defective closure of the spinal cord. If the spinal cord tissue and meninges protrude through the vertebrae, the condition is called spine bifida cystica. The defect can occur anywhere on the spinal column, but is most common in the lumbar region. The protrud- ing nervous tissue is sometimes covered with skin. Many affected fetuses die, but many are born alive and, depending on the

312 ll 1'~ l APPENDIX , j i i 1 i ~ , ! I ~ ! JO ~ 1 2 3 4 5 6 7 8910 MSAFP (Multiples of the Unaffected Median) FIGURE A-1 Distn~ution of concentration of maternal serum alpha-fetoprote~n (MSAFT) In a population of unaffected pregnancies (curve on left) and in a population of pregnancies in which the fetus has spine bifida pystica (curve on right). The abscissa represents the gestational ag~corrected concentration of M:iAFP (using the date of the last menstrual period to date specimen) plotted in terms of multiples of the (unaffected) median (MOMs). Ike ordinate represents the frequency of MOM values with in each of the two populations. Source: Reprinted with permission from Adams et al., 1984. Copyright 1984 by C.V. Mostar Co. location and seriousness of the lesion, can lead relatively normal lives, in spite of physical handicap. Recent advances in medical and surgical therapy have made it possible for many severely affected children to survive (Bamforth and Baird, 1989). AFP was discovered about 30 years ago. It can be detected in the maternal serum by 29 days after conception (Bergstrand and Czar, 1957~. It is synthesized early by the yolk sac and liver and later almost exclusively by the liver. In fetal serum, the concentration reaches a peak at about 13 weeks of gestation and decreases there- after. AFP is excreted in the urine and therefore is found in the amniotic fluid (Crandall, 1981~. AFP concentration is increased in maternal serum during preg- nancy. Some fetal conditions, including anencephaly and spine bifida cystica, raise the amniotic fluid and maternal serum concentrations above normal (Bergstrand, 1986~. The finding of increased AFP in maternal serum/amniotic fluid, or both thus constitutes an indication of open NTDs. (High concentrations are also found when the fetus has a ventral wall defect or when there are twins.) Figure A- 1 shows the distribution of maternal serum AFP concentrations in normal pregnancies (curve on left) and pregnancies with spine bifida cystica in the fetus (curve on right). The screening process for NTD begins with the measurement of maternal serum AFP during the second trimester of pregnan- cy, preferably in week 16 of gestation. That measurement is the start of a multi- stage process, which is outlined in Figure A-2. The following discusses the various aspects of marker validity and concen- trate on this first stage, but the prin- ciples apply to all stages in the screen- ing; indeed, they apply to most medical decision-making processes and in par- ticular to the use of any marker of disease or exposure (Galen and Gambino, 1975~. Experience with the use of AFP demon- strates the importance of determining the

ALP~4-FETOPROTEIN Initial MSAFP* Not Elevated Elevated Anencep: Ultrasonography / \ Diagnosis by Ultrasonography: No Diagnosis Possible \ Amniocentesis Decision: Does Opportunity to Detect an Affected Fetus Warrant Cost and Risk of Amnio- centesis? Multiple Fetuses Mistaken Ges- tational Age Fetal Demise Alter Prenatal Care Yes Amniocentesis: AFAFP** and Karyotyping (if indicated) Diagnosis of No Indl ation of Affected Fetus Fetal Abnormality Patient Decision Plan for Induced Special Care Abortion at Delivery ~MSAFP = maternal serum alpha-fetoprotein *WAFAFP = amniotic fluid alpha-fetoprotein \`No predictive value of a test in the general population and developing a multistep, multimarker screening protocol. ASSESSING THE VALIDITY OF BIOLOGIC MARKERS As noted earlier, an ideal biologic marker should be a sensitive and specific indicator of disease or exposure and (in the context of a specific use) should be suitably predictive of the disease or ex- posure. The terms "sensitive," "specif- ic,~ and "predictive" are defined below, followed by a discussion of how maternal serum AFP qualifies as a marker. "Sensitivity" refers to the ability of a marker to indicate the presence of dis- ease or exposure when disease or expo- sure is present. Sensitivity is therefore usually measured as a conditional proba- bility, that is, the probability that the marker will indicate disease, given the presence of disease. In the notation of Table A-1, the sensitivity of a marker is P(M+ ADO. Sensitivity is therefore the complement of the false-negative proba- bility of the marker, P(M- +. The sen- sitivity of markers is rarely, if ever, perfect (1.0)there are always false- negatives-but a good marker will have a sensitivity close to 1.0. 313 FIGURE A-2 Multistage prenatal detection of neural tube defects. Source: Reprinted with permission from Adams et al., 1984. Copyright 1984 by C.V. Mosby Co. "Specificity" refers to the ability of a marker to indicate the absence of dis- ease or exposure when disease or exposure is absent. In the notation of Table A-1, the specificity of a marker is P(M- | Dab. Specificity is the complement of the false-positive probability, P(M+ IDA. A good marker will have a specificity close to 1.0. TABLE A-1 Probabilities of Marker Presence and Absence, Conditional on Disease Presence and Absence Marker Positive Negative Total Diseases Present Absent P(M+ ID+) Pit- |D+) P(D+) P(M+ ID-) P(M- I D-) P(D-) a p, probability; M, marker; D, disease. Suppose that an investigator is working with a new marker of exposure to a toxic chemical and determines its sensitivity and specificity from test data as pre- sented in Table A-2. The marker has been measured in 1,000 persons with the disease (exposure) and 1,000 persons without the disease. The results are encouraging: the sensitivity is 0.95 (950/1,000), and the specificity is 0.99 (990/1,000~.

314 There are only 5% false-negatives and 1% false-positives. Because the numbers of exposed and unexposed persons in the evaluation are equal, the a priori proba- bility of exposure in the total sample is O.S. TABLE A-2 Cross-Classification of Marker and Disease: Hypothetical Data from a Case-Control Study Marker Disease Present Absent Positive Negative Total 950 10 50 990 1 000 1 000 , , Another quantity important in evaluat- ing the validity of a marker is its predic- tive value. Table A-3 shows the proba- bilities of disease presence and absence, conditional on the presence or absence of the marker. TABLE A-3 Probabilities of Disease Presence and Absence, Conditional on Marker Presence and Absence Diseasea Marker Present Absent Total Positive P(D + | M + ) P(D + | M) P(M + ) Negative P(D- | M + ) P(D- | M-) P(M-) ap, probability; M, marker; D, disease. The focus here is on the rows, rather than on the columns as in Table A- 1. The predic- tive value of a positive (PVP) test for the marker is P(D+ ~M+), that is, the probabil- ity of the disease given a positive test. That probability is the complement of the false-positive probability, P(D- ~M+), the probability that there is no disease when there is a positive test for the mark- er. The predictive value of a negative (PVN) test is P(D- ~M-), which is the com- plement of the false-negative probabili- ty, P(D+ EMS. In applying a test for a dis- ease or exposure, it is important to con- sider the data both as they are presented in Table A-3 and as they are presented in Table A- 1. Consider the data in Table A- APPENDIX 2 focusing on the marker (rows), rather than on the exposure (columns). Here the false-positive rate, the probability that the marker indicates exposure when there is none, is about 0.01 (10/960~. The PVP is about 0.99 (950/960), and the PVN is about 0.95 (990/1,040~. Note that the data in Table A-2 have been gathered in such a way that the number exposed is equal to the number not exposed. That is, the a priori probability of dis- ease in the sample is 0.5-a typical case- control study design. Suppose that the a priori probability of disease in question in the population at large is low, about 1%. Table A-4 shows how the marker would work in practice in this population. The data in Table A-4 exhibit the same sen- sitivity and specificity (0.95 and 0.99, respectively) as the data in Table A-2, but very different PVP. Here the PVP is about 0.49, that is, only half the people with positive test results will have the disease. The sole cause of the differing PVPs is the difference in the a priori prob- ability of the disease in the two settings. One setting is that of a study of the marker where the disease is common (50%) by de- sign, and the other setting is that of the use of the marker in practice where the disease is relatively infrequent. In the example, because the disease is infre- quent, the PVN is high (more than 0.99~. TABLE AL Cross-ClassiBlcation of Markers and Disease: Hypothetical Data from a Population Study Disease Marker Present Absent Total Positive 95 100 Negative 5 9,900 Total 100 10,000 195 9,905 10 100 The example reflects a common outcome of the transition from laboratory bench to community or clinical practice: very good tests can perform poorly. In the clin- ical or community setting, it is most im- portant to know how likely it is that a posi- tive test indicates disease truly and how likely it is that a negative test indi- cates the absence of disease truly. The predictive values of tests can be increased

ALP~I-FETOPROTEIN in two ways-by increasing the sensitivity and/or the specificity or by choosing the individuals or populations for testing so that the a priori probability of the disease or exposure is high. In many situations, it is not possible to change the sensitivity without changing the specificity, and vice versa. The situ- ation with maternal serum AFP is a case in point. Figure A-1 shows that the AFP distributions in normal and affected preg- nancies overlap. If a particular concen- tration of AFP, the "cut-point, is chosen as an indication of the presence of an ab- normal fetus, there will be false-posi- tives and false-negatives, because some normal pregnancies are associated with higher AFP concentrations than are some affected pregnancies. If the cut-point is chosen so that the test is very sensi- tive-so that nearly all affected pregnan- cies fall above the cut-point-there will be more false-positives, and the specifi- city will be low. But, if the cut-point is chosen so that nearly all normal preg- nancies fall below, there will be more false-negatives, and the sensitivity will be low. Regarding the possibilities for. in- creasing the PVP of a test by testing only people with a relatively high a priori probability of exposure or disease, recall the marked contrast in the PVP values cal- culated from the data in Tables A-2 and A-4. In those two examples, there was no difference in the sensitivity or specifi- city of the test, but only differences in the a priori probabilities of exposure. A major reason for disappointment in the practical application of a test is that it is indiscriminately applied in popula- tions where the a priori probability of exposure or disease is low, so the false- positives greatly outnumber the true- positives. In many situations, a test with a low PVP is applied as a screening test. Persons with a positive result can be fol- lowed by more definitive tests. The defin- itive tests are usually not used as a first step, because they are expensive and in- vasive. The use of a screening test also allows the use of the definitive test on persons who have a relatively high a 315 priori probability of having the disease or exposure of interest, which increases the operational PVP of the definitive test (even Definitive tests are rarely per- fect). The currently recommended process for screening for NTDs through the use of maternal serum AFP testing is an example of this approach. VALIDITY OF MATERNAL SERUM AFP MEASUREMENT AS A BIOLOGIC MARKER OF NTDs As indicated earlier, AFP is normally present in amniotic fluid and maternal serum, and it is present in increased con- centrations in the presence of anencephaly and spine bifida cystica. Measuring the second-trimester concentration of mater- nal serum AFP can be used to identify the likelihood that a pregnant woman is carry- ing a fetus with anencephaly or spine bif- ida cystica, a ventral wall defect, or twins; other defects can also be predicted. These likelihoods can be used by the woman and her physician to decide whether to bear the risks and costs of further diagnostic procedures. Ultrasonography can identify multiple fetuses and fetuses with anen- cephaly. The most common cause of in- creased maternal serum AFP is mistaken gestational age, which can be checked by ultrasonography. When an increase in ma- ternal serum AFP cannot be explained by one of the factors assessed with ultrasono- graphy, the likelihood of a fetus with spine bif~da cystica or ventral wall defect can be weighed against the cost and risks of amniocentesis to obtain fluid for assay for amniotic fluid AFP (see Fig. A-2. That assay can be considered as the definitive test for spine bifida cystica, but also is invasive and is associated with an increased risk of abortion. The results of maternal serum AFP tests are usually presented in terms of multiples of the median (MOMs) for normal pregnan- cies. From the specific MOM and some char- acteristics of the mother (such as race, geographic area, and weight), one can es- timate the odds of having an affected fe- tus. Rather than make a presentation in terms of odds, for the purpose of this dis- cussion we consider a maternal serum AFP

316 concentration of at least 2.5 MOM as "ab- normal" and give an illustration of sen- sitivity, specificity, and PVP. (As noted above, the choice of a different MOM as the "abnormal" cut-point will change the sen- sitivity, specificity, and predictive values of the procedure.) The data are derived from a synthesis of the UK Collaborative Study results (Wald and Cuckle, 1980) and are adjusted so that the base is a hypothetical cohort of 100,000 screened pregnant women (data on ventral wall defects and other defects are not included). Table A-5 puts the data on maternal serum AFP in the format of the other tables presented here; the 1,000 multiple pregnancies that would be expected among 100,000 pregnancies have been omitted from the table. The sensitiv- ity of this test is 0.84, meaning that 16% of babies with anencephaly or spine bifida cystica would be missed. TABLE A-S AFT by Presence or Absence of Neural Tube Defects NTD Marker Present Absent Total Positive Negative Total 334 66 400 3,254 953" 98,600 3,588 95,412 99,000 aData derived from UK Collaborative Study data. Source: Wald and Cuckle, 1980. The specificity is 0.97, indicating that there are 3% false-positives. NTDs are rare complications of pregnancy; the false-positives greatly outnumber the true positives, and the PVP is about 0.09. The PVN is quite good at this stage of the screening process, more than 0.99, and a woman with a negative test has only a 7/10,000 risk of having a fetus with anen- cephaly or spine bifida cystica, only about APPENDIX one-sixth of her a priori probability, about 40/ 10,000 in the UK at the time these data were obtained. After staging through ultrasound and amniocentesis with evaluation of amniotic fluid AFP (see Fig. A-2), the UK data look like those presented in Table A-6. The sensitivity of the total screening package for open NTD is 0.81, the specificity is very nearly 1.0, and the PVP is almost 0.99. In all, the process is able to detect 81% of the fetuses affected by open NTDs at the risk of a fairly small number of false- positives. If one includes the risk of fetal death due to amniocentesis (esti- mated to be about 0.5-1.5% of the 3,588 amniocentesis performed), the net benefit of the program is 324 fetuses with open NTDs detected balanced against the 4 false- positives and 10-30 fetal deaths due to complications of amniocentesis. TABLE AL AFP by Presence or Absence of Neural Tube Defecta Total Screen Present Absent Total Positive Negative Total 324 76 400 4 98,596 98,600 328 98,672 99,000 aData derived from UK Collaborative Study data. Source: Wald and Cuckle, 1980. It is obvious that AFP screening could be a monumental failure if stopped at the maternal serum stage. There would be enor- mous errors, the normal fetuses with posi- tive test results greatly outnumbering the affected fetuses. Therapeutic action taken on the basis of maternal serum AFP tests would be wrong most of the time. How- ever, when properly used as the first stage of a screening process, maternal serum AFP evaluation is useful.

References .

Next: References »

Biologic Markers in Reproductive Toxicology (1989)

Chapter: Appendix: Assessing the Validity of Biologic Markers: Alpha-Fetoprotein

Welcome to OpenBook!

Get Email Updates