Read "An Evaluation of the U.S. Navy's Extremely Low Frequency Submarine Communications Ecological Monitoring Program" at NAP.edu

« Previous: 3: Evaluation of Final Reports of Individual Studies

Page 111 Cite

Suggested Citation:"4: Common Issues." National Research Council. 1997. An Evaluation of the U.S. Navy's Extremely Low Frequency Submarine Communications Ecological Monitoring Program. Washington, DC: The National Academies Press. doi: 10.17226/5410.

Page 112 Cite

Page 113 Cite

Page 114 Cite

Page 115 Cite

Page 116 Cite

Page 117 Cite

Page 118 Cite

Page 119 Cite

Page 120 Cite

Page 121 Cite

Page 122 Cite

Page 123 Cite

Page 124 Cite

Page 125 Cite

Page 126 Cite

Page 127 Cite

Page 128 Cite

Page 129 Cite

Page 130 Cite

Page 131 Cite

Page 132 Cite

Page 133 Cite

Page 134 Cite

Page 135 Cite

Page 136 Cite

Page 137 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

4 Common Issues THE COMMITTEE'S EVALUATION of the 11 ecological studies that were in- cluded in the Navy's ELF ecological monitoring program revealed several issues that were common to many or all of the studies. Those common issues are discussed here. This chapter was not intended to discuss each study in the context of each common issue. Specific studies are discussed below for illus- trative purposes. Overall conclusions and recommendations concerning the common issues are presented in Chapter 5. USE OF EXPOSURE DATA BY ECOLOGICAL MONITORING TEAMS All ecological monitoring teams made use of the division between treat- ment and control sites. All pairs of sites satisfied the criteria for 76-Hz expo- sure except the aquatic study sites, for which the ratio of electric fields in the earth at treatment sites to those at the control sites did not quite satisfy the criteria (see Chapter 2~. That was rectified by inclusion of supplementary sites closer to the antenna in 1990. About 10% of the study sites did not satisfy the criterion for 60-Hz exposure, but in all cases the 60-Hz exposures were very low. From an ELF-EMF exposure point of view, the pairing of sites into treatment and control was satisfactory. IlTRI was successful in characterizing the ELF EMFs for this purpose. 1 1 1

2 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM In some studies, it was important to know whether the transmitter was on or off during critical "exposure" periods. Those were the studies that used response variables potentially sensitive to ELF-EMF exposure for only short periods. For example, three of the four major completed studies of small vertebrates considered short-term phenomena: embryonic development (4 days or less), homing (4-5 hours), and maximal metabolic rate (minutes to perhaps 24 hours). The fours study, assessing fecundity, took place over a period of weeks to 2 months, depending on the year; the investigators' division into treatment and control sites might not have corresponded to the actual expo- sures, which were most likely time averages of antenna activity over that period. Without verifying that the transmitter was on during the experiments, it is not possible to know whether a treatment site was exposed to ELF-EMFs from the antenna. And without that knowledge, there could be miscIassifica- tion of subjects from the control category into the treatment category. How- ever, there is little evidence that most of the investigators were aware of or considered this factor in evaluating the results of their experiments. Chapter 5 discusses possible reanalysis of the data collected through some studies. According to the monitoring-program reports reviewed by the committee, only the upland-flora team tried to use any of the other data provided by UTRI, although some teams requested additional ELF-EMF `data. Investiga- tors from the earthworm and soil arthropods study requested and received information on electric field vs. soil depth but never used it. Extensive mea- surements of the electric field in the earth were made at the Martel's Lake site (overhead antenna treatment site for upland-flora and litter decomposition and microflora studies) but never used. The authors of the pollinating insects study indicated that each hutch received a different exposure, but there is no indication of an attempt to use these data in the analysis. Some of the teams might have decided that there was no difference in ecological aspects between treatment and control sites and hence that there was no need for further analy- sis with specific ELF-EMF exposure data. Sometimes, however, only the consideration of additional exposure hypotheses will identify an effect. Such extensive exposure data were available, and they should have been used. The use of ELF-EMF exposure data in two specific studies is discussed here. ELF-EMF CHARACTERIZATIONS AT WETLANDS SITE In 1986 and 1987, the wetlands researchers conducted a series of

COMMON ISSUES ~3 stoma/al-resistance measurements on wetlands plants. The measurements were related by multiple-regression analysis to the independent variables: two envi- ronmental variables and the magnetic field and electric field in the earth result- ing from the antenna system under full-power conditions as measured by TTTRI. The purpose of this analysis was to determine whether there was a statistically significant relationship between stomata! resistance and any or all of the independent variables. The researchers noted that the antenna on or off condition was neither predictable nor observable. They correctly noted that this was a potentially confounding factor in the analysis of short-term responses because measure- ments might have been taken at a treatment site when ELF-EMF exposure was not occurring. They did not pursue the analysis, because they found inconsis- tent results under a variety of both similar and substantially different exposure conditions. USE OF MAGNETIC-FIEED INTENSITY AS A "DOSE" AT UPLAND-FLORA SITE In the upland-flora study, the researchers designated a 76-Hz magnetic field as an indicator of dose to each tree. According to one of their papers (Reed et al. 1993), the indicator was based on "average exposure to magnetic flux density during that particular growing season." It does not appear that the actual measure used in the study is consistent with this definition. The measured fields across the treatment area (during transmitter opera- tion at full power) varied from about 5-10 mG (see Haradem et al. 1994, p. D-7, for location of points 4T2-6,7,S, 12,13,26,34 on the hardwood stand and p. D-30 for the historical measurements at these points). However, the upland-flora report shows magnetic-field measurements of about I-9 mG and a specific "effect" on tree growth at about 2 mG. The researchers would have been expected to define the growing period and use antenna on and off time to derive an "average" field during this time. According to the researchers, the growing period for hardwood trees was about April through September. The antenna onetime statistics for each antenna configuration allow calculation of the average magnetic field over the growing period. These are shown in Table 4-~; the statistics behind the calculations can be found in Appendix ~ of the final engineering report (Haradem et al. 1994~. It is clear from Table 4-! that no trees were exposed to magnetic fields within the range of 0.3 and 3.0 mG. Thus, the method described here is not

~ ~ 4 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM TABLE 4-1 Range of Average Magnetic Fields During the Growing Period Over the Martels Lake (Overhead Antenna) Treatment Hardwood Stand by Year Effective Magnetic Field as Range of Magnetic Field a Percentage of Full-Power for Any Tree within the Year Magnetic Field Treatment Site, mG 1986 0.05 0.0025-0.005 1987 0.3 0.015-0.03 1988 2.1 0.1-0.21 1989 59.0 3.0-5.9 1990 92.0 4.6-9.2 1991 63.0 3.2-6.3 1992 84.0 4.2-8.4 1993 93.0 4.7-9.3 the method actually used by the authors to determine the exposure. Instead, the authors used spot measurements made during operation of the transmitter, which were provided by I]:TRI. Consider the points 4T2-6 (closest to the antenna) and 4T2-7 (farthest from the antenna) on the hardwood-stand treat- ment site. The measurements provided by UTR} and the number of hours that each point was exposed at the measured levels during the growing season are shown in Table 4-2, derived from tables in the final engineering report (Haradem et al. 1994~. Note that exposures that correspond to field measurements between about 1.2 and 2.6 mG are within the range shown in Table 4-2 only for 1991. That is important because the claimed effect occurs at about 2.0 mG. Furthermore, during that year, the exposure was more often in the range of 5.4-10.3 mG than it was in the range of 1.6-3.0 mG. The following observations can be made about this study. The indicator of dose reported in publication of the upland-flora work was not that actually used by the authors. They reported that the indicator of dose was defined as "average exposure to magnetic flux density during that particular growing season." The indicator actually used was based on spot measurements while at least one antenna was on. The authors did not provide a clear rationale for the indicator of dose that they used. Specifically, they were not clear about

COMMON ISSUES ~5 TABLE 4-2 Spot Measurements of Magnetic Field at Hardwood-Stand Treatment Site and Number of Growing-Season Hours at These Levels Total Time Exposed Measured Measured During Growing Magnetic Field at Magnetic Field at Season (4,392 Total Year Point 4T2-6, mG Point 4T2-7, mG Possible Hours), h 1986 0.73 0.37 24.8 1987 1.16 0.59 142.8 1988 5.0 2.6 162.2 1989 10.3 5.4 2,390 1990 11.0 5.8 3,795 1991 3.0 1.6 1,462 1992 10.3 5.4 3,701 1993 10.3 5.5 4,073 Notes: In 1986, there were 24.8 h of operation at levels indicated and 17.0 h of operation In which measured levels were 0.44 and 0.22 mG. It is not clear why values In Me table were used, rather than those reported here or zero, which was by far the most-common exposure level during growing season. In 1991, there were 1,462 h of operation at levels indicated and 2,336 h of operation In which measured levels were 10.3 and 5.4 mG. It is not clear why values In the table were used, inasmuch as those reported here were more common during growing season. why they used particular values of magnetic field rather than others when more than one was reported during a given year. The measurement values between about I.2 and 2.6 mG all come from a single year (1991~. The authors' conclusion that there is an effect on tree growth at about 2.0 mG is not warranted unless they define their field measurement more carefully, provide a clear rationale for it, and find consistent results from more than one growing season. CONCEUSIONS REGARDING USE OF EXPOSURE DATA Exposure data were inadequately or inappropriately used in a number of studies. The studies Hat used short-term response measures should have used,

~ ~ 6 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM but did not use, transmitter on and off times to determine exposure, in addi- tion to the I]:TR} division of sites into treatment and control. With a few exceptions (the upland-flora study and the wetlands study to a lesser degree), the ELF-EMF data provided by TTTR} were used in site selection but not in any way to attempt to establish an exposure-response relationship. Those data should have been used more than they were. In the one case of a search for an exposure-response relationship, the data were used in a way that was not clearly related to actual exposures over the duration of the growing period. Thus, without further justification, any conclusions about low-level magnetic- field effects on tree growth are not warranted. UTR! was successful in generating data sets on exposure to ELF EMFs for each study at the two antenna facilities. As far as the committee knows, the data sets were made available to researchers, and additional data were generated at particular study locations on request. It is not clear, however, whether UTR] followed up the generation of the ELF-EMF exposure data with assistance to each monitoring study in using the information. And when the researchers began to state their findings in annual reports or at the annual meeting, it is not apparent whether TTTR] understood the extent of use of the exposure information by each study. The committee's evaluation of the sepa- rate studies indicates that use of the exposure information was often inade- quate, inaccurate, or inappropriate. That should have become obvious to UTRT and external reviewers when they read or listened to reports from each study each year. The committee wonders why more guidance was not given to each study investigator in using the exposure information to ensure that the response results reported in the final reports of each study were based on appropriate use of the exposure data. ITTR} should at least have required that an EMF-exposure expert work closely with each study until the study leader understood the types of data available, their variability (considering the vaga- ries in antenna operation), and how they might be applied to gauge responses of the selected ecological or biologic variables. STUDY-SITE SELECTION The original request for proposals (RFP) for research regarding the effects of ELF EMFs on biologic systems emphasized selection of study sites so that appropriate levels of ELF EMFs existed at control sites versus treat- ment sites. Considerable attention was also paid to the importance of match- ing sites so that any major differences uncovered would indeed reflect re

COMMON ISSUES ~ ~ 7 sponses to ELF-EMF exposure, as opposed to uncontrolled effects due to contrasts in soil composition, chemistry, or vegetation. It would have been useful at the outset to have an integrated research plan in mind to guide site selection. The absence of such a plan meant that less than full value was obtained from the research as a result of segregated selection of possible eco- system effects. in addition, the lack of common exposure levels at sites lim- ited the possibility of integrating results. (See the section later in this chapter on lack of integration.) THE PRACTICAE PROBLEM OF SITE SEEECTION Given the unavoidable heterogeneity of soils and vegetation typical of the Michigan Upper Peninsula, most researchers did as good a job as possible in matching sites. For instance, the earthworm and soil-arthropod sites were well matched with respect to soils and arthropods, but not with respect to earth- worms. A better match with respect to earthworms could probably not have been attained. In the aquatic study, the researchers sought an upstream control site but could not find one, because none that was of the same stream order and physical attributes as the treatment site existed. in general, so many condi- tions had to be met for site selection that perfect matches were impossible. ADJUSTING THE RESEARCH PLAN TO PROBLEMS WITH SITE SEEECTION Once the site-selection process was undertaken, it quickly became obvi- ous that ideal matches were going to be difficult to find. Numerous research- ers attested to that in their response to written questions from the committee. Two modifications of research design should have been considered upon en- countering the practical obstacles to perfect site matching. First, for a small subset of critical variables, it would have been valuable to pursue spatially extensive comparisons of multiple sites (control versus treatment), thereby gaining the inferential power of multiple independent samples. The more- extensive design would require less effort at each site, although there would have to be a tradeoff in reduced number of variables measured. Second, for the intensive paired comparisons of sites, more critical thought might have been given to what processes, organisms, and observations would be most valuable, given some site mismatches. For instance, in the soil arthropods and

~ ~ ~ EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM earthworms study, the work on the dominant earthworm at the treatment site alone could never yield results that could clearly be attributed to ELF-EMF exposure, because there was no way of separating ELF-EMF effects from other factors without a control site at which the same earthworm was studied. CONCLUSIONS REGARDING SITE SELECTION Researchers faced difficult problems in selecting sites because of hetero- geneity in the environment. Additional money and time for the sampling of more sites and some rethinking of experimental design and statistical inference would have helped to address some of these problems. For example, the program would have been improved by studying fewer variables at more and larger sites and by eliminating studies with poorly matched treatment and control sites. However, as with any environmental impact assessment, it would be unrealistic to expect the Navy's monitoring program to conform fully to an ideal experimental design. PSEUDOREPLICATION In many of the studies, the main effect of interest, namely the effect due to the presence of ELF EMFs generated by the antennas, was pseudoreplicated (that is, not truly replicated, as described by Huribert 1984) in that there was only one site for each level of exposure. Even when more than one site was available, treatments were not randomly interspersed so that background ef- fects of soils, climate, etc., were equally (or at least randomly) distributed among all treatments and controls. Therefore, the experimental data provide an estimate of variance of responses studied within each site, but not the vari- ance due to treatments across sites. The effects of the antenna on response variables are therefore confounded with the background effects of the different soils, climate, etc., on each site, and the two cannot easily be separated. This problem arises in the litter decomposition and microflora studies, the aquatic ecosystem studies, the upland-flora studies, and others. The wetland studies and the bird community studies avoided this problem by having replicate sites within different treatments and replicate plots or transects within each site. In the latter studies, the variance of the response variable of interest could be separated into two components: variance within each site and variance due to treatment or its absence. In the pseudoreplicated studies, that was not possi

COMMON ISSUES ~9 ble, because there was only one site per treatment, so the variance recorded can, strictly speaking, be attributed to only within-site, not between-treatment, effects. Sometimes pseudoreplication is necessary for logistical reasons. When pseudoreplication is unavoidable, the generalization of treatment effects to other sites might be justified (with caution) if it can be demonstrated that the sites chosen for each single application of the treatment are not substantially different from each other at the outset and are at or very near the modal values of other environmental factors thought to affect the response variable of inter- est. That type of pretreatment survey was performed only for the upland-flora studies. In contrast, the litter decomposition studies are seriously compro- mised because there were large differences in decay rates between sites before the antennas were in operation. The antecedent site effects obscured the po- tential detection of treatment effects. The acceptance of all furler conclusions must proceed with those caveats in mind. The caveats are generally not clearly stated anywhere in the monitoring-program reports. The danger with pseudoreplication is in commit- ting a type IT error accepting the null hypothesis (no effect) when it is, in fact, false. Such an error could arise, for example, if the variability within sites or between matched sites is within the range of responses imposed by the treatment. That should not be taken to imply that the effect of the antenna is small, as implied in some reports. Rather, it is not possible to separate the effect of the antenna from effects of other environmental factors without repli- cation both within sites and across sites. In many cases, it is impossible to calculate the probability of a type II error because it depends on an independ- ent estimate of differences between treatment and control, which requires replication of sites, not simply of plots within sites. The putative treatment effect is confounded with the site effect, so the differences between treatment and control cannot be attributed solely to the antenna, inasmuch as they are not independent of pre-existing effects and confounding site effects that contin- ued during the experiment. The extensive use of ANOVA and other linear models (including regres- sion techniques) in the ELF study requires some strong assumptions about the distribution of the data, namely, that effects are linear in the scales chosen, that variances are constant, that error terms are independent, and that residuals are normally distributed. Those assumptions were not usually tested by the ELF researchers, with few exceptions such as the study of the effects of the antenna on bird populations. Furthermore, when the assumptions were consid- ered, the investigators seem to have misunderstood them and to have applied other statistical analyses that might not have been the most appropriate. An

~ 20 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM example of the latter problem is the rejection of exponential-decay models in the litter-decomposition experiments in favor of covariance models that are more difficult to interpret. A further, troublesome aspect of the analyses is the nearly complete absence of any quantitative discussion of the effects of statistical bias, which could well dominate the role of random variation. The lack of quantification of statistical bias is exacerbated by the pseudoreplication or even lack of replication in many of the experiments. SPECIES SELECTION The Navy's original plan for an ecological monitoring program recom- mended that species (or related species) be studied that are reported to be sensitive to EMFs; are important ecologically, aesthetically, or economically; and can be reasonably monitored. These recommendations were largely met in these studies, as discussed below. STUDY SPECIES The diversity of species studied was considerable. Studies included species in most major taxonomic groups, including vascular plants (trees and shrubs), algae, slime molds, amebas, fungi, small mammals, birds, arthro- pods, and various decomposers. However, studies on small-mammal popula- tions and on development of vertebrates (birds and mammals) had design problems, so information on this group is unreliable. Major groups of organ- isms that were not included were some nonvascular plants (e.g., moss), rep- tiles, and amphibians. In the wetlands study, a moss population was found to increase significantly at the treatment site, but this apparent response was not pursued, because moss was not a target species; this is unfortunate because the finding might be an indicator that moss is especially sensitive to ELF-EMF effects. The species studied included types of organisms that had been reported to be sensitive to EMFs in previous laboratory or field studies, including slime molds, vascular plants, earthworms, birds, and bees. This coverage was very good. Little information exists on the EMF sensitivity of most of the particu- lar species in the site, so species related to those exhibiting an effect were usually studied. For example, native bees, rather than honeybees, were stud- ied because honeybees cannot survive the winter in this area. Other species were usually well justified on the basis of their potential

COMMON ISSUES 121 ecological importance. No study used economic or aesthetic importance. The most common criterion of ecological importance was abundance, which is reasonable because abundant species often exert a large effect on ecosystem processes and on other species that depend on them for food and performing valuable ecological functions. For example, the most common tree and shrub species were studied. The ecological importance of other species was based on their potential functional importance to the ecosystem. For example, periphyton species are valuable as food for many other species and as bioas- says of water quality. Decomposers are crucial to nutrient cycling. Strepto- mycete populations are associated with decomposition and nutrient cycling. Most study species could be adequately monitored because they were abundant, although some studies were limited by small sample sizes. No studies focused on rare species, although some might inadvertently have been included in aggregate variables such as bird censuses. In the wetlands study, rare sedge species were dropped when it was recognized that sampling might harm them. The omission of rare species is problematic. Some rare species, such as predators and keystone species, can exert major effects on communities. Potentially endangered species were not studied. It is not known whether rare populations at the edge of their range are more or less sensitive to bioassays of additional stresses than abundant species. it is important to note that studying rare species would have been useful as studvin~ more-common sne- cies in an attempt to find any type of effect. , O The ability to measure response variables determined the choice of some species. For example, in the wetlands study, several species were dropped when it became clear that stomata! resistance could not be easily monitored. in a few studies, species were chosen to provide interesting comparisons and generalizations. For example, in the upland litter-decomposition studies, the leaves of fast- and sIow-growing tree species were compared. In the wetlands study, trees, shrubs, and herbs were compared. However, the lim- ited number of species from each group precluded generalizations about life forms. More such comparisons would have been valuable in these studies. CONCLUSIONS REGARDING SPECIES SELECTION OveralT, species selection was commendable. The species studied in- cluded a broad range of organisms with potentially different responses. That is important for detecting potential ELF-EMF effects on ecological systems that contain a wide diversity of organisms. Representatives of most taxonomic

~ 22 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM groups that have been reported to respond to EMFs were included. Most other species were well justified on the basis of their abundance and potential ecological importance. Aspects of concern include the lack of studies on reptiles, amphibians, and nonvascular plants; the lack of focus on any rare species; the decision not to pursue investigations regarding the population increase in the moss species; and the lack of reliable studies on the develop- ment of birds and mammals. RESPONSE-VARIABLE SELECTION The first priority of the original RFP was a study of bird migration and nesting success because birds have been reported to use magnetite for orienta- tion. However, all the studies addressing that priority had unreliable or weak tests of the response variables. Bird migration was not examined, because critical outside reviews caused the proposed study to be canceled. Instead, local homing and navigation by resident birds were studied. The negative results are questionable because data on antenna operation during the specific periods of study were not used in the analyses. A similar problem is raised by the study of nestling development in tree swallows. Actual field strengths during critical developmental periods were not considered, so the negative results might reflect the lack of exposure rather than the lack of an ELF-EMF effect. The study of bird populations also relied on a weak test: the treatment site was far from the antenna and had very low exposure levels. The second priority of the original RFP included studies of soil microbi- ology and ecology, plant ecology, and insect populations and behavior. These were all examined. Soil microbiology was examined as litter decomposition in three studies (litter decomposition and microflora, wetland studies, and earthworms and soil arthropods) and as streptomycete populations, which are good indicators of microbial community activity (upland flora). Plant ecology was examined in the studies on upland trees and wetlands. Insects were exam- ined in the studies on soil arthropods, aquatic insects, and pollinating bees. The third priority included water quality, fish ecology, reproduction, fertility, and biorhythms. In the aquatic project, water quality was monitored through periphyton responses; and fish abundance, size, growth, and move- ment were measured. Reproduction was studied in bees, earthworms, birds, and small mammals. No studies addressed biorhythms or fertility. No studies examined fluctuating asymmetries in development, which have been shown recently to be sensitive indicators of environmental stress. The omission of mammalian responses in the list of original priorities is

COMMON ISSUES 123 of concern. Although the actual studies included mice, the study was flawed, and results are inconclusive. CRITERIA FOR RESPONSE SELECTION The original RFP stated two major criteria for response selection: the ability to measure responses accurately (consistently) and the ability to detect differential responses amid variation caused by other factors. The first criterion was usually adhered to for practical reasons. If accu- racy could not be achieved, species or response variables were generally dropped. For example, several wetland species were dropped when stomata! conductance proved difficult to measure. Nitrogen fixation was also dropped in the wetlands study. The second criterion was often not met. In addition, many studies suffered from poor experimental or statistical execution and a lack of power that precluded finding significant effects. To address the second criterion of detection, three major approaches that varied in effectiveness were used. First, some researchers chose response variables that they argued were unlikely to be affected by factors other than ELF EMFs. For example, leaf foraging by bees, rather than flower foraging, was studied because flowers varied among sites. However, other factors that could affect leaf foraging were not examined. In the wetIands-plant study, a potential membrane effect on stomata! conductance proved to be affected so much by local conditions (light and temperature) that a species unresponsive to sun was chosen; whether this species might also be unresponsive to ELF- EMF effects is open to question. In the slime-mold study, field ELF-EMF conditions were mimicked in the laboratory to the greatest extent possible to avoid confounding factors. The second approach for increasing detection was to examine changes before and after antenna operation. That was effectively done in the soil- arthropod study by using BACT analysis and in the bird population study by using repeated-measures analysis. In some other studies, the response variable (mass loss in wetland plant litter, overwintering mortality in bees, and bird populations) was known to vary before antenna operation because of other factors. In those studies, researchers tested for a significant interaction be- tween treatment and year; however, the weak correlation between year and ELF-EMF exposure intensity and the lack of power (replication) usually pre- cluded finding any significant effects. The third approach for increasing detection power was to incorporate other variables into predictive models. This is an effective method that was

1 24 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM used in the upland-flora study. Individual tree growth was predicted on the basis of measurements of site, climate, and competitors. Residuals (discrepan- cies of the observed data from the predicted values) were then tested for sig- nificant effects of ELF EMFs. However, the choice of exposure measure- ments makes the conclusion of positive effects questionable. Detection of possible ELF-EMF effects on short-term biologic responses depends crucially on the timing of the antenna operation relative to the experi- ments. However, this timing was apparently ignored in the majority of the studies although data on antenna operation were available. Because antenna power was variable (including zero during shutdowns), knowledge of the timing of exposure is critical to interpreting the results. For example, the orientation of birds flying over ELF EMFs was analyzed without regard to the actual operation of the antenna during each experiment. Embryo development in birds had a 4-day window when exposures could be effective, but antenna operation times were not considered. In the wetlands study, the individual measurements of stomata! conductance and water potentials of plants were apparently not related to antenna operation times. If ELF-EMF effects are immediate and reversible, no responses would be seen during temporary shut- downs. Negative results could reflect absent or highly variable exposures rather than no effect of ELF-EMF exposure. Longer-term responses, such as growth of individuals and populations, will reflect the accumulation of ELF-EMF effects over time. Therefore, the variable timing of antenna operation is of less concern than for short-term responses. However, spatial variation in ELF-EMF intensities could be im- portant for sedentary species. The upland-flora project was the only study in which variation in individual exposure rates was measured and used in the analyses. In an extreme case, slime molds were removed from the ELF-EMF exposure, and effects on the next generation were studied. For more-mobile species, exposures would be impossible to determine, but their movements probably average out any spatial variation in exposure. Population responses were examined for periphyton, soil microfauna, insects, birds, and small mammals. These species were good choices because most have relatively short generation times and populations could show re- sponses over the duration of the studies. However, three systems that showed potentially strong population responses were not adequately studied. In the wetland study, moss increased but was not studied, because it was not a target species. Slime-mold populations appeared to be highly sensitive in initial studies, but population growth was not studied in the field site; instead, later generations taken into the laboratory were examined for responses. A statisti- cally significant increase in chIorophyIl-a of periphyton was compelling, but

COMMON ISSUES 125 no followup laboratory studies were done, apparently because of researchers' resource limitations. In general, population responses were statistically nonsignificant, but this does not necessarily mean that biologic effects were nonsignificant. For most of the population studies, statistical power of the tests was low, so the ability to detect population response was small. For example, in the bird-population studies, survival of nestlings could be reduced by 10-30% without detection of the reduction in population changes, because most adults probably live 2-6 years. Also, populations could be maintained by dispersal from control sites. Effects on reproduction, growth, or mortality can be more sensitive measures of population growth rates than population densities. More effort to measure these variables accurately could have been informative. Many of the studies that were done were flawed. The study of nestling development was flawed by uncertainty about antenna power levels during critical periods of development. Similar studies on mice were too limited to yield strong conclusions. Growth rates of trees were studied, but the predictive mode! was flawed. One of the strongest results, increased overwintering mortality in bees exposed to ELF EMFs, was dismissed by the researchers, perhaps too readily. In the sI~me-mold study, the mitotic cycle had been reported to be a sensitive variable but was dropped from the study. Responses of communities, measured by such indexes as diversity, were examined in the studies on soil arthropods, aquatic systems, and bird popula- tions. Ecosystem responses were examined as nutrient uptake and decomposi- tion in the plant studies. None showed significant effects. However, such aggregate variables can mask impacts on underlying processes, especially if there is any compensation; for example, one species replaces another, and increased reproduction balances increased mortality. WEAKNESS IN THE GENERAL RESEARCH DESIGN FOR RESPONSE VARIABEES One weakness of the research design for examining responses was the lack of emphasis on understanding possible mechanisms and using mechanistic models. Some researchers did justifier their response variables on the basis of known possible mechanisms. For example, the claim that EMFs can affect membrane potentials was used to select stomata! conductance and foliar com- position as responses to measure in wetland plants. More consideration of mechanisms could generate more-specific experimental tests or predictive models.

26 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM Another weakness of the general research design was the lack of estab- lishment of relationships between exposure or dose and biologic responses: exposures and dosage were poorly understood or not used in the analyses; and if responses were nonlinear with respect to exposure or dose, some effects might not have been detected. Most studies did not address either problem. An exception was the upland-flora study, in which exposures were estimated for individual trees; results indicated a slight stimulation of tree growth at moderate levels of ELF-EMF exposures, but this effect might have been spuri- ous and caused by misuse of exposure measurements. CONCLUSIONS REGARDING RESPONSE-VARIABLE SELECTION Many responses, including short-term and long-term responses, were measured, and that is commendable for detecting potential effects at different levels. However, the ability of the studies to detect possible ELF-EMF effects was generally weak. Many studies, especially those on birds and mammals, were flawed in ways that reduced the likelihood of detecting statistically signif- icant responses. Timing of exposure was not related to measurements of short-term responses. Possible small effects would have been difficult to detect, given the lack of power and the occurrence of confounding variables. The term "small effects" is used in this report to refer to ecological effects whose magnitudes are not likely to exceed those expected from normal pertur- bations over the short term. The lack of information on functional response relationships to ELF-EMF exposures might have reduced the researchers' ability to decide which organisms and response variables are most likely to exhibit effects of the ELF antenna. The lack of reliable information on verte- brate responses is of special concern. The statistically significant population responses of periphyton, moss, and pollinating insects might be worth follow- ing up in more-controlled laboratory studies. STATISTICAL POWER The power of a statistical design reflects the likelihood that an experiment will be able to detect the presence or absence of a treatment effect. The im- portance of adequate power is simple: a study with low statistical power will not be able to accept or reject the null hypothesis with sufficient confidence. In the absence of adequate statistical power, the studies in the ecological moni

COMMON ISSUES 127 toring program are uninformative at best. At worst, a potentially important effect might be missed and replaced with an unwarranted sense of confidence that the null hypothesis of no effect has been substantiated. The committee noted lack of adequate statistical power as a problem that arose in more than one study. in the small-vertebrates study, an early de- crease in sample size led to a decrease in power to 70%, and a large number of variables were eventually analyzed with statistical power of less than 30% . The power of statistical tests in the soil arthropods and earthworms study was low because of the sampling scheme (few pitfall traps), large differences between sites, and large variability in the data. It was impossible to estimate accurately the statistical power in several studies, including Michigan and Wisconsin birds and litter decomposition and microflora, because of pseudo- replication and unclear exposure relationships. Reanalysis might improve a number of these studies (see Chapter 51; others are unsalvageable because of flaws in their design or execution. Some studies, such as the pollinating- insects study, did a good job of calculating and discussing the minimal detect- able differences and the power of statistical tests. To quantify the likelihood of detection, or power, one needs to describe three aspects of a study: the experimental design and sample size, the rules by which an effect will be declared statistically significant (as distinct from biologically significant), and the magnitude of effect that will be assumed to result from the experimental intervention. For example, consider the following hypothetical scenario chosen for its relevance to the ELF ecological monitoring program: · The goal is to compare 20 pairs of nesting birds in a control plot with 20 pairs of birds at one of the treatment sites. The variable to be exam- ined is the number of surviving hatchlings per nesting pair at some specified time. The variable (number of surviving hatchlings) will be assumed to follow a Poisson distribution. The statistical rules are (~) the null hypothesis of no difference in the means of the numbers of surviving hatchlings and (2) rejection of the null hypothesis according to a two-tailed test at the 95 % signif- icance level. . The statistical goal is to be able to detect a 20% or greater change in mean number of survivors, under the assumption (justified from prior data) that the mean control number is 3 surviving hatchlings per pair. (Typically, the magnitude of the change to be detected is dictated by considerations of biologic significance.)

28 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM For each case of exposition, we will also assume that the hatchlings are independent of each other. Although this assumption is not correct, it eases the presentation without compromising the illustration with respect to statistical power. That scenario provides sufficient information to compute the statistical power of such an experiment. Statistical power corresponds to the probability that a significant effect will be observed if the mean number of survivors at the treatment group differs from that at the control site by 20% or more, provided that all other assumptions have been satisfied. Tt is vital to consider confidence intervals, that is, to consider the results of such experiments not as providing only two mean values with an associated p value for their difference, but as providing the differences to be expected if the experiment were repeated many times. For example, assume that the results of the above experiment are that the mean number of hatchling survi- vors in the control group differs from the mean number in the treatment group by 22%, with 95% confidence limits of S% and 150%. If the experiment were repeated many times, one would expect the mean difference between treatment and control groups to be in the range of 8-150% in 95% of the repetitions. Because the confidence interval does not include zero, the null hypothesis is rejected at the 0.05 level of significance. The width of the confidence interval implies that the effect could be uncomfortably close to the null (~%) or could be quite large (150%~; that is, the experimental design and sample sizes have led to very imprecise estimates. What does the outcome of the experiment tell us if the power of the design is 0.9? Tt indicates that if there is indeed an effect of 20% or greater, 90% of the time the confidence interval will not include the null hypothesis. Tt provides us with some reasonable bounds on the uncertainty. An alternative experimental design with a power of 0.3 or less (like those reported in a num- ber of experiments in the ELF ecological monitoring program) tells us that 70% or more of the time the confidence interval could include the null hypoth- esis. Such an experiment offers no bounds on the uncertainty, and it can reasonably be questioned whether such an experiment should have been per- formed at all. In this context, it is useful to note that it is common to require that results be expressed in terms of confidence intervals. RESPONSE TO REVIEWS AND CRITIQUES Research teams associated with the ecological monitoring program re- ceived annual comments from reviewers beginning in 1982. In its evaluation

COMMON ISSUES 129 of individual studies, the committee discovered that there was a great deal of variation in how the researchers responded to the reviews of their annual reports and presentations. Some review comments appeared to have been taken seriously and to have led to modification of research designs or report presentations; other comments appear to have been taken lightly or ignored. The committee did not attempt to determine, in a systematic manner, the extent to which the researchers considered the review comments to be appro- priate. However, the committee found several instances in which the peer- reviewer comments raised valid and important questions. Some of these issues have been addressed in this report. In a number of cases, reviewer comments were responded to in a satis- factory manner. The initial study on Michigan and Wisconsin birds was canceled in response to negative reviewer comments, and a new proposal was accepted. The wetlands researchers modified their study design to accommo- date reviewer comments and provided explanations if changes were not made. Upland-flora researchers took peer review seriously and addressed the con- cerns of their reviewers. Several research teams responded only partially to reviewer comments. The research team for the litter-decomposition study made minor changes in experimental technique that were suggested by reviewers. However, more difficult problems involving theoretical issues were only partially considered, if at all. The researchers appeared not to understand some issues fully. In the pollinating-insects study, most reviewer suggestions were heeded and contrib- uted greatly to the quality of the research. However, suggestions on using a BACI analysis with covariates and on increasing replication were not followed. Small-vertebrates researchers were generally responsive to peer reviewers and made several improvements in the study and in the clarity of the final report. However, reviewer concerns regarding low statistical power and lack of data archiving were not addressed. Every year, TTTR] received annual reports on all the studies and held a meeting at which each study was presented or discussed. ITTR} organized outside reviewers to review the annual reports and to comment on the progress of each study. Some of the external reviewers' comments were very critical. Such criticism should have been a clarion call to TTTRI that something was seriously wrong with the design or progress of the criticized study. It is not apparent to the committee that IITRI followed up on reviewer critiques of studies; followup seems to have been left to individual investigators. Conse- quently, research design, analytic techniques, or interpretation that needed improvement or correction according to external reviewers were often left unattended. That lack of attention to reviewers' comments should have con

~ 30 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM cerned TITR} management. TTTR] should have established a regular internal review process to ensure that each study adequately addressed external criti- cism, even to the point of having external reviewers comment on responses to their criticism. There are also instances (e.g., data archiving) where critiques and suggestions of the peer reviewers were addressed to ITTR} managers directly. These critiques appeared to have been ignored by ITTRI. APPROPRIATENESS OF INTERPRETATION Scientists are well acquainted with the potential for bias in conclusions based on a given set of data. Preference for a particular outcome could result in an interpretation favoring that outcome, and this could occur even in cases where the persons making the judgment believe that they have done so without bias. (Such a tendency toward bias could also be present in the subjects of experiments involving humans. Those considerations have given rise to so- called double blind experimental protocols, in which neither the subject nor the experimenter has knowledge that will allow such potential bias.) In several studies of the Navy's ecological monitoring program, modest but significant differences were observed between data collected at treatment sites and data from control sites. Researchers conducting the studies con- cluded that five of these potential effects were due to factors other than the ELF antenna. Without attempting to judge whether any of those interpreta- tions suggested a predisposition to a particular outcome, it is important to consider whether the conclusions were established with a credible scientific basis. In the course of the committee's review and discussion of the research- ers' final reports, concerns arose about the scientific credibility of some of the conclusions. Differences between treatment sites and control sites that were dismissed by researchers and by ITTRI as not being clearly related to ELF exposure were the increase in bee overwintering mortality, the reduction in leaves per bee nest cell, accelerated litter decomposition, earlier eye-opening in mice, and depressed earthworm reproductive rates. The committee believes that some of these observed differences were dismissed too readily as alleged artifacts of environmental variations or experiment design. In the pollinating insects study, the final conclusion that ELF-EMF effects are absent or minimal is not fully justified by the data presented. Similar concerns were expressed by earlier reviewers, who concluded that ELF-EMF effects were demonstrated. In the litter-decomposition study, the provisional conclusion of negligible effects of the antenna appears to be based on a combination of small differ

COMMON ISSUES 131 ences in mean values and the large variation in the data, not on the validity of the results. Even a 5-10% change in leaf-litter decomposition rates can have a major impact on soil characteristics (McCIaugherty et al. 1985~. Although dismissal of the five possible small effects described above might be correct, the committee suggests in Chapter 5 that the results of some studies be reana- {yzed. Another concern has to do with the issue of multiple comparisons and the observation of statistically significant results due to chance alone. For exam- ple, when researchers perform 100 tests at the 5% level of significance, one expects to find, on the average, five positive values (that is, to reject the null hypothesis) because of chance alone. In the very large number of statistical tests performed in the monitoring program, it would be surprising if no statis- tically significant findings were reported. Therefore, the issue is not whether significant results emerged from time to time, but whether the number of such events was larger then expected. That issue was not examined systematically as part of the ecological monitoring program. DIFFERENT METHODS FOR SIMILAR ORGANISMS The broad range of studies in the program often resulted in examining the possible effects of the antennas on similar organisms or processes, but with somewhat different protocols. For example, litterbag studies of decomposition were performed in uplands, wetlands, and streams, but these studies were performed by different teams, often from different universities or institutions, and were initiated at different times. Under those circumstances, it is under- standable that different methods might be used. The upland litterbag studies used leaf litter from dominant tree species, and bags were sampled at about monthly intervals. In the wetland studies, decomposition experiments began with cellulose strips and switched to leaf litter from Ledum groenZandicum, a common shrub. However, the samples in the wetland decomposition studies were taken only annually. Annual sampling precluded fitting different decay models to the wetland decomposition data, as was possible in the upland stud- ies. It is not possible, therefore, to analyze all the decomposition data with a common statistical or mathematical model. The use of different methods does not necessarily negate conclusions from any one study. _ Indeed, there might be sound scientific or logistical reasons for using different methods in different situations. For example, achieving greater precision to meet the objectives of one study might entail

~ 32 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM different methods from those in another study. But the use of different meth- ods prevented researchers from making valid cross-site comparisons and there- by impeded the realization of the full potential of integration across sites and organisms. For example, in~the upland-forest decomposition studies, litter from various species was placed at each site as an "index" material. In the wetland-decomposition studies, cotton strips were used as index materials. Litter in the upland studies was sampled monthly for mass-Ioss estimates, but in the wetland studies it was sampled annually. There does not appear to have been much discussion of coordinating methods before the experiments began. Therefore, the inability to integrate properly across sites and organisms be- cause of the use of different methods is an unfortunate after-effect of the execution of the work, rather than a tradeoff that was intentionally made for the sake of greater precision within any one study. The collection of studies as a whole would have benefitted from the specification of a small number of hypotheses at the earlier stages of study design. LACK OF INTEGRATION AMONG STUDIES AND SYNTHESIS OF INFORMATION As noted in Chapter I, the original RFP for studies of the effects of ELF EMFs on biologic systems in the region of the ELF communications system antennas was developed by ITTRT on the basis of a monitoring program outline from the Navy, previous research, information from state agencies and the U.S. Forest Service, and comments on the Navy's draft environmental-impact statement. Recommendations from the National Research Council report (NRC 1977) on potential effects of ELF EMFs also influenced the RFP. The RFP requested proposals addressing the responsiveness of select groups of organisms to the environment created by the EMFs produced by the ELF antennas. The list included such groups as mammals, birds, invertebrates, plants, slime molds, and amebas. There was only a slight suggestion in the RFP that studies of effects were to be in an ecosystem context; rather, the emphasis seemed to be on a population approach. SELECTION OF PROJECTS In the process of selecting the ecological studies, there does not appear to have been an attempt to fund research teams that were sufficiently close together to encourage interaction. That could have been the result of selection

~ 34 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM adjacent to streams with the responses of aquatic attributes or compared re- sponses of invertebrates within different but integrated study areas. Many other examples could be given. CONCLUSIONS REGARDING INTEGRATION OF STUDIES AND SYNTHESIS OF INFORMATION In the processes used for selection of studies and study sites, there ap- pears to have been little recognition of the possibility that a response of one attribute of an ecosystem (such as an organism or process) could influence other ecosystem components and that information about related responses might reinforce or undermine conclusions about responses taken one at a time. Instead, possible responses were considered as isolated events, that is, outside an ecosystem or integrated context. Such a perspective might have caused the managers of the ecological monitoring program to establish study site selection requirements based only on exposure levels and research impacts on limited populations of sensitive species. The committee notes that it would be infeasi- ble to synthesize the data on response variables because they represent diverse aspects of ecosystems and because in most cases the measured responses were insignificant. In addition, looking for these interactions and developing a synthesis document might take too long and thus be outside the purview of the ecological monitoring program. The committee also notes that there might be some value in use of varied approaches, given that so little is known about the effects of ELF EMFs on ecosystem components and processes. Nevertheless, recognition of interactions among ecosystem components and encouragement of integration among studies with full development and application of appro- priate statistical approaches should have been guiding principles in the early research design. That would have given the research community the opportu- nity to synthesize the extensive information generated over many years of study, even outside the funding of TTTRI. Both the advisory committee and the monitoring program's management team were remiss in not including recommendations, and perhaps requirements, for integration and synthesis in the RFP. An additional problem with the lack of integration and synthesis is that Were was virtually no opportunity to follow up on the most-positive research findings in lieu of continuing with less-promising research. For instance, the finding of a statistically significant increase in chIorophyIl-a in response to ELF EMFs in streams was never followed up with the obvious laboratory

COMMON ISSUES 135 studies, because of lack of funds, even though the researchers recognized it as a potentially important finding that needed laboratory confirmation. Con- versely, earthworm comparisons between sites that differed vastly in their initial earthworm fauna were pursued, even though findings were doomed to ambiguity (because of poor comparability of sites with respect to the earth- worms). The funding and contract basis of these studies locked the program into an inflexible pattern of support for research without matching the funding to a continuous review of the results. If there had been consistent integration and comparison of findings for the different studies, the overall research effort could have been improved for the same amount of money. With better inte- gration, there could have been more pursuit of promising results, the hallmark of good research. By failing to integrate the studies of different species and ecosystem processes, this large-scale effort largely surrendered the possibility of detecting small changes in interactions of components and gave up the major advantage of such large-scale research. Given the absence of synthetic overview, this might just as well have been many isolated studies of one variable at a time. The project thus missed an excellent opportunity to conduct pioneering ecosystem-level research that spanned physiologic, population, and ecosystem responses. All ingredients for such research were in place- but no one put the ingredients together. DATA ARCHIVING The final and annual reports do not contain information on archiving of data generated by the Navy's ecological monitoring program. To understand the current state of data archiving, the committee spoke with several research- ers directly. The results of these telephone conversations indicate that each group of researchers used its own protocol for archiving data and that the quality, durability, and accessibility of these protocols differed dramatically. For the wetlands study, data do not exist in electronic form but are available in the original notebooks and in the annual and final reports published (F. Stearns, formerly of University of Wisconsin-Milwaukee, personal commun., 1996~. For the soil-ameba study, biologic data are recorded in notebooks, and environmental data are available on computer printouts. Although many of these data were analyzed in electronic format, the software used was old and the data are probably not retrievable at this point (R.N. Band, Michigan State University, personal commun., 19961. All data from the pollinating-insects

~ 36 EVALUATION OF ELF ECOLOGICAL MONITORING PROGRAM study are contained in a relational database, RBase. However, the data would not be useable with the limited documentation available, and anyone who wanted to use them would need to speak to the researchers to have the vari- ables and peculiarities of the database explained (K. Strickler, University of Idaho, personal commun., 1996~. Data from the study of bird populations in Michigan were archived with a relational database, Paradox. With some documentation, others would be able to use the files. These data were archived in a standardized format used at the researchers' institute If. Hanowski, Uni- versity of Minnesota, personal commun., 1996~. The small-vertebrate study is worth noting in this context because the large, ambitious study yielded voluminous data. It is important to separate the value of the study as a purely scientific investigation from its value as a tar- geted effort to ascertain whether the Navy's ELF communications system had deleterious effects on the neighboring ecosystems. As a purely scientific investigation, it has yielded mostly high-quality data with incomplete statistical analyses. Some of the flaws identified in the evaluation of the study in Chap- ter 3 could, in theory, be remedied by analyses that used a framework both physiologically appropriate and statistically sound. Such analyses are not likely, in part because of decisions on the part of the IlTR} management team regarding data archiving. The management team failed to observe elementary practices of data management that would have yielded a documented archive of data suitable for re-examination. The latter is an egregious failure, inas- much as a peer reviewer of the entire monitoring program, made the following pointed comments before the 1988 contract-renewal process (letter dated May 5, 1987~: All participants should be sending YOU hard copies and floppy disks (IBM compatible) right now. All data sets should have excellent docu mentation and you should have copies of it. ~ was a little disturbed that some participants take a lackadaisical view of their data sets.... Each investigator should consider data management as an obligation. For IBM] it is an essential." [italics added] Apparently, the recommendation was not followed, and documented archiving of data was not undertaken. The Navy ELF ecological monitoring program supported 11 {ong-term studies. Those studies generated extensive data sets on biologic response variables and environmental conditions. Most of the studies continued for 5 years or more, and each produced annual reports of progress and information

COMMON ISSUES 137 gathered. From a review of the overall program, it appears as though all that was expected from each study was an annual report, presentation or attendance at an annual meeting of monitoring-program investigators, and a final report that discussed responses of selected variables and drew conclusions. To meet those expectations, the investigators in each study must have developed exten- sive sets of information and organized them in a fashion that allowed analysis. Obviously, the data sets include working field notes, tables in electronic and possibly paper form, and results of analyses. As far as the committee is aware, those data sets were kept by individual researchers and not transferred to TITRT. No protocol seems to have been established for the request for proposal (RFP) or later for formatting, documenting, or reporting the data. After millions of dollars had been spent on a monitoring program that could be used for further understanding of the monitored ecosystems, the resulting information appears not to be readily available; if available, it is not in a uniform, user friendly format; and there is no common location to which an outsider can address requests for information. Responsibility for the lack of archiving and of planning for long-term availability of the monitoring informa- tion appears to rest with ITTRI's management of the program. All aspects of archiving of monitoring data should have been designed as part of the RFP, and researchers in all the monitoring studies should have been required to submit their data sets, in the appropriate format, at regular intervals or at least at the completion of each study.

Next: 5: Overall Conclusions and Recommendations »

An Evaluation of the U.S. Navy's Extremely Low Frequency Submarine Communications Ecological Monitoring Program (1997)

Chapter: 4: Common Issues

Welcome to OpenBook!

Get Email Updates