Scientific Issues in Ecological Monitoring
In her introductory remarks, Schaal laid out the scientific issues that the workshop participants had been asked to address: What do we monitor? How do we do ecological monitoring? How frequently do we monitor? Those are, she said, “rather difficult questions.”
They are difficult because ecological monitoring is still a relatively new field, and although researchers have examined some issues in great detail, others remain mostly unexplored. During the workshop, speakers looked at what needs to be done in various areas for ecological monitor ing to become an effective safeguard.
“The first question is what to monitor,” said Power. “What are the organisms? What are the traits of organisms that we want to monitor? How do we choose those organisms?” In some cases, she said, the choice will be straightforward. If, for example, the transgenic crop in question is Bt corn and the concern is that the crop's major pest, the European corn borer, will develop resistance to the Bt toxin, it is necessary to monitor the corn borers for the appearance of genes that confer resistance. "But for many of the other kinds of ecological risks,” she said, “it may be less obvious than that.” When one is concerned about effects on nontarget organisms, for example, there are likely to be many different organisms in both the crop fields and the ecosystems adjacent to the fields. “How do we select among those?”
A number of factors should play a role in such a decision. For instance, Power said, researchers should consider the tolerance, or resistance, of various nontarget organisms to the pesticide or other technology when
deciding which ones to watch, but tolerance shapes the choice in various ways. “We might logically assume that we are interested in the organisms that we think are going to be most susceptible to the particular technology that we are planning to use. In some cases, however, people doing environmental-impact studies have deliberately chosen tolerant nontarget organisms because they can be sure that they can find them when they go out for monitoring.”
Researchers should also consider the abundance of various organisms when deciding which to monitor, but the same sort of conundrum exists. “Do you choose common species that are logistically much easier to deal with in a monitoring scheme,” Power asked, “or do you choose rare species that you predict are going to be more subject to risk?”
The distribution of the nontarget organisms is another important factor: “Do you choose cosmopolitan species so that you can make some general rules that hold across the United States or across the world, or do you choose organisms that are highly localized in their distribution and yet again may be the ones that are most sensitive to the environmental perturbation that you are putting out there?”
Finally, Power said, population stability can make a major difference in the success of monitoring a particular nontarget species: “Choosing organisms that have relatively stable populations seems to give you the opportunity to detect impacts more easily because if you see wild fluctuations you may be able to correlate them in some way with the impact you are interested in. But it may be the organisms with unstable populations, which fluctuate wildly under normal conditions, that are most sensitive to the risk you are interested in. “I am not really giving any answers here; I am giving you a sense of the dilemma surrounding how we choose what to monitor.”
In addition to deciding which species to monitor, one must decide how long and how thoroughly to monitor. The difficulty here, Power explained, is that a single ecosystem can vary greatly in space and time. If ecosystems were uniform, it would be possible to take one or a few measurements and spot any effects caused by a transgenic crop. But the natural variation of ecosystems makes it necessary to take data from a number of sampling sites over several years to have a reasonable chance of telling the difference between a real effect and chance variations.
Power described a study performed by a British researcher, Mick Crawley, that looked at the potential for transgenic herbicide-resistant canola to be invasive in the UK. “They set up a huge experiment. They planted in 12 sites, and they followed it over the course of 3 years and measured invasiveness in an appropriate way.” The final answer was that transgenic canola was not likely to prove invasive in the UK. Then, Power said, after the study was published, a second scientist reanalyzed its data
to answer a different question: What if only some of the 12 sites had been sampled, or what if the data had been taken over a period shorter than 3 years? Would the answer have been the same? The reanalysis found that the final answer would have been little changed by taking data at fewer sites but would have been quite different if the study had gone on for just 1 or 2 years instead of 3. Such volatility suggests, Power said, that the year-to-year variation in weather or other factors may make monitoring sensitive to the period over which it takes place. In contrast, the site-to-site variability was apparently not so great in Crawley's study that it would have made much difference to look at fewer than the 12 original sites. “Of course,” Power said, “that really depends on the variability of the sites that were chosen for this particular study, and I am not convinced that would be true for all available laboratory sites.”
If there were no constraints on time and resources, it clearly would make sense to maximize both the length of the monitoring and the number of sites, but such constraints always exist. “Obviously there is a conflict here between the desire to monitor until we are absolutely sure that there isn't going to be an impact and the desire to get the technology out to users," Power said. "That is a real conflict, and it incorporates both scientific-ecological decision-making and socioeconomic decision-making. I cannot answer the question, but what I can say is that it is important to think carefully about what should be required for field experiments, about what is a realistic but effective design for making sure that year-to-year variability has been accounted for.”
A related issue is that if ecological monitoring is to discover changes in ecosystems caused by the cultivation of transgenic crops, it will be vital to know what those ecosystems were like before the introduction of the transgenic plants. “If you monitor something,” Schaal commented, “you need to collect a series of different data points to tell whether anything is changing. The collection of these data is critical because you cannot tell whether something has changed if you don't have a baseline.”
And because of the natural variability in ecosystems, such a baseline must be more than just a snapshot—that is, more than just data on the ecosystem at one moment. Unless a researcher understands, for example, how much the population of a particular insect normally varies from year to year, it would be impossible to know how to interpret a 30% drop in the insect's numbers the year after a crop of Bt corn was planted. “One of the main difficulties and challenges in impact assessment,” Power said, “is going to be in separating these impacts from natural spatial and temporal variability.” And the problem will only get worse, she predicted, as the global warming trend continues to alter weather and temperature patterns, making year-to-year variability in ecosystems even greater.
Besides providing a basis of comparison for what happens when
transgenic crops are introduced, baseline monitoring helps researchers to understand the systems into which genetically modified plants can be injected. For example, Power said, “it has been suggested for years that viruses are not likely to have any effect on natural plant populations, because most natural plants have evolved resistance to viruses. But there had been little quantitative information to address that question, and studies over the last couple of years have come up with more and more examples of naturally occurring viruses that have had substantial effects on naturally occurring plant populations. So we can no longer assume that there isn't going to be any effect of releasing these virus-resistant plants.” Without a baseline—that is, without knowing in some detail what is going on or what can go on in nature—it is difficult to make an informed judgment about the effects of genetically modified crops.
Getting the detailed baseline data that researchers need, Power said, will require extensive monitoring programs to watch for disease outbreaks or pest infestations in agricultural systems. “But we don't have such a program for natural ecological systems,” she said. “We have some long-term ecological research sites that are meant to begin this process, but these sites are only a decade or two old in most cases. So it is difficult to argue that the baseline monitoring data we have right now are sufficient for many of the kinds of ecological risks that we are interested in.”
One way to correct for having so few baseline data is to use control sites—areas where nontransgenic crops continue to be planted—and compare outcomes there with outcomes at sites where genetically modified crops are introduced. If that is done, the selection of appropriate control sites will be critical, noted Anne Kapuscinski, of the University of Minnesota. The control sites must be carefully matched to the release sites on the basis of key ecological variables, and they must be chosen so that inadvertent contamination from genetically modified crops is unlikely. “It is also going to be important to choose carefully which release sites to monitor," she said, "because it will not be feasible to monitor each commercial application of a genetically modified organism.”
Ultimately, the purpose of monitoring is to help one to understand the risks and benefits associated with transgenic crops and to be able to respond to or manage the risks effectively. So the monitoring should take into consideration the needs at two interconnected stages of risk decision-making, risk assessment and risk management.
Risk assessment has been defined in a variety of ways over the years, said Bob Frederick of the Environmental Protection Agency, but in essence it is an analytical tool that helps one organize and analyze large amounts of data to estimate the potential risk posed by a process or event of interest. Risk assessors attempt to calculate a numerical value for risk, a
value that can then be used in making decisions about whether to go ahead with a particular action.
The basic formula for risk has two entries, Frederick explained: the probability that a particular undesirable event will occur, and the hazard or damages that would accompany that event. Risk is calculated by multiplying the probability of an event by its peril. Thus, to estimate the risk posed by transgenic crops, one must have values for both numbers— probability and hazard.
Power said, “in much of the work done so far in risk assessment of genetically modified crops, we have focused on the probability of the event, trying to get a handle on the number. But the extent of the hazard— what kind of hazards these traits actually confer—is probably more important. An example is the gene-flow literature, in which we now have pretty good estimates of the probability of gene flow for a lot of different crops into their wild relatives, but we still don't have many studies on the extent of the hazard: What does it mean if the gene flow occurs? Does it actually present a hazard?”
Because of that uncertainty, Power said, researchers should focus more on understanding the potential hazards posed by genetically modified crops—the “So what?” question. “Laboratory experiments have shown a variety of examples of hazards from viral recombination that do occur in the laboratory, such as increased virulence, increased host range so that the virus can infect hosts that it would not normally have infected, and changes in transmission, such as viruses that can now be transmitted by an aphid although formerly they were not transmissible by an aphid. Those have been shown under laboratory conditions. The challenge is to figure out how to monitor for them under field conditions.”
As an example, Power described studying the effects of putting viral genes into oats to make the oats resistant to a virus. Research showed that the viral genes did indeed make their way into a companion weed, wild oats, and that the genes made the wild oats resistant to the virus as well. “The question is, Once it becomes resistant, is it likely to become more of a weed? Wild oats are a weed both in agroecosystems and in natural habitats in the sense that they outcompete a lot of native perennial grasses in many parts of the West. The existence of risk depends on such factors as the co-occurrence of domesticated and wild oats, gene flow between them, and the occurrence of viable hybrids of oats and wild oats; and all these have been shown quite extensively. We have been working on whether viral-resistance gives wild oats a selective advantage, and the answer is yes. We can see substantial effects on growth, reproduction, and all those things that you would associate with fitness traits. The next step is to ask about it in the field, and that is essentially where we are now.”
BOX 2: Solving the Monarch Mystery
Little is simple in ecological monitoring. It might seem easy, for example, to answer the question, “Is Bt corn killing monarch caterpillars, or isn't it?” But as John Pleasants, of Iowa State University, demonstrated, even such seemingly easy questions can demand tremendous time, resources, and patience to answer.
Monarch caterpillars feed only on the leaves of the milkweed plant, which often grows close to corn, either in or along the edges of fields. If the corn has been genetically modified to produce the Bt toxin, the toxic pollen from the corn can make its way to the leaves of milkweed plants and be eaten inadvertently by the caterpillars.
The ideal way to determine whether the Bt corn actually harms the caterpillars, Pleasants noted, would be to perform a field study: watch a group of caterpillars in a natural setting, determine which are exposed to Bt toxin and which are not, and see whether the exposed caterpillars are more likely to die than the unexposed. Unfortunately, he decided, such a field study would be impractical.
“To do this on a sufficiently large scale,” Pleasants explained, “you need a lot of replication, and this requires lots of resources. We can't just go out and find lots of Bt fields and lots of non-Bt fields and look at naturally occurring larvae on naturally occurring milkweed plants—we can't get the milkweed where we want it, and the number of larvae is too small.
“So you end up having to contrive some sort of experimental field situation. You bring potted plants out there, and put larvae on plants, and so on. But that requires a lot of effort. You need plenty of potted plants, and you need a colony of monarchs so that larvae are available. Some recent studies have done this, but almost always on a very limited scale—maybe one or two replicates.”
To get enough replication—that is, to do the experiment a number of times, varying the conditions from time to time—Pleasants and colleague Richard Hellmich decided on a combination field-laboratory study in which they would perform some measurements in the laboratory and others in the field and then combine them to reach a conclusion. They would go to corn fields to measure the pollen density in and near the fields. Then in the laboratory, they would feed milkweed leaves to monarch caterpillars with different pollen densities—some with the pollen density found right at the edge of the field, some with the density found 1 meter from the field, some 2 meters, some 4 meters, and so on. By seeing how many of the caterpillars survived at each of these levels, they could arrive at a measure of how dangerous the Bt corn was to caterpillars at various distances from the field.
“One of the advantages of having the laboratory component,” Pleasants explained, “is that you are not so time-constrained. When you do it in the field, you have a window of opportunity when corn plants are pollinating.” They still had to take his pollen-density measurements when the corn was pollinating, but that was much simpler than preparing enough milkweed plants with monarch caterpillars on them at just the right time to do the experiment.
“We took seven different fields. At each sample point, we put a little microscope slide coated with glycerine to capture the ambient pollen flow at that point.” The team constructed assemblages of milkweed stems, called “bouttonnieres,” that
mimicked how the milkweed leaves would pick up pollen from the air. From that, Pleasants created tables showing how much pollen would probably be found on milkweed leaves growing at various distances from a cornfield. It was only an approximate measure; many things can happen in the field to cause variations in the pollen levels. He found, for instance, that rain will wash about 90% of the pollen off the milkweed leaves. But it was a reasonable approximation.
The next step was to create experimental setups where they raised monarch larvae on milkweed leaves that had been sprayed with carefully calibrated amounts of pollen. They recorded the effects of eating the leaves, including deaths and effects on the caterpillars' weight gain. Finally, they calculated what the effects would be in a natural setting at various distances from a cornfield. Their analysis also looked at several types of Bt corn, including one, Event 176, that expressed the Bt toxin in the pollen and others that expressed the toxin only in green leaves.
The research team found that only Event 176 had an appreciable effect on the caterpillars. For the other varieties of corn, it took such high doses of the pollen to have an effect that only a few caterpillars—those living right at the edge of a cornfield and unlucky enough to live on leaves with a particularly dense dusting of pollen—would be harmed by a weight loss, and mortality would be negligible. For Event 176, however, a significant number of the caterpillars living at the edge of a cornfield died, and it was only at distances greater than 4 meters that effects disappeared completely.
The experiment, as carefully as it was done, can only approximate what happens in the field, Pleasants noted, and a number of variables might make the reality very different from the calculated version. “It's possible, for example, that in a benign laboratory environment it takes a high dose to have an effect, but in a field environment where larvae are stressed out by a lot of things, a much lower dose might push them over the edge.
“But it is possible that you might overestimate the toxicity in the laboratory, because there is so much variability in a field situation. In the laboratory, we force them to eat leaves with a particular pollen density. In the field, they could have choices. There might be variability in pollen densities on a leaf itself or on different leaves of one plant.”
Many other factors must be considered to make the laboratory results completely relevant to the field, Pleasants added. For instance, the timing of the monarch life cycle should be compared with the growth cycle of the corn to see exactly what the pollen densities are when the monarch larvae are feeding. Someone should also determine, he said, just how important milkweed in or near cornfields is to monarch production. “In other words, if you imagine a landscape with different kinds of habitat—corn, beans, some Bt corn, natural areas, roadsides— and imagine the distribution of milkweed across that landscape, the question is where the monarch production is coming from.”
Despite its shortcomings, Schaal noted, the sort of real data that this experiment generated is invaluable to those debating the ecological consequences of transgenic crops. “It's interesting to see what the levels of pollen deposition are,” she said. “That allows you to begin to evaluate various risks.”
BOX 3: Type I Versus Type II Errors
On the surface it might seem to be nothing more than an esoteric disagreement about proper statistical technique. In reality, however, the growing debate about type I versus type II errors has the potential to shape policy on genetically engineered organisms in profound ways, so it pays to delve deeply enough into the issue to understand where the disagreement arises and what the stakes are.
Because ecosystems naturally have a great deal of random variation, both in space and in time, monitoring will seldom offer unequivocal evidence to support one conclusion or another. To test whether Bt corn plants growing near cornfields are killing monarch caterpillars, for instance, one might monitor monarch deaths in two spots, one of them next to or in a field with genetically engineered corn and the other next to or in a field with conventional corn. But if more die in the field next to the Bt corn, that does not necessarily mean that the Bt toxin is to blame; the result could simply be a chance difference between the fields. That is what statistics is designed to measure: How likely is it that the effect one sees is nothing more than chance variation? Or, conversely, how likely is it that the effect can be accepted as real? In theory, one could set the standard of proof for such a statistical test at any desired level, demanding that the effect be somewhat likely, moderately likely, or very likely before stating that the data offered evidence to support one's conclusion.
In practice, however, researchers generally demand the same standard of proof from every statistical test. “As scientists, we are pretty much indoctrinated with the notion that we should be looking for significance levels at the 0.05 level,” noted Power. In other words, scientists generally do not accept an effect as proved unless the statistical evidence is so strong that there is less than a 5% probability that the effect—in this case, the death of monarch caterpillars near Bt corn—was the result of chance. In statistical terms, this is known as minimizing type I errors—errors in which one says that there was an effect when there actually was not—in other words, a false positive. “If you think about it,” Power said. “this is clearly a bias in favor of the technology. It is a bias in favor of releasing the technology because we are saying that you have to be sure at the 95% confidence level that there really is an impact or else we are going to assume that there is no impact.”
That is not necessarily the best approach, Power said. “What ecologists have
In short, one of the most important tasks for ecological monitoring will be to help researchers to establish what hazards can be posed in the environment, such as making wild oats a more successful weed, by putting transgenic crops into the field.
On a related note, Frederick said, monitoring programs can be designed to provide various details that risk assessors have identified as important but that are unknown or poorly known. Before a complete risk assessment is done, for example, risk assessors can decide which areas are more or less likely to involve risk and then assign monitoring intensity on that basis to make it more relevant and useful.
increasingly been arguing is that we ought to be thinking more and more about type II errors,” that is, the error of saying that there is no effect when such an effect is actually there—in other words, a false negative. Minimizing type II errors would turn the traditional approach on its head and demand that a preponderance of evidence show that there was no effect before concluding that nothing was happening. In the case of the monarchs, for instance, even if they were unaffected by living next to Bt corn, it would take much monitoring at many sites before a researcher would declare Bt corn safe. Until then, the results would be worded to say that the data failed to rule out the possibility that Bt corn was harming the caterpillars.
An emphasis on avoiding type II errors is a much more conservative course than the current practice of minimizing type I errors. It would raise the bar much higher for tests of genetically modified crops, in essence minimizing the chance of concluding they do not increase harm to the environment when, in fact, they do increase harm to the environment. And that might be what society wants, said Kapuscinski. The types of harm that matter most to society—that people want most to avoid—are precisely the sort that arise from type II errors, such as assuming that a transgenic crop is safe and planting it, only to find later that it causes some ecological damage. People seem less worried about type I errors, whose practical effect would be to keep a safe transgenic crop off the market.
In practice, though, researchers need not necessarily choose between avoiding type I or avoiding type II error in performing their analyses, Kapuscinski said. “To some extent there is a tradeoff between the two,” she said, but researchers can create monitoring experiments that take into account both kinds of error. “In designing monitoring plans, one key criterion should be to try ahead of time to figure out what level of type II versus type I error you can accept and how you will design your experiments to achieve that.”
Power concurred: “The intermediate strategy is to at least consider both kinds of error rather than simply considering type I error, which is what we have been doing pretty much across the board in our risk assessments.”
Once the decision has been made to release a transgenic crop, its effects on the surrounding ecosystem can depend heavily on how the crop is managed, and that too has implications for monitoring. “If we want to have adaptive management strategies in which we alter the management of a particular crop, we need to have data on which to base the alterations of management,” Schaal said.
Gould offered an example of how monitoring might be used in preventing pests in a field of Bt corn from evolving resistance to the Bt toxin. The trick to preventing the resistance from developing is to maintain refugia—areas of corn where the pest is not exposed to the Bt toxin or the
selective pressures that promote resistance development. This cannot be done blindly, however. Gould showed how one can create models of how resistance might develop in a population of insects, monitor the pests to detect signs of such incipient resistance, and use the model and the monitoring to plan refugia accordingly. Without such careful management, the fields could end up inhabited by insects that are not susceptible to Bt toxin.
Ideally, monitoring should be designed so that it can detect both unexpected and unpredicted events and events that are expected. To illustrate her point, Kapuscinski described how attempts to rebuild self-reproducing salmon populations in the Pacific Northwest backfired when carefully planned spawning interventions resulted in a decrease rather than an increase, in salmon abundance. She summed up this way: “If I were to state in one sentence the primary implications of all these carefully examined cases of failures in living-resources management, I would say that the responsible institutions and users were blind-sided by surprising feedback from the system or, to use the terminology of Sengue, ‘fixes that backfire.' So, if you remember nothing else from what I spoke about today, it should be that we should be prepared for ecological surprise. We should expect ecological surprise." Another reference to unexpected results came from Guenther Stotsky, of New York University, who reported that Bt corn decomposes more slowly than non-Bt corn (probably because it has higher lignin content). His research showed that Bt toxins can remain active in the soil for several months—not an expected result. Despite our lack of knowledge about the ecological implications of these findings, they underscore the importance of being prepared for the unexpected.