Many regulations issued by the U.S. Environmental Protection Agency (EPA) are based on results from computer models. EPA is a global leader in advancing and using models in the environmental regulatory decision process. Yet the agency has not sufficiently leveraged opportunities to improve its regulatory decisions by adopting a comprehensive strategy for periodically evaluating and refining its models. This report recommends a series of guidelines and principles that, if adopted, will improve environmental regulatory models and decisions made by the agency. Moreover, adoption of these principles will enhance the agency’s ability to respond to recent information-quality requirements by allowing EPA to provide more informed responses to outside challenges and reduce the likelihood of erroneous data releases that can prompt challenges.
Models have a long history of helping to explain scientific phenomena and of predicting outcomes and behavior in settings where empirical observations are limited or not available. The use of models has resulted in great advances in scientific understanding and in improvements in a wide array of endeavors. However, by their very nature, all models are simplifications and approximations of the real world. Complex relationships are often simplified, and relationships viewed as unimportant are sometimes eliminated from consideration to reduce computational difficulties and increase transparency.
This report looks specifically at the use of computational models in environmental regulatory activities, particularly at EPA. The use of computational models is central to the regulatory decision-making process because the agency must do prospective analyses of its policies, including estimating possible future effects on the environment, human health, and the economy. Obtaining a comprehensive set of measurement data is not feasible in many cases because of time and resource constraints. The agency uses models to generate estimates (or predictions) when data are not available. EPA also uses models to analyze measurement data for trends and effects. The results of models can become the basis for such decisions as initiating environmental cleanup or regulation. In sum, models are critical tools that help to inform and set priorities in environmental policy development, implementation, and evaluation at EPA.
Because of the critical role played by models, EPA has developed a variety of policies and programs to improve models and their use at the agency. One laudable step has been the establishment of the Council for Regulatory Environmental Modeling (CREM) in 2000 to support modeling activities across the agency and to provide an important resource for interested parties outside of EPA.
The National Research Council (NRC) convened the Committee on Models in the Regulatory Decision Process in response to a request from CREM to independently assess evolving scientific and technical issues related to the selection and use of computational and statistical models in decision-making processes at EPA. The full charge is provided in Box S-1 at the end of the Summary.
MODEL USE IN THE REGULATORY PROCESS AT EPA
Models will always be constrained by computational limitations, assumptions, and knowledge gaps. They can best be viewed as tools to help inform decisions rather than as machines to generate truth or make decisions. Scientific advances will never make it possible to build a perfect model that accounts for every aspect of reality or to prove that a given model is correct in all respects for a particular regulatory application. These characteristics make evaluation of a regulatory model more complex than solely a comparison of measurement data with model results. They suggest that model evaluation be viewed as an integral and ongoing part of the life cycle of a model, from problem formulation and model conceptualization to the development and application of a compu-
tational tool. Evaluation of regulatory models also must address a more complex set of trade-offs than evaluation of research models for the same class of models. Regulatory model evaluation must consider how accurately a particular model application represents the system of interest while being reproducible, transparent, and useful for the regulatory decision at hand. Meeting these needs may require different forms of peer review, uncertainty analysis, and extrapolation methods. It also implies that regulatory models should be managed in a way to enhance models in a timely manner and assist users and others to understand a model’s conceptual basis, assumptions, input data requirements, and life history.
EPA has played a major role in advancing the science of environmental modeling. However, as with virtually any component of regulatory decision making, improvements to EPA’s efforts are possible. Many of the recommendations in this report are derived from a review of current modeling practices within individual EPA research and program offices. This report aims to provide an across-agency vision for the use of models in the future. In keeping with the study charge, the report provides a set of guidelines for improving the use of models to support regulation. The committee offers recommendations in three areas of the modeling process: (1) model evaluation; (2) principles for model development, selection, and application; and (3) model management.
Life-Cycle Model Evaluation
Models begin their life cycle with the identification of a need and the development of a conceptual approach, and proceed through building of a computational model and subsequent applications. Models also can evolve through multiple versions that reflect new scientific findings, acquisition of data, and improved algorithms. Model evaluation is the process of deciding whether and when a model is suitable for its intended purpose. This process is not a strict validation or verification procedure but is one that builds confidence in model applications and increases the understanding of model strengths and limitations. Model evaluation is a multifaceted activity involving peer review, corroboration of results with data and other information, quality assurance and quality control checks, uncertainty and sensitivity analyses, and other activities. Even when a model has been thoroughly evaluated, new scientific findings may raise
unanticipated questions, or new applications may not be scientifically consistent with the model’s intended purpose.
Evaluation of a regulatory model should continue throughout the life of a model. In particular, model evaluation should not stop with the evaluation activities that often occur before the public release of a model but should continue throughout regulatory applications and revisions to the model. For all models used in the regulatory process, the agency should begin by developing a life-cycle model evaluation plan commensurate with the regulatory application of the model (for example, the scientific complexity, the precedent-setting potential of the modeling approach or application, the extent to which previous evaluations are still applicable, and the projected impacts of the associated regulatory decision). Some plans may be brief, whereas other plans would be extensive. At a minimum each plan should
Describe the model and its intended uses.
Describe the relationship of the model to data, including the data for both inputs and corroboration.
Describe how such data and other sources of information will be used to assess the ability of the model to meet its intended task.
Describe all the elements of the evaluation plan by using an outline or diagram showing how the elements relate to the model’s life cycle.
Describe the factors or events that might trigger the need for major model revisions or the circumstances that might prompt users to seek an alternative model. These could be fairly broad and qualitative.
Identify responsibilities, accountabilities, and resources needed to ensure implementation of the evaluation plan.
It is essential that the agency is committed to the concept that model evaluation continues throughout a model’s life. Model evaluation should not be an end unto itself but a means to an end, namely, a model fitted to its purpose. EPA should develop a mechanism that oversees the evaluation process to ensure that an evaluation plan is developed, resources are committed to carry it out, and modelers respond to what is learned. Although the committee does not make organizational recom-
mendations or recommendations on the level of effort that should be expended on any particular type of evaluation, it recognizes that the resource implications for implementing life-cycle model evaluation are potentially substantial. However, given the importance of modeling activities in the regulatory process, such investments are critical to enable environmental regulatory modeling to meet challenges now and in the future.
Peer review is an important tool for improving the quality of scientific products and is basic to all stages of model evaluation. One-time reviews, of the kind used for research articles published in the literature, are insufficient for many of the models used in the environmental regulatory process. More time, effort, and variety of expertise are required to conduct and respond to peer review at different stages of the life cycle, especially for complex models.
Peer review should be considered, but not necessarily performed, at each stage in a model’s life cycle. Some simple, uncontroversial models might not require any peer review, whereas others might merit peer review at several stages. Appropriate peer review requires an effort commensurate with the complexity and significance of the model application. When a model peer review is undertaken, EPA should allow sufficient time, resources, and structure to assure an adequate review. Reviewers should receive not only copies of the model and its documentation but also documentation of its origin and history. Peer review for some regulatory models should involve comparing the model results with known test cases, reviewing the model code and documentation, and running the model for several types of problems for which the model might be used. Reviewing model documentation and results is not sufficient peer review for many regulatory models.
Because many stakeholders and others interested in the regulatory process do not have the capability or resources for a scientific peer review, they need to be able to have confidence in the evaluation process. This need requires a transparent peer review process and continued ad-
herence to criteria provided in EPA’s guidance on peer review. Documentation of all peer reviews, as well as evidence of the agency’s consideration of comments in developing revisions, should be part of the model origin and history.
Quantifying and Communicating Uncertainty
There are two critical but distinct issues in uncertainty analysis for regulatory environmental modeling: what kinds of analyses should be done to quantify uncertainty, and how these uncertainties should be communicated to policy makers.
A wide range of possibilities is available for performing model uncertainty analysis. At one extreme, all model uncertainties could be represented probabilistically, and the probability distribution of any model outcome of interest could be calculated. However, in assessing environmental regulatory issues, these analyses generally would be quite complicated to carry out convincingly, especially when some of the uncertainties in critical parameters have broad ranges or when the parameter uncertainties are difficult to quantify. Thus, although probabilistic uncertainty analysis is an important tool, requiring EPA to do complete probabilistic regulatory analyses on a routine basis would probably result in superficial treatments of many sources of uncertainty. The practical problems of performing a complete probabilistic analysis stem from models that have large numbers of parameters whose uncertainties must be estimated in a cursory fashion. Such problems are compounded when models are linked into a highly complex system, for example, when emissions and meteorological model results are used as inputs into an air quality model.
At the other extreme, scenario assessment and/or sensitivity analysis could be used. Neither one in its simplest form makes explicit use of probability. For example, a scenario assessment might consider model results for a relatively small number of plausible cases (for example, “pessimistic,” “neutral,” and “optimistic” scenarios). Such a deterministic approach is easy to implement and understand. However, scenario assessment does not typically include information corresponding to con-
ditions not included in the assessment and whatever is known about each scenario’s likelihood.
It is not necessary to choose between purely probabilistic approaches and deterministic approaches. Hybrid analyses combining aspects of probabilistic and deterministic approaches might provide the best solution for quantifying uncertainties, given the finite resources available for any analysis. For example, a sensitivity analysis might be used to determine which model parameters are most likely to have the largest impacts on the conclusions, and then a probabilistic analysis could be used to quantify bounds on the conclusions due to uncertainties in those parameters. In another example, probabilistic methods might be chosen to quantify uncertainties in environmental characteristics and expected human health impacts, and several plausible scenarios might be used to describe the monetization of the health benefits.
Questions about which of several plausible models to use can sometimes be the dominant source of uncertainty and, in principle, can be handled probabilistically. However, a scenario assessment approach is particularly appropriate for showing how different models yield differing results.
Effective decision making will require providing policy makers with more than a single probability distribution for a model result (and certainly more than just a single number, such as the expected net benefit, with no indication of uncertainty). Such summaries obscure the sensitivities of the outcome to individual sources of uncertainty, thus undermining the ability of policy makers to make informed decisions and constraining the efforts of stakeholders to understand the basis for the decisions.
In some cases, presenting results from a small number of model scenarios will provide an adequate uncertainty analysis (for example, cases in which the stakes are low, modeling resources are limited, or in-
sufficient information is available). In many instances, however, probabilistic methods will be necessary to characterize properly at least some uncertainties and to communicate clearly the overall uncertainties. Although a full Bayesian analysis that incorporates all sources of information is desirable in principle, in practice, it will be necessary to make strategic choices about which sources of uncertainty justify such treatment and which sources are better handled through less formal means, such as consideration of how model outputs change as an input varies through a range of plausible values. In some applications, the main sources of uncertainty will be among models rather than within models, and it will often be critical to address these sources of uncertainty.
Probabilistic uncertainty analysis should not be viewed as a means to turn uncertain model outputs into policy recommendations that can be made with certitude. Whether or not a complete probabilistic uncertainty analysis has been done, the committee recommends that various approaches be used to communicate the results of the analysis. These include hybrid approaches in which some unknown quantities are treated probabilistically and others are explored in scenario-assessment mode by decision makers through a range of plausible values. Effective uncertainty communication requires a high level of interaction with the relevant decision makers to ensure that they have the necessary information about the nature and sources of uncertainty and their consequences. Thus, performing uncertainty analysis for environmental regulatory activities requires extensive discussion between analysts and decision makers.
The Interdependence of Models and Measurements
The interdependence of models and measurements is complex and iterative for several reasons. Measurements help to provide the conceptual basis of a model and inform model development, including parameter estimation. Measurements are also a critical tool for corroborating model results. Once developed, models can drive priorities for measurements that ultimately get used in modifying existing models or in developing new ones.
Measurement and model activities are often conducted in isolation. For example, modelers often add details to models without sufficient measurements to justify or confirm the importance of these changes. Likewise, field and laboratory scientists might expand their compilation of samples without understanding the utility of such information for modeling. Although environmental data systems serve a range of purposes, including compliance assessment, monitoring of trends in indicators, and basic research performance, the importance of models in the regulatory process requires measurements and models to be better integrated. Adaptive strategies that rely on iterations of measurements and modeling, such as those discussed in the 2003 NRC report titled Adaptive Monitoring and Assessment for the Comprehensive Everglades Restoration Plan, provide examples of how improved coordination might be achieved.
Using adaptive strategies to coordinate data collection and modeling should be a priority of decision makers and those responsible for regulatory model development and application. The interdependence of measurements and modeling needs to be fully considered as early as the conceptual model development phase. Developing adaptive strategies will benefit from the contributions of modelers, measurement experts, decision makers, and resource managers.
Retrospective Analysis of Models
EPA has been involved in the development and application of computational models for environmental regulatory purposes for as long as the agency has been in existence. Its reliance on models has only increased over time. However, attempts to learn from prior experiences with models and to apply these lessons have been insufficient.
The committee recommends that EPA conduct and document the results of retrospective reviews of regulatory models not only on single
models but also at the scale of model classes, such as models of groundwater flow and models of health risks. The goal of such retrospective evaluations should be the identification of priorities for improving regulatory models. One objective of this analysis would be to investigate systematic strengths and weaknesses that are characteristic of various types of models. A second important objective would be to study the processes (for example, approaches to model development and evaluation) that led to successful models and model applications.
In carrying out a retrospective analysis, it might be helpful to use models or categories of models that are old by current modeling standards, because the older models could present the best opportunities to assess actual model performance quantitatively by using subsequent advances in modeling and in new observations.
PRINCIPLES FOR MODEL DEVELOPMENT, SELECTION, AND APPLICATION
Models are always incomplete, and efforts to make them more complete can be problematic. As features and capabilities are added to a model, the cumulative effect on model performance needs to be evaluated carefully. Increasing the complexity of models without adequate consideration can introduce more model parameters with uncertain values, and decrease the potential for a model to be transparent and accessible to users and reviewers. It is often preferable to omit capabilities that do not improve model performance substantially. Even more problematic are models that accrue substantial uncertainties because they contain more parameters than can be estimated or calibrated with available observations.
Models used in the regulatory process should be no more complicated than is necessary to inform regulatory decisions. In the process of evaluating whether a model is suitable for its given application, there should be a critical evaluation of whether the model has been made unreasonably complicated. This evaluation should include how model de-
velopers and those that select a model for a particular application have addressed the trade-offs between the need for a given model application to be an accurate representation of the system of interest and the need for it to be reproducible, transparent, and useful for the regulatory decision at hand.
Model use in the environmental regulatory process may involve using the model to extrapolate beyond conditions for which the model was constructed or calibrated or conditions for which the model outputs cannot be verified. For example, it might be necessary to extrapolate laboratory animal data to assessments of possible human effects or to extrapolate the recent history of global environmental conditions to future conditions. In these circumstances, uncertainties about the form of a model and the parameters in the model might yield large uncertainties in model outputs. This problem can be compounded by making a model more complex if the additional processes in the more complex model are unimportant; any extra parameters that need to be estimated could degrade the confidence in the estimates of all parameters.
Extrapolating far beyond the available data for the model draws particular attention in the evaluation process to the theoretical basis of the model, the processes represented in the model, and the parameter values. When critical model parameters are estimated largely on the basis of matching model output to historical data, care must be taken to provide uncertainty estimates for the extrapolations, especially for models with many uncertain parameters.
A model is proprietary if any component that is a fundamental part of the model’s structure or functionality is not available for free to the general public. The use of proprietary models in the regulatory process can produce distrust among regulated parties and other interested indi-
viduals and groups because their use might prevent those affected by a regulatory decision from having access to a model that may have affected the decision. There are many ways in which a model can be proprietary, and some are more prone to engender distrust than others. For example, a model that uses proprietary algorithms may cause more concern than a model that uses publicly available algorithms but has a proprietary user interface.
The committee recommends that EPA adopt a preference for nonproprietary software for environmental modeling. When developing a model, EPA should establish and pursue a goal of not using proprietary elements. It should only adopt proprietary models when a clear and well-documented case has been made that the advantages of using such models outweigh the costs in lower credibility and transparency that accompanies reliance on proprietary models. Furthermore, proprietary models should be subject to rigorous quality requirements and to peer review that is equivalent to peer review for public models. If necessary, nondisclosure agreements could be used for experts to perform a thorough review of the proprietary portions of the model. The review process and results could then be made public without compromising proprietary features. General-purpose proprietary software (for example, Excel, SAS, and MATLAB) usually will not require such scrutiny, although EPA should be cognizant of the costs that obtaining and using such software may impose on interested parties.
Models and Rule-makings
The sometimes contentious setting in which regulatory models are used may impede EPA’s ability to implement some of the recommendations in this report, including the life-cycle evaluation process. Even high-quality models are filled with components that are incomplete and must be updated as new knowledge arises. Yet, those attributes may provide stakeholders with opportunities to mount formal challenges against models that produce outputs that they find undesirable. Requirements
such as those in the Information Quality Act may increase the susceptibility of models to challenges because outside parties may file a correction request for information disseminated by agencies.
When a model that informs a regulatory decision has undergone the multilayered review and comment processes, the model tends to remain in place for some time. This inertia is not always ideal: the cumbersome regulatory procedures and the finality of the rules that survive them may be at odds with the dynamic nature of modeling and the goal of improving models in response to experience and scientific advances.
In such an adversarial environment, EPA might perceive that a rigorous life-cycle model evaluation is ill-advised from a legal standpoint. Engaging in this type of rigorous review may expose the model to a greater risk of challenges, at least insofar as the agency’s review is made public, because the agency is documenting features of its models that need to be improved. Moreover, revising a model can trigger lengthy administrative notice and comment processes. However, an improved model is less likely to generate erroneous results that could lead to additional challenges, and it better serves the public interest.
It is important that EPA institute best practice standards for the evaluation of regulatory models. Best evaluation practices may be much easier for EPA to implement if its resulting rigorous life-cycle evaluation process is perceived as satisfying regulatory requirements, such as those of the Information Quality Act. However, for an evaluation process to meet the spirit and intent of the Information Quality Act, EPA’s evaluation process must include a mechanism for any person to submit information or corrections to a model. Rather than requiring a response within 60 days, as the Information Quality Act does, the evaluation process would involve consideration of that information and response at the appropriate time in the model evaluation process.
To further encourage life-cycle evaluation of models that support federal rule-makings, alternative means of soliciting public comment on model revisions need to be devised. For example, EPA could promulgate a separate rule-making that establishes an agency-wide process for the evaluation and adjustment of models used in its rules. Such a programmatic process would allow the agency to provide adequate opportunities for meaningful public comment at important stages of the evaluation and
revision of an individual model, without triggering the need for a separate rule-making for each revision. A more rigorous and formalized evaluation processes for models may result in greater deference to agency models by interested parties and by reviewing courts. Such a response could decrease the extent of model challenges through adversarial processes.
Model Origin and History
Models are developed and applied over many years by participants who enter and exit the process over time. The model origin and history can be lost when individual experiences with a model are not documented and archived. Without an adequate record, a model might be incorrectly applied, or developers might be unable to adapt the model for a new application. Poor historical documentation could also frustrate stakeholders who are interested in understanding a model. Finally, without adequate documentation, EPA might be limited in its ability to justify decisions that were critical to model design, development, or model selection.
As part of the evaluation plan, a documented history of important events regarding the model should be maintained, especially after public release. Each documentation should have the model’s origin with such key elements as the identity of the model developer and institution, the decisions on critical model design and development, and the records of software version releases. The model documentation also should have elements in “plain English” to communicate with nontechnical evaluators. An understandable description of the model itself, justifications, limitations, and key peer reviews are especially important for building trust.
The committee recognizes that information relevant to model origins and histories is already being collected by CREM and stored in its model database, which is available on the CREM web site. CREM’s database includes over 100 models, although updating of this site has declined in recent years. It provides information on obtaining and running the models and on the models’ conceptual bases, scientific details, and
results of evaluation studies. One possible way to implement the recommendation for developing and maintaining the model history may be to expand CREM’s efforts in this direction. The EPA Science Advisory Board review of CREM contains additional recommendations with regard to specific improvements in CREM’s database.
Improving Model Accessibility
Stakeholders and others necessarily play a vital role in EPA’s use and evaluation of regulatory models. Differing interpretations of data on risk, environmental trends, and a range of social values mean that a broad array of participants will have a stake in the modeling exercise. As a result, various constituencies and individuals must be able to participate in the modeling process through a variety of activities, such as producing their own model results and commenting on and possibly challenging the legitimacy or accuracy of a model.
EPA faces a number of challenges in making its regulatory models, particularly its complex models, accessible to these diverse interests. Nevertheless, EPA has taken some steps to address accessibility to models, including the CREM database of models. This information enhances the transparency and understandability of models to a wide array of interested participants. Despite these efforts, however, stakeholders and others with limited resources or insufficient technical expertise still face substantial barriers to being able to evaluate EPA’s models, comment on important model assumptions, or use the models in their own work.
EPA should place a high priority on ensuring that stakeholders and others have access to models for regulatory decision making. To ensure that its models database contains all actively used models, EPA should continue its support for the intra-agency efforts of CREM. A more formal process may be needed to ensure that CREM’s models database is complete and updated with information that is at least equivalent to information provided for models currently contained in the database.
Yet, even with a high-quality models database, EPA should continue to develop initiatives to ensure that its regulatory models are as accessible as possible to the broader public and stakeholder community.
The level of effort should be commensurate with the impact of the model use. It is most important to highlight the critical model assumptions, particularly the conceptual basis for a model and the sources of significant uncertainty. Meaningful stakeholder involvement should be solicited at the model development and model application stages of regulatory activity, when appropriate. EPA could improve model accessibility through a variety of activities, such as requiring an additional interface for each model to help to identify the assumptions and sources of parameters and other uncertainties and providing additional user and stakeholder training.
However, even if full information on a model is available, technical expertise will still be required to judge independently its quality and suitability for regulatory application. Each of these recommendations requires staff time and resources, which may be considerable. Thus, EPA’s efforts to enhance opportunities for public participation in any particular case must be balanced against other agency priorities.
The committee anticipates that its recommendations will be met with some resistance because of the potentially substantial resources needed for implementing life-cycle model evaluation. However, given the critical importance of having high-quality models for decision making, such investments are essential if environmental regulatory modeling is to meet challenges now and in the future.
A National Research Council committee will assess evolving scientific and technical issues related to the selection and use of computational and statistical models in decision-making processes at the Environmental Protection Agency (EPA). The committee will provide advice concerning the development of guidelines and a vision for the selection and use of models at the agency. Through public workshops and other means, the committee will consider cross-discipline issues related to model use, performance evaluation, peer review, uncertainty, and quality assurance/quality control. The committee will assess scientific and technical criteria that should be considered in deciding whether a model and its results could serve as a reasonable basis for environmental regulatory activities. It will also examine case studies of model development, evaluation, and application to further elucidate guiding principles. The objective of the committee will be to provide a report that will serve as a fundamental guide for the selection and use of models in the regulatory process at EPA—the goal is to produce a report on models similar to the NRC’s 1983 “Red Book” on risk assessment (Risk Assessment in the Federal Government: Managing the Process). As part of its scientific assessment, the committee will need to carefully consider the realities of EPA’s regulatory mission so as to provide practical advice on
model development and use. The report will avoid an overly prescriptive and stringent set of guidelines and will recognize the need for regulatory and policy decisions in the face of incomplete information and uncertainty. In particular, the committee will not attempt to define a numerical standard for accuracy that all models must attain before they can be used in the decision-making process.
The committee will address the following specific issues: