CMIP3 experiments. Two studies, based on the GFDL (Held et al., 2005) and Community Climate System Models (CCSM; Hurrell et al., 2004), have roughly similar skill in reproducing late-20th-century Sahel drought, and propose roughly equivalent explanations of the same, based on Atlantic meridional temperature gradients. Yet the results in the future scenario runs are quite different, producing opposite sign projections of Sahel drought in the 21st century. Hoerling et al. (2006), in their comparative study of the CMIP3 models’ Sahel simulations, acknowledged these differences but could not easily point a finger at any feature of the model that could account for the differences: The differences are not easily attributable to any single difference in physics or process between the models, nor can the community easily tell which, if any, of the projections is the more credible. Tebaldi and Knutti (2007) also address this fact, that intermodel spread cannot be explained or even analyzed beyond a point.
This weakness in methodology requires the climate modeling community to address the issue of scientific reproducibility. That one should independently be able to replicate a scientific result is a cornerstone of the scientific method, yet climate modelers do not now have a reliable method for reproducing a result from one model using another. The computational science community has begun to take a serious look at this issue, with a considerable literature on the subject, including a special issue of Computing in Science and Engineering (CISE, 2009) devoted to the subject. Peng (2011) summarized the issue as follows:
Computational science has led to exciting new developments, but the nature of the work has exposed limitations in our ability to evaluate published findings. Reproducibility has the potential to serve as a minimum standard for judging scientific claims when full independent replication of a study is not possible.
Having all of the nation’s models buy into a common framework would allow this research to be systematized. Maintaining the ability to run experiments across a hierarchy of models under systematic component-by-component changes could hasten scientific progress significantly.
Research in the science of coupling is needed to make the vision a reality. Existing framework software does not specify how models are coupled. Software standards are one part of the story, but there will remain work to be done to define choices for coupling algorithms, fields to be exchanged, and so on.
An effective common modeling infrastructure would include
• common software standards and interfaces for technical infrastructure (e.g., I/O and parallelism);
• common coupling interfaces across a suite of model components of varying complexity;