Evaluation of model performance often requires comparison of model simulations with observed outcomes. Simulations from LCMs usually produce maps of land use, land cover, or some other land-related variable. A standard approach to evaluating the simulation of a land change model is to develop the model through calibration with historical data, for example using two or more maps of land cover during the calibration time interval. The calibrated model then simulates a validation to another time point for which reference data are available. The map of simulated change is then compared with the map of actual reference change during the validation interval to evaluate the differences based on some set of metrics. This comparison requires three maps: the reference map at the start time of the simulation, the reference map at the end time of the simulation, and the simulation map at the end time of the simulation. This three-map analysis shows how the simulated change compares to the reference change by revealing five components: (1) reference change simulated correctly as change (i.e., hits), (2) reference change simulated incorrectly as persistence (i.e., misses), (3) reference persistence simulated incorrectly as change (i.e., false alarms), (4) reference persistence simulated correctly as persistence (i.e., correct rejections), and (5) reference change simulated incorrectly as change to the wrong gaining category (i.e., wrong hits) (Pontius et al., 2011). The relative value of each of these five components can be used to compute quantity disagreement and allocation disagreement (Pontius and Millones, 2011).
The three-map comparison and its five components reveal the accuracy of the land change model versus a null model that predicts complete persistence. Where the land change model generates a miss, the null model would also produce a miss. Where the land change model generates a false alarm, the null model would produce a correct rejection. Where the land change model obtains a hit or a wrong hit, the null model would produce a miss. Thus, if the modeler computes the five components of the three-map comparison, the modeler has produced a comparison with a null model. A frequent blunder is to compute a two-map comparison between the reference map at the end time of the simulation and the simulation map at the end time of the simulation. This two-map comparison cannot distinguish between correctly simulated change (i.e., hits) and correctly simulated persistence (i.e., correct rejections).
After the modeler sees the map of the five components, there are a variety of more detailed ways that the modeler can compare the pattern of simulated change versus the pattern of reference change. There are a plethora of pattern metrics that consider the spatial distribution of the patches in the map. Such metrics can consider the patches’ numbers, sizes, and shapes. The particular research question should dictate whether details concerning the configuration of the patches in the map are important. For example, if the application concerns biodiversity protection, then it is likely to be important to consider whether forest is in one