each approach, we compare the key assumptions, data requirements, and recommended uses of each modeling approach.
Theoretical and Empirical Basis
Machine learning and statistical methods in LCMs involve approaches to represent relationships between inputs (i.e., driving variables) and outputs (i.e., land use or cover changes). The data are used to generate maps of transition potentials that give an empirically based measure of the possibility of particular land transitions. Together with traditional parametric approaches, usually in the form of logistic regression (Millington et al. 2007), generalized linear modeling, or generalized additive modeling (Brown et al. 2002), several different kinds of Bayesian and machine learning algorithms have been used in influential LCMs. For example, the Dinamica model offers the option of logistic regression or a weights-of-evidence approach, which estimates a statistical model similar to logistic regression, but does so within a Bayesian framework (Carlson et al. 2012).
Neural networks play a central role in both the Land Transformation Model (Pijanowski et al., 2002; Ray and Pijanowski, 2010; Tayyebi et al., in press) and Idrisi’s Land Change Modeler (Eastman, 2007; see Box 2.1). Neural networks represent relationships between land transitions and their explanatory variables through a network of weighted relationships that the algorithm adjusts iteratively. Genetic algorithms (GAs) have been used to optimize the rule set for cellular automaton models, by iteratively adjusting the parameter string that defines weights on variables (Jenerette and Wu, 2001). The SLEUTH model (Clarke, 2008) uses an input-assisted incremental approach to calibrate a cellular automata model, but attempts have been made to use genetic algorithms for this purpose (Goldstein 2004). Classification and regression trees are data mining tools that use a sequential partitioning process and have been used to model the probabilities of landscape change (McDonald and Urban, 2006). A comparison across approaches that included logistic regression, Bayesian analysis, weights of evidence, and a neural network showed a case-study site where the neural network produced a more accurate prediction during a validation interval as measured by the area under the relative operating characteristic (ROC) curve and a Pierce skill score (Eastman et al., 2005). Although we do not review all of the individual methods in detail, we describe the strengths and weaknesses of this overall approach relative to the other modeling approaches covered in this study.
Modeling approaches that employ a machine learning or statistical approaches typically receive input in the form of two types of maps: (1) maps of land cover at time points that bound the calibration interval, and (2) maps of explanatory variables, such as topographic slope, distance to roads, etc. After the algorithm