2

Land Change Modeling Approaches

To move land change modeling forward, it is critical that a common language is established to differentiate modeling approaches according to their theoretical and empirical bases. The diversity of approaches to land change modeling, along with differences in definitions between practitioners from different disciplines, does not lend itself to a discrete classification system. Agarwal et al. (2002) described models in terms of how they handle spatial, temporal and decision making complexity. While this approach provided useful distinctions at the time that review was completed, significant progress in developing all modeling approaches has blurred even some of those distinctions.

The committee has identified six generally recognized groups of approaches to land change models (LCMs), the first five of which are arrayed roughly in order from least to most structurally oriented (i.e., focused on process): (1) machine learning and statistical, (2) cellular, (3) sector-based economic, (4) spatially disaggregated economic, (5) agent-based, and (6) hybrid approaches. While we mention statistical approaches in the first category explicitly, statistical methods are used in some way within most of the approaches. There are overlaps in the degree and type of process orientation among the approaches that depend on the details of the specific model representing these approaches. We include the sixth type to acknowledge the importance of studies and applications that combine the different approaches into a single model or modeling framework. The following sections outline the theoretical and empirical bases as well as technical, research, and data challenges for each approach. Examples of each approach are also provided. Because of similarities in the approaches, we address both forms of economic models in a combined section. Following the discussion of



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 29
2 Land Change Modeling Approaches T o move land change modeling forward, it is critical that a common lan- guage is established to differentiate modeling approaches according to their theoretical and empirical bases. The diversity of approaches to land change modeling, along with differences in definitions between practitioners from different disciplines, does not lend itself to a discrete classification system. Agar- wal et al. (2002) described models in terms of how they handle spatial, temporal and decision making complexity. While this approach provided useful distinc- tions at the time that review was completed, significant progress in developing all modeling approaches has blurred even some of those distinctions. The committee has identified six generally recognized groups of approaches to land change models (LCMs), the first five of which are arrayed roughly in order from least to most structurally oriented (i.e., focused on process): (1) machine learning and statistical, (2) cellular, (3) sector-based economic, (4) spatially disaggregated economic, (5) agent-based, and (6) hybrid approaches. While we mention statistical approaches in the first category explicitly, statistical methods are used in some way within most of the approaches. There are overlaps in the degree and type of process orientation among the approaches that depend on the details of the specific model representing these approaches. We include the sixth type to acknowledge the importance of studies and applications that com- bine the different approaches into a single model or modeling framework. The following sections outline the theoretical and empirical bases as well as techni- cal, research, and data challenges for each approach. Examples of each approach are also provided. Because of similarities in the approaches, we address both forms of economic models in a combined section. Following the discussion of 29

OCR for page 29
30 Advancing Land Change Modeling each approach, we compare the key assumptions, data requirements, and recom- mended uses of each modeling approach. MACHINE LEARNING AND STATISTICAL Theoretical and Empirical Basis Machine learning and statistical methods in LCMs involve approaches to represent relationships between inputs (i.e., driving variables) and outputs (i.e., land use or cover changes). The data are used to generate maps of transition potentials that give an empirically based measure of the possibility of particular land transitions. Together with traditional parametric approaches, usually in the form of logistic regression (Millington et al. 2007), generalized linear modeling, or generalized additive modeling (Brown et al. 2002), several different kinds of Bayesian and machine learning algorithms have been used in influential LCMs. For example, the Dinamica model offers the option of logistic regression or a weights-of-evidence approach, which estimates a statistical model similar to logistic regression, but does so within a Bayesian framework (Carlson et al. 2012). Neural networks play a central role in both the Land Transformation Model (Pijanowski et al., 2002; Ray and Pijanowski, 2010; Tayyebi et al., in press) and Idrisi’s Land Change Modeler (Eastman, 2007; see Box 2.1). Neural net- works represent relationships between land transitions and their explanatory variables through a network of weighted relationships that the algorithm adjusts iteratively. Genetic algorithms (GAs) have been used to optimize the rule set for cellular automaton models, by iteratively adjusting the parameter string that defines weights on variables (Jenerette and Wu, 2001). The SLEUTH model (Clarke, 2008) uses an input-assisted incremental approach to calibrate a cellular automata model, but attempts have been made to use genetic algorithms for this purpose (Goldstein 2004). Classification and regression trees are data mining tools that use a sequential partitioning process and have been used to model the probabilities of landscape change (McDonald and Urban, 2006). A comparison across approaches that included logistic regression, Bayesian analysis, weights of evidence, and a neural network showed a case-study site where the neural network produced a more accurate prediction during a validation interval as measured by the area under the relative operating characteristic (ROC) curve and a Pierce skill score (Eastman et al., 2005). Although we do not review all of the individual methods in detail, we describe the strengths and weaknesses of this overall approach relative to the other modeling approaches covered in this study. Modeling approaches that employ a machine learning or statistical approaches typically receive input in the form of two types of maps: (1) maps of land cover at time points that bound the calibration interval, and (2) maps of explanatory variables, such as topographic slope, distance to roads, etc. After the algorithm

OCR for page 29
LAND CHANGE MODELING APPROACHES 31 finds this relationship for the calibration interval, the relationship is then typically used to extrapolate the same relationship into a subsequent validation interval during which the predictive power can be tested. Machine learning algorithms can be appropriate for situations where data concerning pattern are available and theory concerning process is scant. There are many cases where it is possible to obtain land cover maps from more than one time point along with explanatory variables for a study site where the investigator is partially ignorant concerning the detailed processes of land transformation. A machine learning algorithm attempts to learn the mathematical or logical relationships among the patterns of land cover and the patterns of the explanatory variables. The machine learning algorithm focuses exclusively on encoding and extrapolating the pattern of the land change, as opposed to the process of change. If the approach is used for prediction, then the prediction assumes stationarity in the land change pattern from the calibration interval to the subsequent time interval. Machine learning algorithms are used to predict by extrapolating historic patterns and can perform the extrapolation in a manner that does not require theory concerning detailed processes of change. Machine learning algorithms are not designed to simulate feedbacks and nonstationary processes in coupled natural and human systems, nor are they designed to evaluate the effects of policies that attempt to modify processes so that future patterns will be different than the past patterns. Machine learning algorithms are not designed to simulate the mechanisms of human decision mak- ing, because machine learning algorithms lack theory concerning the behavior of decision making. Statistical regression methods assume a fixed mathematical form with coef- ficients that an algorithm estimates to produce an optimal fit, where optimal is defined by a mathematical criterion, i.e., a maximum-likelihood criterion. The maximum-likelihood criterion leads to a mathematical formula to estimate the regression’s coefficients. For example, the regression equation could assume a monotonic sigmoidal relationship between land cover change and topographic slope. Then the maximum-likelihood algorithm estimates the equation’s coef- ficients so the regression curve fits as closely as possible to the data, given the form of the monotonic sigmoidal relationship. The coefficients indicate whether the assumed monotonic relationships are increasing or decreasing, and at what rate. The logistic regression might also include interactions among the explana- tory variables. Diagnostic measurements help to interpret the fitted coefficients of the regression equation. In comparison with logistic regression, machine learning algorithms do not require strong assumptions concerning a particular form of a mathematical equa- tion to express a relationship between the land cover map(s) and the map(s) of explanatory variable(s). Machine learning algorithms attempt to mimic biologi- cal learning systems through predictive artificial intelligence tools. They fit a relationship between the land change variable and the explanatory variable(s)

OCR for page 29
32 Advancing Land Change Modeling BOX 2.1 The Multi-Layer Perceptron The Multi-Layer Perceptron (MLP) is a machine learning algorithm that is available as an option in the Land Change Modeler within the Idrisi GIS software. MLP is a neural network that receives maps of explanatory variables and land transitions for a calibration time interval and then produces a map of transition potential for temporal extrapolation beyond the calibration time interval. The transition potential is an index on a scale from 0 to 1, where higher numbers indicate pixels that have a combination of explanatory values that are more similar to places where the particular transition occurred during the calibration interval compared to places where the transition did not occur. The maps in this box illustrate validation information using Idrisi’s tuto- rial data concerning the gain of disturbed land in Chiquitania, Bolivia. The MLP produced the transition potential map based on the gain of disturbance during a calibration interval (1986-1994) and explanatory variables including slope, elevation, and distance from streams, roads, urban areas, and previous disturbances. The validation interval is 1994-2000; thus, the disturbed pixels of 1994 are masked from the analysis because they are not candidates for post-1994 gain of disturbance. Map A shows the validation data, where black patches show a gain of disturbance. Map B is the output from the MLP, where darker shades indicate relatively higher transition potentials. Map C is a transi- tion potential map that is based exclusively on proximity to disturbed pixels of 1994, where relatively higher transition potentials are assigned to pixels that are closer to disturbance of 1994. The proximity model is included because it is best practice to compare the map from a relatively naïve model to the output from a more complex model. in a manner that is more flexible than regression concerning the mathemati- cal structure of the fitted relationship and can be designed to be more robust to errors in the data (Bishop, 1995). The algorithm uses an iterative process to fit a relationship between the patterns in the land cover maps and the explanatory variables. Over repeated iterations, the algorithm adjusts the model parameters until the algorithm satisfies a stopping criterion. The stopping criterion signals that the algorithm either has generated a particular degree of fit for the relation- ship between the land cover maps and the explanatory variables, or a particular amount of stability in the fit from iteration to iteration. Machine learning algorithms do not necessarily assume sigmoidal or mono- tonic relationships between the likelihood of land change and explanatory variable(s). The machine learning algorithms can also fit interactions among variables, just as regressions can. Machine learning algorithms are similar to common statistical approaches, in the respect that theory concerning processes

OCR for page 29
LAND CHANGE MODELING APPROACHES 33 A B FIGURE (A) Gain of disturbance during valida- tion interval, (B) transition potential from Idrisi’s Multi-Layer Perceptron, and (C) transition poten- tial from a naïve proximity model. C of land change is expressed through the selection of explanatory variables to include and their expected relationships and functional forms, though the theory does not necessarily need to be rich. Many models base selection of variables on the von Thünen idea of land rents, which relates land use, cover, and change to location relative to markets and transportation as well as land suitability through variables like soil quality and slope. This theoretical basis is shared with many cellular models, and is described in more detail in that section. Machine learning and statistical approaches can be appropriate for situations where data concerning pattern are available and theory concerning process is scant. In terms of short-term PP uses of models, machine learning approaches can be used to make useful predictions. There are many cases where it is possible to obtain land cover maps from more than one time point along with explanatory variables for a study site where the investigator is partially ignorant concerning the detailed processes of land transformation. A machine learning algorithm can

OCR for page 29
34 Advancing Land Change Modeling be used to first identify and represent patterns in data, relating inputs (predictor variables) and outputs (a land or land change variable) then generalize those relationships to other data sets. As more data become available for LCM appli- cations, the ability of machine learning algorithms, in particular, to represent and generalize relationships in those data offers significant potential for dealing efficiently with large data volumes. Technical, Research and Data Challenges Because the methods and resulting modeled relationships involved in both machine learning and statistical approaches are developed inductively on the basis of the inputted data, the models are particularly sensitive to the inputs. For example, statistical or machine learning models can be applied to model either land use or land cover, and the categories used in the classification will determine the resulting model form. The meaning of the model is determined by the defini- tions of these categories. Therefore, model suitability for a given purpose will be dependent on the categories in the input map. For these reasons, and because land cover data are more plentiful than land use data, statistical and machine learning approaches are suitable for modeling land cover changes directly, even though these changes may come about through human choices about land use. While machine learning methods have been developed in ways that make them less sensitive to random errors in the input data than statistical methods, systematic biases in data will always affect the resulting models. Used in a predictive mode, both machine learning and statistical approaches generally assume stationarity in the relationship between predictor and land change variables, i.e., that the model fitted during the calibration interval can be applied to the subsequent time interval without modification. The advantage is that these approaches can be used to predict by extrapolating historic patterns, and can perform the extrapolation in a manner that does not require theory con- cerning detailed processes of change. Additionally, any variables included as predictors in the model can be modified to generate scenarios of future change. For example, if distance to roads is a predictor variable it is a relatively simple task to simulate the effect of introducing a new road by recalculating distance to the nearest road. The disadvantage is that variables that are not included in the model, but that might change over time, cannot be accounted for in the projec- tions or scenarios. What this often means is that scenarios involving changes to the economic penalties or incentives or other behaviorally related variables or constraints cannot be simulated. However, we address reduced-form econometric approaches in the section on Economic models, in which statistical methods can be used to estimate and evaluate behaviorally oriented scenarios. Statistical and machine learning approaches differ in the degree of a pri- ori structure imposed by the modeler, such that machine learning algorithms can more easily represent a variety of complex relationships but there exists a

OCR for page 29
LAND CHANGE MODELING APPROACHES 35 greater risk of overfitting. Overfitting can occur when an algorithm produces a mathematical or logical relationship between observed land change and a set of explanatory variables that fits the details of a particular calibration data set but does not apply to a broader set of applications. This can happen when the rela- tionship fits the details of the calibration data in such a way that the model fails to represent the general principles that extend to other times or places. Spatial or temporal nonstationarity in the land change process can mean that a good fit at one time or place will not generalize well to other times or places. For example, a machine learning algorithm might be able to fit a tight relationship between land cover maps and explanatory variables for a given time interval but do a relatively poor job of matching observations when the relationship is extrapolated to time points beyond the calibration interval, for example because the market or policy conditions differ between the two time periods. If the model is overfit during the calibration stage, then the investigator can be lured into a false sense of trust that the model can predict accurately the patterns in data for which the model was not calibrated. Overfitting can occur in nearly any modeling approach, especially approaches that calibrate a model based on a single case study. Thus, an important research topic concerns methods to measure and to address overfitting. Because this is a well-known problem in machine learning algorithms, a variety of techniques have been developed to reduce the risks of overfitting, and generalization outside the calibration data has been demonstrated for LCMs (Pijanowski et al. 2005). One approach that has not yet been tried with LCMs is to generate 100s or 1000s of models based on the stochastic elements of machine learning that can serve as a model ensemble and characterize a range of possible models for a given data set. Interpretation of output can be challenging because many algorithms pro- duce a map of “transition potential” for each land transition, where the transition potential indicates whether the apparent conditions are such that the chances of land change are relatively high versus low. The transition potentials are typically real values on the interval from 0 to 1, and have meaning in terms of their relative ranking, but are not necessarily probabilities of change because they are based on the time interval of change from the data on which they are based. A separate algorithm typically selects pixels with the highest-ranking transition potentials to make a hard classification of future change, where the number of selected pixels is based on an anticipated quantity of land change over some specified time interval. This situation makes it challenging to compare two or more maps of transition potential, since a map that has a higher average transition potential does not necessarily imply a higher anticipated quantity of change compared to a map that has a lower average transition potential. Even if two maps have the same average transition potential, it is not clear how to compare maps when they have differences in the distribution of the transition potentials, for example, when one distribution has a single mode but the other distribution does not. A transi- tion potential can be interpreted as a probability when it indicates the chance

OCR for page 29
36 Advancing Land Change Modeling that a particular categorical transition will occur in the pixel during a specific time interval. It is possible to convert some types of transition potentials to prob- abilities by scaling the transition potentials using a projected quantity of each categorical transition during a specific time interval (Hsieh, 2009). If transition potentials are probabilities, then they have an implied quantity of change for the specified time interval. Compared to statistical methods, for which a large body of theory exists to facilitate diagnosing and interpreting the structure of a given model, machine learning approaches are often criticized as a ‘black box’ for which interpretation of the model structure and performance is a challenge. This is a well-known challenge for machine learning approaches, and a variety of methods have been developed to understand how the predictor variables relate to the outcome (e.g., to open the black box). For example, a simple approach to understanding the relative contribution of different variables to a machine learning model is to leave out each of the variables one at a time and re-calibrate the model (Pijanowski et al. 2002). Additionally, in terms of measuring how well the relative ranks from the model represent the spatial allocation of the observed transitions, The Rela- tive Operating Characteristic (ROC) is frequently used because it is designed to measure the degree to which higher ranks are concentrated on the feature of interest (i.e., change). ROC has been criticized because many modelers use only a single summary statistic of the area under the ROC curve (AUC) to indicate association (Lobo et al., 2008). The AUC fails to expose the rich information that proper interpretation of the full ROC curve can reveal. Another criticism is that, like other metrics of predictive ability, the AUC can be large due to cor- rectly predicted persistence, not correctly predicted change. This problem can be mitigated through careful selection of the study extent and by proper use and interpretation of the full ROC curve. Modifications and alternatives to the ROC type of analysis can better express the manner in which a transition potential map fits the empirically observed land change. Proper selection of the measurement is important because the apparent performance of the algorithm can be sensitive to the selection of the measure- ment. For example, if the machine learning algorithm attempts to maximize the percentage of pixels that agree between the simulated map of land cover types and the reference map for the same time point, then the algorithm might generate output that systematically underestimates the quantity of change. This can occur because a simulation of land change is likely to generate allocation errors when it simulates change; thus, it can reduce the number of those allocation errors by simply predicting very little change. If instead the algorithm seeks to maximize a different measurement, such as the figure of merit (Pontius et al., 2008, 2011), then the algorithm has an incentive to simulate a more accurate quantity of change, because one must simulate an accurate quantity of change in order to have the possibility to generate a high figure of merit.

OCR for page 29
LAND CHANGE MODELING APPROACHES 37 CELLULAR Cellular land change models use discrete spatial units as the basic units of simulation. Such spatial units can be regularly shaped pixels, parcels, or other land units and are usually arrayed in a tessellation. Cellular models use a variety of input information to simulate the conversions of land cover or land use in these land units based on a rule set or algorithm that is applied synchronously to all spatial units and that represents the modeler’s understanding of the land change process. The algorithm represents decision making that is, implicitly, assumed to take place at the level of the spatial units of simulation, with a one-to-one cor- respondence assumed between the spatial units and decision maker. Often, the same decision algorithm is applied to all spatial units in a study area, or to large regions within the study area. Variation in decision making can, therefore, solely arise from the attributes of the spatial unit, rather than from the differences in decision making of the actors managing the spatial units. Theoretical and Empirical Basis A wide variety of cellular models has emerged over the past two decades, differing in their specification and underlying theoretical and empirical basis. Dif- ferences between cellular model types relate to differences in the algorithms and the underlying assumptions that govern the decision rules at the level of spatial units. Another difference between groups of cellular models relates to the way the quantity of change is determined, as distinct from the locations of change. Model- ers generally choose between either constraining the total areas of land change at the regional level or determining the regional level of land change simply as the aggregate of the changes at the level of individual spatial units (Figure 2.1). In this section we first look at the theoretical and empirical basis of the decision models. This is followed by a discussion of the top-down versus bottom-up guid- ance in determining the regional quantities of allocated land change. An assortment of conversion rules have been applied within cellular models. However, the underlying assumptions can be categorized in three different groups based on the underlying theoretical basis of the models (Schrojenstein Lantman et al., 2011): (1) a continuation of historical trends and patterns, (2) allocation based on suitability of the land, and (3) allocation based on neighborhood interactions. Continuation of historical trends and patterns The premise behind the use of historical trends to project future trends is that future land use is assumed to follow patterns of change corresponding to recent or historical changes. The use of this assumption may vary from simple application of transition probabilities from observed historic changes to the use of observed land changes over a past period to empirically estimate relationships between land change and location characteristics. This latter case is similar to

OCR for page 29
38 Advancing Land Change Modeling Regional Top-down land use area constraint on emerging from regional land bottom-up use area conversions Local conversion rules Figure 2.1 Different modeling approaches: Spatial allocation is constrained by a top-down demand or fully determined by local conversion rules. the application of machine learning methods outlined in the previous section and is elaborated below in the discussion of land suitability. As an example, this may mean that if agricultural land use was found close to cities in the past, it is assumed that future predictions of agricultural land will be allocated close to cities. Implicitly, this assumes stationarity in the underlying decision making of the actors of land change. Often times this assumption is appropriate, but a bet- ter understanding of the specific model elements for which and conditions under which stationarity is a reasonable assumption would help with structural evalua- tion of models, a topic we revisit in Chapter 3. The most well-known approach to constructing models based on continua- tion of historical trends is the use of Markov chains. In its basic application to land change, spatial data are used to calculate a transition matrix over an historic time period and then used to derive transition probabilities for the different types of conversions. These probabilities are used to calculate land areas of different land types in the future in a nonspatial manner. Burnham (1973) was one of the first to propose using Markov chain analyses for modeling land use change, but they were later applied by others (Muller and Middleton 1994; Fearnside; Turner 1987). Because of its simplicity, Markovian analysis was very popular during the early phase of development of land change models. However, the approach has a number of limitations. The primary limitations of Markov transition probability-

OCR for page 29
LAND CHANGE MODELING APPROACHES 39 based models for land use and land cover change analyses are (1) the assumption of stationarity in the transition matrix, that is, that it is constant in both time and space; (2) the assumption of spatial independence of transitions; and (3) the dif- ficulty of ascribing causality within the model, that is, that transition probabilities are often derived empirically from multitemporal maps with no description of the process (Baker, 1989; Brown et al., 2000). Several authors have tried to overcome some of these limitations by merging the Markovian concepts with other simulation rules and concepts; (Goigel and Turner, 1988; Guan et al., 2011. These hybrid models often use Markovian mod- els to determine future quantities of change while the spatial patterns are simu- lated by another type of cellular model. Though many models have developed approaches other than Markov chains to describe future changes, the assumption of stationarity is common in other model types. Many statistical and econometric models of relations between land use and location factors assume that such rela- tions remain valid for the period of simulation. Suitability of land Many cellular models use, in one way or another, an assessment of the suitability of the spatial units for alternative land uses as a determinant of the conversion rules. The land suitability in cellular models is underpinned by the theoretical work of von Thünen (1966) and Alonso (1964), which explained land use allocation patterns based on the spatial variations in land rent for different land uses. Following on the premise that land users aim to maximize profit, each parcel is converted to the use with the highest land rent at that location. While land suitability is often represented only in a relative terms, these suitability models provide a basis for understanding where different land uses or covers are most likely to be found. Whether or not land rents are calculated in absolute or relative terms often depends on data availability, and relative land rents are com- monly used for models of land cover, where there may not be a good theoretical link between economic rent and cover type. In the original specification of the von Thünen model, land rent differs by location and land use due to differences in transportation cost and distance to the market. Elaborations of these premises accounted for differences in soil quality and infrastructure (Alonso, 1964), while Walker and others (Walker, 2004;Walker and Solecki, 2004) extended the under- lying bid-rent model to account for development and agency. The suitability of the land is determined in different ways. In some models land suitability is directly derived from the physical suitability for alternative uses based on agroecological zoning assessments; in other instances this is represented by the potential crop yield that may be obtained (Schaldach et al., 2011). Other approaches also include infrastructural and socioeconomic location characteris- tics in the determination of the suitability for a particular use. The importance of different location factors as determinant of the suitability can be based on expert knowledge captured in multicriteria evaluation procedures (Schaldach et al.,

OCR for page 29
64 Advancing Land Change Modeling • Increasing call for use of ABMs in predictive settings will require bet- ter methods for assimilating data and updating models on the fly. For example, operational models will need to be able to be updated when an exogenous shock occurs. • Agent-based modeling efforts benefit from the availability of a wide range of data types. These include survey data that are spatially referenced to param- eterize decision functions; data on land management, use, value, and ownership to complement land cover data; and longitudinal versions of all of these data types. Efforts to collect, make accessible, and integrate these data will enhance these modeling activities. • Additional work is needed on methods to integrate data across disparate sources. For instance, data developed using different functional unit definitions, spatial extents, different levels of aggregation, and by different agencies might need to be integrated within a single ABM and could be harmonized through a variety of interpolation, down-scaling, or up-scaling approaches. Currently researchers often recreate data because they are not able to access and/or integrate existing data. • The structural validity of the rules and algorithms used to represent agent actions and their interactions in these models has been challenging to demon- strate. For ABMs, the validation of agent dynamics is often more important than the validation of the model outcomes that are the most common validation targets for LCMs. • Because agent-based models can frequently generate multiple outputs, due to stochasticity of parameters or inputs, it is important to evaluate the diver- sity of models outcomes that can result from them. This is especially important when combined with evaluation of multiple scenarios, defined by alterations in model settings to reflect alternative possible futures or policy interventions. Stronger norms for full exploration of the space of model outcomes is needed in land change modeling. This goal might benefit from cross-fertilization of methods from other modeling communities to learn how to synthesize many runs (Monte Carlo–type approaches). • Development of new, dynamic, and multidimensional methods of analysis and visualization is needed to help better understand relationships between model parameters and model outputs. These methods can further be used to convey to stakeholders what a model is actually doing and can be used to display behaviors of individual model components (like agents or locations). • There is a general need for standards and norms for documenting and sharing agent-based models. A National Science Foundation–funded effort to coordinate research, dissemination, and documentation of agent-based models has produced a web-based portal for model sharing (i.e., openabm.org), and the ongoing research on the standards of evidence and communication regarding use of models in science is crucial to further uptake and credibility of these modeling activities.

OCR for page 29
LAND CHANGE MODELING APPROACHES 65 HYBRID APPROACHES Many LCMs are not easily classified into one of the categories discussed in the preceding sections. Here we address the fact the conceptual and methodologi- cal approaches described above are quite often used in combination to represent various aspects of land change patterns and processes. For example, machine learning and statistical approaches are often used to develop suitability maps that then serve as one of the inputs to a cellular model that incorporates land suitabil- ity with neighborhood effects to project future land use (Almeida et al. 2008; Li and Yeh 2002) or land cover (Hilbert and Ostendorf 2001) patterns. Similarly, sector-based economic models have been integrated with spatial allocation mod- els to downscale land areas determined in large-scale general equilibrium and integrated assessment models for large world regions to individual pixels. The Global Land Model (Hurtt et al., 2011;Hurtt et al., 2006) uses a relatively simple expert-based ranking of relative suitabilities in combination with an assumed hierarchical ordering of allocation. Other allocation mechanisms are possible, building on the statistical, machine learning, and cellular approaches described above. As a final example of hybrid models, coupled representations of land use and land cover dynamics, as a means of representing the dynamics of both the natural and the human processes involved in land change, have been devel- oped by combining the statistical, cellular and agent-based approaches described above. For example, An et al. (2005) were able to successfully represent interac- tions between human demography and fuel use, and availability and quality of panda habitat by representing dynamics in the human communities with agents and the forest dynamics and habitat characteristics with cellular models that incorporated algorithms for forest growth and habitat suitability, some of which had been developed using statistical models for determination of suitabilities. Theoretical and Empirical Basis Land change is the result of multiple human-environment interactions oper- ating across different scales ranging from global trade of food and energy to local management of land resources at the farm and landscape level. It is represented in data ranging from satellite observations of land cover to surveys of human attitudes, perceptions, and behaviors, with many other types of data in between. So far, researchers have not succeeded in defining an all-compassing theory of land change, and the feasibility of formulating such theory is not evident. Addi- tionally, we have not yet reached the point where we have all the data we need to characterize the various land use and land cover changes that are occurring in various systems throughout the world. Therefore, it is reasonable to expect that some hybridization of the above approaches, accounting for the heterogeneous theories and data environments confronting models that incorporate land use and

OCR for page 29
66 Advancing Land Change Modeling land cover change dynamics, is necessary to serve contemporary scientific and management purposes. Theories from multiple disciplines, such as economics, geography, demog- raphy, ecology, and anthropology, contribute to the explanation of land change. Often, these theories are related to specific land conversion processes or sectors, for example, ecological succession (Cushman et al. 2010), Boserupian theory concerning the effects of population on land use sustainability (Boserup, 1965; Turner and Fischer-Kowalski, 2010), the induced-intensification thesis (Turner and Ali, 1996), neo-Thünen theory about moving frontiers and urban markets (Walker, 2004;Walker and Solecki, 2004), and the theories of Fujita and Krug- man about urban development (Fujita et al., 1999a,b), as notable examples. Most theories cannot adequately explain the complexity of land use decision making, nor address the processes driving both land use and land cover change. It is well understood that decision making processes about land change, as well as many of the ecological and disturbance processes affecting land cover change, are context dependent and one or multiple theories may provide a proper representation for a specific case study or land change process. Therefore, the choice of theory and model concept may depend on the specific scale of analysis, the processes studied, the availability of data, and the case-study characteristics. As an example, in a mature land market, models based on economic theory may best be able to capture the dominant processes. In a deforestation frontier the land market may not be functioning at all and models driven by geographic or institutional factors may be more useful. At the same time, land change may be influenced by several types of pro- cesses synchronously that require different modeling concepts: for example, although land decision makers may be oriented towards optimizing benefits of land use, these benefits are often influenced by land change in the neighborhood, creating scaling effects. Because land change modeling often involves repre- sentation of cross-scale interactions, interactions among different land types or sectors, and determination of both the amount and spatial pattern of land cover types, there are multiple procedural opportunities for including different model- ing approaches. While some opportunities for hybridization (e.g., for representing different land sectors) are driven by differences in the theoretical basis for such models, others (e.g., for crossing scales) are driven by the relative efficiency of different algorithmic approaches to linking across scales and processes, and still others, e.g., linking models of land use and land cover, are driven by multiple concerns that also include data availability. Hybrid modeling approaches there- fore can combine different underlying conceptual frameworks, theories, and empirical observations into a system representation and allow the modeler to choose appropriate procedure for modeling depending on the practical needs of modeling across the range of representation in land systems. Hybrid approaches can involve:

OCR for page 29
LAND CHANGE MODELING APPROACHES 67 1. a combination of approaches to more fully represent decision making, for example, agent-based decision making that includes a cellular neighborhood model to account for neighborhood interactions in the decision making, or a machine learning model to represent human cognition (e.g., Manson 2005); 2. the use of different approaches for different scales to capture the dominant processes at the scale addressed, for example, economic models for aggregate land shares that constrain cellular spatial-allocation models (e.g., Sohl et al., 2012); 3. the use of different modeling concepts for different land change types considered, for example, using a cellular automata (neighborhood-based) model for urban land use with an econometric approach for other land cover types (Verburg and Overmars, 2009); or 4. the use of one approach to parameterize a model using a different approach, for example, machine learning to parameterize a cellular model (Pijanowski et al., 2005; Sangermano et al., 2010). 5. the conceptual integration of modeling frameworks, for example, the development of a common language to refer to automata with fixed (i.e., cellular) versus flexible (i.e., agent-based) topologies (Torrens and Benenson, 2005). Strengths and Weaknesses Hybrid modeling approaches take advantage of the strengths of the indi- vidual approaches and reduce some of their inherent limitations. The lack of overarching theory or systems description in some cases, and data or both in others, makes it necessary to carefully match existing theories and modeling con- cepts to the conditions and empirical contexts under which they are valid. Hybrid approaches allow such flexibility. At the same time, hybridization of modeling concepts allows the development of novel approaches can better represent the complexity of reality. Hybridization also involves risks. Often the combination of multiple concepts leads to an increased complexity reducing the ease of interpretation of simulated changes and hampering causal tracing of emergent land changes. For this reason, model calibration and validation across the multiple hybridized components can be challenging. Separate components of a hybrid model might be calibrated in different ways, according to the empirical demands of each approach, but there is often little theoretical guidance on how the combination of components should be parameterized. As with any modeling approach, the ways the combination of model types represents reality and to what extent the model is able to answer the questions of the stakeholders of the modeling effort will determine its success.

OCR for page 29
68 Advancing Land Change Modeling A COMPARISON OF LAND CHANGE MODELING APPROACHES Ultimately, the aims of land change modeling are to advance the science of land change, to improve our understanding of interactions between land change and various environmental processes, and to provide capacity to support decision making around problems involving land change. For these purposes, the various modeling approaches reviewed here provide capabilities to explain and learn (EL) about land system dynamics, as well as to project and predict (PP) future states of land systems. The ability to evaluate the impacts of environmental and social changes on the land system, especially through the use of scenarios, and to provide input to other models is also an important use to which models have been put. We have arranged the five main modeling approaches in our assessment roughly in order from those most focused on modeling pattern (beginning to Machine Learning and Statistical Approaches) to more structural models that focus more on the processes of land change (including Economic and Agent- Based Approaches) (Table 2.1). While an evaluation of the validity of any given model or approach for any given purpose is beyond the scope of this assessment, we were able to identify some of the implications of these broad differences for how models based on these approaches can be used. In general terms, models that have a more explicit representation of a given process, like those that represent land use decision making with structural Economic or Agent-Based approaches, are more flexible to different types of changes in context that can be evaluated through model scenarios, including for example changes in credit availability, the level of enforcement for illegal activities, or the amount of information available about alternative choices. Paradoxically, perhaps, models with a greater degree of explicitness in representing process, while useful for predicting the conse- quences of alternative scenarios qualitatively, often perform less well making quantitative projections or predictions about specific outcomes at specific places or times. This can result from their inclusion of processes for which parameter values are unavailable empirically or are highly uncertain, feedback processes that can create path dependent outcomes with multiple equilibria, thereby raising the level of uncertainty in predictions, or processes that produce outcomes for which semantically compatible observations are unavailable. The key example of this latter point is the challenge, especially over large extents, of obtaining spa- tially explicit land use information. The need for land use data, due to its relative incompatibility with satellite measurements, is a mismatch with which the LCM community consistently struggles. For these reasons, approaches that are focused on fitting observed patterns (like Statistical and Machine Learning approaches) and extrapolating them into the future can both satisfy the users for which making near-term predictions is an important goal and make efficient use of the extensive record of spatially explicit land cover and other remotely sensed observations. As we have discussed in Chapter 1, these models are not limited by data so much

OCR for page 29
LAND CHANGE MODELING APPROACHES 69 as they are by a lack of representation of the theory behind our understanding of land change processes. Machine Learning approaches can represent well the relationships between, for example, land cover changes that are observable in multiple Landsat images over time and a variety of biophysical, location, and other variables, and used these relationships for extrapolation to estimate where future changes might be expected to occur. As long as the structural elements of the system remain unchanged (i.e., are stationary), projections can provide use- ful information about near-term changes. Because of their thinner theoretical and process grounding, however, models that focus on observed patterns are limited in their ability to support evaluations of scenarios involving structural change. While no single modeling approach can serve all purposes equally well, each of the modeling approaches we describe has been adapted within a wide variety of settings, often to move them along this pattern-process continuum, and the hybrid modeling approaches that combine specific approaches provide particular flex- ibility in developing models that address particular challenges. The relative strengths of the approaches with respect to representing pat- terns and processes, then, further affect their appropriateness within policy- and decision-making contexts. The committee considered the roles that LCMs can play within the context of the cycle of policy and decision making presented in Chapter 1(adapted from Verdung, 1997), and developed an approximate mapping of modeling approaches to stages in that cycle (Figure 2.2). Within this mapping, we identify the suitability of machine learning and cellular models in the prob- lem identification step, because of their assumptions of stationarity and lack the richer structural detail about process needed to evaluate the effects of changes in policy structure. Projections of future trends can be useful to identify situations in which significant problems may arise if action is not taken, for example in managing total maximum daily loads in an area experiencing significant urban growth. These modeling approaches can also be useful at later stages, where, for example, policies or decisions that involve changing or constraining land changes spatially (e.g., through creation of protected areas) or where baselines based on extrapolating past trends are needed for ex ante assessment, but their comparative advantage is in problem identification. To consider interventions that affect agent behaviors or might generate mar- ket feedbacks that have spillover effects on other components or locations in the land system, the richer behavioral specificity of agent-based and structural economic models provides a basis for exploring the structure of the land system and the interactions inherent in it, and exploring dynamics that might benefit from intervention. For example, links between household inequality and environmental outcomes can be explored to identify the reasons for and opportunities to improve both. The process specificity of these modeling approaches is usually needed to weigh the effects of alternative interventions. In moving to a decision about some policy or other action, structural eco- nomic models, including both sector-based and spatially disaggregate as well as

OCR for page 29
Table 2.1 Generalized characteristics of modeling approaches. Modeling Pattern-Process Land cover, Key Assumptions Typical Data Recommended Uses Approach Land use Requirements Machine Learning Pattern Land cover Strong stationarity Land cover maps from at Make forecasts of land cover and Statistical least two time points patterns under stationarity Some number of maps of Extrapolating past patterns predictor variable(s) Cellular Land cover, Stationarity A land cover map at some Forecast land cover patterns land use Strong spatial control point in time Evaluate changes in spatial and/or interaction Some number of maps of controls without market No market interactions predictor variable(s) feedbacks Spatially Land use Utility or profit Data on land use or land Reduced-form models: Disaggregated maximization cover at one or more Identify the causal effect of Economic Models Price and/or spatial point(s) in time key variables on land change equilibrium Economic and biophysical outcomes Heterogeneous agents variables that influence Structural models: Simulate sometimes specified land demand and supply effects of policy changes Any other required on land market outcomes, instrumental variables including changes in prices and land use patterns

OCR for page 29
Modeling Pattern-Process Land cover, Key Assumptions Typical Data Recommended Uses Approach Land use Requirements Sector-Based Land use Utility or profit Economic variables that Forecast aggregate land Economic Models maximization influence aggregate demand changes under a variety of Price and/or spatial and supply, including market-based changes that can equilibrium prices of commodities and affect demand and supply Heterogeneous agents values of trade at a regional sometimes specified or country scale Utility or profit maximization Price equilibrium Representative agents Agent-Based Process Land cover, Usually heterogeneous Data describing Explore land change processes, Models Land use agents characteristics of agents often under stylized conditions Variable interactions Qualitative or quantitative Explore effects of exogenous among agents data on decision processes change on a system, where it Data on land use or land has not happened cover at some point(s) in Explore future scenarios where time past patterns may be poor indicators of future outcomes NOTE: Column 2 indicates the arrangement of approaches along a continuum of focus on pattern vs. process. Column 3 addresses the types of outcomes (land use, land cover, or both) for which each approach has tended to be applied. Column 4 lists key assumptions that characterize how the approach handles data. Column 5 describes the input data requirements and Column 5 lists key recommended uses.

OCR for page 29
72 ADvANCiNG LAND ChANGE MODELiNG ASED NT-B AGE ECONOMIC A RAL Hyb STRU GE CTU rid CT N RU SE ST CT U T- O BA CON RA Ex- SE L E D/C an R- D te BA Intervention SE Design OM ULAR As ELL ses IC MACHINE LEARNING sm ent CELLULAR Problem Decision and Identification Implementation Evaluation Ex-p o t st A ss e ss m e n RE IC DU E TR CED -FORM ECONOM Figure 2.2 Land change modeling approaches (outer circle) placed within the context of the policy- and decision-making cycle (inner circle). SOURCE: modified from Verdung, 1997. agent-based models, and hybrid models provide capabilities that can be exploited for assessing the possible effects of the policy, ex ante. For example, the GTAP model (Hertel, 1997), which is a static multi-region, multi-sector, CGE model, was used to evaluate the implications of biofuel mandates for land use demand both within the United States and internationally through the possible effects on the prices of food commodities (Keeney and Hertel, 2009). Once policies or decisions have been implemented, the need for evaluat- ing the effects of these implementations, ex post, is often quite effectively met through use of reduced-form economic models that estimate the magnitude of the effect of the intervention, usually by comparing observable outcomes either before and after the intervention or in an intervention area and some comparable

OCR for page 29
LAND CHANGE MODELING APPROACHES 73 location. For example, Andam et al. (2008) used reduced-form econometric methods to evaluate the effectiveness of protected areas in reducing deforestation globally by estimating the effects of the protected areas on deforestation rates in comparison with those found in areas in close proximity to protected areas. Understanding the underlying structures, assumptions, and data requirements of different modeling approaches is critical to understanding their applicability for various scientific and decision-making purposes. This review provides a framework for comparison of multiple modeling approaches in relation to specific objectives. The next section of this report outlines opportunities and needs for advances that will improve modeling capabilities into the future.

OCR for page 29