Read "Behavioral Modeling and Simulation: From Individuals to Societies" at NAP.edu

« Previous: 7 Games

Page 271 Cite

Suggested Citation:"8 Common Challenges in IOS Modeling." National Research Council. 2008. Behavioral Modeling and Simulation: From Individuals to Societies. Washington, DC: The National Academies Press. doi: 10.17226/12169.

Page 272 Cite

Page 273 Cite

Page 274 Cite

Page 275 Cite

Page 276 Cite

Page 277 Cite

Page 278 Cite

Page 279 Cite

Page 280 Cite

Page 281 Cite

Page 282 Cite

Page 283 Cite

Page 284 Cite

Page 285 Cite

Page 286 Cite

Page 287 Cite

Page 288 Cite

Page 289 Cite

Page 290 Cite

Page 291 Cite

Page 292 Cite

Page 293 Cite

Page 294 Cite

Page 295 Cite

Page 296 Cite

Page 297 Cite

Page 298 Cite

Page 299 Cite

Page 300 Cite

Page 301 Cite

Page 302 Cite

Page 303 Cite

Page 304 Cite

Page 305 Cite

Page 306 Cite

Page 307 Cite

Page 308 Cite

Page 309 Cite

Page 310 Cite

Page 311 Cite

Page 312 Cite

Page 313 Cite

Page 314 Cite

Page 315 Cite

Page 316 Cite

Page 317 Cite

Page 318 Cite

Page 319 Cite

Page 320 Cite

Page 321 Cite

Page 322 Cite

Page 323 Cite

Page 324 Cite

Page 325 Cite

Page 326 Cite

Page 327 Cite

Page 328 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

8 Common Challenges in IOS Modeling T his chapter discusses broad issues and challenges that are encoun- tered across the range of individual, organizational, and societal (IOS) modeling approaches and methods, highlighting problems that need to be solved for these modeling approaches to be most useful for the militaryâs needs. We first describe issues of integration and interÂ operability, the challenges that confront modelers and simulation develop- ers when they attempt to integrate multiple models and simulations, with the goal of Âmaking them interoperableâthat is, able to use output from one model as input for another. Next we describe some of the challenges (and potential benefits) of developing and using modeling frameworks and tools that facilitate the development of IOS models. We then describe issues of model verification, validation, and accreditation (VV&A), issues that are especially challenging for the modeling of human behavior. Finally, we discuss some of the challenges posed by the data requirements of IOS models in light of the realities of the data and information available to model developers and users. In each section we note some potential solu- tions to the challenges. Integration and Interoperability In this section, we discuss the issues that confront modelers attempting to integrate models developed with different internal structures, at different levels of granularity, or with inconsistent inputs and outputs. The nature of the challenges requires that the discussion be quite technically sophisticated and use terminology and concepts that may be unfamiliar to many readers. 271

272 BEHAVIORAL MODELING AND SIMULATION We have tried to define some of the terms in footnotes, but a simplified discussion would not do justice to the subject matter. Model Interoperability: Incompatibilities and Functionality Gaps There are several fundamental issues (and associated hard problems) that need to be addressed in undertaking the development of an interÂ operable framework of IOS models. First and foremost is the problem of making existing or even new models interoperable, as these are developed independently (i.e., with no coordination) by different software design and development teams, in consultation with domain experts having vari- ous levels of skills and expertise. A very common approach is to build a wrapper around an existing model, thus converting it to an input-output (I-O) black box, or to provide an intelligent agent operating autonomously, which communicates with other models in the network. But this approach is likely to introduce other types of gaps and incompatibilities between models, some of which are identified in Table 8-1 and illustrated in Fig- ureÂ 8-1. We discuss here the need to identify an overall methodology to fill these gaps, including various intelligent automated techniques, processes, and guidelines, as well as aid from human subject matter experts and ana- lysts whenever needed. Interface Incompatibility The first problem shown in Figure 8-1 (in the top row) concerns inter- face incompatibility between two models that either already exist or are being developed independently. If we intend to feed output from model A about a certain object X as input to model B, then some mismatch between the output and input may occur in terms of the assumptions about the numbers and types of Xâs attributes. This is often straightforward but tedious to deal with, often merely involving translation from one descrip- tive framework to another (e.g., from numerical valuesâ1, 2, 3, . . . âto âfuzzyâ Âvaluesâlow, medium, high, . . .). A bigger problem ensues when different levels of resolution are used to represent the same object in two different models. If model A provides a high-resolution object representa- tion of X (e.g., a map, enemy force estimates) for model B, and model B needs a low-resolution representation (e.g., latitude/longitude of enemy center of Âgravity), then some aggregation process must be conducted, usu- â Much of the work described in this section on model integration and interoperability was performed by John Langton and Subrata Das at Charles River Analytics with support from the Air Force Research Laboratory, Information Directorate (AFRL/IF) under contract FA8750- 06-C-0076, and adapted from Langton and Das (2007).

COMMON CHALLENGES IN IOS MODELING 273 TABLE 8-1â Gaps and Incompatibilities Between IOS Models Type Definition Interface Mismatch between the data types of different models or outputs of one model and inputs of another, e.g., real number vs. Boolean Ontological Different relationship structures, naming schemes, etc., in ontologies for different models Formalism Different logic and inferencing mechanisms and procedures for different models Subdomain Differing domains and dynamics between PMESII model dimensions, e.g., gaps economic vs. social SOURCE: Langton and Das (2007). About object of type X Interface Model A Model B Incompatibility Input Output Input Output Ontological Incompatibility Formalism Bel(X) p(X) Incompatibility Subdomain Gaps Economy Social FIGURE 8-1â Illustration of gaps and incompatibilities between IOS models. SOURCE: Langton and Das (2007). 8-1.eps ally based on one approximation method or another. The reverse process is much more difficult, going from a low-resolution output to a high- r Â esolution input, since, in effect, missing input attributes have to be inferred or approximated and filled in. A number of approaches can be used to resolve the interface incompatibility. These are described in the section on interoperability recommendations below.

274 BEHAVIORAL MODELING AND SIMULATION Ontological Incompatibility The second problem illustrated in the figure is ontological incompatibility between models, which arises due to differing vocabularies and expressive power in their respective ontologies. Different teams of engineers and subject matter experts with a diverse range of expertise, knowledge, and cognitive capabilities independently creating models will inevitably develop and use dif- ferent underlying ontologies, which in turn will give rise to incompatibilities across models. Initially, one might suggest the development of a common ontology for the set of all possible models; however, many failed efforts in this direction make it clear that developing a universal ontological standard for model creation is impractical, if not theoretically impossible. Moreover, if models are to be built rapidly, analysts should ideally be free to use a model-building environment of their own choosing without assistance from knowledge engineers. The analysts should not be constrained by a predefined ontology to express their knowledge, which usually inhibits their expressive flow. Hence, rather than proposing to develop a common ontology for the model space, one approach is to focus on facilitating better mapping capa- bilities between differing ontologies. For example, there are tools that can map ontological terms from one domain to another by solving the problems of synonymy and polysemy; these clearly offer hope for translating differ- ing ontologies used in the models. In some cases of incompatibility between the underlying ontological structures of the models (e.g., semantic networks versus logical expressions), one domain can be mapped to another by pro- viding a more expressive ontological structure for one of the models (e.g., semantic networks can be mapped to first-order logical sentences). Therefore, some parts of the ontological incompatibility problem can be addressed via automated techniques. A number of approaches can be used to resolve the ontological incompatibility, described below. Formalism Incompatibility While ontological incompatibility creates problems due to multiple ways of designating an entity, the formalism incompatibility shown in F Â igure 8-1 is concerned with multiple ways of instantiating the object entity â An ontology, for the purposes discussed here, is âa systematic arrangement of all of the important categories of objects or concepts which exist in some field of discourse, showing the relations between them. When complete, an ontology is a categorization of all of the concepts in some field of knowledge, including the objects and all of the properties, relations, and functions needed to define the objects and specify their actionsâ (http://www.answers. com/ [accessed July 2007]). â ynonymy refers to one referent (concept) with several words that can denote it (plainÂ EnglishÂ S examples: big, large); polysemy refers to one word denoting multiple referents (plain E Â nglishÂ exampleÂs: break; park).

COMMON CHALLENGES IN IOS MODELING 275 computationally represented in the model. For example, uncertainty can be expressed not only in terms of probability values, but also via Âvarious other formalisms, such as certainty factors, the Dempster-Shafer measure of beliefs (Shafer, 1976), and numerous other qualitative dictionaries. These are fundamentally incompatible with each other, both in terms of their underlying conceptual representation of uncertainty and probabilistic rea- soning, and in the sense of having different types of scales. Conversion between two such formalisms often requires deep understanding of the models and their formalisms, thus breaking the simple I-O black box idea of encapsulation. Specialization of formalism is often appropriate to map one approach to another. For example, probability theory is a special case of Dempster-Shafer theory that allows beliefs to be expressed only on Âsingleton sets, facilitating development of a mapping from probability models into Dempster-Shafer models. Subdomain Gaps If one wants to feed the output from a model in one domain to another, it will require an analyst or domain expert with knowledge of both domains to bridge the subdomain gaps. This is due not only to the ontological gaps between the domains being considered, but also to differ- ing Âdynamics between the domains. Addressing this problem requires the skills of experts from the respective domains or ideally ones who are expert in both domains. A number of approaches can be proposed to bridge such gaps, by high- lighting possible correspondences between concepts and variables across domains, described below. Recommendations are also made for more compre- hensive approaches that could be part of a long-term development effort. Figure 8-2 provides an illustration of model interoperabilityâÂfocusing on political, military, economic, social, information, and infrastructure (PMESII)-related issuesâwith interactions among three layered models: one focusing on the social structure, one on the community infrastructure, and a third on the underlying information models, respectively from top to bottom. The infrastructure model in the middle models a stabilization and reconstruction operations (SRO) model, developed by the AFRL/IF, (ÂRobbins, Deckro, and Wiley, 2005) using a system dynamics modeling approach (see Chapter 4), and captures a sequence of influences among variables, starting from the power supply at an electrical power substation. The generated power is fed into an industrial water plant, which produces water consumed by oil field work. An oil field produces crude oil to be refined by a refinery. Refined fuel is used to generate power, which in turn is supplied to various power substations, thus forming a loop. It is especially

High Level of Anger Level of Anger 276 Medium Loss Influence Diagram Among Population Among Population Low Model Fragment of Social Model in Town Aligned Drinking Refined Refined Sufficient with the USA Drinking Power Power Water Fuel Fuel Food Supply Water Drinking Power Water Refined Fuel Power Industrial Oil Oil Power Fragment of Substation Water Plant Field Refinery Generators Infrastructure Industrial Refined Power Crude Model Water Fuel High Voltage Power SRO Model Refined Fuel Terrorist Group A Leads Leader X Angers 8-2.eps Attr Attr Attr Concept Graph Fragment of Behavioral Model Information Model in Town with Terrorist Stronghold Aggressive Diplomatic Quick to Anger Attr Attr Attr Causes Imminent Attack in landscape view, smaller type is 5.73 pt Use of Threatening Calling for Inviting Suicide Observable Intelligence Phrases Jihad Bombers FIGURE 8-2â Interoperability of three different PMESII models. SOURCE: Langton and Das (2007).

COMMON CHALLENGES IN IOS MODELING 277 difficult to reason with these types of graphs, containing such loops span- ning many variables, as it creates an additional burden for discounting the variablesâ self-influence. The social model at the top of the figure captures the impact of these infrastructure-related variables on the society, using influence modeling technology (see Chapter 6). The model specifically captures the influence of the four variables of power, drinking water, refined fuel, and sufficient food supply on a variable representing the level of anger of the population in a town aligned with coalition forces. The dynamics of the social model are that short supply in any one of these three consumable products will increase the level of anger among the local population. In fact, if a terrorist organization became aware of the mid-layer SRO model sequence in the infrastructure, then the power substation would assume heightened impor- tance in the eyes of the terrorist strategists: an attack on a substation would not only cripple other services in the loop, but would also drive the senti- ment of the local population against the coalition. Note that the diamond box represents the expected mission utility in line with the level of anger. The utility (although difficult to quantify here) should go up when the anger level is down and vice versa. The behavioral information model at the bottom of the figure illus- trates how a model of a terrorist leader can be built using a concept graph approach (Sowa, 1984) in which concepts are represented by rectangles (e.g., [Person: Leader X] and [Behavior: Aggressive]), and conceptual rela- tions are represented by circles (e.g., has Attributes) and soft-cornered rectangles (e.g., Leads, Causes). An analyst can query such a model to determine who the terrorist leader is and the nature of the leader based on various observable intelligence. Such a leader X, who leads the terror- ist group A, can possess different types of behavior attributes, including aggressive, diplomatic, quick to anger, etc. If the leader is quick to anger and there are some stimuli to make the leader angry, then an attack on friendly targets may be imminent. One such stimulus would be coalition forces stopping the supply of oil to the region, as indicated by the link to the SRO model above. The key issue here is the interoperability among the models. Note that although an I-O connection has been made between the two variables Oil Refinery and Refined Fuel of the top two models, they are ontologi- cally incompatible as defined earlier. However, they can be made compat- ible by recognizing that the term âOilâ is synonymous with âFuel,â and âRefinedâ and âRefineryâ have a common base word. Another difficult compatibility problem is illustrated by the fact that there is no input for the variable ÂSufficient Food Supply in the social model, illustrating the inter- face incompatibility described earlier. One can envision, however, that this âsufficiencyâ concept could be automatically computed from the supply of

278 BEHAVIORAL MODELING AND SIMULATION food previously recorded in available databases to bridge this last gap. A number of recommendations for resolving specific model incompatibilities and functionality gaps are provided below. More general approaches to resolving more than one of these gaps simultaneously are a current area of study (Langton and Das, 2007). Recommendations for Resolving Gaps in Model Interoperability A number of approaches can be taken to maintain, adapt, and integrate diverse models in the context of the interoperability gaps just defined. Dealing with Interface Incompatibility Interface incompatibility generally refers to two or more models having different types of data for their inputs and outputs and thus not being able to interoperate without some form of data conversion. There are at least three types of interface incompatibilities: 1. I-O format incompatibilities: string versus binary, real Âversus Âinteger, fixed versus floating point, numeric versus Boolean, incompatible scale, incompatible zero point, date-time format, color format. 2. Logical incompatibilities: number of I-O points (e.g., three out- puts versus four inputsâRGB to CMYK is a trivial example), I-O t Â iming (e.g., fast output versus slow input). 3. Model persistence format incompatibilities: XML versus YAML, OWL versus RDF, etc. One way to deal with these issues is via a development interface that provides a basic set of translation functions that can learn from user i Ânteraction over time. A graphic user interface (GUI) would allow users to explicitly modify, add, and remove interface translation functions, as illustrated in Figure 8-3. Users could also specify these translation functions within an ontology or the XML schema of a model, based on specifica- tions derived, for example, from an evolved, global ontology. A full-scope GUI would then allow users to explicitly modify, add, and remove inter- face translation functions. A number of potential translation functions are described below in the context of the type of incompatibilities each addresses. Dealing with I-O Format Incompatibilities Many interface incompatibilities fall within this category, and most solutions can be resolved by some combination of the following:

COMMON CHALLENGES IN IOS MODELING 279 About object of type X Model A Model B Input Output Input Output ï£« X1 ï£¶ ï£« Y1 ï£¶ Contextual ï£¬X ï£· ï£¬Y ï£· Information ï£¬ 2 ï£·â ï£¬ 2ï£· ï£¬ ... ï£· ï£¬ ... ï£· ï£¬ ï£· ï£¬ ï£· ï£ Xm ï£¸ ï£Yn ï£¸ Y j = f ( X1i ,..., X i , C) k Develop an interface for encoding commonly used transformation functions Ex: PROJECT(X3), X1*X2+X3, min(X1, X2), gen(X3), fuzz(X3) FIGURE 8-3â Resolving interface incompatibility. 8-3.eps â¢ Normalization: mapping any value to lie between 0 and 1 relative to its minimum and maximum possible values. â¢ Weighting: scaling a value, typically in relation to other values. â¢ Fuzzification: randomly generating a number to lie within some con- straining interval (e.g., some random number between 0.3 and 0.6). â¢ Discretization: âbinningâ values according to their range and a range they must fall withinâsomewhat like roundingâsometimes taking their distribution into account (e.g., 0.5 within a range between 0Â and 1 can be discretized to 1 for a range of only 0 or 1). XML schemas often exist to support model file persistence. These schemas define the elements of a model along with the possible values they can take on. XSLT can then be used along with a number of standard

280 BEHAVIORAL MODELING AND SIMULATION translation functions for integrating inputs and outputs of two models on relevant nodes or links. These functions can also be adapted according to user interaction over time. Dealing with Logical Incompatibilities In some cases, one model may have more outputs than anotherâs inputs or vice versa. When integrating models, we therefore need methods for addressing these situations. For an overabundance of values, we can Âsimply use some form of aggregation. Again, a model schema or ontology can specify how this aggregation should be performed, or the user could specify this through the above-mentioned GUI. In the case in which we have only one value but must map to more than one, we can simply duplicate the value or partition it according to any context provided in the model schema or ontology. In some cases, the sample rate of inputs and outputs may differ. One way of dealing with this is through smoothing and resampling. Dealing with Model Persistence Format Incompatibilities In essence, this issue really mirrors the greater task of integrating m Â odels. The existence of a standard schema or ontology for different models would immediately resolve this issue. However, we cannot now depend on such a standard or on adherence to it. A partial solution may be to evolve or derive a standard schema or ontology. In either case, most effective solutions will entail the use of XML and XSLT for the translation of one model format to another. Dealing with Ontological Incompatibility Ontological incompatibility refers to two models having different struc- tures, including the entities they specify and the relationships between them. For instance, a rules system model may have several pairs of nodes connected by one link (precedent and consequent), whereas a Bayesian net typically has more of a tree structure. Nodes can have different names, graphs can be directed or undirected, and two models representing the same system can be at different resolutions and thus include a different number of nodes and links. The principal issue of this incompatibility is determin- ing which entities, nodes, or links in different models should map to one another for interoperation. Syntactic heuristics: The labels and descriptions of nodes and links in differing models can be compared on the basis of their raw string content. If these string components match, then the nodes or links may be a match

COMMON CHALLENGES IN IOS MODELING 281 as well. For instance, ârunway16â may map to ârunway.â A threshold for how many characters must match to infer a string match must be specified. This type of matching can also include matching nodes/links based on the range, cardinality, and other attributes of their possible values. Semantic heuristics: Nodes and links from different models can be compared on the basis of the semantics of their labels, descriptions, and any other textual metadata specified in an XML file, XML schema, or o Â ntology. Elements from different models that have a semantic similarity can then be mapped to one another for model integration. For instance, a node with the name âairportâ in one model may be mapped to a node with the name ârunwayâ in another model on the basis of the semantic similarity of their labels. Semantic similarity is determined by the relations between two words as derived from statistical usage, ontologies, thesauruses, dictionaries, etc. There are both service-oriented architectures and application program interface specifications for this purpose, including WordNet (Al-Halimi and Kazman, 1998) and Lexical Freenet (Beeferman, 1998). Relation mapping: Relation mapping can be used to address ontologi- cal incompatibility by mapping nodes from one model to nodes of another based on their relations (how they are connected) within their individual models. With this information, we can then suggest potential mappings between nodes of different models based on the similarity of their relations within their respective models. Consider the nodes Î± of model A and Î² of model B. Although these nodes may have very different names, they may have very similar relations. For example, both could influence five other nodes and be influenced by four other nodes. Based on their similarity, we may be able to deduce that these nodes can be mapped together for model integration. It is important to note that relations encompassing a node are not merely all of its incoming and outgoing links; they also include features identifying how the node affects any other nodes in the model. While this approach should rarely be used to draw links automatically, it could be used to make effective recommendations. Model node aggregation: Model aggregation can be used to address ontological incompatibility by identifying how sets of nodes in different models with differing cardinalities may be mapped to one another. It may be the case that a node Î± in model A maps to a subset of nodes N in model B, resulting in incompatible ontologies. For example, consider Î± to be the node airport and N to be the subset of nodes runway, plane, radar, and air traffic control. The question is, which nodes should airport be mapped to for model integration? We can use the semantic similarity of the labels on the nodes of N (e.g., interfacing with WordNet for ontological inference) â n I mathematics, the cardinality of a set is a measure of the ânumber of elements of the setâ (Wikipedia, see http://en.wikipedia.org/wiki/Cardinality [accessed Feb. 2008]).

282 BEHAVIORAL MODELING AND SIMULATION to aggregate N into one meta-node: airportâ². Using semantic similarity between airport and the constituent nodes of airportâ², we can then infer that these two entities should be mapped for model integration. More spe- cifically, the inputs to airport could potentially be mapped as inputs to all nodes of airportâ². We can also infer the pairing of airport and airportâ² using relation map- ping. To continue the example, consider relations between airport and the other nodes of model A, and airportâ² and any remaining nodes of model B. If their relations are similar, then airport could be a candidate for integrat- ing with all of the nodes of airportâ². For instance, both airport and airportâ² could be connected to the nodes passenger, ticket, and pilot. With ontologi- cal inference, we can see that the relations of airport and airportâ² are similar within their respective models (even though the relations of airport are not similar to the relations of any of the constituent nodes of airportâ²). We can then deduce that these nodes should be mapped for model integration. Dealing with Formalism Incompatibility Established formalism mappings. There has been a great deal of research on mapping between different algorithm formalisms, and there are a number of established standards. Table 8-2 shows a matrix in which the followÂing illustrative formalisms appear in the outer cells of both the X and Y axis: Bayesian probability, Dempster-Shafer, fuzzy logic, pos- sibilistic Â theory, certainty factor, and symbolic dictionary. Each internal cell denotes the mechanism used for mapping between associated for- malisms on the outer cells of the X (to) and Y (from) axes. Note that the mechanism for mapping from X to Y may not necessarily be the same Âmechanism used for mapping from Y to X. Shaded cells represent established mechanisms for mapping between different formalisms, while nonshaded cells represent potential mapping approaches. In other words, there are known and established algorithms for mapping between the formalisms that are joined by a shaded cell. Using XML schemas and ontologies for formalism mapping. Ontologies can explicitly identify mappings between formalisms in the attributes of links within a model. For such formalisms as Bayesian networks and argumenta- tion networks, relations correlate with the links between nodes. We can add an attribute to each link in a Bayesian network XML schema, declaring a âcausesâ relationship between a parent and child node it connects. Further- more, we can specify that types of links declaring a âcausesâ relationship should be mapped to the âentailsâ relationship of a rule-based model. When mapping from a Bayesian network model to a rule-based model, we can then infer from the ontology that a âcloudyâ node in a Bayesian network that

TABLE 8-2â Mappings Between Modeling Formalisms Bayesian Dempster- Fuzzy Possibilistic Certainty Symbolic Probability Shafer Logic Theory Factor Dictionary Bayesian X Generalization Membership Transformations Mapping from Probability- probability degree based on consistency probabilities to to-symbolic interpretation of principles certainty factors mapping probabilities Dempster- Bayesian X Via Bayesian Via Bayesian Via Bayesian Belief value- Shafer approximation approximation approximation approximation to-symbolic (transferable belief mapping model) Fuzzy Normalization of Via normalization X Possibility measure Mapping from Membership logic membership degrees interpretation of membership degree-to- membership degrees degree to certainty symbolic factors mapping Possibilistic Transformations Belief Membership X Mapping from Possibility theory based on consistency interpretation degree possibility measure- principles of possibility interpretation measures to to-symbolic measures of possibility certainty factors mapping measures Certainty Normalization Via normalization Via normalization Via normalization X Certainty factor factor-to- symbolic mapping Symbolic Symbolic-to- Symbolic-to-belief Symbolic-to- Symbolic-to- Symbolic-to- X dictionary probability mapping value mapping membership possibility measure certainty factor degree mapping mapping mapping SOURCE: Langton and Das (2007). 283

284 BEHAVIORAL MODELING AND SIMULATION âcausesâ a ârainâ node should be mapped to a rule that has the variable âcloudyâ as a precedent and ârainâ as a consequent. Subdomain Gaps Subdomain gaps can be addressed by learning ontologies and Âontological evolution in which the relationships between models are implicitly speci- fied as models are built. Mixed initiative approaches can also be used to address all of the interoperability gaps, including differences in subdomains. For example, the system might offer suggestions about what nodes or links should be related between two models. The user could then accept, edit, or ignore these suggestions. One simple mixed initiative approach would be reinforcement learning (Kaelbling, Littman, and Moore, 1996) guided by these user selections. If a user accepts a suggestion, the system could increase the number of related suggestions. If a user rejects a suggestion, then the system should learn not to make similar suggestions in the future. Even devoid of other heuristics, this approach would allow the storage of historical information as to what input and output types the user typically maps together and offer these mappings as suggestions for subsequent model integration efforts. In summary, there are no currently agreed-upon and widely used stan- dards for model integration and interoperability. The field of IOS modeling is fragmented, with models being developed from different perspectives, at different levels of detail, and using different theoretical frameworks and architectures. To address these issues, we suggest improvements in âtransla- tionâ interfaces, schemas, or ontologies that could guide integration, as well as mixed initiative efforts in which model developers and users from differ- ent perspectives work together to create models. Architectures and standards for that would support the development of integrated interoperable feder- ated models identified as a key area for future research in Chapter 11. frameworks and toolkits General Issues and Requirements Earlier chapters have described many IOS modeling and analysis tech- niques, but it is generally accepted that no single approach or model- â Much of the work described in this section was performed by Karen A. Harper, Jonathan D. Pfautz, Chen Ling, Sofya Tenenbaum, David Koelle, and Marc Sageman, with support from the Air Force Research Laboratory, Human Effectiveness Directorate (AFRL/HE) Âunder contract FA8650-06-C-6731, and by Karen A. Harper and John Bachman with support from the AFRL/IF under contract FA8750-06-C-0078, to Charles River Analytics.

COMMON CHALLENGES IN IOS MODELING 285 ing Â formalism can or should be applied to capture all of the complex d Â ynamics of modern military missions and activities. The previous section has described some of the fundamental modeling issues that arise when we consider linking up or âfederatingâ different models, and it is clear that considerable progress will have to be made at the conceptual level before such activities become commonplace in the IOS modeling and simulation (M&S) community. In the meantime, there has been and continues to be progress in the development of more specialized (and therefore less globally encompassing) frameworks and toolkits that attempt to address some of the more practical issues of model development, verification and validation (V&V; see following section for more complete discussion), and integration across modeling concepts and instantiated simulations. In general, these efforts attempt to provide an integrated development environment (IDE) that enables â¢ the development of simpler, more focused submodels to represent specific features of the behavior of interest to the analyst, using the most appropriate tools for modeling those features; â¢ the straightforward integration of those submodels into a cohesive and sophisticated representation of the overall operational environ- ment; and â¢ the effective accounting of the complex interdependencies between modeled variables within the integrated system. Most of the work in developing frameworks and tools has occurred in the individual âstovepipesââsome refer to these as âcylinders of e Â xcellenceââthat characterize each modeling community. Perhaps the best funded over the longest development history is the OneSAF System (One Semi-Automated Forces; see Chapter 2) of the Department of Defense (DoD). OneSAF is an M&S environment with a strong Army legacy that models combined arms tactical operations up to the battalion and brigade level, at variable levels of resolution (âentityâ) from the individual soldier on up. A key driver in its development was to ensure âcomposability,â which is another way to say that the associated development environment provides for user-specifiable systems, entities, units, and associated behav- iors (with variable âdial-inâ levels of fidelity). This is accomplished via a Product Line Architectural Framework, illustrated in Figure 8-4, a layered architectural approach that allows for âplug-and-playâ modules at many different levels and via a model-developer suite of GUIs that provide the following functionalities to the M&S developer: â¢ System composer: High-level control and testing of the overall simulation.

286 8-4.eps bitmap image landscape for legibility FIGURE 8-4 OneSAF product line architecture framework. SOURCE: See http://www.peostri.army.mil/CTO/FILES/SmithR_GeneralFrameworkInterop.pdf, p. 12. [accessed April 2008].

COMMON CHALLENGES IN IOS MODELING 287 â¢ Entity composer: Specification of hardware components (weapons, sensors, etc.). â¢ Unit composer: Specification of the organizational structure. â¢ Behavior composer: Specification of the entity behaviors in terms of a task-network branching structure comprised of conditional branch points and behavior primitives (this is the heart of the OneSAF behavior model). â¢ Management and control tool: High-level control of the mission objectives, order of battle, route plans, etc. OneSAFâs capabilities are impressive, buts its focus on the ground war, its limited repertoire due to prescripted behavior primitives, its inability to model deep cognitive or social interactions, and its narrow focus on mili- tary missions (in contrast to more encompassing PMESII considerations) all point to the need for further model development in this area. Its focus on bringing the modeling out of the hands of the programmers and into the hands of the analysts and users, via a focused effort on IDE development, is commendable and should serve as a model for parallel efforts now ongoing in other M&S communities. Another modeling community working on IDEs is the group of researchers and model developers focusing on the behavior of the indi- vidual human, often based on the framework of the particular cognitive architecture underlying the model (see Chapter 5 for additional discus- sion). Table 8-3 provides a sampling of some of the individual behavior models discussed earlier, along with their associated IDEs. As can be seen, the IDEs are very specific to each cognitive modeling paradigm; can range from highly generic programming language development environments (e.g., CLOS) to very specific model development environments (e.g., iGEN); assume varying levels of expertise on the part of the model developer, from general programming expertise to âdrag-and-dropâ graphic construction skills; and provide a range of developer support, from little beyond the basic programming IDE to extensive debugging, logging, and visualization. Again, this is not meant to be a survey of such IDEs, but rather an illustra- tion of the variety of IDEs in use by the development community. Yet another modeling community engaged in developing frameworks and toolkits is the widespread and diverse group of researchers, model developers, and applications specialists focusing on group and organiza- tional models. One of the best clearinghouses for gaining an overview of available models and tools is maintained by the Computational Analysis of Social and Organizational Systems Center at Carnegie Mellon University. Although there exists some conflation of models and the associated IDEs for their development, the site provides useful pointers to a number of model

288 BEHAVIORAL MODELING AND SIMULATION TABLE 8-3â Selected Cognitive Architectures and Their Development Environments Development Model Environment Comment ACT-R 6.0 No formalized â¢ LISP ACT-R is the âbaselineâ version and requires development a knowledge of LISP programming environment, since â¢ Python ACT-R makes ACT-R available to a wider model developers audience (i.e., non-LISP programmers) work directly with â¢ jACT-R is a Java version the different ACT-R â¢ All have associated programming IDEs but require frameworks: LISP programming skillsâand theoretical knowledge of ACT-R, Python the underlying cognitive architecture constructsâ ACT-R, and significantly beyond drag-and-drop model-building jACT-R activity COGNET iGEN â¢ Workbench-based development environment with a collection of high-level agent-building tools â¢ GUI for defining program logic and knowledge, without programming â¢ Application program interface for integration of iGEN cognitive agents within existing applications using standard languages/protocols D-OMAR OmarL, OmarJ â¢ OmarL is a LISP-based environment for knowledge representation and the definition of agents and their behaviors. The languages are extensions of the Common LISP Object System (CLOS) â¢ OmarJ is a Java-based agent development environment that provides tools for creating and managing systems of agents operating in a distributed computing environment. OmarJ provides most of the features of OmarL with an improved external communication layer that uses Jini for internode communication and the ability to break out of simulation mode and run agents in a non-time-controlled environment EPIC â¢ IDEs associated with original LISP version of EPIC and with current C++ version SAMPLE AgentWorksâ¢ AgentWorksâ¢ consists of: â¢ Perceptual, cognitive, and communications modules including neural networks, fuzzy logic, Bayesian belief networks, expert systems, and argumentation engines â¢ Advanced processing capabilities supporting planning, learning, and distributed applications â¢ Enhanced usability components for construction, validation, and visualization of agent processes Soar SDB â¢ Soar Debugger (SDB) is an XDB-like debugger for the Soar programming language, including functionality, such as deep structure inspection, watches, and breakpoints and a graphical interface to common Soar commands

COMMON CHALLENGES IN IOS MODELING 289 development tools and frameworks at Carnegie Mellon and elsewhere, as illustrated by: â¢ Construct (http://www.casos.cs.cmu.edu/projects/construct/info. html), a multiagent model development environment â¢ OrgAhead (http://www.casos.cs.cmu.edu/projects/OrgAhead), an organizational structure analysis tool â¢ DyNet (http://www.casos.cs.cmu.edu/projects/DyNet) â¢ BRAHMS Composer (http://www.agentisolutions.com/products/ composer.htm), the IDE for BRAHMS, an agent-based organiza- tional modeling framework â¢ SimVision (http://www.epm.cc/solutions/simvision.htm), a bundled software environment and methodology for organizational design â¢ CONNECT (http://www.cra.com), a social network analysis tool for organizational modeling and simulation â¢ DDD (Distributed Dynamic Decision-making; http://www.aptima. com/a-sim.php), a simulation building and execution environment for predicting and assessing team performance A quick perusal of these tools (and others) makes it clear that, like the cognitive modeling frameworks described earlier, the associated IDEs run the gamut in sophistication, from those demanding high levels of user expertise in the underlying theory and the associated modeling language, to those stressing ease of use, but imposing limited applicability for selected domains. Considerable work is still needed to bring these highly special- ized models out to the general user community, via IDEs that provide wide applicability as well as usability. In the general area of developing representations for âsoftâ problems in IOS behavior, such as modeling the evolution of a terrorist organization or understanding the multiple possible paths in nation-state rebuildingâand the interplay of critical diplomatic, information, military, and economic (DIME) and PMESII variablesâlittle has been accomplished in the way of developing associated IDEs to support the DIME/PMESII M&S com- munity. This is primarily due to the fact that such nascent modeling efforts are still grappling with the conceptual underpinnings of representation; considerations of model development infrastructure and user- (developer-) friendliness are still considered a secondary objective. However, the lack of such environments may actually be hampering conceptual development, â Soft in the sense of heavily driven by human and social rules of behavior, as opposed to more readily modeled problems that are well constrained by generally accepted physical, economic, or doctrinal factors.

290 BEHAVIORAL MODELING AND SIMULATION because unwieldy development environments slow the âtest-and-evaluateâ spiral cycles that must inevitably occur in this field. As noted earlier, modeling of military activities in which IOS behaviors dominate outcomes (e.g., asymmetric threats embedded in urban environ- ments) demands a clear understanding of the complex sociopolitical con- text. This translates to the analysis of the potential effects that a given set of DIME actions will have across the full range of the PMESII context. Within the context of the Integrated Battle Command program of the Defense Advanced Research Projects Agency (http://www.darpa.mil/sto/Âsolicitations/ IBC/), these analyses are viewed in two ways, as shown in Figure 8-5. From left to right, the figure shows a causal analysis in which, given a set of pos- sible DIME actions to be taken, a system of complex and integrated behav- ior models is used to predict the potential effects those DIME actions may have across the PMESII dimensions. From right to left, the figure shows a diagnostic analysis in which, given a set of desired PMESII effects in the operational domain, the same system of integrated behavior models is used to identify the candidate sets of DIME actions that might be applied to achieve those desired effects. By conducting both types of analysesâones that move well beyond the limits of conventional military âmetal-on-metalâ modeling embodied by OneSAF, for exampleâcommanders will be able to develop significantly deeper insight into the dynamics of the big picture operational context (see additional discussion later in this section). The key to successfully executing such encompassing analyses lies in the development of the embedded behavior models representing the full range of PMESII variables and how they can be individually and collectively affected by specific DIME actions. For example, as described earlier in this chapter, the SRO model (Robbins et al., 2005) analyzes the organizational hierarchy, dependencies, interdependencies, exogenous drivers, strengths, and weaknesses of a countryâs PMESII systems using a complex set of interdependent system dynamics representations. While approaches like this have demonstrated some success in modeling subcomponents of the PMESII environment, it is generally accepted that no single approach or modeling formalism can or should be applied to capture all of the complex dynamics of modern asymmetric warfare; in other words, it is not necessary to stick with a single modeling formalism (e.g., system dynamics model- ing) to model something as complex as a nation-state undergoing political upheaval, foreign intervention, or civil war. A better approach is to provide for an IDE that enables the intercon- nection of disparate modeling methods representing DIME/PMESII features using the most appropriate method to model those features. A key issue â A brief overview of potentially useful modeling paradigms for DIME/PMESII modeling and analysis issues is given in Appendix C.

COMMON CHALLENGES IN IOS MODELING 291 Social/ Culture What if? Potential Potential DIME PMESII Actions Economic/ Political/ Effects Blue or Red Blue or Red Infrastructure Religious Possible Information Desired or Actual DIME PMESII Effects Actions Regular Military How to Achieve? Irregular Military FIGURE 8-5â Predicting and analyzing PMESII effects of DIME actions. SOURCE: Adapted from Allen (2004). 8-5.eps bitmap image pushed up mid-tones is providing compatibility across models andcone & oval most new vector labels, new their underlying modeling formalisms, such as the models described in Chapters 4 through 6 and the generic formalisms presented in Table 8-2. As noted above, this is a con- ceptually difficult problem to solve theoretically, but some progress can be made with the development of sufficiently flexible IDEs. IDE Development Goals and Examples An ideal IOS IDE, especially one targeted for the complex task of devel- oping DIME/PMESII models, would include â¢ an intuitive graphical model development environment support- ing the specification of heterogeneous submodels using a variety of modeling formalisms (Bayesian reasoning, fuzzy logic, system dynamics models, rule-based expert systems, etc.). â¢ a suite of model integration tools enabling user-driven sharing of data and information among constituent DIME/PMESII models. â¢ a suite of model V&V tools enabling user-driven verification of individual and integrated DIME/PMESII model behavior as well

292 BEHAVIORAL MODELING AND SIMULATION as the large-scale data collection required to support validation of model behavior against empirical data. â¢ a model analysis infrastructure that enables user-driven causal and diagnostic reasoning within the integrated modeling framework using sampling techniques and sensitivity analysis, respectively. â¢ a suite of multiresolution modeling tools and supporting infra- structure to support the user-driven specification of DIME/PMESII submodels at multiple levels of modeling fidelity. â¢ a model management infrastructure that enables the capture, dis- tribution, and maintenance of large libraries of DIME/PMESII submodels. We describe here an exemplar effort in developing such an IDEâthe Human and System Modeling and Analysis Toolkit (HASMAT) developed for AFRL/HE (Bachman and Harper, 2007; Harper et al., 2007)âand describe two specific modeling efforts conducted with this framework, to illustrate how nonconventional modeling problemsâspecifically counterÂterrorism and military recruitingâcan be addressed within such frameworks. HASMAT is intended to be representative of efforts under way to develop frameworks in this area. We close with a description of generic DIME/PMESII analysis capabilities that also need to be part of such frameworks. Human and System Modeling and Analysis Toolkit HASMAT is designed to support predictive analysis of behavioral and organizational dynamics by integrating existing and mature technolo- gies. The HASMAT functional system architecture is shown in Figure 8-6. H Â ASMAT is used by a modeler to create a model representing human behavior at multiple levels, from societal behavior down to individual cog- nitive decision-making behavior. To create these models, the modeler can use a variety of modeling methods (e.g., social network modeling, Bayesian belief networks, rule-based systems, fuzzy logic, case-based reasoning). The modeler can also use a number of different methods for model integration, defining ontologies, data schemas, and mappings between individual modelÂ ing components or between a model and an external environment (e.g., a decision aid, a simulation, a real-world data source). A modeler also has access to tools for model management, including version control methods for existing or newly created models, and libraries of models and model templates that can be adapted to a particular domain or situation. All of these capabilities are accessed via the modeler interfaces, which provide GUIs to specific toolkit features. All of these components are integrated into a software system architecture designed to support reconfigurability, integration, and incorporation of new capabilities.

COMMON CHALLENGES IN IOS MODELING 293 FIGURE 8-6â HASMAT system architecture. SOURCE: Harper et al. (2007). 8-6.eps bitmap image Figure 8-7 provides an overview design vision for the PMESII model analysis tools designed in the HASMAT environment. In the upper left of the graphic is shown a simple selection tool for the analyst to select from among the range of available models defining the PMESII environment for execution and analysis. The selection of a specific model results in the input fields for that model being captured from the selected model (via its XML schema-based I-O specification) and populating the tabular input sets shown on the left. On the right side of the figure are displayed the outputs of the modelâs execution, in this case, showing the national and regional SRO model outputs. This could be generalized as other potential represen- tative structures, including a map-based overview of the model-Âgenerated results. As the user selects specific outputs in the callout datasets, the overview map shows the comparison of that value set across the modeled

294 FIGURE 8-7â An overview of the analystâs interface in the HASMAT environment. SOURCE: Harper et al. (2007). 8-7.eps bitmap image

COMMON CHALLENGES IN IOS MODELING 295 regions through fill color. As the user drags the timeline back and forth, the output datasets will display the values as described for that point in time. Finally, in the lower right, the analyst is also provided with graphical displays of selected model outputs in time-series data plots, a feature that allows for easier access to trend data throughout a model run for more detailed analysis. As the user manipulates the timeline, these data plots shift a marker to identify the value at the selected time. Modeling Terrorist Network Evolution Figure 8-8 illustrates the software integration strategy that was used to generate a HASMAT-based modeling framework of terrorist organization activity (Harper et al., 2007). The model constructed within the integrated HASMAT framework consists of a social network representation of an organization or loosely connected set of groups or individuals of interest to the counterterrorism analyst. Each node within the social network can represent an individual (e.g., a key leader in the community of interest that has been the target of specific intelligence-gathering activities), a group (e.g., a set of individuals representing a cohesive entity in the community of inter- est), or an event. The links within the social network represent relationships between nodes in the modeled community, in which these relationships are defined at the outset by known intelligence (e.g., individual X is a known leader of group Y). This social network representation enables the analyst to build up the network over time based on intelligence products. In typical social network analysis applications, this static representa- tion would be used and analyzed to infer structural elements or features of the organization that, for example, might be exploited by counterterrorism specialists to capture further intelligence or to infiltrate a known group of interest. In HASMAT, however, this social network topology provides only the first step of the modeling capability. In HASMAT, each node of the social network is then populated by a âbehavioral agentâ representing the dynamic behavior of the modeled individual or group. These agents can be configured by the analyst based on gathered intelligence information. Thus, these agents are not static representations of individual or group âprofiles,â although they do contain representations of such information. Instead, they provide dynamic simulations of behavioral responses of the modeled individuals or groups within the social network to events and actions that are âinjectedâ into these models based on evolving simulations of the social network dynamics. For example, an agent representing a given individual can âreactâ to incoming information (e.g., the invasion of Iraq by U.S. forces, the introduction of a new leader into a group of interest) and generate new events that propagate out to the social network (e.g., the establishment of new or strengthening/weakening of existing relationships

296 Agent behaviors are captured as change events in social network (e.g., generate new link, change attribute of node/link). Each node of the Agent components social network is generate responses represented by an to network events agent that reasons captured as agent about the network behaviors (e.g., data. seeking new relationships). 8-8.eps Each agent applies 3 bitmap images a collection of computational reasoning algorithms to process network events. with vector type & arrows added FIGURE 8-8â Overview of HASMAT software integration strategy. SOURCE: Harper et al. (2007).

COMMON CHALLENGES IN IOS MODELING 297 within the network). The result is an emergent, evolving representation of the organizational dynamics of the modeled community, driven by modeled reactive behaviors of individuals and groups. Finally, at the bottom of Figure 8-8, we show the supporting model- ing technologies that are assembled by the model developer (the analyst, a third-party social scientist, etc.) to provide the detailed representations of modeled individual and group behaviors. These detailed components capture and generate the simulated responses of a modeled individual or group based on injected or simulated stimuli. These simulated responses are then pushed back up to the social network representation as âchange eventsâ within the social network itself. Such change events include the generation of new links (i.e., relationships), the deletion of existing links, the adjustment of profile characteristics of the modeled node (e.g., an increase or decrease in an individualâs radicalism), and the adjustment of a link attribute (e.g., strength or nature of a relationship). This modeling framework was then used to model the well-documented terrorist activity leading up to the 2004 Madrid train bombings (Sageman, 2004, 2006; Harper et al., 2007), including organizational relationships among individuals associated with the attacks and their evolution over time (Telvick, 2007). Many interesting dynamics were seen in the data and mod- eled in the HASMAT environment. One goal of the effort was to model the outcomes that a group can takeâit can talk or boast about operations, or it can actually take action. There are many factors that contribute to the final shift to action, including how much they have boasted of action so far, the easy access to weapons, the required skills, an external missive or deadline, and past criminal historyâa predisposition to act. This and several other model outcomes were compared with the available data to assess model fi Â delity to the real world, to support rapid spirals of hypothesizing, develop- ing, and validating, in an effort to understand the underlying dynamics of the terrorist networkâs behavior in terms of fundamental behavioral âprimitivesâ (Harper et al., 2007). Without the rapid development environment afforded by HASMAT (and similar IDEs now beginning to be used in the community), these rapid spirals and exploration of possibilities would not be possible. Modeling Iraqi Recruiting Activity A similar software integration strategy was applied to generate a H Â ASMAT-based modeling framework of Iraqi recruiting and training a Â ctivity (Bachman and Harper, 2007). The SRO model (Robbins et al., 2005) was constructed within the integrated HASMAT framework, consist- ing of a system dynamics model representation of key PMESII components of Iraq, including demographics, coalition and insurgent activities, criti- cal infrastructure, etc. This allowed for full communication between two

298 BEHAVIORAL MODELING AND SIMULATION heterogeneous modeling environments and the development of specialized models that were best served by the system dynamics paradigm (e.g., SRO model), or by a suite of computational intelligence components (e.g., agents that incorporate fuzzy logic, belief networks, expert systems). A major objective of the development effort, besides developing a framework for integrating heterogeneous modeling paradigms, was to provide the analyst with an aggregated, accessible view of model results that could be used to support decision making by a commander or other decision maker. Such a tool would allow the commanderâs staff to easily generate model inputs (representing DIME actions that could be taken) and monitor model responses over time in a presentation framework that would be more intuitive than the PMESII IDE itself. For example, in the context of the recruiting and training SRO model, the analyst might specify a specific allocation of troops to support recruiting and training of specific capabilities in a given region of Iraq. The models would then be executed against this input set, and the analyst could monitor the overall effects on high-level PMESII variables (unemployment, economic stability, crime rates, etc.) in an intuitive graphical interface. This would insulate the ana- lyst from the detailed outputs and implementation details that would be of interest to the model developer, while providing the analyst with intuitive and targeted real-time decision support leveraging the models constructed using the HASMAT framework. Advanced Analysis Capabilities Making predictions using DIME/PMESII models requires two types of what-if analysis, depicted in Figure 8-5 earlier in this chapter. The first type, causal reasoning, enables analysis from causes to effects. This allows the user to consider the effects of potential DIME actions on the PMESII models under consideration. The second type, diagnostic reasoning, enables reasoning from effects to causes. This allows the user to specify the desired (or actual) PMESII effects and determine the DIME actions that are most likely to achieve this result while minimizing undesirable second- or third- order effects. Supporting these two types of reasoning using PMESII models requires specific statistical sampling and analysis techniques. Causal reasoning. In this type of analysis, the user specifies a set of DIME actions and the analysis indicates how these actions would influence the given PMESII models. Due to the nonlinearity of the systems being modeled and the incompleteness of information about system state, it is unreason- able to expect that PMESII models will provide high-fidelity predictive capabilities. Instead, the predictive value of the models lies in their ability to generate the distribution of plausible outcomes across multiple courses of

COMMON CHALLENGES IN IOS MODELING 299 action. One approach to this is to use Monte Carlo sampling, which refers to a family of algorithms that approximate a function f by calculating f(x), for a randomly chosen x, over many iterations. Sampling is a useful approx- imation technique in cases in which the function to be computed is difficult or impossible to calculate exactly. For complex nonlinear models such as PMESII models, randomized sampling provides an effective approach to approximating model outputs because it is independent of the underlying formalisms being used by the model. Sampling can be used to analyze any model that incorporates both (1)Â a representation of the cause-effect relationships between model elements and (2) a specification of the relative likelihoods of inputs or initial states of model elements for which such conditions are not explicitly specified by the user. The first condition requires only that the model being sampled have some predictive capability. For example, hidden Markov models, belief networks, neural networks, and rule bases all meet this condition; a purely analytical tool such as a topic tree or a concept map does not. The second condition requires that the model specify a distribution of initial conditions for model elements, including the likelihood that various actions (either blue or red) will be observed. This allows the sampling algorithm to select random inputs according to a plausible distribution. Given that the DIME/PMESII models in the system meet these two criteria, a user would perform a causal analysis using the analysis sampling tool in the following manner: 1. Specify conditions. The user first specifies the set of assumptions to be evaluated by the analysis; this includes not only the DIME actions of interest but also assumptions about the state of hidden variables in the models. The user also specifies the number of itera- tions to be performed by the sampling algorithm. 2. Select data collection parameters. The user then selects the elements in the models for which state data will be collected. 3. Begin the simulation. The IDE samples the PMESII models repeat- edly. At each iteration, the states of variables not explicitly set in step 1 are randomized to permissible states given information about the relative likelihood of the initial states of the variable. The effects of the model inputs are propagated through the model, and the framework collects system state data for the variables selected by the user in step 2. 4. View collected data. The user then views the data collected in step 3, viewing the relative frequencies of various outcomes. The envisioned IDE would allow the model analyst to specify initial con- ditions for input variables and view the resulting simulation data in a graphi-

300 BEHAVIORAL MODELING AND SIMULATION cal format for analysis. By performing this type of simulation-based analysis for multiple DIME actions, the user would be able to determine which actions result in a greater likelihood of achieving the desired effects as well as which actions result in a greater likelihood of causing undesired effects. Means-ends and sensitivity analysis. The second type of reasoning of inter- est to a DIME/PMESII modeler is means-ends analysis: for a given effect or system state, what are the actions that can be taken to achieve the desired state? This type of analysis is very difficult to do using heterogeneous m Â odels and is an area in need of further work. Outlined here are some potential approaches to supporting this type of analysis. One approach would be to perform a forward-chaining analysis for each set of actions under consideration; the set of actions most likely to achieve the desired result could be selected empirically based on the results of each analysis. Such an approach is clumsy and inefficient, however, since the forward-chaining reasoning process is itself computationally expensive. Also, performing a brute-force means-ends analysis in this fashion, with the large number of action sets that are likely to be possible, would quickly become prohibitively complex and computationally expensive. One solution is to reduce the search space of possible actions or input states using a technique known as sensitivity analysis. Sensitivity analysis computes, typically using black box sampling techniques, how variability in the output of a model depends on variation in its inputs. Because it uses sampling, sensitivity analysis can also be applied to any type of model formal- ism: only the inputs and outputs are observed. In the case of reasoning using DIME/PMESII models, we can use sensitivity analysis to determine which actions or input variables are most relevant in determining the outcome or effect in which we are interested. Once we have identified a subset of relevant actions, we can then perform brute-force means-ends analysis in the manner described above to determine the optimal combination of those actions. To illustrate this process further, consider the following example. S Â uppose a group of modelers have developed a network of DIME/PMESII models specifying the interrelationships between the economic and political elements of a particular country. A user of the envisioned framework wishes to use the aggregated model to gain insight into the types of actions that can be taken to boost public confidence in the existing government. Because of the complexity of the model and the number of possible inputs and actions, the user performs a sensitivity analysis and determines that the factors most critical in determining public confidence are the supply of electricity, the visibility of police in the community, and the price of gasoline. Having identified this subset of factors, the user performs a brute-force means- ends analysis and determines that public confidence can be maximized by increasing electricity supply by 20 percent, maintaining the current high level of police forces, and reducing taxes on gasoline by 3 percent.

COMMON CHALLENGES IN IOS MODELING 301 Because sensitivity analysis determines the variability of model output according to its inputs, it can provide results of interest other than just the relevance of an input. For example, the rate of change of model output as a result of input may be of even greater significance for a model user in select- ing an optimal course of action. For example, if the model indicates a strong nonlinearity or âtipping pointâ in the output variable under consideration, this would indicate the importance of gathering additional information to determine how close to this tipping point the system being modeled actually is. Or the model may indicate that the results of an action on an output variable may be highly variable, with a large standard deviation; this would indicate a higher risk associated with the action, especially in cases in which the impact of the actions being taken is difficult to control. In summary, a variety of frameworks and toolkits are in development, although the choices for IOS models are much more limited than for cog- nitive models of individuals, for which there are a number of well-known, tested, alternative architectures in widespread use. It is a recommendation of the report (see Chapter 11) that diverse frameworks for IOS models be supported and further developedâit is too early to tell which approaches will be most useful for different purposes. Verification, validation, and accreditation In this section we describe some of the significant issues involved in the VV&A of IOS models: the ways in which they differ from physics- based models, the special challenges of forecasting human behavior, given the huge number of variables that can combine to determine it, and other thorny issues. We introduce the term âaction modelâ and argue that mili- tary requirements for IOS models often include models for action as well as for understanding and exploration, and that the validation of such models cannot be done without a clear specification of the purpose for which the model is being developed. We also discuss the ways in which the military approaches VV&A and provide some examples of VV&A issues specific to various model types discussed in previous chapters. Finally, we make recommendations for dealing with IOS VV&A challenges. General Issues: Validation for Use All models are wrong, but some are useful. G.E.P. Box (1979) V&V are challenging issues for social science M&S. As generally understood, verification is the âprocess of determining that a model imple- mentation accurately represents the developerâs conceptual description and specifications.â Validation is the âprocess of determining the degree to

302 BEHAVIORAL MODELING AND SIMULATION which a model is an accurate representation of the real world from the perspective of the intended uses of the modelâ (ITT Research Institute, 2001, p. 10). Stated more intuitively (ITT Research Institute, 2001, p. 10), verification asks âDid I build it right?â Validation asks âDid I build the right thing?â In building it right, there are two elements: the degree of real-world representation and the intended uses of the model. They are related but are not the same thing. A realistic representation may not meet the intended use. It is a frequent error to put primary emphasis on a realistic represen- tation, assuming it will meet the purpose. The result is an unending quest for realism without considering the intended use or purpose of the model. When one begins with the intended use or purpose, then the degree of real- world representation follows. Depending on the type of understanding that is needed or the action that might be taken, we can determine the degree of realism required. Here we develop an action approach to validation that begins with the model purposeâto take action. V&V are necessary to support the goal of building and applying pur- poseful models and model simulations for understanding and exploration as well as for real-world actions. Research program managers frequently see V&V as a drain on resources. In contrast, practitioners or model users typically view the V&V process as a worthy investment of time and effort, since it can prevent the costly consequences of using incorrect models and simulations. If the intended use is not fully considered, then the model is not as useful as it might be. When the intended purpose is to take actionâto do somethingânot just to understand or describe the world, the degree of realism needed is determined by the actions that can be taken in the situation. This section stresses validation issues. Validation can be approached in two different ways within the larger V&V process. The first way is to begin with verification, proceed to validation, and then to the intended purpose. This ordering of concerns may result in a model that is verified and validated yet fails to be useful for its intended purpose. The second and recommended way is to begin with the intended purpose, proceed with verification, and then to validation in relation to intended purpose (Burton and Obel, 1995; U.S. Department of Defense, 1995). First and foremost, without a prior specification of intended purpose, there are no clear-cut a priori criteria for deciding which features of a phe- nomenon to stress in its modeled representation. Indeed, multiple models that represent different aspects of a given phenomenon might be desirable and even necessary to achieve different purposes. For example, given a potentially unstable situation, a model constructed to describe the situation will in general differ substantially from a model constructed to guide the

COMMON CHALLENGES IN IOS MODELING 303 selection of an intervention action to stabilize the situation. That is, a model for understanding may not be a good model for action. Moreover, each model purpose entails its own unique model validation requirements. In particular, the model purpose determines the appropri- ate trade-off between predictive accuracy, the appropriate formulation of dynamic processes, and the appropriate treatment of idiosyncratic and stochastic elements of real-world processes. For example, models are frequently criticized for lack of realismâthat is, not describing the world as it is observed or leaving out some aspectâ but realism to what purpose? The continued addition of realistic features makes the implications of a model more difficult to understand, requiring increasingly sophisticated statistical and analytical techniques. Eventually, the continued addition of realism will result in a modelâs exhibiting such complexity that it has all of the interpretation problems of the real world itself, problems that presumably motivated the modeling effort in the first place. Extreme realism might also require an impractical amount of data to build the model or to specify parameter values and run the model. Con- sequently, if a simple model serves the intended purpose, then it should be preferred. Action models require some degree of realism for action, but real- ism is not a good test for action models. For action models, the purpose is to support decisions to take actionâparticularly when there is considerable uncertainty about the world. As stressed by Marks (2006), the assertion that a model is validated when it is determined to be useful for its intended purpose is vacuous until âpurposeâ is defined. The purpose of a model could be to explain an observed phenomenon, to forecast a range of future phenomena that might occur without an intervention, or to guide the taking of actions in some specific problem context. For example, purposes might include behavioral description, behavioral explanation, behavioral prediction, exploration, normative advice and implications, training, and decision making (Burton and Obel, 1995). A different model would typically be required to meet each of these different purposes. The first type of model purposeâexplanationârequires what might be termed an understanding approach to validation. The latter type of model purposeâguidance for actionârequires what might be termed an action approach to validation. Both purposes typically involve forecasting. The next section briefly reviews the understanding and exploration approach, commonly adopted in academic research. Following that, the next section elaborates the action approach, in which the purpose is to take action or intervene.

304 BEHAVIORAL MODELING AND SIMULATION Validation for Understanding and Exploration Consider first the case in which validation is undertaken for the purpose of explanation or of understanding the system that is modeled in order to gain new insights. Intervention is not the purpose here. Ideally, an explana- tion of a phenomenon would entail a complete understanding of both the necessary and sufficient conditions for its occurrence. In practice, compromise is essential. Any model will fall short of a complete understanding. There is no limit to further refinement for a more complete understanding. As stressed by Haefner (2005), one possibility is that a model of a given phenomenon is incomplete in the sense that it is not capable of explaining all aspects of the phenomenon deemed to be important for an intended pur- pose. At the other end of the spectrum, multiple distinct models could offer different competing explanations for a given phenomenon, none of which could reliably be eliminated on the basis of currently available empirical evidence (observational equivalence). An intermediate possibility stressed by Epstein (2006) is that a model has been constructed that is capable of reliably generating a particular phenomenon of interest (generative sufficiency). Such a model offers one candidate explanation for the phenomenon. Intensive experimentation could then be used to judge the robustness of the generative explanation to perturbations in the model specifications (Judd, 2006). If this process could somehow identify the entire class of models capable of generating the phenomenon, then the ideal but elusive goal of necessary and sufficient explanation would be achieved. Consider next the case in which the purpose of validation is forecasting to identify a range of possible outcomes and estimate the likelihood of each. As Marks (2006) notes, this is a simpler purpose than explanation, in that only sufficient conditions for the occurrence of a phenomenon are sought. That is, one wants a model capable of generating reliable forecasts of out- comes (or outcome distributions) under various possible circumstances in some specified problem domain of interest. Whether this model is capable of elucidating all possible circumstances under which these outcomes would occur is not an issue of concern. However, what is of concern for forecasting is whether a model is i Ânaccurate. Does the model predict outcomes with misleading likelihoods? In particular, does the model predict outcomes that could never actually be observed? Prediction is a very important element of the action approach, as we explain below. An important use of models is for exploration and the generation of nonobvious insights into complex phenomena that could not have been obtained without the model. A classic example is Schellingâs (1971) tipping point model, which showed that neighborhood segregation could occur

COMMON CHALLENGES IN IOS MODELING 305 even if most people are racially tolerant. In these cases, the focus is not so much on the âaccuracyâ of the model, but on âunexpectedâ results. However, these models are also driven by their purposeâto provide new and important insights, where ânewâ and âimportantâ are in the eye of the beholderâand their validity cannot be assessed without a deep understand- ing of that purpose. Validation for Action There are many aspects of an action model. An action model needs to relate actions of interest to outcomes of interest. The model does not necessarily need to reveal deep understanding. However, an action model must be timely and accurate relative to its purpose. For example, a model that predicts a hurricaneâs landfall is useful only if it provides predictions that are timely enough to allow for evacuation in advance of landfall and accurate enough to be taken seriously by those who need to evacuate. An action model is context specific in terms of available resources that help define what is feasible at this time and this place. In the illustration to f Â ollow, these issues are fundamental. Validation for action begins with the purpose of the model. Prediction without and with intervention is an important element of an action model. Consider, now, an action model whose intended use is to provide guid- ance for the taking of actions in an uncertain environment. The validation process for an action model is necessarily different, but it does incorporate aspects from the validation processes described in the previous section for explanatory and exploratory models. In particular, prediction is important to action. Specifically, the validation process for an action model must include a careful consideration of the modeled action choices, including no interven- tion. For example, have these action choices been specified in a suitably realistic or feasible way? And have enough action choices been included in the action domains of decision makers to permit them to display a realis- tic degree of flexibility in the face of changing and possibly unanticipated conditions? Appropriate modeling of action choices will not eliminate the uncertainty inherent in a situation, but it should help to clarify the possible action alternatives and hence provide useful guidance regarding the best action to take. We start by considering the validation of a simple forecasting model with no action domain. This model is then generalized to an action model, and the implications for validation are considered. A simple forecasting model with no action domain. The simplest situa- tion is one of pure prediction in which there is no action to be taken. As

306 BEHAVIORAL MODELING AND SIMULATION an illustration, consider a corn farmer who lives in an area where it might rain or not and who wishes to predict the weather. This weather prediction problem for this corn farmer prediction model can be parsed into a number of distinct modeling issues. First, what exact form could a weather prediction take? For example, the farmer could focus solely on rain, or he could also take sunlight into account. If the focus is solely on rain, the farmer could consider a simple probability distribution consisting of probability assessments for rain or no rain, or he could consider a more sophisticated probability distribution consisting of probabilities spanning the range of possibilities from no rain to a great deal of rain. The farmer might also choose to collapse this prob- ability distribution into a simple prediction (forecast) concerning whether it will rain or not. Alternatively, it could be that the farmer ultimately cares only about his corn yield, and he cares about the weather only to the extent that he believes the weather affects his corn yield. The farmer might express this belief by postulating an if-then relationship between weather and corn yield of the form âif A, then B.â The contingency condition A might be either ârainâ or âno rainâ and the result B might then be a specific conditional probability distribution Prob(b|A) for the corn yield b conditional on the realization of A. For example, it could be that the contingency condition ârainâ is postulated to result in a two-thirds chance of a high yield and a one-third chance of a low yield, whereas the contingency condition âno rainâ is postulated to result in a 50-50 chance of a high or low yield. If-then relationships permit the formation of compound predictions. For example, continuing with the above illustration, constructing a corn yield prediction requires the farmer to assess and compound the uncertainty arising from two distinct types of events: the weather A, and the corn yield b conditional on the weather A. By the Bayes rule, the joint probability Prob(Aâ©b) that a specific weather event A and a corn yield b both occur is given by Prob(A)Prob(b|A). Each probability assessmentâProb(A) and Prob(b|A)ârequires its own form of validation. Second, what exact form should a weather prediction take, given the purpose that drives the corn prediction model? If the farmer wants only a rain forecast, then a simple assessment of the probability of rain versus no rain might suffice. If the farmer wants a more sophisticated understand- ing of the weather, he might assess a finer range of probabilities spanning a range of rainfall amounts. If he is interested in constructing a com- pound prediction of corn yield, then the fineness of his weather probability assessments will presumably depend on the postulated impact of different weather events A on corn yield b; there is no need to separately assess the probability of no rain and very light rain if both events are postulated to have the same effect on yield. Moreover, in addition to forming probability

COMMON CHALLENGES IN IOS MODELING 307 assessments Prob(A) for weather events A, he will need to assess the con- ditional probability Prob(b|A) of each possible corn yield b conditional on each possible weather event A. Third, if an outcome is inherently uncertain, then point predictions regarding this outcome cannot be made with certainty. In the simple predic- tion model at hand, there is no way to eliminate the inherent uncertainty about the weather, nor should there be. A model cannot eliminate the inherent unknown of whether it will rain or not. A model describes what a modeler thinks he knows, but it also highlights what he does not know. A farmer might be able to say with confidence that the probability of rain is 0.3. Or, more generally, he might be able to provide a complete description of all possible rain events A in terms of probability assessments P(A) or by means of a nested sequence of confidence intervals. But he cannot say for sure whether it will rain or not. To be valid for situations with inherent uncertainties, a model should reflect honestly what is knowable and capture well what is known. It is misleading at best, and quite possibly damaging, to use point estimates as if it is known with certainty what will happen. It is inappropriate to build more into a model than is knowable for the situation. The corn farmer prediction model at hand is a pure prediction model; the farmer is not faced with the need to choose an action. The following section considers the implications for validation when this simple model is generalized to include action choices for the farmer. A simple illustrative action model. Consider, now, a corn farmer action model that represents a simple extension of the previous corn farmer pre- diction model. The corn farmer now has the option of adding fertilizer to his field or not. Consequently, the farmerâs action domain consists of two possible action choices: add fertilizer to the field or do not add fertilizer to the field. (This action validation model is based on a decision theory model, similar to those discussed in Chapter 5.) Some structural aspects of the farmerâs problem are assumed to remain the same: the probability that it will rain or not (which assumes the weather is independent of the farmerâs action choice) and the possible corn yields realized under rain or no rain in the absence of fertilization. However, the impacts of fertilization on corn yieldâfor example, bushels per acreâwhen it rains and when it does not rain must now also be considered. Specifically, as depicted in Table 8-4, the farmer needs to specify what the corn yield would be under rain and no rain should he choose to add fertilizer to his field or not. This results in four distinct âcompoundâ contingency condi- tions (combined weather and action states) that could impact corn yield. As Table 8-4 shows, the corn farmer action model has the same general framing as the corn farmer prediction model, except that the if-then rela-

308 BEHAVIORAL MODELING AND SIMULATION TABLE 8-4â Contingency Table for the Corn Farmer Action Model Outcome Action Rain No rain Add fertilizer What will be the yield with What will be the yield with rain and fertilizer? no rain and with fertilizer? Do not add fertilizer What will be the yield with What will be the yield with rain and no fertilizer? no rain and no fertilizer? tionships must now be generalized to indicate what will happen for various actions or interventions inserted into the natural order of things. An important point to stress is that the validation of an action model does not necessarily require the model to generate highly accurate predic- tions. For example, for the problem at hand, the corn farmer might be able to deduce that the addition of fertilizer to his field will profitably increase his corn yield whether or not it rains, because of a government support program that reimburses farmers for all of their fertilizer costsâthat is, fertilizer is free. In this case the farmerâs best (most profitable) action choice is clear; he should choose to fertilize his field. He does not need to predict with high accuracy the probability of rain, the probable effects of rain on his resulting corn yield, or the price of corn in order to take the best action. In short, the validation process for action models is sharply distinct from the validation processes for explanatory, purely predictive, and exploratory models. The primary focus is on taking the best action rather than on the realism of the model or the ability of the model to generate accurate predic- tions. On one hand, a purely predictive model might not say much about which action to takeâa rain forecast by itself does not tell us whether to fertilize or not. On the other hand, good understanding and predictive power could be essential requirements for achieving a useful action model. A good understanding of weather and how weather affects crop yield under different fertilization conditions could be essential for deciding whether to fertilize or not. We now illustrate a more involved situation, in which the validity of an action model depends critically on the modelâs descriptive and predic- tive accuracy. A more complicated action model. Consider a more complicated action model involving the following hypothetical decision: Should a military force enter a potentially hostile village in order to establish a relationship with the local militia leaders, and if so, which entry mode should be chosen?

COMMON CHALLENGES IN IOS MODELING 309 The outcome resulting from each possible choice of a village entry mode depends on the degree of hostility of the mayor and the presence (or not) of a local resistance group. The purpose or goal is simply to occupy important terrain and minimize casualtiesâthose of both the military force and the villagers. This village deployment action model could be entirely framed in terms of if-then relationships connecting compound contingency conditions to ultimate outcomes, in which each compound contingency condition involves a village entry choice together with a mayor hostility level and the presence or absence of a local resistance group. To focus attention on action choice, however, it is useful to separate out the entry choice from the latter two nonaction contingency condition aspects. In particular, it is useful to think of these nonaction contingency condition aspects as constituting a scenario conditioning the choice of an action. The contingency table for this model is depicted in Table 8-5. Note that only three of the four possible scenarios are depicted for ease of exposition. Validation of this village deployment action model involves three criti- cal considerations. First, how appropriate is the action domain listed down the left-hand side of the table for the problem at hand? The action domain is the set of possible actions that can reasonably be taken in this situation. There are two possible kinds of errors here. The action domain might be poorly specified. For example, are the village entry choices in the table truly relevant and feasible given the available resources, the situation constraints, and the time available to take action? Alternatively, the action domain might be incompletely specified. For example, the village entry choices in the table are classified only by degree of armament, ignoring possibly criti- cal timing issues (e.g., enter at dawn versus enter at night). In addition, other ways to enter the village (e.g., with an initial leaflet drop or bombard- ment) might be feasible. Second, are the scenarios listed along the top in Table 8-5 appropriately specified for the situation at hand? The scenarios are the set of possible conditions that might arise that we do not control or determine. In par- ticular, are the contingency condition aspects that form the basis for these scenarios both reasonably accurate and reasonably complete? For example, it might be the case that the initial attitude of the mayor and the existence (or not) of a resistance cell are not the only important aspects to consider for the characterization of the initial conditions. Another attribute of equal or greater importance might be whether the civilians (i.e., the inhabitants of the village) are religious or not. Even assuming the initial attitude of the mayor and the existence (or not) of the resistance cell are correctly identified as the two most important aspects to consider in conjunction with village entry mode, only three of the four possible combinations of these two aspects are analyzed for the village

310 BEHAVIORAL MODELING AND SIMULATION TABLE 8-5â Contingency Table for Village Deployment Action Model Outcome When Outcome When the Outcome When the the Mayor Is Hostile Mayor Is Hostile Mayor Is Not Hostile and There Is a and There Is No and There Is No Action Resistance Cell Resistance Cell Resistance Cell Do not enter [None] [None] [None] Enter with The mayor does The mayor The mayor attempts firepower nothing to stop active will organize a to negotiate while evident; return resistance; fire is demonstration; a few the citizens resist defensive fire returned; it is likely civilians are roughed passively; a few only after that a few civilians up; no casualties civilians are detained; receiving are killed; one or two no casualties sporadic gunfire soldiers are wounded from snipers Enter with The above with a Do not know Same as above firepower lower probability of evident killing villagers Enter with The mayor negotiates The mayor negotiates The mayor negotiates small group to and there is a high in the town square for food and negotiate probability that the and finally lets the medical supplies; no small group will small group go; no casualties; no terrain be held captive; no casualties; no terrain occupied casualties; no terrain occupied occupied Enter with food The mayor forbids The mayor forbids The mayor is and medical the distribution of the the distribution of the welcoming and supplies food and medicine food and medicine negotiates for more and the cell initiates and demands that the food and medicine; an exchange of fire; troops leave the area; no casualties; terrain no casualties; terrain no casualties occupied occupied in the table. A fourth possible combinationâa nonhostile mayor together with the existence of a resistance cellâis not considered. By excluding this fourth combination, the modeler is effectively concluding either that it is impossible or that it is so improbable that it is not worthwhile to consider. Third, are the if-then relationships mapping the contingency conditions (scenario-action pairs) into possible outcomes appropriately specified and explained? The cells in Tables 8-4 and 8-5 contain the outcomes for the scenario-action pairs. In the corn farmer action model, it is assumed that

COMMON CHALLENGES IN IOS MODELING 311 fertilizer (or not) and rain (or not) are the only two unknown conditions of importance and that corn yield is the only important outcome variable. For the village deployment action model, however, the if-then relationships mapping the contingency conditions into possible outcomes are inherently much more complicated. Specifically, as seen in Table 8-5, the village deployment action model is assumed to have three important aspects making up each contingency condition: the village entry mode, the village mayorâs initial hostility level, and the existence (or not) of a local resistance cell. Given any particular combination of the latter two aspects, the military chooses a village entry mode. Given any particular combination of these three aspects, there is then a response by the mayor and a response by the resistance cell (if present). The overall combination of all of these events then determines an ultimate effect on civilians and the mission of entering the village. Also, the if-then relationships connecting contingency conditions to possible outcomes in Table 8-5 might not be appropriately specified. For example, it is assumed that a definite outcome results under each possible contingency condition when, in fact, a great deal of residual outcome uncertainty might remain (e.g., the level of civilian losses might still be uncertain). Moreover, these outcomes might be incompletely specified (e.g., the casualty rate among soldiers is currently ignored in some cells). A template for validating action models. As developed above, the valida- tion of an action model should begin with the purpose of actionânot the model itself. One cannot assess the validity of an action model without first knowing its purpose. Next, the validation of an action model will typically be demanding, involving specification of an action domain, scenarios, and if-then relationships. The best approach to follow for carrying out this validation will depend on the purpose of the model. In brief, validation should establish the purpose of the model, list the possible actions or interventions for the purpose, specify the scenarios that depict the uncertainties or unknowns that are inherent and cannot be eliminated, and develop the if-then relations between the possible actions and the possible uncertaintiesâthat is, predict the outcomes, which might be multidimensional and uncertain as well. The validation of an action model is a challenge that does not lend itself to a well-specified set of detailed procedures; the action model gives a template for which the details must be filled in. Nevertheless, the examples presented in the previous sections suggest a reasonable order of concerns, which is summarized below. Before an action model can be validated, the purpose must be specified. One cannot proceed with the validation without this specification. In the corn farmer example, the purpose of the farmer is to make a high profit. In

312 BEHAVIORAL MODELING AND SIMULATION the military force example, the purpose of the military force is to occupy the village with a minimum of casualties. First, is the action domain appropriately specified for the situation with regard to an operational specification of each action and the completeness of the possible actions? For the farmer example, is to fertilize or not an appropriately complete specification of the farmerâs possible actions? For the military force example, are the four ways to enter the village an appro- priately complete specification of the forceâs possible actions? Second, are the considered scenarios appropriately specified? Are rain and no rain the appropriate scenarios for the farmer? Are the mayorâs and resistance cellâs actions appropriate? Does the range of considered scenarios cover the range of situations in which actions might actually have to be taken? Third, is each if-then relationship connecting a contingency condition to a possible range of outcomes specified with an appropriate level of real- ism and prediction? For the farmer, the corn yield is important, as is the price. For the military force, the mayorâs reaction and then the resulting effect on the occupation and casualties are important. The specifications of these three key model features (action domain, s Â cenarios, and if-then relationships) are interdependent, and all three aspects need to be carefully considered for the overall validation of the model. Each presents a different problem for the modeler. The specification of an appropriate action domain requires a deep understanding of what actions are feasible and reasonable for a situation. The specification of appropriate scenarios requires a deep understanding of what is likely to be known and not known (and what is knowable) about a situation at hand. The speci- fication of appropriate if-then relationships requires a deep understanding of the causal structure connecting contingent conditions (scenario-action pairs) to potential outcomes. The specification and validation of if-then relationships is particu- larly difficult for systems involving multiple interacting human beings with capabilities for learning and social communication. The largest source of uncertainty in social systems is behavioral uncertainty, that is, uncertainty regarding what other people will do. It is for this reason that the âthenâ parts of the if-then relationships postulated for social systems will gener- ally have to be in the form of multidimensional subjective probability assessments giving likelihoods for a range of possible outcomes. These probability assessments will inherently be subjective judgments based on an understanding of individual and group behavior gleaned from observa- tions, surveys, human subject experiments, and biological and physical considerations. There are a number of common errors that are to be avoided. It is a frequent error to develop a simple predictive model that assumes no action

COMMON CHALLENGES IN IOS MODELING 313 or intervention and that does not explicitly specify the scenarios. The pos- sible mistakes are, first, that the model might be used for action or interven- tion without consideration of the critical outcomes; second, the model does not specify the scenarios, and it is easy to assume inappropriately that the model applies to all scenarios. Without a specification of the actions and the scenarios, a model is quite limited. The uncertainty inherent in most action models involving individuals and organizations makes validation difficult. Inability to conduct repeated experiments is a key issue. We can observe rainfall and corn yield over many years. Consequently, statistical averages might be both available and useful for a corn farmer contemplating these types of events. However, the general lack of repeated experience for those contemplating the entry of a village might make it impossible to use simple descriptive statistics based on averaging. At best, there will be a range of possible outcomes with subjec- tive estimates about what will occur. Because the largest source of uncer- tainty in social systems is behavioral uncertainty, the if-then relationships postulated for social systems will typically have to be stochastic, giving a range of possible outcomes for each contingency condition together with a probability assessment for each outcome. Military Approaches to Verification, Validation, and Accreditation The Defense Modeling and Simulation Office (DMSO) has devoted considerable effort to the development of definitions, processes, and tools for V&V of models and simulations; formal definitions of terms and con- cepts are given in DoD Directive 5000.59-M, Glossary of Modeling and Simulation Terms. Additional information and a larger glossary can be found at the website devoted to VV&A, the DoD VV&A Recommended Practices Guide (http://vva.dmso.mil/). A simplified sketch of how VV&A is interrelated to the overall process of M&S development is given in Figure 8-9. Ideally, the M&S process begins with the development of a conceptual model, proceeds to the design and implementation of a simulation of that model, and ends with testing and evaluation, allowing for âspiralsâ of iterative design-development- testing over time. In parallel, the VV&A process begins with validation of the conceptual design, verification that the design and its implementation properly instantiate the conceptual model, and validation of the test results. â or F a more detailed discussion of model validation for social systems, see Carley and S Â voboda (1996), Fagiolo, Windrum, and Moneta (2006), and the extensive resources available at http://www.econ.iastate.edu/Âtesfatsi/empvalid.htm. â he figure shows the process for new M&S development; the process for VV&A of existing T M&S systems is considerably more complicated (http://vva.dmso.mil/Role/VVAgentLegacy/ default.htm).

314 BEHAVIORAL MODELING AND SIMULATION DEVELOP NEW M&S Prepare Refine Plan M&S Develop M&S for Develop Implement M&S Develop- Conceptual Use Design & Test Rqmts ment Model Validate Verify Make Verify Develop Verify Validate Conceptual Implemen- Accreditation Rqmts V&V Design Results Model tation Decision V&V PROCESS 8-9.eps FIGURE 8-9â Interaction between V&V and new development activities. SOURCE: Adapted from http://vva.dmso.mil [accessed Feb. 2008]. In effect, the validation tasks focus on ensuring that the model adequately represents that portion of the real world being modeled, and the verifica- tion tasks focus on ensuring that the simulation adequately implements the model. Accreditation is shown as the last step in the process, the point at which the accrediting agency (the owner of the simulation) places its stamp of approval on the validation results. The traditional focus for DMSO M&S development has been the physical battlespace environment (terrain, ocean, atmosphere, etc.), with an emphasis on conventional warfare. Simulation and VV&A of the enti- ties that populate the environment has typically been left to the separate services or âaccreditingâ agencies: the Air Force has been responsible for development and VV&A of F-16E aircraft, the Army for Bradley Fighting Vehicles, etc. It is also fair to note that the great preponderance of M&S entity devel- opment has been devoted to âplatformâ entities, not the human Âdecision makers who âdriveâ those entities, either at a one-on-one level (e.g., the pilot of an F-16E) or at a higher command and control level (e.g., the Joint Force Air Component Commander). As a result, corresponding VV&A efforts have been equally unbalanced, with the majority of the effort devoted to the VV&A of systems for which there is a strong conceptual model (e.g., a behavioral law for platform kinematics, such as âdistance equals speed mul- tiplied by timeâ) and a well-understood protocol for validating that model against the real-world behavior of the entity being modeled (e.g., measuring time, speed, and distance of the moving platform in the field). VV&A of IOS models is particularly problematic, because of both the lack of clear

COMMON CHALLENGES IN IOS MODELING 315 and generally accepted conceptual models and the difficulty of conducting âcleanâ human-in-the-loop experiments, unconfounded by large individual differences across individuals, inadequate experimental controls over vari- ables that subtly influence human behaviors, learning and adaptation over repeated experimental trials by individual experimental subjects, etc. These shortcomings were identified in an earlier National Research Council (NRC) study, which noted as its primary conclusion that the M&S community should adopt a general framework for developing and accredit- ing models over three time horizons (National Research Council, 1998): â¢ Short term o Collect and disseminate human performance data. o Support incremental improvement for selected models. o Create accreditation procedures for human behavior modeling. â¢ Mid-term o Extend task analysis effortsâfor example, STRICOMâs ÂCommon Model of the Mission Space. o Support sustained human behavior modeling development in focused domains (e.g., AFRLâs Agent-Based Modeling and Behavior Representation (AMBR) Air Traffic Control Testbed). â¢ Long term o Support theory development and basic research. At the time that the earlier report was written, the focus of most DoD M&S was on conventional platform-dominated nonurban warfare. The report was similarly focused. It is now clear that considerably more empha- sis has to be given to the development of models that span the space from the individual decision maker, to small groups, to urban populations, and even to entire national and transnational populations. As noted in Chapter 5, we need to account not only for ânominalâ human behaviors, but also for those colored by individual differences (e.g., personality traits) and ethnic/religious/cultural influences. And, since no individual operates in a vacuum, we also need to account for the influences of the organizational structures and social networks mediating human intercourse. This is a tall order for the M&S community, but efforts have started. Since the 1998 NRC study, the Air Force has initiated a number of programs: â¢ AFRL/HE workshops on Adversarial Modeling (2002), Cognitive Engineering (2002), Cognitive Modeling, Science, and Engineering (2003), and Representing Personality and Culture (2003). â¢ The Aeronautical Systems Center Engineering Directorateâs S Â AMPLE program to develop agent-based pilot models to populate the SIMAF engagement simulation.

316 BEHAVIORAL MODELING AND SIMULATION â¢ The AFRL/HE AMBR Program, directed at developing intelligent agents to mimic human behaviors. â¢ A nascent effort in modeling human behavior by the Behavioral Research Branch of the National Air and Space Information Center. The other services have likewise started comparable efforts in this area, most notably the Office of Naval Researchâs Affordable Human Behavior Modeling Program, having the following major goals (see http://www.onr. navy.mil/sci_tech/34/342/training_afford.asp): â¢ Reducing the time-consuming knowledge engineering needed to define the fundamental behaviors to be modeled for a particular operator in a given military role. â¢ Reducing the M&S construction effort (model concept design and simulation development) required to develop simulations of the desired human behaviors. â¢ Identifying processes to ensure reusability of models and model components. â¢ Developing improved V&V techniques for the developed models. This is one of the few programs that directly addresses V&V issues. Another major finding of the NRC report on human behavior model- ing (National Research Council, 1998) was that substantial effort needs to be invested in the development and VV&A of larger scale models that go beyond the representation of individual humans, to begin to address col- lections of individuals, from small teams, to groups, crowds, urban popu- lations, and even nation-states. This was echoed by the DMSO-sponsored conference on organizational simulation held in 2003 (Rouse and Boff, 2005), which attempted to address qualitative and quantitative changes in the fundamental M&S issues associated with modeling larger groups of individuals. Although a number of novel and disparate approaches were proposed and described, only a very small fraction of the proceedings directly addressed critical VV&A issues for this class of behavioral models, such as: â¢ What constitutes behavior prediction/forecasting for multiple inter- acting simulated human entities? Should we relax our need for prediction accuracy and instead be satisfied with robustness in anticipating the range of future possibilities? â¢ How does one go about validating a conceptual model when the model is still being formulated? Are there different levels of valida- tion that apply?

COMMON CHALLENGES IN IOS MODELING 317 â¢ How does one verify the resulting model simulation, when so many of the interesting qualities of behavior reflect idiosyncratic and stochastic activities that change over time due to learning by the individual and socialization by the group? â¢ Who should be the accrediting agency? There are, however, a number of ongoing efforts aimed at advancing the application of large-scale organizational modeling to DoD questions of interest, most notably the Joint Forces Commandâs Urban Resolve Program, which focuses on urban operations and how M&S can be used to explore and define urban operations war-fighting capabilities for the future joint force commander. The first phase focuses on intelligence, surveillance, and reconnaissance operations in the urban environment, using a high-entity count simulation of Jakarta built on top of JWARS, which, in turn, is built on top of the existing OneSAF simulation framework (see Chapter 2). Although the development plan for JWARS originally included a detailed VV&A plan for conventional warfare scenarios,10 it is unclear whether V&V efforts for Urban Resolve went any further than the âlooks okâ test. This is not atypical of large-scale simulations in general. Finally, we note that AFRL/HE has begun the process of attempting to formalize the VV&A process for individual human performance models, cognitive models, and group representations developed or âownedâ by the directorate, via the publication of the AFRL/HE Instruction 16-03 (Brinkley, 2003). We believe this is a good start in this particularly difficult area. Validation Issues Specific to Individual Modeling Approaches In this section we review the validation challenges and approaches that are specific to various modeling approaches used for IOS models. Validation of Conceptual Models Verbal conceptual models are sometimes specific enough that they can be tested and plausibly falsified, using empirical field studies or controlled experiments. For example, Fiske and colleagues have used social cognition experiments to demonstrate that people organize acquaintances in memory according to the dominant model that organizes the relationship, in studies of subjects from Bengali, Chinese, Korean, Vai (Liberia and Sierra Leone), and U.S. cultures (Fiske, 1992), and that for many subjects this classifica- 10â his T information is based on the program overview at http://www.msiac.dmso.mil/ spug_documents/JWARS_Overview_Brief.ppt.

318 BEHAVIORAL MODELING AND SIMULATION tion accounts for more variance in recall and substitution errors than per- sonal attributes, such as gender, race, and age. In contrast to such well-developed conceptual frameworks, broad metaphors (brains as information-processing devices, organizations as cultures) are not really subject to verification or falsification. Whether or not they are used in a particular domain is likely to depend largely on face validity and established precedent. In evaluating the usefulness of a broad conceptual model, the yardstick is often not how well supported the model is, but how much interesting research it inspires. Even when a verbal model seems, in principle, to be subject to falsification, the underspecification of relations and processes often means that a rather broad array of different outcomes can be presented as consistent with the theory. As Harris (1976) noted in his paper entitled âThe Uncertain Connection Between Verbal Theories and Research Hypotheses in Social Psychology,â theoretical terms often are not defined, boundary conditions are unspecified and, under various plausible interpretations of assumptions or conditions, several well-known theories include internal contradictions and inconsistencies (cited in Davis, 2000). Validation of Cultural Models Cultural inventory models rely on ethnographic observation and are therefore both time-consuming to develop and highly subjective. Having multiple independent observers helps ameliorate the subjectivity problem, but it is expensive. Dominant trait models, such as the Hofstede dimensional models, can involve two sets of data. The first set is used to derive the dimensions. These can be validated by a number of different statistical methods, such as Âfactor analysis. Once these are fixed, another set of data is obtained to score each new culture on the dimensions. These data have to be obtained from willing natives of the culture, and the data have to be updated over time because cultures change. Validation of Cognitive Models While there is increasing emphasis on validation of cognitive architec- tures, validation remains one of the most challenging aspects of cognitive architecture research and development. â[Human behavioral representa- tion] validation is a difficult and costly process [and] most in the commu- nity would probably agree that validation is rarely, if ever doneâ (Campbell and Bolton, 2005, p. 365). Campbell goes on to point out that there is no general agreement on exactly what constitutes an appropriate validation of a cognitive architecture. Since cognitive architectures are developed for

COMMON CHALLENGES IN IOS MODELING 319 a wide variety of reasons, there is a correspondingly wide set of validation (and evaluation) objectives and metrics and associated methods. Lack of established benchmark problems and criteria exacerbates this problem. Validation of Cognitive-Affective Architectures In spite of the challenges associated with validation of emotion models and cognitive-affective architectures, progress is being made in the area. A promising trend in emotion modeling is the increasing emphasis on including evaluation and validation studies in publications. As is the case with cognitive architectures, no existing emotion models or cognitive- affective architectures have been validated across multiple contexts and a broad range of metrics. However, some important evaluation and valida- tion approaches and studies exist and are discussed in detail in Chapter 5. Cognitive-affective architecture validation has not yet reached the stage of systematic comparisons that is beginning to be used for their cognitive counterparts. However, given the recent emphasis on validation in the computational emotion research community, such studies are likely to be taking place in the near future. Validation of Agent-Based Models Agent-based models (ABMs) are computational frameworks that permit the theoretical exploration of complex processes through controlled repli- cable experiments (see Chapter 6). In principle, these experiments could be run entirely with artificially generated initial conditions, Âparameter values, and functional forms. Nevertheless, their ultimate usefulness depends on the extent to which they prove capable of shedding light on real-world systems, that is, their ability to enhance understanding and guide decisions and actions. When validation of ABM frameworks is attempted, the validation is generally restricted to small areas of performance. A typical approach to validation is to run an experiment using an ABM framework, collect data from this experiment, statistically analyze the results to generate the response surface, and then contrast the response surface with real data. It is easy, even with only a few variables, to generate such a quantity of data from an ABM framework that there are no existing data with which to compare them, no existing statistical package can handle them, and most desktops cannot store them. Therefore, typically only small portions of the overall response surface can be estimated at once. The size of the analyzed response surface is thus often dictated by the userâs interests and the critical policy or decision-making questions at issue (i.e., the action domain and the scenarios relevant to that domain, as discussed above).

320 BEHAVIORAL MODELING AND SIMULATION ABM researchers have recently begun to explore promising new approaches to validation. For example, a number of them are now advo- cating iterative participatory modeling (IPM) as an effective way to incre- mentally achieve validation of the structural, institutional, and behavioral aspects of the complex systems they study. For an introductory exposition of IPM, see Barreteau (2003). The essential idea is to have multiÂdisciplinary researchers join with stakeholders in a repeated looping through a four- stage modeling process: (1) field study and data analysis, (2) scenario discussion and role-playing games, (3) ABM development and implementa- tion, and (4) intensive computational experiments. The new aspect of IPM relative to more traditional participatory model- ing approaches is the emphasis on modeling as an open-ended collaborative learning process. The modeling objective is to help stakeholders manage complex problems over time through a continuous learning process rather than to attempt the delivery of a definitive problem solution. In addition, ABM researchers are also beginning to explore the poten- tial benefits of conducting parallel experiments with real and computational agents for achieving improved validation of their behavioral assumptions.11 A critical concern is how to attain sufficiently parallel experimental designs so that information drawn from one design can usefully inform the other. Recommendations for Developing and Validating IOS Models We have argued that IOS models should be validated beginning with the purpose and then considering the action set, scenarios, and if-then relations in the specific situation. The committee makes a number of sug- gestions for modeling and simulations that will facilitate the validation of a specific model. Check with Multiple Experts Four different experts should examine an IOS model: the users of the model, the scenario experts, the if-then or domain experts, and the modelers themselves. Modelers cannot examine a model by themselves; they tend to focus on the verification with less emphasis on the purpose of the model. For an action model, the user is very important to check the relevance and feasibility of the action set. The scenario expert should examine the uncer- tainties and unknowns. Domain experts are particularly knowledgeable about the if-then relationships. However, their knowledge is not necessarily 11â ee http://www.econ.iastate.edu/tesfatsi/aexper.htm for annotated pointers to ABM S r Â esearch on parallel experiments with real and computational agents; see also the survey by Duffy (2006).

COMMON CHALLENGES IN IOS MODELING 321 framed in this manner, so some adjustment may be required. For example, domain experts know about âwhat isâ and âwhat has beenâ but may be less certain about âwhat might beâ outcomes. However, they are likely to point out errors in the models for what might be and limits of what is known. Each expert can contribute to the validation of an action model. It is unlikely that any single expert can ensure a valid action model alone. The structure and content of the model provide a template for a procedure by which multiple experts can validate different aspects of an integrated action model. Keep the Model as Simple as Possible for Its Purpose An IOS model does not have to be complex. Parsimonious models are preferred. The corn farmer action model is simple and does not capture the complexity of weather forecasting or the chemistry of fertilizers. But it is understandable and permits the farmer to make a decision and take action. Action models that are intuitively understandable to decision Âmakers (trans- parent) are preferred. An action model that is disconnected from a deci- sion makerâs intuition and from concepts he or she is familiar with does not permit interplay between the decision maker and the model. In short, complicated, nonintuitive action models require decision makers to accept the implications of the models on blind faith. Action models should aid decision makers, not replace them. Examine âWhat Might Beâ as Well as âWhat Isâ âWhat isâ should mimic the real world within limits. âWhat isâ Âmodels are a basis for âwhat might be.â A model that has little or no correspon- dence with the real world is not likely to be relevant for what might happen. What might be is very important for action modelsâparticularly in new situations (Burton, 2003). Many of the relevant action-scenario combina- tions have not been observed in the past. So the model must be relevant for action beyond what is or what has been to new situations. For example, it would be desirable if the illustrative village deployment action model could be used reliably in other similar situations, say for the withdrawal from a village as well as entry. But it is not likely that the model could be used to help plan an action to disarm a resistance cell. Presumably this would require a more detailed model of the functioning of the cell. Whether it would be desirable to develop one model to handle both entry and cell disarmament or two separate models would presumably depend on economies of scopeâIs there anything to be gained by considering both issues jointly?âand on computational implementation costs. IOS models should be developed and examined beyond what is to what might be. At

322 BEHAVIORAL MODELING AND SIMULATION the same time, it is important to examine the limits of the model and not use it in situations in which it might be inappropriate. As suggested above, s Â implicity is desirable, but it must be balanced so that the action model is useful for its purpose. IOS models are likely to forecast a range of possible outcomes, some more likely than others, and to incorporate many factors that are highly uncertain and, indeed, unknowable at the time the model is developed. How then can such models be validated? Popper, Lempert, and Bankes (2005) argue that models used to explore policy alternatives for an uncer- tain future should not be expected to yield predictions that can be tested but rather should be used to explore and compare possible outcomes under a variety of possibilities in order to select strategies that are robustâyielding the best overall results across a variety of possible futures. Postevent outcomes can also be used to evaluate models, although models are not necessarily incorrect if the actual outcome that occurred was not the one forecast to be most likely. Unlikely events do occur, and many IOS applications do not permit the replication that would generate a distribution of actual outcomes. A very useful approach would be to develop multiple models that take different perspectives and use different theories and data, merge their predictions to create zones of likelihood, and compare their forecasts with the actual outcomes (see Docking below). As with other validation approaches, the value of the modelâs results depends on its intended use, so the degree to which forecasts need to correspond to reality will depend on the modelâs purpose. Use Model Touching for Validation Model touching is comparison or juxtaposition of models. There are many ways to bring models together. Here is a list: â¢ Bring experts (as described above) together to develop and examine the model. â¢ Compare the action model with qualitative studies for the situation or domain. â¢ Check with other studies that might be empirically based on data from the field or from experiments. â¢ Compare with computational models that are based on field data. Docking. Docking is the bringing together of two modelsâa metaphor borrowed from space exploration. More precisely, docking is an evaluation of the extent to which two or more different models of the same action situation can be cross-calibrated so that they yield the same outcome (or outcome probability distribution) given the same contingency condition

COMMON CHALLENGES IN IOS MODELING 323 (Axtell, Axelrod, Epstein, and Cohen, 1996). Docking goes beyond model touching to compare in more detail. It can provide a better understanding of the true connections relating the three key elements of an action model: the actions, the scenarios, and the possible outcomes resulting under each contingency condition (scenario-action pair). Docking gives confirmation that we have a reasonable understanding of an action situation, and that our conclusions are being driven by the intrinsic nature of the action situ- ation and not by idiosyncratic aspects of the model implementation. One possible approach is to compare how different models perform under the same benchmark action-scenario combination, which can provide insight into how different models define actions and how they structure if-then relationships. That is, for an action model, take the same action possibili- ties and the same unknown scenarios, then develop two separate if-then relationship models. Develop and compare the outcome tables for the two models. Are the outcomes the same? If not, why? One must go behind the model outcomes and examine the details of the models to understand their differences. Individuals who are expert in the subject are critical in judging the models and their value. Docking should involve experts throughout the process, as discussed above. Docking of multiple modeling approaches against common benchmark problems using a panel of expert judges has recently been used to provide considerable insight into individual cognitive performance models (Gluck and Pew, 2005). At this time, there is a need to develop benchmark scenario-action situ- ations that can be used to dock two or more models. This effort will involve action, scenario, and if-then experts. With these benchmarks, Â docking s Â tudies can add greatly to the development of action models. Given the current state of the art, the participation of experts in the docking process is essential. The next best step in validation is to support docking studies among experts who develop computation-based models. Automated machine docking of two or more models is a very high-risk endeavor at present. At a later stage of understanding, we may be able to develop a computationally based approach to the docking of models. But for now, experts and their judgment are mandatory. Triangulation. Triangulation goes beyond docking and involves examining the same action domain using an action model, an expert group using a qualitative approach, and reference to quantitative studies in the domain. An action model validated using multiple approaches is more likely to help the decision maker take actions that meet the purpose. However, a large number of triangulations are often possible. We do not know a priori what the best triangulation is for a given situation, but it is quite likely that a good triangulation will be situation dependent.

324 BEHAVIORAL MODELING AND SIMULATION Exploratory testing of robustness. Miller (1998) proposes active nonlinear tests for complex models to validate the modelâs structure and robustness. In this approach, automatic nonlinear search algorithms probe for extreme outcomes that could occur within the set of reasonable model perturba- tions. This multivariate sensitivity technique can find places where a com- plex model âbreaks,â that is, produces results that are outside a range of reasonable predictions. In summary, universal rules about what is the appropriate procedure for validating IOS models are not possible. However, we recommend the validation of models through a three-part triangulation process, based on the purpose of the model. Validation should involve (1) participation by multiple experts who can provide different perspectives on the action domain, the scenarios, and the if-then rules incorporated in the model; (2)Â docking of similar computational models against one another; (3) com- parison to qualitative and theoretical studies and previous quantitative results and exploratory testing for a range of outcomes. A good heuristic would be to begin with the experts as discussed above and move as quickly as possible to docking studies and exploratory testing. DATA Issues and challenges Data can be used in two different ways in modeling. When models are developed inductively from data, the quality of the data is extremely important. In that case the data are broader in scope and limited only in a very general manner. For example, an anthropologist sees different things than an engineer in the same situation. For existing models, the data are prescribed by the model, and the quality of data is extremely important. Here again, the data yield values for the model parameters and make the model specific to a given situation and problem. The data requirements are driven by different modeling needs. For each situation, quality data are needed and are important to the usefulness of the model. This means that even the most promising, sophisticated, and elegant models may be severely limited or hampered by specific data needs and requirements. Thus, data issues are an essential component for assessing the ultimate success for model development, validation, and applications. A number of potential data factors need to be considered in the course of conceptualizing and developing models. These include but are not limited to the following. â¢ Primary/secondary: Data may already exist (secondary) or may need to be collected (primary). Obviously, models using secondary source data have some advantages because they require little or no data collection. However, models using such forms of data may

COMMON CHALLENGES IN IOS MODELING 325 be limited by the nature and quality of the data that exist. This might mean the model will be constrained by the type of data avail- able, and such constraints may limit the modelâs ability to address important issues and problems. Models using primary sources of data have more flexibility, given that they can determine exactly what type of data needs to be collected. However, primary data collection involves its own set of limitations that are reflected in the factors described below. â¢ Observable/nonobservable: Some data are directly observable, and this may facilitate ease of collection. Phenomena that are not directly observable may require more extensive efforts to uncover the necessary information (e.g., face-to-face interviews). â¢ Distant/close: Some forms of data can be collected at a distance. This may involve the use of technology, such as cell phones or video links. However, other types of data require actually being there on the ground, as for face-to-face contact or interviews with subjects, respondents, or informants. â¢ Representative/nonrepresentative: Often model assumptions require data to be collected or compiled in some specific manner. The best example of this is the explicit assumptions underlying classical parametric statistical models that require random samples from a population. There are other models that simply require units of analysis to be representative of a given theoretically important category of some type, and it may be the case that any unit of analysis fitting the categorical criteria will suffice. An important consideration is the extent to which units of analysis used in the model need to be derived by either probabilistic or nonprobabilistic methods (see Johnson, 1990). â¢ Passive/active: This is related to some of the factors above in that some data can be collected casually or on the fly. Such data may still require being there but may require only documenting or record- ing naturally occurring events, conversations, or interactions. In contrast, more direct and active methods of data collection may be necessary and will involve, for example, actually interviewing individuals at events or interviewing them about given conversa- tions or interactions. â¢ Tacit/explicit: Some forms of data require little interpretation or reading between the lines. Other types of data are implicit, and there is a need to make them more explicit. This is particularly true for some forms of human knowledge that are often tacit and may require specific types of elicitation interviewing techniques to extract the requisite information to be used in the model (Johnson and Weller, 2002).

326 BEHAVIORAL MODELING AND SIMULATION There are certainly other important factors to be considered in terms of relating models to various data requirements. However, the factors described above potentially reflect impediments to the utility and validity of any proposed model. If, for example, models require data involving forms that are tacit, active, representative, close, nonobservable, and, of course, primary, then the data may be costly to obtain and may limit the modelsâ potential effectiveness given the data constraints. But this does not address in any way issues of data quality concerning reliability and validity. We can consider the factors above to reflect elements of how hard data might be to collect or obtain. Although some of these factors are related to issues of reliability and validity, they are not necessarily one and the same. Often the data that are the most difficult to collect (i.e., on-the-ground face-to-face interviews) are the data that have the most reliability and validity, whereas data that are the easiest to obtain (i.e., secondary source data) may be the most problematic. The extent to which one trusts the data will ultimately determine the extent to which one trusts model outcomes or predictions. In summary, even though quality data are extremely important, the operationalization of quality is different for the different demands of the model. One implication is that we need better quality data. Another impli- cation is that we need a better understanding of how we can model, describe, predict, and explain with less than quality data. This further sug- gests that a better notion is needed of what is meant by quality data for the various models and needs. References AFRL/HE. (2002). Proceedings of the Eleventh Conference on Computer Generated Forces and Behavioral Representation, May, Orlando, FL. Al-Halimi, R., and Kazman, R. (1998). Temporal indexing through lexical chainin. In C. Fellbaum (Ed.), WordNet: An electronic lexical database (pp. 333â351). Cambridge, MA: MIT Press. Allen, J.G. (2004). Commanderâs automated decision support tools. Briefing to Proposersâ Symposium for DARPAâs Integrated Battle Command Program, Dec. 15, Washington, DC. Axtell, R., Axelrod, R., Epstein, J.M., and Cohen, M.D. (1996). Aligning simulation models: A case study and results. Computational and Mathematical Organization Theory, 1(2), 123â141. Bachman, J.A., and Harper, K.A. (2007). Toolkit for building hybrid, multi-resolution PMESII models. (Final report #RI-RS-TR-2007-238.) Cambridge, MA: Charles River Analytics. Available: http://stinet.dtic.mil/cgi-bin/GetTRDoc?AD=ADA475418&Location=U2&do c=GetTRDoc.pdf [accessed Feb. 2008]. Barreteau, O. (2003). Our companion modelling approach. Journal of Artificial Societies and Social Simulation, 6(2). Available: http://jasss.soc.surrey.ac.uk/6/2/1.html [accessed April 2008].

COMMON CHALLENGES IN IOS MODELING 327 Beeferman, D. (1998). Lexical discovery with an enriched semantic network. In Proceedings of the Workshop on Applications of WordNet in Natural Language Processing Systems, Association for Computational Linguistics/Conference on Computational Linguistics (ACL/COLING 1998), Montreal, Canada. Available: http://www.lexfn.com/doc/lexfn. pdf [accessed Feb. 2008]. Box, G.E.P. (1979). Robustness in the strategy of scientific model building. In R.L. Launer and G.N. Wilkinson (Eds.), Robustness in statistics: Proceedings of a workshop. New York: Academic Press. Brinkley, J. (2003, May 14). AFRL/HE instruction #16-03. Wright Patterson Air Force Base, OH: Air Force Research Laboratory/Human Effectiveness Directorate. Burton, R.M. (2003). Computational laboratories for organization science: Questions, validity and docking. Computational and Mathematical Organization Theory, 9(2), 91â108. Burton, R.M., and Obel, B. (1995). The validity of computational models in organization science: From model realism to purpose of the model. Computational and Mathematical Organization Theory, 1(1), 57â71. Campbell, G.E., and Bolton, A.E. (2005). HBR validation: Integrating lessons learned from multiple academic disciplines, applied communities, and the AMBR project. In K.A. Gluck and R.W. Pew (Eds.), Modeling human behavior with integrated cognitive archi- tectures: Comparison, evaluation, and validation (pp. 365â395). Mahwah, NJ: Lawrence Erlbaum Associates. Carley, K.M., and Svoboda, D. (1996). Modeling organizational adaptation as a simulated annealing process. Sociological Methods and Research, 25(1), 138â168. Davis, J.H. (2000). Simulations on the cheap: The LatanÃ© approach. In D.R. Ilgen and C.L. Hulin (Eds.), Computational modeling of behavior in organizations: The third scientific discipline (pp. 217â220). Washington, DC: American Psychological Association. Duffy, J. (2006). Agent-based models and human subject experiments. In L. Tesfatsion and K.L. Judd (Eds.), Handbook of computational economics, volume 2: Agent-based com- putational economics. Amsterdam, The Netherlands: Holland/Elsevier. Epstein, J.M. (2006). Remarks on the foundations of agent-based generative social science. In L. Tesfatsion and K.L. Judd (Eds.), Handbook of computational economics, volume 2: Agent-based computational economics. Amsterdam, The Netherlands: Holland/Elsevier. Fagiolo, G., Windrum, P., and Moneta, A. (2006). Empirical validation of agent-based models: A critical survey. Journal of Artificial Societies and Social Simulation, 10(2). Available: http://jasss.soc.surrey.ac.uk/10/2/8.html [accessed April 2008]. Fiske, A.P. (1992). The four elementary forms of sociality: Framework for a unified theory of social relations. Psychological Review, 99(4), 689â723. Gluck, K.A., and Pew, R.W. (Eds.). (2005). Modeling human behavior with integrated cog- nitive architectures: Comparison, evaluation, and validation. Mahwah, NJ: Lawrence Erlbaum Associates. Haefner, J.W. (2005). Modeling biological systems: Principles and applications. New York: Springer. Harper, K.A., Pfautz, J.D., Ling, C., Tenenbaum, S., and Koelle, D. (2007). Human and system modeling and analysis toolkit (HASMAT). (Final report #R06651.) Cambridge, MA: Charles River Analytics. Harris, R.J. (1976). The uncertain connection between verbal theories and research hypotheses in social psychology. Journal of Experimental Social Psychology, 12(2), 210â219. ITT Research Institute. (2001). Modeling and Simulation Information Analysis Center (MSIAC). Verification, validation, and accreditation (VV&A) automated support tools: A state of the art report, Part 1-Overview. Chicago, IL: Author. Johnson, J.C. (1990). Selecting ethnographic informants. Thousand Oaks, CA: Sage.

328 BEHAVIORAL MODELING AND SIMULATION Johnson, J.C., and Weller, S. (2002). Elicitation techniques in interviewing. In J. Gubrium and J. Holstein (Eds.), Handbook of interview research (pp. 491â514). Thousand Oaks, CA: Sage. Judd, K.L. (2006). Computationally intensive analyses in economics. In L. Tesfatsion and K.L. Judd (Eds.), Handbook of computational economics, volume 2: Agent-based computa- tional economics. Amsterdam, The Netherlands: Holland/Elsevier. Kaelbling, L.P., Littman, M.L., and Moore, A.W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237â285. Langton, J.T., and Das, S.K. (2007). Final technical report for contract #FA8750-06-C-0076. Air Force Research Laboratory, May 16, Rome Research Site, Rome, NY. Marks, R.E. (2006). Validation and complexity. Working paper, Austrailian Graduate School of Management, University of New South Wales, Sydney. Miller, J.H. (1998). Active nonlinear tests (ANTs) of complex simulation models. Management Science, 44(6), 820â830. National Research Council. (1998). Modeling human and organizational behavior: Appli- cation to military simulations. R.W. Pew and A.S. Mavor (Eds.). Panel on Modeling Human Behavior and Command Decision Making: Representations for Military Simula- tions. Commission on Behavioral and Social Sciences and Education. Washington, DC: N Â ational Academy Press. Popper, S.W., Lempert, R.J., and Bankes, S.C. (2005). Shaping the future. Scientific Ameri- can Digital, April. Available: http://www.sciamdigital.com/index.cfm?fa=Products. ViewIssue&ISSUEID_CHAR=8E3EADFC-2B35-221B-6A8455DA45AE8B50 [accessed Feb. 2008]. Robbins, M., Deckro, R.F., and Wiley, V.D. (2005). Stabilization and reconstruction opera- tions model (SROM). Presented at the Center for Multisource Information Fusion Fourth Workshop on Critical Issues in Information Fusion: The Role of Higher Level Informa- tion Fusion Systems Across the Services, University of Buffalo. Available: http://www. infofusion.buffalo.edu/ [accessed Feb. 2008]. Rouse, W.B., and Boff, K.R. (2005). Organizational simulation. Hoboken, NJ: John Wiley & Sons. Sageman, M. (2004). Understanding terror networks [web page, Foreign Policy ÂResearch Institute]. Available: http://www.fpri.org/enotes/20041101.middleeast.sageman.Â understandingterrornetworks.html [accessed August 2007]. Sageman, M. (2006). The psychology of Al Qaeda terrorists: The evolution of the Global Salafi Jihad. In C.H. Kennedy and E.A. Zillmer (Eds.), Military psychology: Clinical and operational applications (Chapter 13, pp. 281â294). New York: Guilford Press. Schelling, T.C. (1971). Dynamic models of segregation. Journal of Mathematical Sociology, 1, 143â186. Shafer, G. (1976). A mathematical theory of evidence. Princeton, NJ: Princeton University Press. Sowa, J.F. (1984). Conceptual structures: Information processing in mind and machine. R Â eading, MA: Addison-Wesley. Telvick, M. (Producer). (2005, January). Al Qaeda today: The new face of the global Jihad. Arlington, VA: PBS Frontline. Telvick, M. (2007). Al Qaeda today: The new face of the global jihad [web page, PBS Front- line]. Available: http://www.pbs.org/wgbh/pages/frontline/shows/front/etc/today.html [ Â accessed August 2007]. U.S. Department of Defense. (1995). Modeling and simulation (M&S) master plan. (DoD #5000.59-P.) Washington, DC: Office of the Undersecretary of Defense for Acquisition and Technology.

Next: 9 State of the Art with Respect to Military Needs »

Behavioral Modeling and Simulation: From Individuals to Societies (2008)

Chapter: 8 Common Challenges in IOS Modeling

Welcome to OpenBook!

Get Email Updates