Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
PART I Workshop Summary
Workshop Summary INTRODUCTION In the summer of 2002, the Office of Naval Research asked the Committee on Human Factors to hold a workshop on dynamic social network modeling and analysis. The primary purpose of the workshop was to bring together scientists who represent a diversity of views and approaches to share their insights, commentary, and critiques on the developing body of social network analysis research and application. The secondary purpose of the workshop was to assist government and private-sector agencies in assessing the capabilities of social network analysis to provide sound models and applications for current problems of national importance, with a particular focus on national security. Some of the presenters focused on social network theory and method, others attempted to relate their research or expertise to applied issues of interest to various government agencies. This workshop is one of several activities undertaken by the National Research Council that bears on the contributions of various scientific disciplines to understanding and defending against terrorism a topic raised by Bruce M. Alberts, president of the National Academy of Sciences, in his annual address to the membership in April 2002. The workshop was held in Washington, D.C., on November 7-9, 2002. Twenty-two researchers were asked to prepare papers and give presentations. The presentations were grouped into four sessions, each of which con- cluded with a discussant-led roundtable discussion among presenters and workshop attendees on the themes and issues raised in the session. The sessions were: (1) Social Network Theory Perspectives, (2) Dynamic Social Networks, (3) Metrics and Models, and (4) Networked Worlds. The opening address was presented by workshop chair Ronald Breiger, of the University of Arizona; Kathleen Carley, of Carnegie Mellon University, offered closing remarks summarizing the sessions and linking the work to applications in national security. Part II of this report contains the opening address, the closing remarks, and the papers as provided by the authors. The agenda and biographical sketches of the presenters are found in the appendixes. This summary presents the major themes developed by the presenters and discussants in each session and concludes with research issues and prospects for both the research and applications communities. WORKSHOP SESSIONS AND THEMES Overall, the workshop provided presentations on the state of the art in social network analysis and its potential contribution to policy makers. The papers run the gamut from the technical to the theoretical, and examine such
4 DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS application areas as public health, culture, markets, and politics. Throughout the workshop a number of themes emerged, all based on the following understandings: · Both network theory and methodology have expanded rapidly in the past decade. · New approaches are combining social network analysis with techniques and theories from other research areas. · Many of the existing tools, metrics, and theories need to be revisited in the context of very large scale and/ or dynamic networks. · The body of applied work in social networking is growing. Several common analytical issues underlie much of the research reported. First, traditional social network analysis is "data greedy" very detailed data are required on all participants. Questions to be addressed in the analysis of these data concern how to estimate the data from high-level indicators, how sensitive the measures are to missing data, and how network data can be collected rapidly and/or automatically. Furthermore, advances require the development of additional shareable data sets that capture heretofore understudied aspects such as large-scale networks, sampling errors, linkages to other types of data, and over-time data. Second, traditional social network analysis (especially prior to the last couple of decades)1 has focused on static networks, whereas much of the work discussed here focuses on the processes by which networks change or emerge. While ongoing data collection and analysis are providing key new insights, researchers need new statistical methods, simulation models, and visualization techniques to handle such dynamic data and to use these data to reason about change. Third, social network theories are beginning to outstrip the measures and data. For example, theories often posit ties as being flexible, probabilistic, or scaled; most data and metrics, however, are still based on binary data. Session I: Social Network Theory Perspectives Presenters and Papers Discussant: Ronald Breiger 1. Linton C. Freeman, Finding Social Groups: A Meta-Analysis of the Southern Women Data 2. Harrison C. White, Autonomy vs. Equivalence Within Market Network Structure ? 3. Noah E. Friedkin, Social Influence Network Theory: Toward a Science of Strategic Modification of Interpersonal Influence Systems 4. David Lazer, Information and Innovation in a Networked World Themes The papers in this session illustrate the breadth of areas that can be addressed by social network analysis. On one hand, the work can be used to explain, predict, and understand the behavior of small groups and the influence of group members on one another, as seen in the work of Freeman and Friedkin. On the other hand, social network analysis can be "writ large" and applied at the market or institutional level, as described in the papers by White and Lazer. Regardless of network size, all four papers demonstrate that a structural analysis that focuses on connec- tions can provide insight into how one person, group, or event can and does influence another. These people or groups or events cannot, and do not, act in an autonomous fashion; rather, their actions are constrained by their position in the overall network, which is in turn constrained by the other networks and institutions in which they are embedded (the overall ecology). Further, the papers presented in this session review the range of methodologi- 1Early work on dynamic modeling was done by P.S. Holland and S. Leinhardt (A dynamic model for social networks, Journal of Mathematical Sociology 5:5-20, 1977).
WORKSHOP SUMMARY s Cal approaches and styles of analysis that are compatible with a social network approach. Unlike many other scientific methods, the social network approach can be used with ethnographic and field data (Freeman), experi- mental laboratory data (Friedkin), historical examples (White), and policy/technology evaluation (Lazer). Freeman provided a framework for assessing comparative analyses of a long-standing object of sociological theory the small social collectivity characterized by interpersonal ties. White integrated theoretical perspectives on a network-based sociology of markets and firms. Friedkin advocated applications of social influence network theory to problems of network modification. Lazer considered governance questions arising from the "informa- tional efficiency" of different network architectures (spatial, organizational, emergent) and the prospects of "free riding" governments becoming complacent about innovating in the hope that another government will bear the cost of a successful innovation. Three major themes crosscut these papers and the resultant discussion. Scaling up and uncertainty. How well do the different analytical techniques and algorithms "scale up" to large networks with hundreds or thousands of actors and multiple types of relations? Perhaps a more useful phrasing of this question is, Under what conditions andfor which analytical purposes do models of social networks scale up, and how well do existing techniques deal with uncertainty in information? Spirited discussion arose in response to the question of whether the same social network model may be posed at "micro" and "macro" levels of social organization, or whether scaling up must involve the addition of substantially more complex representation of social structure within the network model. White' s model requires the analyst to account explicitly for the varied circumstances of particular industries. In his work, markets constructed among firms in networks are mapped into a space of settings with interpretable parameters that govern the ratio of demand to producers' costs; key parameters pertain to growth in volume and to variation in product quality. White's model is also distinctive in treating uncertainty not as a technical problem implicated in parameter estimation (statistical models being a main concern of the second and third sessions of this workshop), but as a substantive force that drives the evolution of ranking and manipulation as organizing features of a space of markets. During discussion, White expressed the view that scaling up is indeed a formidable challenge. Friedkin, on the other hand, felt that the only practical constraint on social influence models is the problem of data-gathering ability. Friedkin's social influence network theory describes an influence process in which mem- bers' attitudes and opinions on an issue change recursively as the members revise their positions by taking weighted averages of the positions of influential fellow members. One example of an application would be producers who eye, and orient to, their competitors while figuring out the cost of their product, as in White's market model. The mathematics is general Friedkin suggested applications to the modification of group struc- ture such that, for example, outcomes are rendered less sensitive to minor changes in influence structure or to initial opinions of group members. However, Friedkin's model focuses on convergent interpersonal dynamics rather than on the structuring of qualitatively distinct network outcomes, as in White's market ecology. Network outcomes. Both Lazer and Friedkin argued that the partial structuring of interdependence, a definitive aspect of social networks, must be taken into account in theorizing the production of outcomes. Taking innovative information, such as knowledge of policies or innovations that work as a desired network outcome, Lazer's paper asks how interdependence can be governed in large and complex systems. Lazer develops the argument that, where the production of information requires costly investment, there is a paradoxical possibility that the more efficient a system is at spreading information, the less new information the system might generate. This suggests that in networked (as distinct from hierarchical) worlds, incentives should be provided to continue experimenting with innovations. Breiger suggested the benefits of a comparative reading of Lazer' s paper and Friedkin' s: In Friedkin' s model the dependent variable is actor opinions at equilibrium; in Lazer's, it is knowledge created by or held by actors. Can the free riding that motivates Lazer's rational actors be usefully applied to suggest processes that structure the interior of Friedkin's influence networks? Conversely, can Friedkin's model provide a concrete format for specifying Lazer's theories of information interdependency?
6 DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS Meta-analysis of network analysis methods. Freeman was able to locate 21 analyses of the same data set (concerning the participation of women in social events in a southern city in the 1930s). He examined these studies by use of a form of meta-analysis, but it is an unusual form in that multiple analytic techniques are applied to a single data set. There is an underlying dimension on which the analytic methods converge, and the several most effective techniques allow identification of a single (and, in this convergent sense, most informed) description of the data. Generally, the "best" of these 21 analytic procedures agree more with one another than with the anecdotal description supplied by the original data gatherers. The possibility is raised that in certain circumstances (perhaps in cases where the various methods applied yield results that are not too far from the true network structure) the intensive application of multiple analytic methods may compensate for problems in data quality. Session II: Dynamic Social Networks Presenters and Papers Discussant: Stanley Wasserman 1. Jeffrey C. Johnson, Informal Social Roles and the Evolution and Stability of Social Networks (coauthors Lawrence A. Palinkas and James S. Boster) 2. Kathleen M. Carley, Dynamic Network Analysis 3. Tom A.B. Snijders, Accounting for Degree Distributions in Empirical Analysis of Network Dynamics 4. Michael W. Macy, Polarization in Dynamic Networks: A Hopfield Model of Emergent Structure (coauthors James A. Kitts and Andreas Flache) 5. Martina Morris, Local Rules and Global Properties: Modeling the Emergence of Network Structure 6. H. Eugene Stanley, Threat Networks and Threatened Networks: Interdisciplinary Approaches to Stabiliza- tion and Immunization (coauthor Shlomo Havlin) Themes The papers in this session address the evolution, emergence, and dynamics of network structure. Methods range across ethnography and participant observation, statistical modeling, simulation studies, and models em- ploying computational agents. The methods are not mutually exclusive and were often used together to create a more complete understanding of the dynamics of social networks. Four themes emerged from these papers and the roundtable discussion. What makes networks effective or ineffective? This theme added an explicitly dynamic focus to the "network outcomes" theme of the previous session. Various factors were considered, including the presence or absence of certain roles, structural characteristics (patterns of ties), and connections to other networks. The cross-cultural ethnographies of network evolution conducted by Jeffrey Johnson and his colleagues demonstrate that group dynamics can vary dramatically from one group to another even within the same physical and cultural setting and in the presence of similar organizational goals and formal structure. Johnson et al., studying Antarctic research station teams, found five features of emergent social roles that are associated with the evolution of effective networks: heterogeneity (such that members' roles fit in with one another), consensus (agreement on individuals' status and function), redundancy (such that removal of a single actor still ensures proper functioning, avoiding vulnerability), latency (promoting adaptive responses to unforeseen events) and isomornhism of formal and informal social roles (promoting agreement on group goals and objectives). The studies make use of quantitative . . . . . modeling of network structures over time as well as direct observation over extended periods. Multiagent network models are featured in the dynamic network analysis of Kathleen Carley and in the computational modeling of Michael Macy and his colleagues. Carley has formulated a highly distinctive approach to the dynamic modeling of social networks. Her simulation models and her formulation of ties among actors as probabilistic (such that connections can go away, get stronger, or change in strength dynamically over time) rather than deterministic allow her to investigate how networks may endeavor to regain past effectiveness (or to "re-
WORKSHOP SUMMARY 7 grow") to make up for destabilizing losses. "What if" exercises allow theoretical exploration of assumptions that an analyst makes about network vulnerabilities and about alternative distributions of resources, with reference to the system the analyst has constructed. For example, Carley found that removal of the most central node might leave a network less vulnerable than removal of an emergent leader. Macy et al. studied conditions in which a group might be expected to evolve into increasingly antagonistic camps. Their computational model allows the manipulation of qualities attributed to agents for example, their tendency to focus on a single issue, the degree of their conviction, and their rigidity or openness to influence from others. A surprising finding of this simulation study is that global alignment along a single polarizing definition of opposing ideologies is facilitated by ideological flexibility and open-mindedness among local constituents, as seen in the elegantly simple system of actors and relations postulated by the type of neural network employed by Macy et al. Researchers in the statistical physics community have recently focused attention on "scale-free" social net- works, characterized, as Eugene Stanley pointed out in his presentation, by a power-law distribution of ties emanating from the nodes, loosely analogous to an airline route map showing a very small number of well- connected "hubs" and many less well-connected nodes. It has been proven that scale-free networks are optimally resilient to random failure of individuals; Stanley and Havlin point out at the same time, however, that such networks are highly susceptible to deliberate attack. Under the assumption that social networks of people exposed to disease have the scale-free property, the authors review possible strategies for immunization. Dynamics of local structure. The papers presented by Macy and by Carley also speak effectively to another theme that crosscut many of the papers in this session: the advancement of modeling techniques that focus on behavior among small sets of actors and on the implications of such local behavior for the evolution of a network macro- structure. This theme provides a dynamic cast to the ideas of "scaling up" presented in the first session. Compu- tational models of social networks are precisely about exploring the evolution of whole systems on the basis of rules (such as when an actor should form a tie with another) and endowments postulated at the level of individual actors. Coming from quite a different direction, that of the formulation of models that allow the fit of models to data to be assessed within a statistical context, Snijders' paper reports an investigation of network evolution on the basis of a stochastic, actor-oriented model that embeds discrete-time observations in an unobserved continuous-time network evolution process. Just one arc is added or deleted at any given moment, and actors are assumed to try to obtain favorable network configurations for themselves. The paper by Martina Morris continues the focus on relating local rules to global structure, relying on empirical data (from sexual partner networks) and statistical modeling as well as simulations. Her question was whether the overall network structure can be explained by recourse to a small number of partnership formation rules that operate on the local, individual level. Two such rules were evinced in selective mixing patterns, such as the degree of matching on race or age, and the timing/sequencing of partnership formation (e.g., serial monogamy versus concurrency). Morris linked network data to network simulation by means of a statistical modeling framework: statistical models for random graphs as implemented via the Markov chain Monte Carlo (MCMC) estimation method. Putting together local rules and simulations based on random graph (statistical) methods allows empirical modeling of global networks on the basis of local properties. Key features of networks in modeling evolution. Physicists including Stanley and Havlin, who have made many recent contributions to the modeling of scale-free networks, are interested in the dynamic evolution of degree distributions (number of ties emanating from each node). Sociologists tend to emphasize that other features of the network are also of great importance in network evolution, including the degree of transitivity (which is one way of measuring hierarchy), cyclicity (a hierarchy-defeating principle), segmentation into subgroups, and so on. In his paper, Snijders demonstrates that it is possible to formulate a model for network evolution in which the evolutionary process of the degree distribution is decoupled from other, arguably important, features of the network's evolution such as those mentioned above. Snijders' paper elaborates a statistical context within which the contribution of each of several features of network evolution might be comparatively assessed. Carley's paper
8 DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS elaborates mechanisms for evolution that draw on nonstructural properties such as individual learning and re- source depletion, as well as exogenous changes such as the removal of specific nodes. Simulation and computational-actor models versus validation and statistical models. A lively roundtable discussion of the six naners in this session focused in particular on the nrecedin~ theme and on the relative meets _ _ _ _ ___ _ _ _ _ _ _ _ __ _ _ _ __ _ _ __ __ ~ _ ~ O ~ . . .. .. . . . . . . . . . . .. . . Ot conceptual modeling versus validation techniques. it was argued, on one hand, that an emphasis on validation is healthy because the analyst can be in a position to distinguish real patterns from mere noise or from wishful thinking. On the other hand, it was argued that simulations can help an analyst to explore the logic of postulated mechanisms that may be driving an empirical result, and to explore realms of the possible rather than predicting the future of a specific event or action. Such predictions are generally not possible outcomes of simulation studies. It was argued that each broad project approach (statistical rigor and simulations that explore various insights) is valuable, as are efforts to integrate them more closely. As more data become publicly available to the social networking community, they can be used to improve both simulation and statistical techniques. Session III: Metrics and Models Presenters and Papers Discussant: Philippa Pattison 1. Stanley Wasserman, Sensitivity Analysis of Social Network Data and Methods: Results (coauthor Douglas Steinley) Some Preliminary 2. Andrew J. Seary and William D. Richards, Spectral Methods for Analyzing and Visualizing Networks: An Introduction 3. Mark S. Handcock, Assessing Degeneracy in Statistical Models for Social Networks 4. Stephen P. Borgatti, The Key Player Problem 5. Elisa Jayne Bienenstock and Phillip Bonacich, Balancing Efficiency and Vulnerability in Social Networks 6. Christos Faloutsos, ANF: A Fast and Scalable Tool for Data Mining in Massive Graphs (coauthors Christopher R. Palmer and Phillip B. Gibbons) Themes The papers in this session present methodological developments at the forefront of efforts to construct statistical models and metrics for understanding social networks. Although a number of papers in the other sessions also contributed significantly to this effort, a key distinction between these papers and those in the other sessions is that they focus on what can be learned if we only have network data. At least four important themes guiding the development of new models and metrics can be identified. Exploratory data analysis (EDAJ for networks. The first is the development of techniques that fall under the broad class of methods for exploratory data analysis (EDA) for networks. Such methods include descriptive measures and analyses that assist in summarizing and visualizing properties of networks and in investigating the dependence of such measures on other network characteristics. Under this general heading, Seary and Richards provided a comprehensive review of what can be learned from spectral analyses of matrices related to the adjacency matrix of a network. They also illustrated the application of these analyses to empirical networks using the computer program NEGOPY (a key concept in this program is that of liaisons). Faloutsos summarized a number of relationships between node measures (such as degree of connectivity and the "hop exponent") and their frequency in power-law terms and presented a fast algorithm for computing the approximate neighborhood of each node. Wasserman and Steinley presented the first stages of a study designed to explore the "sensitivity" of network measures by assessing the variation in some important network statistics (e.g., degree centralization, "betweenness" centralization, and proportion of transitive triads) as a function of specified random graph distributions.
WORKSHOP SUMMARY 9 Model development and estimation. The second guiding theme is the value of developing plausible models for social networks whose parameters can be estimated from network data. Mark Handcock outlined the general class of exponential random graph models and presented a compelling analysis of difficulties associated with estimating certain models within the class. He showed how model degeneracy the tendency for probability mass to be concentrated on just a few graphs interferes with approaches to apply standard simulation-based estimation approaches, and he described an important alternative model parameterization in which such problems can be handled. 1 1 Impact of network change on network properties. A third theme underlying the work presented in this session is the importance of understanding how different measures of network structure change following "node removal" or "node failure." For example, Borgatti considered two versions of the "key player" problem: Given a network, find a set of k nodes that, if removed, maximally disrupts communication among remaining nodes or is maximally connected to all other nodes. He proposed distance-based measures of fragmentation and reach as relevant to these two versions of the key-player problem and presented an algorithm for optimizing the measures as well as several applications. Bienenstock and Bonacich contrasted the notions of efficiency and vulnerability in networks, and of random and strategic attack, and examined the efficiency and resilience of four network forms random, scale-free, lattice, and bipartite under both forms of attack. A distance-based efficiency measure similar to Borgatti's fragmentation measure was proposed and vulnerability was measured as the average decrease in efficiency of the network after a sequence of successive attacks. Bienenstock and Bonacich found that, of the class of networks assessed, scale-free networks were most susceptible to attack, and lattice and bipartite models with a small proportion of random ties offered the best balance of efficiency and resilience. Finally, Faloutsos examined three types of network node failure: random, in order of degree, and in order of approximate neighborhood size. He argued that the Internet at the router level was robust in the case of random node failure but sensitive to the other two forms. In the roundtable discussion, it was noted that the research topic of network vulnerability appears to be an emerging area in which there are many useful, and usefully interrelated, results, with reference in particular to the papers by Borgatti and by Bienenstock and Bonacich in this session, as well as to those by Carley and by Stanley and Havlin in the previous session. Fast algorithms such as those developed by Faloutsos and colleagues are necessary to extend these investigations to very large contexts such as the Internet. Processes or flows on networks. A fourth theme is the importance of distinguishing the structure of a network from the different types of dynamic processes or flows that the network might support. Borgatti described a framework for distinguishing different interpersonal processes (e.g., disease transmission, dissemination of knowl- edge) that might involve network partners and considered the implications of such distinctions for analyses of network structure. Session IV: Networked Worlds Presenters and Papers Discussant: David Lazer 1. Alden S. Klovdahl, Social Networks in Contemporary Societies 2. David Jensen, Data Mining in Social Networks (coauthor Jennifer Neville) 3. Peter D. Hoff, Random Effects Models for Network Data 4. 5. Carter T. Butts, Predictability of Large-Scale Spatially Embedded Networks Noshir S. Contractor, Using Multi-Theoretical Multi-Level (MTML) Models to Study Adversarial Networks (coauthor Peter R. Monge) 6. Michael D. Ward, Identifying International Networks: Latent Spaces and Imputation (coauthor Peter D. Hoff and Corey Lowell Lofdahl)
10 Themes DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS The papers in this session focus on the modeling of large-scale social networks. It is important to stress that tools and data sets for addressing large-scale networks are still in their infancy. An overarching concern is the extent to which standard social network metrics provide information in large-scale networks. A related need is publicly available, large-scale network data sets that can serve as examples for systematic comparative method- ological analysis. Four themes emerged from the papers and the roundtable discussion in this session. Understanding the structure of large-scale networks. Klovdahl's paper demonstrates how social structure can be exploited to obtain a sample of a large-scale network. In random-walk sampling, a small set of persons is randomly sampled from a large population and interviewed; from the contacts provided by each interviewee, one is randomly selected for an interview, and this process is repeated until chains of (say) length 2 are constructed. In essence, this procedure allows observation of random sets of connected nodes, which provide the basis for statistical inferences of the structural properties of large networks. The feature of structure addressed in the paper by Butts is geographical distance, which he discusses as a robust correlate of interaction (e.g., most participants in the 9/11 airplane hijackings were from a particular, relatively small region of Saudi Arabia). His paper models the predictive power of geographical distance in large- scale, spatially embedded networks, and argues that in many realistic situations distance explains a very high proportion of variability in tie density. Understanding processes in large-scale networks. What are the processes that sustain large networks? Why do people maintain, dissolve, and reconstitute communication links as well as links to information? Contractor and Monge systematically reviewed large bodies of empirical literature and distilled many propositions concerning the maintenance and dissolution of links. To date, much research on social networks has looked at just one of these mechanisms at a time. Ironically, however, many of the mechanisms contradict one another. For example, creating a tie with someone because many others do so is consistent with social contagion theory but contradicts self-interest theories that suggest the marginal return from an additional tie would be slight. In their larger project, Contractor and Monge develop a framework that tests multiple theories such as these at multiple levels, allowing many theories to be brought to bear on the same data set. Understanding data on large-scale networks. Papers by several participants present models, methods, and illustrative analyses oriented toward the study of large-scale networks. Jensen and Neville joined social network analysis with data mining and related techniques of machine learning and knowledge discovery in order to investigate large networks. At the intersection of statistics, databases, artificial intelligence, and visualization, data mining techniques have been extended to relational data. One example, useful in detecting cell phone fraud, is that fraudulent telephone numbers are likely to be not one but two degrees away (because various phone numbers are stolen but they tend to be used to call the same parties). Jensen's and Neville's paper reports their effort to predict an outcome (the success of a film) on the basis of a data set that interlines features of a large network (such as movies, studios, actors, and previous awards). Whereas recent work in machine learning and data mining has made impressive strides toward learning highly accurate models of relational data, Jensen and Neville suggest that cross-disciplinary efforts that make good use of social network analysis and statistics should lead to even greater progress. Hoff presented random effects modeling for social networks, which provide one way to model the statistical dependence among the network connections. The models assume that each node has a vector of latent character- istics and that nodes relate preferentially to others with similar characteristics. Hoff employs a Markov chain Monte Carlo (MCMC) simulation procedure to estimate the model' s parameters. Ward, Hoff, and Lofdahl report in their paper an application of Hoff's latent spaces model to data on interactions among primary actors in Central Asian politics over an 11-year period ending in 1999, based on 1 million iterations of the MCMC estimation procedure using geographic distance as the only covariate. Countries closer together in the dimensional space resulting from model estimation were predicted to have a higher probabil-
WORKSHOP SUMMARY 11 ity of connection. Imputation techniques were investigated and found to predict ties that were not sampled. The paper provides a favorable initial application of the latent spaces model to a data context of interest. A final observation from the resultant discussion concerned the similarities (which are very great) as well as the differences (which are nonetheless consequential) among the various statistical models, notably those of Handcock's and Hoff's papers as well as the random graph models presented by Wasserman, Pattison, and colleagues. In brief, the similarities concern the increased focus on formulating parametric models for random graphs within the exponential family. The differences pertain to different paths taken in the estimation of model parameters. Understanding the adversary versus understanding ourselves. In the roundtable discussion, Lazer began his remarks by considering, in the post-9/11 context, the contributions that network analysis might make to decision makers who confront security challenges and suggested that the first problem to be considered is that of under- standing the adversary. Severe information overload coupled with a great deal of missing or nonexistent data and the need for quick, real-time decisions are factors that hamper efforts to understand an adversary's vulnerabilities. Social network models cannot always lead an analyst to make a prediction that has perfect accuracy, but they can certainly improve the process of making such predictions by identifying relevant linkages and related sources of uncertainty that may be easily overlooked. For example, the paper by Ward and his colleagues provides a major extension to the usual international relations models that paradoxically ignore relational structures. Further, Butts was able in his paper to project the likely structure of a network based on a small amount of publicly available information. Lazer then turned to the question of understanding ourselves, arguing that a different set of challenges is implicated in this second concern. Here the needs center around questions of who needs to coordinate, communi- cate, and cooperate. Challenges concern critical self-evaluation (turf issues, organizational cultures, entrenched constituencies), design challenges (including needs for security as well as coordination), and the ability to be up and running in real-time, emergent situations. An opportunity in this area is that there are many more chances to gather data. Therefore, in contrast to a project designed to understand the adversary's vulnerabilities, there are perhaps more obvious openings for data-analytic approaches and for rigorous research. RESEARCH ISSUES AND PROSPECTS In this final section we identify near-term prospects for improving social networks research that emerged from the workshop papers and discussion. In doing so we focus on three areas: formulation of models; data and measurement; and research relevant to national security needs. The concerns and questions that we identify, while voiced by various workshop participants speaking as individuals, do not represent conclusions or recommenda- tions of the workshop itself. Formulation of Models Networks "Plus": Generalized Relational Structures The social network community has often found it useful to view social networks as "skeletal" abstractions of a much richer social reality. An important question, though, is whether the extent to which attempts to model and quantify network properties can rely on the network observations alone or whether they would instead be enhanced by additional information about actors and ties and their embedding in other social forms (the constellation of which might be termed "generalized relational data structures". In other words, to what extent do we need to develop a more systematic (and quantitative) understanding of such generalized relational data structures, such as the meta-matrix approach? Such an understanding could lead to the development of models and analytic ap- proaches that reflect the social context in which networks reside and the interaction of network processes with other aspects of this reality, including the background, intentions, and beliefs of the actors involved and the
2 DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS cultural and geographical settings in which they find themselves. Breiger raised similar points in his opening address. Processes on Networks An important reason to examine network structure is that such examination provides an understanding of the constraints and opportunities for social and cognitive processes enacted through network ties. Yet there is a limited understanding of the extent to which we can predict the course of social and cognitive processes from network topology alone. Should we be engaging in empirical and methodological programs of study that enable us to articulate more clearly the relationship between network structure and various types of social processes in which we are interested? Research presented at this workshop demonstrates that models in which the network ties and other network-based diffusion or contagion processes convolve significantly expand our ability to understand, interpret, and predict social and cognitive behavior. Further work involving empirical analysis, simulation, and statistical modeling in this area seems to hold particular promise. Scaling Up An evident theme emerging from Sessions I (Social Network Theory Perspectives) and II (Dynamic Social Networks) is the potential value of juxtaposing methods used to describe large-scale networks, primarily in physics and computer science, with methods for evaluating models for social networks at a smaller scale, primarily in the social sciences. A number of statistical models that have been developed for social networks of medium size have attempted to express network structure as the outcome of regularities in interactive interpersonal processes at a "local" level. Can we extend the focus of such statistical modeling approaches to develop theoretically prin- cipled and testable models for social networks at a larger scale and in the process evaluate some of the claims emerging from the more descriptive analyses? A number of network theories are based on cognitive principles and small-group social theory. As we move to large-scale networks does this microlevel behavior still appear, or does the variance inherent in human action cause such microbehavior to be lost as macrolevel bases for relations become visible? Model Development and Evaluation In addition to extending models to include richer sources of relational data, how can we best incorporate potentially more complex dependence structures and longitudinal observations? Can we develop measurement models for ties and structural models for networks that take account of measurement and sampling issues? Can we also develop more rigorous and diagnostic approaches to model evaluation? How can simulation studies be best utilized to contribute to the resolution of questions of model specification and evaluation for evolving networks? Can we evaluate the fit of power laws more carefully; indeed, can we construct and evaluate models predicting the emergence of scale-free networks? Furthermore, can we extend and possibly integrate the very different and distinctive programs of fruitful work currently being done (and presented at the workshop in papers by Carley, Faloutsos et al., Friedkin, Macy et al., Morris, Snijders, and Stanley and Havlin) in order to build models for the convolution of network ties, actor orientations, and actor affiliations? Modeling: Estimation and Evaluation Significant issues that emerged from the workshop session on dynamic social networks are the complementarily of simulation from complex models for dynamic and interactive network-based processes (in order to understand model behavior) and the task of formulating models in such a way that model parameters can be estimated from observed data and model fit can be carefully evaluated. The potential value of developing, estimating, and evaluating models in conjunction with empirical data is evident, and a major research domain is the development of models for network observations that will allow us to close the gap between what can be
WORKSHOP SUMMARY 13 hypothesized from simulation-based explorations of theoretical positions and what can be verified empirically from well-designed network studies. Data and Measurement The Design of Network Studies: Sampling Issues and Data Quality A significant set of issues surrounds the question of network sampling. Although only a few presentations in this workshop directly addressed questions of sampling, such issues were always close to the surface. Networks rarely have boundaries, and almost all empirical networks have been based on sampling decisions or sampling outcomes of some form. A principled means for handling sampling issues would be very valuable and indeed is a natural extension of model-based formulations. Several factors that might be considered here include (1) the biases inherent in the collection of nodes and ties obtained by a given sampling procedure, (2) the tendency to over- or undersample certain types of relations, and (3) the extent to which such errors are uniformly distributed over the network or focused in some portion of the overall network. Related issues concern missing data, unreliable data, and data arising from actors (ranging from school children to corporate executives) who may strategically misreport their ties. Methods for analyzing network data exhibiting these properties include intensive application of multiple analytic procedures to compensate for prob- lems of data quality (see Freeman's paper and related discussion in Session I above); addition of actors by means of random-walk sampling (see discussion of Klovdahl's presentation in Session IV); and the possibility that missing links can be implied by the existence of other linkages, as reviewed in the section entitled "Data Quality and Network Sampling" in Breiger's opening address. Important questions to address include: What methodological steps can be taken to minimize the conse- quences of missing nodes and tie measurement errors? Can we deal with missing data in model construction and develop model-based approaches to estimate missing data? Would it be prudent to develop more effective measurement strategies for each tie of interest, as well as models for the measurement of ties? How can we go about providing evidence for the validity of network measurement? Further, should we focus more effort on characterizing the multifaceted nature of network ties? ~ _ Network Estimation It is important to note that most network studies are done in a "data greedy" fashion, with the result that the underlying network is mapped out or sampled at a fairly high level of accuracy. This is practical in some contexts, such as situations in which archival or observational data are available and reliable; however, it is likely to be highly impractical in many other contexts. Thus, it would be useful to specify systematically the classes or dimensions of networks that exhibit fundamentally different behavior. Which high-level indicators can be used to determine the location of an unobserved network of interest in this space of possibilities? In other words, what can be done to provide a first-order estimate of the shape of the unobserved network? What kinds of questions can be answered by having even this high-level estimate? Basic research both on characterizing the impact of networks and on their fundamental form would be useful in this regard. Exploratory Data Analysis In light of the issues summarized above, it will be important to consider how we can augment descriptive analyses of networks so as to incorporate information from other sources (e.g., node attributes, orientations and locations, tie properties, group and organizational affiliations). More generally, can we extend these approaches to more complex and longitudinal relational data structures, and can they be developed so as to assist in the evalua- tion and development of model-based approaches? Many visualization tools provide valuable means for the simultaneous presentation of relational and other data forms, but can their capacities be further enhanced? In relation to the notion of resistance of network statistics, is special treatment required for certain network concepts?
4 DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS For example, many network analyses have been based on the role of cut points and bridge ties, observations that lead to statistics that may be inherently nonresistant (e.g., number of components, reachability, etc.~. Should researchers identify observations associated with lack of resistance (e.g., investigate their measurement quality)? Research Relevant to National Security Impact of Network Change on Network Properties What is learned by integrating an understanding of network "interventions" with a model-based approach? One of the core features of social networks is arguably their potential to self-organize, which is especially likely in response to an intervention. Research presented at this workshop illustrates the potential for network models, both simulation and mathematical, to be used to foreshadow the probable network response to various types of interven- tions such as the removal of a node that is high in centrality or cognitive load, a "key player." This work suggests that, to be effective, strategies for altering networks need to be tailored to the processes by which the networks change, recruit new members, and diffuse goods, messages, or services. Stabilization and Destabilization Strategies Basic research is needed to determine the set of factors that influence network stabilization and destabilization strategies. Papers in each of the workshop's four sessions address these concerns. A problem analogous to the "key player" problem that was not addressed at the workshop but is equally critical is the problem of the "key tie." Can we develop metrics to identify key ties and the impact of their removal or addition on the overall behavior of the network? What are the basic properties that make a group, an organization, or a community resilient, efficient, and adaptive? Can we identify network structures or roles in networks that optimize these properties? While the research on dynamic networks, both empirical and simulation, suggests that this is possible, there is still much work to be done. Can we combine our analysis of the consequences of network change with a model-based understanding of measurement error? Can we identify a program of empirical research that would evaluate predictions about the impact of diverse types of intervention under different levels and types of error? Closing the Gap As Carley emphasized in her closing address, the ideas, measures, and tools being developed by network analysts hold promise with respect to the needs of the defense and intelligence community. However, there is still a large gap. Fundamental new science on dynamic networks under varying levels of uncertainty is needed. To fill this gap, Carley suggested that new research featuring empirical studies, metrics, statistical models, computer simulations, and theory building are all needed. How can the gap between scientific research on networks and national needs be narrowed? Carley put forward four proposals. First, universities need to produce more master's and Ph.D. students who are trained in social network analysis and who enter government work. Second, illustrative data sets that are suitable for dissemination within the community of networks researchers and that suggest the types of problems faced by the defense and intelligence community need to be made publicly available. Third, effort needs to be made to establish a dialogue with social networks researchers in which the needs of the defense and intelligence commu- nity can be articulated without compromising national security. Finally, academicians in this research area need to continue to strive for clarity in the articulation of the practical implications of their theoretical results.