National Academies Press: OpenBook

Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers (2003)

Chapter: Local Rules and Global Properties: Modeling the Emergence of Network Structure

« Previous: Polarization in Dynamic Networks: A Hopfield Model of Emergent Structure
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 174
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 175
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 176
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 177
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 178
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 179
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 180
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 181
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 182
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 183
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 184
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 185
Suggested Citation:"Local Rules and Global Properties: Modeling the Emergence of Network Structure." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 186

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Local Rules and Global Properties: Modeling the F,mer~f~n~f' of NPtwnrk ^^ Structure Martina Moorish University of Washington Departments of Sociology and Statistics Abstract This paper reviews the interaction between theory and methods in the field of network analysis. Network theory has traditionally sought to use social relations to hncl~e whet ~ ~ A: ~ 1 _ A: ~ ~ . , 1 ~ ~e, _ ,, ^~ ~ ~ ~io`;~m s`;~;nr~s rerer to as one m~cro-macro gap: understanding how social structures are formed by the accumulation of simple rules operating on local relations. While early network methods reflected this goal, the bulk of the methods developed later and popularized in computer packages were more descriptive and static. That is beginning to change again with recent developments in statistical methods for network analysis. One particularly promising approach is based on exponential random graph models (ERGM). ERGM were first applied in the context of spatial statistics, and they provide a general framework for modeling dependent data where the dependence can be thought of as a neighborhood effect The morle.l~ r.nn ha. 7~1~f~f1 1~ A~mmn^c`= Ill ^~1~~r~-lr ~f_~r~~] properties into the effects of localized interaction rules; the traditional concern of the field. An example is given using an HIV transmission network. ~ ~^ A_ _v_~ —~ at— V V ~i ct11 11~L W JO— i:>l~1 U~I.Ul al Where do networks come from? There are really two questions implied in this simple query. The first is about the underlying process that gives rise to the patterns of links among nodes in the population - this requires us to think about a dynamic mode] that links local processes to a global outcome. The second is the inverse question: how to infer the underlying process from the patterns we observe - this requires statistical methods for sampling. estimation and inference. Network theory was explicitly focused on the first question. The current developments in the field are making progress on the second. And the most welt developed and commonly used methods in the field, while not particularly well suited to answering either question. have provided some important intermediate tools. Modern (post 1940) social network theory explicitly focused on the link between local dynamics and ~,lobal structures. One root can be traced back to social anthropology and exchange theory. In this research the focus is on how rules governing permissible ' This work was supported in part by NIH grants RO 1 HD4 1 877-01 and RO ~ DA I 283 1-02. 174 DYNAMIC SOCIAL NETWO~MODEL~G ED ISIS

exchange partners (for, e.g., gift giving, or marriage) cumulate up to determine the overall structure and stability of an exchange system (White 1963; Levi-Strauss 1969; Yamagishi and Cook ~ 993; Bearman ~ 9961. Another root can be traced back to the social psychology and balance theory. In this research, the focus is on the how the requirement of balance for positive and negative affect among three actors - e.g., my enemas friend is my enemy- cumulates up (Heider 1946; Granovetter 1979; Chase ~ 982~. In both cases, the key findings have been that some simple local rules may lead to overall network structures that display striking regulanties, like linear hierarchies, stable cliques, or stable cycles of exchange. Other rules do not have these systemically stable impacts, and this distinction leads to an evolutionary hypothesis about the differential survival of rules over time, i.e., to links with game theory. Given this early focus, one might have expected the field to develop a methodology for systematically exploring such dynamics, but this did not happen. Nor did the inverse statistical problem, of inferring the rules from the patterns, receive much attention. It is tempting to attribute this to inadequate computing and statistical tools available at the time (from the ~ 940s to the ~ 980s). And to the impact of the progress macle in the (less demanding) linear model framework. Whatever the reason, however, the methodology that did develop in the field of network analysis focused on static descnptive measures, rather than the dynamic model. The descriptive approach drew inspiration from mathematical graph theory, using tools from linear algebra to manipulate the adjacency matnx, and focusing on issues of clustering and connectivity. These tools have become the heart of the field, providing a rich framework for thinking about networks and a wide range of summary measures to represent both the network position occupied by specific nodes, and the overall network structure. Almost all of the classic network measures - paths, cycles, density, centrality, structural equivalence, cliques, blockmodels, role algebras and bicomponents -- owe their development to researchers working with these descnptive tools. Textbooks and computer packages for network analysis typically have these measures at their core. They have become the common language for network analysis, defining the basic features of networks and helping to develop our intuitions about the complex relational structures we seek to understand. The statistical issues - developing pnncipled methods for estimation from samples, and quantifying the uncertainty in the estimates - have been addressed in a limited way in this descriptive context bv eschewing' trarlitinnn1 mn~l~l hack ctotictira1 methods. Model-based methods rely on the assumption of independent observations to obtain tractable likelihood-based estimation and inference. In network analysis, independence is not something one would want to assume, since understanding the dependence among observations is actually the primary task. The growth in computing power has given rise to statistical methods that rely on resampling rather than likelihood inference, e.g., the bootstrap, jackknife, and permutation tests. These have been readily adopted by network analysts, finding their way into tests like quadratic assignment procedures (QUAP) for sociomatrix regression. In recent years, Markov Chain Monte CarIo (MCMC) algorithms have been developed for complex estimation problems, and these are now providing the tools needed to return to a model-based framework. DYNAMIC SOCIAI~ NETWORK A1Ol9E~ING~D TRYSTS 175

Statistical Models for Networks Progress in model-based statistical methods for network analysis has picked up momentum in recent years, due to a combination of theoretical developments, advances in statistical computing, and innovations in data collection. Model-based statistical estimation and inference would seem to offer exactly the right tools to answer the original inverse question raised above, because it is based on an underlying stochastic model of the population process. To be useful, however, two things are needed: (~) an appropn ate model for the process and the dependencies in it, and (2) a method of estimation that works with dependent observations.2 This would seem to be fairly straightforward, but the history of the field makes it clear it is not. There is general agreement that the exponential family of distributions provides a good framework for modeling the probability that a random graph (or sociomatrix) X takes the observed value x. Holland and Leinhardt (1981) were the first to propose using this model for networks, noting that it was a natural forth because the sufficient statistics were explicitly tied to parameters of interest, like indegree, outUegree, mutuality and etc. The general form of the model is given by: P(X = x) = exp {Z 0,, f (Xjj )} / C(X) (1) where the sufficient statistics, f (I ), are typically counts of links and of products of links, the All are the parameters of interest, and coax) is a normalizing constant. This mode} had been proposed 20 years earlier by Bahadur (1961) as a general representation for multivariate binomial distributions. It is a completely general model, and can fit any graph perfectly by using all (n)_ 1 possible parameters (where n is the number of nodes). One can think of the sufficient statistics as defining, the neighborhood of dependence. Conditional on the rest of the graph, the dependence between observations is restncted to the other points in this neighborhood. The heart of the modeling effort is to specify (and test) parsimonious representations of the neighborhood. To focus on this, we will strip equation (~) down to its modeling core: IJ ~~t,f(Xjj) i=l and work through the models that have been proposed in various contexts. (2) Much of the statistical theory for ERGM was developed and applied in the context of spatial statistics by Besag (19741. The simplest spatial models represent observations as points on a lattice, and assume that only the nearest neighbors have an influence on the status of a site. For example, imagine an agricultural plot, divided into a grid along two orthogonal axes (see Figure I. ~ It is important to keep in mind that the unit of analysis here is the link, not the node, and the dependence is between links, not between nodes. 176 DYNAMIC SOCIAL fJETWORKMODELIl`JG AND ANALYSIS

Figure I. A lattice model for a "nearest neighbor', spatial process. The dark site in the center is the site whose status is of interest; the grey sites indicate the neighborhood of dependence. Each square in the and has 4 adjacent sites in its neighborhood. Nearest neighbor ERGM models mimic the classic Ising mode] from statistical mechanics, representing the status of a site (say, infected with a fungus) as a function of: Mali j~p~xi~jxi~kaj+k k={ i,]) . . . . ... ... The spatial dependence among sites is captured by p, which is considered a nuisance parameter, and the parameter of interest is a, which represents the spatially adjusted propensity for infection. When applying these spatial models to networks, both similarities and differences in the nature of the data should be kept in mind. In both cases, the data are often represented in matrix form - the lattice and the adjacency (or socio-) matnx. But the meaning of the indices that order the matrix are quite different. The "site" in the spatial context is a dyed (not a person) in the network context. The Cartesian coordinates that define the and in the spatial context become persons in the network context. This latter difference has important implications. In the spatial context, the spatial metric is exogeneous, and the lattice is fixed. In the network context, the rows and columns of the adjacency matrix have no intrinisic ordering, and may be permuted at will without changing the pattern of ties among persons, or the neighborhoods that define dependence. While physical space may influence the pattern of ties, it is not the only factor. As a result, the notion of "space" in network models is largely endogeneous and neighborhoods are typically not defined in terms of proximity in the matrix. Because the nature of the "spatial" dependence is the primary focus of interest in network analysis, the parameters that define this dependence are not nuisance parameters, but the parameters of most interest. This also changes the kinds of models that are appropriate DYNAMIC SOCIAL NETWORK MODELING ED ISIS 177

for networks. The models can be based on exogeneous non-spatial attributes of the nodes (e.g., age, race, sex, etc.), or on the propensity for certain kinds of configurations in the network (e.g., tnad effects, cycles and paths). Alternatively, the models may seek to represent neighborhoods in terms of a completely endogeneously defined latent space (with methods similar in spins to earlier blockmodeling (White, Boorman et al. 197641. Each of these approaches can be found in the literature. The first application of ERGM like models to networks was Holland and Leinhardt's p] mode! for directed graphs. The core of this mode} takes the form: n '` ~_, . ~ n ,/ ~ Dip+ ~ ,? ~ pjX~j + pI XijXj, i=1 j=1 i<J The model represents the process that gives rise to the graph as a function of the node- specif~c outUegree and indegree propensities, the Pi and ,Cj parameters, and a uniform propensity, p ~ for ties to be reciprocated. A number of key features of the ERGM framework are visible in this simple model. Hi. ~ ~ ~ . . , , .. .. . . . ~ (4) first, tne network statistics that drive the mode] are simple functions of links, and the neighborhood represented by the statistics is easy to read from functional forms. In much the same way that a standard cross-tabulation can be decomposed into marginal and interaction effects, this model uses marginal and interaction-type terms to distinguish between dyad-indepe~ndent and dyed-dependent effects. Terms that are based on marinas sums represent effects that operate on each dyed independently. In this model, the basic indegree and outUegree teens are of this sort. Marginal effects can also be used to represent groups of nodes defined by exogeneously given attnbutes, and the patterns of selective mixing among the groups (this will be shown below). Used in this way, the attributes define a social distance metric that establishes a generalized kind of neighborhood. Terms that are based on products of links, on the other hand, represent dependence between the links. In this model, the only dependence between links is within-dyad- the tendency for mutual ties - so this mode! is still referred to as a dyad~c independence model. But products of non-reciprocal links are a fairly straightforward generalization, and models with these terms are referred to as dyadic dependence models. Examples include products of tnads of various kinds (a direct formalization of the balance theory hypotheses), and all manner of larger component sizes and structures. In each case, the sufficient statistic will be the sum of the products of sets of links that are eligible for the specific configuration of interest, and the parameter will represent whether this configuration happens more or less likely than one would expect by chance. The size and structure of the configuration indicates the nature of the neighborhood. Seconds one can also catch a glimpse of the two key mechanisms for reducing the number of parameters in these models: one is to limit the number of configurations represented as having an effect on the probability of the graph, the other is to impose "homogeneity constraints" on isomorphic configurations. In this mode! the number of configurations is quite small - instars, outstare, and mutual dyads. But the homogeneity constraints are only imposed on the mutuality parameter; indegree and outUegree parameters remain node-specific. As the population grows barges the number of 178 DYNAMIC SOCIAI~ NETWORK MODELING ED ISIS

parameters will also grow. This is somewhat like a f~xed-effect model in economics, where a term is fit to every respondent in a setting with repeated measures. Here, as there, these are essentially nuisance parameters. Homogeneity constraints are often imposed for parsimony, but they can also be used to represent exogenous covanates. For example, one can specify ;,roup-specif~c, rather than individual-specific parameters for indegree and for outUegree (Fienberg, Meyer et al. 1985; Wang and Wong 19871: ~ Ok ~ Hi+ + ~ p~ ~ X+ j + ~ (k! ~ X;j + pit Xij-~ji k = I, G. ~ = l ? G (5) ink jel iek,,je! There are now G indegree and outUegree parameters, rather than N. This can lead to a substantially more parsimonious mode! if the number of groups is much smaller than the number of persons. The ~ parameters are used to specify the level of mixing within and between groups. As in other log-linear modeling settings, the mixing can also be parsimoniously modeled to represent patterns of interest (Morns 19911. The next extension of ERGM in networks took on the task of modeling dyadic dependence. The ability to mode! dyadic dependence is the single most important feature of the ERGM framework, both because it is theoretically appealing, and because it is statistically innovative. The range of possibilities it opens up, however, is daunting. Without the physical space embedding, the notion of "neighborhood" is pretty much unconstrained. Frank and Strauss (Frank and Strauss 1986) were the first to exploit the more general form of the ERGM to represent dyadic dependence in the context of networks. The approach they took was a natural, if somewhat mechanical, first step: links are dependent if they share a node. The neighborhood is Markovian in the sense that links must be directly adjacent to be dependent, so Frank and Strauss referred to the model as the Markov Graph. Using the Hammersley-Clifford theorem, the sufficient statistics for this mode! can be shown to be products of links that represent the stars of various sizes in the graph, and triangles. A "star" is a cluster of linics with a single central node. For an undirected graph the general Markov mode! is: ,2 - 2 IS. + ~ [ij~XijXji,X~; k=] where Sk represents a star of size k, so a product of k links. Frank and Strauss then investigate a simpler version of the Markov anodes, restricting the configurations to so arid so -- edges and 2-stars, and imposing an homogeneity constraint on the tnangle parameter a. Wasserman and Pattison (Wasserman and Pattison 1996; Pattison and Wasserman 1999) returned to the more general form. which they christened pi, in honor of Holland and Leinhardt's pioneering work. The models above are based on explicit measured covariates, though in some cases the covariates are themselves part of the process being modeled. Another approach, with roots in classic multivariate analysis, seeks instead to represent the latent, unmeasured space implied by the pattern of network ties. In this purely descriptive DYNAMIC SOCIAL NETWO=MOD~WG ED ^^YSIS (6) 179

180 model, the aim is to find a latent space in which the probability of a tie vanes with the distance between the two nodes (Hoff, Raftery et al. 20021. Conditional on this metric, the dyads are independent. a-~Zi - (Jo (7) The z parameters represent the latent spatial locations, and need to be estimated. Without further dimensional reduction, the number of parameters required to map this latent space becomes quite large. More parsimonious representations can be tested. Explicit covanates can also be added. Despite the enormous flexibility the ERG framework has brought to the field, virtually all of the published applications of these models have either used a variant of the original p] model, or the simplified Markov graph. This is somewhat curious, as both are pretty mechanical rendenngs of what one might think of as the "neighborhood" of influence, and neither is particularly well suited to representing the kinds of processes traditionally of interest to network modelers. The lack of mode! development is probably due to the extremely challenging technical problems that accompany the estimation of these models.3 Another problem is the lack of data, or at least of the type of data currently required. None of the models has been specified in a way that missing data can be handled, so it is still necessary to have the equivalent of census data on a network - data on every node and every link. Such data do exist, but they are not common, and this has been a major obstacle to all forms of network analysis. Finally' it may also be that our ability to think, empincally, in terms of positions and network structure has atrophied as we have waited for the right tools to become available. Whatever the reason, there has been remarkably little application of these modeling tools to data. Models are the bridge between theory and data. And while there is a certain attraction to simple abstract forms, like Markov graphs, small world graphs, or scale free networks, the simplification we seek will be embedded in each substantive context. In the rest of this paper ~ will show how these models can be used to formalize the investigation of how a global network structure might cumulate up from simple local rules in a specific case. ' Other papers in this volume will be addressing these in detail, so they do not need to be covered here. DYNAMIC SOCIAL NETWORK MODELING^D CYSTS

From local rules to global structures: Partnership networks and the spread of HIV Over the past two decades, the epidemic of HIV has challenged the epidemiological community to rethink its paradigms for understanding the risk of infectious disease transmission, both at the individual level, and at the level of population transmission dynamics. Opinion has rapidly converged on the central importance of partnership networks. Because people acquire infections from their partners, it is not only a person's own behavior that puts them at risk, but the behavior of their partners, and more generally, the persons to whom they are indirectly connected by virtue being connected to their partner. At the individual level the risk of infection is determined by position in the overall transmission network. While individual behavior plays a primary role in establishing this position, it is not the exclusive determinant. To some extent, even the notion of "individual behavior" itself is open to question here, since all behaviors can only be adopted with a willing partner, which has important implications for both modeling and behavioral intervention strategies. At the aggregate level, too, transmission dynamics are strongly affected by networks. The network of partnerships channels the spread of an infection through a population, amplifying or impeding the spread relative to a hypothetical randomly mixing population. At both the individual and the population level then, our ability to quantify the risk of transmission for HIV or other sexually transmitted or blood-borne infections, depends on our ability to measure and summarize the transmission network. One of the greatest obstacles to using network analysis in this context is the complete data requirement. In the context of the sexual and needIe-sharing networks that matter for HIV, such data collection is impossible. The Question. then. its whether we ran ~~,_1~ _ ^~ 11 ~ 1~ ',t ~~ GApl~] tIlO overall network secure won a small number of partnership formation mIes that operate at the local individual level. If this is true, it will radically simplify data collection needs, and give network analysis a central place in the tools of sexually transmitted infection (STI) research and intervention. There is some reason to think that such local rules may govern the network structures of interest. At the population level, the findings from recent mathematical modeling suggest that two general types of structures are important: selective mixing patterns among different groups (Hethcote and Van Ark 1987; Sattenspie! 1987; Garnett and Anderson ~ 993; Moms 1996), and the timing and sequence of partnership formation (Watts and May ~ 992; Monks and Kretzschmar ~ 9971. These general insights come from different types of mathematical simulations, including both simple deterministic differential equations (sometimes called compartmental models) and stochastic microsimulations of a population of interacting individuals (sometimes called "agent based models"~. Mixing refers to assortative and disassortative biases in the joint distribution of partners' attributes. Examples include the degree of matching on race' age, sex, social status, sexual orientation or level of sexual activity. Assortative biases can create patchy, clustered networks, which tends to increase the speed of spread within groups, and slow the spread between them. If the groups also vary in activity level, the resulting DYNAMIC SOCKS NETWO=MODE~G ED CYSTS 181

distribution of infection can be very uneven. In the extreme, prevalence may rise to high endemic levels in some groups, while other groups remain infection free. Disassortative biases typically have the opposite effect, ensuring rapid spread to all groups. For example, preferential linking between highly active persons and less active persons ensures that the latter are reached more quickly in the epidemic. An example of this is the traditional double standard, where men can have multiple partners, but women only one. The timing and sequence of partnership formation refers to the pattern of start and end dates for partnerships over an individual's lifetime. If the partnership intervals defined by these dates are strictly sequential, the pattern is called serial monogamy. If the intervads overlap, the pattern is called concurrency. Serial monogamy retards the spread of infection in two primary ways: it locks the pathogen in a partnership for some time after transmission, and it ensures that only later partners are at indirect risk from earlier ones. With concurrency, by contrast, one's partner can have other partners who in turn have other partners, and so on. Instead of a monogamous population of dyads and isolates, concurrency creates a potentially large connected component for rapid pervasive spread. In addition, because earlier partners need not be dropped as later partners are added, the infection can be spread both in both directions: from earlier to later partners, and from later to earlier ones. Almost all of our understanding of the effects of these network patterns on HIV spread have been based on simulation. And with few exceptions, the simulations have created network effects indirectly, by varying parameters of some convenient mathematical function to produce a change in the simulated networks. Network patterns are thus outcomes of the model, rather than inputs. This strategy has been enormously valuable for orienting research, and it laid the groundwork for future progress. But it has also limited our ability to place this work on a Burn empirical footing. and to anantifv the risk in any specific network. --rat =, ~ =~ .~ .~_ Linking network data to network simulation requires a statistical bndge: a modeling framework that enables the key structural parameters to be estimated from network data, so that these can be used to directly drive a simulation. ERGMs have the potential to do this. The simplest network statistic is the total number of links, represented as Xij, which provides information on the density of the network - in this context, i= ~ I n I. j<i the level of partnership activity in the population. The level of concurrences is represented by statistics for the "nodal degree cli~trih,~ti~n" _ the n',mhr~- of Rim With ~ ~ ~ . ~ ~ __, ^4~_~_^ ~~ [,_& V VY ~L11 u, I, a, or more partners tano one could test either parametric or non-parametric forms of these distributions). A simple selective mixing on a discrete characteristic (e.g., race) would be represented by the count of dyads in which both partners have the same attribute. One can also parametenze cycles of various sizes (e.g., triangles, 4-cycles, etc.), and with parameters that make the temporal dependence explicit. This enables one to mode] how in the network evolves over time. These parameters and others can be fit simultaneously, which provides a uniform metric for establishing their relative strength, and examining their correlation. What this statistical model provides, then, is a 182 DYNAMIC SOCIAL NETWORK AlODEL~G kD TRYSTS

systematic method for summarizing the key structural features of an empincal network, with a framework for comparison and testing. While estimation techniques are often of little interest to non-statisticians, this case is an exception. The constant in equation (~) requires a calculation that makes simple maximum likelihood estimation impossible for graphs larder than about 20 nodes. To avoid this, early applications of these models used an approximation called maximum pseudolikelihood estimation (MPLE). The problem was that the estimates produced by MPLE were of unknown quality. In the last few years, researchers have turned instead to computationally intensive Monte-CarIo Markov Chain (MCMC) estimation methods which allow the true likelihood function to be maximized. In addition to providing, more accurate estimates, this approach is particularly interesting for our purposes because the MCMC method effectively simulates the network in order to maximize the likelihood. We can, however, just as easily use the MCMC algorithm to simulate the network given the parameter estimates, and this provides the ideal solution to the problem of lining network data to the network simulation. One can estimate the network parameters from data, and then use the same model, with the empirically based parameter estimates, to drive a simulation of the network with an infection spreading through it. The MCMC aIgonthm provides the engine for both tasks. This makes it possible for the first time to directly control the network structures in a simulation so that they "look like" the networks we observe in different data sets. A Model-Based Hypothesis We are now in a position to test a different kind of hypothesis about the role of networks on disease spread. The research of the previous decade has suggested that attribute-based mixing and levels of concurrency have a large impact on network structure and transmission dynamics. Now we can ask whether these are the only features that matter. One the one hand, there is a good theoretical basis for this hypothesis. For the type of partnerships that spread STI it seems reasonable to presume that people make decisions about which partners to choose based on preferences and norms that operate at the local level. That is, we choose partners because they are the night sex, age, race and status, and we often care if they have other partners. It is unlikely that people form partnerships thinking "If ~ choose this partner ~ can shorten the number of steps between me and a randomly chosen person on the west coast" or '`I'd like to complete as many 5- cycles as possible with a single partnership". This may seem obvious at one level, but the implications may not be as obvious, and they are quite striking. If simple local rules govern partner selection, then these also determine the aggregate structure in the network: what looks like an unfathomably complicated system is, in fact, produced by a few key focal organizing pnnciples. By extension, these simple local rules are also. therefore, the key behavioral determinants of disease transmission dynamics on the network. There are also important practical implications of this hypothesis. Both mixing and concurrency are network properties that can be measured with local network sampling strategies. If it turns out that these two local rules explain most of the variation DYNAMIC SOCIAL NETWO=MODEL~G ED CYSTS 183

in network structure that is relevant to disease spread, then we have a simple inexpensive way to measure network vulnerability routinely in public health surveillance. And they describe simple behavioral rules that people can be taught to recognize and change. So the last piece of the puzzle is to determine how to test the hypothesis that mixing anc concurrency contain all of the intorrn~tion we nosh tm l~n^~xr tin -~l,~q`= the spread potential in a network. to what we want to fit. ~ ^ ,' ~ _~ I ·~4V YV ~w ~ Y At L~1~ We seek a "goodness of fit'' tests but one that is tailored Transmission potential in a network is determined by the network connectivity. Connectivity can be measured in a number of ways, but two simple measures are the properties of reachability (is person i connected to person j by a path of some length ?) and distance (what is the length of that paths. Reachability and distance represent epidemiologically relevant "higher r)rcl~r" network nrr~n~rti-c that cimr~lm ~^A^lc, ~h~.ll'1 ~ ~1 ~ 1 ~~ r ___, . ^.~ —~ t~ ~~_~ ~~ ~~ ~4111~1~ lilV~l~l~ OllVUlU oe ante to reproduce. we can develop fit statistics based on these higher order network statistics to test the models that include parameters only for mixing and concurrency. This makes it possible to test whether the local organizing features represented in the model reproduce the larger structural features of the network. We can therefore develop formal tests for the hypothesis that mixing and concurrency capture the epidemiologically relevant vacation in network structure. We can also use the MCMC simulation engine to verify whether these features are sufficient for establishing the epidemic potential in a network. Conclusion The new tools for network modeling provide us with the ability to empirically test a question that has both important practical implications, and deep theoretical roots. The generalizable findings from this work will not necessarily be monolithic. There is no reason ~c, inane anal attnoute mixing and concurrency are the local rules that drive all network structure. What is generalizable, though, is that the models for networks should _ . ^. . .. , , , . . ~ be rooted in the scientific context that they seek to explain. For this, one does not need a one size fits all approach, one needs a flexible class of models that can be tested against data in principled ways. ERGMs provide the basis for this kind of emp~ncally-based network analysis, and should become widely used in the years to come. 184 DYNAMIC SOCIAL METWO=MODELING AND ANALYSTS

References Bahadur,R.R.~19611.Arepresentationofthejointdistributionofresponseston dichotomous items. Studies in Item Analysis and Prediction. H. Solomon. Stanford. California, Stanford University Press: 158-168. Bearman. P. S. (19961. "Generalized Exchange." Amer ~ Soc 102~51: 1383-1415. Besag, J. (19741. "Spatial interaction and the statistical anayIsis of lattice systems." Journal of the Royal Statistical Society Series B 36: 192-236. Chase, I. D. (19821. "Dynamics of hierarchy formation: the sequential development of dominance relationships." Behaviour 80: 218-40. Fienberg. S. E.~ M. M. Meyer, et al. ~ ~ 9851. "Statistical analysis of multiple sociometric relations." Journal of the American Statistical Association 80: 51-67. Frank, O. and D. Strauss (19861. "Markov Graphs." JASA SI: 832-842. Garnett, G. and R. Anderson (19931. "Contact tracing and the estimation of sexual mixing patterns: the epidemiology of gonococcal infections." Sex Transm Dis 20~41: ~ 81- 91. Granovetter, M. (19791. The theory gap in social network analysis. Perspectives in Social Network Analysis. P. Holland and S. Leinhardt. New York, Academic. Heider, F. (19461. "Attitudes and Cognitive Organzation." Journal of Psychology January: 107-~12. Hethcote, H. and ]. Van Ark (19871. "Epidemiological Models for Heterogeneous Populations: Proportionate Mixing, Parameter Estimation, and Immunization Programs." Math. Biosc. 84: 85-] IS. Hoff, P. D., A. E. Raftery, et al. (20021. "Latent space approaches to social network analysis." Journal of the American Statistical Association 97( 460~: 1090-1098. Holland, P. and S. Leinhardt (198I). "An exponential fatly of probability distnbutions for directed graphs." JASA 77: 33-50. Levi-Strauss, C. (19691. The Elementary Structures of Kinship (on". 19491. Boston, Beacon Press. Morns, M. (1991). "A Tog-linear modeling framework for selective mixing." Math Biosc 107: 349-77. Morns' M. (1996~. Behavior change and non-homogeneous mixing. Models for Infectious Human Diseases: Their Structure and Relation to Data V Isham and · . G. Medley. Cambndge, Cambridge University Press. 6: 236-49. Morns. M. and M. Kretzschmar (19971. "Concurrent partnerships and the spread of HIV."AIDS11:641-~. Pattison. P. E. and S. Wasserman (19991. "Logit models and logistic regressions for social networks, Il. Multivariate relations." British Journal of Mathematical and Statistical Psychology 52: 169- ~ 94. Sattenspiel, L. (19871. "Epidemics in nonrandomly mixing populations: A simulation." Amer. J. Phys. Anthro. 73: 25 ~ -265. Wang, Y. J. and G. Y. Wong (19871. "Stochastic blockmodels for directed graphs." JASA 82: 8-19. Wasserman, S. and P. Pattison (19961. ''Cogit models and logistic regressions for social networks: I. An introduction to Markov graphs and p*." PsYchometnka 60~401- 4261. DYNAMIC SOCIAL N~TWOI?KMOD~:~G~D ISIS 185

Watts, C. H. and R. M. May (1999). "The Influence of Concurrent Partnerships on the Dynamics of HIV/AIDS." Maw Biosc 108: 89-:104. White, H. C. (19634. An AnatomY of Kinship. Englewood Cliffs. Prentice Hall. White, H. C., S. A. Boorman, et al. (19761. "Social structure from multiple networks, I: Blockmodels of roles and positions." American Journal of Sociology SI: 730-80. Yamagishi. T. and K. Cook (19931. "Generalized exchange and social dilemmas." Soc Psych Quarterly 56~41: 235-48. 186 DYNAMIC SOCIAL NE:TWOltKA1ODEf~G~D TRYSTS

Next: Social Networks: Threat Networks and Threatened Networks »
Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers Get This Book
×
Buy Paperback | $73.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

In the summer of 2002, the Office of Naval Research asked the Committee on Human Factors to hold a workshop on dynamic social network and analysis. The primary purpose of the workshop was to bring together scientists who represent a diversity of views and approaches to share their insights, commentary, and critiques on the developing body of social network analysis research and application. The secondary purpose was to provide sound models and applications for current problems of national importance, with a particular focus on national security. This workshop is one of several activities undertaken by the National Research Council that bears on the contributions of various scientific disciplines to understanding and defending against terrorism. The presentations were grouped in four sessions – Social Network Theory Perspectives, Dynamic Social Networks, Metrics and Models, and Networked Worlds – each of which concluded with a discussant-led roundtable discussion among the presenters and workshop attendees on the themes and issues raised in the session.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!