National Academies Press: OpenBook
« Previous: Statistical Models for Social Networks: Inference and Degeneracy
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 241
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 242
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 243
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 244
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 245
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 246
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 247
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 248
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 249
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 250
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 251
Suggested Citation:"The Key Player Problem." National Research Council. 2003. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. Washington, DC: The National Academies Press. doi: 10.17226/10735.
×
Page 252

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

The Key Player Problem' Stephen P. Borgatti2 Introduction The key player problem (KPP) consists of two separate sub-problems, which can be described at a general level as follows: (KPP-1 ) Given a social network, find a set of k nodes (called a l~p-set of order k) which, if removed, would maximally disrupt communication among the remaining nodes. 2. (KPP-2) Given a social network, find a kp-set of order k that is maximally connected to all other nodes. Of course, these introductory definitions leave out what is meant precisely by "maximally disrupt communication" and "maximally connected?'. Part of the process of solving these problems is providing definitions of these concepts that lead to feasible solutions and useful outcomes. However, it would seen clear that KPP-1 involves fragmenting, a network into components, or, failing that, making distances between nodes so large as to be practically disconnected. In contrast, KPP-2 involves finding, nodes that can reach as many remaining nodes as possible via direct links or perhaps short paths. The first problem, KPP-1, arises in a number of contexts. A prime example in the public health context is the immunization/quarantine problem. Given an infectious disease that is transmitted from person to person, and given that it is not feasible to immunize and/or quarantine an entire population, which subset of members should be immunized/quarantined so as to maximally hinder the spread of the infection? An example in the military context is target selection. Given a network of terrorists who must coordinate in order to mount effective attaches, and given that only a small number can be intervened with (e.~., by arresting or discrediting), which ones should be chosen in order to maximally disrupt the network? The second problem, KPP-2, arises in the public health context when a health agency needs to select a small set of population members to use as seeds for the diffusion of practices or attitudes that promote health, such as using bleach to clean needles. In the organizational management context, the problem occurs when management wants to implement a change initiative and needs to get a small set of informal leaders on-board first, perhaps by running a weekend intervention with them. In the military context, KPP-2 translates to locating an efficient set of enemies to surveil, turn (into double-agents), or feed . . . mlslntormatlon to. At first glance, both KPP-1 and KPP-2 would appear to be easily solved by either employin;, some graph- theoretic concepts such as cutpoints and cutlets, or via the methods of social network analysis, such as measuring, node centrality. It turns out, however, that none of the existing, methods are adequate This paper explains why and presents a new approach specifically designed for the key player problem. ~ Acknowledgements: This research is supported by Office of Naval Research grant number N0001402 1 1032. Thanks to Scott Clair for leading note to this problem, Mark Newman for suggesting reciprocal distances, Kathleen Carley for useful discussions. and Valdis Krebs for providing illustrative data. ~ Dept. of Organization Studies. Carroll School of Management. Boston College, Chestnut Hill, MA 02467 DYNAMIC SOCIAL NETWORK MODEM ED ISIS 241

Centrality Approach The centrality approach consists of measuring the centrality of each node in the network. then selecting the k most central nodes to comprise the kp-set. Since many measures of cen ;rality exist, one question that arises is which measure to use. For KPP-1, we can expect the best measures to be those based on betweenness. For example, Freeman's betweenness measure sums the proportion of shortest paths from one node to another that pass through a given node. Thus, a node with high betweenness is responsible for connecting many pairs of nodes via the best path, and deleting that node should cause many pairs of nodes to be more distantly (if not completely disconnected). For KPP-2, we can expect measures based on degree centrality and closeness centrality to be useful. Degree centrality is simply the number of nodes that a given node is adjacent to. Hence, depending on what the social relation represented by the graph is and assuming that adjacency implies potential for influence, a node with high degree can potentially directly influence many other nodes. Closeness centrality is defined as the sum of geodesic distances from a given node to all others, where geodesic distance refers to the length of the shortest path between two points Thus, a node with a low closeness score (very central) should be able to influence, directly and indirectly, many others. The centrality measures are plausible solutions for KPP. However, they are not optimal. There are two basic problems, which I refer to as the design issue and the group selection issue. Of the two, the group selection issue is the more senous. The Design Issue The design issue arises ultimately from the fact that centrality measures were not desi;,ned with KPP-1 and KPP-2 specifically in mind. Starting with KPP-1, consider the graph in Figure 1. ^10 7~~9 43/- :_6 \/\ ~ ~4_ y5 Figure 1. Node 1 has the highest centrality on all considered measures, including betweenness centrality. Yet deleting node 1 has relatively little effect on the network. Distances between certain pairs of nodes do increase, but it is clear that communication amen;, all points remains possible as there is no fragmentation. In contrast, deleting node 8. which does not have the highest betweenness is more effective. Removing 8 splits the graph into five unconnected fra ,ments (components). 242 DYNAMIC SOCL4L fJETWORKMODE~G ED ISIS

For KPP-2, the picture is a little brighter. If we formulate KPP-2 tin terms of reaching the most nodes directly, degree centrality is optimal. If we formulate it in terms of reaching the most nodes in up to in steps, men we can reamly define a new measure of centrality ("m-reach centralitY'.) that counts the , . ~ . . . ~ number of nodes within distance m of a given node. The Group Selection Issue The group selection issue, discussed as the group centrality problem in Everett and Borgatti (1999), refers to the fact that selecting a set of nodes which, as an ensemble, solves KPP-1 or KPP-2, is quite different from selecting, an equal number of nodes that individually are optimal solutions for KPP. To start with, consider KPP-1. Figure 2 shows a graph in which nodes h and i are, individually, the best nodes to delete in order to fragment the network. Yet deleting i in addition to h yields no more fragmentation than deleting z alone. ~ contrast, deleting m with h does produce increased fragmentation, even though individually m is not nearly as effective as i. The reason i and h are not as good together as i and in is that i and h are redundant with respect to their liaising role- they connect the same third parties to each other. In a sense, the centrality of one is due to the centrality of the other, with the result being that the centrality of the ensemble "control" is quite a bit less than the sum of the centralities of each. me ·r mom it Figure 2 ~0 ~q ,,Es The redundancy principle also applies to KPP-2. Consider the Graph in Figure 3. Nodes a and h arc . , . . ~ .. . - ~ ~ - ~r- ~~~~~~ ~~~~ I ~~ _ I—~1—L, me, Individually the best connected. Each reaches five other nodes, more than any other by far. But together they reach no more than either does alone. In contrast. if b is Haired with c (which inclivirl,'~liv rP.~rh`~c ./ ~ only three nodes), the set reaches every node in the network. ~c' _~ Figure 3. DYNAMIC SOCIAL NETWORK MODEL~G~D ISIS 243

Graph Theoretic Approaches In addition to basic concepts such as components and distance, a number of graph-theoretic concepts are relevant to KPP. For KPP-1, the most obvious are the notions of cutpoint and bridge, which are nodes and lines respectively whose deletion would increase the number of components in the graph. These concepts do not address the group-selection issue. However, both have set-level counterparts in the form of cutlets. A outset is a set of nodes (or lines) whose removal would increase the number of components in the graph. Most work has focused on minimum weight cutlets, which are smallest sets that have the outset property. There are three difficulties with outsets in the KPP context. First, we cannot specify the number of nodes in the set arid then seek the set of that size that does the best job (rather, the measure of success is fixed and we are able to find a smallest set that achieves that level of success). In this sense, the graph theoretic approaches solve the inverse of the problem we seek to solve. Second, no account is taken of distances among nodes. Third, all solutions that increase the number of components are equal, even If one solution creates just two components while another creates ten. For KPP-2, applicable concepts include vertex covers and dominating sets. A vertex cover is a set of nodes whose members are incident upon every edge in the graph. A dominating set is a set of nodes whose members are adjacent to all other nodes in the graph.3 For our purposes these are equivalent and fail for exactly the same reasons as cutlets. Measuring Success In order to optimally solve KPP, we must have a clear definition of success. I propose that for KPP-1 there are two network properties we want to maximize by removing the kp-set: fragmentation and distance. For KPP-2, we want to measure the distance-based reach of the kp-set into the network around it. Therefore, we need measures for each of these concepts. Fragmentation - Perhaps the most obvious measure of network fragmentation is a count of the number of components. If the count is 1, there is no fragmentation. The maximum fragmentation occurs when every node is an isolate, creatin;, as many components as nodes. For convenience, we normalize the count by dividing by the number of nodes. K Eq. 1 The problem with this measure is that it doesn't take into account the sizes of the components. For example, in Figure 2, deleting node m would break the network into two components, but the vast majority of nodes remain connected. In contrast, deleting node i would also result in just two components, but more pairs of nodes would be disconnected from each other. This suggests another measure fragmentation that simply counts the number of pairs of nodes that are disconnected from each other. Given a matrix R in which r = 1 if; con reach i ~n'1 a.. - n r`th`~r`xricr~ ``rm can define the new measure as follows: - ~~ ~ ~~ ~ ~~ — ~ ~~ ~_~ or ~~~ rev ~ 3 Graph theorists differ on whether these sets are understood to be niinimal or not. 244 DYNAMIC SOCIAL?JETWORKHODEL~G~D~YSIS

F = 1 - - 2~, ~ rat . j<i n(n - 1) Eq. 2. Since, by definition, nodes within a component are mutually reachable, the F measure can be rewritten more economically in terms of the sizes (sit) of each component (indexed by k): ~ Sk Use —I) F=~_ k non -by Eq. 3. The F measure is remarkably similar to a diversity measure known variously as heterogeneity, the concentration ratio. the Hirschman-Herfindahl index, or the Herf~ndahl index. Applied to the current context it is defined as follows: H=1—~:—~ k (nJ Eq. 4. One difference between F and H is that while both achieve minimum values of 0 when the network consists of a single component, when the network is maximally fragment (all isolates) the H measure can only achieve 1-1/n. If we normalize Hby dividing by 1-1/n, we obtain the F measure (and seeing F as a normalization of ~ points us to a slightly faster computing formula). An alternative approach is information entropy. Applied to this context, the measure is defined as E=-~, k in(Si.) k n n Eq. 5. c~ _ _ The measure is bounded from below at zero, but is unbounded from above. We can bound it by dividing it by its value when all nodes are isolates: z-k' In(Sk ) ,~ n n E= Ion sit k M , Eq. 6. Distance While the fragmentation measure F is very satisfactory for what it does, it does not take into account the shape - the internal structure - of components. A network that is divided into two components of size 5 in which each component is a clique (Figure 4a) is seen as equally fragmented as a network divided into two DYNAMIC SOCIAL N~TWORKHOD~:~G ED ^¢YSIS 245

components of size 5 in which each component is a line (Figure 4b). Yet distances and therefore transmission times are much higher in the latter network. Further, if we can delete only so many nodes, it may be that there is no possible selection of nodes whose removal would disconnect the graph. In such cases, we would still like some way of evaluating which sets are better than others. An obvious solution would be to measure the total distance between all pairs of nodes in the network. However, this only works in the case where the graph remains connected. Otherwise, we must sum infinite distances. A practical alternative is to base the measure on sum the reciprocals of distances, observing the convention that the reciprocal of inanity is zero, In that case we can create a version of F. based on Equation 2, that weights by reciprocal distance. ~ 1 2, Day = 1 _ i>j dij n(n - 1) Eq. 7. The OF measure is identical to F when all components are complete (i.e. each component is also a clique). However, when distances within components are greater than 1, the measure captures the relative cohesion of the components. For example, the graph in Figure 4a has two components of size 5 and the OF measure is 0.556. The graph in Figure 4b, which is less cohesive, also has two components of size S. but: the OF measure is 0.715, indicating much less cohesion . Like the F measure, vF achieves its maximum value of 1.0 when the graph consists entirely of isolates. ·5 ~24 ~9 ~10 7 Figure 4a. OF = 0.556 246 ,^4~ —'~2 i/ \ ~9 it, ~ ,~~ ~ 1 /r - '''. `6 ;10 Figure 4b. Do = 0.715 DYNAMIC SOCKS N~TWO~MODFL~G ED ISIS

Distance-Based Reach The measures discussed for KPP-1 are graph-level measures that we apply to a kp-set by removing the set and measuring the fragmentation in the remaining graph. For KPP-2, we seek a set of nodes that, as a group' is maximally connected to all other nodes. Hence, we need a direct measure of the centrality of the kp-set as a group. The concept of group centrality has already been elaborated by Everett and Boraatti (1999), but only degree, closeness, betweenness and eigenvector croup centrality measures have been discussed. As noted earlier, these measures are not optimal for the KPP-2 problem. Hence, we must develop new ones based on the concept of reach. The simplest group reach measure, termed m-reach, is a count of the number of unique nodes reached by any member of the kp-set in m links or less. The advantage of this measure is its face validity. The disadvantage is that it assumes that all paths of length m or less are equally important (when in fact a path of length 1 is likely to be more important than a Dath of length 21 and that ~11 rhythm lons,~r then `= Arc wholly irrelevant. c , ~~ ^ ~ r ~ ^ _ ^ ,~_A ~~ ~ ~ _ A more sensitive measure, called distance-weighted reach, can be defined as the sum of the reciprocals of distances from the kp-set S to all nodes (see Equation 8~. As described by Everett and Bor~atti (1999), the distance from a set to a node outside the set can be usefully defined in a number of ways, such as taking the maximum distance from any member of the set to the outside node, taking, the average distance, or taking the minimum distance. For KPP-2, the minimum distance is appropriate since the fact that the distance to an outside node might be large for a given member of the set will usually be irrelevant. ~ 1 D j do ~ = - n Eq. 8. In the equation, the distance from kp-set S to node j is indicated by do In addition, it should be noted that the summation is across all nodes and the distance from the set to a node within the set is defined to be 1. As before the reciprocal of an infinite distance is defined to be 0. Taking some interpretive license, we can view DR as the proportion of all nodes reached by the set, where nodes are weighted by their distance from the set and only nodes at distance 1 are given full weight. Hence, DR achieves a maximum value of 1 when every outside node is adjacent to at least one member of the kp-set (i.e., the kp-set is a dominating set). The minimum value of O is achieved when every node is an isolate. Selecting a KP-Set For KP-sets of size 1, the measures presented above can used straightforwardly to select key players by simply choosing, the one with the highest score on any given measure. Thus, they can be regarded n~ new measures of node centrality that are optimized for the key player problem. - D~ For kp-sets of size k ~ 1, however, there is no simple procedure for choosing an optimal set. Some heuristic procedures may be of value. For example, for KPP-2, we start by selecting the node with the highest OR score. Then, for each of the remaining k-1 selections, we choose the node with the highest score that is not adjacent to any of the nodes already selected. This algorithm, a variant of the first fit decreasing, algorithm for the bin-packing problem' is fast and easy, but often yields solutions that are considerably less than optimal. DYNAMIC SOCIAL N~TWORKMODE~G ED ISIS 247

Other heuristic approaches specific to the KPP can be constructed. but the fact that we have clear measures of success that are easily computed recommends a generic combinatorial optimization algorithm, such as tabu-search (Grover, 1986), K-L (Kernighan and Lin, 1970), simulated annealing (Metropolis et al, 1953) or genetic algorithms (Holland, 19751. Initial experiments suggest that all of these do an excellent job on KPP. and so I present only a simple greedy algorithm. Figure 5 outlines the method, which is normally repeated using dozens of random starting sets. 1. Select k nodes at random to populate set S 9. Set F = fit using appropriate key player metric 3. For each node u in S and each node v not in S a. DELTAF = improvement in fit if u and v were swapped 4. Select pair with largest DELTAF a. If DELTAF <= then terminate b. Else, swap pair with greatest improvement in fit and set F = F + DELTAF 5. Go tostep3 Figure 5. Greedy optimization algorithm. Empirical Trials The operation of the algorithm is illustrated using two datasets drawn from the public health (AIDS) and military (terrorism) contexts. Both cases are approached from both KPP-1 and KPP-2 points of view. '277 \1~ bt ]37_~i 136 ~ ~j152 `477~1~87 , 10~32 ~ 235 ~61 &~` ~1341~14 102 ~ ¢16 \4 &~.1~1~ \\\ 149 ~~,~ ~.~^ a - 273 W ,9 71~ 226 \ 2001~178 \52 i1~ ~1 ~ ~ ~~ ~~ ·~ i1~ `= ¢4 )1 93 '~ 120 `140 r:157 +1636~2 ¢= A.' Figure 6. Acquaintance network. Upward cnangles indicate African-Americans, downward triangles indicate Puerto-Ricans, and squares identify all others. 248 DYNAMIC SOCIAL N~TWO~ MOD~G ED ISIS

AIDS Example The AIDS dataset consists of an acquaintance network among 293 drug injectors on the streets of Hartford, CT. The data are described in Weeks et al (2002~. The network consists of one large main component (193 nodes), and many very small components. As shown in Figure 6, the main component of the network has a very clear structure. It consists of two groups, one african-american (with higher HIVE proportion), and the other largely puerto-rican (with lower HIV+ proportion). Connection between the two groups is limited by just a few acquaintances and this bottleneck helps maintain the lower HIVE rate in the Puerto-Rican part of the network. Whether through immunization (should that become possible) or quarantining, it is clear that one thing we would want to do early on is to isolate the two groups from one another, both because we want to maintain the low HIVE rates in Puerto-Rican cluster, and because risk is a function of the size of the component a node is part of. Thus, we have a case of KPP-1. The network provides a useful first test of the key player optimization algorithm for two reasons. First, the structure of the network makes it quite vulnerable to disconnection, and easy to check the results visually. If the algorithm fails this test, it is clearly not adequate. Second, the network is already fragmented. providing noise that could confuse some algorithms. Based on visual inspection, it is clear that immunizing or quarantining just two nodes would separate the main component into two nearly equal halves. So for the first run of the algorithm we seek a kp-set of size 2. The starting fragmentation index for the graph is 0.567. The algorithm selected two nodes, identified in black in Figure 6, which, if deleted, would increase fragmentation to 0.658 (a proportional increase in fra ,mentation of 16%~. The selected nodes match our intuition and divide the main component in two big chunks. Turning to KPP-2, we are also interested in selecting a small group of nodes to be subjects of an intervention - specifically, to be trained as peer educators (known as Peer Health Advocates or PHAs) to disseminate and demonstrate [IIV prevention practices. Weeks et al (2002) did this by hand, laboriously selecting the smallest group that could reach more than 50% of the main component of the network. The winning set contained 14 nodes. Running Key Player on the same data, yields the same result, as shown in Table 1. Group Number Percent Size Reached Reached 16 27 2 3 4 5 6 8 9 10 11 12 87 13 92 14 97 Table 1. 36 43 49 55 61 67 72 77 82 8.3 14.0 18.7 22.3 25.4 28.5 31.6 34.7 37.3 39.9 42.5 45.1 47.7 50.3 DYNAMIC SOCIAL NETWORK MODEII?JG AND ANALYSIS 249

250 As might be expected, the number of people reached increases as a fractional power of the group size, but fuller consideration of this phenomenon is outside the scope of this paper. Terrorist Example The terrorist dataset, compiled by Krebs (20011. consists of a presumed acquaintance network among 74 suspected terrorists. For the purposes of this analysis, only the main component is used, consisting of 63 individuals. t d~- , r:/l i_ \ Figure 7. Terrorist network compiled by Krebs (20021. The first question we ask is which persons should be isolated in order to maximally disrupt the network. Let us assume that we can only isolate three people. A run of the KeyPlayer program selects the three red nodes identified in red in Figure 7 (nodes A, B and C). Removing these nodes yields a fragmentation measure of 0.59, and breaks the graph into 7 components (including two large ones comprising the left and right halves of the graph). The second question we ask is, Even that we would like to diffuse certain information, which nodes would we want to be exposed to the information so as to potentially reach all other nodes quickly and with certainty? Let us assume that information that travels more than two links tends to degrade or be viewed with suspicion. Hence we want the smallest set of nodes that can reach all others within two links or less. The KeyPlayer algorithm finds that a set of three nodes (the square nodes in Figure 7, labeled A, C and D) reaches 100% of the network (including themselves). DYNAMIC SOCIAL NETWO~MODELING AND ANALYSIS

Discussion In this paper we have defined the KeyP]ayer problem and demonstrated why the naive centrality-based approach and more sophisticated graph-theoretic approaches fail to solve the problem. We have introduced new metrics for measuring success. and implemented a combinatorial optimization algorithm to maximize these quantities. Applications in both health and military areas were demonstrated. The metrics for measuring success in the KPP-1 problem are essentially measures of graph cohesion that may be useful descriptively in a number of contexts besides the key player problem. Typical applications might be the comparison of similar organizations, or using cohesion as a predictor of group performance. The fact that KPP-1 and KPP-2 have different solutions is interesting. For many, centrality is either a unitary concept with many highly correlated measures, or a fully multidimensional concept in which each measure indicates a different kind of centrality. I would suggest an intermediate view that divides notions of node] importance into two basic types, corresponding to the KPP-1 and KPP-2 problems. KPP-1 measures quantify the extent to which a graph's cohesion is reduced by the removal of that node, while KPP-2 measures quantify the extent to which a node's ties reach into the network. It is the second type that is most directly about centrality: a node that is well-connected. The first type is not as much about the node's well-connectedness as about the connectedness of the rest of the graph in the absence of the node. From the ,raph's point of view, there is a loose analogy to the distinction in operant conditionin, between positive reinforcement (KPP-2) and negative reinforcement (KPP-1~: positive reinforcement provides benefit (connectivity) by directly providing, a boon (the node s connections), while negative reinforcement provides benefit by relieving a stressor (the node holds together the otherwise Garmented graph). From the node's point of view, a node achieves its highest values on KPP-2 measures when the graph is highly cohesive. In contrast, high values on KPP-1 measures will nonnally occur only when the graph is not very cohesive. Actors high on KPP-2 measures lend themselves to maximizing utilization of resources flowing through the network, while actors high on KPP-1 measures lend themselves to maximizing the benefits of brokerage, gatekeepin~ and playing actors off each other. Limitations and Next Steps There are si ,nificant dimensions to the problem that I have ignored. Perhaps the most important is the issue of data quality. If this approach is to yield a practical tool, we cannot simply assume perfect data. Rawer, the method should be robust in the face of errors in the data. Two approaches will be explored. First, there is the notion of not optimizing too closely to the observed dataset. If the data are known to vary from the crush by a given magnitude (e.g., 10% of observed ties don't actually exist and 10% of observed non-adjacent pairs are in fact adjacent), then we can randomly vary the data by this magnitude and optimize across a set of "adjacent.' datasets obtained in this way. The result is a kp-set that is not necessarily optimal for the observed dataset, but will represent a high-quality solution for the neighborhood of the graph as a whole. An alternative approach is to treat knowledge of ties as probabilistic, modifying the KeyPIayer metrics accordingly. For example, suppose we knew, for each pair of nodes, the independent likelihood that a tie exists between them. Then, in principle, we could work out the expected distance (including infinity) between the nodes across all possible networks.4 KPP measures based on distance and reachability could then be computed substituting expected distance for observed distance. The practical challenge here Is to find shortcut formulas for expected distance and connectedness that enable fast computation. Note that the problem being addressed is certainty of observed data values, not the existence of ties. It is assured in this approach that ties are fixed and not probabilistically emerging as a function of node attributes or other ties. The dynamic nature of ides is a different phenomenon that wants its own models. DYNAMIC SOCIAL NETWORK AJODF~G ED ISIS 251

Another issue concerns the use of geodesic distance at all. As discussed by Borgatti (2002), different kinds of flows processes have different kinds of characteristic trajectories. For example, infections that immunize (or kill) the host don't return to nodes they have previously infected. Hence, they travel aloe;, graph theoretic paths. In contrast, gossip transmitted via email can easily reach the same node multiple times, but in general not from the same sources (i.e., a person doesn't usually tell the same confidential story to the same person more than once). Hence. stories Ravel aloe, trails. Neither infections nor stories necessarily travel via the shortest paths to each node. Consequently, for those processes, the expected distance travelled by somethin;, flowing through the network is not equal to geodesic distance, and it would make sense to substitute this other expected distance in the equations. For simplicity of exposition, this paper has assumed undirected simple graphs with non-valued edges. The extension to directed graphs is straightforward, but the extension to valued edges will require some development. An exception is the class of fragmentation measures, which generalize nicely alone, the lines of minimum weight cutlets, losing only the computational shortcuts available with non-valued data. The distance-based metrics will require different generalizations depending on the kinds of values, which could range from distances to capacities to probabilities of transmission. Multiple relations, recorded as separate graphs with a common node set. might be handled hv summing the ~llr.~.c~ an ten an Ferric o11 relations. 7 _-;_ i_ ^~ A__- ~} Vow ~~ ~~—~~ call—JIVll ~~ all Finally, it may be of interest in the future to incorporate actor attributes. In the military context, com- mun~cation among, actors with redundant skills may sometimes be less important than communication between actors with complementary skills. In the public health context, it is helpful in slowing epidemics to minimize mixing of different populations (such as when married women are linked to commercial sex workers via their husbands). Hence, an additional criteria we would want to consider in fr~c~m~ntincr network is maximizing separation of actors with certain attributes. References _,c, ~ . _ . . - , id,= ~ Bor~atti, S.P. 2002. Stopping terrorist networks: Can social network analysis really contribute? Sunbelt Inten~ationaI Social Networks Conference. ] 3-17 February. New Orleans. Everett, M. G., & Borgatti, S. P. 1999. The centrality of groups and classes. Journal of Mathematical Sociology. 23~31: 181-201 Glover F., 1986, Future paths for integer programming and links to artificial intelligence, Computers and Operations Research, 5: 533-549. Holland, J. 1975. Adaptation in Natural and Artificial Systems. University of Michigan Press. Kernighan, B.W., and Lin, S. 1970. Efficient heuristic procedure for partitioning graphs. Bell Systems Technical Journal. 49~21: 291-297. Krebs, V. 2002. Uncloaking teITonst networks. First Monday 7~41: April. http://www.Elrstmonday.dk/issues/issue7 4/krebs/index.html Metropolis,N., A. Rosenbluth, M. Rosenbluth, A. Teller, E. Teller. 1953. Equation of state calculations by fast computing machines". J. Chem. Phys.,21, 6, 1087-1092. Weeks, M.R., Clair, S., Borgatti, S.P., Radda, K.. and Schensul, J.J. 2002. Social networks of drug users In high risk sites: Finding the connections. AIDS arid Behavior 6~2~: 193-206. 252 DYNAMIC SOCIAL N~TWO~ MODELING ED ANALYSIS

Next: Balancing Efficiency and Vulnerability in Social Networks »
Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers Get This Book
×
Buy Paperback | $73.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

In the summer of 2002, the Office of Naval Research asked the Committee on Human Factors to hold a workshop on dynamic social network and analysis. The primary purpose of the workshop was to bring together scientists who represent a diversity of views and approaches to share their insights, commentary, and critiques on the developing body of social network analysis research and application. The secondary purpose was to provide sound models and applications for current problems of national importance, with a particular focus on national security. This workshop is one of several activities undertaken by the National Research Council that bears on the contributions of various scientific disciplines to understanding and defending against terrorism. The presentations were grouped in four sessions – Social Network Theory Perspectives, Dynamic Social Networks, Metrics and Models, and Networked Worlds – each of which concluded with a discussant-led roundtable discussion among the presenters and workshop attendees on the themes and issues raised in the session.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!