Skip to main content

Currently Skimming:

Stability and Degeneracy of Network Models--Mark S. Handcock, University of Washington
Pages 343-374

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 343...
... For instance, the social networks community has done a massive amount of work as you saw last night in the after-dinner talk. Two key references would be Frank (1972)
From page 344...
... What we are probably interested in first would be the nature of the relationships themselves, questions such as how the behavior of individuals depends on their location in a social network, or how the qualities of the individuals influence the social structure. Then we might be interested in how network structure influences processes that develop over a network.
From page 345...
... FIGURE 1 Hence, we can think of Y, the g-by-g matrix of relationships among the actors, as the sociomatrix. In a very simple sense, we want to write down a stochastic model for the joint distribution of Y, which is this large, multivariate distribution, often discrete, but it could be continuous also.
From page 346...
... How a particular dyad probably would tie in between a particular pair of actors is influenced by the surrounding ties. By specifying the local properties, as long as they are done in this way, you get the global joint distribution, going from the local specification to give you a global one.
From page 347...
... The other thing you can do here is further parameterize the degree distribution, which essentially places nonlinear parametric constraints on the α parameters here. As most statisticians know, this moves it technically from just a straight linear exponential family to a curve exponential family.
From page 348...
... FIGURE 4 We could then add in -- Martina did this yesterday, and I'll show this again briefly -- co-variates that occur in this linear fashion, attributes of the nodes, attributes of the dyads. We can add some additional clustering component to a degree form in this way.
From page 349...
... FIGURE 6 The network on the left side of Figure 6 is one with zero clustering, and on the right we see what happens when the mean clustering coefficient is pushed up to 15 percent with the same degree distribution. The basic notion is that these models give you a way of incorporating known clustering coefficients in the model, while holding a degree distribution fixed.
From page 350...
... The network in the upper left is a heterosexual Yule with no correlation and the one in the upper right is a heterosexual Yule with a strong correlation triangle percent of 60 percent versus 3, which is the default one for random mixing given degree. There is a modest one here as well as one with negative correlation.
From page 351...
... But the natural question is, because these statistics will tend to be highly correlated with each other, highly dependent, it's not really clear for any given model exactly what the node qualities of that model would actually be. As Martina Morris showed yesterday, the natural idea of starting from something very simple can sometimes lead to models that aren't very good.
From page 352...
... It is a property of a random graph model; it's nothing to do with data per se, but the model itself. We call it near degenerate if the model places all its probability mass, i.e., the likelihood of certain graphs on a small number of actual graphs.
From page 354...
... It also has an analog as a Strauss model for spatial point processes in the very least. Basically, what it has is an overall density parameter, and a single parameter dependence measure.
From page 355...
... You essentially have one area of the parameter space which is producing essentially all of the mass in a very small subset of models. It's only in this small area of the parameter space, around 0, which corresponds to just purely random graphs with equally likely ties, that you are about to get nondegenerate forms.
From page 356...
... The central idea is this. Rather than thinking about the natural parameterization of the exponential family, we can think about the classical mean value parameterization of that family defined by the following mapping.
From page 357...
... If we choose two randomly chosen other actors what is the probability they are tied to a third? In Figure 15, we have a new parameterization in which the mean values are actually on scales that people usually think about in terms of social network forms, and I think they have a lot of value for that reason.
From page 358...
... The natural parameter spaces across all of R2. The mean value space is from the number of edges, so 0-21 in my simple example with 7 nodes, and from 0-105.
From page 359...
... є bdC} be the set of graph on the boundary of the convex hull. Based on the geometry of the mean value parametrization, the expected sufficient statistics are close to the boundary of the hull and the model will place much probability mass on graphs in degree Y
From page 361...
... In essence, what this is saying is that if your parameter is in the region outside the green region, it's essentially going to give you mean values on the edge of the parameter space, and hence very odd looking forms. The only reasonable values are the values in this small green area.
From page 362...
... This statement can be quantified in a number of ways as shown in Figure 20: FIGURE 20 I'll briefly speak about inference for social network models, although we can do inference based on the likelihood as before. We have a probability model.
From page 363...
... FIGURE 22 Figures 22 and 23 tell us exactly what and how to do in terms of maximum likelihood. On the other hand if it's on exterior of the convex hull, the MLE doesn't exist.
From page 364...
... There is a result which corresponds to any MCMC likelihood used in that way actually converges with sufficient iterations which I won't belabor here. I always find interesting the relationship between a near degenerate model and MCMC estimation.
From page 365...
... FIGURE 25 FIGURE 26 For example, suppose for the two-star models you want to choose a mean value with 9 edges and about 40 two stars, and you run your MCMC sampler. Figure 27 shows what you get.
From page 366...
... So, what we are actually seeing, if you look at the marginal distribution of these draws, is a profoundly polarized distribution with most of the draws from very low values and some of the draws from quite high. And of course, in such a way that the mean value is 9, which is exactly what we designed the model to actually do.
From page 367...
... It is sometimes convenient to cluster design mechanisms into conventional, adaptive, and convenience designs. So-called conventional designs do not use the information collected during a survey to direct the subsequent sampling of individuals.
From page 368...
... I examined likelihood-based inference for adaptive designs -- basically how to fit the models I described earlier, and Martina Morris described, when you have a conventional design and, probably more importantly, when you have adaptive data. You actually have massively sampled data due to link tracing.
From page 369...
... I won't say too much about this particular model, but you can get the natural parameter estimates. I also measure the standard errors induced by the Markov of chain Monte Carlo sampling.
From page 370...
... In my two remaining seconds I will say we can use those model parameters to generate processes that look a lot like Colorado Springs, and Figure 31 shows two of them. They don't have the larger component of the graph we saw in Figure 29, because this model doesn't naturally have that form, but if you go through other realizations you can get very similar looking forms.
From page 371...
... FIGURE 31 In conclusion, I'll reiterate that large and deep literatures exist that are often ignored; simple models are being used to capture clustering and other structural properties, and the inclusion of attributes (e.g., actor attributes, dyad attributes) is very important.
From page 372...
... The same underlying geometry is important, but it's not related to those results about existence of an MLE. The basic idea is that if we change the view from the natural parameter space, where interpretation of model properties is actually quite complex, to the mean value parameter space, which is expressed in terms of the statistics -- which we chose to be the foundation of our model, and hence for which we should have good interpretation about -- then it gives us a much better lens to see about how the model is working.
From page 373...
... So, this is more a property of dependent models rather than social network models on this particular model class. Coming back to Peter's last point, model specification is extremely difficult here, because you are using the model specification to also find misspecification in that model.
From page 374...
... 1997. Evolution of Social Networks.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.