Cover Image

Not for Sale



View/Hide Left Panel

Network Data and Models

Martina Morris, University of Washington


DR. MORRIS: As they say, I realize I am the only thing that stands between you and dinner. It is the end of the day. It is not an enviable spot, and all of these have become longer and longer talks, so I am going to see if I can work through this a little more quickly.

I am closer to the biologists since I put the people up front. We have a large group of people working on network sampling and network modeling—Steve Goodreau, Mark Handcock and myself at the University of Washington; Dave Hunter and Jim Moody, who are both here; Rich Rothenberg, who is an M.D., Ph.D.; Tom Snijders who some of you know, is a network statistician; Phillipa Pattison and Garry Robbins from Melbourne have also done a lot of work on networks over the year, and then a gaggle of grad students that come from lots of different disciplines as well.

We are funded from NIH and we are primarily interested in how networks channel the spread of disease, so Figure 1 shows an example of a real network that comes from Colorado Springs, a visualization that Jim did. You can see that the network has a giant component, which is not surprising. It has this dendritic effect, which is what you tend to get with disassortative mixing. I think that is something that Mark Newman pointed out. After a while you get to look at these things and you can immediately pick that up. This is kind of a boy-girl disassortative mixing, and it generates these long loosely connected webs. This is also a fairly high-risk group of individuals, and it was sampled to be exactly that. John Potter thinks that he got about 85 percent of the high-risk folks in Colorado Springs. Every now and then you see a configuration like a ring of nodes connected to a central node, which represents a prostitute and a bunch of clients. Of course not all networks look like that.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 226
Network Data and Models Martina Morris, University of Washington DR. MORRIS: As they say, I realize I am the only thing that stands between you and dinner. It is the end of the day. It is not an enviable spot, and all of these have become longer and longer talks, so I am going to see if I can work through this a little more quickly. I am closer to the biologists since I put the people up front. We have a large group of people working on network sampling and network modeling—Steve Goodreau, Mark Handcock and myself at the University of Washington; Dave Hunter and Jim Moody, who are both here; Rich Rothenberg, who is an M.D., Ph.D.; Tom Snijders who some of you know, is a network statistician; Phillipa Pattison and Garry Robbins from Melbourne have also done a lot of work on networks over the year, and then a gaggle of grad students that come from lots of different disciplines as well. We are funded from NIH and we are primarily interested in how networks channel the spread of disease, so Figure 1 shows an example of a real network that comes from Colorado Springs, a visualization that Jim did. You can see that the network has a giant component, which is not surprising. It has this dendritic effect, which is what you tend to get with disassortative mixing. I think that is something that Mark Newman pointed out. After a while you get to look at these things and you can immediately pick that up. This is kind of a boy-girl disassortative mixing, and it generates these long loosely connected webs. This is also a fairly high-risk group of individuals, and it was sampled to be exactly that. John Potter thinks that he got about 85 percent of the high-risk folks in Colorado Springs. Every now and then you see a configuration like a ring of nodes connected to a central node, which represents a prostitute and a bunch of clients. Of course not all networks look like that. 226

OCR for page 226
FIGURE 1 Just a little bit on motivation. Our application is infectious disease transmission, and in particular HIV, recognizing that the real mechanism of transmission for HIV is partnership networks. What we are interested in, in a general sense, is what creates the network connectivity, particularly in sparse networks. As most of you know, HIV has enormously different prevalence around the world. As low as it is here, which is certainly less than 1 percent, compared to close to 40 percent in places like Botswana. So, there is a really interesting question there about what kind of network connectivity would generate that difference and variation in prevalence. There are clearly properties of the transmission system that you need to think about, which include biological properties; heterogeneity, which is the distribution of attributes of the nodes, the persons, but also infectivity and susceptibility in the host population. There are multiple time scales to consider, and time scales are something that we haven’t talked about much here, but I think are very important in this context. You get the natural history and evolutionary dynamics of the pathogens, but you also have some interesting stuff going on with partnerships there. In addition, there is this real challenge of data collection. That is in contrast to Traceroutes—it would be great if we could collect Traceroutes of sexual behavior. Maybe we could do that with some of those little nodes that we stuck on the monkey heads, but at 227

OCR for page 226
this point we can’t really do that. So, for us, this means that the data are, in fact, very difficult to collect, and that is a real challenge. One of the things we are aiming for—basically, this is our lottery ticket—is to find methods that will help us define and estimate models for networks that use minimally sampled data. By minimally, I mean you can’t even contact trace in this case, because contact tracing is, itself, a very problematic thing to do in the case of sexual behavior. FIGURE 2 So, how do you generate connectivity in sparse networks? Lots of people have focused on a scale of three steps: high-degree hubs, all you need is one person out there with millions of partners and your population is connected. In fact, you can generate connectivity in very-low- degree networks as well, as shown in Figure 2. Your simple square dance circle is a completely connected population where everybody only has two partners, so it is important to recognize that there are lots of other ways that you can generate connectivity, and that even if one person, for example, did act as a hub, and you figured that out somehow and you removed them, you would still have a connected population. I think it has some pretty strong implications for prevention. Also, there is obviously clustering and heterogeneity in networks. Figure 3 shows data from a school. You can see in this case that there is a combination of clustering by grade that generates the different colors and also the very strong racial divide. 228

OCR for page 226
FIGURE 3 You might ask yourself the question, why is this? What is the model behind this? What generates this? A number of people have hinted at this kind of stuff earlier. Are these exogenous attributes at work—that is, birds of a feather stick together? If you are the same grade, you are the same race as me, that is why we are likely to be friends. Or is it some kind of endogenous process where, if two people share a friend, they are more likely to be friends themselves? That is a friend-of-a-friend process. It is interesting that in popular language we have both of those ideas already ensconced in a little aphorism. So, thinking about partnership dynamics and the timing and sequence and why that would matter, one of the things that we have started looking at in the field of networks and HIV is the role that concurrent partnerships play. 229

OCR for page 226
FIGURE 4 Concurrent partners are partnerships that don’t obey the rule of serial monogamy. That is, each partnership does not have to end before the next one begins. You can see in Figure 4, in the long yellow horizontal line labeled “1”, we have got somebody who has a partnership all the way across the time interval, and then a series of concurrent partnerships with that, including a little one night stand labeled “5” that makes three of them concurrent at some point in time. Now, this is the same number of partners as in the upper graph, so it is not about multiple partners, although you do need to have multiple partners to have concurrent partners. This is really a function of timing and sequence. What we find in Uganda and Thailand is very interesting. Uganda at the time that we were doing this study had a prevalence of about 18 percent and Thailand had about 2 percent, which is 10 times less. In Uganda men reported about 8 partners in the lifetime on average and in Thailand it was 80. Now, 10 years later Thailand doesn’t have an epidemic that comes from having 80 partners in your lifetime, so something else is obviously going on: it is not just the number of partners. We looked into this concurrency a little bit, and we found that in both cases you get concurrent partnerships. So, it is not the simple prevalence of concurrency that is the difference. The difference comes in the kinds of partnerships that are concurrent. In Uganda you tend to get two long-term partnerships that overlap, whereas in Thailand you have the typical sex partner pattern, which is one long term partnership, and then a short term on the side. The net result is that in Uganda the typical concurrent partnership is actually active on the day of the 230

OCR for page 226
interview—41 percent of them were, and they had been going for three years. The ones that are completed are also reasonably long, about 10 months. In Thailand you don’t get anywhere near as many active on the day of the interview. They are about two months long. Ninety-five percent of them are a day long, so these concurrences happen and they are over. That kind of thing can actually generate a very different pattern on an epidemic. The simulations that we have done of that suggest that if you take this into account you do, in fact, generate an epidemic in Uganda that has about 18 percent prevalence, whereas Thailand will typically have just about 2 percent. The approach that we take is thinking about generative models here, local rules that people use to select partners that cumulate up to global structures and a network. What we want is a generative stochastic model for this process, and that model is not going to look like you, for example, want to create clustering. A clustering coefficient, although it can be a great descriptive summary for a network, is not necessarily going to function well as a generative model for a network. It is also probably the case that when I select a partner I am not thinking I want to create the shortest path, the shortest geodesic to the east coast. That is probably also not going on. Again, it’s a nice summary of a network but probably not a generative property. We want to be able to fit both exogenous and endogenous effects in these models, so that turns out to be an interesting and difficult problem. We also want this to work well for sample data. We want to be able to estimate based on observed data, and then make an inference to a complete network. Figure 5 shows what this generative model is going to look like. We have adapted this from earlier work in the field. It is an exponential random graph model. It basically takes the probability of observing a network or a graph, a set of relationships, as a function of something that looks a little bit like an exponentiated version of a linear model, and then a normalizing constant down below that is all possible graphs of that size. This is the probability of this graph as a function of a model prediction, with this as the normalizing constant. The origins go back at least in the statistical literature of Bahadur during 1961; I talked a little bit about a multivariate binomial. Besag has adopted this for spatial statistics, and Frank and Strauss first proposed it for networks in 1986. The benefits of this approach are that it considers all possible relations jointly, and that is important because the relations here are going to be dependent, if there are any endogenous processes going on, like these friends of a friend. It is an exponential family, which means it has very well understood statistical properties, and it also turns out to be very flexible. It can represent a wide range of configurations and processes. 231

OCR for page 226
FIGURE 5 What does model specification mean in a setting like this? There are two things that we must choose, the set of network statistics z(x) and whether or not to impose homogeneity constraints on the parameter θ. We are going to choose a set of network statistics that we think are part of the self-organizing property of this graph. A network statistic is just a configuration of dyads. Edges are a single dyad, that is the simplest network statistic, and that’s used to describe the density of the graph. Others include k-stars, which are nested in the sense that a 4-star contains quite a few 3-stars in it, and the 3-star has 3 2-stars in it. So, that is a nested kind of parameterization that is common in the literature. We tend to prefer something like degree distributions instead in part because I think they are easier to understand. Degrees are not nested. A node simply has a degree, it has one degree only, and you count up the number of nodes that have that degree. Triads or closed triangles are typically the basis for most of the clustering parameters that people are interested in. Almost anything you can think of in terms of a configuration of dyads can be represented as a network statistic. Then you have the parameter θ, and your choice there is whether you want to impose homogeneity constraints, and I believe Peter Hoff talked about this a little bit in his talk this morning. 232

OCR for page 226
FIGURE 6 There is clearly a lot of heterogeneity in networks. Heterogeneity can be based on either nodal or dyadic attributes. People can be different because of their age, their race, their sex, those kinds of exogenous attributes, or different types of partnerships may be different. Let’s think a little bit about how we create these network statistics. Referring to Figure 6, the vector here can range from a really minimal number one, such as the number of edges, which is the simple Bernoulli model for a random graph. But the vector and also be a saturated model, with one term for every dyad, which is a large number of terms. Obviously, you don’t move immediately to a saturated model. It wouldn’t give you any understanding of what was happening in the network anyway, so what we are trying to do is work somewhere in between these two, some parsimonious summary of the structural regularities in the network. Figure 6 gives some examples from the literature that have been used in the past; the initial model was the p1 model by Holland and Leinhardt in 1981. Peter Hoff’s talk this morning fit the framework of a p1 model: it had an actor-specific in-degree and an actor-specific out- degree—so, two parameters for every actor in the network—plus a single parameter for the number of mutual dyads, that is, when i nominates j and j nominates i in return. That is a dyadic independent model, which is to say, within a dyad, between two nodes, where there is the only dependence here for this mutual term, all other dyads are independent. That is a minimal model of what is going on in a network. The first attempt to really begin to model dependent processes in networks—and that is, the edges are dependent—was the Markhov graph proposal by Frank and Strauss, which is also 233

OCR for page 226
shown in Figure 6. This model includes terms for the number of edges, the number of k-stars— again, those are those nested terms—and the number of triangles. In each of those cases you can impose homogeneity constraints on θ or not. So, for any network statistic that you have in your model, the θ parameters can range from minimal—that is, there is a single homogenous θ for all actors—to maximal, where there is a family of θ i ’s, each being actor- or configuration-specific. That was the case in Peter Hoff’s model. Every actor would have two parameters there. You can say that every configuration has its own parameter, which is a function of the two actors, or multiple actors that are involved in it. In the Bernoulli graph, the edges are the statistic, and there is a single θ that says every edge is as likely as every other edge. So, that is a homogeneity constraint. When we go with maximal θ parameters, you quickly max out and lose insight, but you can sure explain a lot of variance that way. In between are parameters that are specific to groups or classes of nodes—so you might have different levels of activity or different mixing propensities by age or by sex. In addition, there are parametric versus non-parametric representations of ordinal distribution. So, for a degree distribution, you can have a parameter for every single degree, or you could impose a parametric form of some kind, a Poisson, negative binomial, something like that. The group parameterizations are typically used to represent attribute mixing in networks. We have heard a lot about that, but this is the statistical way to handle that. There is a lot of work that has been done on that over the last 20 years. The parametric forms are usually used to parsimoniously represent configuration distributions so degree distributions, shared partner distributions, and things like that. FIGURE 7 234

OCR for page 226
Estimating these models has turned out to be a little harder than we had hoped; otherwise, we would all be talking about statistical models for networks today. I don’t think there would be anybody in here who would be talking about anything else. The reason is that this thing is a little bit hard to estimate. Figure 7 shows the likelihood function P(X = x) that we are going to be trying to maximize. The normalizing constant c makes direct maximum likelihood estimation of that θ vector impossible because, even with 50 nodes, there are an almost uncountable number of graphs. So, you are not going to be able to compute that directly. Typically, there have been two approaches to this. The first that dominated in the literature for the last 25 years is pseudolikelihood, which is essentially based on the logistic regression approximation. We now know, and we knew even then, that this isn’t very good when the dependence among the ties is strong, because it makes some assumptions there about independence. MCMC is the alternative; it theoretically guarantees estimation. FIGURE 8 But that’s only “theoretically.” Implementation has turned out to be very challenging. The reason has been something called model degeneracy. Mark Handcock is going to talk a lot about this tomorrow, but I am just going to make a couple of notes about it today. Figure 8 shows a really simple model for a network, just density and clustering. Those are the only two things 235

OCR for page 226
that we think are going on. So, there is the density term, which is just the sum of the edges, and c(x) is the clustering coefficient that people like to use to represent the fraction of 2-stars that are completed triangles. What are the properties of this model? Let’s examine it through simulation. Start with a network of 50 nodes. We are going to set the density to be about 4 percent, which is about 50 edges for a Bernoulli graph. The expected clustering, if this were a simple random graph, would then just be 3.8 percent, but let’s give it some more clustering. Let’s bump it up to 10 times higher than expected and see how well this model does. By construction, the networks produced by this simple model will have an average density of 4 percent, and an average clustering of 38 percent. Figure 9 shows what the distribution of those networks looks like. FIGURE 9 The target density and clustering would be where the dotted lines intersect, but virtually none of the networks look like that. Most of these networks, instead, are either fairly dense, but don’t have enough triangles, or not so dense, but are almost entirely clustered in triangles. What does that mean? It means that is this is a bad model. For a graph with these properties, if you saw this kind of clustering and density, this is a bad model for it. This graph didn’t come from that model, that is what it says. In the context of estimation, we call the model degenerate when the graph used to estimate the model is extremely unlikely under that model. Instead, the model 236

OCR for page 226
FIGURE 17 Finally, Figure 18 shows the minimum geodesic distance between all pairs, with a certain fraction of them here being unreachable. FIGURE 18 Figure 19 shows how the Bernoulli model does, and it doesn’t do very well. 243

OCR for page 226
FIGURE 19 Putting these all together, Figure 20 shows your goodness of fit measures. FIGURE 20 Figure 21 shows what it looks like for all the models. The first column shows the degree distribution; even a Bernoulli model is pretty good. Adding attributes doesn’t get you much, but adding the shared partners, you get it exactly right on, and the same is true when you add both the shared partner and the attributes. For the local clustering, which is this edgewise shared-partner term, the Bernoulli model does terrible. I can’t say the attribute model does a whole lot better. Of course, once you put the two-parameter weighted shared-partner distribution in, you capture that pretty well. 244

OCR for page 226
FIGURE 21 You don’t capture the geodesic well with edges alone, but it is amazing how well you get the geodesic just from attribute mixing alone, just from that homophily. In fact, it actually doesn’t do so well, the local clustering term for this edgewise shared-partner doesn’t capture it anywhere near as well, and you actually don’t do as good a job when you put both of them in. So, Figure 22 is the eyeball test. That is a different approach to goodness of fit. One thing you want to make sure of with these models is that they aren’t just crazy. Obviously, those degenerate models were crazy, and you could see that very quickly. They would either be empty or they would be complete. For this figure, it actually would be hard to tell which one was simulated and which one was real. They are actually getting the structure from the eyeball perspective pretty well. 245

OCR for page 226
FIGURE 22 There are 50 schools from which we can draw information. There are actually more, but 50 that have good information. They range in size from fairly small—and this school 10 is one of the smaller mid-sized schools, with 71 students in the data set—but we can use these models all the way up to beyond 1,000. We have now used them on 3,000-node networks, and they are very good and very stable. Figure 23 compares some results for different network sizes, using the model that has both the friends-of-a-friend and the birds-of-a-feather processes in it. It does very well for the smaller networks, but as you start getting out into the bigger networks, the geodesics are longer than you would expect. Basically, I think that is telling you there is more clustering, there is more pulling apart in these networks, and less homogeneity than these models assume. So, there is something else that is generating some heterogeneity here. 246

OCR for page 226
FIGURE 23 The other thing you can do is to compare the parameter estimates across the models, which is really nice. In Figure 24, we look at 59 schools, using the attribute-only model. You can see the differential effects for grades: the older students tend to have more friends. 247

OCR for page 226
FIGURE 24 Figure 25 shows the homophily effect, the birds-of-a-feather effect, which is interesting. You see that it is strongest for the younger and the older, and those are probably floor and ceiling effects. Mean effects for race don’t show up as being particularly important, but blacks are significantly more likely than any other group to form assortative ties. Interestingly, you can see that Hispanics really bridge the racial divide. So, there are all sorts of nice things you can do by looking at those parameters as well. Finally, the other thing you can do is examine what is the effect of adding a transitivity term to a homophily model. I mean, how much of the effect that we attributed to homophily actually turns out to be this transitivity effect instead. It turns out the grade-based homophily estimates fall by about 14 percent once you control for this friend-of-a-friend effect. The race homophily usually falls, but actually sometimes rises, so once you account for transitivity you find that the race effects are actually even stronger than you would have expected with just the homophily model. This is shown in Figure 26. 248

OCR for page 226
FIGURE 25 FIGURE 26 249

OCR for page 226
The transitivity estimates this friend-of-a-friend effect falls by nearly 25 percent once you control for the homophily term, as shown in Figure 27. This doesn’t seem like much, but it is amazing to be able to do this kind of stuff on networks, because we have not been able to do this before. We have not been able to test these kinds of things before. What we have now is a program and a theoretical background that allows us to do this. FIGURE 27 What this approach offers is a principled method for theory-based network analysis where you have a framework for model specification, estimation, comparison and inference. These are generative models and they have tests for fit, so it is not as if you see there is clustering, but it is homogenous clustering how well that it fits. These give you the answers to those questions. We have methods for simulating networks that fall out of these things automatically, because we can reproduce the known, observed, or theoretically-specified structure just by using the MCMC algorithm. For the cross-sectional snapshots, this is a direct result of the fitting procedure. For dynamic stationary networks, it is based on a modified MCMC algorithm, and dynamic evolving networks means you have to have model terms for how that evolution proceeds. You can then simulate diffusion across these networks and, in particular, for us, disease transmission dynamics. It turns out the methods also lend themselves very easily to sampled networks and other missing network data. 250

OCR for page 226
With that I will say if you are interested in learning more about the package that we have, it is an R based package. We are going to make it available as soon as we get all the bugs out. If you want to be a guinea pig, we welcome guinea pigs, and all you need to do is take a look at http://csde.washington.edu/statnet. Thank you very much. QUESTIONS AND ANSWERS DR. BANKS: Martina that was a lovely talk and you are exactly right, you can do things that have never been done before, and I am deeply impressed by all of it. I do wonder if perhaps we have not tied ourselves into a straightjacket by continuing to use the analytic mathematical formulation of these models. DR. MORRIS: The analytic what formulation? DR. BANKS: An analytical mathematical formulation growing out of a p1 model and just evolving. An alternative might be to use agent-based simulation, to try and construct things from simple rules for each of the agents. For example, it is very unlikely that anybody could have more than 5 best friends or things like that. DR. MORRIS: Actually, you would be surprised how many people report that. I am kidding. I agree, except that I think that, depending on how you want to define agent-based simulation, these are agent-based simulations. What I am doing is proposing certain rules about how people choose friends. So, I choose friends because I tend to like people the same age as me, the same race as me, the same socioeconomic status. Those are agent-based rules and, when other people are using those rules, then we are generating the system that results from those rules being in operation. I don’t see a distinction between these two in quite the same way, but I do agree. One thing that we did do was focus on the Markov model for far too long. It was edges, stars, and triangles. I think there is an intuitive baby in the bath water, and that is, edges are only dependent if they share a node. That is a very natural way to think about trying to model dependency, but I think it did kind of narrow our focus probably more than it should have. DR. BANKS: You are exactly right. Your models do instantiate something that could be an agent-based system, but there are other types of rules that would be natural for friendship formation that would be very hard to capture in this framework, I think. DR. MORRIS: I would be interested in talking to you about what those are. DR. BANKS: For example, the rule that you can’t have more than 5 friends might be hard to build in. DR. MORRIS: No, in fact, that is very easy to build in. That is one of the first things we 251

OCR for page 226
had to do to handle the missing data here, because nobody could have more than 5 male or five female friends. DR. HOFF: There was one middle part of your talk which I must have missed, because you started talking about the degeneracy problem with the exponentially parameterized graph models, and then at the end we saw how great they were. So, at some point there was the transition, by including certain statistics, or is it including certain statistics that makes them less degenerate, or is it the heterogeneity you talked about? I could see how adding heterogeneity to your models or to the parameters is going to drastically increase the amount of the sample space that you are going to cover. Could you give a little discussion about that? DR. MORRIS: These models have very little heterogeneity in them relative to your models. So, every actor does not have both an in-degree and an out-degree. There is basically just a degree term for classes. So, grades are allowed to have different degrees, race is allowed to have different degrees. The real thing—that wasn’t what made this work. What made this work was the edgewise shared partner. When we had originally tried using the local clustering term as either the clustering coefficient or the number of triangles with just a straight theta on it, those are degenerate models. The edgewise shared partner doesn’t solve all problems either, but at least it was an ability bound, and that is essentially the effect that it has, is that it bounds the tail and it says that people can't have that many partners. So, that is what changed everything. DR. JENSEN: I think the description of model degeneracy is wonderful. Fortunately, it wasn’t 10 years of work in my case. It was more like 6 months of work that went down the drain and I didn’t know why, and I think you have now explained it. Is that written up anywhere? Are there tests that we can run for degeneracy? What more can we do? DR. MORRIS: That is a great question. Mark Handcock is really the wizard of model degeneracy, and I think he is going to give a talk tomorrow that can answer some of those questions. I don’t think we have a test for it yet, although you can see whether your MCMC chain is mixing properly and, if it is always down here and then all of a sudden it goes up here, then you know you have got a problem. It is still a bit of an art. STATNET, this package, will have a lot of this stuff built into it. DR. SZEWCZYK: My question is, is it model degeneracy, or is it model inadequacy? I look at a lot of these things and my question is, can we take some of these models and, rather than just fitting one universal model, can we go in there and, say, fit a mixture of these p-star models, or these Markov models or p1 models, rather than assuming that everyone acts the same within these groups? DR. MORRIS: There are lots of ways to try to address the heterogeneity, I agree with you, and I think they need to be more focused on the substantive problem at hand. 252

OCR for page 226
So, just throwing in a mixture or throwing in a latent dimension, to me, kind of misses the point of why do people form partnerships with others. So, when I go into this, I go in saying, I want to add attributes to this. A lot of people who have worked in the network field don’t think attributes matter because somehow it is still at the respondent level. We all know that, as good network analysts, we don't care about individuals. We only care about network properties and dyads. DR. SZEWCZYK: We care about individuals. DR. MORRIS: Attributes do a lot of work. They do a lot of heavy lifting in these models and they actually explain, I think, a fair amount. I would call it model degeneracy only in this case because you get an estimate and you might not even realize it was wrong. In fact, when people used the pseudolikelihood estimates, they had no idea that the cute little estimate they were getting with a confidence interval made no sense at all. It is degenerate because what it does, it performs the function. It actually gets the average right, but then it gets all the details wrong. So, you can call that inadequate, and it is. It is a failure. It is a model failure. That is very clear. DR. WIGGINS: So, one thing to follow up on that I was wondering about, since each of these models defined a class, I wonder if you thought about treating this using not classifiers, large-margin classifiers, like support vector machines. Some anecdotal evidence is that sometimes you can tell if none of your network models is really good for a network that you are interested in. So, some of these techniques that measure everything at once, rather than measuring a couple of features you want to reproduce will show you how one network, if you look at it in terms of one attribute, it looks like model F, but if you look at it in terms of a different model, it turns out to be model G, and that might be one way of seeing whether or not you have heterogeneity or just none of your models is a good model. If you have a classifier, then all the different classifiers might choose, not me, as the class, in which case you can kind of see if none of your models is the right model. DR. MORRIS: Yes, that is a nice idea. 253