**Martina Morris, University of Washington**

**DR. MORRIS:** As they say, I realize I am the only thing that stands between you and dinner. It is the end of the day. It is not an enviable spot, and all of these have become longer and longer talks, so I am going to see if I can work through this a little more quickly.

I am closer to the biologists since I put the people up front. We have a large group of people working on network sampling and network modeling—Steve Goodreau, Mark Handcock and myself at the University of Washington; Dave Hunter and Jim Moody, who are both here; Rich Rothenberg, who is an M.D., Ph.D.; Tom Snijders who some of you know, is a network statistician; Phillipa Pattison and Garry Robbins from Melbourne have also done a lot of work on networks over the year, and then a gaggle of grad students that come from lots of different disciplines as well.

We are funded from NIH and we are primarily interested in how networks channel the spread of disease, so Figure 1 shows an example of a real network that comes from Colorado Springs, a visualization that Jim did. You can see that the network has a giant component, which is not surprising. It has this dendritic effect, which is what you tend to get with disassortative mixing. I think that is something that Mark Newman pointed out. After a while you get to look at these things and you can immediately pick that up. This is kind of a boy-girl disassortative mixing, and it generates these long loosely connected webs. This is also a fairly high-risk group of individuals, and it was sampled to be exactly that. John Potter thinks that he got about 85 percent of the high-risk folks in Colorado Springs. Every now and then you see a configuration like a ring of nodes connected to a central node, which represents a prostitute and a bunch of clients. Of course not all networks look like that.

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 226

Network Data and Models
Martina Morris, University of Washington
DR. MORRIS: As they say, I realize I am the only thing that stands between you and
dinner. It is the end of the day. It is not an enviable spot, and all of these have become longer and
longer talks, so I am going to see if I can work through this a little more quickly.
I am closer to the biologists since I put the people up front. We have a large group of
people working on network sampling and network modeling—Steve Goodreau, Mark Handcock
and myself at the University of Washington; Dave Hunter and Jim Moody, who are both here;
Rich Rothenberg, who is an M.D., Ph.D.; Tom Snijders who some of you know, is a network
statistician; Phillipa Pattison and Garry Robbins from Melbourne have also done a lot of work on
networks over the year, and then a gaggle of grad students that come from lots of different
disciplines as well.
We are funded from NIH and we are primarily interested in how networks channel the
spread of disease, so Figure 1 shows an example of a real network that comes from Colorado
Springs, a visualization that Jim did. You can see that the network has a giant component, which
is not surprising. It has this dendritic effect, which is what you tend to get with disassortative
mixing. I think that is something that Mark Newman pointed out. After a while you get to look at
these things and you can immediately pick that up. This is kind of a boy-girl disassortative
mixing, and it generates these long loosely connected webs. This is also a fairly high-risk group
of individuals, and it was sampled to be exactly that. John Potter thinks that he got about 85
percent of the high-risk folks in Colorado Springs. Every now and then you see a configuration
like a ring of nodes connected to a central node, which represents a prostitute and a bunch of
clients. Of course not all networks look like that.
226

OCR for page 226

FIGURE 1
Just a little bit on motivation. Our application is infectious disease transmission, and in
particular HIV, recognizing that the real mechanism of transmission for HIV is partnership
networks. What we are interested in, in a general sense, is what creates the network connectivity,
particularly in sparse networks. As most of you know, HIV has enormously different prevalence
around the world. As low as it is here, which is certainly less than 1 percent, compared to close
to 40 percent in places like Botswana. So, there is a really interesting question there about what
kind of network connectivity would generate that difference and variation in prevalence.
There are clearly properties of the transmission system that you need to think about,
which include biological properties; heterogeneity, which is the distribution of attributes of the
nodes, the persons, but also infectivity and susceptibility in the host population. There are
multiple time scales to consider, and time scales are something that we haven’t talked about much
here, but I think are very important in this context. You get the natural history and evolutionary
dynamics of the pathogens, but you also have some interesting stuff going on with partnerships
there. In addition, there is this real challenge of data collection. That is in contrast to
Traceroutes—it would be great if we could collect Traceroutes of sexual behavior.
Maybe we could do that with some of those little nodes that we stuck on the monkey heads, but at
227

OCR for page 226

this point we can’t really do that. So, for us, this means that the data are, in fact, very difficult to
collect, and that is a real challenge. One of the things we are aiming for—basically, this is our
lottery ticket—is to find methods that will help us define and estimate models for networks that
use minimally sampled data. By minimally, I mean you can’t even contact trace in this case,
because contact tracing is, itself, a very problematic thing to do in the case of sexual behavior.
FIGURE 2
So, how do you generate connectivity in sparse networks? Lots of people have focused
on a scale of three steps: high-degree hubs, all you need is one person out there with millions of
partners and your population is connected. In fact, you can generate connectivity in very-low-
degree networks as well, as shown in Figure 2. Your simple square dance circle is a completely
connected population where everybody only has two partners, so it is important to recognize that
there are lots of other ways that you can generate connectivity, and that even if one person, for
example, did act as a hub, and you figured that out somehow and you removed them, you would
still have a connected population. I think it has some pretty strong implications for prevention.
Also, there is obviously clustering and heterogeneity in networks. Figure 3 shows data
from a school. You can see in this case that there is a combination of clustering by grade that
generates the different colors and also the very strong racial divide.
228

OCR for page 226

FIGURE 3
You might ask yourself the question, why is this? What is the model behind this? What
generates this? A number of people have hinted at this kind of stuff earlier.
Are these exogenous attributes at work—that is, birds of a feather stick together? If you
are the same grade, you are the same race as me, that is why we are likely to be friends. Or is it
some kind of endogenous process where, if two people share a friend, they are more likely to be
friends themselves? That is a friend-of-a-friend process. It is interesting that in popular language
we have both of those ideas already ensconced in a little aphorism. So, thinking about
partnership dynamics and the timing and sequence and why that would matter, one of the things
that we have started looking at in the field of networks and HIV is the role that concurrent
partnerships play.
229

OCR for page 226

FIGURE 4
Concurrent partners are partnerships that don’t obey the rule of serial monogamy. That
is, each partnership does not have to end before the next one begins. You can see in Figure 4, in
the long yellow horizontal line labeled “1”, we have got somebody who has a partnership all the
way across the time interval, and then a series of concurrent partnerships with that, including a
little one night stand labeled “5” that makes three of them concurrent at some point in time. Now,
this is the same number of partners as in the upper graph, so it is not about multiple partners,
although you do need to have multiple partners to have concurrent partners. This is really a
function of timing and sequence.
What we find in Uganda and Thailand is very interesting. Uganda at the time that we
were doing this study had a prevalence of about 18 percent and Thailand had about 2 percent,
which is 10 times less. In Uganda men reported about 8 partners in the lifetime on average and in
Thailand it was 80. Now, 10 years later Thailand doesn’t have an epidemic that comes from
having 80 partners in your lifetime, so something else is obviously going on: it is not just the
number of partners. We looked into this concurrency a little bit, and we found that in both cases
you get concurrent partnerships. So, it is not the simple prevalence of concurrency that is the
difference. The difference comes in the kinds of partnerships that are concurrent. In Uganda you
tend to get two long-term partnerships that overlap, whereas in Thailand you have the typical sex
partner pattern, which is one long term partnership, and then a short term on the side. The net
result is that in Uganda the typical concurrent partnership is actually active on the day of the
230

OCR for page 226

interview—41 percent of them were, and they had been going for three years. The ones that are
completed are also reasonably long, about 10 months. In Thailand you don’t get anywhere near
as many active on the day of the interview. They are about two months long. Ninety-five percent
of them are a day long, so these concurrences happen and they are over.
That kind of thing can actually generate a very different pattern on an epidemic. The
simulations that we have done of that suggest that if you take this into account you do, in fact,
generate an epidemic in Uganda that has about 18 percent prevalence, whereas Thailand will
typically have just about 2 percent. The approach that we take is thinking about generative
models here, local rules that people use to select partners that cumulate up to global structures
and a network. What we want is a generative stochastic model for this process, and that model is
not going to look like you, for example, want to create clustering. A clustering coefficient,
although it can be a great descriptive summary for a network, is not necessarily going to function
well as a generative model for a network. It is also probably the case that when I select a partner
I am not thinking I want to create the shortest path, the shortest geodesic to the east coast. That is
probably also not going on. Again, it’s a nice summary of a network but probably not a
generative property. We want to be able to fit both exogenous and endogenous effects in these
models, so that turns out to be an interesting and difficult problem. We also want this to work
well for sample data. We want to be able to estimate based on observed data, and then make an
inference to a complete network.
Figure 5 shows what this generative model is going to look like. We have adapted this
from earlier work in the field. It is an exponential random graph model. It basically takes the
probability of observing a network or a graph, a set of relationships, as a function of something
that looks a little bit like an exponentiated version of a linear model, and then a normalizing
constant down below that is all possible graphs of that size. This is the probability of this graph
as a function of a model prediction, with this as the normalizing constant. The origins go back at
least in the statistical literature of Bahadur during 1961; I talked a little bit about a multivariate
binomial. Besag has adopted this for spatial statistics, and Frank and Strauss first proposed it for
networks in 1986. The benefits of this approach are that it considers all possible relations jointly,
and that is important because the relations here are going to be dependent, if there are any
endogenous processes going on, like these friends of a friend. It is an exponential family, which
means it has very well understood statistical properties, and it also turns out to be very flexible. It
can represent a wide range of configurations and processes.
231

OCR for page 226

FIGURE 5
What does model specification mean in a setting like this? There are two things that we
must choose, the set of network statistics z(x) and whether or not to impose homogeneity
constraints on the parameter θ. We are going to choose a set of network statistics that we think
are part of the self-organizing property of this graph. A network statistic is just a configuration of
dyads. Edges are a single dyad, that is the simplest network statistic, and that’s used to describe
the density of the graph. Others include k-stars, which are nested in the sense that a 4-star
contains quite a few 3-stars in it, and the 3-star has 3 2-stars in it. So, that is a nested kind of
parameterization that is common in the literature. We tend to prefer something like degree
distributions instead in part because I think they are easier to understand. Degrees are not nested.
A node simply has a degree, it has one degree only, and you count up the number of nodes that
have that degree.
Triads or closed triangles are typically the basis for most of the clustering parameters that
people are interested in. Almost anything you can think of in terms of a configuration of dyads
can be represented as a network statistic. Then you have the parameter θ, and your choice there is
whether you want to impose homogeneity constraints, and I believe Peter Hoff talked about this a
little bit in his talk this morning.
232

OCR for page 226

FIGURE 6
There is clearly a lot of heterogeneity in networks. Heterogeneity can be based on either
nodal or dyadic attributes. People can be different because of their age, their race, their sex, those
kinds of exogenous attributes, or different types of partnerships may be different. Let’s think a
little bit about how we create these network statistics. Referring to Figure 6, the vector here can
range from a really minimal number one, such as the number of edges, which is the simple
Bernoulli model for a random graph. But the vector and also be a saturated model, with one term
for every dyad, which is a large number of terms. Obviously, you don’t move immediately to a
saturated model. It wouldn’t give you any understanding of what was happening in the network
anyway, so what we are trying to do is work somewhere in between these two, some
parsimonious summary of the structural regularities in the network.
Figure 6 gives some examples from the literature that have been used in the past; the
initial model was the p1 model by Holland and Leinhardt in 1981. Peter Hoff’s talk this morning
fit the framework of a p1 model: it had an actor-specific in-degree and an actor-specific out-
degree—so, two parameters for every actor in the network—plus a single parameter for the
number of mutual dyads, that is, when i nominates j and j nominates i in return. That is a dyadic
independent model, which is to say, within a dyad, between two nodes, where there is the only
dependence here for this mutual term, all other dyads are independent. That is a minimal model of
what is going on in a network.
The first attempt to really begin to model dependent processes in networks—and that is,
the edges are dependent—was the Markhov graph proposal by Frank and Strauss, which is also
233

OCR for page 226

shown in Figure 6. This model includes terms for the number of edges, the number of k-stars—
again, those are those nested terms—and the number of triangles. In each of those cases you can
impose homogeneity constraints on θ or not. So, for any network statistic that you have in your
model, the θ parameters can range from minimal—that is, there is a single homogenous θ for all
actors—to maximal, where there is a family of θ i ’s, each being actor- or configuration-specific.
That was the case in Peter Hoff’s model. Every actor would have two parameters there. You can
say that every configuration has its own parameter, which is a function of the two actors, or
multiple actors that are involved in it.
In the Bernoulli graph, the edges are the statistic, and there is a single θ that says every
edge is as likely as every other edge. So, that is a homogeneity constraint. When we go with
maximal θ parameters, you quickly max out and lose insight, but you can sure explain a lot of
variance that way. In between are parameters that are specific to groups or classes of nodes—so
you might have different levels of activity or different mixing propensities by age or by sex. In
addition, there are parametric versus non-parametric representations of ordinal distribution. So,
for a degree distribution, you can have a parameter for every single degree, or you could impose a
parametric form of some kind, a Poisson, negative binomial, something like that.
The group parameterizations are typically used to represent attribute mixing in networks.
We have heard a lot about that, but this is the statistical way to handle that. There is a lot of work
that has been done on that over the last 20 years. The parametric forms are usually used to
parsimoniously represent configuration distributions so degree distributions, shared partner
distributions, and things like that.
FIGURE 7
234

OCR for page 226

Estimating these models has turned out to be a little harder than we had hoped; otherwise,
we would all be talking about statistical models for networks today. I don’t think there would be
anybody in here who would be talking about anything else.
The reason is that this thing is a little bit hard to estimate. Figure 7 shows the likelihood
function P(X = x) that we are going to be trying to maximize. The normalizing constant c makes
direct maximum likelihood estimation of that θ vector impossible because, even with 50 nodes,
there are an almost uncountable number of graphs. So, you are not going to be able to compute
that directly. Typically, there have been two approaches to this. The first that dominated in the
literature for the last 25 years is pseudolikelihood, which is essentially based on the logistic
regression approximation. We now know, and we knew even then, that this isn’t very good when
the dependence among the ties is strong, because it makes some assumptions there about
independence. MCMC is the alternative; it theoretically guarantees estimation.
FIGURE 8
But that’s only “theoretically.” Implementation has turned out to be very challenging.
The reason has been something called model degeneracy. Mark Handcock is going to talk a lot
about this tomorrow, but I am just going to make a couple of notes about it today. Figure 8 shows
a really simple model for a network, just density and clustering. Those are the only two things
235

OCR for page 226

that we think are going on. So, there is the density term, which is just the sum of the edges, and
c(x) is the clustering coefficient that people like to use to represent the fraction of 2-stars that are
completed triangles. What are the properties of this model? Let’s examine it through simulation.
Start with a network of 50 nodes. We are going to set the density to be about 4 percent, which is
about 50 edges for a Bernoulli graph. The expected clustering, if this were a simple random
graph, would then just be 3.8 percent, but let’s give it some more clustering. Let’s bump it up to
10 times higher than expected and see how well this model does. By construction, the networks
produced by this simple model will have an average density of 4 percent, and an average
clustering of 38 percent. Figure 9 shows what the distribution of those networks looks like.
FIGURE 9
The target density and clustering would be where the dotted lines intersect, but virtually
none of the networks look like that. Most of these networks, instead, are either fairly dense, but
don’t have enough triangles, or not so dense, but are almost entirely clustered in triangles. What
does that mean? It means that is this is a bad model. For a graph with these properties, if you
saw this kind of clustering and density, this is a bad model for it. This graph didn’t come from
that model, that is what it says. In the context of estimation, we call the model degenerate when
the graph used to estimate the model is extremely unlikely under that model. Instead, the model
236

OCR for page 226

FIGURE 17
Finally, Figure 18 shows the minimum geodesic distance between all pairs, with a certain
fraction of them here being unreachable.
FIGURE 18
Figure 19 shows how the Bernoulli model does, and it doesn’t do very well.
243

OCR for page 226

FIGURE 19
Putting these all together, Figure 20 shows your goodness of fit measures.
FIGURE 20
Figure 21 shows what it looks like for all the models. The first column shows the degree
distribution; even a Bernoulli model is pretty good. Adding attributes doesn’t get you much, but
adding the shared partners, you get it exactly right on, and the same is true when you add both the
shared partner and the attributes. For the local clustering, which is this edgewise shared-partner
term, the Bernoulli model does terrible. I can’t say the attribute model does a whole lot better. Of
course, once you put the two-parameter weighted shared-partner distribution in, you capture that
pretty well.
244

OCR for page 226

FIGURE 21
You don’t capture the geodesic well with edges alone, but it is amazing how well you get
the geodesic just from attribute mixing alone, just from that homophily. In fact, it actually
doesn’t do so well, the local clustering term for this edgewise shared-partner doesn’t capture it
anywhere near as well, and you actually don’t do as good a job when you put both of them in.
So, Figure 22 is the eyeball test. That is a different approach to goodness of fit. One
thing you want to make sure of with these models is that they aren’t just crazy. Obviously, those
degenerate models were crazy, and you could see that very quickly. They would either be empty
or they would be complete. For this figure, it actually would be hard to tell which one was
simulated and which one was real. They are actually getting the structure from the eyeball
perspective pretty well.
245

OCR for page 226

FIGURE 22
There are 50 schools from which we can draw information. There are actually more, but
50 that have good information. They range in size from fairly small—and this school 10 is one of
the smaller mid-sized schools, with 71 students in the data set—but we can use these models all
the way up to beyond 1,000. We have now used them on 3,000-node networks, and they are very
good and very stable. Figure 23 compares some results for different network sizes, using the
model that has both the friends-of-a-friend and the birds-of-a-feather processes in it. It does very
well for the smaller networks, but as you start getting out into the bigger networks, the geodesics
are longer than you would expect. Basically, I think that is telling you there is more clustering,
there is more pulling apart in these networks, and less homogeneity than these models assume.
So, there is something else that is generating some heterogeneity here.
246

OCR for page 226

FIGURE 23
The other thing you can do is to compare the parameter estimates across the models,
which is really nice. In Figure 24, we look at 59 schools, using the attribute-only model. You
can see the differential effects for grades: the older students tend to have more friends.
247

OCR for page 226

FIGURE 24
Figure 25 shows the homophily effect, the birds-of-a-feather effect, which is interesting.
You see that it is strongest for the younger and the older, and those are probably floor and ceiling
effects. Mean effects for race don’t show up as being particularly important, but blacks are
significantly more likely than any other group to form assortative ties. Interestingly, you can see
that Hispanics really bridge the racial divide. So, there are all sorts of nice things you can do by
looking at those parameters as well.
Finally, the other thing you can do is examine what is the effect of adding a transitivity
term to a homophily model. I mean, how much of the effect that we attributed to homophily
actually turns out to be this transitivity effect instead. It turns out the grade-based homophily
estimates fall by about 14 percent once you control for this friend-of-a-friend effect. The race
homophily usually falls, but actually sometimes rises, so once you account for transitivity you
find that the race effects are actually even stronger than you would have expected with just the
homophily model. This is shown in Figure 26.
248

OCR for page 226

FIGURE 25
FIGURE 26
249

OCR for page 226

The transitivity estimates this friend-of-a-friend effect falls by nearly 25 percent once you
control for the homophily term, as shown in Figure 27. This doesn’t seem like much, but it is
amazing to be able to do this kind of stuff on networks, because we have not been able to do this
before. We have not been able to test these kinds of things before. What we have now is a
program and a theoretical background that allows us to do this.
FIGURE 27
What this approach offers is a principled method for theory-based network analysis
where you have a framework for model specification, estimation, comparison and inference.
These are generative models and they have tests for fit, so it is not as if you see there is clustering,
but it is homogenous clustering how well that it fits. These give you the answers to those
questions. We have methods for simulating networks that fall out of these things automatically,
because we can reproduce the known, observed, or theoretically-specified structure just by using
the MCMC algorithm. For the cross-sectional snapshots, this is a direct result of the fitting
procedure. For dynamic stationary networks, it is based on a modified MCMC algorithm, and
dynamic evolving networks means you have to have model terms for how that evolution
proceeds. You can then simulate diffusion across these networks and, in particular, for us,
disease transmission dynamics. It turns out the methods also lend themselves very easily to
sampled networks and other missing network data.
250

OCR for page 226

With that I will say if you are interested in learning more about the package that we have,
it is an R based package. We are going to make it available as soon as we get all the bugs out. If
you want to be a guinea pig, we welcome guinea pigs, and all you need to do is take a look at
http://csde.washington.edu/statnet. Thank you very much.
QUESTIONS AND ANSWERS
DR. BANKS: Martina that was a lovely talk and you are exactly right, you can do things
that have never been done before, and I am deeply impressed by all of it. I do wonder if perhaps
we have not tied ourselves into a straightjacket by continuing to use the analytic mathematical
formulation of these models.
DR. MORRIS: The analytic what formulation?
DR. BANKS: An analytical mathematical formulation growing out of a p1 model and
just evolving. An alternative might be to use agent-based simulation, to try and construct things
from simple rules for each of the agents. For example, it is very unlikely that anybody could have
more than 5 best friends or things like that.
DR. MORRIS: Actually, you would be surprised how many people report that. I am
kidding. I agree, except that I think that, depending on how you want to define agent-based
simulation, these are agent-based simulations. What I am doing is proposing certain rules about
how people choose friends. So, I choose friends because I tend to like people the same age as me,
the same race as me, the same socioeconomic status. Those are agent-based rules and, when
other people are using those rules, then we are generating the system that results from those rules
being in operation. I don’t see a distinction between these two in quite the same way, but I do
agree.
One thing that we did do was focus on the Markov model for far too long. It was edges,
stars, and triangles. I think there is an intuitive baby in the bath water, and that is, edges are only
dependent if they share a node. That is a very natural way to think about trying to model
dependency, but I think it did kind of narrow our focus probably more than it should have.
DR. BANKS: You are exactly right. Your models do instantiate something that could be
an agent-based system, but there are other types of rules that would be natural for friendship
formation that would be very hard to capture in this framework, I think.
DR. MORRIS: I would be interested in talking to you about what those are.
DR. BANKS: For example, the rule that you can’t have more than 5 friends might be
hard to build in.
DR. MORRIS: No, in fact, that is very easy to build in. That is one of the first things we
251

OCR for page 226

had to do to handle the missing data here, because nobody could have more than 5 male or five
female friends.
DR. HOFF: There was one middle part of your talk which I must have missed, because
you started talking about the degeneracy problem with the exponentially parameterized graph
models, and then at the end we saw how great they were. So, at some point there was the
transition, by including certain statistics, or is it including certain statistics that makes them less
degenerate, or is it the heterogeneity you talked about? I could see how adding heterogeneity to
your models or to the parameters is going to drastically increase the amount of the sample space
that you are going to cover. Could you give a little discussion about that?
DR. MORRIS: These models have very little heterogeneity in them relative to your
models. So, every actor does not have both an in-degree and an out-degree. There is basically
just a degree term for classes. So, grades are allowed to have different degrees, race is allowed to
have different degrees. The real thing—that wasn’t what made this work. What made this work
was the edgewise shared partner. When we had originally tried using the local clustering term as
either the clustering coefficient or the number of triangles with just a straight theta on it, those are
degenerate models. The edgewise shared partner doesn’t solve all problems either, but at least it
was an ability bound, and that is essentially the effect that it has, is that it bounds the tail and it
says that people can't have that many partners. So, that is what changed everything.
DR. JENSEN: I think the description of model degeneracy is wonderful. Fortunately, it
wasn’t 10 years of work in my case. It was more like 6 months of work that went down the drain
and I didn’t know why, and I think you have now explained it. Is that written up anywhere? Are
there tests that we can run for degeneracy? What more can we do?
DR. MORRIS: That is a great question. Mark Handcock is really the wizard of model
degeneracy, and I think he is going to give a talk tomorrow that can answer some of those
questions. I don’t think we have a test for it yet, although you can see whether your MCMC
chain is mixing properly and, if it is always down here and then all of a sudden it goes up here,
then you know you have got a problem. It is still a bit of an art. STATNET, this package, will
have a lot of this stuff built into it.
DR. SZEWCZYK: My question is, is it model degeneracy, or is it model inadequacy? I
look at a lot of these things and my question is, can we take some of these models and, rather than
just fitting one universal model, can we go in there and, say, fit a mixture of these p-star models,
or these Markov models or p1 models, rather than assuming that everyone acts the same within
these groups?
DR. MORRIS: There are lots of ways to try to address the heterogeneity, I agree with
you, and I think they need to be more focused on the substantive problem at hand.
252

OCR for page 226

So, just throwing in a mixture or throwing in a latent dimension, to me, kind of misses the point
of why do people form partnerships with others. So, when I go into this, I go in saying, I want to
add attributes to this. A lot of people who have worked in the network field don’t think attributes
matter because somehow it is still at the respondent level. We all know that, as good network
analysts, we don't care about individuals. We only care about network properties and dyads.
DR. SZEWCZYK: We care about individuals.
DR. MORRIS: Attributes do a lot of work. They do a lot of heavy lifting in these
models and they actually explain, I think, a fair amount. I would call it model degeneracy only in
this case because you get an estimate and you might not even realize it was wrong. In fact, when
people used the pseudolikelihood estimates, they had no idea that the cute little estimate they
were getting with a confidence interval made no sense at all. It is degenerate because what it
does, it performs the function. It actually gets the average right, but then it gets all the details
wrong. So, you can call that inadequate, and it is. It is a failure. It is a model failure. That is very
clear.
DR. WIGGINS: So, one thing to follow up on that I was wondering about, since each of
these models defined a class, I wonder if you thought about treating this using not classifiers,
large-margin classifiers, like support vector machines. Some anecdotal evidence is that
sometimes you can tell if none of your network models is really good for a network that you are
interested in. So, some of these techniques that measure everything at once, rather than
measuring a couple of features you want to reproduce will show you how one network, if you
look at it in terms of one attribute, it looks like model F, but if you look at it in terms of a
different model, it turns out to be model G, and that might be one way of seeing whether or not
you have heterogeneity or just none of your models is a good model. If you have a classifier,
then all the different classifiers might choose, not me, as the class, in which case you can kind of
see if none of your models is the right model.
DR. MORRIS: Yes, that is a nice idea.
253