network and be looking for functional clusters of nodes. There are a variety of situations in which you might be interested in this.
Finding groups in networks is an old problem. It is something that computer scientists, in particular, have looked at for many decades. But there is a difference between what we want to do with social networks, for example, and what computer scientists have traditionally done. First of all, in the computer science problems, traditionally when you are looking for the groups in networks, there are some additional constraints that we don’t have, such as you know how many groups you want beforehand. A standard computer science problem might be that you have a bunch of tasks and you want to divide them up over many processors on a computer, and you know how many processors you have. You know how many groups you want to divide them into. Very often you might want the groups to be of roughly equal sizes. That is a typical constraint in the kinds of algorithms that computer scientists look at because, for example, you want to load balance between many processors in a computer. In the problems we are looking at that is not often true.
What is more, there is a fundamental philosophical difference compared to the traditional approach of partitioning a computer network, in that we are assuming here that the social networks we are looking at naturally divide into these groups somehow. In the kinds of problems you want to solve in computer science, often you are just given the network and you want to find whatever the best division of it is into groups, such that you have a bunch of these tightly knit groups, and there are lots of connections within the groups, and only a few connections between the groups, and you do the best you can with whatever network you are given. We are taking a slightly different point of view here, and assuming from the outset that there is some good division for some good sociological reason, for example, that your social network naturally divides up into these groups, and we would like to find the natural divisions of the network. And those natural divisions might involve dividing it into some small groups and some large groups, and you don’t know how many groups there will be beforehand, and so forth.
There are old algorithms for doing this from computer science, like the one shown in Figure 2, which you might be familiar with, the so-called Kernighan-Lin algorithm from the 1970s. In this algorithm, if you want to divide a network into two parts, you can start by dividing it any way you like, essentially at random, and then you repeatedly swap pairs of vertices between the two parts in an effort to reduce the number of edges that run between the left-hand group and the right-hand group in Figure 2. You continue doing that until you can’t improve things any more, and then you stop. This is typical of the kinds of things that traditional algorithms do. It works very well but it has some constraints. First of all, you need to know the sizes of the groups