C
Content of Network Science Courses
Appendix C offers a detailed account of the results of the survey the committee conducted on the core materials taught in network science courses. To achieve its goal, the committee followed a twostep procedure. First, it searched for currently taught courses on networks, looking in all possible departments, from computer science and physics to the social and biological sciences. A representative list of such courses is provided in Table C1. The committee’s work was facilitated by the fact that many courses post a detailed syllabus on the Web, as well as links to other courses on the same topic. If a syllabus was not available on the Web site, a copy was requested from the instructor. After inspecting the collected syllabi, it was possible to discern a set of core concepts that are shared by a wide range of courses and application areas that are often common only within a specific field. The set of concepts was then shared with a number of researchers and educators who are involved in research related to network science or who teach related courses. Based on the input provided by these individuals, the committee created the survey of core material in this appendix.
The committee faced a significant challenge as it sought to synthesize the diverse body of material taught in a wide range of disciplines. Indeed, as discussed in Chapter 2, network science is called on to address problems that not only cut across disciplines but also represent a vast body of knowledge, from infrastructure networks, such as the power grid and the Internet, to pervasive applications running over these infrastructure networks, such as the World Wide Web (www); from networks of gene and protein interactions to social and economic networks. The board range of these networks is illustrated in Table C2.
In what sense can the study of such diverse subjects examples fall under a single unifying domain? The realization that despite their diversity, most real networks are characterized by a small set of organizing principles helped answer this question. For example, the widespread emergence of scalefree networks is rooted in the role of growth and preferential attachment, mechanisms present in many real systems, from cell biology to computer science. This implies that a unified set of tools can be applied to characterize the properties and behavior of a wide range of real networks. For example, tools developed by mathematicians to understand random networks or measures introduced by sociologists to explore social networks can be applied by biologists to design new drugs for disrupting the metabolic network or by computer scientists to explore the properties of the World Wide Web or the Internet. These generic or universal features of real networks and tools are reflected in the courses that are currently taught. Despite their diversity, most courses cover basic concepts that appear to be common across disciplines. The role of this appendix is to survey these common concepts, tools, and methods, identifying the material at the core of network science.
OVERALL ORGANIZATION
Even in established fields of science there is significant room for diversity of focus and interpretation. For example, a survey of basic courses on, say, quantum mechanics or economics would show that while there is broad agreement across instructors, textbooks, and universities on a small set of topics that must be covered, there are significant variations in the examples used, the application areas covered, and special topics. The committee shows that this is eminently the case for networksciencerelated courses as well. Broadly speaking, one can identify a set of core concepts that emerges in a wide range of courses in network science, largely independent of their discipline. These core concepts are typically embedded into applications with different focus. Applications can be grouped into three main areas. The first and the most dynamically evolving area, biological networks, applies network theory to subcellular (metabolic, regulatory, genetic) networks, neural networks, and ecological networks like food webs and species interactions. The second area, social and economic networks, encompasses a wide range of topics such as social interactions, collabora
TABLE C1 Representative Courses on Computer Science
Courses 
Institution 
Name of Course 
Web Address 
Core courses 
Pennsylvania State University 
Graphs and Networks in Systems Biology 

University of Michigan, Ann Arbor 
Network Theory 
http://wwwpersonal.umich.edu/~mejn/courses/2004/cscs535/index.html 

Columbia University 
Networks and Complexity in Social Systems 

Columbia University 
Scaling in Networks 
http://comet.columbia.edu/courses/elen_e9701/2001/outline.html 

University of California at Irvine 
Networks and Complexity 
http://eclectic.ss.uci.edu/~drwhite/Anthro179a/SocialDynamics02.html 

University of Pennsylvania 
Networked Life 

University of Indiana at Bloomington 
Structural Data Mining 

University of Patras (Greece) 
Networks 
http://nicomedia.math.upatras.gr/courses/mnets/index_en.html 

Applications and related courses 
University of Michigan, Ann Arbor 
Information Retrieval 

Cornell University 
Structure of Information Networks 

Massachusetts Institute of Technology 
Complex Human Networks Reading Group 

Virginia Tech 
Recommender Systems 
http://people.cs.vt.edu/~ramakris/Courses/CS6604RS/outline.html 

Boston College 
Social Network Analysis 

University of Toronto 
Social Network Analysis 
tions, social filtering, and economic alliances. The third application area, infrastructure and communication networks, ranges from the Internet and the World Wide Web to power grids, phone networks, and sensor networks.
In the following sections the committee describes in some detail the core concepts, followed by a short discussion of the three application areas. Table C3 lists some core material that could be expected in network science courses.
NETWORK STRUCTURE
The elementary attributes of all networks are the nodes, which are the basic units of a system, and the links, which are the connections between the nodes. Both the nodes and the links can widely differ in different fields. For example, the nodes might be humans or scientists in social networks; molecules, genes, or neurons in biology; routers or transformers in infrastructural networks; and Web pages or research publications in information networks. Similarly, the links might be friendships, alliances, reactions, synapses, optical and copper cables, URLs, or citations. The totality of the nodes and the links defines a network, often represented in graphic form as a connectivity matrix, telling us which nodes are directly connected to each other. Given that the study of networks was traditionally part of graph theory, a branch of mathematics with long and distinguished history, most of the language that network theory uses today has its roots in graph theory. A network map (or connectivity matrix) is typically the starting point for characterizing the structure or topology of any network.
Once a network has been mapped, the first priority is to characterize its topological and structural features. Degree of connectivity represents the most elementary measure of a node, specifying the number of links a node has to other nodes. Much can be learned about a network by inspecting the degree distribution, which in its simplest manifestation is a histogram of the number of nodes with a given degree. Other important measures include the shortest path between two nodes, which plays a key role in identifying smallworld effects; the diameter, which is the distance between the two most distant nodes; the subgraphs and communities that characterize the relationship between small subsets of nodes within a network; the spectral properties, which help us capture a series of local and global network characteristics; and,
TABLE C2 RealWorld Networks Appearing in Courses
Discipline/Course 
Type of Network 
Infrastructure and communications networks 
Power grid Internet Public switched telephone network 
Information and content distribution networks 
World Wide Web Broadcast Sensors Search 
Social networks 
Collaboration Communities Social filtering and recommendations Economic Linguistic 
Computing networks 
Neural nets Petri nets Cellular automata Interacting intelligent agents 
Engineering systems 
Control networks Integrated circuits Queuing networks Process networks Transportation networks Supply chains and manufacturing 
Research networks 
Scientific grid Collaborations Blogs and online journals 
Military networks 
Terrorist networks Intelligence networks Logistics networks 
Biological networks 
Metabolism Gene and protein interactions Biomanufacturing Regulatory and control networks Ecological networks and food webs Viruses and epidemics 
finally, the link strength or weight, which characterizes the nature of the interactions between different nodes.
Based on these measures, real networks may be classified in perhaps two or three major classes. First, there is a class known as regular networks, or graphs, in which the degree of all the nodes assumes the same value or only a few discrete values and the underlying network has a regular, repetitive structure. Such regular graphs approximate the structure of most crystals, as well as a number of other objects, engineered and natural, from the retina of the eye to the roads of some large cities (like New York). Much attention, however, has focused on random networks, systems in which the nodes are randomly connected to each other. In such networks the degrees follows a Poisson distribution. Despite their important role in network theory, we do not know of major real networks that would be fully random. Finally, the availability of largescale network maps has led to the discovery that many real networks are neither regular nor fully random but, rather, scalefree. They have a heavily tailed degree distribution—that is, there are significant (order of magnitude) differences in the degree of different nodes. Scalefree networks describe the cell, the Web, the Internet, and many collaboration and social and economic networks. While many real networks are intermediate between these three classes, this classification captures some of the basic primitives used in many courses on networks and most networks are characterized in terms of the three classes.
An important question surfacing in many networksciencerelated courses is the following: What processes and mechanisms give rise to the network characteristics discussed in the preceding section? A closely related question is this: How do we generate networks with structural characteristics that mimic the properties of selected real networks? Network models, introduced to answer these two questions, are an important part of most network science courses (see Table C4). These models have two main functions. First, some models aim to mimic, in a simplified form, the emergence and evolution of real networks, helping us to understand the mechanism responsible for the formation of real networks. Second, to test the impact of selected network characteristics on the network’s behavior, we need to gener
TABLE C3 Content of a Typical Network Science Course
Subject 
Content 
Core concepts 
Realworld networks Characterization and classifying networks and their components Network modeling 
Network interpretation and processes 
Flow and routing Aggregation and growth Communication and coordination 
Behavior: networks as dynamic entities 
Performance and scaling Robustness Routing and congestion Disruptability 
Engineering methods in network science 
Network design Network analysis 
Applications of network science 
Information and communication network Biological networks Social networks Control and mechanical systems Industrial applications Military applications 
TABLE C4 Network Models Commonly Used to Generate Network Topologies and Analytical Tools Used to Characterize and Study the Properties of Models
Network Models 
Analytical Tools 
Random networks ErdosRenyi model Percolation based Scalefree models Growth and preferential attachment Static models Optimization Static models Smallworld model Optimized topologies 
Exact methods Discrete math Combinatorics Graph theory Dynamical systems Master and rate equations Mean field theory Generating functions Stochastic networks Statistical mechanics Agentbased models Clustering tools 
ate synthetic networks with preset properties. Given these needs, network modeling is one of the most highly studied components of network science.
Historically the most studied model is the random network model explored by Erdos and Renyi, which generates networks by placing the links randomly between the nodes. Random networks represent an important reference frame in network modeling. Despite the fact that we do not know of real networks that are captured by this model, given that most of its characteristics can be calculated exactly, it represents an important theoretical tool against which various real networks and hypotheses can be tested. While scalefree network models represent a recent addition to the network literature, given that many real networks of interest, from biology to computer science, have scalefree characteristics, in the past few years they have become the most investigated class of model. Several models are available to generate scalefree networks. The most studied one was introduced by Barabási and Albert (1999) to capture the formation of a real network. It involves two main ingredients, growth and preferential attachment. Given the diversity of networks, a whole class of models has been introduced that incorporate other mechanisms affecting a network’s evolution, from fitness to node and link removal and aging. In addition, a number of models do not capture the network formation process but result in scalefree topologies through either a fitnessdriven or optimization procedure. Despite the important theoretical role these models play, there is no clear evidence that real networks would be shaped by these processes. The small world model, introduced by Watts and Strogatz (1998), interpolates between regular and random networks and has also generated significant theoretical interest.
Understanding the properties of these networks requires a number of analytical tools that have been developed by a number of fields, from discrete mathematics to statistical mechanics. The study of random networks has a long history, using exact methods developed in graph theory, combinatorics, probability theory, and stochastic processes. Scalefree network models, which represent graphs that change in time, are typically studied using methods based on rate and master equations capable of precisely predicting the degree distribution and other characteristics of scalefree networks. In some cases mathematicians have employed exact methods and the tools of dynamical systems and percolation theory to obtain exact results for these networks. Optimization and genetic algorithms have been used to generate optimized networks. Finally, in order to identify communities and groups in networks there has been a crossdisciplinary interest in developing network clustering methods.
NETWORK DYNAMICS
The specification of the dynamics that characterize a network is less straightforward because these dynamics tend to be rather different in the various applications areas. As mentioned in Chapter 3, the examples range from phase transitions in physical systems, to chemical reactions governed by rate equations, to sociological interactions between people. Here we develop just one example from the computer science discipline.
Computers in a network must often be able to broadcast messages to all other computers in the network. The simplistic protocol would require that each computer maintain a view of the addresses of all computers in the entire network. If computers are constantly leaving or entering the network, the significant problem arises of updating and maintaining a consistent view of the network at each computer. This updating of the views prevented early systems from scaling in size. The solution was for each computer to maintain only a small partial view of the network. The broadcast protocol is for a computer to broadcast to each computer in its view and have computers receiving a broadcast retransmit the message to computers in their views. These more sophisticated protocols that are necessary to overcome scaling problems have lead to sophisticated mathematical techniques and algorithms (Demers et al., 1987). This is just one example of many directions in which basic theory in network science is emerging in the computer science area and appearing in advanced courses.
NETWORK FUNCTION
The purpose of most networks is to transport information, people, or material. These functions often determine both the structure and the dynamics of real networks. Therefore, the understanding of network function has been a very active area in a number of research fields, and it is reflected in most
network courses. The high degree of universality and commonality that is present in network structure largely disappears when it comes to network function, thanks to the diversity of the functions that networks assume in different domains. For example, the purpose of the Internet is to transfer information in an efficient manner, in contrast to the purpose of the metabolic network, which is to process the chemicals consumed by the cell and to turn them into the cell’s building blocks. Therefore, on the top of similar topologies one can define a wide range of dynamical rules, from flow to diffusion and contagion, that lead to different functions and network behavior. Despite this diversity, a number of common and highly studied themes have emerged in the past few years.
One of the most studied themes focuses on diffusion on networks. Indeed, networks support a wide range of diffusive processes that can cause serious problems in a number of application areas. Social and informational networks are responsible for the diffusion of ideas. Sexual and contact networks are responsible for the spread of infectious diseases and viruses, ranging from acquired immunodeficiency syndrome (AIDS) to influenza to severe acute respiratory syndrome (SARS). Computer and email networks are responsible for the spread of electronic viruses and worms, generating billions of dollars in damages. Finally, social and professional networks are responsible for the spread of innovations, ideas, and rumors, playing a key economic and social role. It turns out that the numerical and analytical tools used by different fields to approach these problems are highly similar. Yet, in the past few years, with the increasing understanding of the topology of sexual contact networks and computer networks, we have witnessed significant paradigm shifts. For example, the discovery that in scalefree networks diseases do not experience an epidemic threshold has had a significant impact on the strategies used for epidemic modeling and on the design of efficient interventions. While these processes are discussed from the perspective of different fields, they are widely covered in a wide range of courses, from biology to business to computer science.
Network flows, describing mostly the situation when something tangible flows from source to destination, are another class of widely studied multidisciplinary problems. They emerge in the Internet as the flow of bits along the physical infrastructure, in the study of metabolic networks as the flux of matter across reactions, or in transportation networks as, for example, the flow of cars on the highway. In general, the network structure canalizes these flow processes and to a high degree determines the flow rates and the necessary capacities on each link and node.
Another muchstudied interdisciplinary problem is a network’s ability to carry on its functions in the face of errors and failures. Centered on the question of robustness and resilience, many of these studies explore the network’s dynamical and topological integrity under node and link loss (Dodds et al., 2003). A number of studies have shown that there is a strong interplay between a network’s robustness and its structure. For example, scalefree networks are very robust under random node removal but highly vulnerable to the systematic removal of their hubs. The situation becomes even more complex if network flows are considered, which could lead to cascading failures, such as the 2003 Northeast electricity blackout, affecting millions of consumers (Watts, 2002). But robustness studies play a key role in designing new drugs or in developing systems and network topologies that are highly error resistant.
Another class of muchstudied problems focuses on search in networks, leading to algorithms and methods to efficiently locate information in complex networked structures. These studies play a key role in a wide range of problems, from Web search algorithms to the identification of information and expertise in an organization.
REFERENCES
Barabási, A.L., and R. Albert. 1999. Emergence of scaling in random networks. Science 286(5439): 509–512.
Demers, A., D. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. Sturgis, D. Swinehart, and D. Terry. 1987. Epidemic algorithms for replicated database maintenance . In Proceedings of the 6th ACM Symposium on Principles of Distributed Computing. Vancouver, B.C., Canada.
Dodds, P.S., D.J. Watts, and C.F. Sabel. 2003. Information exchange and the robustness of organizational networks. Proceedings of the National Academy of Sciences of the United States of America 100(21): 12516–12521.
Watts, D.J. 2002. A simple model of global cascades on random networks. Proceedings of the National Academy of Sciences of the United States of America 99: 12516–12521.
Watts, D.J., and S. Strogatz. 1998. Collective dynamics of “smallworld” networks. Nature 393(6684): 440–442.