Emergence of Scaling in Random Networks

Networks,Reading,Science — Zac Townsend @ October 31, 2012 10:12 am

This article by Barabasi and Albert provide an explanation for why the distribution of links over nodes in many real-world networks follow a power-law distribution. Power-law distributions are defined by a concentration of the density in a small number of units and then a long-tail.

The distribution function of connectivities for various large networks. (A) Actor collaboration graph with N=212,250 vertices (B) WWW, N=325,729 (C) Power grid data, N =4941, . The dashed lines have slopes (A)=2.3, (B)=2.1 and (C)=4

After going through some example of networks with complex topologies--everything from neurons to the World Wide Web--they note that networks have traditionally been described by the random network model of Erdos and Renyi (the "ER" model). But, the ER model (and the WS model) is not up to the task of explaining the scale-free power law distributions that they find by "exploring several large databases describing the topology of large networks that span fields as diverse as the WWW or citation patterns in science."

More precisely, the authors find that independent of the system by which the networks are created, that the probability P(k) that a node has k links follos $P(k) \sim k^{- \gamma}$. They explain these distributions by developing a model that includes growth and preferential attachment--in short, 1) that networks start small and grow larger and 2) that no matter if a) because early nodes (those that exist when the network is young) have more random chances to be linked to new nodes or because b) there is "preferential attachment" made to those nodes that are already popular, that there will be some winners with many more links than most other nodes.

Why do the ER and WS models not accurately explain the power-law phenomena? Both models start with a fixed number of nodes and then randomly connect them (ER) or reconnected (WS), whereas this model takes explicit account of the growing number of nodes that occurs in most real world networks. Additionally, the two random network models assume that the probability that any two nodes are connected is random and uniform, whereas this model includes preferential connectivity:

For example, a new actor is most likely to be cast in a supporting role with more established and better-known actors. Consequently, the probability that a new actor will be cast with an established one is much higher than that the new actor will be cast with other less-known actors.

So, how does their model work? It begins with some small number of nodes and then keeps adding new nodes with the probability that the new nodes will be connected to all the previous nodes weighted by the previous nodes popularity. In their language,

1. To incorporate the growing character of the network, starting with a small number ($m_0$) of nodes,
at every time step they add a new node with $m(m_0)$ links, so the new node connects to $m$ different vertices already present in the system.
2. To incorporate preferential attachment, they assume that the probability $\Pi$ that a new node will be connected to node $i$ depends on the connectivity $k_i$ of that node, so that $\Pi (k_i)=\frac{k_i}{\sum_jk_j}$.

They continue by developing slight variants of this model--one that assumes growth but not preferential treatment and one that assumes preferential treatment but not growth--and show that neither is sufficient to explain the emergent scale-free power law distributions seen in real-world networks.

Because of preferential treatment, early gains are quite important and older nodes increase their connectivity at the expense of the younger ones. This leads to some nodes "that are highly connected, a “rich-get-richer” phenomenon that can be easily detected in real networks."