Large Scale Bayesian Modelling of Complex Networks |
Andreas Leon Aagaard Moth, Kristoffer Jon Albers
|
Abstract | Bayesian stochastic blockmodelling has proven a valuable tool for discovering community structure in complex networks. The Gibbs sampler is one of the most commonly used strategies for solving the inference problem of identifying the structure. Though it is a widely used strategy, the performance of the sampler has not been examined sufficiently for large scale modelling on real world complex networks. The Innite Relational Model is a prominent non-parametric extension to the Bayesian stochastic blockmodel, which has previously been scaled to model large bipartite networks.
In this thesis we examine the performance of the Gibbs sampler and the more sophisticated Split-Merge sampler in the Innite Relational Model. We push the limit for network modelling, as we implement a high performance sampler capable of performing large scale modelling on complex unipartite networks with millions of nodes and billions of links. We find that it is computationally possible to scale the sampling procedures to handle these huge networks.
By evaluating the performance of the samplers on different sized networks, we find that the mixing ability of both samplers decreases rapidly with the network size. Though we find that Split-Merge can increase the performance of the Gibbs sampler, these procedures are unable to properly mix over the posterior distribution already for networks with about 1000 nodes. These findings clearly indicates the need for better sampling strategies in order to expedite the studies of real world complex networks. |
Type | Master's thesis [Academic thesis] |
Year | 2013 |
Publisher | Technical University of Denmark, Department of Applied Mathematics and Computer Science |
Address | Matematiktorvet, Building 303B, DK-2800 Kgs. Lyngby, Denmark, compute@compute.dtu.dk |
Series | M.Sc.-2013-92 |
Note | |
Electronic version(s) | [pdf] |
Publication link | http://www.compute.dtu.dk/English.aspx |
BibTeX data | [bibtex] |
IMM Group(s) | Intelligent Signal Processing |