Fast Community Detection with Graph Sparsification

Jesse Laeuchli

doi:10.1007/978-3-030-47426-3_23

. 2020 Apr 17;12084:291–304. doi: 10.1007/978-3-030-47426-3_23

Fast Community Detection with Graph Sparsification

Jesse Laeuchli ^7,^✉

Editors: Hady W Lauw⁸, Raymond Chi-Wing Wong⁹, Alexandros Ntoulas¹⁰, Ee-Peng Lim¹¹, See-Kiong Ng¹², Sinno Jialin Pan¹³

PMCID: PMC7206315

Abstract

A popular model for detecting community structure in large graphs is the Stochastic Block Model (SBM). The exact parameters to recover the community structure of a SBM has been well studied, and many methods have been proposed to recover a nodes’ community membership. A popular approach is to use spectral methods where the Graph Laplacian L of the given graph is created, and the Fiedler vector of the graph is found. This vector is then used to cluster nodes in the same community. While a robust method, it can be expensive to compute the Fiedler vector exactly. In this paper we examine the types of errors that can be tolerated using spectral methods while still recovering the communities. The two sources of error considered are: (i) dropping edges using different sparsification strategies; and (ii) inaccurately computing the eigenvectors. In this way, spectral clustering algorithms can be tuned to be far more efficient at detecting community structure for these community models.

Keywords: Clustering, Graph sparsification, Stopping criteria

Background and Motivation

Stochastic Block Models. Detecting communities through clustering is an important problem in a wide variety of network applications characterized by graphs [1, 3]. However, it can be difficult to study the accuracy of clustering on arbitrary graphs. To aid network analysis, generative models are frequently introduced. One popular model is the Stochastic Block Model (SBM) [2]. In this model a number of nodes n with community memberships are given, and the connectivity of vertices p and q within (and between) the communities are also specified. For a given graph with parameters G(n, p, q), we define Inline graphic , . It has been shown that the community structure can only be recovered when [14]. While models of more than two communities are sometimes studied, in this paper we restrict our attention to the case where the number of communities is fixed at two. There are two reasons for this. The first is that there is more theory available to work with. The second is that in practice it is often the custom when seeking communities in a graph to recursively cluster the nodes into two groups and then continue recursively, since this approach lends itself to high performance computing [4]. A two community model is therefore relevant to real-world approaches and worthy of study. Our goal in this paper is to discuss the question of how we can recover the communities of a SBM faster, by applying graph sparsification and inaccurate eigenvector computation, without harming the accuracy of our recovery methods. Additionally, we show how we can leverage recent research on nearly linear-time solvers to capitalize on the sparser graphs we obtain.

Spectral Sparsification. A popular approach to clustering is to find the Fiedler vector of the graph Laplacian L [4]. This is faster the sparser L is. Given a matrix L, we say that matrix Inline graphic is similar iff for all x,

Matrices that are similar to each other by the criteria of Eq. (1) share similar eigenvectors and eigenvalues [5]. While fast methods exist for computing such similar matrices through sparsification [5], it is unclear how errors in eigenvector approximation translate into errors in the communities recovered. Answering this question is a key contribution of this paper.

Spectrum of SBM Matrices. Since we will make extensive use of the spectrum of the different matrix representations of a Stochastic Block Model (SBM), we review the known results here, and provide some new ones.

First we consider the spectrum of a two community adjacency matrix. Define Inline graphic , , where p is the probability of a connection between nodes inside the community, and q is the probability of a connection between nodes in different communities. Then the average instance of a SBM model with these parameters can be represented in the form,

where Inline graphic and . Any particular instance of a SBM drawn from this distribution of matrices can be represented as , where is a random Wigner matrix. Because the eigenvalues of follow the famous semicircle law, the spectrum of A also follows such a distribution [14], with the exception of the two largest eigenvalues. The distribution of the bulk of the eigenvalues follows the equation,

The radius of the bulk of the spectrum of Inline graphic is given as below, with the center of the semi-circle being at 0.

Finally we also have the two largest eigenvalues of Inline graphic given as below.

We note that the eigenvectors of Inline graphic are randomly distributed vectors on the unit sphere except for the top two eigenvectors. The top two eigenvectors are perturbed versions of the vectors of [14].

We will also be interested in the spectrum and the eigenvectors of the scaled Laplacian Inline graphic of instances of our SBM. The bulk of the spectrum of is also known to follow a semi-circle distribution [11]. If we denote the average degree of the SBM as , then the distribution of the bulk of the eigenvalues follows the equation,

The radius of the bulk of the spectrum are as below, with the center of the semi-circle being at 1.

Avrachenkov et. al [11] states that the other non-trivial eigenvalue of Inline graphic remains to be characterized, so we briefly show that the eigenvalues outside the semi-circle are as below and bound their deviation from this mean, since our algorithms will make use of this information.

We have Inline graphic , since is a Laplacian. If is a regular graph, then the value for is directly computable from the eigenvalues of , as above. Since SMBs are close to regular, with each node having the same average degree of , we need to show that the deviation from the mean is small and with high probability will not change the result. From Lutzeyer and Walden [16] we have that the error of applying this linear transform of the eigenvalues of A, in order to obtain the eigenvalues of Inline graphic , is . We can then use the Chernoff concentration bounds to show that this error goes to zero with high probability.

The elements of the rows of Inline graphic are drawn from a binomial distribution, with n/2 of them with probability p, and n/2 of them with probability q. For each diagonal element of , we then have,

Since we have n diagonal elements, the probability that none exceed this bound can be computed as,

The error in our approximation for Inline graphic is then

Taking the limit as n increases we then have,

Finally, we state two properties of Inline graphic that we will make use of later. We can write . Recall that the eigenvectors of are randomly distributed on the unit sphere. Then for any

Alternatively, we have

Overall Approach. We now outline our overall problem. We would like to accelerate spectral algorithms for Stochastic Block Models (SBMs) while still recovering the communities accurately. Our main approach is to analyze the impact of two different types of error on SBM algorithms. The first is ‘edge dropping’. We investigate two strategies for dropping edges which allow us to recover the communities despite, in some cases, having significantly fewer edges than the original problem. While the idea of sparsifying graphs in order to more efficiently recover communities is not new, our contribution is to determine the level of sparsification that can take place while still recovering communities.

Our second approach is to stop convergence of the eigensolver early. We analyze ‘power iteration’, and show that for many SBM instances the solver does not need to be run to convergence. We choose power iteration both because the analysis is simple, and because in conjunction with nearly linear-time solvers, and the dropping strategy previously mentioned, we can design extremely efficient algorithms. This is because power iteration based on these solvers are O(m) complexity. In some cases we can reduce the number of edges by orders of magnitude, making these solvers very attractive.

The foundation of both these methods is a careful use of the model parameters and the known results for the spectra of SBM models.

Methods and Technical Solutions

Sampling with Effective Resistance. The main idea is that for a given Stochastic Block Model (SBM) we know when we can recover the communities based on the parameters a, b of the model. While it is sometimes assumed that these parameters are known, Mossel et al. [7] gives Eq. (15) for recovering the parameters of an unknown SBM, where |E| is the number of edges in the graph, Inline graphic , and is the number of cycles of length in the graph. While is difficult to compute, Mossel et al. shows that this can be well approximated by counting the number of non-backtracking walks in the graph that can be made in O(n) time. They then obtain a linear-time algorithm for estimating a, b by showing that Inline graphic , and where,

Once we obtain an estimate for a, b then we can estimate how much we should sparsify the graph to ensure that Inline graphic , while still dropping edges to obtain a much sparser matrix, for which we can obtain the Fiedler Vector much faster. We can also estimate the percentage of the nodes we will recover using the equation erf [14] .

In order to understand the percentage of the edges of the graph that we should sample we need to consider what the odds are they we will sample an edge connecting two nodes inside a community, against the odds that they will sample an inter-community edge. Ideally we would only sample edges inside the communities, since this would make the communities trivial to detect. Unfortunately, it has been shown by Luxburg et al. [9, 10] that for SBM as Inline graphic , the effective resistance of a given edge (i, j) in the graph tends toward . Since the degrees of the nodes in this model are O(n), the variation between effective resistances will be small, and will in any case not reflect the community structure of the graph. At this point our spectral sparsifier will be selecting edges essentially at random. While Luxburg et al. state that theoretical results suggest that the effective resistances could degenerate only for very large graphs, their experimental results show that this behaviour arises even for small communities of 1, 000 vertices.

While in some sense this is a drawback, since this result is telling us we may as well sample randomly, our algorithm can still function, and we can save the cost of computing the effective resistances. For Inline graphic to hold true, there must be significantly more intra-community edges than inter-community ones. If we are sampling randomly with spectral sparsification, we should still sample more of the desired edge type, since more of this type exist and we are sampling each edge with roughly the same probability. If we have probabilities p, q, and number of nodes n, then the expected value for the number of edges is shown in Eq. (16). We can then compute the probability of sampling an intra-community or inter-community edge as in Eq. (17). If we take S samples, Eq. (18) shows the estimated Inline graphic for our sparsified graph. We then have , , which can be used to decide if the communities can be recovered.

Correcting Effective Resistance. While Luxburg et al. [10] show that as Inline graphic the effective resistance for a SBM degenerates to for two nodes i, j, there are various methods known for correcting this. One of these is to multiply by the sum of the degrees. While this does not correct the issue in and of itself, since the effective resistance between every pair of nodes converges to two, the variance around two may be meaningful. Using these “scaled” effective resistances captures the community structure of a SBM, and sparsifying by these resistances can cause us to find the community structure of an SBM very quickly. These scaled effective resistances can be obtained by taking the scaled Laplacian Inline graphic of our SBM, and applying the same algorithm that is used to estimate the effective resistance of L.

Given the constants a, b, we can calculate the average difference in scaled effective resistance between edges both inside the communities and outside. This is useful because it allows us to predict on average how much we should sample to ensure Inline graphic , given the increased chance of sampling inter-community edges.

Recall that Inline graphic . Using our knowledge of the spectrum of , we can compute the average values of these terms.

We see that the effective resistance inside the group on average is Inline graphic and otherwise. This allows us to amend our estimates for and . If we let be the ratio between the effective resistance of the two links, then Eq. (21) gives the scaled , .

We note that our method above does have a potential drawback for very small graphs. This is because we need to sample Inline graphic (n log(n)) edges to avoid the graph being disconnected [12]. As graphs become large this should be a non-issue because we have , which indicates that our sampling criteria will require more edges than needed to ensure connectivity.

Computing the Eigenvector Using Inverse Power Iteration. One of the challenges of spectral methods is computing the eigenvectors needed for clustering, since this can be expensive. Given a nearly linear-time solver, one can compute the eigenvectors of a scaled graph Lapalcian in nearly time in the order of the number of elements of Inline graphic [6], by using Inverse Power Iteration. This is attractive given our edge dropping strategy, where we may reduce the number of edges by several orders of magnitude for favourable graphs.

An additional feature is that it is possible to calculate a stopping criteria for the eigensolver that will allow us to recover the communities, even though the eigenvector has not fully converged. This is a desirable property, since full convergence can be slow. While the bound for our stopping criteria is not tight, it nevertheless is significantly faster than would otherwise be the case for full convergence.

Recall that for the power iteration we have an initial state Inline graphic . On average the terms will be of approximately the same size. We are attempting to compute the eigenvector . After each iteration of the power method we have a resultant vector which consists of the desired eigenvector , and some sum of the other eigenvectors. We need to compute the likely contribution from the other eigenvectors. Once these contributions are smaller than Inline graphic with high probability, we can stop the iteration, because the signal from the desired eigenvector will dominate the calculation, and allow for the correct community assignment.

We need to compute the average contribution from the remaining eigenvectors at each iteration. We begin by computing the average size for each component of the other eigenvectors. Assuming all the Inline graphic are equal, we have . The eigenvectors are randomly distributed around the unit sphere, as in Wigner matrices. We know from O’Rourke and Wang [17], that the elements of these eigenvectors are normally distributed variables, .

Multiplying Inline graphic by we have after k iterations. We then have that each component of the sum of the eigenvectors are normal variables .

We can now use Chebyshev’s inequality to compute the probability that a component of the sum of the eigenvectors Inline graphic is greater than , the size of the components of the dominant eigenvector as follows,

Using our knowledge of the density of the spectrum of Inline graphic , we can compute the probability in Eq. (22), for as follows,

Once we know the probability of a single component of the eigenvector being greater than the Inline graphic , we can use this in a binomial distribution to calculate how many elements we are likely to incorrectly classify. We can, either stop when k is large enough to imply this is close to zero, or when the number is the same order as the error introduced by the perturbation of the main eigenvector from the addition of the random eigenvectors to Inline graphic , as given in [13].

Regularized Spectral Clustering. While spectral clustering is robust for matrices with high average degrees (cf. Saade et al. [8]), for very sparse matrices that have low degree entries the technique may struggle to recover the communities when the graph approaches the theoretical limits of community detection. This issue is exacerbated by the fact that we are dropping edges, and thus may create such problematic cases. To combat this we use the method of regularized spectral clustering method given in Saade et al. [8]. Given a regularization parameter Inline graphic , and the matrix with constant entries , the authors first define the regularized adjacency matrix as,

Similarly they define the regularized diagonal as Inline graphic as,

Then the regularized scaled Laplacian is given as,

We note that the Fiedler vector of this matrix can be computed using the power method using nearly linear-time sparse solvers. Inline graphic is symmetric and diagonally dominate so we can make use of nearly linear-time solvers to compute . Then, since is a rank one matrix , we can compute using the Sherman-Morrison formula which allows us to solve in terms of .

Using Eq. (27) we can then proceed to compute the Fielder vector using power iteration. In order to determine when to stop the power iteration we proceed in the same way as in Computing the Eigenvector Using Inverse Power Iteration by determining the spectrum Inline graphic . We begin by noting that has the same spectrum as , except that the top eigenvalue is increased by the rank one update , as discussed in Ding and Zhou [15]. Since we will project out this eigenvector when performing power iteration, we then only need to consider the density function of Inline graphic . By the same argument that we used Eq. (10), we can show that as n increases, whp this density function is given by,

Our Algorithm. We now present our algorithm. We first obtain (or the user provides) an estimate for the Stochastic Block Model (SBM) parameters. We then obtain the scaled effective resistances Inline graphic of the elements of the scaled Laplacian , which we then use to create a probability density function. We note that we modify the probability density function to sample the edges that have a low effective resistance over those that have a high resistance, since these are the edges that make up our community. This approach is slightly different from the standard algorithm of Spielman and Srivastava [5], which seeks to sample the highest resistance edges.

Next we compute the estimated p, q we will obtain after sparsification, using either Eq. (18) or (21), depending on our sampling strategy, and based on this we decide how much sparsification we can safely apply. After creating our new matrix, we then obtain the relevant eigenvector, depending on whether we are using the Laplacian or the Regularized Laplacian from Eq. (26).

A Comment on Complexity. While the best performance we obtained was by using the scaled effective resistance, depending on what solver is available, this may not always be the most effective strategy. This is because obtaining the scaled effective resistances using the method of Spielman and Srivastava [5], requires us to solve a number of linear systems. If a nearly linear time solver is available, this will take O(m) time, where m is the number of edges before our dropping strategy. This will dominate the cost of the computation, and we will not get significant speed-up from using power iteration, which is of order Inline graphic , where again is the number of sparsified edges. In the case of our examples this is clearly sub-optimal, since we can reduce m several orders of magnitude and still recover the communities, even when we are dropping edges randomly. In this case it makes sense not to use the scaled effective resistance. On the other hand, in practice, we may wish to use a different eigensolver, since the code for these may be more mature. In this case, the cost of the eigensolver may dominate, especially since the cost of applicable solvers (such as Lanczos-based solvers), does not entirely depend on m. In this case the use of scaled effective resistance sampling may be more effective.

Empirical Evaluation

We now present some experimental results. We first examine the difference between effective-resistance and scaled-effective resistance, and how closely they follow the predicted percentage of recovery. Additionally, we investigate the time needed to compute the eigenvectors of the sparsified versus the unsparsifed matrix, and our convergence criteria for the Fiedler Vector.

Recovery of Communities. We now examine the success of our method in recovering the communities with the given sparsification. In Figs. 1a we see that using the Regularized Laplacian we can quickly recover almost all the nodes correctly, at around the sparsification level, predicted by Eqs. (18) and (21). This also highlights the impact of using the scaled effective resistance for sampling, with the method converging faster, and following the prediction of Eq. (21) more closely. We note that for both sampling methods the percent of edges we preserve is very small, of the order of Inline graphic of the original graph for the Scaled Effective Resistance method.

In Fig. 1b, we try the real-world example of Saade et al. [8], where the authors attempt to partition two blogging communities by their political alignment. This is an interesting example because the communities are difficult to recover, requiring the use of regularization techniques, and because the graph structure is not exactly captured by the SBM model. Further this graph is quite small with only 1, 222 nodes, meaning that the graph may be disconnected, as discussed earlier in Sect. 2. Despite these difficulties, we are still able to recover the communities even after a significant amount of sparsification is applied, at the point that our criteria indicate we should be successful.

Time Saved in Eigenvector Calculation. One of the main motivations of this work is to obtain the correct community labels while spending less time computing the require eigenvectors. Since we are able to recover the communities, despite applying large amounts of sparsification, we would expect our eigensolver to converge faster. Exactly how fast depends on the solver. For the eigensolver shown in Spielman and Teng [6], built on top of their nearly linear-time solver and constructed solely to find the Fiedler vector of the Laplacian, our time to compute the eigenvector would depend on the number of elements of our graph. Since we have reduced the number of elements by multiple orders of magnitude when sampling with scaled effective-resistance we would get a multiple order of magnitude speed-up. Unfortunately, these solvers are not available for use in production code, so we do not benchmark them here.

When using the off-the-shelf solver available in Matlab to find the desired eigenvector, with our best method we achieve essentially an order of magnitude speed-up. This is because the solvers used by this method are not optimized for graphs in the way that the solver of Spielman and Teng. The observed speed-up can be seen in Table 1b.

Table 1.

(a) Speed-up in eigensolver from sparsification for the Regularized Laplacian for SBMs (10000, 0.5, 0.3) and (10000, 0.5, 0.2) respectively, using an off the shelf solver. (b) Number of iterations of the inverse power method required to reach the stopping criteria vs number of iterations to reach Inline graphic accuracy using the Scaled Laplacian for the SMB (10000, 0.5, 0.3).

Open in a new tab

While we do not have a nearly linear-time solver to fairly benchmark our Inverse Power method, we are able to test the number of iterations required to obtain Inline graphic accuracy vs the number of iterations recommended by our stopping criteria, seen in Table 1a. In all four cases all the community nodes were recovered, even though the sparsification was of the order of .

Conclusion and Future Work

In this paper we explored the use of sparsifying by effective resistance and scaled effective resistances in order to recover sparsify SBMs, as well as effective stopping criteria for eigensolvers used for community detection. The main goal is to obtain faster solutions while still being confident in our ability to recover the communities. We have provided a method that determines the number of samples needed, depending on the type of sampling used. We found that the community structure can be recovered even when the matrix becomes very sparse. Since SBMs are a commonly studied model for clustering, this method is widely applicable. We leave several areas open for future work. While SBMs are widely studied, the model has certain intrinsic limits which prevent it from modeling certain real-world networks well. We would like to provide a similar analysis for more complex community models, in particular models which have a non-constant average degree. We could then apply our model to a larger variety of real-world graphs.

Acknowledgements

We would like to acknowledge the efforts of Professor Peter W. Eklund, who helped make this paper possible.

Contributor Information

Hady W. Lauw, Email: hadywlauw@smu.edu.sg

Raymond Chi-Wing Wong, Email: raywong@cse.ust.hk.

Alexandros Ntoulas, Email: antoulas@di.uoa.gr.

Ee-Peng Lim, Email: eplim@smu.edu.sg.

See-Kiong Ng, Email: seekiong@nus.edu.sg.

Sinno Jialin Pan, Email: sinnopan@ntu.edu.sg.

Jesse Laeuchli, Email: j.laeuchli@deakin.edu.au.

References

1.Girvan M, Newman MEJ. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA. 2002;99:7821–7826. doi: 10.1073/pnas.122653799. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Condon A, Karp RM. Algorithms for graph partitioning on the planted partition model. Random Struct. Algor. 2001;18:116–140. doi: 10.1002/1098-2418(200103)18:2<116::AID-RSA1001>3.0.CO;2-2. [DOI] [Google Scholar]
3.Abbe E. Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 2017;18(1):6446–6531. [Google Scholar]
4.Pothen A, Simon H, Liou K. Partitioning sparse matrices with eigenvectors of graphs. SIAM. J. Matrix Anal. Appl. 1990;11(3):430–452. doi: 10.1137/0611030. [DOI] [Google Scholar]
5.Spielman, D., Srivastava, N.: Graph Sparsification by effective resistances. In: Proceedings of the 40th Annual ACM symposium on Theory of computing, STOC 2008, pp. 563–568 (2008)
6.Spielman D, Teng S. Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. SIAM J. Matrix Anal. Appl. 2014;35(3):835–885. doi: 10.1137/090771430. [DOI] [Google Scholar]
7.Mossel E, Neeman J, Sly A. Reconstruction and estimation in the planted partition model. Probab. Theory Relat. Fields. 2014;162(3):431–461. doi: 10.1007/s00440-014-0576-6. [DOI] [Google Scholar]
8.Saade A, Krzakala F, Zdeborová L. Impact of regularization on spectral clustering. Ann. Stat. 2016;44:1765–1791. doi: 10.1214/16-AOS1447. [DOI] [Google Scholar]
9.Luxburg, U., Radl, A., Hein, M.: Getting lost in space: Large sample analysis of the resistance distance. In: Advances in Neural Information Processing Systems, vol. 23 (2010)
10.Luxburg U, Radl A, Hein M. Hitting and commute times in large random neighborhood graphs. J. Mach. Learn. Res. 2014;15:1751–1798. [Google Scholar]
11.Avrachenkov, K., Cottatellucci, L., Kadavankandy, A.: Spectral properties of random matrices for stochastic block model. In: WiOpt, pp. 25–29, May 2015
12.Fung, W.S., Hariharan, R., Harvey, N.J., Panigrahi, D.: A general framework for graph sparsification. In: STOC 2011, pp. 71–80, 06–08 June 2011
13.McSherry, F.: Spectral partitioning of random graphs. In: Proceedings 42nd IEEE Symposium on Foundations of Computer Science (2001)
14.Nadakuditi, R.R., Newman, M.E.: Graph spectra and the detectability of community structure in networks. Phys. Rev. Lett. 108(18), 188701 (2012) [DOI] [PubMed]
15.Ding J, Zhou A. Eigenvalues of rank-one updated matrices with some applications. Appl. Math. Lett. 2007;20(12):1223–1226. doi: 10.1016/j.aml.2006.11.016. [DOI] [Google Scholar]
16.Lutzeyer, J., Walden, A.: Comparing graph spectra of adjacency and Laplacian matrices. arXiv:171203769
17.O’Rourke, S., Vu, V., Wang, K.: Eigenvectors of random matrices: a survey. J. Comb. Theory Ser. A 144, 361–442 (2016)

[CR1] 1.Girvan M, Newman MEJ. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA. 2002;99:7821–7826. doi: 10.1073/pnas.122653799. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Condon A, Karp RM. Algorithms for graph partitioning on the planted partition model. Random Struct. Algor. 2001;18:116–140. doi: 10.1002/1098-2418(200103)18:2<116::AID-RSA1001>3.0.CO;2-2. [DOI] [Google Scholar]

[CR3] 3.Abbe E. Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 2017;18(1):6446–6531. [Google Scholar]

[CR4] 4.Pothen A, Simon H, Liou K. Partitioning sparse matrices with eigenvectors of graphs. SIAM. J. Matrix Anal. Appl. 1990;11(3):430–452. doi: 10.1137/0611030. [DOI] [Google Scholar]

[CR5] 5.Spielman, D., Srivastava, N.: Graph Sparsification by effective resistances. In: Proceedings of the 40th Annual ACM symposium on Theory of computing, STOC 2008, pp. 563–568 (2008)

[CR6] 6.Spielman D, Teng S. Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. SIAM J. Matrix Anal. Appl. 2014;35(3):835–885. doi: 10.1137/090771430. [DOI] [Google Scholar]

[CR7] 7.Mossel E, Neeman J, Sly A. Reconstruction and estimation in the planted partition model. Probab. Theory Relat. Fields. 2014;162(3):431–461. doi: 10.1007/s00440-014-0576-6. [DOI] [Google Scholar]

[CR8] 8.Saade A, Krzakala F, Zdeborová L. Impact of regularization on spectral clustering. Ann. Stat. 2016;44:1765–1791. doi: 10.1214/16-AOS1447. [DOI] [Google Scholar]

[CR9] 9.Luxburg, U., Radl, A., Hein, M.: Getting lost in space: Large sample analysis of the resistance distance. In: Advances in Neural Information Processing Systems, vol. 23 (2010)

[CR10] 10.Luxburg U, Radl A, Hein M. Hitting and commute times in large random neighborhood graphs. J. Mach. Learn. Res. 2014;15:1751–1798. [Google Scholar]

[CR11] 11.Avrachenkov, K., Cottatellucci, L., Kadavankandy, A.: Spectral properties of random matrices for stochastic block model. In: WiOpt, pp. 25–29, May 2015

[CR12] 12.Fung, W.S., Hariharan, R., Harvey, N.J., Panigrahi, D.: A general framework for graph sparsification. In: STOC 2011, pp. 71–80, 06–08 June 2011

[CR13] 13.McSherry, F.: Spectral partitioning of random graphs. In: Proceedings 42nd IEEE Symposium on Foundations of Computer Science (2001)

[CR14] 14.Nadakuditi, R.R., Newman, M.E.: Graph spectra and the detectability of community structure in networks. Phys. Rev. Lett. 108(18), 188701 (2012) [DOI] [PubMed]

[CR15] 15.Ding J, Zhou A. Eigenvalues of rank-one updated matrices with some applications. Appl. Math. Lett. 2007;20(12):1223–1226. doi: 10.1016/j.aml.2006.11.016. [DOI] [Google Scholar]

[CR16] 16.Lutzeyer, J., Walden, A.: Comparing graph spectra of adjacency and Laplacian matrices. arXiv:171203769

[CR17] 17.O’Rourke, S., Vu, V., Wang, K.: Eigenvectors of random matrices: a survey. J. Comb. Theory Ser. A 144, 361–442 (2016)

PERMALINK

Fast Community Detection with Graph Sparsification

Jesse Laeuchli

Abstract

Background and Motivation

Methods and Technical Solutions

Empirical Evaluation

Fig. 1.

Table 1.

Conclusion and Future Work

Acknowledgements

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Fast Community Detection with Graph Sparsification

Jesse Laeuchli

Abstract

Background and Motivation

Methods and Technical Solutions

Empirical Evaluation

Fig. 1.

Table 1.

Conclusion and Future Work

Acknowledgements

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases