Skip to main content
iScience logoLink to iScience
. 2020 Sep 28;23(10):101626. doi: 10.1016/j.isci.2020.101626

Link Prediction through Deep Generative Model

Xu-Wen Wang 1, Yize Chen 2, Yang-Yu Liu 1,3,
PMCID: PMC7575873  PMID: 33103070

Summary

Inferring missing links based on the currently observed network is known as link prediction, which has tremendous real-world applications in biomedicine, e-commerce, social media, and criminal intelligence. Numerous methods have been proposed to solve the link prediction problem. Yet, many of these methods are designed for undirected networks only and based on domain-specific heuristics. Here we developed a new link prediction method based on deep generative models, which does not rely on any domain-specific heuristic and works for general undirected or directed complex networks. Our key idea is to represent the adjacency matrix of a network as an image and then learn hierarchical feature representations of the image by training a deep generative model. Those features correspond to structural patterns in the network at different scales, from small subgraphs to mesoscopic communities. When applied to various real-world networks from different domains, our method shows overall superior performance against existing methods.

Subject Areas: Complex Systems, Network Modeling, Network Topology

Graphical Abstract

graphic file with name fx1.jpg

Highlights

  • A novel link prediction method based on deep generative models is developed

  • This method works for general undirected or directed complex networks

  • Leveraging structural patterns at different scales, this method outperforms others


Complex Systems; Network Modeling; Network Topology

Introduction

Networks have become an invaluable tool for describing the architecture of various complex systems, be they technological, biological, or social in nature (Albert and Barabási, 2002; Newman, 2003; Boccaletti et al., 2006; Scholtes et al., 2014). Mathematically, any real-world network can be represented by a graph, G(V,E), where V={1, 2, ⋯,N} is the node set and E⊆V×V is the link set. A link, denoted as a node pair (i,j) with i,j∈V, represents certain interaction, association, or physical connection between nodes i and j, which could be either directed or undirected, weighted or unweighted. For many systems (especially biological systems), the discovery and validation of links require significant experimental efforts. Consequently, many real-world networks mapped so far are substantially incomplete (Von Mering et al., 2002; Han et al., 2005). For example, a recent estimate indicates that in human cells the explored protein-protein interactions cover less than 20% of all potential protein-protein interactions (Sahni et al., 2015). How to tease out the missing interactions based on the discovered ones? In network science and machine learning, this is commonly known as the link prediction problem (Liben-Nowell and Kleinberg, 2007; Clauset et al., 2008).

An accurate link prediction method will greatly reduce the experimental efforts required to establish the network's topology and/or accelerate mutually beneficial interactions that would have taken much longer to form serendipitously. Consequently, link prediction has many real-world applications (Hulovatyy et al., 2014; Martínez, Berzal and Cubero, 2016a). In biomedicine, link prediction can be used to infer protein-protein interactions or drug-target interactions (Zhang et al., 2005; Campillos et al., 2008; Luo et al., 2017). In e-commerce, it can help build better recommender systems, e.g., Amazon's “people who bought this also bought” feature (Linden et al., 2003). On social media, it can help build potential connections such as the “people you may know” feature on Facebook and LinkedIn (Blagus et al., 2012). In criminal intelligence analysis, link prediction can assist in identifying hidden co-participation in illicit activities (Berlusconi et al., 2016).

Numerous methods, such as similarity-based algorithms (Katz, 1953; Barabási and Albert, 1999; Friedman et al., 1999; Sarukkai, 2000; Guimerà and Sales-Pardo, 2009; Lü and Zhou, 2011; Perozzi et al., 2014; Chen et al., 2017; Kovács et al., 2019), maximum likelihood algorithms (Clauset et al., 2008; Guimerà and Sales-Pardo, 2009; Pan et al., 2016), probabilistic models (Heckerman et al., 2007; Chaney et al., 2015), and deep learning-based methods (Tavakoli et al., 2017; Chiu and Zhan, 2018), especially graph representation learning-based methods (Niepert et al., 2016, Ahmed and Kutzkov, no date; van den Berg et al., 2017; Hamilton et al., 2017; Schlichtkrull et al., 2017; Murphy et al., 2019; Srinivasan and Ribeiro, 2019) have been developed to solve the link prediction problem (see Supplementary Information Section 1 for brief descriptions of those existing methods). Yet, many of these existing methods (such as similarity-based algorithms) are designed for undirected networks. Moreover, most of these methods are based on domain-specific heuristics (Sarukkai, 2000), and hence their performances differ greatly for networks from different domains.

A powerful link prediction method that does not rely on any domain-specific heuristic and works for general complex networks has been lacking (Martínez, Berzal and Cubero, 2016b). Here, we fill the gap by developing a link prediction method based on deep generative models (DGMs) (see Figure 1 for a schematic demonstration).

Figure 1.

Figure 1

Demonstration of Our Link Prediction Method on a Directed Network

The adjacency matrix of this directed network (with 28 nodes and 118 links) looks like the binary image of letter E with 12 missing pixels. Note that 5 isolated nodes are not shown in the network presentation. We perturb the original network (image) by removing 5 links at random in M different ways to obtain a pool of perturbed networks (images) Ii (i = 1, …,M) (M = 5000 for this example). This input dataset will be fed into the generative adversarial networks (GANs) that consist of two deep artificial neural networks: generator and discriminator. The generator takes the noise drawn from a uniform distribution (with 100 dimensions for this example) as input and produces fake images. The discriminator is a binary classifier that tells whether a given image is a real one from the input dataset or a fake one produced by the generator. Over the course of training iterations, the generator can produce convincing fake images P from the feedback offered by the discriminator. The pixel value Pij in the fake grayscale image P can be used to calculate the existent probability of a link between a node pair: αij = 1−Pij. The final existent probability is calculated by averaging αij over S (S = 500) generated fake networks.

Results

Key Idea

Our key idea is to treat the adjacency matrix of a network as the pixel matrix of a binary image. In other words, present (or absent) links will be treated as pixels of value (0 or 1), respectively. By perturbing the original input network (image) in many different ways through randomly removing a small fraction of present links, we obtain a pool of perturbed networks (images). Those perturbed images will be fed into a DGM to create fake images that look similar to the input ones (see Supplementary Information Section 3 for details of DGMs). Those fake images (networks) will be used to perform link prediction in the original image (network). For the DGM, here we leverage one of the most popular ones, Generative Adversarial Networks (GANs), that consist of two deep artificial neural networks (called generator and discriminator) contesting with each other in a game theory framework (Goodfellow et al., 2014; Arjovsky et al., 2017). The generator takes random noise from a known distribution as input and transforms them into fake images through a deconvolutional neural network. The discriminator is a binary classifier (based on a convolutional neural network), which determines whether a given image looks like a real one from the input dataset or like a fake one artificially created by the generator. Over the course of training iterations, the discriminator learns to tell real images from fake ones. At the same time, the generator uses feedback from the discriminator to learn how to produce convincing fake images to fool the discriminator so that it cannot distinguish from real ones (see Supplementary Information Section 3 for details). To better train the generator, one can use the Wasserstein distance to quantify the dissimilarity between fake and real images. During the training process, through minimizing the Wasserstein distance, the generator learns to assign link probabilities between each node pair (including both observed and unobserved links) to fool the discriminator so that it cannot distinguish real and fake images. Note that the assigned link probabilities to those observed links will be quite close to one, whereas the link probabilities assigned to those unobserved links will be close to zero but not exactly zero. (This process is also known as the smooth process [Yeh et al., 2017]). Hence the generated fake images are grayscale, even though all the input images fed to GAN are binary. If the probabilities assigned to missing links are much higher than that of nonexistent links, then the link prediction is much better than random guess (see Figure S1 for an intuitive explanation).

Demonstration Using Synthetic Networks

To demonstrate our DGM-based link prediction, let us consider a toy example: a small directed network of 28 nodes and 118 links (Figure 1): 106 solid links form the training set, whereas 12 dashed links form the probe set. Those nodes are labeled appropriately so that the adjacency matrix of this network looks like a binary image of letter E with 12 missing pixels, corresponding to 12 removed or “missing” links. Note that those probe links will never enter the learning process. First, we create M perturbed binary images by randomly removing a fraction q of pixels of value 0 (i.e., those present links) from the original image (network). Second, we use the M perturbed binary images as input to train GANs, which will eventually generate S fake grayscale images that look similar to the input ones. In this example, we choose M = 5,000, q = 0.1, and S = 500. The existent likelihood of the link between nodes i and j, denoted as αij, in the corresponding fake network is simply given by αij = 1−Pij, where Pij is the rescaled pixel value (ranging from 0 to 1) in each fake grayscale image. Finally, we take the average value αij = 1−Pij over all the S fake images to get the overall existent likelihood of the link (i,j). Note that in this toy example all the 12 missing links display higher αij than that of nonexistent links, so they are all successfully recovered. Figure 1 may remind us the classical image inpainting problem, where we need to reconstitute or retouch the missing or damaged regions of an image to make it more legible and to restore its unity (Bertalmio et al., 2000). We emphasize that the link prediction problem addressed here is fundamentally different from the image inpainting problem. For image inpainting, we generally know the locations of the damaged regions of an image, whereas for link prediction, we do not know which links are missing in a network. In fact, teasing them out is exactly the task of link prediction.

At the first glance, our DGM-based link prediction method seems to heavily rely on the existing patterns in the adjacency matrix of the original network. After all, we are treating a network as an image. But do we have to sophisticatedly label the nodes in the network to ensure the success of our method? To address this concern, we perform the following numerical experiment. We start from an original network with an appropriate node labeling such that the adjacency matrix looks exactly as the binary image of letter E without any missing pixels. (See Supplementary Information Figure S2 for a more complicated synthetic network generated by the stochastic block model.) Then we relabel η fraction of the nodes in the network so that the binary image associated with its adjacency matrix looks much more random than the letter E (see insets of Figure 2A). Note that the network structure is fixed, while we just label the nodes differently so that the resulting adjacency matrices (or binary images) look quite different. We then randomly remove 10% links of those five networks as probe set to evaluate the performance of our method at different η values, as well as the performance of two classical link prediction methods for directed networks that do not depend on the node labeling at all. Hereafter, to quantify the performance of any link prediction method, we employ the standard AUC statistic, i.e., the area under the receiver operating characteristic curve (Clauset et al., 2008; Guimerà and Sales-Pardo, 2009). To calculate the AUC, we first randomly split the link set ε into two parts (see Figures S3 and S4 for details): (1) a fraction f of links as the test or probe set εP, which will be removed from the network; and (2) the remaining fraction (1−f) of links as the training set εT, which will be used to recover the removed links. The AUC statistic is defined to be the probability that a randomly chosen link in the probe set εP is given a higher score by the link prediction method than that of a randomly chosen nonexistent link (see Supplementary Information Section 2 and Figure S5 for details). For each network, we performed 20 independent random splittings unless otherwise stated. We find that, for this small directed network, the performance of our method degrades only slightly even after we relabel 25% nodes (Figure 2A). When we relabel more nodes, the performance is actually quite stable. Even if we relabel all the nodes, the AUC of our method is still about 0.9, which is higher than that of other link prediction methods for directed networks, such as the preferential attachment (PA) (Barabási and Albert, 1999)-based method (with AUC∼0.85) and the low-rank matrix completion (LRMC) (Pech et al., 2017) method (with AUC∼0.7). In Supplementary Information Figure S6, we further show that the AUC of our method is generally above 0.9 with different completely random node labeling of this network. This is simply because even after completely random node labeling, small-scale patterns (e.g., many short line segments in the relabeled image of E) can still be readily leveraged by our method.

Figure 2.

Figure 2

Impact of Node Labeling on the Performance of Our DGM-Based Link Prediction Method

(A) A randomly selected fraction of η nodes are relabeled in a directed network whose original adjacency matrix looks exactly as the binary image of letter (E) We randomly divide the links into two parts: a fraction of 10% links chosen as the probe set and the remaining 90% links as the training set. We perform link prediction using three different methods: DGM (deep generative model based), PA (preferential attachment based), and LRMC (low rank matrix completion). In this example, we choose M = 1000 for our DGM-based method. Even after we relabel all the nodes so that the adjacency matrix does not display prominent features, the median AUC of our DGM-based method is still around 0.9, whereas it is 0.85 for the PA method and 0.7 for the LRMC method. Inset: The adjacent matrices corresponding to different relabeling fractions, where black pixels represent existing links.

(B–D) AUCs of DGM-based and other traditional methods in the link prediction of a directed modular network (N = 28) generated by the stochastic block model (Girvan and Newman, 2002) with within-module connection probability pin = 0.5 and between-module connection probability pout = 0.05. The adjacent matrices before (or after node relabeling) is shown in (B) (or C), respectively. Asterisks in (D) show whether the AUC of our DGM-based link prediction method is significantly higher than that of the other three traditional algorithms (paired-sample t test).

Significance levels: p value <0.01(∗∗), <0.001(∗∗∗).

The results presented in Figure 2A indicate that, for those networks that have strong structural patterns, our DGM-based link prediction does not heavily rely on the detailed node labeling. However, to optimize the performance of our method, one should still label the nodes accordingly. This can be achieved by extracting community structure in the network (Newman and Girvan, 2004; Radicchi et al., 2004), for example, using the classical Louvain method (Blondel et al., 2008). (Note that nodes within a community can be labeled randomly and the test set should not be involved in the node labeling.) To test this simple idea, we consider a directed modular network generated from the stochastic block model (Girvan and Newman, 2002), where any two nodes within the same module are randomly connected with probability pin and any two nodes between different modules are randomly connected with probability pout (see Figure S7 for the performance of DGM with connection probability). We apply our method as well as various traditional methods to this modular network (N = 28) with random node labeling (Figure 2B). We find that no link prediction method performs significantly better than random guess for this directed network (Figure 2D). However, applying the Louvain method first (here we treat the directed network as an undirected one) will capture some structural patterns in the network (i.e., the community structure in the adjacency matrix, see Figure 2C), which will significantly improve the performance of our method (Figure 2D; paired-sample t test). This result suggests that any structural patterns should be exploited for our DGM-based link prediction.

Demonstration Using a Real Network

Real-world complex networks certainly display more prominent structural patterns than ER random graphs. Thanks to the deep neural networks in the DGM, our method can actually leverage structural patterns in a real network at different scales all together, from small subgraphs to community structure (Girvan and Newman, 2002). To demonstrate this, we consider the character co-occurrence network of Victor Hugo's Les Misérables. As shown in Figure 3A, this network displays many interesting structural patterns, e.g., stars, cliques, and communities. After node labeling using the Louvain method (Blondel et al., 2008), those structural patterns naturally emerge in the matrix (image) presentation. In particular, those stars show up as line segments, cliques and communities appear as blocks in the image (Figure 3B). After training, deep neural networks with many layers are able to extract the most important structural patterns of the network as the key features of the corresponding image. Note that, at the same layer of the deep neural network, different filters can actually learn different feature representations: some focus more on lower level features such as line segments, whereas others focus more on higher level features such as blocks (Figure 3C). Deeper layers will typically capture higher level features or more global structural patterns (Figure 3D). Leveraging those features at different levels or structural patterns learned at different scales, our DGM-based link prediction performs very well (see Supplementary Information Section 4.2.2 and 4.2.3 for more details). Indeed, for this particular network, with a fraction f = 0.1 of links as test set, we have AUC∼0.95, much higher than that of other link prediction methods, e.g., CN (with AUC∼0.7) and SEAL (with AUC∼0.85).

Figure 3.

Figure 3

Deep Neural Networks in the DGM Are Able to Learn Different Structural Patterns of a Network at Different Scales

(A) The character co-occurrence network of Victor Hugo's Les Misérables (with 77 nodes) contains several interesting structural patterns such as stars, cliques, and community structure.

(B) The matrix (image) representation of the network, with node labeling based on the Louvain method. Those structural patterns are highlighted in different colors.

(C) Learned feature maps from the first convolutional layer with trained filters. There are in total 64 feature maps. Each of them is of size 40 × 40.

(D) Learned feature maps from the second convolutional layer with trained filters. There are in total 128 feature maps. Each of them is of size 20 × 20. For each feature map, higher (or lower) values are shown in redder (or yellower) color.

Systematic Benchmarking Using Real Networks

To systematically demonstrate the advantage of our DGM-based link prediction in real-world applications, we compare the performance of our method with that of both classical and state-of-the-art link prediction methods for a wide range of real-world networks (LeBlanc et al., 1975; Baird et al., 1998; Christian and Luczkovich, 1999; Krebs, 2002; Zhang et al., 2005), from social, economic, technological to biological networks (see Supplementary Information Section 6 and Table S1 for brief descriptions of real networks analyzed in this work). For undirected networks (Figure 4A), we find that generally global similarity indices (e.g., Katz, ACT) and SBM-based link prediction methods perform better than local similarity indices (e.g., CN, PA, RA)-based methods. But the performances of those heuristics-based methods vary a lot over different network domains. Some of them actually perform even worse than random guess, especially when the training set is small (corresponding to large f). By contrast, our DGM-based method displays very robust and high performance for various undirected networks (see Figures S8 and S9 for comparing the performance with additional three methods). It also outperforms several state-of-the-art link prediction methods based on non-negative matrix factorization (Chen et al., 2017), network embedding (Perozzi et al., 2014), and graph neural networks (Zhang and Chen, 2018). For directed networks (Figure 4B), most of the existing methods (especially those state-of-the-art methods) are actually not applicable, except two classical methods: PA and LRMC (see Figure S10 for comparing the performance with modified RA method). We compare the performance of our method with those of PA and LRMC. Again, we find that our method displays more robust and better performance than PA and LRMC for various directed networks (see Figure S11 for detailed statistical test). We also use AUPRC (area under the precision-recall curve) as the performance evaluation metrics (see Table S2, Figures S12 and S13).

Figure 4.

Figure 4

Our DGM-Based Link Prediction Displays Very Robust and High Performance for Both Undirected and Directed Real-World Networks

DGM, deep generative model based link prediction; CN, common neighbors; PA, preferential attachment; RA, resource allocation; JC, Jaccard index; KATZ, Katz index; ACT, average commute time; SBM, stochastic block model; LRMC, low rank matrix completion; DW, deep-walk embedding method; NMF, non-negative matrix factorization; SEAL, learning from Subgraphs, Embeddings, and Attributes for Link prediction; SPM, structural perturbation method (see Supplementary Information Section 1 for details of each algorithm).

(A) Undirected networks. Top: Terrorist association network, Zachary karate club, Protein-protein interaction (PPI) network (a subnetwork of PPIs in S. cerevisiae). Bottom: Medieval river trade network in Russia, Internet topology (at the PoP level), Contiguous states in the United States.

(B) Directed networks. Top: Consulting (a social network of a consulting company), cat cortex (the connection network of cat cortical areas), cattle (a network of dominance behaviors of cattle). Bottom: Seagrass food web, St. Martin food web, Sioux Falls traffic network. AUC of our DGM-based method is the average AUC over the last 20 epochs of the total 150 epochs for all of networks. Here an epoch is one full training cycle on the training set. For all the undirected real networks, we apply the Louvain method first to label the nodes appropriately. Directed networks are labeled by the method proposed in Arenas et al., (2008). Error bar represents the standard error of the mean.

We emphasize that, before applying our DGM-based link prediction to each of the real-world networks tested in Figure 4, we performed node labeling (also known as matrix reordering in the literature) to get the matrix (or image) presentation of the network. For the sake of simplicity, we just labeled nodes in a network based on its community structure. In particular, for undirected networks, we apply the Louvain method; for directed networks, we apply the method proposed in Arenas et al. (2008). We emphasize that our approach does not rely on the presence of communities in a network. Any structural patterns at different scales (from small subgraphs to communities, see Figure 3) can be and should be leveraged all together. A reasonable node labeling can actually be achieved in many different ways other than just community detection. We found that for real-world networks the performance of our DGM-based link prediction actually does not heavily depend on the specific node labeling algorithm (see Supplementary Information Section 5). This is consistent with the results presented in Figure 2A, where we show that as long as the network has strong structural patterns, then any reasonable node labeling will offer a plausible matrix (or image) presentation of the network, which can be used for our DGM-based link prediction.

Parallelization Based on Image Splitting

Since our method essentially treats a network as an image, it can be easily parallelized by splitting a large network (image) into several small subnetworks (subimages) and then performing link prediction for each subnetwork (subimage) in parallel (Figure 5A). Note that node labeling is the first step of our approach. In this step, we always treat the whole network as an image (by any reasonable node labeling algorithm), regardless of the network size. Training the DGM is the second step of our approach. In this step, if the network/image is small, we train the DGM and hence perform link prediction for the whole image. Only if the network/image is too large, for which the DGM cannot be easily trained, we need to split the image into subimages, train DGM, and perform link prediction for different subimages in parallel. This splitting typically does not decrease the overall link prediction performance, compared with the result of treating the large network as a whole (Figure 5B). For each subnetwork, when only the information of the subnetwork is provided, our method outperforms other methods (Figure 5B). In fact, even if other methods (e.g., PA and LRMC) use the information of the whole network to perform link prediction for a subnetwork, our method that only relies on the information of the subnetwork still displays better performance (Figure 5B).

Figure 5.

Figure 5

The DGM-Based Link Prediction Method Can Infer Missing Links of Large Networks and Arbitrarily Selected Subnetworks within Large Networks

(A) The adjacency matrix of a real network: The Little Rock food web. The network (image) is split into 9 subnetworks (subimages).

(B) AUC of DGM, PA, and LRMC on the original network and AUC of DGM, PA, and LRMC on each subnetwork (DGM-S, PA-S, LRMC-S). Error bar represents the standard error of the mean.

(C and D) We perform link prediction for 200 randomly selected subnetworks (of size 60) chosen from two large-scale real networks: (E) Facebook wall posts (with 46,952 nodes and 87,6993 links) and (F) Google+ (with 23,628 nodes and 39,242 links), respectively. We randomly divide the links of the relabeled networks into two parts: a fraction of 10% links are chosen as probe set and the remaining 90% fraction of links as training set (here, each subnetwork contains 15 links at least). Asterisks at the top of each panel shows whether the AUC of our DGM-based link prediction model is significantly higher than that of the other two traditional algorithms (paired-sample t test). Significance levels: p value <0.001(∗∗∗).

(E and F) AUC of DGM and other scalable link prediction methods in two large-scale networks: (E) Facebook-NIPS (with N = 2,888 nodes and 2,981 links), and (F) US airports (with N = 1,574 nodes and 28,236 links).

The image representation of complex networks also allows us to focus on any specific subnetwork of interest and just predict the missing links in that subnetwork. For example, we perform link prediction for 200 subnetworks of size 60 randomly selected from two large real networks: Facebook wall posts (Viswanath et al., 2009) (with 46,952 nodes and 87,6993 links) and Google+ (Leskovec and Mcauley, 2012) (with 23,628 nodes and 39,242 links). To get each subnetwork, we randomly select a subimage of size 60 × 60 from the whole image (i.e., the adjacency matrix A=(aij) of the network) by choosing rows (i+1) to (i+60) and columns (j+1) to (j+60). Then we convert this subimage into a small network with adjacency matrix As=(apqs), where apqs = 1 if ai+p−1,j+q−1 = 1. We find that our method shows much higher AUC than other methods (Figures 5C and 5D). All these results suggest that our method holds great promise in link prediction for large-scale real-world networks. To directly demonstrate the performance of our method in analyzing large-scale networks, we consider an undirected network: Facebook-NIPS (with N = 2,888 nodes), and a directed network: US airports (with N = 1,574 nodes). We randomly remove a fraction (f = 0.1) of links as test set. To facilitate the training process of DGM and speed up the link prediction, we still split each large network (image) into several small subnetworks (subimages) and then perform link prediction for each subnetwork. But, in the end, to have a fair comparison with other methods (that always treat the large network as a whole), we calculate the AUC of our method from the whole network (constructed by merging subnetworks/subimages generated by the DGM). We find that clearly for both large-scale real networks (Facebook-NIPS and US airports) our method outperforms other existing methods (Figures 5E and 5F). See Supplementary Information Figure S14 for results of large-scale model networks.

Discussion

In summary, our DGM-based link prediction shows superior performance against existing methods for various types of networks, be they technological, biological, or social in nature. Since our method treats the adjacency matrix of a network as an image, it can be naturally extended to solve the link prediction problem for bipartite graphs, multi-layer networks, and multiplex networks, where the adjacency matrices have certain inherent structure. With small modification, it can also be used to perform link prediction in weighted graphs (see Supplementary Information Figure S15). To achieve that, we need to normalize the link weights so that they can be treated as existent probabilities of the corresponding links. In principle, any DGM can be utilized in our method. But we find that, for the link prediction purpose, GANs perform much better than other DGMs, e.g., variational autoencoder (Sohn et al., 2015) (see Supplementary Information Figure S16). There are several hyperparameters in training the GANs (see Supplementary Information Section 4 for details). In this work, we use the same set of hyperparameters for all the networks to show a conservative AUC estimation of our method. The performance of our method can certainly be further improved by carefully tuning those hyperparameters for a specific network of interest (see Figure S17). We feel this is beyond the scope of the current work and hence leave it as a future one.

Limitations of the Study

We should admit that, although our DGM-based link prediction displays superior performance in various real-world networks, it has several limitations. First, its time complexity is higher than traditional heuristic-based methods (e.g., Common Neighbors [Zhou et al., 2010], Preferential Attachment [Barabási and Albert, 1999]) and embedding-based methods (e.g., DeepWalk [Perozzi et al., 2014], node2vec [Grover and Leskovec, 2016, p. 2]). (See Supplementary Information Section 4.2.4 for detailed analysis of its time complexity.) Such a speed-accuracy trade-off deserves a very careful consideration in real-world applications. For certain link prediction applications, such as recommendation system in e-commerce or online social media with daunting network sizes, speed is the major concern, hence traditional link prediction methods still have big advantages. For applications in biomedicine (e.g., inferring protein-protein interactions or drug-target interactions) or criminal intelligence analysis (e.g., identifying hidden accomplice in criminal activity), those networks are much smaller than social media networks, and accuracy is way more important than speed. In those cases, we anticipate that our DGM-based link prediction should have an unparalleled advantage. Furthermore, we suggest that one should definitely exploit graphics process unit parallelism to train the GANs (Im et al., 2016), which will certainly speed up our method. Finally, we emphasize that, in real-world applications of link prediction, any additional side information, such as node attributes, can be incorporated into our method to further improve the link prediction. Second, we should admit that, since we treat the adjacency matrix of a network as a binary image, our method is by definition not permutation invariant. Recently, the notion of permutation-invariance has been discussed a lot in the deep learning literature (Wood and Shawe-Taylor, 1996; Kondor and Trivedi, 2018; Maron et al., 2018, 2019; Bloem-Reddy and Teh, 2019; Behrisch et al., 2016). A network method is called permutation-invariant if it produces the same output regardless of the node labeling used to encode the adjacency matrix of the network. Although our method is not permutation invariant in theory, we emphasize that it is approximately permutation invariant in practice. As shown in Figure S18, node labeling does not significantly affect the performance of our link prediction method, as long as the node labeling method leverages existing structure features in the network.

Resource Availability

Lead Contact

Yang-Yu Liu (yyl@channing.harvard.edu).

Materials Availability

This study did not generate new unique reagents.

Data and Code Availability

The code is available on the cloud-based reproducibility platform: Code Ocean (https://codeocean.com/), with the compute capsule entitled "Link Prediction through Deep Generative Model" (https://codeocean.com/capsule/6854770/tree/v1).

Methods

All methods can be found in the accompanying Transparent Methods supplemental file.

Acknowledgments

We thank Yizhou Sun, Tina Eliassi-Rad, Pan Zhang, Changjun Fan, Nima Dehmamy, Santo Fortunato, Yong-Yeol Ahn, Huawei Shen, and Marco Tulio Angulo for valuable discussions. Research reported in this publication was supported by grants R01AI141529, R01HD093761, UH3OD023268, U19AI095219, and U01HL089856 from National Institutes of Health.

Author Contributions

Y.-Y.L. conceived and designed the project. X.-W.W. and Y.C. did the analytical and numerical calculations. X.-W.W. analyzed all the real networks. All authors analyzed the results. Y.-Y.L. and X.-W.W. wrote the manuscript. Y.C. edited the manuscript.

Declaration of Interests

The authors declare no competing financial interests. Correspondence and requests for materials should be addressed to Y.-Y.L. (yyl@channing.harvard.edu).

Published: October 23, 2020

Footnotes

Supplemental Information can be found online at https://doi.org/10.1016/j.isci.2020.101626.

Supplemental Information

Document S1. Transparent Methods, Figures S1–S18, and Tables S1 and S2
mmc1.pdf (4MB, pdf)

References

  1. Albert R., Barabási A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 2002;74:47. [Google Scholar]
  2. Arenas A., Fernández A., Gómez S. Analysis of the structure of complex networks at different resolution levels. New J. Phys. 2008;10:053039. [Google Scholar]
  3. Arjovsky M., Chintala S., Bottou L. Wasserstein GAN. arXiv. 2017 http://arxiv.org/abs/1701.07875 [Google Scholar]
  4. Baird D., Luczkovich J., Christian R.R. Assessment of Spatial and temporal variability in ecosystem Attributes of the St Marks National Wildlife Refuge, Apalachee Bay, Florida. Estuar. Coast. Shelf Sci. 1998;47:329–349. [Google Scholar]
  5. Barabási A.-L., Albert R. Emergence of scaling in random networks. Science. 1999;286:509–512. doi: 10.1126/science.286.5439.509. [DOI] [PubMed] [Google Scholar]
  6. Behrisch M., Bach B., Henry Riche N., Schreck T., Fekete J.-D. Matrix reordering methods for table and network visualization. Comput. Graph. Forum. 2016;35:693–716. [Google Scholar]
  7. van den Berg R., Kipf T.N., Welling M. Graph convolutional matrix completion. arXiv. 2017 arXiv:1706.02263 [Google Scholar]
  8. Berlusconi G., Calderoni F., Parolini N., Verani M., Piccardi C. Link prediction in criminal networks: a tool for criminal intelligence analysis. PLoS One. 2016;11:e0154244. doi: 10.1371/journal.pone.0154244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bertalmio M., Sapiro G., Caselles V., Ballester C. Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. ACM Press/Addison-Wesley Publishing Co.; 2000. Image inpainting; pp. 417–424. [Google Scholar]
  10. Blagus N., Šubelj L., Bajec M. Self-similar scaling of density in complex real-world networks. Physica A Stat. Mech. Appl. 2012;391:2794–2802. [Google Scholar]
  11. Bloem-Reddy B., Teh Y.W. Probabilistic symmetry and invariant neural networks. arXiv. 2019 arXiv:1901.06082 [Google Scholar]
  12. Blondel V.D., Guillaume J.-L., Lambiotte R., Lefebvre E. Fast unfolding of communities in large networks. J. Stat. Mech. 2008;2008:P10008. [Google Scholar]
  13. Boccaletti S., Latora V., Moreno Y., Chavez M., Hwang D. Complex networks: structure and dynamics. Phys. Rep. 2006;424:175–308. [Google Scholar]
  14. Campillos M., Kuhn M., Gavin A.C., Jensen L.J., Bork P. Drug target identification using side-effect similarity. Science. 2008;321:263–266. doi: 10.1126/science.1158140. [DOI] [PubMed] [Google Scholar]
  15. Chaney A.J., Blei D.M., Eliassi-Rad T. Proceedings of the 9th ACM Conference on Recommender Systems. ACM; 2015. A probabilistic model for using social networks in personalized item recommendation; pp. 43–50. [Google Scholar]
  16. Chen B., Li F., Chen S., Hu R., Chen L. Link prediction based on non-negative matrix factorization. PLoS One. 2017;12:e0182968. doi: 10.1371/journal.pone.0182968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Chiu C., Zhan J. Deep learning for link prediction in dynamic networks using weak estimators. IEEE Access. 2018;6:35937–35945. [Google Scholar]
  18. Christian R.R., Luczkovich J.J. ‘Organizing and understanding a winter’s seagrass foodweb network through effective trophic levels’. Ecol. Model. 1999;117:99–124. [Google Scholar]
  19. Clauset A., Moore C., Newman M.E. Hierarchical structure and the prediction of missing links in networks. Nature. 2008;453:98–101. doi: 10.1038/nature06830. [DOI] [PubMed] [Google Scholar]
  20. Friedman N., Getoor L., Koller D., Pfeffer A. International Joint Conferences on Artificial Intelligence. Stockholm; Sweden: 1999. Learning probabilistic relational models; pp. 1300–1309. [Google Scholar]
  21. Girvan M., Newman M.E. Community structure in social and biological networks. Proc. Natl. Acad. Sci. U S A. 2002;99:7821–7826. doi: 10.1073/pnas.122653799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y. Generative adversarial nets. In: Ghahramani Z., Welling M., Cortes C., editors. Advances in Neural Information Processing Systems. 2014. pp. 2672–2680. [Google Scholar]
  23. Grover A., Leskovec J. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16. The 22nd ACM SIGKDD International Conference, San Francisco, California, USA. ACM Press; 2016. node2vec: scalable feature learning for networks; pp. 855–864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Guimerà R., Sales-Pardo M. Missing and spurious interactions and the reconstruction of complex networks. Proc. Natl. Acad. Sci. U S A. 2009;106:22073–22078. doi: 10.1073/pnas.0908366106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Hamilton W., Ying Z., Leskovec J. Inductive representation learning on large graphs. In: Guyon I., Luxburg U.V., Bengio S., Wallach H., Fergus R., Vishwanathan S., Garnett R., editors. Vol. 30. Curran Associates, Inc.; 2017. pp. 1024–1034. (Advances in Neural Information Processing Systems). [Google Scholar]
  26. Han J.D., Dupuy D., Bertin N., Cusick M.E., Vidal M. Effect of sampling on topology predictions of protein-protein interaction networks. Nat. Biotechnol. 2005;23:839–844. doi: 10.1038/nbt1116. [DOI] [PubMed] [Google Scholar]
  27. Heckerman D., Meek C., Koller D. MIT Press; 2007. ‘Probabilistic Entity-Relationship Models, PRMs, and Plate Models’, Introduction To Statistical Relational Learning; pp. 201–238. [Google Scholar]
  28. Hulovatyy Y., Solava R.W., Milenković T. Revealing missing parts of the interactome via link prediction. PLoS One. 2014;9:e90073. doi: 10.1371/journal.pone.0090073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Im D.J., Ma H., Kim C.D., Taylor G. Generative adversarial parallelization. arXiv. 2016 arXiv:1612.04021 [Google Scholar]
  30. Katz L. A new status index derived from sociometric analysis. Psychometrika. 1953;18:39–43. [Google Scholar]
  31. Kondor R., Trivedi S. On the generalization of equivariance and convolution in neural networks to the action of compact groups. arXiv. 2018 arXiv:1802.03690 [Google Scholar]
  32. Kovács I.A., Luck K., Spirohn K., Wang Y., Pollis C., Schlabach S., Bian W., Kim D.K., Kishore N., Hao T. Network-based prediction of protein interactions. Nat. Commun. 2019;10:1–8. doi: 10.1038/s41467-019-09177-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Krebs V.E. Mapping networks of terrorist cells. Connections. 2002;24:43–52. [Google Scholar]
  34. LeBlanc L.J., Morlok E.K., Pierskalla W.P. An efficient approach to solving the road network equilibrium traffic assignment problem. Transport. Res. 1975;9:309–318. [Google Scholar]
  35. Leskovec J., Mcauley J.J. Advances in Neural Information Processing Systems. 2012. Learning to discover social circles in ego networks; pp. 539–547. [Google Scholar]
  36. Liben-Nowell D., Kleinberg J. ‘The link-prediction problem for social networks’. J. Am. Soc. Inf. Sci. Technol. 2007;58:1019–1031. [Google Scholar]
  37. Linden G., Smith B., York J. Amazon. com recommendations: item-to-item collaborative filtering Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput. 2003;7:76–80. [Google Scholar]
  38. Lü L., Zhou T. Link prediction in complex networks: a survey. Physica A Stat. Mech. Appl. 2011;390:1150–1170. [Google Scholar]
  39. Luo Y., Zhao X., Zhou J., Yang J., Zhang Y., Kuang W., Peng J., Chen L., Zeng J. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat. Commun. 2017;8:573. doi: 10.1038/s41467-017-00680-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Maron H., Ben-Hamu H., Shamir N., Lipman Y. Invariant and equivariant graph networks. arXiv. 2018 arXiv:1812.09902 [Google Scholar]
  41. Maron H., Fetaya E., Segol N., Lipman Y. On the universality of invariant networks. arXiv. 2019 arXiv:1901.09342 [Google Scholar]
  42. Martínez V., Berzal F., Cubero J.-C. A survey of link prediction in complex networks. ACM Comput. Surv. 2016;49:69. [Google Scholar]
  43. Martínez V., Berzal F., Cubero J.-C. A survey of link prediction in complex networks. ACM Comput. Surv. 2016;49:1–33. [Google Scholar]
  44. Murphy R.L., Srinivasan B., Rao V., Ribeiro B. Relational pooling for graph representations. arXiv. 2019 arXiv:1903.02541 [Google Scholar]
  45. Newman M.E.J. The structure and function of complex networks. SIAM Rev. 2003;45:167–256. [Google Scholar]
  46. Newman M.E., Girvan M. Finding and evaluating community structure in networks. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2004;69:026113. doi: 10.1103/PhysRevE.69.026113. [DOI] [PubMed] [Google Scholar]
  47. Niepert M., Ahmed M., Kutzkov K. Learning Convolutional Neural Networks for Graphs. International conference on machine learning. 2016:2014–2023. [Google Scholar]
  48. Pan L., Zhou T., Lü L., Hu C.K. Predicting missing links and identifying spurious links via likelihood analysis. Sci. Rep. 2016;6:22955–23010. doi: 10.1038/srep22955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Pech R., Hao D., Pan L., Cheng H., Zhou T. Link prediction via matrix completion. EPL. 2017;117:38002. [Google Scholar]
  50. Perozzi B., Al-Rfou R., Skiena S. DeepWalk: online learning of social representations. arXiv. 2014:701–710. doi: 10.1145/2623330.2623732. [DOI] [Google Scholar]
  51. Radicchi F., Castellano C., Cecconi F., Loreto V., Parisi D. Defining and identifying communities in networks. Proc. Natl. Acad. Sci. U S A. 2004;101:2658–2663. doi: 10.1073/pnas.0400054101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Sahni N., Yi S., Taipale M., Fuxman Bass J.I., Coulombe-Huntington J., Yang F., Peng J., Weile J., Karras G.I., Wang Y. Widespread macromolecular interaction perturbations in human genetic disorders. Cell. 2015;161:647–660. doi: 10.1016/j.cell.2015.04.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Sarukkai R.R. Link prediction and path analysis using Markov chains1. Comput. Netw. 2000;33:377–386. [Google Scholar]
  54. Schlichtkrull M., Kipf T.N., Bloem P., Van Den Berg R., Titov I., Welling M. Modeling Relational Data with Graph Convolutional Networks. arXiv. 2017 http://arxiv.org/abs/1703.06103 [Google Scholar]
  55. Scholtes I., Wider N., Pfitzner R., Garas A., Tessone C.J., Schweitzer F. Causality-driven slow-down and speed-up of diffusion in non-Markovian temporal networks. Nat. Commun. 2014;5:5024. doi: 10.1038/ncomms6024. [DOI] [PubMed] [Google Scholar]
  56. Sohn K., Lee H., Yan X. Learning structured output representation using deep conditional generative models. In: Cortes C., Lawrence N.D., Lee D.D., Sugiyama M., Garnett R., editors. Advances in Neural Information Processing Systems. 2015. pp. 3483–3491. [Google Scholar]
  57. Srinivasan B., Ribeiro B. On the equivalence between node embeddings and structural graph representations. arXiv. 2019 arXiv:1910.00452 [Google Scholar]
  58. Tavakoli S., Hajibagheri A., Sukthankar G. International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction. 2017. Learning social graph topologies using generative adversarial neural networks. [Google Scholar]
  59. Viswanath B., Mislove A., Cha M., Gummadi K.P. Proceedings of the 2nd ACM Workshop on Online Social Networks. ACM; 2009. On the evolution of user interaction in facebook; pp. 37–42. [Google Scholar]
  60. Von Mering C., Krause R., Snel B., Cornell M., Oliver S.G., Fields S., Bork P. ‘Comparative assessment of large-scale data sets of protein–protein interactions’. Nature. 2002;417:399–403. doi: 10.1038/nature750. [DOI] [PubMed] [Google Scholar]
  61. Wood J., Shawe-Taylor J. A unifying framework for invariant pattern recognition. Pattern Recognit. Lett. 1996;17:1415–1422. [Google Scholar]
  62. Zhang B., Liu R., Massey D., Zhang L. Collecting the Internet AS-level topology. ACM SIGCOMM Comput. Commun. Rev. 2005;35:53–61. [Google Scholar]
  63. Yeh R.A., Chen C., Yian Lim T., Schwing A.G., Hasegawa-Johnson M., Do M.N. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. Semantic image inpainting with deep generative models; pp. 5485–5493. [Google Scholar]
  64. Zhang M., Chen Y. Link Prediction Based on Graph Neural Networks. arXiv. 2018 arXiv:1802.09691 [Google Scholar]
  65. Zhou T., Kuscsik Z., Liu J.G., Medo M., Wakeling J.R., Zhang Y.C. Solving the apparent diversity-accuracy dilemma of recommender systems. Proc. Natl. Acad. Sci.U S A. 2010;107:4511–4515. doi: 10.1073/pnas.1000488107. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Transparent Methods, Figures S1–S18, and Tables S1 and S2
mmc1.pdf (4MB, pdf)

Data Availability Statement

The code is available on the cloud-based reproducibility platform: Code Ocean (https://codeocean.com/), with the compute capsule entitled "Link Prediction through Deep Generative Model" (https://codeocean.com/capsule/6854770/tree/v1).


Articles from iScience are provided here courtesy of Elsevier

RESOURCES