Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2020 Apr 17;12084:448–461. doi: 10.1007/978-3-030-47426-3_35

Quality-Aware Streaming Network Embedding with Memory Refreshing

Hsi-Wen Chen 14, Hong-Han Shuai 15, Sheng-De Wang 14, De-Nian Yang 16,17,
Editors: Hady W Lauw8, Raymond Chi-Wing Wong9, Alexandros Ntoulas10, Ee-Peng Lim11, See-Kiong Ng12, Sinno Jialin Pan13
PMCID: PMC7206285

Abstract

Static network embedding has been widely studied to convert sparse structure information into a dense latent space. However, the majority of real networks are continuously evolving, and deriving the whole embedding for every snapshot is computationally intensive. To avoid recomputing the embedding over time, we explore streaming network embedding for two reasons: 1) to efficiently identify the nodes required to update the embeddings under multi-type network changes, and 2) to carefully revise the embeddings to maintain transduction over different parts of the network. Specifically, we propose a new representation learning framework, named Graph Memory Refreshing (GMR), to preserve both global types of structural information efficiently. We prove that GMR maintains the consistency of embeddings (crucial for network analysis) for isomorphic structures better than existing approaches. Experimental results demonstrate that GMR outperforms the baselines with much smaller time.

Keywords: Network embedding, Streaming data mining

Introduction

Low-dimensional vector representation of nodes in large-scale networks has been widely applied to a variety of domains, such as social media [13], molecular structure [7], and transportation [9]. Previous approaches, e.g., DeepWalk [13], LINE [16], and SDNE [20], are designed to reduce the sparse structure information to a dense latent space for node classification [13], link prediction [16], and network visualization [21]. However, the above embedding schemes were not designed for evolutionary networks. Current popular networks tend to evolve with time, e.g., the average number of friends increases from 155 in 2016 and to 338 in 2018 [8]. Ephemeral social networks, like Snapchat for short-term conversations, may disappear within weeks. However, retraining the whole embedding for each snapshot is computationally intensive for a massive network. Therefore, streaming network embedding is a desirable option to quickly update and generate new embeddings in a minimum amount of time.

Different from dynamic network embeddings [12, 21] that analyze a sequence of networks to capture the temporal patterns, streaming network embedding1 aims to update the network embedding from the changed part of the network to find the new embedding. Efficient streaming network embedding has the following four main challenges. 1) Multi-type change. Dynamic changes of networks with insertions and deletions of nodes and edges are usually frequent and complex. It is thus important to derive the new embedding in minimum time to timely reflect the new network status. 2) Evaluation of affected nodes. Updating the embeddings of only the nodes neighboring to the changed part ignores the ripple effect on the remaining nodes. It is crucial to identify the nodes required to update the embeddings and ensure that the nodes with similar structures share similar embeddings. 3) Transduction. When a network significantly changes, it is difficult to keep the local proximity between the changed part and the remaining part of the network. It is also important to reflect the change in the global structure. 4) Quality guarantee. For streaming embeddings based on neural networks (usually regarded as a black box), it is challenging to provide theoretical guarantees about the embedding quality.

To effectively address the above challenges, this paper proposes a new representation learning approach, named Graph Memory Refreshing (GMR). GMR first derives the new embedding of the changed part by decomposing the loss function of Skip-Gram to support multi-type changes. It carefully evaluates the ripple-effect area and ensures the correctness by proposing a globally structure-aware selecting strategy, named hierarchical addressing, to efficiently identify and update those affected nodes with beam search to avoid the overfitting problem. To effectively support streaming data, our idea is to interpret the update of embeddings as the memory networks with two controllers, a refreshing gate and percolation gate, to tailor the embeddings from the structural aspect and maintain the transduction. GMR then updates the embeddings according to the streaming information of the new network and the stored features (i.e., memory) of the current network to avoid recomputing the embedding of the whole network. Moreover, GMR aims to both preserve the global structural information and maintain the embeddings of isomorphic structures, i.e., ensuring that the nodes with similar local structures share similar embeddings. This property is essential to ensure the correctness of network analysis based on network embeddings [18]. We theoretically prove that GMR preserves the consistency of embeddings for isomorphic structures better than that of the existing approaches. The contributions of this paper are summarized as follows.

  • GMR explores streaming network embedding with quality guarantees. The hierarchical addressing, refreshing gate, and percolation gate efficiently find and update the affected nodes under multi-type changes.

  • We prove that GMR embedding preserves isomorphic structures better than the existing approaches. According to our literature review, this is the first theoretical analysis for streaming network embedding.

  • Experimental results show that GMR outperforms the baselines by at least Inline graphic for link prediction and node classification with a much shorter time.

Related Work

Static network embedding has attracted a wide range of attention. Laplacian Eigenmaps [1] and IsoMaps [17] first constructed the adjacency matrix and then solved the matrix factorization, but the adjacency matrix was not scalable for massive networks. After Skip-Gram [11] was demonstrated to be powerful for representation learning, DeepWalk [13] and node2vec [5] employed random walks to learn network embedding, while LINE [16] and SDNE [20] were able to preserve the first-order and second-order proximity. GraphSAGE [6] and GAT [19] generated node representations in an inductive manner, by mapping and aggregating node features from the neighborhood.

In addition, a recent line of research proposed to learn the embeddings from a sequence of networks over time for finding temporal behaviors [12, 21]. However, these approaches focused on capturing the temporal changes rather than the efficiency since they recomputed the embeddings of the whole network, instead of updating only the changed part. Another line of recent research studied the dynamic embedding without retraining. However, the SVD-based approach [22] was more difficult to support large-scale networks according to [5]. Besides, [10] only supported the edge insertion and ignored edge deletion, whereas the consistency of the embeddings for globally isomorphic structures was not ensured. Compared with the above research and [3], the proposed GMR is the only one that provides a theoretical guarantee on the embedding quality (detailed later). It also more accurately preserves both the global structural information and the consistency of the embeddings.

Problem Formulation

In this section, we present the definitions for streaming network embeddings.

Definition 1

( Streaming Networks ). A dynamic network Inline graphic is a sequence of networks Inline graphic over time, where Inline graphic is the network snapshot at timestamp t. Inline graphic represents the streaming network with the changed part Inline graphic and Inline graphic as the sets of vertices and edges inserted or deleted between t and Inline graphic.

Definition 2

( Streaming Network Embeddings). Let Inline graphic denote the streaming network embedding that preserves the structural property of Inline graphic at timestamp t. The streaming network embeddings are derived by Inline graphic, where Inline graphic updates the node embedding Inline graphic at timestamp Inline graphic according to Inline graphic and Inline graphic, i.e., Inline graphic, where Inline graphic.

In other words, the inputs of the streaming network function are the embedding in the current time and the changed part of the network. In contrast, for [12, 21], given a dynamic network Inline graphic, the embedding is derived by a sequence of functions Inline graphic, where Inline graphic maps the node Inline graphic to the d-dimensional embedding Inline graphic at timestamp Inline graphic, i.e., Inline graphic. Therefore, the inputs are the whole networks in the current and next time. In the following, we present the problem studied in this paper.

Definition 3

( Quality-aware Multi-type Streaming Network Embeddings ). Given a streaming network with Inline graphic and Inline graphic as the sets of the vertices and edges inserted or deleted between t and Inline graphic, the goal is to find the streaming network embedding and derive the corresponding embedding quality to ensure that the nodes with similar structures share similar embeddings.

Later in Sect. 5, we formally present and theoretically analyze the quality of the embedding with a new metric, named isomorphic retaining score. Moreover, we prove that the proposed GMR better preserves the structures than other state-of-the-art methods in Theorems 1.

Graph Memory Refreshing

In this section, we propose Graph Memory Refreshing (GMR) to support multi-type embedding updates, to identify the affected nodes required to update the embeddings by hierarchical addressing, and to ensure that the nodes with similar structures share similar embeddings. To effectively support streaming data, we leverage the controllers (refreshing and percolation gates) of memory networks [4] to refresh the memory (update the embedding) according to the current state (the current embedding) and new input (streaming network).

Multi-type Embedding Updating

For each node Inline graphic, the Skip-Gram model predicts the context nodes Inline graphic and maximizes the log probability,

graphic file with name M31.gif 4.1

However, it is computationally intensive to derive the above probabilities for all nodes. Therefore, the probabilities are approximated by negative sampling [11],

graphic file with name M32.gif 4.2

where Inline graphic is the sigmoid function, Inline graphic and Inline graphic are respectively the embedding vectors of Inline graphic and Inline graphic, and Inline graphic is the noise distribution for negative sampling. The two terms respectively model the observed neighborhoods and the negative samples (i.e., node pairs without an edge) drawn from distribution Inline graphic. However, Eq. (4.2) focuses on only the edge insertion. To support the edge deletion, the second part in Eq. (4.2) is revised to consider unpaired negative samples and the deletion as follows,

graphic file with name M40.gif 4.3

where D is the set of deleted edges, and Inline graphic is required to be set greater than 1 because the samples from D usually provide more information than the unpaired negative samples Inline graphic.2 Note that node deletion is handled by removing all incident edges of a node, while adding a node with new edges is regarded as the edge insertion.3

Hierarchical Addressing

For streaming network embedding, previous computationally intensive approaches [4] find the embeddings of all nodes by global addressing. A more efficient way is updating only the neighboring nodes of the changed part with local addressing [10]. However, the ripple-effect area usually has an arbitrary shape (i.e., including not only the neighboring nodes). Therefore, instead of extracting the neighboring nodes with heuristics, hierarchical addressing systematically transforms the original network into a search tree that is aware of the global structure for the efficient identification of the affected nodes to update their embeddings.

Hierarchical addressing has the following advantages: 1) Efficient. It can be regarded as a series of binary classifications (on a tree), whereas global addressing and local addressing belong to multi-class classification (on the candidate list). Therefore, the time complexity to consider each node in Inline graphic is reduced from Inline graphic (i.e., pairwise comparison) to Inline graphic, where k is the number of search beams (explained later). 2) Topology-aware. It carefully examines the graph structure to evaluate the proximity and maintain the isomorphic structure, i.e., ensuring that the nodes with similar structures share similar embeddings. This property is essential for the correctness of network analysis with network embeddings [18].

Specifically, hierarchical addressing first exploits graph coarsening to build an addressing tree for the efficient search of the affected nodes. Graph coarsening includes both first-hop and second-hop collapsing: first-hop collapsing preserves the first-order proximity by merging two adjacent nodes into a supernode; second-hop collapsing aggregates the nodes with a common neighbor into a supernode, where the embedding of the supernode is averaged from its child nodes [2]. Second-hop collapsing is prioritized because it can effectively compress the network into a smaller tree.

The network is accordingly transformed into an addressing tree with each node Inline graphic as a leaf node. Afterward, for each node Inline graphic, we search for the node Inline graphic sharing the highest similarity with Inline graphic as the first affected node for Inline graphic by comparing their cosine similarity [4] along the addressing tree. For each node in the tree, if the left child node shares a greater similarity to Inline graphic, the search continues on the left subtree; otherwise, it searches the right subtree. The similarity search ends when it reaches the leaf node with the highest similarity to Inline graphic, and any node in Inline graphic (not only the neighbors of Inline graphic) is thereby allowed to be extracted. In other words, hierarchical addressing enables GMR to extract the affected nodes located in different locations of the network (not necessary to be close to Inline graphic), whereas previous approaches [3, 10, 21] update only the neighboring nodes of Inline graphic. Afterward, hierarchical addressing extracts the top-1 result for all nodes in Inline graphic as the initially affected nodes (more will be included later), where the nodes with the similarity smaller than a threshold h are filtered. To prevent over-fitting in a local minimum, hierarchical addressing can also extract the top-k results at each iteration with the beam search.4

Figure 1 presents an example of hierarchical addressing with the dimension of embeddings as 2. At timestamp Inline graphic (Fig. 1(a)), we construct the addressing tree by first merging nodes Inline graphic and Inline graphic into supernode Inline graphic through second-hop collapsing. The embedding of Inline graphic is Inline graphic. Afterward, Inline graphic merges Inline graphic into Inline graphic through first-hop collapsing, and Inline graphic is the root of the tree. At Inline graphic (Fig. 1(b)), if a new node Inline graphic is linked to Inline graphic with the embedding as (0.3, 0.2), we identify the affected nodes with bream search (Inline graphic) and start from the root Inline graphic. First, we insert Inline graphic and Inline graphic into the search queue with the size as 2 since Inline graphic, to compare the similarity of Inline graphic with that of Inline graphic and Inline graphic. Both Inline graphic and Inline graphic are then popped out from the queue because Inline graphic and Inline graphic have higher similarity i.e., the top-2 results (0.78 and 0.98), compared with 0.73 for Inline graphic.

Fig. 1.

Fig. 1.

Example of hierarchical addressing.

Refresh and Percolate

After identifying the nodes required to update the embeddings by hierarchical addressing, a simple approach is to update the embeddings of those affected nodes with a constant shift [6, 20]. However, a streaming network with a topology change on only a subset of nodes usually leads to different shifts for the nodes in distinct locations. Moreover, updating only the nodes extracted from hierarchical addressing is insufficient to ensure consistency of embeddings for the nodes with similar structures when the embeddings are tailored independently.

To effectively support streaming data, inspired by the gating mechanism in GRU [4], we parameterize the update of the embedding according to the current embedding and incoming streaming network. Specifically, GMR decomposes the update procedure into two controller gates: a refreshing gate Inline graphic and percolation gate Inline graphic. For each node Inline graphic selected in hierarchical addressing for each Inline graphic, the refreshing gate first updates the embedding of Inline graphic according the new embedding of Inline graphic, and the percolation gate then updates the embedding for every neighbor Inline graphic of Inline graphic from the new embedding of Inline graphic. The refreshing gate quantifies the embedding update for Inline graphic from an incoming stream (i.e., one-to-one update), while the percolation gate transduces the embedding of Inline graphic to its neighborhoods (i.e., one-to-many update) to preserve better local structure. The two gates are the cornerstones to maintain isomorphic structure, as proved later in the Theorem 1.

To update the embeddings of Inline graphic, i.e., updating Inline graphic from Inline graphic, we first define a shared function Inline graphic to find the refreshing coefficient Inline graphic, which represents the correlation between the embedding of Inline graphic and the new embedding of Inline graphic, i.e., Inline graphic. The refreshing gate selects the correlation function [19] as the shared function Inline graphic to extract the residual relation [19] between the two embeddings, instead of directly adopting a constant shift as was done in previous work. Here Inline graphic is a shift projection, and Inline graphic is derived by Inline graphic, where || is the vector concatenation operation. After this, we regulate refreshing coefficient Inline graphic into [0, 1] by a sigmoid function Inline graphic to provide a non-linear transformation. Therefore, Inline graphic quantifies the extent that Inline graphic affects Inline graphic,

graphic file with name M112.gif 4.4

Thereafter, the percolation gate revises the embedding of the neighbor nodes of Inline graphic to ensure the consistency of the embeddings for the nodes with similar structures. The percolation gate learns another sharable vector Inline graphic and finds the percolation coefficient Inline graphic, to quantify the extent that Inline graphic affects Inline graphic. Similarly, we regulate Inline graphic by Inline graphic to update Inline graphic as follows,

graphic file with name M121.gif 4.5

Therefore, when the refreshing and percolation gates are 0, the streaming network is ignored. In contrast, when both gates become 1, the previous snapshot embedding is dropped accordingly. In summary, the refreshing and percolation gates act as decision makers to learn the impact of the streaming network on different nodes. For the percolation gate, when node Inline graphic is updated, the percolation gate tailors the embedding of each Inline graphic,5 by evaluating the similarity of Inline graphic and Inline graphic according to the embeddings Inline graphic and Inline graphic. If Inline graphic and Inline graphic share many common neighbors, the percolation value of Inline graphic will increase to draw Inline graphic and Inline graphic closer to each other. The idea is similar for the refreshing gate. Note that Inline graphic and Inline graphic are both differentiable and can be trained in an unsupervised setting by maximize the objective Eq. (4.3). The unsupervised loss can also be replaced or augmented by a task-oriented objective (e.g., cross-entropy loss) when labels are provided. We alternatively update the embeddings (i.e., Inline graphic and Inline graphic) and the correlation parameters (i.e., Inline graphic and Inline graphic) to achieve better convergence.

Figure 2 illustrates an example of updating the node Inline graphic. After the embedding of Inline graphic updated from (0.8, 0.1) to (0.9, 0.1), GMR uses the percolation gate to transduce the embedding to the neighborhood nodes (i.e., Inline graphic, Inline graphic, and Inline graphic) to preserve the local structure. Since Inline graphic shares more common neighbors (Inline graphic) with Inline graphic than Inline graphic (none), the values of percolation gate for Inline graphic and Inline graphic are 0.8 and 0.5, respectively. The embeddings of node Inline graphic and Inline graphic become Inline graphic and Inline graphic through the percolation gate from Inline graphic, respectively. Therefore, relative distance between Inline graphic and Inline graphic can be maintained.

Fig. 2.

Fig. 2.

Example of percolation gate.

Theoretical Analysis

The quality of network embedding can be empirically evaluated from the experiment of network analysis, e.g., link prediction [16] and node classification [13], since the network embedding algorithm is unsupervised learning without knowing the ground truth. In contrast, when the network analysis task is unknown a priori, it is important to theoretically analyze the quality of network embedding. To achieve this goal, we first define the isomorphic pairs and prove that the embeddings of isomorphic pairs are the same in GMR. This property has been regarded as a very important criterion to evaluate the quality of network embedding [18], because the nodes with similar structures are necessary to share similar embeddings. Moreover, the experimental results in Sect. 6 manifest that a higher quality leads to better performance on task-oriented metrics.

Definition 4

( Isomorphic Pair). Any two different nodes Inline graphic and Inline graphic form an isomorphic pair if the sets of their first-hop neighbors Inline graphic are the same.

Lemma 1

If Inline graphic and Inline graphic are both isomorphic pairs, Inline graphic is also an isomorphic pair.

Proof:

According to Definition 4, Inline graphic and Inline graphic are both isomorphic pairs, indicating that Inline graphic and Inline graphic. Therefore, Inline graphic is equal to Inline graphic, and thus Inline graphic is also an isomorphic pair.   Inline graphic

Lemma 2

The embeddings Inline graphic and Inline graphic are the same after GMR converges if and only if (Inline graphic, Inline graphic) is an isomorphic pair.

Proof:

We first prove the sufficient condition. If Inline graphic is an isomorphic pair with Inline graphic, the probability of Inline graphic to predict the context nodes is not to equal to that of Inline graphic (Eq. (4.1)). Therefore, there exists a better solution that makes Inline graphic and Inline graphic be equal, contradicting the condition that the algorithm has converged. For the necessary condition, if Inline graphic but Inline graphic is not an isomorphic pair, since the probabilities are equal and the algorithm has converged, Inline graphic should be identical to Inline graphic for Eq. (4.1), contradicting that Inline graphic is not an isomorphic pair. The lemma follows.    Inline graphic

As proved in [14], the network embedding algorithms can be unified into the factorization of the affinity matrix. Therefore, nodes with the same first-hop neighborhood have the same embedding when the decomposition ends.

Based on Lemma 2, we define the isomorphic retaining score as follows.

Definition 5

( Isomorphic Retaining Score). The isomorphic retaining score, denoted as Inline graphic, is the summation of the cosine similarity over every isomorphic pair in Inline graphic, Inline graphic. Specifically,

graphic file with name M190.gif 5.1

where Inline graphic is the cosine similarity between Inline graphic and Inline graphic, and Inline graphic is the set of isomorphic pairs in Inline graphic. In other words, the embeddings of any two nodes Inline graphic and Inline graphic with the same structure are more consistent to each other if Inline graphic is close to 1 [18]. Experiment results in the next section show that higher isomorphic retaining scores lead to better performance of 1) the AUC score for link prediction and 2) the Macro-F1 score for node classification.

The following theorem proves that GMR retains the isomorphic structure better than other Skip-Gram-based approaches, e.g., [5, 13, 16], under edge insertion. Afterward, the time complexity analysis is presented.

Theorem 1

GMR outperforms other Skip-Gram-based models regarding the isomorphic retaining score under edge insertion after each update by gradient descent.

Proof:

Due to the space constraint, Theorem 1 is proved in the online version.6

   Inline graphic

Time Complexity. In GMR, the initialization of the addressing tree involves Inline graphic time. For each t, GMR first updates the embeddings of Inline graphic in Inline graphic time. After this, hierarchical addressing takes Inline graphic time to identify the affected nodes. Notice that it requires Inline graphic time to update the addressing tree. To update the affected nodes, the refreshing and percolation respectively involve O(1) and Inline graphic time for one affected node, where Inline graphic is the maximum node degree of the network. Therefore, updating all the affected nodes requires Inline graphic. Therefore, the overall time complexity of GMR is Inline graphic, while retraining the whole network requires Inline graphic time at each timestamp. Since k is a small constant, Inline graphic, and Inline graphic, GMR is faster than retraining.

Experiments

To evaluate the effectiveness and efficiency of GMR, we compare GMR with the state-of-the-art methods on two tasks, i.e., link prediction and node classification. For the baselines, we compare GMR with 1) Full, which updates the whole network with DeepWalk [13]; 2) change [3], which only takes the changed part as the samples with DeepWalk;7 3) GraphSAGE [6], which derives the embeddings from graph inductive learning; 4) SDNE [20], which extends the auto-encoder model to generate the embeddings of new nodes from the embeddings of neighbors; 5) CTDNE [12], which performs the biased random walk on the dynamic network;8 and 6) DNE [3], which updates only one affected node; 7) SLA [10], which handles only node/edge insertion; 8) DHPE [22], which is an SVD method based on matrix perturbation theory. The default Inline graphic, h, k, d, batch size, and learning rate are 1, 0.8, 3, 64, 16, and 0.001, respectively. Stochastic gradient descent (SGD) with Adagrad is adopted to optimize the loss function.

Link Prediction

For link prediction, three real datasets [15] for streaming networks are evaluated: Facebook (63,731 nodes, 1,269,502 edges, and 736,675 timestamps), Yahoo (100,001 nodes, 3,179,718 edges, and 1,498,868 timestamps), and Epinions (131,828 nodes, 841,372 edges, and 939 timestamps).9 The concatenated embedding Inline graphic of pair Inline graphic is employed as the feature to predict the link by logistic regression.10

Table 1 reports the AUC [5], isomorphic retaining score S in Eq. (5.1), and running time of different methods.11 The results show that the proposed GMR achieves the best AUC among all streaming network embedding algorithms. Compared with other state-of-the-art baselines, GMR outperforms other three baselines in terms of AUC by at least Inline graphic, Inline graphic and Inline graphic on Facebook, Yahoo and Epinions, respectively. Besides, GMR is close to that of Full(Inline graphic less on Facebook, Inline graphic more on Yahoo and Inline graphic less on Epinions), but the running time is only Inline graphic. Moreover, GraphSAGE has relatively weak performance since it cannot preserve the structural information without node features. The running time of SDNE is Inline graphic greater than that of GMR due to the processing of the deep structure, while the AUC of SDNE is at least Inline graphic less than that of GMR on all datasets.

Table 1.

Experiment results of link prediction.

Facebook Yahoo Epinions
AUC S sec AUC S sec AUC S sec
GMR 0.7943 0.94 3325 0.7674 0.93 3456 0.9294 0.92 3507
Full 0.8004 0 95 66412 0.7641 0.95 72197 0.9512 0.96 61133
Change 0.6926 0.79 2488 0.6326 0.82 2721 0.8233 0.84 2429
GraphSAGE 0.6569 0.77 4094 0.6441 0.79 5117 0.8158 0.85 4588
SDNE 0.6712 0.81 7078 0.6585 0.83 7622 0.8456 0.88 6799
CTDNE 0.7091 0.85 4322 0.6799 0.84 5136 0.8398 0.90 5097
DNE 0.7294 0.87 2699 0.6892 0.86 2843 0.8648 0.92 2613
SLA 0.7148 0.86 2398 0.6910 0.86 2438 0.8598 0.91 2569
DHPE 0.7350 0.88 3571 0.7102 0.88 3543 0.8458 0.90 3913

Compared to other streaming network embedding methods (e.g., DNE, SLA, and DHPE), GMR achieves at least Inline graphic of improvement because the embeddings of other methods are updated without considering the global topology. In contrast, GMR selects the affected nodes by globally structure-aware hierarchical addressing, and the selected nodes are not restricted to the nearby nodes. Furthermore, GMR outperforms baselines regarding the isomorphic retraining score since it percolates the embeddings to preserve the structural information. Note that the isomorphic retaining score S is highly related to the AUC with a correlation coefficient of 0.92, demonstrating that it is indeed crucial to ensure the embedding consistency for the nodes with similar structures.

Node Classification

For node classification, we compare different approaches on BlogCatalog [16] (10,132 nodes, 333,983 edges, and 39 classes), Wiki [5] (2,405 nodes, 17,981 edges, and 19 classes), and DBLP [22] (101,253 nodes, 223,810 edges, 48 timestamps, and 4 classes). DBLP is a real streaming network by extracting the paper citation network of four research areas from 1970 to 2017. BlogCatalog and Wiki are adopted in previous research [3] to generate the streaming networks.12 The learned embeddings are employed to classify the nodes according to the labels. Cross-entropy is adopted in the loss function for classification with logistic regression. We randomly sample Inline graphic of labels for training and Inline graphic of labels for testing, and the average results from 50 runs are reported.13 Table 2 demonstrates that GMR outperforms Change by Inline graphic regarding Macro-F1 [13], and it is close to Full but with Inline graphic speed-up. The Macro-F1 scores of GraphSAGE and SDNE are at least Inline graphic worse than that of GMR, indicating that GraphSAGE and SDNE cannot adequately handle multi-type changes in dynamic networks. Moreover, GMR achieves better improvement on BlogCatalog than on DBLP, because the density (i.e., the average degree) of BlogCatalog is larger, enabling hierarchical addressing of GMR to exploit more structural information for updating multiple nodes. For DBLP, GMR also achieves the performance close to Full.

Table 2.

Experiment results of node classification.

BlogCatalog Wiki DBLP
F1 S sec F1 S sec F1 S sec
GMR 0.2059 0.90 1998 0.4945 0.92 199 0.7619 0.93 7638
Full 0.2214 0.91 37214 0.5288 0.93 3811 0.7727 0.94 149451
Change 0.1651 0.71 1237 0.3597 0.79 122 0.6841 0.86 5976
GraphSAGE 0.1558 0.81 2494 0.3419 0.82 173 0.6766 0.86 11410
SDNE 0.1723 0.83 2795 0.3438 0.84 266 0.6914 0.87 16847
CTDNE 0.1808 0.84 2923 0.4013 0.85 301 0.7171 0.88 9115
DNE 0.1848 0.86 1547 0.4187 0.86 141 0.7302 0.90 6521
SLA 0.1899 0.87 1399 0.3998 0.85 149 0.7110 0.88 6193
DHPE 0.1877 0.87 2047 0.4204 0.86 215 0.7311 0.90 8159

It is worth noting that the isomorphic retaining score S is also positively related to Macro-F1. We further investigate the percentages of isomorphic pairs with the same label on different datasets. The results manifest that Inline graphic, Inline graphic and Inline graphic of isomorphic pairs share the same labels on BlogCatalog, Wiki, and DBLP, respectively. Therefore, it is crucial to maintain the consistency between isomorphic pairs since similar embeddings of isomorphic pairs are inclined to be classified with the same labels.

Conclusion

In this paper, we propose GMR for streaming network embeddings featuring the hierarchical addressing, refreshing gate, and percolation gate to preserve the structural information and consistency. We also prove that the embeddings generated by GMR are more consistent than the current network embedding schemes under insertion. The experiment results demonstrate that GMR outperforms the state-of-the-art methods in link prediction and node classification. Moreover, multi-type updates with the beam search improve GMR in both task-oriented scores and the isomorphic retaining score. Our future work will extend GMR to support multi-relations in knowledge graphs.

Footnotes

1

In streaming data mining [10], the incoming data stream, instead of the whole dataset, is employed to update the previous mining results efficiently.

2

Equation (4.3) is introduced as the general form for the Skip-Gram model under the multi-type change, and GMR only samples the insertions/deletions from streaming network Inline graphic at time stamp t for updating the embeddings.

3

The new node embedding is initialized by the average of its neighborhood [10] and then updated by maximizing Eq.(4.3).

4

For each node, the k search beams iteratively examine their child nodes (e.g., total 2k nodes) and maintain only the top-k child nodes with the highest similarity in a queue. Any leaf node reached by a beam will be included in the top-k results.

5

Inline graphic represents the set of first-hop neighborhoods.

6

The online version is presented in https://bit.ly/2UUeO7B.

7

The setting follows OpenNE: https://github.com/thunlp/OpenNE.

8

For fair comparison, SDNE only takes the adjacency matrix of current stream as the input feature. CTDNE only samples from the latest 50 streams instead of the whole network.

9

Facebook and Epinions contain both the edge insertion and deletion, represented by “i j -1 t” for removing edge (ij) at timestamp t. Yahoo lacks deletion since it is a message network.

10

For link prediction, at time t, we predict the new edges for time Inline graphic (excluding the edges incident to the nodes arriving at time Inline graphic).

11

For Full, due to high computational complexity in retraining the networks for all timestamps, we partition all timestamps into 50 parts [23] with the network changes aggregated in each part.

12

The streaming network Inline graphic is generated from the original network by first sampling half of the original network as Inline graphic. For each timestamp t, Inline graphic is constructed by sampling 200 edges (not in Inline graphic) from the original network and adding them (and the corresponding terminal nodes) to Inline graphic, whereas 100 edges of Inline graphic are deleted.

13

For a new node, only its embedding derived after the arrival is employed in the testing.

Contributor Information

Hady W. Lauw, Email: hadywlauw@smu.edu.sg

Raymond Chi-Wing Wong, Email: raywong@cse.ust.hk.

Alexandros Ntoulas, Email: antoulas@di.uoa.gr.

Ee-Peng Lim, Email: eplim@smu.edu.sg.

See-Kiong Ng, Email: seekiong@nus.edu.sg.

Sinno Jialin Pan, Email: sinnopan@ntu.edu.sg.

Hsi-Wen Chen, Email: r06921045@ntu.edu.tw.

Hong-Han Shuai, Email: hhshuai@nctu.edu.tw.

Sheng-De Wang, Email: sdwang@ntu.edu.tw.

De-Nian Yang, Email: dnyang@iis.sinica.edu.tw.

References

  • 1.Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in NIPS, pp. 585–591 (2002)
  • 2.Chen, H., Perozzi, B., Hu, Y., Skiena, S.: HARP: hierarchical representation learning for networks. In: Thirty-Second AAAI (2018)
  • 3.Du, L., Wang, Y., Song, G., Lu, Z., Wang, J.: Dynamic network embedding: an extended approach for skip-gram based network embedding. In: IJCAI, pp. 2086–2092 (2018)
  • 4.Graves, A., Wayne, G., Danihelka, I.: Neural turing machines. arXiv preprint arXiv:1410.5401 (2014)
  • 5.Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD, pp. 855–864 (2016) [DOI] [PMC free article] [PubMed]
  • 6.Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in NIPS, pp. 1024–1034 (2017)
  • 7.Jaeger S, Fulle S, Turk S. Mol2vec: unsupervised machine learning approach with chemical intuition. J. Chem. Inf. Model. 2018;58(1):27–35. doi: 10.1021/acs.jcim.7b00616. [DOI] [PubMed] [Google Scholar]
  • 8.Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection, June 2014
  • 9.Li, J., Chen, C., Tong, H., Liu, H.: Multi-layered network embedding. In: Proceedings of the 2018 SIAM ICDM, pp. 684–692 (2018)
  • 10.Liu, X., Hsieh, P.C., Duffield, N., Chen, R., Xie, M., Wen, X.: Real-time streaming graph embedding through local actions. In: Proceedings of the WWW, pp. 285–293 (2019)
  • 11.Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in NIPS, pp. 3111–3119 (2013)
  • 12.Nguyen, G.H., Lee, J.B., Rossi, R.A., Ahmed, N.K., Koh, E., Kim, S.: Continuous-time dynamic network embeddings. In: Proceedings of the WWW, pp. 969–976 (2018)
  • 13.Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD, pp. 701–710 (2014)
  • 14.Qiu, J., Dong, Y., Ma, H., Li, J., Wang, K., Tang, J.: Network embedding as matrix factorization: unifying DeepWalk, LINE, PTE, and node2vec. In: Proceedings of the WSDM, pp. 459–467 (2018)
  • 15.Rossi, R., Ahmed, N.: The network data repository with interactive graph analytics and visualization. In: Twenty-Ninth AAAI (2015)
  • 16.Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: LINE: large-scale information network embedding. In: Proceedings of the WWW, pp. 1067–1077 (2015)
  • 17.Tenenbaum JB, De Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science. 2000;290(5500):2319–2323. doi: 10.1126/science.290.5500.2319. [DOI] [PubMed] [Google Scholar]
  • 18.Tsitsulin, A., Mottin, D., Karras, P., Müller, E.: VERSE: versatile graph embeddings from similarity measures. In: Proceedings of the the WWW, pp. 539–548 (2018)
  • 19.Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
  • 20.Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD, pp. 1225–1234 (2016)
  • 21.Zhou, L., Yang, Y., Ren, X., Wu, F., Zhuang, Y.: Dynamic network embedding by modeling triadic closure process. In: Thirty-Second AAAI (2018)
  • 22.Zhu D, Cui P, Zhang Z, Pei J, Zhu W. High-order proximity preserved embedding for dynamic networks. Trans. Knowl. Data Eng. 2018;30:2134–2144. [Google Scholar]
  • 23.Zoghi, M., Tunys, T., Ghavamzadeh, M., Kveton, B., Szepesvari, C., Wen, Z.: Online learning to rank in stochastic click models. In: Proceedings of the ICML, pp. 4199–4208 (2017)

Articles from Advances in Knowledge Discovery and Data Mining are provided here courtesy of Nature Publishing Group

RESOURCES