Quality-Aware Streaming Network Embedding with Memory Refreshing

Hsi-Wen Chen; Hong-Han Shuai; Sheng-De Wang; De-Nian Yang

doi:10.1007/978-3-030-47426-3_35

. 2020 Apr 17;12084:448–461. doi: 10.1007/978-3-030-47426-3_35

Quality-Aware Streaming Network Embedding with Memory Refreshing

Hsi-Wen Chen ¹⁴, Hong-Han Shuai ¹⁵, Sheng-De Wang ¹⁴, De-Nian Yang ^16,^17,^✉

Editors: Hady W Lauw⁸, Raymond Chi-Wing Wong⁹, Alexandros Ntoulas¹⁰, Ee-Peng Lim¹¹, See-Kiong Ng¹², Sinno Jialin Pan¹³

PMCID: PMC7206285

Abstract

Static network embedding has been widely studied to convert sparse structure information into a dense latent space. However, the majority of real networks are continuously evolving, and deriving the whole embedding for every snapshot is computationally intensive. To avoid recomputing the embedding over time, we explore streaming network embedding for two reasons: 1) to efficiently identify the nodes required to update the embeddings under multi-type network changes, and 2) to carefully revise the embeddings to maintain transduction over different parts of the network. Specifically, we propose a new representation learning framework, named Graph Memory Refreshing (GMR), to preserve both global types of structural information efficiently. We prove that GMR maintains the consistency of embeddings (crucial for network analysis) for isomorphic structures better than existing approaches. Experimental results demonstrate that GMR outperforms the baselines with much smaller time.

Keywords: Network embedding, Streaming data mining

Introduction

Low-dimensional vector representation of nodes in large-scale networks has been widely applied to a variety of domains, such as social media [13], molecular structure [7], and transportation [9]. Previous approaches, e.g., DeepWalk [13], LINE [16], and SDNE [20], are designed to reduce the sparse structure information to a dense latent space for node classification [13], link prediction [16], and network visualization [21]. However, the above embedding schemes were not designed for evolutionary networks. Current popular networks tend to evolve with time, e.g., the average number of friends increases from 155 in 2016 and to 338 in 2018 [8]. Ephemeral social networks, like Snapchat for short-term conversations, may disappear within weeks. However, retraining the whole embedding for each snapshot is computationally intensive for a massive network. Therefore, streaming network embedding is a desirable option to quickly update and generate new embeddings in a minimum amount of time.

Different from dynamic network embeddings [12, 21] that analyze a sequence of networks to capture the temporal patterns, streaming network embedding1 aims to update the network embedding from the changed part of the network to find the new embedding. Efficient streaming network embedding has the following four main challenges. 1) Multi-type change. Dynamic changes of networks with insertions and deletions of nodes and edges are usually frequent and complex. It is thus important to derive the new embedding in minimum time to timely reflect the new network status. 2) Evaluation of affected nodes. Updating the embeddings of only the nodes neighboring to the changed part ignores the ripple effect on the remaining nodes. It is crucial to identify the nodes required to update the embeddings and ensure that the nodes with similar structures share similar embeddings. 3) Transduction. When a network significantly changes, it is difficult to keep the local proximity between the changed part and the remaining part of the network. It is also important to reflect the change in the global structure. 4) Quality guarantee. For streaming embeddings based on neural networks (usually regarded as a black box), it is challenging to provide theoretical guarantees about the embedding quality.

To effectively address the above challenges, this paper proposes a new representation learning approach, named Graph Memory Refreshing (GMR). GMR first derives the new embedding of the changed part by decomposing the loss function of Skip-Gram to support multi-type changes. It carefully evaluates the ripple-effect area and ensures the correctness by proposing a globally structure-aware selecting strategy, named hierarchical addressing, to efficiently identify and update those affected nodes with beam search to avoid the overfitting problem. To effectively support streaming data, our idea is to interpret the update of embeddings as the memory networks with two controllers, a refreshing gate and percolation gate, to tailor the embeddings from the structural aspect and maintain the transduction. GMR then updates the embeddings according to the streaming information of the new network and the stored features (i.e., memory) of the current network to avoid recomputing the embedding of the whole network. Moreover, GMR aims to both preserve the global structural information and maintain the embeddings of isomorphic structures, i.e., ensuring that the nodes with similar local structures share similar embeddings. This property is essential to ensure the correctness of network analysis based on network embeddings [18]. We theoretically prove that GMR preserves the consistency of embeddings for isomorphic structures better than that of the existing approaches. The contributions of this paper are summarized as follows.

GMR explores streaming network embedding with quality guarantees. The hierarchical addressing, refreshing gate, and percolation gate efficiently find and update the affected nodes under multi-type changes.
We prove that GMR embedding preserves isomorphic structures better than the existing approaches. According to our literature review, this is the first theoretical analysis for streaming network embedding.
Experimental results show that GMR outperforms the baselines by at least for link prediction and node classification with a much shorter time.

Related Work

Static network embedding has attracted a wide range of attention. Laplacian Eigenmaps [1] and IsoMaps [17] first constructed the adjacency matrix and then solved the matrix factorization, but the adjacency matrix was not scalable for massive networks. After Skip-Gram [11] was demonstrated to be powerful for representation learning, DeepWalk [13] and node2vec [5] employed random walks to learn network embedding, while LINE [16] and SDNE [20] were able to preserve the first-order and second-order proximity. GraphSAGE [6] and GAT [19] generated node representations in an inductive manner, by mapping and aggregating node features from the neighborhood.

In addition, a recent line of research proposed to learn the embeddings from a sequence of networks over time for finding temporal behaviors [12, 21]. However, these approaches focused on capturing the temporal changes rather than the efficiency since they recomputed the embeddings of the whole network, instead of updating only the changed part. Another line of recent research studied the dynamic embedding without retraining. However, the SVD-based approach [22] was more difficult to support large-scale networks according to [5]. Besides, [10] only supported the edge insertion and ignored edge deletion, whereas the consistency of the embeddings for globally isomorphic structures was not ensured. Compared with the above research and [3], the proposed GMR is the only one that provides a theoretical guarantee on the embedding quality (detailed later). It also more accurately preserves both the global structural information and the consistency of the embeddings.

Problem Formulation

In this section, we present the definitions for streaming network embeddings.

Definition 1

( Streaming Networks ). A dynamic network Inline graphic is a sequence of networks over time, where is the network snapshot at timestamp t. represents the streaming network with the changed part and as the sets of vertices and edges inserted or deleted between t and .

Definition 2

( Streaming Network Embeddings). Let Inline graphic denote the streaming network embedding that preserves the structural property of at timestamp t. The streaming network embeddings are derived by , where updates the node embedding at timestamp according to and , i.e., , where .

In other words, the inputs of the streaming network function are the embedding in the current time and the changed part of the network. In contrast, for [12, 21], given a dynamic network Inline graphic , the embedding is derived by a sequence of functions , where maps the node to the d-dimensional embedding at timestamp , i.e., . Therefore, the inputs are the whole networks in the current and next time. In the following, we present the problem studied in this paper.

Definition 3

( Quality-aware Multi-type Streaming Network Embeddings ). Given a streaming network with Inline graphic and as the sets of the vertices and edges inserted or deleted between t and , the goal is to find the streaming network embedding and derive the corresponding embedding quality to ensure that the nodes with similar structures share similar embeddings.

Later in Sect. 5, we formally present and theoretically analyze the quality of the embedding with a new metric, named isomorphic retaining score. Moreover, we prove that the proposed GMR better preserves the structures than other state-of-the-art methods in Theorems 1.

Graph Memory Refreshing

In this section, we propose Graph Memory Refreshing (GMR) to support multi-type embedding updates, to identify the affected nodes required to update the embeddings by hierarchical addressing, and to ensure that the nodes with similar structures share similar embeddings. To effectively support streaming data, we leverage the controllers (refreshing and percolation gates) of memory networks [4] to refresh the memory (update the embedding) according to the current state (the current embedding) and new input (streaming network).

Multi-type Embedding Updating

For each node Inline graphic , the Skip-Gram model predicts the context nodes and maximizes the log probability,

4.1

However, it is computationally intensive to derive the above probabilities for all nodes. Therefore, the probabilities are approximated by negative sampling [11],

4.2

where Inline graphic is the sigmoid function, and are respectively the embedding vectors of and , and is the noise distribution for negative sampling. The two terms respectively model the observed neighborhoods and the negative samples (i.e., node pairs without an edge) drawn from distribution . However, Eq. (4.2) focuses on only the edge insertion. To support the edge deletion, the second part in Eq. (4.2) is revised to consider unpaired negative samples and the deletion as follows,

4.3

where D is the set of deleted edges, and Inline graphic is required to be set greater than 1 because the samples from D usually provide more information than the unpaired negative samples .2 Note that node deletion is handled by removing all incident edges of a node, while adding a node with new edges is regarded as the edge insertion.3

Hierarchical Addressing

For streaming network embedding, previous computationally intensive approaches [4] find the embeddings of all nodes by global addressing. A more efficient way is updating only the neighboring nodes of the changed part with local addressing [10]. However, the ripple-effect area usually has an arbitrary shape (i.e., including not only the neighboring nodes). Therefore, instead of extracting the neighboring nodes with heuristics, hierarchical addressing systematically transforms the original network into a search tree that is aware of the global structure for the efficient identification of the affected nodes to update their embeddings.

Hierarchical addressing has the following advantages: 1) Efficient. It can be regarded as a series of binary classifications (on a tree), whereas global addressing and local addressing belong to multi-class classification (on the candidate list). Therefore, the time complexity to consider each node in Inline graphic is reduced from (i.e., pairwise comparison) to , where k is the number of search beams (explained later). 2) Topology-aware. It carefully examines the graph structure to evaluate the proximity and maintain the isomorphic structure, i.e., ensuring that the nodes with similar structures share similar embeddings. This property is essential for the correctness of network analysis with network embeddings [18].

Specifically, hierarchical addressing first exploits graph coarsening to build an addressing tree for the efficient search of the affected nodes. Graph coarsening includes both first-hop and second-hop collapsing: first-hop collapsing preserves the first-order proximity by merging two adjacent nodes into a supernode; second-hop collapsing aggregates the nodes with a common neighbor into a supernode, where the embedding of the supernode is averaged from its child nodes [2]. Second-hop collapsing is prioritized because it can effectively compress the network into a smaller tree.

The network is accordingly transformed into an addressing tree with each node Inline graphic as a leaf node. Afterward, for each node , we search for the node sharing the highest similarity with as the first affected node for by comparing their cosine similarity [4] along the addressing tree. For each node in the tree, if the left child node shares a greater similarity to Inline graphic , the search continues on the left subtree; otherwise, it searches the right subtree. The similarity search ends when it reaches the leaf node with the highest similarity to , and any node in (not only the neighbors of ) is thereby allowed to be extracted. In other words, hierarchical addressing enables GMR to extract the affected nodes located in different locations of the network (not necessary to be close to Inline graphic ), whereas previous approaches [3, 10, 21] update only the neighboring nodes of . Afterward, hierarchical addressing extracts the top-1 result for all nodes in as the initially affected nodes (more will be included later), where the nodes with the similarity smaller than a threshold h are filtered. To prevent over-fitting in a local minimum, hierarchical addressing can also extract the top-k results at each iteration with the beam search.4

Figure 1 presents an example of hierarchical addressing with the dimension of embeddings as 2. At timestamp Inline graphic (Fig. 1(a)), we construct the addressing tree by first merging nodes and into supernode through second-hop collapsing. The embedding of is . Afterward, merges into through first-hop collapsing, and is the root of the tree. At (Fig. 1(b)), if a new node is linked to with the embedding as (0.3, 0.2), we identify the affected nodes with bream search ( Inline graphic ) and start from the root . First, we insert and into the search queue with the size as 2 since , to compare the similarity of with that of and . Both and are then popped out from the queue because and have higher similarity i.e., the top-2 results (0.78 and 0.98), compared with 0.73 for Inline graphic .

Refresh and Percolate

After identifying the nodes required to update the embeddings by hierarchical addressing, a simple approach is to update the embeddings of those affected nodes with a constant shift [6, 20]. However, a streaming network with a topology change on only a subset of nodes usually leads to different shifts for the nodes in distinct locations. Moreover, updating only the nodes extracted from hierarchical addressing is insufficient to ensure consistency of embeddings for the nodes with similar structures when the embeddings are tailored independently.

To effectively support streaming data, inspired by the gating mechanism in GRU [4], we parameterize the update of the embedding according to the current embedding and incoming streaming network. Specifically, GMR decomposes the update procedure into two controller gates: a refreshing gate Inline graphic and percolation gate . For each node selected in hierarchical addressing for each , the refreshing gate first updates the embedding of according the new embedding of , and the percolation gate then updates the embedding for every neighbor of from the new embedding of . The refreshing gate quantifies the embedding update for Inline graphic from an incoming stream (i.e., one-to-one update), while the percolation gate transduces the embedding of to its neighborhoods (i.e., one-to-many update) to preserve better local structure. The two gates are the cornerstones to maintain isomorphic structure, as proved later in the Theorem 1.

To update the embeddings of Inline graphic , i.e., updating from , we first define a shared function to find the refreshing coefficient , which represents the correlation between the embedding of and the new embedding of , i.e., . The refreshing gate selects the correlation function [19] as the shared function to extract the residual relation [19] between the two embeddings, instead of directly adopting a constant shift as was done in previous work. Here Inline graphic is a shift projection, and is derived by , where || is the vector concatenation operation. After this, we regulate refreshing coefficient into [0, 1] by a sigmoid function to provide a non-linear transformation. Therefore, quantifies the extent that affects ,

4.4

Thereafter, the percolation gate revises the embedding of the neighbor nodes of Inline graphic to ensure the consistency of the embeddings for the nodes with similar structures. The percolation gate learns another sharable vector and finds the percolation coefficient , to quantify the extent that affects . Similarly, we regulate by to update as follows,

4.5

Therefore, when the refreshing and percolation gates are 0, the streaming network is ignored. In contrast, when both gates become 1, the previous snapshot embedding is dropped accordingly. In summary, the refreshing and percolation gates act as decision makers to learn the impact of the streaming network on different nodes. For the percolation gate, when node Inline graphic is updated, the percolation gate tailors the embedding of each ,5 by evaluating the similarity of and according to the embeddings and . If and share many common neighbors, the percolation value of will increase to draw and closer to each other. The idea is similar for the refreshing gate. Note that Inline graphic and are both differentiable and can be trained in an unsupervised setting by maximize the objective Eq. (4.3). The unsupervised loss can also be replaced or augmented by a task-oriented objective (e.g., cross-entropy loss) when labels are provided. We alternatively update the embeddings (i.e., Inline graphic and ) and the correlation parameters (i.e., and ) to achieve better convergence.

Figure 2 illustrates an example of updating the node Inline graphic . After the embedding of updated from (0.8, 0.1) to (0.9, 0.1), GMR uses the percolation gate to transduce the embedding to the neighborhood nodes (i.e., , , and ) to preserve the local structure. Since shares more common neighbors () with than (none), the values of percolation gate for Inline graphic and are 0.8 and 0.5, respectively. The embeddings of node and become and through the percolation gate from , respectively. Therefore, relative distance between and can be maintained.

Theoretical Analysis

The quality of network embedding can be empirically evaluated from the experiment of network analysis, e.g., link prediction [16] and node classification [13], since the network embedding algorithm is unsupervised learning without knowing the ground truth. In contrast, when the network analysis task is unknown a priori, it is important to theoretically analyze the quality of network embedding. To achieve this goal, we first define the isomorphic pairs and prove that the embeddings of isomorphic pairs are the same in GMR. This property has been regarded as a very important criterion to evaluate the quality of network embedding [18], because the nodes with similar structures are necessary to share similar embeddings. Moreover, the experimental results in Sect. 6 manifest that a higher quality leads to better performance on task-oriented metrics.

Definition 4

( Isomorphic Pair). Any two different nodes Inline graphic and form an isomorphic pair if the sets of their first-hop neighbors are the same.

Lemma 1

If Inline graphic and are both isomorphic pairs, is also an isomorphic pair.

Proof:

According to Definition 4, Inline graphic and are both isomorphic pairs, indicating that and . Therefore, is equal to , and thus is also an isomorphic pair.

Lemma 2

The embeddings Inline graphic and are the same after GMR converges if and only if (, ) is an isomorphic pair.

Proof:

We first prove the sufficient condition. If Inline graphic is an isomorphic pair with , the probability of to predict the context nodes is not to equal to that of (Eq. (4.1)). Therefore, there exists a better solution that makes and be equal, contradicting the condition that the algorithm has converged. For the necessary condition, if but Inline graphic is not an isomorphic pair, since the probabilities are equal and the algorithm has converged, should be identical to for Eq. (4.1), contradicting that is not an isomorphic pair. The lemma follows.

As proved in [14], the network embedding algorithms can be unified into the factorization of the affinity matrix. Therefore, nodes with the same first-hop neighborhood have the same embedding when the decomposition ends.

Based on Lemma 2, we define the isomorphic retaining score as follows.

Definition 5

( Isomorphic Retaining Score). The isomorphic retaining score, denoted as Inline graphic , is the summation of the cosine similarity over every isomorphic pair in , . Specifically,

5.1

where Inline graphic is the cosine similarity between and , and is the set of isomorphic pairs in . In other words, the embeddings of any two nodes and with the same structure are more consistent to each other if is close to 1 [18]. Experiment results in the next section show that higher isomorphic retaining scores lead to better performance of 1) the AUC score for link prediction and 2) the Macro-F1 score for node classification.

The following theorem proves that GMR retains the isomorphic structure better than other Skip-Gram-based approaches, e.g., [5, 13, 16], under edge insertion. Afterward, the time complexity analysis is presented.

Theorem 1

GMR outperforms other Skip-Gram-based models regarding the isomorphic retaining score under edge insertion after each update by gradient descent.

Proof:

Due to the space constraint, Theorem 1 is proved in the online version.6

Inline graphic

Time Complexity. In GMR, the initialization of the addressing tree involves Inline graphic time. For each t, GMR first updates the embeddings of in time. After this, hierarchical addressing takes time to identify the affected nodes. Notice that it requires time to update the addressing tree. To update the affected nodes, the refreshing and percolation respectively involve O(1) and Inline graphic time for one affected node, where is the maximum node degree of the network. Therefore, updating all the affected nodes requires . Therefore, the overall time complexity of GMR is , while retraining the whole network requires time at each timestamp. Since k is a small constant, , and Inline graphic , GMR is faster than retraining.

Experiments

To evaluate the effectiveness and efficiency of GMR, we compare GMR with the state-of-the-art methods on two tasks, i.e., link prediction and node classification. For the baselines, we compare GMR with 1) Full, which updates the whole network with DeepWalk [13]; 2) change [3], which only takes the changed part as the samples with DeepWalk;7 3) GraphSAGE [6], which derives the embeddings from graph inductive learning; 4) SDNE [20], which extends the auto-encoder model to generate the embeddings of new nodes from the embeddings of neighbors; 5) CTDNE [12], which performs the biased random walk on the dynamic network;8 and 6) DNE [3], which updates only one affected node; 7) SLA [10], which handles only node/edge insertion; 8) DHPE [22], which is an SVD method based on matrix perturbation theory. The default Inline graphic , h, k, d, batch size, and learning rate are 1, 0.8, 3, 64, 16, and 0.001, respectively. Stochastic gradient descent (SGD) with Adagrad is adopted to optimize the loss function.

Link Prediction

For link prediction, three real datasets [15] for streaming networks are evaluated: Facebook (63,731 nodes, 1,269,502 edges, and 736,675 timestamps), Yahoo (100,001 nodes, 3,179,718 edges, and 1,498,868 timestamps), and Epinions (131,828 nodes, 841,372 edges, and 939 timestamps).9 The concatenated embedding Inline graphic of pair is employed as the feature to predict the link by logistic regression.10

Table 1 reports the AUC [5], isomorphic retaining score S in Eq. (5.1), and running time of different methods.11 The results show that the proposed GMR achieves the best AUC among all streaming network embedding algorithms. Compared with other state-of-the-art baselines, GMR outperforms other three baselines in terms of AUC by at least Inline graphic , and on Facebook, Yahoo and Epinions, respectively. Besides, GMR is close to that of Full( less on Facebook, more on Yahoo and less on Epinions), but the running time is only . Moreover, GraphSAGE has relatively weak performance since it cannot preserve the structural information without node features. The running time of SDNE is Inline graphic greater than that of GMR due to the processing of the deep structure, while the AUC of SDNE is at least less than that of GMR on all datasets.

Table 1.

Experiment results of link prediction.

	Facebook			Yahoo			Epinions
	AUC	S	sec	AUC	S	sec	AUC	S	sec
GMR	0.7943	0.94	3325	0.7674	0.93	3456	0.9294	0.92	3507
Full	0.8004	0 95	66412	0.7641	0.95	72197	0.9512	0.96	61133
Change	0.6926	0.79	2488	0.6326	0.82	2721	0.8233	0.84	2429
GraphSAGE	0.6569	0.77	4094	0.6441	0.79	5117	0.8158	0.85	4588
SDNE	0.6712	0.81	7078	0.6585	0.83	7622	0.8456	0.88	6799
CTDNE	0.7091	0.85	4322	0.6799	0.84	5136	0.8398	0.90	5097
DNE	0.7294	0.87	2699	0.6892	0.86	2843	0.8648	0.92	2613
SLA	0.7148	0.86	2398	0.6910	0.86	2438	0.8598	0.91	2569
DHPE	0.7350	0.88	3571	0.7102	0.88	3543	0.8458	0.90	3913

Open in a new tab

Compared to other streaming network embedding methods (e.g., DNE, SLA, and DHPE), GMR achieves at least Inline graphic of improvement because the embeddings of other methods are updated without considering the global topology. In contrast, GMR selects the affected nodes by globally structure-aware hierarchical addressing, and the selected nodes are not restricted to the nearby nodes. Furthermore, GMR outperforms baselines regarding the isomorphic retraining score since it percolates the embeddings to preserve the structural information. Note that the isomorphic retaining score S is highly related to the AUC with a correlation coefficient of 0.92, demonstrating that it is indeed crucial to ensure the embedding consistency for the nodes with similar structures.

Node Classification

For node classification, we compare different approaches on BlogCatalog [16] (10,132 nodes, 333,983 edges, and 39 classes), Wiki [5] (2,405 nodes, 17,981 edges, and 19 classes), and DBLP [22] (101,253 nodes, 223,810 edges, 48 timestamps, and 4 classes). DBLP is a real streaming network by extracting the paper citation network of four research areas from 1970 to 2017. BlogCatalog and Wiki are adopted in previous research [3] to generate the streaming networks.12 The learned embeddings are employed to classify the nodes according to the labels. Cross-entropy is adopted in the loss function for classification with logistic regression. We randomly sample Inline graphic of labels for training and of labels for testing, and the average results from 50 runs are reported.13 Table 2 demonstrates that GMR outperforms Change by regarding Macro-F1 [13], and it is close to Full but with speed-up. The Macro-F1 scores of GraphSAGE and SDNE are at least worse than that of GMR, indicating that GraphSAGE and SDNE cannot adequately handle multi-type changes in dynamic networks. Moreover, GMR achieves better improvement on BlogCatalog than on DBLP, because the density (i.e., the average degree) of BlogCatalog is larger, enabling hierarchical addressing of GMR to exploit more structural information for updating multiple nodes. For DBLP, GMR also achieves the performance close to Full.

Table 2.

Experiment results of node classification.

	BlogCatalog			Wiki			DBLP
	F1	S	sec	F1	S	sec	F1	S	sec
GMR	0.2059	0.90	1998	0.4945	0.92	199	0.7619	0.93	7638
Full	0.2214	0.91	37214	0.5288	0.93	3811	0.7727	0.94	149451
Change	0.1651	0.71	1237	0.3597	0.79	122	0.6841	0.86	5976
GraphSAGE	0.1558	0.81	2494	0.3419	0.82	173	0.6766	0.86	11410
SDNE	0.1723	0.83	2795	0.3438	0.84	266	0.6914	0.87	16847
CTDNE	0.1808	0.84	2923	0.4013	0.85	301	0.7171	0.88	9115
DNE	0.1848	0.86	1547	0.4187	0.86	141	0.7302	0.90	6521
SLA	0.1899	0.87	1399	0.3998	0.85	149	0.7110	0.88	6193
DHPE	0.1877	0.87	2047	0.4204	0.86	215	0.7311	0.90	8159

Open in a new tab

It is worth noting that the isomorphic retaining score S is also positively related to Macro-F1. We further investigate the percentages of isomorphic pairs with the same label on different datasets. The results manifest that Inline graphic , and of isomorphic pairs share the same labels on BlogCatalog, Wiki, and DBLP, respectively. Therefore, it is crucial to maintain the consistency between isomorphic pairs since similar embeddings of isomorphic pairs are inclined to be classified with the same labels.

Conclusion

In this paper, we propose GMR for streaming network embeddings featuring the hierarchical addressing, refreshing gate, and percolation gate to preserve the structural information and consistency. We also prove that the embeddings generated by GMR are more consistent than the current network embedding schemes under insertion. The experiment results demonstrate that GMR outperforms the state-of-the-art methods in link prediction and node classification. Moreover, multi-type updates with the beam search improve GMR in both task-oriented scores and the isomorphic retaining score. Our future work will extend GMR to support multi-relations in knowledge graphs.

Footnotes

In streaming data mining [10], the incoming data stream, instead of the whole dataset, is employed to update the previous mining results efficiently.

Equation (4.3) is introduced as the general form for the Skip-Gram model under the multi-type change, and GMR only samples the insertions/deletions from streaming network Inline graphic at time stamp t for updating the embeddings.

The new node embedding is initialized by the average of its neighborhood [10] and then updated by maximizing Eq.(4.3).

⁴

For each node, the k search beams iteratively examine their child nodes (e.g., total 2k nodes) and maintain only the top-k child nodes with the highest similarity in a queue. Any leaf node reached by a beam will be included in the top-k results.

⁵

Inline graphic represents the set of first-hop neighborhoods.

⁶

The online version is presented in https://bit.ly/2UUeO7B.

⁷

The setting follows OpenNE: https://github.com/thunlp/OpenNE.

⁸

For fair comparison, SDNE only takes the adjacency matrix of current stream as the input feature. CTDNE only samples from the latest 50 streams instead of the whole network.

⁹

Facebook and Epinions contain both the edge insertion and deletion, represented by “i j -1 t” for removing edge (i, j) at timestamp t. Yahoo lacks deletion since it is a message network.

¹⁰

For link prediction, at time t, we predict the new edges for time Inline graphic (excluding the edges incident to the nodes arriving at time ).

¹¹

For Full, due to high computational complexity in retraining the networks for all timestamps, we partition all timestamps into 50 parts [23] with the network changes aggregated in each part.

¹²

The streaming network Inline graphic is generated from the original network by first sampling half of the original network as . For each timestamp t, is constructed by sampling 200 edges (not in ) from the original network and adding them (and the corresponding terminal nodes) to , whereas 100 edges of are deleted.

¹³

For a new node, only its embedding derived after the arrival is employed in the testing.

Contributor Information

Hady W. Lauw, Email: hadywlauw@smu.edu.sg

Raymond Chi-Wing Wong, Email: raywong@cse.ust.hk.

Alexandros Ntoulas, Email: antoulas@di.uoa.gr.

Ee-Peng Lim, Email: eplim@smu.edu.sg.

See-Kiong Ng, Email: seekiong@nus.edu.sg.

Sinno Jialin Pan, Email: sinnopan@ntu.edu.sg.

Hsi-Wen Chen, Email: r06921045@ntu.edu.tw.

Hong-Han Shuai, Email: hhshuai@nctu.edu.tw.

Sheng-De Wang, Email: sdwang@ntu.edu.tw.

De-Nian Yang, Email: dnyang@iis.sinica.edu.tw.

References

1.Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in NIPS, pp. 585–591 (2002)
2.Chen, H., Perozzi, B., Hu, Y., Skiena, S.: HARP: hierarchical representation learning for networks. In: Thirty-Second AAAI (2018)
3.Du, L., Wang, Y., Song, G., Lu, Z., Wang, J.: Dynamic network embedding: an extended approach for skip-gram based network embedding. In: IJCAI, pp. 2086–2092 (2018)
4.Graves, A., Wayne, G., Danihelka, I.: Neural turing machines. arXiv preprint arXiv:1410.5401 (2014)
5.Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD, pp. 855–864 (2016) [DOI] [PMC free article] [PubMed]
6.Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in NIPS, pp. 1024–1034 (2017)
7.Jaeger S, Fulle S, Turk S. Mol2vec: unsupervised machine learning approach with chemical intuition. J. Chem. Inf. Model. 2018;58(1):27–35. doi: 10.1021/acs.jcim.7b00616. [DOI] [PubMed] [Google Scholar]
8.Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection, June 2014
9.Li, J., Chen, C., Tong, H., Liu, H.: Multi-layered network embedding. In: Proceedings of the 2018 SIAM ICDM, pp. 684–692 (2018)
10.Liu, X., Hsieh, P.C., Duffield, N., Chen, R., Xie, M., Wen, X.: Real-time streaming graph embedding through local actions. In: Proceedings of the WWW, pp. 285–293 (2019)
11.Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in NIPS, pp. 3111–3119 (2013)
12.Nguyen, G.H., Lee, J.B., Rossi, R.A., Ahmed, N.K., Koh, E., Kim, S.: Continuous-time dynamic network embeddings. In: Proceedings of the WWW, pp. 969–976 (2018)
13.Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD, pp. 701–710 (2014)
14.Qiu, J., Dong, Y., Ma, H., Li, J., Wang, K., Tang, J.: Network embedding as matrix factorization: unifying DeepWalk, LINE, PTE, and node2vec. In: Proceedings of the WSDM, pp. 459–467 (2018)
15.Rossi, R., Ahmed, N.: The network data repository with interactive graph analytics and visualization. In: Twenty-Ninth AAAI (2015)
16.Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: LINE: large-scale information network embedding. In: Proceedings of the WWW, pp. 1067–1077 (2015)
17.Tenenbaum JB, De Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science. 2000;290(5500):2319–2323. doi: 10.1126/science.290.5500.2319. [DOI] [PubMed] [Google Scholar]
18.Tsitsulin, A., Mottin, D., Karras, P., Müller, E.: VERSE: versatile graph embeddings from similarity measures. In: Proceedings of the the WWW, pp. 539–548 (2018)
19.Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
20.Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD, pp. 1225–1234 (2016)
21.Zhou, L., Yang, Y., Ren, X., Wu, F., Zhuang, Y.: Dynamic network embedding by modeling triadic closure process. In: Thirty-Second AAAI (2018)
22.Zhu D, Cui P, Zhang Z, Pei J, Zhu W. High-order proximity preserved embedding for dynamic networks. Trans. Knowl. Data Eng. 2018;30:2134–2144. [Google Scholar]
23.Zoghi, M., Tunys, T., Ghavamzadeh, M., Kveton, B., Szepesvari, C., Wen, Z.: Online learning to rank in stochastic click models. In: Proceedings of the ICML, pp. 4199–4208 (2017)

[CR1] 1.Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in NIPS, pp. 585–591 (2002)

[CR2] 2.Chen, H., Perozzi, B., Hu, Y., Skiena, S.: HARP: hierarchical representation learning for networks. In: Thirty-Second AAAI (2018)

[CR3] 3.Du, L., Wang, Y., Song, G., Lu, Z., Wang, J.: Dynamic network embedding: an extended approach for skip-gram based network embedding. In: IJCAI, pp. 2086–2092 (2018)

[CR4] 4.Graves, A., Wayne, G., Danihelka, I.: Neural turing machines. arXiv preprint arXiv:1410.5401 (2014)

[CR5] 5.Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD, pp. 855–864 (2016) [DOI] [PMC free article] [PubMed]

[CR6] 6.Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in NIPS, pp. 1024–1034 (2017)

[CR7] 7.Jaeger S, Fulle S, Turk S. Mol2vec: unsupervised machine learning approach with chemical intuition. J. Chem. Inf. Model. 2018;58(1):27–35. doi: 10.1021/acs.jcim.7b00616. [DOI] [PubMed] [Google Scholar]

[CR8] 8.Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection, June 2014

[CR9] 9.Li, J., Chen, C., Tong, H., Liu, H.: Multi-layered network embedding. In: Proceedings of the 2018 SIAM ICDM, pp. 684–692 (2018)

[CR10] 10.Liu, X., Hsieh, P.C., Duffield, N., Chen, R., Xie, M., Wen, X.: Real-time streaming graph embedding through local actions. In: Proceedings of the WWW, pp. 285–293 (2019)

[CR11] 11.Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in NIPS, pp. 3111–3119 (2013)

[CR12] 12.Nguyen, G.H., Lee, J.B., Rossi, R.A., Ahmed, N.K., Koh, E., Kim, S.: Continuous-time dynamic network embeddings. In: Proceedings of the WWW, pp. 969–976 (2018)

[CR13] 13.Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD, pp. 701–710 (2014)

[CR14] 14.Qiu, J., Dong, Y., Ma, H., Li, J., Wang, K., Tang, J.: Network embedding as matrix factorization: unifying DeepWalk, LINE, PTE, and node2vec. In: Proceedings of the WSDM, pp. 459–467 (2018)

[CR15] 15.Rossi, R., Ahmed, N.: The network data repository with interactive graph analytics and visualization. In: Twenty-Ninth AAAI (2015)

[CR16] 16.Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: LINE: large-scale information network embedding. In: Proceedings of the WWW, pp. 1067–1077 (2015)

[CR17] 17.Tenenbaum JB, De Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science. 2000;290(5500):2319–2323. doi: 10.1126/science.290.5500.2319. [DOI] [PubMed] [Google Scholar]

[CR18] 18.Tsitsulin, A., Mottin, D., Karras, P., Müller, E.: VERSE: versatile graph embeddings from similarity measures. In: Proceedings of the the WWW, pp. 539–548 (2018)

[CR19] 19.Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)

[CR20] 20.Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD, pp. 1225–1234 (2016)

[CR21] 21.Zhou, L., Yang, Y., Ren, X., Wu, F., Zhuang, Y.: Dynamic network embedding by modeling triadic closure process. In: Thirty-Second AAAI (2018)

[CR22] 22.Zhu D, Cui P, Zhang Z, Pei J, Zhu W. High-order proximity preserved embedding for dynamic networks. Trans. Knowl. Data Eng. 2018;30:2134–2144. [Google Scholar]

[CR23] 23.Zoghi, M., Tunys, T., Ghavamzadeh, M., Kveton, B., Szepesvari, C., Wen, Z.: Online learning to rank in stochastic click models. In: Proceedings of the ICML, pp. 4199–4208 (2017)

PERMALINK

Quality-Aware Streaming Network Embedding with Memory Refreshing

Hsi-Wen Chen

Hong-Han Shuai

Sheng-De Wang

De-Nian Yang

Abstract

Introduction

Related Work

Problem Formulation

Definition 1

Definition 2

Definition 3

Graph Memory Refreshing

Multi-type Embedding Updating

Hierarchical Addressing

Fig. 1.

Refresh and Percolate

Fig. 2.

Theoretical Analysis

Definition 4

Lemma 1

Proof:

Lemma 2

Proof:

Definition 5

Theorem 1

Proof:

Experiments

Link Prediction

Table 1.

Node Classification

Table 2.

Conclusion

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases