Skip to main content
. 2020 Sep 28;117(41):25195–25197. doi: 10.1073/pnas.2017807117

Fig. 1.

Fig. 1.

In the problem of link prediction, we are asked to identify which unobserved links in a network are more likely to exist. Nodes could represent individuals or drugs, and links could represent, respectively, friendship relationships in a social network or harmful drug–drug interactions. In this example, links AB and CD exist but have not been observed, so we aim to predict them. Model 1 pays attention only to the connectivity of nodes, and it captures that many nodes are connected to A, so it correctly predicts the AB link. However, since there is nothing especial about the connectivity of C and D, it misses the CD link. Conversely, model 2 pays attention only to group structure, so it realizes that all nodes in the group at Right are connected to each other, and it predicts the CD link. However, since in the group at Left many pairs of nodes are not connected to each other, it misses link AB. Model stacking, in which models 1 and 2 are combined, may be able to predict both links.