Skip to main content
. 2022 Sep 21;12:15746. doi: 10.1038/s41598-022-20025-w

Figure 3.

Figure 3

Predicting the total amount of money exchanged among agents of the criminal financial network. (A) Visualization of the criminal financial network. Nodes represent agents (people or companies) and edges indicate financial transactions. The thicker the edge and lighter its color, the larger the amount exchanged between a pair of nodes. (B) Coefficient of determination (R2 score) of the association between the logarithm of the predicted and observed amounts of money exchanged between pairs of nodes in the test sets. These predictions are obtained using k-nearest neighbor regressors (kNN with k=6) trained with node2vec representations of edges and different binary operators. The bars stand for the average accuracy and error bars represent one standard deviation over ten realizations of the embedding and training processes. The gray continuous line represents the accuracy of a baseline regressor that always predicts the average value of the training set, and the black dashed line represents the accuracy of another dummy regressor that always predicts the median of the training set. (C) A typical example of the relationship between the base-10 logarithm of the predicted and observed amounts of money exchanged between pairs of nodes in the test sets obtained with a kNN regressor (k=6) trained with node2vec representations of edges and the Hadamard operator. The dashed line represents the 1:1 relationship. (D) Average R2 score as a function of the number of neighbors (k) in the kNN regressors estimated from the test sets. The vertical dashed line indicates the optimal number of neighbors (k=6). (E) Average R2 score on the test sets as a function of the fraction nodes in the training sets. In the last two panels, the solid lines indicate the average R2 score, and the shaded regions stand for one standard deviation band estimated over ten realizations of the embedding and training processes with the Hadamard operator.