Skip to main content
. 2018 Jun 29;9:2544. doi: 10.1038/s41467-018-04948-5

Fig. 5.

Fig. 5

Combining community prioritization metrics without an external gold standard. a The rank aggregation algorithm starts with four ranked lists of communities, Rr, each one arising from the values of a different community prioritization metric r (where r is one of “l”— likelihood, “d”—density, “b”—boundary, “a”—allegiance). Communities are ordered by the decreasing value of the metric. We use C to indicate the rank of an illustrative community by the community prioritization metrics and at different stages of the algorithm. b Each ranked list is partitioned into equally sized groups, called bags. Each bag i in ranked list Rr has attached importance weight Kri whose initial values are all equal (represented by black bars all of same width). CRank uses the importance weights Kri to initialize aggregate prioritization R as a weighted average of community ranks Rl, Rd, Rb, Ra. c The top-ranked communities (denoted as dotted cells) in the aggregated prioritization R serve as a temporary gold standard, which is then used to iteratively update the importance weights Kri. d In each iteration, CRank updates importance weights using the Bayes factor calculation36 (Supplementary Note 4). Given bag i and ranked list Rr, CRank updates importance weight Kri, based on how many communities from the temporary gold standard appear in bag i. Updated importance weights then revise the aggregated prioritization in which the new rank R(C) of community C is expressed as: R(C) = rlogKrir(C)Rr(C), where Krir(C) indicates the importance weight of bag ir(C) of community C for metric r, and Rr(C) is the rank of C according to r. By using an iterative approach, CRank allows for the importance of a metric not to be predetermined and to vary across communities