Skip to main content
. 2021 Jul 19;12:4385. doi: 10.1038/s41467-021-24584-w

Fig. 1. The module repertoire construction process.

Fig. 1

a A collection of 16 blood transcriptome datasets spanning a wide range of immunological and physiological states was used as a starting point for the identification of gene co-expression patterns (RSV: Respiratory Syncytial Virus, HIV: Human Immunodeficiency Virus, COPD: Chronic Obstructive Pulmonary Disease). b Each dataset was independently clustered via k-means clustering. c Gene co-clustering events were recorded in a table, where the entries indicate in how many datasets, out of the 16, co-clustering was observed for a given gene pair. d The co-clustering table served as the input to build a weighted co-clustering graph (see also Supplementary Fig. 1), where the nodes represent genes and the edges represent co-clustering events. e The largest, most highly weighted sub-networks among a large network (constituting 15,132 nodes in this case) were identified mathematically and assigned a module ID. The genes constituting this module were removed from the selection pool and the process was repeated to select the next largest set of genes. Once all the gene sets for a given round of selection have been identified the criterion is relaxed for the next round, (e.g. M1 modules corresponding to the first round with the highest co-clustering weight [16 out of 16 datasets], M2 modules corresponding to the second round [co-clustering observed in 15 out of 16 datasets]). Overall, this process resulted in the selection of 382 modules comprising 14,168 transcripts.