Figure 1:
Schematic diagram illustrating the whole computational pipeline proposed in this work. Figure shows step in the pipeline. (A) Collecting heterogeneous experimental data sources derived from mass spectrometry (MS) experiments, gene expression measurements and sub-cellular localization annotations, (B) Integrating heterogeneous data into a fused feature matrix, (C) Training a random forest classifier for predicting pairwise interactions, (D) Identifying protein complexes as strongly connected protein clusters from high-scoring pairwise interactions, and (E) Refining protein complexes by adding weaker pairwise interactions and merging highly overlapping complexes.