The Integrative Method Based on the Module-Network for Identifying Driver Genes in Cancer Subtypes

. 2018 Jan 24;23(2):183. doi: 10.3390/molecules23020183

Algorithm 1 Integrative model based on module-network for cancer subtypes

Input: CNV data and gene expression data of two subtypes

Output: A short list of gene sets

The 1th step: Difference analysis EMDSort

(P, Q, f_{i j}, d_{i j})

(a) compute the

E M D (P, Q)

using

f_{i j}

and

d_{i j}

according to the Formula (4).

f_{i j}

is the flow and the

d_{i j}

is the Euclidean distance.

(b) compute the

F D R_{j i}

according to the emd-values.

F D R_{j i}

The 2th step: Initial modules construction

(a) fit two normal contributions by k-means clustering and select the threshold T for each modulator.

(b) split the expression of the target gene into two sets

(A, B)

according to the threshold T.

l e a f

, the parameters

α

and

λ

, the size of

L e a f

(d) compute the

S c o r e (t a r g e t_g e n e, m o d u l a t o r)

of the split using the Formula (9).

(e) assign the target gene into the single highest scoring candidate modulator.

The 3th step: Module network learning

repeat

(a) search for a regulation program for each module.

(b) reassign each gene to the module whose program best predicts its behavior.

p r o

until (

p r o < 0.1

)

The 4th step: The identification of candidate driver genes.