Algorithm 1 Integrative model based on module-network for cancer subtypes |
Input: CNV data and gene expression data of two subtypes |
Output: A short list of gene sets |
The 1th step: Difference analysis EMDSort
|
(a) compute the using and according to the Formula (4). is the flow and the is the Euclidean distance. |
(b) compute the according to the emd-values. |
(c) compute the q-value according to the . |
The 2th step: Initial modules construction |
(a) fit two normal contributions by k-means clustering and select the threshold T for each modulator. |
(b) split the expression of the target gene into two sets according to the threshold T. |
(c) Given a leaf vector , the parameters and , the size of
N. |
(d) compute the of the split using the Formula (9). |
(e) assign the target gene into the single highest scoring candidate modulator. |
The 3th step: Module network learning |
repeat
|
(a) search for a regulation program for each module. |
(b) reassign each gene to the module whose program best predicts its behavior. |
(c) compute the proportion of re-assigned genes . |
until () |
The 4th step: The identification of candidate driver genes. |