Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2016 Mar 1.

Published in final edited form as: Mol Biosyst. 2015 Jan 22;11(3):714–722. doi: 10.1039/c4mb00677a

Fig. 1 — Overview of the compound signature discovery framework. This method requires raw L1000 data after various compounds and gene knockdown treatments. The raw data after the two types of treatments are preprocessed to yield gene expression data in Phase I. In Phase II, the EGEM matrix is constructed based on these gene expression data to measure relationships among compounds and knock-down genes. This matrix is then decomposed to a weight matrix and a coefficient matrix by the csNMF method. Protein-protein interaction data are added in consideration of biological connections. Signatures are identified based on strongly associated genes (i.e., those with larger values in the coefficient matrix).