Cell preparations for the Stem Cell Matrix are cultured in the authors’ laboratory or collected from other sources worldwide. Samples are assigned source codes that capture their biological origin and an relatively unbiased description of the cell type (such as BNLin for brain-derived neural lineage). Samples are collected and processed at a central lab for microarray analysis on a single Illumina BeadStation instrument.
The genomics data are processed by unsupervised algorithms that are capable of grouping the samples based on non-obvious expression patterns encoded in transcriptional phenotypes. For pathway discovery, existing high-content databases with experimental data (e.g. protein-protein-interaction data or gene sets) are combined with our transcriptional database, a priori assumed identity of cell types and bootstrapped sparse non-negative matrix factorization (sample clustering) to produce metadata that can be mined with Gene Set Analysis software and topology-based gene set discovery methods (systems wide network analysis). Web-based, computer-aided visualization methodologies can be used by researchers to formulate testable hypotheses and generate results and insights in stem cell biology.
Two exemplary results we report in this paper are the classification of novel stem cell types in the context of other better understood stem cell preparations, and a molecular map of interacting proteins which appear to function in concert in pluripotent stem cells.