Table 2.
Common useful terms
Term | Definition |
---|---|
Alignment | The process of matching reads to a particular region of the genome. |
Counts | Number of reads that map to a particular gene. |
Coverage | The extent to which a genomic region of interest was sequenced. |
Dimensionality reduction | The processed by which the most important features of a high-dimensional data set are extracted thus resulting in a data set with reduced dimensions. |
Dropout events | Missing data values due to low transcript expression and the stochastic nature of gene expression. |
Embedding | Mapping of high-dimensional features onto a low dimensional space. |
Feature extraction | Transformed data set built with the most predictive features of a high-dimensional data set. |
Feature selection | Selection of most predictive features in a high-dimensional data set. |
Features | Variables of a particular data set (i.e. fluorescence intensity or gene counts). |
Machine learning | Process of building mathematical models that allow computers to be trained to be predictive. Requires training with a “training dataset”. |
Partitioning | Division of data. |
Quantitation | Generation of count table that integrates that number of reads per gene that were aligned successfully. |
Reads | Nucleotide sequences as a result of sequencing. |
Sequencing depth | The number of reads per genomic region. |