Skip to main content
. Author manuscript; available in PMC: 2020 Sep 1.
Published in final edited form as: Am J Reprod Immunol. 2019 Jun 26;82(3):e13157. doi: 10.1111/aji.13157

Table 2.

Common useful terms

Term Definition
Alignment The process of matching reads to a particular region of the genome.
Counts Number of reads that map to a particular gene.
Coverage The extent to which a genomic region of interest was sequenced.
Dimensionality reduction The processed by which the most important features of a high-dimensional data set are extracted thus resulting in a data set with reduced dimensions.
Dropout events Missing data values due to low transcript expression and the stochastic nature of gene expression.
Embedding Mapping of high-dimensional features onto a low dimensional space.
Feature extraction Transformed data set built with the most predictive features of a high-dimensional data set.
Feature selection Selection of most predictive features in a high-dimensional data set.
Features Variables of a particular data set (i.e. fluorescence intensity or gene counts).
Machine learning Process of building mathematical models that allow computers to be trained to be predictive. Requires training with a “training dataset”.
Partitioning Division of data.
Quantitation Generation of count table that integrates that number of reads per gene that were aligned successfully.
Reads Nucleotide sequences as a result of sequencing.
Sequencing depth The number of reads per genomic region.