. 2020 Dec 10;11:610798. doi: 10.3389/fgene.2020.610798

BOX 1.

Terms, concepts, expressions, and definitions for clarity of readers foraying into multi-omics.

Terms, concepts, expressions	Definitions
Multi-omics/panomics/integromics/integrated omics polyomics/transomics cross-omics	An approach aiming to improve the understanding of systems regulatory biology, molecular central dogma and genotype-phenotype relationship by combining 3 or more different omics data.
Multi-table, Multi-block	Terms focusing on the format of the data rather than its nature, popular in chemoinformatics (among other fields); can (but does not have to) imply a larger number of features than observations in the integrated tables/blocks.
Multi-view	Method often used in the field of ML for learning heterogeneity in the data and identification of patterns. By comparison to multiple cameras viewing an object from different angles, in omics context, the object can vary – whether it’s “cell,” “organism,” or just “genome” viewed via different seq* techniques.
Multi-source	This term encompasses datasets that are derived from multiple sources of molecular assays. This terminology is used, for example by the joint and individual variation explained (JIVE) tool (O’Connell and Lock, 2016) during EDA.
Multi-modal	A term often used in omics in reference to multiple measurements methods done at molecular level to gain holistic insights of cellular machinery (e.g., one cell at a time). It is also popular in drug repositioning that involves integration of more nuanced electronic health record (EHR) data integration.
Central dogma of molecular biology	This is an explanation of the flow of genetic information within a biological system from DNA to RNA (transcription) to protein (translation) to metabolites (enzyme catalysis).
Machine learning (ML) method	Algorithm (a sequence of instructions) aimed at learning from data, with applications including exploration/dimensionality reduction (unsupervised methods, e.g., PCA, matrix factorization) and classification/prediction (supervised or semi-supervised methods)
Deep learning (DL) method	A subtype of ML using deep neural networks, composed of artificial neurons (signal aggregating or transforming units) arranged in layers; the depth of the DL refers to the number of “hidden” layers between the “input” (exclusive) and “output” layers (inclusive).
Fusion (Baldwin et al., 2020)	A specific type of integration that applies a uniform method in a scalable manner, to solve biological problems which the multi-omics measurements target.
Exploratory data analysis (EDA)	It is an approach that is heavily used in statistics, data science field during early data analysis steps often coupled with visualization.
Matrix factorization	A class of ML algorithms based on matrix decomposition, i.e., representation of a data matrix by two or more matrices (factors) that can be multiplied together to obtain the original matrix (or its approximation). It can be used for classification, prediction, or exploration.
Data heterogeneity	The data with a structural variation that can be explained by the composition of the analyzed dataset; encompasses both the clinical heterogeneity (e.g., presence of two groups with different genetic make-up due to ancestral differences, or different underlying etiologies of a disease) and technical heterogeneity (i.e., batch effects).
Meta-data	A table of organized information and instructions that helps to summarize the data properties in order to make it findable and usable for data analysis across same or multiple projects.
Git	A version-control system for tracking changes in source code and other documents during software development. Platforms such as Github and Gitlab are built on top of it.