. 2022 Apr 23;23(3):bbac133. doi: 10.1093/bib/bbac133

Table 1.

Transfer learning, tools and techniques

Name/acronym, reference	Source domain	Target domain	Input of the predictors	Output of the predictors	Transfer method; regression or classification task?	Availability, advantages and disadvantages (results/accomplishments)
Semisupervised transfer learning [9]	Application-area-specific mouse phenotype-outcome-labeled gene expression data	Human gene expression data	Human gene expression data	Human phenotype data (and subsequently DEGs and enriched pathways inferred from these)	Transductive: supervised modeling (mouse) amended iteratively by semi-supervised retraining (adding unlabeled human data); classification task	Matlab code available from www.mathworks.com/matlabcentral/fileexchange/69718-semisupervised-learning-functions. Compared favorably in various metrics to different machine learning methods like kNN, SVM and RF
XGSEA [18]	GO (or similar) gene sets and enrichment scores, e.g. from mouse or zebrafish	GO (or similar) gene sets and enrichment scores, e.g. from human	Gene expression data from source species used to calculate enrichment scores	Gene sets significantly associated in target species	Transductive: domain adaptation followed by prediction of significantly associated gene sets; regression task: logistic on P-values, linear on enrichment scores or linear on positive and negative enrichment scores separately	Code available at https://github.com/LiminLi-xjtu/XGSEA Compared favorably in various metrics to three naïve methods also proposed in the paper. XGSEA produced a smaller but more focused list of significant GO terms in the reported case study than the best performing naïve method. Depending on the needs of a study this could be an advantage or disadvantage to further interpretation
FIT [19]	Precompiled datasets of mouse gene expression	Precompiled datasets of human gene expression	Mouse gene expression	Human gene expression for matching condition, genes with high effect size	Unsupervised (dimensionality reduction): gene-level lasso regression; follow-up classification task to identify high-effect genes	Available at http://www.mouse2man.org; including pre-test for transferability; compared favorably to predictions based only on mouse data
Translatable components regression (TransComp-R) [20]	Human gene expression data (pretreatment), human drug response data	Mouse proteomics data	Human gene expression (pretreatment) and drug response data (the latter are given, not to be predicted)	Mouse proteins (and corresponding pathway enrichments) with association to human drug response	Unsupervised (feature representation): PCA-based regression	Matlab code available from https://de.mathworks.com/matlabcentral/fileexchange/77987-transcompr. Experimental verification of a gene predicted to be involved in resistance to treatment; apparently no other benchmarking
Pathway RespOnsive GENes (PROGENy) [21] and Discriminant Regulon Expression Analysis (DoRothEA) [22]	Two curated resources of footprint pathway perturbations (PROGENy), and another of footprint regulons (transcription factor—target interactions in DoRothEA) from human data, and human–mouse orthologs	The mouse equivalent of the source	Mouse gene expression data	Mouse pathway activity (PROGENy) or transcription factor activity and enrichment (DoRothEA)	Transductive: supervised prediction of mouse pathways (PROGENy) and regulons (DoRothEA); regression task	Both tools are available as R (Bioconductor) and python packages; for usage examples see https://github.com/saezlab/transcriptutorial; no benchmarking is described by the authors
Adversarial Inductive Transfer Learning (AITL) [12]	In vitro (cell line) gene expression and quantitative outcome (IC50) data	In vivo (patient) gene expression and qualitative outcome (yes/no) data	In vitro gene expression data (GDSC)	In vivo outcomes (TCGA)	Inductive: adversarial domain adaptation and multi-task learning (predicting outcomes for both source and target) using deep neural nets; classification task in the target domain	Code available at https://github.com/hosseinshn/AITL; performance benchmarked against six other methods (see main text) and found to perform best
Patient Response Estimation Corrected by Interpolation of Subspace Embeddings (PRECISE) [24]	Gene expression data from preclinical models (cell lines, patient-derived xenografts) and drug response	Human gene expression data	Human gene expression data	Human drug response	Transductive: similarity-based identification of shared mechanisms between large datasets from preclinical models and a small number of human samples, focused on cancer; regression task	Available as python package,; example protocols provided as Jupyter notebooks; see https://github.com/NKI-CCB/PRECISE; outperforming two state-of-the-art approaches (ridge regression on either the raw or ComBat corrected gene expression data) on retrieving associations between known biomarkers and drug responses
Transfer variational autoencoder, trVAE [30]	Gene expression data (cell line) or image data (or similar) under a specific (first) condition	Gene expression data or image data (or similar) under a different (second) condition	Data under the first condition and a label specifying the second condition	Data transformed to the second condition	Transductive: based on an autoencoder neural net; regression-like task when applied to expression data	Available from https://github.com/theislab/trvae_reproducibility; benchmarked against six other tools (see main text) and found to perform best
MultiPlier [15]	Preprocessed disease-related datasets of human gene expression, highlighting LVs (characteristic patterns of correlated genes)	Human (rare disease) gene expression data	Human (rare disease) gene expression data	Characteristic expression patterns of correlated genes	Unsupervised (feature representation): constrained matrix factorization highlighting LVs, then projection of input into latent space; neither regression nor classification	PLIER is available at https://github.com/wgmao/PLIER; MultiPlier is available from https://github.com/greenelab/multi-plier with a summary of additional dependencies also described in the accompanying paper. A docker image is provided to reproduce the analyses; no benchmarking is described by the authors