Skip to main content
. 2019 Aug 26;47:607–615. doi: 10.1016/j.ebiom.2019.08.027

Table 1.

Recent ML tools and applications in various aspects of translational medicine with the key results and challenges faced by each application of ML.

Category Application ML technique(s) Key result(s) or Advantages(s) Challenge(s)
Drug discovery Designing chemical compounds (retrosynthetic process) Deep neural networks (DNNs) and Monte Carlo tree search [9] 30× quicker than traditional computer-aided methods [70]
  • 1.

    Scarcity of training data

  • 2.

    Stronger, but slower-reasoning, algorithms should be developed for this application

Designing chemical compounds (de novo drug design) Deep recurrent neural network (RNN) [6] Generate isofunctional, new chemical entities Appropriate predicted bioactivity which has been validated is required
Generative deep learning (based on RNNs) [7] Does not require similarity searching or external scoring and new molecular structures are generated immediately User has to make a decision on when training should be stopped
Reinforcement Learning for Structural Evolution (ReLeaSE - 2 DNNs, generative and predictive) [8] Simpler to use compared to traditional methods Only available for a single-task regime - development to extend to optimise several target properties together is required
Drug screening Random Forest and ChemVec [11] Highest accuracy when compared to 3 other algorithms
  • 1.

    Improve feature representation using deep learning

  • 2.

    Experimental validation required

Imaging Cell microscopy and histopathology Bayesian matrix factorisation method, Macau [25] Predictive performance comparable with that of DNNs
  • 1.

    Current results for this method are based on a single HTI screen

  • 2.

    Requires an adequate sized library of compound for training the model

Gradient Boosting [27] Reduction of disturbances to the cells, making sample preparation quicker and cheaper Deep learning techniques should be tried to improve the model
Defining relationships between morphology and genomic features Inception v3 (based on convolutional neural networks) [22] Capable of distinguishing between 3 types of histopathological images, predicting mutational status of 6 genes Current data may not fully represent the heterogeneity of tissues
Genomic medicine Biomarker discovery Elastic net regression [33] Identification of BRAF and NRAS mutations in cell lines, were among the top predictors of drug sensitivity for a MEK inhibitor Technique does not allow for the comparison between drugs
Unsupervised hierarchical clustering (part of ACME analysis) [30] Identified associations between BRAF mutant cell lines of the skin lineage being sensitive to the MEK inhibitor
  • 1.

    Distance metric and linkage criteria must be specified

  • 2.

    Does not scale well

Spectral clustering by Similarity Network Fusion (SNF) [34] Identification of new tumour subtypes by utilising mRNA and methylation signatures Prospective studies required to determine accuracy
Integrating different modalities of data iCluster [44] Identified potentially novel subtypes of breast and lung cancers on top of subgroups characterised by concordant DNA copy number alterations and gene expression in an automated way Only focuses on array data
Kernel Learning Integrative Clustering (KLIC) [46] Compared to Cluster-Of-Cluster Analysis (COCA), KLIC adds more detailed information about data from each dataset into the last clustering step and is able to merge datasets having various levels of noise, giving more weight to more significant ones Only tested on simulated datasets
Spectral clustering by SNF [43] Identification of new medulloblastoma subtypes
  • 1.

    Larger cohort size for validation

  • 2.

    Current analysis of samples is bulk analysis

Affinity Network Fusion (ANF) and semi-supervised learning [47] Performs similarly or better when compared to SNF, less computationally demanding, generalises better Results on four cancer types only (known disease types) and not yet validated on additional experimental data
Clusternomics [48] Outperforms existing methods and derived clusters with clinical meaning and significant differences in survival outcomes when tested on real-world data [44,[49], [50], [51]] Comparison of performance to other methods on real-world data