Skip to main content
. 2021 Mar 12;22(6):2903. doi: 10.3390/ijms22062903

Table 2.

Challenges posed for ML and DL in biomedicine, existing strategies to overcome these challenges and proposed solutions by integrating ML techniques with established bioinformatics approaches.

Problem Bottleneck Example Solutions Potential Integrated ML/DL and Bioinformatics Solutions
Small and dependent datasets Data availability Restricting the number of parameters [27,190] Neural network architectures for small and sparse datasets
Separating training and test sets by phylogenetic similarity [27] Methods to evaluate data dependency by protein and sequence similarities
Biological sequence representation Methodological NLP with neural networks-based modeling [191,192,193,194] Incorporating amino acid substitution and codon usage matrices to representation frameworks
Incorporating conserved domain databases to the training framework
Incorporation of different data types Methodological Integration of multi-omics datasets through existing network topologies
Reproducibility Acceptance Documentation and deposition of the processed data [195] -
Benchmarking of the processing pipeline and optimized parameters [196] -
Interpretability Acceptance Incorporation of established bioinformatic methods and databases with ML and DL frameworks [128,196]
Generation of interpretable DL models [197,198,199]