This figure gives an overview of the machine learning techniques that have been used to answer the drug discovery questions covered in this Review. A range of supervised learning techniques (regression and classifier methods) are used to answer questions that require prediction of data categories or continuous variables, whereas unsupervised techniques are used to develop models that enable clustering of the data. ADME, absorption, distribution, metabolism and excretion; CNN, convolutional neural network; CT, computed tomography; DAEN, deep autoencoder neural network; DNN, deep neural network; GAN, generative adversarial network; MRI, magnetic resonance imaging; NLP, natural language processing; PK, pharmacokinetic; RNAi, RNA interference; RNN, recurrent neural network; SVM, support vector machine; SVR, support vector regression.