Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning

Ulf Norinder; Ola Spjuth; Fredrik Svensson

doi:10.1186/s13321-021-00555-7

. 2021 Oct 2;13:77. doi: 10.1186/s13321-021-00555-7

Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning

Ulf Norinder ^1,^2,³, Ola Spjuth ^1,^✉, Fredrik Svensson ^4,^✉

PMCID: PMC8487527 PMID: 34600569

Abstract

Confidence predictors can deliver predictions with the associated confidence required for decision making and can play an important role in drug discovery and toxicity predictions. In this work we investigate a recently introduced version of conformal prediction, synergy conformal prediction, focusing on the predictive performance when applied to bioactivity data. We compare the performance to other variants of conformal predictors for multiple partitioned datasets and demonstrate the utility of synergy conformal predictors for federated learning where data cannot be pooled in one location. Our results show that synergy conformal predictors based on training data randomly sampled with replacement can compete with other conformal setups, while using completely separate training sets often results in worse performance. However, in a federated setup where no method has access to all the data, synergy conformal prediction is shown to give promising results. Based on our study, we conclude that synergy conformal predictors are a valuable addition to the conformal prediction toolbox.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13321-021-00555-7.

Keywords: Conformal prediction, Federated learning, Confidence, Machine learning

Introduction

Confidence predictors [1], such as conformal predictors, have been demonstrated to have several properties that make them useful for predictive tasks in drug discovery and other biomedical research [2]. Well calibrated models with defined uncertainties can facilitate decision making and has been identified as an important area of development [3, 4].

Conformal predictors allow predictions to be made at a pre-set confidence level, with errors guaranteed to not exceed that level. This is achieved under only mild conditions. Both transductive [5] and inductive conformal predictors [6] (ICP) have been described but we will focus on ICP in this study. The basis of an ICP is that a calibration set is used to relate new predictions to calibration instances with known labels. The conformal predictor then outputs a prediction region based on the calibration results and the selected confidence level. For example, a prediction set for a binary classification has four possible outcomes, no prediction, either of the two labels, or both labels. For details on how this is achieved we direct the reader to Norinder et al. [7] and Alvarsson et al. [8]. Reviews on the application of conformal prediction in the field of cheminformatics are also available [2, 3]. Conformal predictors can be calibrated for each class separately, called Mondrian conformal predictors. Mondrian conformal predictors have been shown not only to give the expected error rate for each class independently, but also to give excellent performance for imbalanced data [9, 10].

When evaluating conformal predictors two key metrics are validity and efficiency. Validity measures the fraction of predictions containing the correct label while efficiency measures the fraction of predictions containing only one label (or in the case of regressions, the width of the prediction region). The properties of conformal prediction guarantees that validity is always achieved as long as the conditions are met. It is generally desired to have as high efficiency as possible to maximise the utility of the predictions.

Several different approaches have been described for conformal prediction. The baseline ICP method uses fixed predefined training and calibration sets. Commonly, this process is repeated multiple times with different splits between training and calibration, and the p-values averaged, in what is called an aggregated conformal predictor (ACP) [11, 12]. This has the advantage that the prediction becomes less sensitive to the split between training and calibration data. However, while ACPs empirically have been shown in many applications to generate valid conformal predictors (an error rate not exceeding the set confidence-level) [13, 14], they have not been theoretically proven to be valid.

Recently, a new type of conformal predictor, called a synergy conformal predictor (SCP), has been introduced for classification [15] and regression problems [16]. In this application, the nonconformity scores from several different predictors are aggregated to construct a conformal predictor using a shared calibration set. This approach has been shown to satisfy the requirements for theoretical validity. SCP has previously been applied to toxicity predictions [17], but applications to other cheminformatics problems have to our knowledge not been reported and a systematic evaluation of SCP in cheminformatics is not available.

Key aspects of the different conformal predictors are shown schematically in Fig. 1. While the basic principle remains the same, the key difference between the different conformal predictors is the strategy used to split the data. Splitting the training data into smaller individual sets for SCP risk decreasing the predictive performance of the model compared to approaches trained on the full training set. However, the disjoint training sets allow for applications in for example federated learning [18] or distributed training that is not possible to achieve with other conformal methods that require access to all the available training data.

Federated learning is the process where several parties jointly train a machine learning model but keep their respective data local and private [19]. Federated learning can therefor help overcome issues related to confidentiality or privacy of data while still generated models based on a large amount of data.

Previous work has shown that prediction intervals from multiple non-disclosed datasets can be integrated by aggregating conformal p-values, but without producing valid results [20]. Applying SCP for federated learning is also convenient as it is a rigorously defined framework for aggregating the results from multiple sources. However, the aggregation still requires access to a shared calibration set.

SCP can also be used to construct predictor ensembles with overlapping training data as long as the calibration set remains the same. This allows for each split to contain sufficient training data to generate well-performing models regardless of the number of splits used and might allow for more efficient models compared to a single ICP predictor while still maintaining the guaranteed error rate as SCP methods have been shown to be theoretically valid.

In this study, we compare the performance of SCP with that of ICP and ACP on large-scale bioactivity datasets. We also explore potential applications of SCP in federated learning.

Results and discussion

To evaluate SCP for bioactivity data, two sets of PubChem data described by two sets of descriptors were used. These datasets have previously been used for machine learning evaluations [21, 22]. We compared the performance of SCP with five or ten splits (SCP 5 and SCP 10), SCP with ten random overlapping splits (RSCP 10), ACP with ten aggregations (ACP 10), and ICP. The results were evaluated using mainly the model efficiency, defined as the fraction of single label predictions. This is due to the fact that we expect all conformal predictors to give valid models, that is models with an error rate corresponding to the set significance level. See the methods section for more detail on these metrics. Efficiency for all methods is shown Figs. 2, 3, 4, 5 along with pairwise comparison for statistically significant differences (Wilcoxon signed-rank test). All methods produced valid models (see Additional files 1 and 2).

Fig. 2 — Top panels: efficiency for the active class for Set 1 using the different conformal predictors at a range of significance levels (0.1–0.3). Results for RDKit descriptors left and fingerprints right. Bottom panels: pairwise comparison (Wilcoxon signed-rank test with Bonferroni correction for multiple testing, 0.05 significance level, across all significance levels and datasets) of methods on rows with methods on columns, significantly better result is indicated in blue, significantly worse result in red. p-values are indicated in the figure. For example, in the bottom left panel we can see that SCP 10 is significantly worse than all other methods it is compared with

Fig. 3 — Top panels: efficiency for the inactive class for Set 1 using the different conformal predictors at a range of significance levels (0.1–0.3). Results for RDKit descriptors left and fingerprints right. Bottom panels: pairwise comparison (Wilcoxon signed-rank test with Bonferroni correction for multiple testing, 0.05 significance level, across all significance levels and datasets) of methods on rows with methods on columns, significantly better result is indicated in blue, significantly worse result in red. p-values are indicated in the figure

Fig. 4 — Top panels: efficiency for the active class for Set 2 using the different conformal predictors at a range of significance levels (0.1–0.3). Results for RDKit descriptors left and fingerprints right. Bottom panels: pairwise comparison (Wilcoxon signed-rank test with Bonferroni correction for multiple testing, 0.05 significance level, across all significance levels and datasets) of methods on rows with methods on columns, significantly better result is indicated in blue, significantly worse result in red. p-values are indicated in the figure

Fig. 5 — Top panels: efficiency for the inactive class for Set 2 using the different conformal predictors at a range of significance levels (0.1–0.3). Results for RDKit descriptors left and fingerprints right. Bottom panels: pairwise comparison (Wilcoxon signed-rank test with Bonferroni correction for multiple testing, 0.05 significance level, across all significance levels and datasets) of methods on rows with methods on columns, significantly better result is indicated in blue, significantly worse result in red. p-values are indicated in the figure

Overall, all the methods follow a similar pattern for the efficiencies and there are no dramatic differences, this is also evident from the fact that most of the comparisons did not produce a statistically significant difference in performance. However, ICP and RSCP tend to deliver slightly more efficient models at the higher confidence levels. This can be rationalized by ACPs tendency to produce slightly over valid models (overconservative) with a resulting loss in efficiency. For SCP 5 and SCP 10, the division of the training data is likely the cause of the lower efficiency, this is also supported by the overall lower efficiency for SCP 10.

Despite the somewhat lower efficacy of the SCP models, our results indicate that they can still generate well-performing models. Especially when not dividing the training data in too many partitions, as seen from the generally better performance of SCP 5 compared to SCP 10. In situations where a single joint training set is not available, either for technical reasons (aggregating a large amount of for example image data might be challenging), or where data cannot be shared between collaborators for reasons of confidentiality, SCP can be an option where models can be trained in a distributed fashion and the results joined together by a common calibration set.

The RSCP method overall produced more efficient models compared to SCP 5 and SCP 10 and can be a good alternative to ACP when the theoretical validity of the models is an important consideration or when ACPs tendency to generate overconservative models is undesirable. However, the need to draw random samples of the available training data means that the opportunities for distributed learning are lost for RSCP.

To investigate the potential utility of SCP for federated or distributed learning, we compared the results from modelling the individual parts of the training sets and using the average prediction (INDICP 5 and INDSCP 5) to the aggregated results for SCP 5. We elected to use the SCP 5 models as these had consistently better performance compared to SCP10. This reflects a scenario where data cannot be pooled to train one model and without federation the models would only have access to parts of the data, one fifth in this case. The average performance of the individual models compared to the federated model is shown in Figs. 6 and 7. Clearly, having access to more data in total improves the federated model compared to the individual models trained on only parts of the data. These results show promise for SCP for applications in federated learning. However, additional studies are required to benchmark SCP against other approaches in federated learning.

Fig. 6 — Distribution of efficiency for the individual models compared to the federated for Set 1. RDKit descriptors on top row and FP bottom, active left and inactive right. Statistically significant differences are indicated (Wilcoxon signed-rank test with Bonferroni correction for multiple testing, 0.05 significance level)

Fig. 7 — Distribution of efficiency for the individual models compared to the federated for Set 2. RDKit descriptors on top row and FP bottom, active left and inactive right. Statistically significant differences are indicated (Wilcoxon signed-rank test with Bonferroni correction for multiple testing, 0.05 significance level)

Overall, our study supports the previously published results on SCP and expand these to bioactivity prediction [15, 16]. In this study we employed Random Forest as the underlying model coupled with either molecular descriptors from RDKit or Morgan fingerprints. However, due to the flexible framework of conformal prediction any underlying method and descriptor can be used, allowing for easy conversion of already validated prediction setups. This is especially useful for federated learning since each participant can use their preferred model and descriptor type independently of what the other participants use.

Conclusions

We have demonstrated that synergy conformal predictors can achieve predictive performance on par with ICP and ACP methods. The same type of benefit that has been observed for other Mondrian conformal predictors for heavily imbalanced data is also true for SCP and the minority class is well predicted.

Since disjoint training sets can be joined with a shared calibration set, SCP has the potential to unlock conformal prediction, and thus predictions with a defined error rate, in situations where data is difficult to aggregate for one model and for applications in federation learning. Our results indicate that good performance can be obtained from such models.

In summary, SCP is a useful addition to the conformal prediction toolbox and can complement other methods in situations where a theoretical validity is paramount or where distributed training is desired.

Methods

Datasets

Two different sets of data, both originating from PubChem [23], were used in this analysis and previously employed and reported on in references [21] (Set 1) and [22] (Set 2). The AID and number of compounds for each dataset is shown in Table 1. The compiled datasets both include data from AID 2314. However, differences in how these datasets were curated means that the number of compounds included is different.

Table 1.

Datasets used in this study. Note that some of the assays deploy complex readouts that might not uniquely query the assigned target, see the full PubChem descriptions for details

AID	PubChem assay description	Inactive	Active
Set 1
411	qHTS Assay for Inhibitors of Firefly Luciferase	68,948	1555
868	Screen for Chemicals that Inhibit the RAM Network	190,834	3545
1030	qHTS Assay for Inhibitors of Aldehyde Dehydrogenase 1 (ALDH1A1)	145,732	15,914
1460	qHTS for Inhibitors of Tau Fibril Formation, Thioflavin T Binding	45,834	1189
1721	qHTS Assay for Inhibitors of Leishmania Mexicana Pyruvate Kinase (LmPK)	289,529	1087
2314	Cycloheximide Counterscreen for Small Molecule Inhibitors of Shiga Toxin	258,344	36,955
2326	qHTS Assay for Inhibitors of Influenza NS1 Protein Function	259,823	1067
2451	qHTS Assay for Inhibitors of Fructose-1,6-bisphosphate Aldolase from Giardia Lamblia	272,893	2016
2551	qHTS for inhibitors of ROR gamma transcriptional activity	253,192	16,632
485290	qHTS Assay for Inhibitors of Tyrosyl-DNA Phosphodiesterase (TDP1)	337,970	953
485314	qHTS Assay for Inhibitors of DNA Polymerase Beta	312,599	4491
504444	Nrf2 qHTS screen for inhibitors	283,351	7406
Set 2
1814	MLPCN Alpha-Synuclein 5'UTR—5'-UTR binding—activators	40,780	16,112
2314	Cycloheximide Counterscreen for Small Molecule Inhibitors of Shiga Toxin	30,586	26,306
2796	Luminescence-based primary cell-based high throughput screening assay to identify activators of the Aryl Hydrocarbon Receptor (AHR)	51,322	5570
463190	uHTS identification of small molecule inhibitors of tim10-1 yeast via a luminescent assay	52,443	4449
485346	uHTS for identification of Inhibitors of Mdm2/MdmX interaction in luminescent format	51,461	5431
504652	Antagonist of Human D 1 Dopamine Receptor: qHTS	50,420	6472
588726	Fluorescence-based biochemical primary high throughput screening assay to identify inhibitors of the fructose-bisphosphate aldolase (FBA) of M. tuberculosis	51,858	5034
652054	qHTS of D3 Dopamine Receptor Antagonist: qHTS	51,857	5035
687014	Luminescence-based cell-based primary high throughput screening assay to identify agonists of the DAF-12 from the parasite H. glycines (hgDAF-12)	52,572	4320
743279	qHTS for Inhibitors of Inflammasome Signaling: IL-1-beta AlphaLISA Primary Screen	47,459	9433

Open in a new tab

The chemical structures were standardized using the IMI eTOX project standardizer [24] in order to generate consistent compound representations and then further subjected to tautomer standardization using the MolVS standardizer [25]. Activity was assigned according to the PubChem annotation, and compounds with ambiguous activity were discarded.

A set of 97 physicochemical/structural feature descriptors, previous used in studies with good results [13, 26] were calculated using RDKit version 2018.09.1.0 [27]. A second descriptor set comprised of Morgan fingerprints [28] using radius 4 and hashed onto a binary feature vector of length 1,024 were also calculated using RDKit.

The data sets were randomly divided into a training set (80%) and a test set (20%).

Study design

Four different Mondrian conformal prediction protocols (outlined in Fig. 1) were used to derive in silico models for the data sets:

ICP.
Aggregated Conformal Prediction (ACP) using 10 randomly selected pairs of proper training and calibration sets, respectively. (ACP 10).
Synergy Conformal Prediction (SCP) using a randomly selected calibration set and a random 5- or tenfold division of the proper training set (mutually exclusive subsets). (SCP 5 and SCP 10).
Synergy Conformal Prediction using a randomly selected calibration set and 10 randomly selected subsets (70%) of the proper training set (RSCP 10). This selection allows duplication of instances between proper training sets.

Additionally, for comparison to federated models we also use ICP and SCP on each training set separately and merged the results from the 5 parts (INDICP 5 and INDSCP 5) into one file of predicted p-values, respectively. Since the comparison, as noted above, was made to SCP5, each training set was split in 5 parts.

All underlying models were built using the RandomForestClassifier in Scikit-learn [29] version 0.20.4 with default parameters (100 estimators), that previously has been shown to be a robust and accurate methodology for bioactivity prediction [30, 31].

Method evaluation

As introduced above, conformal predictions are typically evaluated by calculating the validity and efficiency of the predictors. In this study we define validity as the fraction of predictions that include the correct label and efficiency as the fraction of single label predictions. Since conformal predictors should be valid, focus is generally on the efficiency as a more efficient predictor will produce more useful output. For a more in-depth explanation on conformal prediction and its validation, see Norinder et al. [7].

Statistical test

A Wilcoxon signed-rank test (significance level 0.05) with Bonferroni correction for multiple testing was used in order to determine statistical significance between the conformal prediction methods. Methods were compared across all datasets and significance levels.

Supplementary Information

13321_2021_555_MOESM1_ESM.docx^{(254.4KB, docx)}

Additional file 1. Plots of model efficiency.

13321_2021_555_MOESM2_ESM.xlsx^{(195.5KB, xlsx)}

Additional file 2. Tabulated model results.

Authors' contributions

UN, OS, and FS jointly designed the study. UN completed the computations. FS drafted the initial manuscript which was edited by all authors. All authors read and approved the final manuscript.

Funding

Open access funding provided by Uppsala University. The ARUL UCL DDI is funded by Alzheimer’s Research UK (ARUK) (560832). OS acknowledges funding from Swedish Foundation for Strategic Research (Grant BD15-0008SB16-0046).

Availability of data and materials

The datasets supporting the conclusions of this article are available in the PubChem repository, see Table 1 for identifiers. Code for the conformal predictors is available from GitHub https://github.com/FredrikSvenssonUK/SCP.

Declarations

Competing interests

OS is co-founder of Scaleout Systems AB, a Swedish company involved in federated learning.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Ola Spjuth, Email: Ola.Spjuth@farmbio.uu.se.

Fredrik Svensson, Email: f.svensson@ucl.ac.uk.

References

1.Vovk V, Gammerman A, Shafer G. Algorithmic learning in a random world. New York: Springer; 2005. pp. 1–324. [Google Scholar]
2.Cortés-Ciriano I, Bender A (2021) Concepts and applications of conformal prediction in computational drug discovery. In: Artificial intelligence in drug discovery, the royal society of chemistry, pp 63–101
3.Mervin LH, Johansson S, Semenova E, et al. Uncertainty quantification in drug design. Drug Discov Today. 2020 doi: 10.1016/j.drudis.2020.11.027. [DOI] [PubMed] [Google Scholar]
4.Lombardo F, Desai PV, Arimoto R, et al. In Silico absorption, distribution, metabolism, excretion, and pharmacokinetics (ADME–PK): utility and best practices. an industry perspective from the International Consortium for innovation through quality in pharmaceutical development. J Med Chem. 2017;60:9097–9113. doi: 10.1021/acs.jmedchem.7b00487. [DOI] [PubMed] [Google Scholar]
5.Vovk V. Transductive conformal predictors. In: Papadopoulos H, Andreou AS, Iliadis L, Maglogiannis I, editors. BT–artificial intelligence applications and innovations. Berlin: Springer; 2013. pp. 348–360. [Google Scholar]
6.Papadopoulos H. Inductive conformal prediction: theory and application to Neural networks. In: Fritzsche P, editor. Tools in artificial intelligence. London: InTech; 2008. [Google Scholar]
7.Norinder U, Carlsson L, Boyer S, Eklund M. Introducing conformal prediction in predictive modeling. a transparent and flexible alternative to applicability domain determination. J Chem Inf Model. 2014;54:1596–1603. doi: 10.1021/ci5001168. [DOI] [PubMed] [Google Scholar]
8.Alvarsson J, Arvidsson McShane S, Norinder U, Spjuth O. Predicting with confidence: using conformal prediction in drug discovery. J Pharm Sci. 2021;110:42–49. doi: 10.1016/j.xphs.2020.09.055. [DOI] [PubMed] [Google Scholar]
9.Löfström T, Boström H, Linusson H, Johansson U. Bias reduction through conditional conformal prediction. Intell Data Anal. 2015;19:1355–1375. doi: 10.3233/IDA-150786. [DOI] [Google Scholar]
10.Norinder U, Boyer S. Binary classification of imbalanced datasets using conformal prediction. J Mol Graph Model. 2017;72:256–265. doi: 10.1016/j.jmgm.2017.01.008. [DOI] [PubMed] [Google Scholar]
11.Carlsson L, Eklund M, Norinder U, et al. Aggregated conformal prediction. In: Iliadis L, Maglogiannis I, Papadopoulos H, et al., editors. Artificial intelligence applications and innovations: AIAI 2014 workshops: CoPA, MHDW, IIVC, and MT4BD, Rhodes, Greece, September 19–21, 2014. Proceedings. Berlin: Springer International Publishing; 2014. [Google Scholar]
12.Vovk V. Cross-conformal predictors. Ann Math Artif Intell. 2015;74:9–28. doi: 10.1007/s10472-013-9368-4. [DOI] [Google Scholar]
13.Svensson F, Norinder U, Bender A. Modelling compound cytotoxicity using conformal prediction and PubChem HTS data. Toxicol Res (Camb) 2017;6:73–80. doi: 10.1039/C6TX00252H. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Bosc N, Atkinson F, Felix E, et al. Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery. J Cheminform. 2019;11:4. doi: 10.1186/s13321-018-0325-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Gauraha N, Spjuth O Synergy Conformal Prediction. Department of Pharmaceutical Biosciences, Faculty of Pharmacy, Disciplinary Domain of Medicine and Pharmacy, Uppsala University
16.Gauraha N, Spjuth O (2021) Synergy conformal prediction for regression. In: Proceedings of the 10th International conference on pattern recognition applications and methods - Volume 1, ICPRAM, SciTePress, pp 212–221
17.Morger A, Svensson F, Arvidsson McShane S, et al. Assessing the calibration in toxicological in vitro models with conformal prediction. J Cheminform. 2021;13:35. doi: 10.1186/s13321-021-00511-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Sheller MJ, Edwards B, Reina GA, et al. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Sci Rep. 2020;10:12598. doi: 10.1038/s41598-020-69250-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Kairouz P, McMahan HB, Avent B, et al (2021) Advances and Open Problems in Federated Learning. ArXiv. https://arxiv.org/abs/1912.04977
20.Spjuth O, Brännström RC, Carlsson L, Gauraha N. Combining prediction intervals on multi-source non-disclosed regression datasets. In: Gammerman A, Vovk V, Luo Z, Smirnov E, editors. Proceedings of the eighth symposium on conformal and probabilistic prediction and applications. Bulgaria: Proc Mach Learn Res; 2019. [Google Scholar]
21.Svensson F, Afzal AM, Norinder U, Bender A. Maximizing gain in high-throughput screening using conformal prediction. J Cheminform. 2018 doi: 10.1186/s13321-018-0260-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Norinder U, Svensson F. Multitask modeling with confidence using matrix factorization and conformal prediction. J Chem Inf Model. 2019;59:1598–1604. doi: 10.1021/acs.jcim.9b00027. [DOI] [PubMed] [Google Scholar]
23.Wang Y, Xiao J, Suzek TO, et al. PubChem’s bioassay database. Nucleic Acids Res. 2012;40:D400–D412. doi: 10.1093/nar/gkr1132. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.IMI eTOX project standardizer, version 017. https://pypi.python.org/pypi/standardiser
25.MolVS standardizer, version 009. https://pypi.python.org/pypi/MolVS
26.Svensson F, Norinder U, Bender A. Improving screening efficiency through iterative screening using docking and conformal prediction. J Chem Inf Model. 2017;57:439–444. doi: 10.1021/acs.jcim.6b00532. [DOI] [PubMed] [Google Scholar]
27.RDKit: Open-source cheminformatics, version 2018.09.1.0. http://www.rdkit.org
28.Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50:742–754. doi: 10.1021/ci100050t. [DOI] [PubMed] [Google Scholar]
29.Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–2830. doi: 10.1007/s13398-014-0173-7.2. [DOI] [Google Scholar]
30.Kensert A, Alvarsson J, Norinder U, Spjuth O. Evaluating parameters for ligand-based modeling with random forest on sparse data sets. J Cheminform. 2018;10:49. doi: 10.1186/s13321-018-0304-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Svensson F, Aniceto N, Norinder U, et al. Conformal regression for quantitative structure-activity relationship modelling–quantifying prediction uncertainty. J Chem Inf Model. 2018;58:1132–1140. doi: 10.1021/acs.jcim.8b00054. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

13321_2021_555_MOESM1_ESM.docx^{(254.4KB, docx)}

Additional file 1. Plots of model efficiency.

13321_2021_555_MOESM2_ESM.xlsx^{(195.5KB, xlsx)}

Additional file 2. Tabulated model results.

Data Availability Statement

[CR1] 1.Vovk V, Gammerman A, Shafer G. Algorithmic learning in a random world. New York: Springer; 2005. pp. 1–324. [Google Scholar]

[CR2] 2.Cortés-Ciriano I, Bender A (2021) Concepts and applications of conformal prediction in computational drug discovery. In: Artificial intelligence in drug discovery, the royal society of chemistry, pp 63–101

[CR3] 3.Mervin LH, Johansson S, Semenova E, et al. Uncertainty quantification in drug design. Drug Discov Today. 2020 doi: 10.1016/j.drudis.2020.11.027. [DOI] [PubMed] [Google Scholar]

[CR4] 4.Lombardo F, Desai PV, Arimoto R, et al. In Silico absorption, distribution, metabolism, excretion, and pharmacokinetics (ADME–PK): utility and best practices. an industry perspective from the International Consortium for innovation through quality in pharmaceutical development. J Med Chem. 2017;60:9097–9113. doi: 10.1021/acs.jmedchem.7b00487. [DOI] [PubMed] [Google Scholar]

[CR5] 5.Vovk V. Transductive conformal predictors. In: Papadopoulos H, Andreou AS, Iliadis L, Maglogiannis I, editors. BT–artificial intelligence applications and innovations. Berlin: Springer; 2013. pp. 348–360. [Google Scholar]

[CR6] 6.Papadopoulos H. Inductive conformal prediction: theory and application to Neural networks. In: Fritzsche P, editor. Tools in artificial intelligence. London: InTech; 2008. [Google Scholar]

[CR7] 7.Norinder U, Carlsson L, Boyer S, Eklund M. Introducing conformal prediction in predictive modeling. a transparent and flexible alternative to applicability domain determination. J Chem Inf Model. 2014;54:1596–1603. doi: 10.1021/ci5001168. [DOI] [PubMed] [Google Scholar]

[CR8] 8.Alvarsson J, Arvidsson McShane S, Norinder U, Spjuth O. Predicting with confidence: using conformal prediction in drug discovery. J Pharm Sci. 2021;110:42–49. doi: 10.1016/j.xphs.2020.09.055. [DOI] [PubMed] [Google Scholar]

[CR9] 9.Löfström T, Boström H, Linusson H, Johansson U. Bias reduction through conditional conformal prediction. Intell Data Anal. 2015;19:1355–1375. doi: 10.3233/IDA-150786. [DOI] [Google Scholar]

[CR10] 10.Norinder U, Boyer S. Binary classification of imbalanced datasets using conformal prediction. J Mol Graph Model. 2017;72:256–265. doi: 10.1016/j.jmgm.2017.01.008. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Carlsson L, Eklund M, Norinder U, et al. Aggregated conformal prediction. In: Iliadis L, Maglogiannis I, Papadopoulos H, et al., editors. Artificial intelligence applications and innovations: AIAI 2014 workshops: CoPA, MHDW, IIVC, and MT4BD, Rhodes, Greece, September 19–21, 2014. Proceedings. Berlin: Springer International Publishing; 2014. [Google Scholar]

[CR12] 12.Vovk V. Cross-conformal predictors. Ann Math Artif Intell. 2015;74:9–28. doi: 10.1007/s10472-013-9368-4. [DOI] [Google Scholar]

[CR13] 13.Svensson F, Norinder U, Bender A. Modelling compound cytotoxicity using conformal prediction and PubChem HTS data. Toxicol Res (Camb) 2017;6:73–80. doi: 10.1039/C6TX00252H. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Bosc N, Atkinson F, Felix E, et al. Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery. J Cheminform. 2019;11:4. doi: 10.1186/s13321-018-0325-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Gauraha N, Spjuth O Synergy Conformal Prediction. Department of Pharmaceutical Biosciences, Faculty of Pharmacy, Disciplinary Domain of Medicine and Pharmacy, Uppsala University

[CR16] 16.Gauraha N, Spjuth O (2021) Synergy conformal prediction for regression. In: Proceedings of the 10th International conference on pattern recognition applications and methods - Volume 1, ICPRAM, SciTePress, pp 212–221

[CR17] 17.Morger A, Svensson F, Arvidsson McShane S, et al. Assessing the calibration in toxicological in vitro models with conformal prediction. J Cheminform. 2021;13:35. doi: 10.1186/s13321-021-00511-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Sheller MJ, Edwards B, Reina GA, et al. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Sci Rep. 2020;10:12598. doi: 10.1038/s41598-020-69250-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Kairouz P, McMahan HB, Avent B, et al (2021) Advances and Open Problems in Federated Learning. ArXiv. https://arxiv.org/abs/1912.04977

[CR20] 20.Spjuth O, Brännström RC, Carlsson L, Gauraha N. Combining prediction intervals on multi-source non-disclosed regression datasets. In: Gammerman A, Vovk V, Luo Z, Smirnov E, editors. Proceedings of the eighth symposium on conformal and probabilistic prediction and applications. Bulgaria: Proc Mach Learn Res; 2019. [Google Scholar]

[CR21] 21.Svensson F, Afzal AM, Norinder U, Bender A. Maximizing gain in high-throughput screening using conformal prediction. J Cheminform. 2018 doi: 10.1186/s13321-018-0260-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] 22.Norinder U, Svensson F. Multitask modeling with confidence using matrix factorization and conformal prediction. J Chem Inf Model. 2019;59:1598–1604. doi: 10.1021/acs.jcim.9b00027. [DOI] [PubMed] [Google Scholar]

[CR23] 23.Wang Y, Xiao J, Suzek TO, et al. PubChem’s bioassay database. Nucleic Acids Res. 2012;40:D400–D412. doi: 10.1093/nar/gkr1132. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] 24.IMI eTOX project standardizer, version 017. https://pypi.python.org/pypi/standardiser

[CR25] 25.MolVS standardizer, version 009. https://pypi.python.org/pypi/MolVS

[CR26] 26.Svensson F, Norinder U, Bender A. Improving screening efficiency through iterative screening using docking and conformal prediction. J Chem Inf Model. 2017;57:439–444. doi: 10.1021/acs.jcim.6b00532. [DOI] [PubMed] [Google Scholar]

[CR27] 27.RDKit: Open-source cheminformatics, version 2018.09.1.0. http://www.rdkit.org

[CR28] 28.Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50:742–754. doi: 10.1021/ci100050t. [DOI] [PubMed] [Google Scholar]

[CR29] 29.Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–2830. doi: 10.1007/s13398-014-0173-7.2. [DOI] [Google Scholar]

[CR30] 30.Kensert A, Alvarsson J, Norinder U, Spjuth O. Evaluating parameters for ligand-based modeling with random forest on sparse data sets. J Cheminform. 2018;10:49. doi: 10.1186/s13321-018-0304-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Svensson F, Aniceto N, Norinder U, et al. Conformal regression for quantitative structure-activity relationship modelling–quantifying prediction uncertainty. J Chem Inf Model. 2018;58:1132–1140. doi: 10.1021/acs.jcim.8b00054. [DOI] [PubMed] [Google Scholar]

PERMALINK

Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning

Ulf Norinder

Ola Spjuth

Fredrik Svensson