Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2021 Jun 24;22(6):bbab225. doi: 10.1093/bib/bbab225

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2021. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Number of training complexes (the red curve) against protein structure similarity cutoff (left column), ligand fingerprint similarity cutoff (center column) and pocket topology dissimilarity cutoff (right column) to the CASF-2016 test set in two directions, either starting from a small training set comprising complexes most dissimilar to the test set (top row; the ds direction defined by or ) or starting from a small training set comprising complexes most similar to the test set (bottom row; the sd direction defined by or ). At the top row, the histograms plot the number of additional complexes that will be added to a larger set when the protein structure similarity cutoff is incremented by a step size of 0.01 (left), when the ligand fingerprint similarity cutoff is incremented by 0.01 (center), or when the pocket topology dissimilarity cutoff is decremented by 0.2 (right). At the bottom row, the histograms plot the number of additional complexes that will be added to a larger set when the protein structure similarity cutoff is decremented by a step size of 0.01 (left), when the ligand fingerprint similarity cutoff is decremented by 0.01 (center), or when the pocket topology dissimilarity cutoff is incremented by 0.2 (right). Hence the number of training complexes referenced by an arbitrary point of the red curve is equal to the cumulative summation over the heights of all the bars of and before the corresponding cutoff. By definition, the histograms of the three subfigures at the bottom row are identical to the histograms at the top row after being mirrored along the median cutoff, but the cumulative curves are certainly different. The raw values of this figure are available at https://github.com/cusdulab/MLSF.

Inline graphic — Number of training complexes (the red curve) against protein structure similarity cutoff (left column), ligand fingerprint similarity cutoff (center column) and pocket topology dissimilarity cutoff (right column) to the CASF-2016 test set in two directions, either starting from a small training set comprising complexes most dissimilar to the test set (top row; the ds direction defined by or ) or starting from a small training set comprising complexes most similar to the test set (bottom row; the sd direction defined by or ). At the top row, the histograms plot the number of additional complexes that will be added to a larger set when the protein structure similarity cutoff is incremented by a step size of 0.01 (left), when the ligand fingerprint similarity cutoff is incremented by 0.01 (center), or when the pocket topology dissimilarity cutoff is decremented by 0.2 (right). At the bottom row, the histograms plot the number of additional complexes that will be added to a larger set when the protein structure similarity cutoff is decremented by a step size of 0.01 (left), when the ligand fingerprint similarity cutoff is decremented by 0.01 (center), or when the pocket topology dissimilarity cutoff is incremented by 0.2 (right). Hence the number of training complexes referenced by an arbitrary point of the red curve is equal to the cumulative summation over the heights of all the bars of and before the corresponding cutoff. By definition, the histograms of the three subfigures at the bottom row are identical to the histograms at the top row after being mirrored along the median cutoff, but the cumulative curves are certainly different. The raw values of this figure are available at https://github.com/cusdulab/MLSF.