Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2011 Aug 19;51(9):2146. doi: 10.1021/ci200363q

Correction to CSAR Benchmark Exercise of 2010: Selection of the Protein–Ligand Complexes

James B Dunbar Jr , Richard D Smith, Chao-Yie Yang, Peter Man-Un Ung, Katrina W Lexa, Nickolay A Khazanov, Jeanne A Stuckey, Shaomeng Wang, Heather A Carlson
PMCID: PMC3180240

This Erratum is to declare that the values reported for R2 in the paper are actually Pearson R values. The wrong column of data in a spreadsheet was used inadvertently. All correlation values in the paper are correct, just mislabeled with the squared superscript. One of the major conclusions noted in the abstract and discussed in the “Strengths and Weaknesses” Section should read:

“Inherent experimental error limits the possible correlation between scores and measured affinity; Pearson R is limited to ∼0.91 (Pearson R2 ∼0.83) when fitting to the data set without over parameterizing. Pearson R is limited to ∼0.83 (Pearson R2 ∼0.70) when scoring the data set with a method trained on outside data.”

For clarity, the Pearson R and R2 are given in Table 1 below for all the theoretical cases posed. It corrects the correlation coefficients in Figure 3 and in the discussion of signal over noise in the “Strengths and Weaknesses” section.

Table 1. Correlation Metrics when Random Error is Added to the 343 Affinity Data of the CSAR-NRC Data Seta.

  error with σ = 0.5 log K error with σ = 1.0 log K error with σ = 2.0 log K error with σ = 3.0 log K
Random Error in One Coordinate (Ideal vs Lab Case)
Pearson R 0.976 0.913 0.744 0.590
(Pearson R)2 0.952 0.834 0.554 0.348
Random Error in Both Coordinates (Lab vs Scoring Case)
Pearson R 0.952 0.835 0.553 0.355
(Pearson R)2 0.907 0.696 0.305 0.130
a

Values are the medians of 100 generations of random error.

It should be noted that our use of R2 is based on squaring the Pearson value, not based on a calculation of the coefficient of determination (also called R2). The coefficient of determination measures the one-to-one correspondence between two values, requiring a slope of 1 and an intercept at 0 rather than least-squares-fit values.

Acknowledgments

We thank Christian Kramer of Novartis Pharma AG for pointing out that the R2 values in the paper were likely R and for very stimulating discussions regarding Pearson R2 versus the coefficient of determination.

Funding Statement

National Institutes of Health, United States


Articles from Journal of Chemical Information and Modeling are provided here courtesy of American Chemical Society

RESOURCES