Table 1.
A: Population level | |
The predictive ability of polygenic risk scores can be measured in research studies, where differences between cases and controls (Fig. 1) or of a continuous trait in a population are assessed. Here, the disease status or trait is pre-established, and the studies measure the extent to which this is determined by the PRS. Outcome measures from such studies include: (1) R2 from linear regression, which quantifies the proportion of variance in a continuous trait captured by the PRS, or equivalently Nagelkerke’s R2 for logistic regression for case-control disease status. | |
(2) R2 on a liability scale, which transforms Nagelkerke’s R2 to reflect disease prevalence, instead of the case-control ratio of the research study [15]. | |
(3) The area under the receiver operating characteristic curve (AUC) [16], which takes a value from 0.5 to 1. This gives an overall summary of the predictive ability of the model. It is most easily interpreted as the probability that a randomly selected case will have a higher polygenic risk score than a randomly selected control. Such models can also include risk factors such as age and sex, which will increase the AUC values above that based on PRS alone. | |
(4) The proportion of the population that has a k-fold increased odds (k = 2, 3, …), compared to the population disease risk. | |
(5) Odds ratio of disease risk conferred by a 1-standard deviation increase in PRS. | |
(6) Odds ratio of disease for an individual in the top PRS decile (or other quantiles) compared to individuals in a different part of the PRS distribution. The high-risk group may be compared to the lowest decile, a mid-quintile (e.g. 40–60%), or those outside the high-risk group (0–90%). Comparing the upper and lower tails maximises the odds ratio for impact but raises concerns about the arbitrariness of the quantile used. | |
B: Individual level | |
In a clinical setting, the focus is on a single person: what information does their PRS give about their risk of disease? Possible outcome measures that are relevant at an individual level include: | |
(a) At what percentile in the distribution of PRS does this individual lie? This is between 0 and 100%, with scores having a normal distribution. | |
(b) What is this person’s relative risk of disease compared to the average risk in the population? | |
(c) What is this person’s absolute risk of disease, and by what age [17]? |