Skip to main content
. 2019 Jun 18;8:e46179. doi: 10.7554/eLife.46179

Figure 1. Demographics of participants.

(A) World map displaying a red dot (i.e., one dot = one IP address) at the location of a participant completing the paired-associates learning (PAL) task. (B) Line plot showing the percent of males and females from 18 to 85 years old. (C) Line plot displaying the percent of participants with 12, 14, 16, or 20 years of education for each year of age from 18 to 85 years old. (D) Line plot showing the percent of participants reporting a first-degree family history of Alzheimer’s disease (FH) for each year of age from 18 to 85 years old.

Figure 1.

Figure 1—figure supplement 1. Regression diagnostic plots of the general linear model (GLM) including all participants (N=59,571).

Figure 1—figure supplement 1.

(A) A plot of the residuals versus fitted values. This plot suggests that the assumptions of linearity, equality of variances, and no outliers are met. (B) A plot of quantiles of the data versus quantiles of a normal distribution. This plot suggests that there is a violation from normality in residuals and thus the error terms; however, this violation should not cause major problems because of the large number of participants in this study. (C) A plot of the square root of the standardized residuals versus the fitted values. Like plot A., this plot suggests that the variability of the residuals does not change much over the range of the dependent variable. (D) A plot of the residuals versus leverage. This plot suggests that there are no influential cases as all case fall within Cook’s distance (i.e., red dashed line not visible on the plot).

Figure 1—figure supplement 2. Simulated additional self-report error and the impact on the significance of the FH effect in MindCrowd.

Figure 1—figure supplement 2.

The effect of error in self-reported FH-AD status was simulated by adding additional error to the MindCrowd cohort by re-assigning individual FH responses between 1-30%. and re-running the full statistical model. This was repeated 10,000 times for each error rate. Boxplots representing the distribution of p-values for the re-analysis under the new error model are illustrated. This demonstrates that even with 8% additional error added to the self-report FH question, we would still have identified a significant association with FH in all 10,000 cases. Even with 24% additional FH self-report error we would have still reported a significant association between FH and PAL over 50% of the time (better than by chance alone). Therefore, this simulation suggests that even with significant levels of additional error in FH self-report the association between FH and PAL performance would still have been noted by our study.