Skip to main content
Frontiers in Psychology logoLink to Frontiers in Psychology
. 2025 Nov 4;16:1701166. doi: 10.3389/fpsyg.2025.1701166

From big associations to big practices—Why normative modeling should be the default in personality neuroscience

Peiqian Wu 1,*, Yudie Chang 1
PMCID: PMC12623340  PMID: 41262398

Introduction

Early studies reported enticing correlations between Big Five personality traits and brain structure (DeYoung et al., 2010; Kanai and Rees, 2011). Such findings supported the view that stable traits have identifiable neural substrates. However, over a decade of research revealed a more complicated picture: many reported brain-trait associations failed to replicate (e.g., Boekel et al., 2015), and the field struggled with so-called “voodoo correlations,” in which extremely high correlations were likely spurious (Vul et al., 2009). Subsequent critiques underscored that typical sample sizes were underpowered to detect the small effect sizes realistic for personality–brain links (Button et al., 2013; Dubois and Adolphs, 2016).

Small effects and the need for big data

Comprehensive investigations have found little to no evidence of robust brain–personality associations in large samples. For instance, Avinun et al. (2020) examined 1,107 individuals and reported no significant relationships between Big Five traits and multiple measures of brain morphometry. A systematic review and meta-analysis reached a similar conclusion, summarizing that there are no replicable structural brain differences as a function of Big Five traits (Chen and Canli, 2022). These results suggest that true associations, if present, are extremely small in magnitude. Power analyses and empirical work converge on the same message: detecting such tiny effects requires very large samples. Marek et al. (2022) showed that typical brain-wide association studies require thousands of subjects for reproducibility. Figure 1 illustrates this relationship: as effect size decreases, the sample size needed for 80% statistical power rises steeply. In practice, most historical studies in personality neuroscience were far too small, yielding underpowered analyses and spurious positives (Yarkoni, 2009; Button et al., 2013).

Figure 1.

Graph showing the sample size required to detect small brain-trait effects based on true correlation. The y-axis represents the required sample size for eighty percent power, and the x-axis shows true correlation values from zero to 0.10. Key points marked are at r = 0.02 (N = 19,620), r = 0.03 (N = 8,719), and r = 0.05 (N = 3,138). Note states computation was done via Fisher-z power analysis with eighty percent power and a two-sided alpha of 0.05. The curve shows a steep decline in sample size with increasing correlation.

Sample size required to detect small brain–trait effects (80% power, two-sided α = 0.05). Markers indicate r = 0.02/0.03/0.05. Computed via Fisher-z power analysis.

Heterogeneity and the case for normative models

Heterogeneity is a key reason why averaging can obscure meaningful effects: individuals with the same trait score can show different neural patterns, and similar brain measures can accompany different trait expressions. Sex-dependent or subgroup-specific associations have been observed (Nostro et al., 2017), and idiosyncratic variation is the rule rather than the exception. Normative modeling provides a principled solution by estimating the expected distribution of brain measures given covariates (e.g., age, sex) and then characterizing each person by their deviation from that norm (Marquand et al., 2016, 2019). Instead of asking whether trait X correlates with region Y on average, we ask whether individuals with extreme trait values show atypical deviations relative to peers. This person-centered approach has already improved sensitivity in clinical domains, revealing heterogeneous deviation patterns in disorders such as schizophrenia and bipolar disorder (Wolfers et al., 2018). Resources like the Brain Charts for the human lifespan demonstrate how large multi-cohort datasets can be used to build robust normative references (Bethlehem et al., 2022).

From big associations to big practices

Adopting normative modeling as a default approach represents a shift from chasing elusive average effects to embracing big practices—robust analytical habits that match problem complexity. Practically, this entails: assembling adequately powered samples via collaboration and data sharing; deriving individual deviation scores for relevant brain measures; testing preregistered, out-of-sample hypotheses about how deviations relate to personality; and reporting full model performance and failures. This agenda pairs naturally with prediction-oriented analysis (Yarkoni and Westfall, 2017), multivariate methods, and open science. It aligns with the idea that personality is about what makes individuals unique—and our methods should quantify that uniqueness instead of averaging it away.

Conclusion

The era of small-N brain–personality correlation hunting is ending. By defaulting to normative modeling, personality neuroscience can better accommodate small effects and heterogeneity, leverage large datasets to establish meaningful baselines, and identify how and why certain individuals or subgroups deviate. Coupled with adequately powered designs and transparent, predictive workflows, this shift from big associations to big practices promises findings that are more robust, generalizable, and practically meaningful.

Funding Statement

The author(s) declare that financial support was received for the research and/or publication of this article. The Outstanding Talent Cultivation Project under the High-Level and Subsidized Discipline Construction Program of Anhui Normal University (2025GFXK040) supported this study.

Author contributions

PW: Funding acquisition, Investigation, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. YC: Writing – review & editing.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that Gen AI was used in the creation of this manuscript. During manuscript preparation we used ChatGPT (OpenAI; model: GPT 5; accessed in September 2025 via chat.openai.com) only for language polishing.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  1. Avinun R., Israel S., Knodt A. R., Hariri A. R. (2020). Little evidence for associations between the Big Five personality traits and variability in brain gray or white matter. NeuroImage 220:117092. 10.1016/j.neuroimage.2020.117092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bethlehem R. A. I., Seidlitz J., White S. R., Vogel J. W., Anderson K. M., Adamson C., et al. (2022). Brain charts for the human lifespan. Nature 604, 525–533. 10.1038/s41586-022-04554-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Boekel W., Wagenmakers E. J., Belay L., Verhagen J., Brown S., Forstmann B. U. (2015). A purely confirmatory replication study of structural brain–behavior correlations. Cortex 66, 115–133. 10.1016/j.cortex.2014.11.019 [DOI] [PubMed] [Google Scholar]
  4. Button K. S., Ioannidis J. P. A., Mokrysz C., Nosek B. A., Flint J., Robinson E. S. J., et al. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nat. Rev. Neurosci. 14, 365–376. 10.1038/nrn3475 [DOI] [PubMed] [Google Scholar]
  5. Chen Y.-W., Canli T. (2022). “Nothing to see here”: No structural brain differences as a function of the Big Five personality traits from a systematic review and meta-analysis. Personal. Neurosci. 5:e8. 10.1017/pen.2021.5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. DeYoung C. G., Hirsh J. B., Shane M. S., Papademetris X., Rajeevan N., Gray J. R. (2010). Testing predictions from personality neuroscience: brain structure and the Big Five. Psychol. Sci. 21, 820–828. 10.1177/0956797610370159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dubois J., Adolphs R. (2016). Building a science of individual differences from fMRI. Trends Cognit. Sci. 20, 425–443. 10.1016/j.tics.2016.03.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Kanai R., Rees G. (2011). The structural basis of inter-individual differences in human behaviour and cognition. Nat. Rev. Neurosci. 12, 231–242. 10.1038/nrn3000 [DOI] [PubMed] [Google Scholar]
  9. Marek S., Tervo-Clemmens B., Calabro F. J., Montez D. F., Kay B. P., Hatoum A. S., et al. (2022). Reproducible brain-wide association studies require thousands of individuals. Nature 603, 654–660. 10.1038/s41586-022-04492-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Marquand A. F., Kia S. M., Zabihi M., Wolfers T., Buitelaar J. K., Beckmann C. F. (2019). Conceptualizing mental disorders as deviations from normative functioning. Mol. Psychiatry 24, 1415–1424. 10.1038/s41380-019-0441-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Marquand A. F., Rezek I., Buitelaar J., Beckmann C. F. (2016). Understanding heterogeneity in clinical cohorts using normative models: Beyond case–control studies. Biol. Psychiatry 80, 552–561. 10.1016/j.biopsych.2015.12.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Nostro A. D., Müller V. I., Reid A. T., Eickhoff S. B. (2017). Correlations between personality and brain structure: a crucial role of gender. Cereb. Cortex 28, 3698–3712. 10.1093/cercor/bhw191 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Vul E., Harris C., Winkielman P., Pashler H. (2009). Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspect. Psychol. Sci. 4, 274–290. 10.1111/j.1745-6924.2009.01125.x [DOI] [PubMed] [Google Scholar]
  14. Wolfers T., Doan N. T., Kaufmann T., Alnæs D., Moberget T., Agartz I., et al. (2018). Mapping the heterogeneous phenotype of schizophrenia and bipolar disorder using normative models. JAMA Psychiatry 75, 1146–1155. 10.1001/jamapsychiatry.2018.2467 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Yarkoni T. (2009). Big correlations in little studies: inflated fMRI correlations reflect low statistical power—Commentary on Vul et al. (2009). Perspect. Psychol. Sci. 4, 294–298. 10.1111/j.1745-6924.2009.01127.x [DOI] [PubMed] [Google Scholar]
  16. Yarkoni T., Westfall J. (2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspect. Psychol. Sci. 12, 1100–1122. 10.1177/1745691617693393 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Frontiers in Psychology are provided here courtesy of Frontiers Media SA

RESOURCES