Abstract
Polygenic scores (PGSs) have emerged as promising tools for predicting complex traits from genetic data, however, their predictive performance for psychiatric disorders remains limited and the added value of deep learning (DL) over linear models is underexplored. In this study, we compared our DL model, Genome-Local-Net (GLN), with the linear model bigstatsr in predicting five psychiatric disorders—ADHD, ASD, BIP, MDD, and SCZ—using individual-level genotype data. We further assessed whether combining these internal (individual-based) PGSs with external (GWAS-derived) PGSs and family genetic risk scores (FGRSs) could improve prediction additively or synergistically. While GLN and bigstatsr performed similarly in-sample, GLN showed better generalization on an out-of-sample replication set for ADHD, ASD, and MDD, with an average AUROC gain of 0.026. Integrating internal, external, and family-based scores significantly improved ADHD prediction, though DL-based integration provided no consistent advantage over logistic models. These findings suggest that while DL may enhance generalizability for specific psychiatric traits, linear models remain competitive and effective for genetic risk prediction.
INTRODUCTION
The ability to predict disorders from genetic data is a focal point of genomic research, as it has the potential to advance early diagnosis and personalized medicine, while providing deeper insights into the genetic basis of medical conditions (Kullo et al., 2022; Torkamani et al., 2018). Research into genetic-based prediction has primarily focused on polygenic scores (PGSs), which can be divided into two types based on how SNP effect sizes are derived: external PGSs use effect sizes of SNPs estimated from external GWAS summary statistics, while internal PGSs estimate the SNP effects directly from individual-level data. PGSs have been found to have significant predictive accuracy for various medical conditions, including diabetes (Luckett et al., 2023), cardiovascular diseases (Mars et al., 2020), inflammatory bowel disease and breast cancer (Khera et al., 2018). However, even though psychiatry pioneered the development of PGSs (Purcell et al., 2009), the predictive performance of PGSs for mental disorders falls short of achieving similar levels of accuracy compared to some of the non-psychiatric conditions (Andreassen et al., 2023). For instance, current PGSs explain ~11% of liability for schizophrenia and ~4% for major depression, corresponding to AUCs of 0.74 and 0.60, respectively, values below thresholds typically considered clinically actionable (Murray et al., 2021). This likely reflects the complex and heterogeneous nature of psychiatric disorders, along with sample sizes that remain underpowered, highlighting the need for methods better suited to these challenges.
Many software packages have been developed to compute both external PGSs, like PRSice (Choi & O’Reilly, 2019), PRS-CS (Ge et al., 2019), SBayesR (Lloyd-Jones et al., 2019), LDpred2 (Privé et al., 2021), SDPR (Zhou & Zhao, 2021), and internal PGSs, like BOLT-LMM (Loh et al., 2015), bigstatsr (Privé et al., 2018, 2020), snpnet (Li et al., 2021; Qian et al., 2020), but they generally assume additive contributions from independent variants (Schwarzerova et al., 2024). The hypothesis of whether there is substantial contribution of non-linear and interactions effects between genetic variants for complex traits remains an intensely debated topic (Hemani et al., 2021; Hill et al., 2005; Sheppard et al., 2021; Tang et al., 2022). Nonetheless, exploring these effects remains a crucial area of investigation in understanding the genetic architecture of complex traits and could potentially lead to models with higher predictive performance. Recently developed non-linear Machine Learning (ML) and Deep Learning (DL) based PGSs were found to achieve at least similar performance to linear models, and outperforming them in a few specific cases such as autoimmune diseases (Ohta et al., 2024; Sigurdsson et al., 2023), Alzheimer’s (Hermes et al., 2023), inflammatory bowel disease (Klau et al., 2023; Peng et al., 2024), and breast cancer (Badré et al., 2021). Different studies have also developed DL-based predictors of psychiatric disorders from individual data, but they generally have low sample sizes and focused on maximum few thousands of SNPs as predictors (Bracher-Smith et al., 2021). Moreover, it is unclear how DL-based PGSs perform for psychiatric disorders compared to linear models, as many published studies were found to suffer from methodological issues such as overfitting, unfair comparisons, and data leakage, which undermine their validity and reliability (Bracher-Smith et al., 2021).
For complex disorders, a single internal or external PGS will never provide a complete picture of risk. Biomarkers, health traits, health states, constitutional factors, and family history, among other factors, will contribute and should be considered in a combined and comprehensive way (Schork et al., 2018). Among these, family history of disease is a common predictor that was shown, for many disorders, to be complementary to PGSs (Dybdahl Krebs et al., 2023; Hujoel et al., 2022; Mars et al., 2022). Recent work has emphasized how Family Genetic Risk Scores (FGRSs), which combine information from multiple relatives into a single score (Kendler et al., 2021), can significantly enhance prediction beyond PGSs alone. In major depressive disorder (MDD), for instance, both PGS and FGRS individually achieved modest classification performance (AUC ≈ 0.57–0.60), but their combination yielded a notable improvement (AUROC = 0.630 in iPSYCH1 and 0.608 in iPSYCH2), highlighting their complementary value for risk stratification (Dybdahl Krebs et al., 2024). Other studies have explored predicting psychiatric disorders by combining multiple external PGSs (Albiñana et al., 2023; Krapohl et al., 2018), internal and external PGSs (Albiñana et al., 2021), and they all have shown improved predictive performance compared to single feature models. However, the integration of internal and external PGSs with FGRSs, and further exploration of potential interactions among scores is currently understudied and could potentially provide a more powerful and comprehensive approach to genetic risk prediction in psychiatric disorders. In particular, DL approaches may offer an effective strategy for achieving this goal. In this scenario, we recently developed EIR, a DL framework designed to enhance individual-level PGS prediction by introducing a novel neural network architecture called Genome-Local-Net (GLN) (Sigurdsson et al., 2023). When we applied the GLN model from EIR to the UKBB dataset for predicting diagnoses and biomarker levels, we found it to outperform linear models in some cases (Sigurdsson et al., 2022, 2023, 2024). However, this model has never been applied on large scale psychiatric disorders cohorts.
Here we explored the use of the DL architecture GLN from the EIR framework (Sigurdsson et al., 2023) for predicting 5 psychiatric disorders: Attention Deficit Hyperactivity Disorder (ADHD), Autism Spectrum Disorder (ASD), Bipolar Disorder (BIP), Major Depressive Disorder (MDD), and Schizophrenia (SCZ), using large-scale genotype data from the Integrative Psychiatric Research Consortium (iPSYCH) (Pedersen et al., 2018; Bybjerg-Grauholm et al., 2020), and compared it to a state-of-the-field linear model, bigstatsr (Privé et al., 2018). We also explored the potential of combining GLN with scores from bigstatsr, external PGSs and FGRSs in both logistic regression- and DL-based integration approaches. We found that GLN and bigstatsr had similar performance in predicting the 5 psychiatric diagnoses on the test set, but GLN significantly outperformed and had better generalization performance than bigstatsr on the out-of-sample replication set for ADHD, ASD, and MDD. The integration of GLN and bigstatsr did not provide additional predictive performance, but combining GLN or bigstatsr with external PGS and FGRS did significantly improved prediction performance in terms of Area Under the Receiver Operating Curve (AUROC) for ADHD, suggesting that combining various types of genetic risk scores – both internal and external - could enhance the ability to predict some complex psychiatric phenotypes.
RESULTS
Study overview
In this study we leveraged the Danish population-based Integrative Psychiatric Research Consortium (iPSYCH) case-cohort dataset, which contains genotype data from individuals born in Denmark between 1981 and 2008 and followed up until 2016. The iPSYCH cohort consists of two waves of data: iPSYCH1 and iPSYCH2, which were genotyped on two different SNP arrays (see Methods). Taken together, the two waves contain data on 137,251 individuals, of which 88,831 were diagnosed with at least one of 5 psychiatric disorders: Attention Deficit Hyperactivity Disorder (ADHD), Autism Spectrum Disorder (ASD), Bipolar Disorder (BIP), Major Depressive Disorder (MDD), and Schizophrenia (SCZ). For controls, 48,420 individuals were randomly drawn from the entire population. The specific numbers of cases and controls in the two waves that we analyzed in this study are presented in Figure 1C.
Figure 1. Overview of the study.
A) Overview of the workflow followed in this study. After splitting the data into training and test, we used the data of the individuals from the iPSYCH1 cohort to train bigstatsr (a linear model) and GLN (a DL model) to compute internal PGSs for 5 different psychiatric disorders – ADHD, ASD, BIP, MDD, SCZ. On these individuals we also computed external PGSs and PA-FGRSs. We compared the performance of the single scores and explored their integration with a regression- and a DL-based approach, both on the held-out test of iPSYCH1 and on the entirety of the iPSYCH2 as out-of-sample replication set. B) Workflow of the strategy we used to select the final set of SNPs on which we computed the PGSs. After filtering for MAF > 0.01 and imputation score > 0.8, we extracted the intersection of the SNPs across the two cohorts iPSYCH1 and iPSYCH2. Using external GWAS summary statistics, we ran Clumping and Thresholding (C+T) with PRSice at different p-value thresholds, restricting the analysis on the intersected SNP set. This resulted in the creation of 90 SNP sets. After taking the union of these sets across each p-value threshold, we then performed clumping on MAF across a range of r2 values. Finally, we arbitrarily chose the final set of SNPs as the one created with parameters p-value 0.05 and r2 0.5, which consisted of 230,588 external GWAS-informed SNPs. See methods for more details. C) Distribution of the cases and controls across the 5 psychiatric disorders in the two iPSYCH cohorts 1 and 2.
A schematic overview of the study is presented in Figure 1A. For each of the 5 psychiatric disorders ADHD, ASD, BIP, MDD, SCZ, we used the genotype data of the individuals from the training sets of the iPSYCH1 cohort to compute two internal Polygenic Scores (PGSs): a linear one using bigstatsr, a state-of-the-art linear model (Privé et al., 2018, 2020), and our DL-based GLN model from the EIR framework (Sigurdsson et al., 2023). We further computed external PGSs with SBayesR and Pearson-Aitken Family Genetic Risk Scores (Dybdahl Krebs et al., 2024) (PA-FGRSs, hereinafter referred to as FGRSs for simplicity). We first compared the performances of the two internal PGSs, separately, and then assessed their contribution within integrated models that combined external PGSs and FGRSs, using both a logistic model and a DL-based approach. We evaluated the performances on a held-out test set from the iPSYCH1 cohort, and used the entirety of the iPSYCH2 cohort as an out-of-sample replication set. All PGS were computed using the same genotype data, a set of ~230k GWAS-informed SNPs that were in common between the two iPSYCH (see Methods for more details and Figure 1B for a schematic overview).
Hyperparameter tuning improves DL model performance for predicting psychiatric disorders
To optimize the DL model GLN from the EIR framework on genotype data, we tuned 30 different combinations of 4 hyperparameters separately for each psychiatric diagnosis (see Methods). The hyperparameter tuning results revealed diagnosis-specific patterns in the selection of top-performing combinations. For ADHD, the best-performing models consistently avoided kernel widths ≤ 32 in the locally connected layers (which determine how many input features each neuron processes), residual block dropout, which controls the probability of dropping connections in residual blocks to prevent overfitting values between 0.25 and 0.5, and the fraction of SNPs set to ‘missing’ in one-hot encoding for data augmentation set to 0.6 (Figure S1A). In contrast, models predicting ASD and MDD were less sensitive to specific hyperparameter choices, achieving optimal performance across a broad range of combinations (Figure S1B-C). For BIP, the top-performing models never selected kernel widths ≥ 64, a channel expansion base of 3 (which determines the number of trainable kernels applied to each local SNP region), a residual block dropout rate of 0.25, or a data augmentation setting where ≤ 20% of SNPs were set to ‘missing’ (Figure S1D). Similarly, for SCZ, the top-performing configurations consistently excluded a kernel width of 128 (Figure S1D). When testing for the impact that each hyperparameter had independently on the prediction performances, we found that the kernel width was significantly associated to higher AUROCs only for BIP (p-value 0.019), while the percentage of SNPs set to ‘missing’ negatively associated to lower AUROCs for ADHD and MDD with p-values of 0.017 and 0.039, respectively (Figure S2, Table S1). After 5-fold Cross Validation (5-CV) training in the iPSYCH1 cohort, the best performing combinations for each diagnosis resulted in average 5-CV AUROCs of 0.58, 0.56, 0.54, 0.55, 0.55 for ADHD, ASD, BIP, MDD, and SCZ, respectively. These findings highlight that the GLN architecture had varying degrees of sensitivity to hyperparameter selection across disorders, with some diagnoses exhibiting stricter constraints for optimal performance. Such differences may potentially reflect differences in the underlying genetic architecture of each disorder.
GLN outperformed bigstatsr and generalized better on the replication set for ADHD, ASD, and MDD
To explore the potential advantages of using a DL model for computing internal, individual-based PGSs for psychiatric diagnoses, we compared the performance of the DL model GLN from the EIR framework to the model bigstatsr, a state-of-the-art linear model tailored to high dimensional individual-level genotype data (Privé et al., 2018, 2019). After training each model for each diagnosis using the same 5-fold CV splits, we found that bigstatsr consistently achieved higher performance than GLN on the iPSYCH1 held-out test set across all five diagnoses. However, these differences were not statistically significant, with average AUROCs ranging from 0.54 [0.49–0.58] for BIP to 0.61 [0.60–0.63] for ADHD (Figure 2A, Table S4). (R2s on liability scale from 0.004 [−0.003–0.011] for BIP and 0.035 [0.024–0.046] for ADHD, Figure S4A, Table S4). Nevertheless, when we evaluated the models on the iPSYCH2 replication set, GLN significantly outperformed bigstatsr in predicting ADHD, ASD, and MDD, with average increases in AUROC of 0.036 [0.025–0.046], 0.024 [0.012–0.036], and 0.02 [0.01–0.031], respectively (q-values, i.e. FDR corrected p-values < 0.001, Figure 2A, Table S4). We found this same pattern when comparing the performance in terms of R2, with average increases of 0.015 [0.01–0.02] for ADHD, 0.005 [0.003–0.007] for ASD, and 0.005 [0.002–0.008] for MDD (q-values < 0.001, Figure S4A, Table S4). To further assess the generalization performances of the models, we tested whether their performances on iPSYCH2 was lower than on the iPSYCH1 test set for each diagnosis. Here, bigstatsr exhibited a significant drop in AUROC performance for ADHD and MDD, with average decrease of 0.05 [0.035–0.07] and 0.04 [0.02– 0.06], respectively (q-values < 0.001, Figure 2B, Table S5). In contrast, GLN showed no significant drop in performance for any diagnosis. In other words, it showed no signs of overfitting to the training data. As before, we found the same pattern in terms of R2, with average drops of 0.027 [0.015–0.038] for ADHD and 0.015 [0.006–0.024] for MDD (q-values < 0.001, Figure S4B, Table S5). These results suggest that GLN produced more generalizable effects and the comparable performance of the two models within sample may be due to overfitting by bigstatsr.
Figure 2. Performance and generalization of internal PGS models GLN and bigstatsr.
A) Predictive performance in terms of AUROC of the two internal PGS models — GLN (DL-based, blue) and bigstatsr (linear, green) — for five psychiatric disorders (ADHD, ASD, BIP, MDD, SCZ) evaluated on the iPSYCH1 test set (left) and the out-of-sample iPSYCH2 replication set (right). Bars represent AUROC scores with error bars indicating 95% empirical confidence intervals based on 10,000 bootstraps. While the models showed comparable performance in-sample, GLN significantly outperformed bigstatsr for ADHD, ASD, and MDD on iPSYCH2 (FDR-corrected p-values annotated). B) Difference in AUROC (ΔAUROC) between iPSYCH1 and iPSYCH2 for each model, capturing drop in generalization performance. GLN exhibited more stable performance across cohorts, while bigstatsr showed significant performance declines for ADHD and MDD. Statistical significance was assessed using one-sided empirical tests with FDR correction; “ns” denotes non-significant comparisons.
Integrating GLN and bigstatsr did not increase performance
Next, we explored whether a logistic regression-based integration of the prediction scores from GLN and bigstatsr could increase the classification performance compared to the models taken singularly. For each psychiatric disorder, we implemented a logistic regression model using the GLN and bigstatsr scores as explanatory variables. We trained each model on the same 5-CV folds used to train GLN and bigstatsr and evaluated the performance on the test set of the iPSYCH1 (see Methods). Here, the performance of the logistic regression integration models GLN+bigstatsr reached values comparable to those of GLN and bigstatsr alone, with AUROCs ranging from 0.54 [0.49–0.58] for BIP to 0.62 [0.60–0.64] for ADHD (Figure 3A, Table S6) (R2s from 0.003 [0–0.011] for BIP and 0.039 [0.028–0.051] for ADHD, Figure S5A, Table S6). Thus, the integrated model GLN+bigstatsr failed to provide a significant advantage neither in terms of AUROC, nor R2. This suggests that GLN and bigstatsr likely captured overlapping information and that their combination did not provide additional complementary signals to improve classification performance.
Figure 3. Integration of internal, external, and family-based scores improves ADHD prediction but shows limited benefit from DL-based combination.
A) Predictive performance (AUROC) of different combinations of internal PGSs (GLN, bigstatsr), external PGSs, and family genetic risk scores (FGRSs) across five psychiatric disorders. For ADHD, the multivariate regression-based integration of GLN or bigstatsr with PGS and FGRS significantly improved prediction (GLN+PGS+FGRS AUROC = 0.68, q = 0.011), while other combinations showed no significant gains. For other disorders, individual and combined models performed comparably. B) Performance difference (ΔAUROC) between DL- and regression-based integration models across disorders. The DL-based integration did not outperform the regression-based one, with overlapping confidence intervals and no significant differences detected. C) Generalization performance (ΔAUROC between iPSYCH1 and iPSYCH2) for each integrated model. Regression-based integration methods (line pattern) were more robust to cohort differences, particularly for ADHD and ASD, while DL-based integration models showed larger generalization drops. Error bars represent 95% confidence intervals based on 10,000 bootstraps. Statistical significance was assessed using one-sided empirical tests with FDR correction; “ns” denotes non-significant comparisons.
The external PGSs and PA-FGRSs exhibited similar performances for most disorders
We further investigated whether combining the internal PGSs – GLN and bigstatsr – with external PGSs and FGRSs could improve prediction performance. Since external PGSs leverage large-scale GWAS summary statistics and FGRSs capture family-based genetic risk, integrating them with logistic and/or DL-based internal PGSs could provide complementary predictive power. We first found that, when considered individually, PGSs and FGRSs exhibited similar predictive performance for ASD, BIP, MDD, and SCZ, with average AUROC values ranging from 0.55 [0.53–0.57] for ASD to 0.61 [0.56–0.66] for BIP (Figure 3A). The PGS for ADHD performed notably worse than both GLN and bigstatsr, as well as FGRS, with AUROCs of 0.52 [0.49–0.54] against 0.61 [0.60–0.63] and 0.63 [0.61–0.65], respectively (Figure 3A). This likely reflects the limited power of the external ADHD summary statistics since, unlike the other disorders in this study, most of the available genotype data of ADHD cases originate from iPSYCH, which was excluded from the training GWAS to avoid overfitting. The logistic regression combination of external PGSs and FGRSs (PGS+FGRS model) significantly outperformed the models with the individual scores for MDD but not for other disorders, reaching AUROC values of 0.64 [0.62–0.65] (empirical q-value 0.042, see Methods, Figure 3A). This was consistent with (Dybdahl Krebs et al., 2024), who observed that integrating PGS and FGRS scores increased MDD classification performance.
Combining internal PGSs with PA-FGRS and external PGSs in a regression model improved classification performance in ADHD
We found that the multivariate model, which integrated the internal PGSs (GLN and/or bigstatsr) with external PGSs and FGRSs (GLN/bigstatsr+PGS+FGRS model), had significantly higher performance in terms of AUROC compared to the PGS+FGRS model for ADHD. Specifically, the AUROC reached 0.68 [0.66–0.70] for the multivariate model, compared to 0.64 [0.62–0.65] for the PGS+FGRS model (q-value 0.011, Figure 3A). This suggests that the internal PGSs provided additional information that was complementary to the external PGSs and FGRSs. However, when comparing the two models in terms of R2, the difference did not reach statistical significance but the direction of effect was the same, with the multivariate model reaching values of 0.081 [0.066–0.097] compared to 0.064 [0.050–0.078] for the PGS+FGRS model (Figure S4, Table S7). The internal PGSs GLN and bigstatsr likely enhanced classification by capturing information that improved case/control separation. However, these signals may explain only small incremental variance in ADHD liability, leading to a smaller change in R². This suggests that while internal PGSs improved case prediction in our setting, their contribution when combined with external PGSs trained on larger samples is limited. Finally, we found that a full multivariate model combining GLN+bigstatr+PGS+FGRS did not outperform the models that included only one internal PGS at a time, further suggesting that the two internal PGS methods GLN and bigstatsr provided similar weights to the same SNPs and thus did not provide complementary predictive value when combined (Figure 3A, Table S7).
A DL-based integration of internal PGSs with PA-FGRS and external PGSs did not provide advantage over a logistic regression-based one
We further explored whether a DL based approach to integrating various combinations of risk scores would improve predictive performance compared to the regression-based integration methods tested above. In this DL-based strategy the GLN neural network underwent further training and combined the other scores in a way that allowed the detection of potential non-linear interactions between these and individual SNPs in the neural network architecture (see Methods). The DL-based integration models reached performances comparable to those of the regression-based integration models, with average AUROC differences between the DL and the logistic models ranging from −0.0025 [−0.0302 – 0.0251] for GLN+bigstatsr in ASD to 0.0321 [−0.0319 – 0.0961] for GLN+bigstatsr+PGS+FGRS in BIP (Figure 3B, Table S8). We observed the same pattern in terms of R2 (Figure S5B, Table S8). Thus, this analysis revealed no significant advantage of integrating internal, external PGSs and PA-FGRS using the DL approach over the regression-based one for any of the model combinations. These findings suggest that in this setting both the DL and regression-based integrations might capture overlapping genetic contributions to psychiatric disorders, implying a limited or absent contribution from complex and non-linear interactions among the tested set of predictors, or that in our scenario the DL models were unable to capture them.
Generalization of all the models on iPSYCH2
Finally, we evaluated the replication performances of the logistic regression- and DL-based integration models on the iPSYCH2 out-of-sample set. We found that when predicting BIP and SCZ, none of the models had a significant drop in performance on iPSYCH2 with average differences in AUROCs between test iPSYCH1 and iPSYCH2 performances ranging from − 0.0239 [−0.0711 – 0.0233] for the DL model GLN+bigstatsr+PGS+FGRS in BIP to 0.0074 [− 0.0301 – 0.0448] for the logistic model GLN+bigstatsr in SCZ (Figure 3C, Table S9). This reflects the pattern we had already observed for the replication performance of the single internal PGS models GLN and bigstatsr (Figure 2B). However, the confidence intervals for BIP and SCZ were the widest among disorders studied here, since their sample sizes were the smallest (Figure 1C). For MDD, all models had significant drops in performance, with average AUROCs differences ranging from 0.022 [0.004 – 0.04] for the DL model GLN+PGS+FGRS to 0.0425 [0.0237 – 0.0613] for the DL model GLN+bigstatsr (empirical q-values ≤ 0.015, Figure 3C, Table S9). This is consistent with the results observed for the single internal PGSs, where MDD showed the largest performance drop for GLN (average AUROC decrease of 0.0165 [−0.0022 – 0.0353] (Figure 2B, Table S5). As the GLN-based PGS for MDD had a poorer generalization performance compared to the other disorders, models that included it as part of the integration were similarly affected, leading to reduced overall performance. However, for ADHD and ASD, we found a clear pattern where the regression-based integration had a consistently lower drop in performance compared to the DL-based integration and was not significant for all models except GLN+bigstatsr+PGS+FGRS for ADHD (q-value 0.024, Figure 3C, Table S9). This suggests that the regression-based integrations were robust and able to better balance the good replication performances of GLN and the worse performances of bigstatsr.
DISCUSSION
While efforts to refine polygenic scores have yielded substantial progress, predicting psychiatric disorders from genetics remains a challenge due to the complex genetic architecture of these conditions. Moreover, the extent to which DL models can offer improvements over simpler linear ones seems to currently be limited. Here, using the genotype data from the iPSYCH1 and 2 cohorts, we systematically evaluated two internal PGS methods for predicting five psychiatric disorders (ADHD, ASD, BIP, MDD, and SCZ): bigstatsr, a state-of-the-art linear model, and Genome-Local-Net (GLN), a deep learning (DL) model from the EIR framework (Privé et al., 2018, 2020; Sigurdsson et al., 2023). Additionally, we investigated whether integrating these internal PGSs with Pearson-Aitken Family Genetic Risk Scores (PA-FGRSs) and external PGSs, which use summary statistics for the estimation of the effect sizes, could improve predictive performance.
We found that the DL model GLN had a comparable performance to the linear model bigstatsr for predicting the five diagnoses on the held-out test set of the iPSYCH1 cohort. This is in agreement with previous work showing that DL and linear models can achieve similar performances for complex trait prediction from genetic data (Christodoulou et al., 2019; Ohta et al., 2024; Sigurdsson et al., 2023), especially if the genetic architecture of these traits is highly polygenic and thought to be mostly additive (Hill et al., 2005; Hivert et al., 2021). Bigstatsr is a highly tuned elastic net model which makes strong prior assumptions about the linear and additive effects of the SNPs, and this in turn makes it perform well for traits thought of as being mainly additive (Momen et al., 2018). Conversely, DL models like GLN inherently make fewer assumptions about the distribution and nature of the effect of genetic variants compared to traditional linear models. This reduced assumption framework complicates the training process and necessitates larger sample sizes for effective model training (Grealey et al., 2024).
Interestingly, GLN had a higher generalization performance compared to bigstatsr when applied to the out-of-sample replication set of iPSYCH2 for predicting ADHD, ASD, and MDD, both in terms of AUROC and R2. This finding suggests that the DL model GLN was able to capture more robust, potentially non-linear and/or interaction-based genetic patterns underlying the psychiatric disorders, while the linear model bigstatsr might have been more prone to overfitting to the specific characteristics of the initial training cohort. We previously showed that the GLN model exhibited robust generalization performance across two independent cohorts (Sigurdsson et al., 2023). However, that study did not focus on psychiatric disorders nor included a comparison with bigstatsr, preventing us to directly compare our results. While GLN’s superior performance in the replication set suggests better generalization, we cannot exclude that this advantage arose from the model’s ability to capture subtle patterns specific to the training data, which coincidentally aligned with the replication set, instead of capturing real and more generalizable genetic effects. Further research on this and potentially other cohorts would be necessary to disentangle this effect and validate GLN’s increased generalizability for psychiatric disorder prediction. If this finding holds, it will undoubtedly represent an interesting avenue of research since classic PGSs generally suffer from poor transferability across cohorts (Ding et al., 2023; Wang et al., 2022).
Combining the scores from GLN and bigstatsr did not provide additional predictive performance compared to the single models, suggesting that these two internal PGS methods might have captured largely redundant genetic effects in the training set of the iPSYCH1 cohort. In order to confirm this hypothesis, future analyses could examine the specific genetic variants to which the two models assigned their weights and assess the extent of their overlap.
We further showed that integrating either GLN or bigstatsr with external PGS and FGRS in a logistic model significantly increased the prediction performance for ADHD but not for the other disorders. This could be an indication that the ADHD internal PGSs - GLN or bigstatsr - were able to capture information from the genetic data beyond what was obtained through the external PGS and FGRS. While this improvement was significant in terms of AUROC, it did not reach significance when evaluated using R² on the liability scale. However, the trend in R² remained in the same positive direction, indicating a consistent, albeit more modest, gain. AUROC reflects a model’s ability to rank cases above controls and is thus sensitive to improvements in classification, whereas R² measures the proportion of variance explained in the underlying liability. Since PGSs tend to perform best in identifying individuals at the extremes of the risk distribution (Choi et al., 2020), it is possible that the internal PGSs improved classification by better identifying these individuals, which enhanced AUROC but had a smaller impact on overall explained variance.
Previous studies have showed how combining multiple risk scores can increase the performance for predicting psychiatric disorders and other complex traits (Albiñana et al., 2023; Allesøe et al., 2022; Dybdahl Krebs et al., 2024; Elgart et al., 2022; Truong et al., 2024). However, to the best of our knowledge, this is the first study that explored the potential of combining scores using DL and combining both internal and external PGSs, together with FGRSs, to predict psychiatric disorders. Even though the performance of the full logistic model combining all these risk scores was significantly higher than the one of the PGS+FGRS model for ADHD, the increase was modest. In addition, the DL-based integration did not provide significant advantages over the logistic one. The role of non-linear interactions in complex traits remains a subject of ongoing research. Here we did not observe substantial performance differences between deep learning models and linear approaches for psychiatric disorders. This could reflect a genuine lack of non-linear signal, or it may be due to limitations in our current sample size and experimental setup, which may not have been sufficient to detect such effects (Grealey et al., 2024). Nonetheless, we observed indications of improved generalization of the DL models, though these findings require further validation. The difficulty in detecting evidence of non-linear interactions in complex human traits may not be due to their absence but rather due to the multitude and subtlety of the effects. Consequently, linear approximation could manage to capture nearly all the population-level variation.
LIMITATIONS
Our study has several limitations. A first and inherent limitation is the sample size which, while being the largest for traits like ADHD and ASD, may still be insufficient for the DL model to fully capture the complex genetic effects of these disorders. As sample sizes increase, the relative performance difference between DL and linear models is envisaged to change (Grealey et al., 2024). Another limitation that may have impacted the performance of the DL model is the computational constraint, which restricted us to testing only a limited number of hyperparameter combinations. A more exhaustive hyperparameter optimization or model search could potentially enhance the performance of the DL model. However, the DL model’s optimization process is more challenging due to the higher number of potential hyperparameters compared to the bigstatsr model. Additionally, for computational feasibility and fair comparisons we restricted the genotype data to a relatively small number of SNPs, which may mask some potential of the DL models. In contrast, bigstatsr can handle a larger number of SNPs more efficiently, allowing for faster processing and analysis (Privé et al., 2018, 2020). Related to this, another limitation is the specific architecture of the GLN model, which is not representative of all DL models. Therefore, we cannot assume these results would generalize to all other architectures. In contrast, linear models have fewer variations in their design and implementation, making them more straightforward to compare. As a result, our findings regarding the performance of the GLN model should not be generalized to all DL approaches. Finally, since we focused on a cohort with a homogeneous European ancestry, we did not consider the remaining potential biases due to population structure, and we cannot exclude with certainty that it would not impact the differences between the models presented in this study.
CONCLUSIONS AND FUTURE PERSPECTIVES
In this study we compared a DL model, Genome-Local-Net (GLN), with the linear model bigstatsr for predicting five psychiatric disorders using genotype data from the iPSYCH cohort. While both models showed similar performance within the training cohort, GLN generalized better to an independent replication set for ADHD, ASD, and MDD. This suggests that deep learning may offer advantages in terms of generalizability, although its benefit was not consistent across all disorders. Integrating these internal polygenic scores with external GWAS-based PGSs and FGRS improved prediction for ADHD, but not for the other disorders. Notably, logistic regression-based integration performed as well as or better than deep learning-based integration, indicating that complex interactions among scores were either absent or not effectively captured by the deep models.
In conclusion, our findings suggest that DL approaches can be a valuable tool for complex trait prediction from genetic data, with the potential for improved generalizability compared to linear models. However, the performance gains may be limited, at least at current sample sizes, for highly polygenic traits with a mostly additive genetic architecture, such as the psychiatric disorders studied here. Further research is necessary to explore the advantages of deep learning for other complex traits, as well as to investigate more effective ways of integrating various genetic risk factors using machine learning. Overall, the clinical utility of these models is currently limited, but they hold great promise for future clinical deployment as sample sizes increase and our understanding of the genetic underpinnings of these disorders continue to advance. Multi-modal models, which integrate diverse data sources such as genetic, environmental, and clinical information, are already being used in research to provide a more comprehensive study of complex traits (Allesøe et al., 2022; Ma et al., 2025; Porcu et al., 2021; Sigurdsson et al., 2024). In the future, we anticipate an increasing adoption of these models, as they will not only enhance the study of genetic factors but also incorporate environmental influences, allowing for a more holistic perspective on their interactions and overall impact.
MATERIALS AND METHODS
Dataset
The dataset used in this study originates from the Lundbeck Foundation Integrative Psychiatric Research (iPSYCH) consortium, which created the iPSYCH2015, a large case-cohort sample encompassing the entire population of Denmark born between 1981 and 2008 (Bybjerg-Grauholm et al., 2020; Pedersen et al., 2018). From this population of 1,657,499 citizens, 88,831 individuals were selected and genotyped based on being diagnosed with at least one major mental disorder, and 48,420 individuals were randomly selected as controls from the general population, for a total of 137,251 individuals. The iPSYCH2015 case-cohort sample is comprised of two separate enrollment periods based on the initial population - iPSYCH2012 and iPSYCH2015i, hereafter referred to as iPSYCH1 and iPSYCH2 for simplicity. The first includes 57,377 psychiatric cases and 30,000 random population controls, totaling 86,189 individuals. The second enrollment period expanded the sample size by adding 36,741 psychiatric cases and 19,982 random population controls, for a total of 56,233 individuals. All the individuals in the cohort got a blood sample taken at birth and stored in the Danish Neonatal Screening Biobank (Nørgaard-Pedersen & Hougaard, 2007). From these, DNA was extracted, and genotyping was performed on the Infinium PsychChip v1.0 array for iPSYCH1 and on the Global Screening Array v2 for iPSYCH2. The steps of genotyping, quality controls and imputation have been described in detail in Schork et al., 2019. Only unrelated individuals with European ancestry were included in this study. To ascertain psychiatric diagnoses, the iPSYCH study design relied on data retrieved from the Danish Psychiatric Central Research Register (PCR)(Mors et al., 2011), in conjunction with information sourced from the Danish National Patient Register (DNPR) (Lynge et al., 2011). The diagnoses were administered by psychiatrists during in and out-patients encounters up until 2016. Diagnoses from primary care were not registered in the national registry and therefore not available for this study. The mental disorders included in the iPSYCH cohort and which we analyzed in this study were defined according to the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10) and are the following: attention-deficit/hyperactivity disorder (ADHD: F90.0), autism spectrum disorders (ASD: F84.0, F84.1, F84.5, F84.8, F84.9), bipolar disorder (BIP: F30–31), major depressive disorder (MDD: F32, F33), and schizophrenia (SCZ: F20). The personal information of the individuals enrolled in the study was anonymized to ensure their privacy. The project received approval from the Danish Data Protection Agency and, therefore, under Danish law, obtaining informed consent from the participants was not necessary.
Genotype data preprocessing
To ensure a computationally feasible size of the data, we selected SNPs in an informed manner, leveraging GWAS summary statistics for each disorder to define the final list of input variants. An overview of the filtering steps is presented in Figure 1C. We filtered the genotype data from the iPSYCH1 and iPSYCH2 cohorts independently to remove multi-allelic variants, and to retain only variants with a minor allele frequency (MAF) > 0.01 and with an imputation quality score (DR2) > 0.8. The preprocessing steps were carried out with custom scripts using bcftools (v.1.20) and PLINK (v.2.00) (Chang et al., 2015; Danecek et al., 2021). We then took the intersection of the variants from the two cohorts, resulting in 6,844,196 SNPs. To construct a GWAS-informed SNP set, we ran clumping and thresholding (C+T) with PRSice (v.2.3.5) (Choi & O’Reilly, 2019) on each of the 5 psychiatric disorders ADHD, ASD, BIP, MDD, SCZ, and across a range of p-value thresholds: 0.00000005, 0.0000001, 0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1 with the options --clump-kb 250 –clump-p 1 –clump-r2 1. This was performed on the training and validation set of iPSYCH1 and not on the entire cohort to prevent data leakage. This resulted in 18 different SNP sets for each psychiatric disorder, one for each p-value threshold from the C+T procedure. For each of these SNP sets at each p-value threshold, we then took the union of the SNPs across the 5 disorders and further clumped these candidate SNP sets using the function snp_clumping from the package bigsnpr (v.1.11.6) (Privé et al., 2018) across a range of R2: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and using MAF as a statistic, to ensure that no genomic region was left unrepresented (Privé et al., 2018). From these, we selected the final SNP set as the one having 230,588 SNPs, obtained using a p-valued threshold of 0.05 and a clumping R2 of 0.5. This was an arbitrary choice, based on a tradeoff between selecting a feasible number of variants with enough information but at the same time with reduced LD. Only the genotype data from the autosomal chromosomes was used in this study. The overview of the number of variants in each SNP set created to select the final set is shown in Table S10.
Implementation of training
For each mental disorder, we split the iPSYCH1 dataset into training and test sets with ratios of 0.9 and 0.1, respectively, stratifying for case/control status. The test sets were held out and used exclusively for the evaluation of the models. The training sets were further split into 5 folds so that we could train the models in a 5-fold cross-validation fashion. The same folds were used for all models to ensure reproducibility and consistency in their comparisons. The final split sizes (train/validation per fold/test) were the following: ADHD: 34,681/6,936/3,854, ASD: 32,692/6,539/3,633, BIP: 23,412/3,682/2,602, MDD: 38,106/7,622/4,234, and SCZ :24,329/4,886/2,704. The final models were the result of an ensemble of the five models trained on the five different training folds and evaluated on the test set and on the entirety of the iPSYCH2 dataset as an out-of-sample replication set.
GLN implementation and hyperparameter tuning
The neural network models used in this study employ the genome-local-net (GLN) architecture developed in (Sigurdsson et al., 2023). In this architecture, locally connected layers (LCL) and residual blocks were employed to extract information from the genotype data. This allowed for the combination of both local and global genomic information while ensuring that the number of parameters was kept manageable compared to using other architectures such as fully connected (FC) layers. The models were implemented using EIR (v.0.1.39). The genotype data was one-hot encoded allowing each allele to have a distinct weight. For each of the five psychiatric diagnoses, we performed hyperparameter tuning with a random search approach on the training and validation set of iPSYCH1. We considered four hyperparameters for tuning. The kernel width, which determines the receptive field size in locally connected layers, was tested at values of 16, 32, 64, and 128 (covering 4, 8, 16, and 32 one-hot encoded SNPs, respectively). The residual block dropout, which controls the probability of dropping connections in residual blocks to prevent overfitting, was evaluated at dropout rates of 0, 0,10, 0.25, 0.5, and 0.9. The channel expansion base, which defines the power of 2 used to determine the number of channels (i.e., number of trainable kernels applied to each SNPs region) in the network, was tested with values of 2, 3, and 4. Finally, the percentage of SNPs set to ‘missing’ for data augmentation, a method that simulates missingness in genotype data by randomly setting a proportion of SNPs to ‘missing’ in one-hot encoding, was explored at levels of 0%, 20%, 40%, and 60%. Each hyperparameter tuning training was performed in a 1-fold fashion, using validation sets 2–5 for training and validation 1 for evaluation and selection of the best hyperparameter combinations based on the AUROC metric.
Linear model bigstatsr
The bigstatsr model was trained using the same 5-CV folds as used for the training of the GLN model with the function big_spLogReg from the bigstatsr and bigsnpr packages (v1.5.12 and v.11.6, respectively) (Privé et al., 2018, 2020). This function natively implements a Cross-Model Selection and Averaging (CMSA) method, which enables a fast cross-validation coupled with an extensive and automatic hyperparameter tuning of the lambda and alpha hyperparameters of the elastic net regularization. We performed a grid search for alpha with values [0.0001, 0.001, 0.01, 0.1, 1] and we used the default options to select the best penalization parameter (λ). The final model was a result of the average of the 5 models run on each fold, as an ensemble model.
External PGSs computation
We computed external polygenic scores (PGSs) for the individuals of the two cohorts iPSYCH1 and iPSYCH2 separately, using summary statistics from recent genome-wide association studies (GWAS) conducted by the Psychiatric Genomics Consortium (PGC) on ADHD (Demontis et al., 2023), ASD (Grove et al., 2019), BIP (Mullins et al., 2021), MDD (Howard et al., 2019), SCZ (Trubetskoy et al., 2022).To avoid sample overlap, we ensured that iPSYCH participants were excluded from the discovery datasets. We re-estimated SNP effect sizes using SBayesR from GCTB (v.2.05beta) (Lloyd-Jones et al., 2019), a Bayesian multiple regression method, applying a p-value threshold of 0.9. For linkage disequilibrium (LD) modeling, we used the HapMap3-based banded LD matrix, downloaded from the GCTB resource (https://cnsgenomics.com/software/gctb/#Download). We used the default SBayesR settings for the scaling factor (γ = 0, 0.01, 0.1, 1) and mixture proportions (π = 0.95, 0.02, 0.02, 0.01). After rescaling the SNP effects, we generated PGS for each individual using the scoring module in PLINK (v.2.00) (Chang et al., 2015).
PA-FGRSs
The PA-FGRS were computed as previously described (Dybdahl Krebs et al., 2024) using the PA-FGRS R package (https://github.com/BioPsyk/PAFGRS). In summary, we used the Danish Civil registry to link each member of the iPSYCH study to all available relatives (21 on average). For each relative diagnoses were extracted from PCR and DNPR and date of end-of-follow-up was set the date of death, emigration or Dec 31, 2016, whichever came first. Assuming an additive genetic liability threshold model, the PA-FGRS method estimates the expected genetic liability of an individual conditional on the disease status of all available relatives, while modeling their relatedness, the sex-stratified disease prevalence and proportion their life they have been observed which is integrated using the sex- and birthyear specific cumulative incidence curves.
Logistic regression-based integration
For each mental disorder, we carried out an integration of the scores from GLN, bigstatsr, external PGS and FGRS using a logistic regression and implemented using the tidymodels framework in R. For the ADHD, ASD, and MDD models, we allowed an interaction term between the scores, while we did not for the BIP and SCZ models, since it negatively impacted the performance in terms of AUROC (Fig S6, upper plots). As mentioned above, the training was performed on the same 5-CV folds as for the other models, and the performance was evaluated on the test sets of iPSYCH1 and the out-of-sample replication sets of the iPSYCH2. To avoid data leakage, the scores were centered and scaled using the means and standard deviations calculated on the training set.
DL-based integration
We performed the DL-based integration of the GLN model with the scores from bigstatsr, external PGS and FGRSs, by allowing the pre-trained GLN model to undergo further training and by adding the scores as tabular input to EIR. Initially, the GLN model was trained solely on the genotype data. The scores from bigstatsr, external PGSs and FGRSs were then fed as additional inputs to this model, via a dedicated tabular feature extractor. This used embeddings for the categorical inputs, and the continuous inputs where normalized with the means and standard deviations computed from the training set to avoid data leakage. The components of the pre-trained GLN model (genotype feature extractor, fusion module and output) were kept the same. The new tabular inputs were injected into the existing fusion module through the tabular extractor, enabling a flexible architecture that could potentially learn interactions between individual SNPs and tabular scores (e.g., a specific SNP modulating the effect of a PGS). A schematic overview of the architecture is available in Figure 2A from the original EIR paper (Sigurdsson et al., 2023). All models allowed for interaction between the scores, since for the DL case the performance of these models was not significantly lower than those of the non-interaction models for BIP and SCZ (Fig S6, bottom plots)
Model performances and statistical tests
For each psychiatric disorder, we trained all models – GLN, bigstatsr, the regression- and the DL-based combined models of all scores – as five separate models using five-fold cross validation. We then formed the final models by ensembling these five trained models with a simple averaging of the predictions of each model. We used these ensemble models to predict the mental disorders on the iPSYCH1 tests sets, as well as on the full iPSYCH2 samples, which served as an out-of-sample replication set. To derive confidence intervals for the performance metrics and estimate their sampling distributions, we bootstrapped the test and replication sets, i.e. for each set we drew 10,000 samples with replacement with the same original size, and computed the average metric, together with its standard deviation and non-parametric 2.5th and 97.5th percentiles (Raschka, 2020). To compute the Area Under the Receiver Operating Curve (AUROC), we used the same implementation of the AUCBoot function from the bigstatsr package (Privé et al., 2020). For the R2, we used the same approach as (Albiñana et al., 2023), namely we computed the R2 using the caret function R2 on the predicted probabilities and true outcomes, and then converted this onto the liability scale using the function coef_to_liab from the bigsnpr package (Privé et al., 2018) using the following values as prevalences for the psychiatric disorders: ADHD: 0.05, ASD: 0.02, BIP: 0.01, MDD: 0.15, SCZ:0.01. For the option K_gwas of the function, analogously to what (Albiñana et al., 2023) did, we used the proportion of cases in the specific sets (these are shown in Figure 1C). We compared the performance of two models by deriving empirical p-values using the bootstrapped AUC and R2 for each model, aligning with the classical definition of a p-value as the probability, under the null hypothesis, of observing data at least as extreme as the observed differences. We performed a one-side empirical test to assess whether e.g. Model 2 achieved higher performance than Model 1, by counting the proportion of times the metric from Model 2 was higher or equal than the metric from Model 1. Under the null hypothesis of “Model 2’s metric is not higher than Model 1’s metric”, this proportion served as the p-value for the one-sided test. To correct for multiple comparisons, we applied the false discovery rate (FDR) method using the Benjamini–Hochberg procedure(Benjamini & Hochberg, 1995). This approach provides adjusted p-values, often referred to as “q-values,” which control the expected proportion of false positives (Type I errors) across all comparisons performed, and was applied with the function p.adjust from the stats package in R.
Supplementary Material
ACKNOWLEDGEMENTS
We would like to thank Professor Ditte Demontis for providing us with the latest GWAS summary statistics for ADHD trained on PGC excluding the iPSYCH individuals.
FUNDING
L.C. A.I.S., and S.R. were supported by the Novo Nordisk Foundation (NNF14CC0001, NNF23SA0084103 and NNF21SA0072102). K.L.G.H and A.J.S were supported by Lundbeckfonden (R335-2019-2318) and the National Institute of Mental Health (R01MH130581). M.D.K was supported by Lundbeckfonden (R450-2023-1447). M.E.B was supported by Lundbeckfonden (R278-2018-1411)
Funding Statement
L.C. A.I.S., and S.R. were supported by the Novo Nordisk Foundation (NNF14CC0001, NNF23SA0084103 and NNF21SA0072102). K.L.G.H and A.J.S were supported by Lundbeckfonden (R335-2019-2318) and the National Institute of Mental Health (R01MH130581). M.D.K was supported by Lundbeckfonden (R450-2023-1447). M.E.B was supported by Lundbeckfonden (R278-2018-1411)
Footnotes
SUPPLEMENTARY INFORMATION
Supplementary figures and tables are available in the attached files.
CONFLICT OF INTEREST
S.R. is the founder and owner of the Danish company BioAI and has performed consulting for Sidera Bio ApS. The other authors declare no competing interests.
DATA AND CODE AVAILABILITY
The iPSYCH data is securely stored and was analyzed in the national HPC facility in Denmark (https://genome.au.dk/). The personal data was anonymized prior to use for research purposes. The project received approval from the Danish Data Protection Agency, and in accordance with Danish legislation, informed consent was not required. The iPSYCH initiative is committed to providing access to these data to the scientific community, in accordance with Danish law. EIR is available at https://eir.readthedocs.io/en/stable/. The code used in this project is available at https://github.com/leocob/eirpsy. The code to compute PA-FGRSs is available at https://github.com/BioPsyk/PAFGRS.
REFERENCES
- Albiñana C., Grove J., McGrath J. J., Agerbo E., Wray N. R., Bulik C. M., Nordentoft M., Hougaard D. M., Werge T., Børglum A. D., Mortensen P. B., Privé F., & Vilhjálmsson B. J. (2021). Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction. The American Journal of Human Genetics, 108(6), 1001–1011. 10.1016/j.ajhg.2021.04.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Albiñana C., Zhu Z., Schork A. J., Ingason A., Aschard H., Brikell I., Bulik C. M., Petersen L. V., Agerbo E., Grove J., Nordentoft M., Hougaard D. M., Werge T., Børglum A. D., Mortensen P. B., McGrath J. J., Neale B. M., Privé F., & Vilhjálmsson B. J. (2023). Multi-PGS enhances polygenic prediction by combining 937 polygenic scores. Nature Communications, 14(1), Article 1. 10.1038/s41467-023-40330-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- Allesøe R. L., Thompson W. K., Bybjerg-Grauholm J., Hougaard D. M., Nordentoft M., Werge T., Rasmussen S., & Benros M. E. (2022). Deep Learning for Cross-Diagnostic Prediction of Mental Disorder Diagnosis and Prognosis Using Danish Nationwide Register and Genetic Data. JAMA Psychiatry. 10.1001/jamapsychiatry.2022.4076 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andreassen O. A., Hindley G. F. L., Frei O., & Smeland O. B. (2023). New insights from the last decade of research in psychiatric genetics: Discoveries, challenges and clinical implications. World Psychiatry: Official Journal of the World Psychiatric Association (WPA), 22(1), 4–24. 10.1002/wps.21034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Badré A., Zhang L., Muchero W., Reynolds J. C., & Pan C. (2021). Deep neural network improves the estimation of polygenic risk scores for breast cancer. Journal of Human Genetics, 66(4), Article 4. 10.1038/s10038-020-00832-7 [DOI] [PubMed] [Google Scholar]
- Benjamini Y., & Hochberg Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society Series B: Statistical Methodology, 57(1), 289–300. 10.1111/j.2517-6161.1995.tb02031.x [DOI] [Google Scholar]
- Bracher-Smith M., Crawford K., & Escott-Price V. (2021). Machine learning for genetic prediction of psychiatric disorders: A systematic review. Molecular Psychiatry, 26(1), 70–79. 10.1038/s41380-020-0825-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bybjerg-Grauholm J., Pedersen C. B., Bækvad-Hansen M., Pedersen M. G., Adamsen D., Hansen C. S., Agerbo E., Grove J., Als T. D., Schork A. J., Buil A., Mors O., Nordentoft M., Werge T., Børglum A. D., Hougaard D. M., & Mortensen P. B. (2020). The iPSYCH2015 Case-Cohort sample: Updated directions for unravelling genetic and environmental architectures of severe mental disorders (p. 2020.11.30.20237768). 10.1101/2020.11.30.20237768 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang C. C., Chow C. C., Tellier L. C., Vattikuti S., Purcell S. M., & Lee J. J. (2015). Second-generation PLINK: Rising to the challenge of larger and richer datasets. GigaScience, 4(1), s13742-015-0047-0048. 10.1186/s13742-015-0047-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Choi S. W., Mak T. S.-H., & O’Reilly P. F. (2020). Tutorial: A guide to performing polygenic risk score analyses. Nature Protocols, 15(9), 2759–2772. 10.1038/s41596-020-0353-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Choi S. W., & O’Reilly P. F. (2019). PRSice-2: Polygenic Risk Score software for biobank-scale data. GigaScience, 8(7), giz082. 10.1093/gigascience/giz082 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Christodoulou E., Ma J., Collins G. S., Steyerberg E. W., Verbakel J. Y., & Van Calster B. (2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology, 110, 12–22. 10.1016/j.jclinepi.2019.02.004 [DOI] [PubMed] [Google Scholar]
- Danecek P., Bonfield J. K., Liddle J., Marshall J., Ohan V., Pollard M. O., Whitwham A., Keane T., McCarthy S. A., Davies R. M., & Li H. (2021). Twelve years of SAMtools and BCFtools. GigaScience, 10(2), giab008. 10.1093/gigascience/giab008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Demontis D., Walters G. B., Athanasiadis G., Walters R., Therrien K., Nielsen T. T., Farajzadeh L., Voloudakis G., Bendl J., Zeng B., Zhang W., Grove J., Als T. D., Duan J., Satterstrom F. K., Bybjerg-Grauholm J., Bækved-Hansen M., Gudmundsson O. O., Magnusson S. H., … Børglum A. D. (2023). Genome-wide analyses of ADHD identify 27 risk loci, refine the genetic architecture and implicate several cognitive domains. Nature Genetics, 1–11. 10.1038/s41588-022-01285-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ding Y., Hou K., Xu Z., Pimplaskar A., Petter E., Boulier K., Privé F., Vilhjálmsson B. J., Olde Loohuis L. M., & Pasaniuc B. (2023). Polygenic scoring accuracy varies across the genetic ancestry continuum. Nature, 618(7966), 774–781. 10.1038/s41586-023-06079-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dybdahl Krebs M., Appadurai V., Georgii Hellberg K.-L., Ohlsson H., Steinbach J., Pedersen E., iPSYCH Study Consortium, Werge T., Sundquist J., Sundquist K., Cai N., Zaitlen N., Dahl A., Vilhjalmsson B., Flint J., Bacanu S.-A., Schork A. J., & Kendler K. S. (2023). The relationship between genotype- and phenotype-based estimates of genetic liability to psychiatric disorders, in practice and in theory. Genetic and Genomic Medicine. 10.1101/2023.06.19.23291606 [DOI] [Google Scholar]
- Dybdahl Krebs M., Georgii Hellberg K.-L., Lundberg M., Appadurai V., Ohlsson H., Pedersen E., Steinbach J., Matthews J., Border R., LaBianca S., Calle X., Meijsen J. J., Ingason A., Buil A., Vilhjálmsson B. J., Flint J., Bacanu S.-A., Cai N., Dahl A., … Schork A. J. (2024). Genetic liability estimated from large-scale family data improves genetic prediction, risk score profiling, and gene mapping for major depression. The American Journal of Human Genetics, 111(11), 2494–2509. 10.1016/j.ajhg.2024.09.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elgart M., Lyons G., Romero-Brufau S., Kurniansyah N., Brody J. A., Guo X., Lin H. J., Raffield L., Gao Y., Chen H., de Vries P., Lloyd-Jones D. M., Lange L. A., Peloso G. M., Fornage M., Rotter J. I., Rich S. S., Morrison A. C., Psaty B. M., … Sofer T. (2022). Non-linear machine learning models incorporating SNPs and PRS improve polygenic prediction in diverse human populations. Communications Biology, 5(1), 1–12. 10.1038/s42003-022-03812-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ge T., Chen C.-Y., Ni Y., Feng Y.-C. A., & Smoller J. W. (2019). Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nature Communications, 10(1), 1776. 10.1038/s41467-019-09718-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grealey J., Abraham G., Méric G., Cánovas R., Kelemen M., Teo S. M., Salim A., Inouye M., & Xu Y. (2024). Deep learning for polygenic prediction: The role of heritability, interaction type and sample size. Genetic and Genomic Medicine. 10.1101/2024.10.25.24316156 [DOI] [Google Scholar]
- Grove J., Ripke S., Als T. D., Mattheisen M., Walters R. K., Won H., Pallesen J., Agerbo E., Andreassen O. A., Anney R., Awashti S., Belliveau R., Bettella F., Buxbaum J. D., Bybjerg-Grauholm J., Bækvad-Hansen M., Cerrato F., Chambert K., Christensen J. H., … Børglum A. D. (2019). Identification of common genetic risk variants for autism spectrum disorder. Nature Genetics, 51(3), Article 3. 10.1038/s41588-019-0344-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hemani G., Powell J. E., Wang H., Shakhbazov K., Westra H.-J., Esko T., Henders A. K., McRae A. F., Martin N. G., Metspalu A., Franke L., Montgomery G. W., Goddard M. E., Gibson G., Yang J., & Visscher P. M. (2021). Phantom epistasis between unlinked loci. Nature, 596(7871), Article 7871. 10.1038/s41586-021-03765-z [DOI] [PubMed] [Google Scholar]
- Hermes S., Cady J., Armentrout S., O’Connor J., Carlson S., Cruchaga C., Wingo T., Greytak E. M., & Initiative T. A. D. N. (2023). Epistatic Features and Machine Learning Improve Alzheimer’s Risk Prediction Over Polygenic Risk Scores. medRxiv. 10.1101/2023.02.10.23285766 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hill W. G., Goddard M. E., & Visscher P. M. (2005). Data and theory point to mainly additive genetic variance for complex traits. PLOS Genetics, 4(2). 10.1371/journal.pgen.1000008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hivert V., Sidorenko J., Rohart F., Goddard M. E., Yang J., Wray N. R., Yengo L., & Visscher P. M. (2021). Estimation of non-additive genetic variance in human complex traits from a large sample of unrelated individuals. American Journal of Human Genetics, 108(5), 786–798. 10.1016/j.ajhg.2021.02.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howard D. M., Adams M. J., Clarke T.-K., Hafferty J. D., Gibson J., Shirali M., Coleman J. R. I., Hagenaars S. P., Ward J., Wigmore E. M., Alloza C., Shen X., Barbu M. C., Xu E. Y., Whalley H. C., Marioni R. E., Porteous D. J., Davies G., Deary I. J., … McIntosh A. M. (2019). Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions. Nature Neuroscience, 22(3), 343–352. 10.1038/s41593-018-0326-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hujoel M. L. A., Loh P.-R., Neale B. M., & Price A. L. (2022). Incorporating family history of disease improves polygenic risk scores in diverse populations. Cell Genomics, 2(7), 100152. 10.1016/j.xgen.2022.100152 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kendler K. S., Ohlsson H., Sundquist J., & Sundquist K. (2021). Family Genetic Risk Scores and the Genetic Architecture of Major Affective and Psychotic Disorders in a Swedish National Sample. JAMA Psychiatry, 78(7), 735–743. 10.1001/jamapsychiatry.2021.0336 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khera A. V., Chaffin M., Aragam K. G., Haas M. E., Roselli C., Choi S. H., Natarajan P., Lander E. S., Lubitz S. A., Ellinor P. T., & Kathiresan S. (2018). Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nature Genetics, 50(9), 1219–1224. 10.1038/s41588-018-0183-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klau J. H., Maj C., Klinkhammer H., Krawitz P. M., Mayr A., Hillmer A. M., Schumacher J., & Heider D. (2023). AI-based multi-PRS models outperform classical single-PRS models. Frontiers in Genetics, 14. https://www.frontiersin.org/articles/10.3389/fgene.2023.1217860 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krapohl E., Patel H., Newhouse S., Curtis C. J., von Stumm S., Dale P. S., Zabaneh D., Breen G., O’Reilly P. F., & Plomin R. (2018). Multi-polygenic score approach to trait prediction. Molecular Psychiatry, 23(5), 1368–1374. 10.1038/mp.2017.163 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kullo I. J., Lewis C. M., Inouye M., Martin A. R., Ripatti S., & Chatterjee N. (2022). Polygenic scores in biomedical research. Nature Reviews Genetics, 23(9), 524–532. 10.1038/s41576-022-00470-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li R., Chang C., Tanigawa Y., Narasimhan B., Hastie T., Tibshirani R., & Rivas M. A. (2021). Fast numerical optimization for genome sequencing data in population biobanks. Bioinformatics, 37(22), 4148–4155. 10.1093/bioinformatics/btab452 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lloyd-Jones L. R., Zeng J., Sidorenko J., Yengo L., Moser G., Kemper K. E., Wang H., Zheng Z., Magi R., Esko T., Metspalu A., Wray N. R., Goddard M. E., Yang J., & Visscher P. M. (2019). Improved polygenic prediction by Bayesian multiple regression on summary statistics. Nature Communications, 10(1), 5086. 10.1038/s41467-019-12653-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loh P.-R., Tucker G., Bulik-Sullivan B. K., Vilhjálmsson B. J., Finucane H. K., Salem R. M., Chasman D. I., Ridker P. M., Neale B. M., Berger B., Patterson N., & Price A. L. (2015). Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nature Genetics, 47(3), 284–290. 10.1038/ng.3190 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luckett A. M., Weedon M. N., Hawkes G., Leslie R. D., Oram R. A., & Grant S. F. A. (2023). Utility of genetic risk scores in type 1 diabetes. Diabetologia, 66(9), 1589–1600. 10.1007/s00125-023-05955-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lynge E., Sandegaard J. L., & Rebolj M. (2011). The Danish National Patient Register. Scandinavian Journal of Public Health, 39(7_suppl), 30–33. 10.1177/1403494811401482 [DOI] [PubMed] [Google Scholar]
- Ma S., Zeng A. G. X., Haibe-Kains B., Goldenberg A., Dick J. E., & Wang B. (2025). Moving towards genome-wide data integration for patient stratification with Integrate Any Omics. Nature Machine Intelligence, 7(1), 29–42. 10.1038/s42256-024-00942-3 [DOI] [Google Scholar]
- Mars N., Koskela J. T., Ripatti P., Kiiskinen T. T. J., Havulinna A. S., Lindbohm J. V., Ahola-Olli A., Kurki M., Karjalainen J., Palta P., Neale B. M., Daly M., Salomaa V., Palotie A., Widén E., & Ripatti S. (2020). Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers. Nature Medicine, 26(4), 549–557. 10.1038/s41591-020-0800-0 [DOI] [PubMed] [Google Scholar]
- Mars N., Lindbohm J. V., Della Briotta Parolo P., Widén E., Kaprio J., Palotie A., & Ripatti S. (2022). Systematic comparison of family history and polygenic risk across 24 common diseases. The American Journal of Human Genetics, 109(12), 2152–2162. 10.1016/j.ajhg.2022.10.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Momen M., Mehrgardi A. A., Sheikhi A., Kranis A., Tusell L., Morota G., Rosa G. J. M., & Gianola D. (2018). Predictive ability of genome-assisted statistical models under various forms of gene action. Scientific Reports, 8(1), 12309. 10.1038/s41598-018-30089-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mors O., Perto G. P., & Mortensen P. B. (2011). The Danish Psychiatric Central Research Register. Scandinavian Journal of Public Health, 39(7_suppl), 54–57. 10.1177/1403494810395825 [DOI] [PubMed] [Google Scholar]
- Mullins N., Forstner A. J., O’Connell K. S., Coombes B., Coleman J. R. I., Qiao Z., Als T. D., Bigdeli T. B., Børte S., Bryois J., Charney A. W., Drange O. K., Gandal M. J., Hagenaars S. P., Ikeda M., Kamitaki N., Kim M., Krebs K., Panagiotaropoulou G., … Andreassen O. A. (2021). Genome-wide association study of more than 40,000 bipolar disorder cases provides new insights into the underlying biology. Nature Genetics, 53(6), 817–829. 10.1038/s41588-021-00857-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murray G. K., Lin T., Austin J., McGrath J. J., Hickie I. B., & Wray N. R. (2021). Could Polygenic Risk Scores Be Useful in Psychiatry?: A Review. JAMA Psychiatry, 78(2), 210–219. 10.1001/jamapsychiatry.2020.3042 [DOI] [PubMed] [Google Scholar]
- Nørgaard-Pedersen B., & Hougaard D. M. (2007). Storage policies and use of the Danish Newborn Screening Biobank. Journal of Inherited Metabolic Disease, 30(4), 530–536. 10.1007/s10545-007-0631-x [DOI] [PubMed] [Google Scholar]
- Ohta R., Tanigawa Y., Suzuki Y., Kellis M., & Morishita S. (2024). A polygenic score method boosted by non-additive models. Nature Communications, 15(1), 4433. 10.1038/s41467-024-48654-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pedersen C. B., Bybjerg-Grauholm J., Pedersen M. G., Grove J., Agerbo E., Bækvad-Hansen M., Poulsen J. B., Hansen C. S., McGrath J. J., Als T. D., Goldstein J. I., Neale B. M., Daly M. J., Hougaard D. M., Mors O., Nordentoft M., Børglum A. D., Werge T., & Mortensen P. B. (2018). The iPSYCH2012 case–cohort sample: New directions for unravelling genetic and environmental architectures of severe mental disorders. Molecular Psychiatry, 23(1), 6–14. 10.1038/mp.2017.196 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peng J., Bao Z., Li J., Han R., Wang Y., Han L., Peng J., Wang T., Hao J., Wei Z., & Shang X. (2024). DeepRisk: A deep learning approach for genome-wide assessment of common disease risk. Fundamental Research, S2667325824001171. 10.1016/j.fmre.2024.02.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Porcu E., Sjaarda J., Lepik K., Carmeli C., Darrous L., Sulc J., Mounier N., & Kutalik Z. (2021). Causal Inference Methods to Integrate Omics and Complex Traits. Cold Spring Harbor Perspectives in Medicine, 11(5), a040493. 10.1101/cshperspect.a040493 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Privé F., Arbel J., & Vilhjálmsson B. J. (2021). LDpred2: Better, faster, stronger. Bioinformatics, 36(22–23), 5424–5431. 10.1093/bioinformatics/btaa1029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Privé F., Aschard H., & Blum M. G. B. (2019). Efficient Implementation of Penalized Regression for Genetic Risk Prediction. Genetics, 212(1), 65–74. 10.1534/genetics.119.302019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Privé F., Aschard H., Ziyatdinov A., & Blum M. G. B. (2018). Efficient analysis of large-scale genome-wide data with two R packages: Bigstatsr and bigsnpr. Bioinformatics, 34(16), 2781–2787. 10.1093/bioinformatics/bty185 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Privé F., Vilhjálmsson B. J., & Aschard H. (2020). Fitting penalized regressions on very large genetic data using snpnet and bigstatsr [Preprint]. Genetics. 10.1101/2020.10.30.362079 [DOI] [Google Scholar]
- Purcell S. M., Wray N. R., Stone J. L., Visscher P. M., O’Donovan M. C., Sullivan P. F., Sklar P., Purcell (Leader) S. M., Stone J. L., Sullivan P. F., Ruderfer D. M., McQuillin A., Morris D. W., O’Dushlaine C. T., Corvin A., Holmans P. A., O’Donovan M. C., Sklar P., Wray N. R., … Stanley Center for Psychiatric Research and Broad Institute of MIT and Harvard. (2009). Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature, 460(7256), 748–752. 10.1038/nature08185 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qian J., Tanigawa Y., Du W., Aguirre M., Chang C., Tibshirani R., Rivas M. A., & Hastie T. (2020). A fast and scalable framework for large-scale and ultrahigh-dimensional sparse regression with application to the UK Biobank. PLOS Genetics, 16(10), e1009141. 10.1371/journal.pgen.1009141 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raschka S. (2020). Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning (No. arXiv:1811.12808). arXiv. 10.48550/arXiv.1811.12808 [DOI] [Google Scholar]
- Schork A. J., Schork M. A., & Schork N. J. (2018). Genetic risks and clinical rewards. Nature Genetics, 50(9), 1210–1211. 10.1038/s41588-018-0213-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schork A. J., Won H., Appadurai V., Nudel R., Gandal M., Delaneau O., Revsbech Christiansen M., Hougaard D. M., Bækved-Hansen M., Bybjerg-Grauholm J., Giørtz Pedersen M., Agerbo E., Bøcker Pedersen C., Neale B. M., Daly M. J., Wray N. R., Nordentoft M., Mors O., Børglum A. D., … Werge T. (2019). A genome-wide association study of shared risk across psychiatric disorders implicates gene regulation during fetal neurodevelopment. Nature Neuroscience, 22(3), Article 3. 10.1038/s41593-018-0320-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schwarzerova J., Hurta M., Barton V., Lexa M., Walther D., Provaznik V., & Weckwerth W. (2024). A perspective on genetic and polygenic risk scores—Advances and limitations and overview of associated tools. Briefings in Bioinformatics, 25(3), bbae240. 10.1093/bib/bbae240 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sheppard B., Rappoport N., Loh P.-R., Sanders S. J., Zaitlen N., & Dahl A. (2021). A model and test for coordinated polygenic epistasis in complex traits. Proceedings of the National Academy of Sciences, 118(15), e1922305118. 10.1073/pnas.1922305118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sigurdsson A. I., Gräf J. F., Yang Z., Ravn K., Meisner J., Thielemann R., Webel H., Smit R. A. J., Niu L., Mann M., FinnGen, Vilhjalmsson B., Neale B. M., Ganna A., Hansen T., Loos R. J. F., & Rasmussen S. (2024). Non-linear genetic regulation of the blood plasma proteome (p. 2024.07.04.24309942). medRxiv. 10.1101/2024.07.04.24309942 [DOI] [Google Scholar]
- Sigurdsson A. I., Louloudis I., Banasik K., Westergaard D., Winther O., Lund O., Ostrowski S. R., Erikstrup C., Pedersen O. B. V., Nyegaard M., DBDS Genomic Consortium, Brunak S., Vilhjálmsson B. J., & Rasmussen S. (2023). Deep integrative models for large-scale human genomics. Nucleic Acids Research, gkad373. 10.1093/nar/gkad373 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sigurdsson A. I., Ravn K., Winther O., Lund O., Brunak S., Vilhjálmsson B. J., & Rasmussen S. (2022). Improved prediction of blood biomarkers using deep learning [Preprint]. Genetic and Genomic Medicine. 10.1101/2022.10.27.22281549 [DOI] [Google Scholar]
- Tang D., Freudenberg J., & Dahl A. (2022). Factorizing polygenic epistasis improves prediction and uncovers biological pathways in complex traits (p. 2022.11.29.518075). bioRxiv. 10.1101/2022.11.29.518075 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Torkamani A., Wineinger N. E., & Topol E. J. (2018). The personal and clinical utility of polygenic risk scores. Nature Reviews Genetics, 19(9), 581–590. 10.1038/s41576-018-0018-x [DOI] [PubMed] [Google Scholar]
- Trubetskoy V., Pardiñas A. F., Qi T., Panagiotaropoulou G., Awasthi S., Bigdeli T. B., Bryois J., Chen C.-Y., Dennison C. A., Hall L. S., Lam M., Watanabe K., Frei O., Ge T., Harwood J. C., Koopmans F., Magnusson S., Richards A. L., Sidorenko J., … Schizophrenia Working Group of the Psychiatric Genomics Consortium. (2022). Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature, 604(7906), 502–508. 10.1038/s41586-022-04434-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Truong B., Hull L. E., Ruan Y., Huang Q. Q., Hornsby W., Martin H., van Heel D. A., Wang Y., Martin A. R., Lee S. H., & Natarajan P. (2024). Integrative polygenic risk score improves the prediction accuracy of complex traits and diseases. Cell Genomics, 4(4), 100523. 10.1016/j.xgen.2024.100523 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y., Tsuo K., Kanai M., Neale B. M., & Martin A. R. (2022). Challenges and Opportunities for Developing More Generalizable Polygenic Risk Scores. Annual Review of Biomedical Data Science, 5(1), null. 10.1146/annurev-biodatasci-111721-074830 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou G., & Zhao H. (2021). A fast and robust Bayesian nonparametric method for prediction of complex traits using summary statistics. PLOS Genetics, 17(7), e1009697. 10.1371/journal.pgen.1009697 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The iPSYCH data is securely stored and was analyzed in the national HPC facility in Denmark (https://genome.au.dk/). The personal data was anonymized prior to use for research purposes. The project received approval from the Danish Data Protection Agency, and in accordance with Danish legislation, informed consent was not required. The iPSYCH initiative is committed to providing access to these data to the scientific community, in accordance with Danish law. EIR is available at https://eir.readthedocs.io/en/stable/. The code used in this project is available at https://github.com/leocob/eirpsy. The code to compute PA-FGRSs is available at https://github.com/BioPsyk/PAFGRS.



