Skip to main content
Schizophrenia Bulletin Open logoLink to Schizophrenia Bulletin Open
. 2023 Mar 10;4(1):sgad008. doi: 10.1093/schizbullopen/sgad008

Development and Validation of Predictive Model for a Diagnosis of First Episode Psychosis Using the Multinational EU-GEI Case–control Study and Modern Statistical Learning Methods

Olesya Ajnakina 1,2,✉,#, Ihsan Fadilah 3, Diego Quattrone 4, Celso Arango 5, Domenico Berardi 6, Miguel Bernardo 7, Julio Bobes 8, Lieuwe de Haan 9, Cristina Marta Del-Ben 10, Charlotte Gayer-Anderson 11, Simona Stilo 12,13, Hannah E Jongsma 14,15, Antonio Lasalvia 16, Sarah Tosato 17, Pierre-Michel Llorca 18, Paulo Rossi Menezes 19, Bart P Rutten 20, Jose Luis Santos 21, Julio Sanjuán 22, Jean-Paul Selten 23, Andrei Szöke 24, Ilaria Tarricone 25, Giuseppe D’Andrea 26, Andrea Tortelli 27, Eva Velthorst 28,29, Peter B Jones 30,31, Manuel Arrojo Romero 32, Caterina La Cascia 33, James B Kirkbride 34, Jim van Os 35,36,37, Michael O’Donovan 38, Craig Morgan 39, Marta di Forti 40, Robin M Murray 41,42; EU-GEI WP2 Group 43,3, Daniel Stahl 44
PMCID: PMC11207766  PMID: 39145333

Abstract

Background and Hypothesis

It is argued that availability of diagnostic models will facilitate a more rapid identification of individuals who are at a higher risk of first episode psychosis (FEP). Therefore, we developed, evaluated, and validated a diagnostic risk estimation model to classify individual with FEP and controls across six countries.

Study Design

We used data from a large multi-center study encompassing 2627 phenotypically well-defined participants (aged 18–64 years) recruited from six countries spanning 17 research sites, as part of the European Network of National Schizophrenia Networks Studying Gene-Environment Interactions study. To build the diagnostic model and identify which of important factors for estimating an individual risk of FEP, we applied a binary logistic model with regularization by the least absolute shrinkage and selection operator. The model was validated employing the internal-external cross-validation approach. The model performance was assessed with the area under the receiver operating characteristic curve (AUROC), calibration, sensitivity, and specificity.

Study Results

Having included preselected 22 predictor variables, the model was able to discriminate adults with FEP and controls with high accuracy across all six countries (rangesAUROC = 0.84–0.86). Specificity (range = 73.9–78.0%) and sensitivity (range = 75.6–79.3%) were equally good, cumulatively indicating an excellent model accuracy; though, calibration slope for the diagnostic model showed a presence of some overfitting when applied specifically to participants from France, the UK, and The Netherlands.

Conclusions

The new FEP model achieved a good discrimination and good calibration across six countries with different ethnic contributions supporting its robustness and good generalizability.

Keywords: psychosis/diagnostic factors, diagnostic prediction modeling/risk prediction, cannabis use

Introduction

First episode psychosis (FEP), which affects approximately 3% of the adult population, is an umbrella term used to refer to schizophrenia spectrum disorders or related psychotic disorders.1 Although schizophrenia was initially conceptualized as a chronic, progressive deteriorating condition,2 accumulating evidence suggests that people with a diagnosis of schizophrenia can experience symptomatic improvements and regain a degree of social and occupational functioning,3–5 especially when early intervention services intervene at the onset of the very first psychosis episode.6,7 This ignited an increased focus on specialist early intervention services for FEP,8,9 the aim of which is to reduce treatment delay, increase chance for recovery and improve overall prognosis of psychosis10; however, the detection of those individuals who are at risk for developing FEP is currently limited.11 The reasons for the difficulties in detecting people who are at a greater risk for FEP are diverse including lack of financial recourses, high work-load and reliance on help-seeking behaviors. Indeed, most psychiatric services cannot offer a prompt assessment of the person at risk after the referral was made.12

This recognition ignited development of individualized diagnostic prediction modeling for disease diagnosis that considers individual variability in characteristics and lifestyle of each person.13 Thus, diagnostic models, which having been built on combined effects of thoroughly selected predictors, can be used to forecast the probability of a certain condition being present at the individual level.14 This is particularly potent considering that detection of people who are at risk for FEP using a validated diagnostic model does not rely on help-seeking and can be implemented at differences services. For example, it has been shown that while 45% of the people who were unlimitedly diagnosed with FEP were referred to early intervention services by the emergency medical services with another 17.9% of the FEP patients came in contact with mental health services via the criminal justice agencies.15,16 These demonstrate that having a reliable tool to identify persons at risk of having FEP across these services will make the identification of people who are at risk for FEP much easier making the referrals to early interventions services more promptly. Therefore, it is hoped that availability of diagnostic models will facilitate a more rapid identification of individuals who are at a higher risk of FEP.17 This in turn would reduce time to treatment initiation, which is currently delayed for up to 3 years,18 subsequently minimizing the social and functional disability that results from prolonged untreated psychoses.3–5

To-date, several studies, having employed either neuroimaging methods19,20 or proteomic data21 to relatively small samples, aimed to develop a diagnostic model to classify individuals with schizophrenia compared to healthy controls. However, the implementation of the models built on such complex data is likely to be constrained by logistical and financial challenges. Currently, there is no study that has developed, evaluated, and validated a diagnostic model for FEP using data reflecting the real-life clinical information available to a physician and a patient at the time of the assessment. Imperatively, it remains unknown if a diagnostic model for classifying individual with FEP trained on data acquired at one site will perform similarly well on data acquired from a new site that was not included in the model development.19

Of course, this lack of progress in individualized diagnostic for FEP could be due to the complex etiology of FEP disorders, which may be intrinsically difficult to predict at an individual level. It is, nonetheless, equally feasible that the lack of progress can, at least in part, be attributed to significant methodological shortcomings that have engulfed the field of prediction modeling in psychiatry.14 These include relying on small samples, utilizing substantially lower numbers of cases relative to the number of considered predictor variables, employing unreliable methods to select predictor variables for inclusion into the model, not properly assessing the accuracy of the model, and not efficiently dealing with missing data.22–24 In fact, a recent systematic review showed that all current diagnostic models in FEP were at high risk of bias,14 making it highly unlikely for these models to be of any use.

In the era of precision medicine, computationally demanding modern statistical learning algorithms, particularly regularized regression methods (RRMs),25 promise to provide a useful tool for diagnostic modeling. Through an introduction of a penalty for overfitting, which occurs when the developed model provides an over-optimistic assessment of the predictive performance,21 RRMs produce a model with good interpretability,26 which is especially portent for clinical application. Therefore, in the present study using a large multi-center phenotypically well-defined sample of FEP,27 we employed RRMs to develop, evaluate, and validate a diagnostic risk estimation model to classify individual with FEP based on an individual profile of sociodemographic characteristics and environmental circumatnces.28 The model was developed following the current guidelines.29,30 To ensure our model is appropriate for routine use in clinical practice,31 we used the internal-external validation in multi-site settings highlighting the extent to which the developed model can be generalized to the data from plausibly related settings.32

Methods

Study Design and Participants

Participants were recruited and assessed as part of the incidence and first episode case-control study, conducted as part of the EUropean network of national schizophrenia networks investigating Gene-Environment Interactions (EU-GEI) study,27 which comprises the largest multi-site study of psychotic disorders ever conducted. EU-GEI study was established between May 2010 and April 2015 in tightly defined catchment areas in 17 sites across 6 countries, which were UK, The Netherlands, France, Spain, Italy and Brazil.33 The research sites within each country were purposefully selected to include a mix of urban and rural areas.27,33 All participants provided informed, written consent following full explanation of the study. It is noteworthy that the combined incidence and case-control methodology allowed us to account for any potential selection biases amongst the recruited and assessed cases.

Ethical Approval.

All participants who agreed to take part in the case–control study provided informed, written consent following full explanation of the study. Ethical approval for the study was provided by relevant local research ethics committees in each of the study sites.27,33

Ascertainment of Cases.

The inclusion criteria for FEP cases were: (1) presentation with a clinical diagnosis for an untreated FEP as defined by International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10) criteria27 (codes F20-F33) within the timeframe of the study; (2) aged between 18 and 64 years (inclusive); and (3) resident within one of the 17 defined catchment area at the time of their first presentation to psychiatric services for psychosis. Exclusion criteria were: (1) a previous contact with specialist mental health services for psychotic symptoms outside of the study period at each site; (2) evidence of psychotic symptoms precipitated by an organic cause (ICD-10: F09); (3) transient psychotic symptoms resulting from acute intoxication (F1x.5); (4) severe learning disabilities, defined by an IQ less than 50 or diagnosis of intellectual disability (F70–F79); and (5) insufficient fluency of the primary language at each site to complete assessments.27

Ascertainment of Controls.

To better reflect the source population from which the cases arose, controls were recruited based on random and quota sampling that considered the distribution of age, sex, and ethnicity in each region. Inclusion criteria for controls were (1) age 18–64 years; (2) resident in the distinctly defined catchment region; (3) adequate fluency of the primary language used in each site; (4) no history of current or past psychiatric disorders.27,33 The individuals who were recruited as controls for the study were broadly representative of local populations in relation to age, gender, and ethnicity.27 Individuals who agreed to take part were screened for a history of psychosis. Those who reported previous or current treatment for psychosis were excluded.27

Predictors.

Following a previous research protocol,34 we excluded variables which had a high collinearity with other variables, and/or had > 50% missing values. Overall, 96 predictors related to participants’ sociodemographic circumstances, childhood adversity, life events experienced in adulthood and substance use were included in the model development (Supplementary table 1). Information on these predictors was collected using previously validated tools with a good inter-rater reliability and structured, standardized format across sites.27,33,35

Statistical Analysis

The process of model development, evaluation and validation was carried out according to methodological standards outlined by Steyerberg et al25; results were reported according to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guidelines36; the completed checklist is provided in Supplementary table 2. All analyses were performed in RStudio release 4.02.37

Sample Size Calculations.

To calculate the sample size in the present study, we utilized the guidelines for sample size calculations when developing prediction models for binary outcomes.38 These guidelines require to account for not only on the number of events relative to the number of candidate predictor, which is a well-known rule of thumb for the required sample size,39 but also on the total number of participants, the outcome proportion (incidence) in the study population, and the expected predictive performance of the model. These information is used to tailor sample size requirements to the specific setting of interest, with the aim of minimizing the potential for model overfitting while targeting precise estimates of key parameters.38 Accordingly, to assess if our sample size was large enough to develop a robust prediction diagnostic model, we calculated the required sample size according to these guidelines38 considering several parameters (Supplementary Material). Assuming the value of R2 corresponds to an R2Nagelkerke of 0.15, that is, R2CS = 0.15 × max([R2CS]),38 the sample size required for the model development was n = 2816 corresponding to 14.67 events per predictor. Consequently, our sample size of n = 2627 was slightly below the requirement; though, this sample size calculation did not consider regularization, which reduces the risk of overfitting.

Imputation of Missing Values.

In the present study, some variables had missing values (Supplementary table 4) with an average missingness in the entire sample was approximately 12.9%. To avoid using unrepresentative sample of complete cases that may result in incorrect risk predictions,40,41 we imputed missing values employing missForest, which is an imputation method based on random forest that handles continuous and categorical variables equally well and accommodates non-linear relation structures.42,43 As recommended for prediction models, the outcome was included in the imputation process.44 The distribution of the variables included in the analyses before and after the imputation are presented in Supplementary table 3 showing that the imputed values were very closely aligned with the observed values across all variables used in the analyses.

Variable Selection and Model Fitting.

To build the diagnostic model and identify which of included predictors were important for classifying individual with FEP and controls, we applied a binary logistic model with regularization by the least absolute shrinkage and selection operator (LASSO).45 LASSO entails fitting a model, which, by imposing penalty (λ) on the size of regression parameter estimates shrink them towards zero,46,47 simultaneously selects predictors, estimates their effects, and introduces parsimony. Therefore, if a suitable λ is chosen, LASSO intrinsically performs predictor selection and deals with collinearity. Selection of the optimal tuning parameter λ optimizing the model performance is described below.

Model Estimation.

The tuning parameter λ optimizing optimized the area under the receiver operating characteristic curve (AUROC) was chosen from a grid of 100 λ values through 10-fold repeated cross-validation (CV).45 10-fold CV divided data randomly into 10 non-overlapping data partitions; participants included in the first 9 partitions were considered as the training sample, and the remaining individuals as the test sample. To reduce the variance of CV, 10-fold CV was repeated 10 times computing the AUROC for each λ value. As a parsimonious model is desirable for practice48 and may generalize better to different populations,49 though often at the expense of a lower predictive performance, the model that corresponded to 3% tolerance of the maximum AUROC yielding more parsimony with fewer irrelevant variables compared to standard minimum lambda,46 was chosen as the final model.

Model Performance.

Model’s accuracy was measured with discrimination and calibration. Discrimination indicates how well a model separates individuals who experienced an event from those who did not; we assessed discrimination using AUROC, where a value of 0.5 indicates that a model does not discriminate better than chance, while 1 indicates that a model discriminates perfectly.50 Calibration, assessed via calibration slope β, which should ideally be 1, and the calibration-in-the-large α, which ideally should be zero, describes how well the predicted risk corresponds to the risk from the observed data51,52 and can be described as a measure of bias in a model.53 We present calibration graphically by placing the estimated and actual outcome risk on the horizontal and vertical axes, respectively. We further measured the prediction accuracy of our models with sensitivity and specificity. Unlike the traditional 50%, which follows often incorrect assumption that the false-positive and false-negative are equally important,25 to classify an individual as high or low risk based on a prediction model, a cut-off for the predicted probability (ie, “decision threshold”)25 was selected by maximizing the sum of the model’s sensitivity and specificity to minimize the false positives, which are unavoidable.54 This entailed selecting the decision threshold that maximized the overall correct classification rates, while choosing the point on the receiver operating characteristic curve farthest from chance.55

Model Validation.

We validated the developed model using internal-external cross-country validation, which allows quantifying the generalizability of a prediction model across different settings and populations.56 Specifically, having six countries, we grouped participants by country and redeveloped the model, repeating every step of estimating and selecting candidate predictors, in five of the six countries. We then evaluated the resulting model using the data from the remaining country measuring discrimination and calibration as described above. We repeated this validation algorithm six times until each country was used as a validation sample and reported the mean as general performance estimates. The full model equation is presented in Supplementary materials to enable the essential independent external validation.57

Results

Study Participants

The sample characteristics are provided in table 1. The sample comprised 2627 participants; of these, 43.0% (n = 1130) had a diagnosis of FEP and 57.9% (n = 1497) were participants without FEP disorders. The average age of participants with a diagnosis of FEP at the time of the assessment was 31.3 years (SD = 10.6), 61.7% (n = 697) were men and 63.3% (n = 715) were of white ethnicity. Compared to FEP group, participants without FEP were older (mean = 36.3 years, SD = 12.9) of whom 47.2% (n = 706) were men and 78.7% (n = 1178) were of white ethnicity. Of the entire sample, 22.2% (n = 582) were recruited from the UK, 15.5% (n = 406) were recruited from The Netherlands, 16.2% (n = 426) came from Spain, 9.6% (n = 252) came from France, 17.8% (n = 467) were recruited from Italy, and 18.8% (n = 494) came from Brazil. The participants recruited across all countries were comparable in terms of age, gender, and relationship status at the time of the assessment (table 2).

Table 1.

Characteristics of Participants in the Study Population

Characteristic Overall
n = 2627
Case status
FEP
n = 1130 (43.0%)
Control
n = 1497 (57.9%)
n/mean %/SD n/mean %/SD n/mean %/SD
Age at assessment (years) 34.0 12.2 31.3 10.6 36.1 12.9
Gender
 Male 1403 53.4 697 61.7 706 47.2
 Female 1224 46.6 433 38.3 791 52.8
Education (years) 14.0 4.3 12.9 4.2 14.7 4.2
Ever been employed 2402 91.6 999 88.6 1403 93.8
Never been in a long-term relationship 507 19.3 345 30.5 162 10.8
Ethnicity
 White 1893 72.1 715 63.3 1178 78.7
 Black 380 14.5 235 20.8 145 9.7
 Other 353 13.4 180 15.9 173 11.6
Country
 The UK 582 22.2 246 21.8 336 22.4
 The Netherlands 406 15.5 196 17.3 210 14.0
 Spain 426 16.2 204 18.1 222 14.8
 France 252 9.6 105 9.3 147 9.8
 Italy 467 17.8 187 16.5 280 18.7
 Brazil 494 18.8 192 17.0 302 20.2

Note: sd, standard deviation.

Table 2.

Characteristics of Participants in the Study Population, by Country

Recruitment countries The UK
n = 582 (22.2%)
The Netherlands
n = 406 (15.5%)
Spain
n = 426 (16.2%)
France
n = 252 (9.6%)
Italy
n = 467 (17.8%)
Brazil
n = 494 (18.8%)
n/mean %/SD n/mean %/SD n/mean %/SD n/mean %/SD n/mean %/SD n/mean %/SD
Case status
 FEP 246 42.3 196 48.3 196 48.3 105 41.7 187 40.0 192 38.9
 Health control 336 57.7 210 51.7 210 51.7 147 58.3 280 60.0 302 61.1
Age at assessment (years) 33.5 11.9 34.4 13.3 34.9 11.5 35.9 13.5 33.1 11.4 33.3 12.1
 Median (IQR) 31 17 31 22 34 18 33 22 31 19 30 14
Gender
 Male 315 54.1 235 57.9 188 44.1 134 53.2 228 48.8 253 51.2
 Female 267 45.9 171 42.1 238 55.9 118 46.8 239 51.2 241 48.8
Education (years) 15.3 3.4 16.3 3.8 13.4 4.5 13.3 3.7 14.0 3.9 11.2 4.3
Ever been employed 539 92.6 398 98.0 385 90.8 229 91.2 396 85.2 455 92.1
Never been in a long-term relationship 110 18.9 94 23.2 91 21.4 35 13.9 89 19.1 88 17.8
Ethnicity
 White 348 59.8 292 72.1 382 89.7 143 56.7 434 92.9 294 59.5
 Black 170 29.2 57 14.1 9 2.1 81 32.1 12 2.6 51 10.3
 Other 64 11.0 56 13.8 35 8.2 28 11.1 21 4.5 149 30.2

Note: FEP, first-episode psychosis; N, number of participants; IQR, interquartile range; df, degrees of freedom.

Diagnostic Model

Internally-externally validated performance of our model is presented in table 3. The model included 22 (22.9% out of n = 96) predictor variables (Supplementary table 4). The model’s apparent performance is presented in Supplementary table 5. Following model’s validation, a very good discrimination was observed across all countries (rangeAUROC = 0.84–0.86). The calibration intercept (α) for all, but The Netherlands (calibration intercept [α] = 0.56), countries was slightly larger than 0 (range = −0.24 to 0.13). Calibration slope (β) was the lowest for the sample obtained in Brazil (calibration slope [β] = 1.11) and Spain (calibration slope [β] = 1.22) indicating an excellent model predication accuracy; though, calibration slope (β) for France, the UK and The Netherlands were relatively high (1.77, 1.71, and 1.61, respectively) suggesting that the model for these sites slightly overestimate risks. Calibration plots showed good agreement between observed and expected risk at predicted probabilities across all countries (figure 1). Using a cut-off point of 39.8%, our model was able to discriminate adults FEP from controls with good sensitivity (range = 73.9–78.0%) and specificity (range = 75.6–79.3%).

Table 3.

Internally–Externally Validated the Model’s Performance

Internally-externally validated performance Prediction model of FEP
The UK The Netherlands Spain France Italy Brazil
Sample size, n 582 406 426 252 467 494
Number of outcome events, n 246 196 204 105 187 192
 Proportion of outcome events 0.42 0.48 0.48 0.42 0.40 0.39
 AUROC 0.84 (95% CI = 0.82–0.85) 0.84 (95% CI = 0.82–0.85) 0.84 (95% CI = 0.83–0.86) 0.84 (95% CI = 0.82–0.85) 0.86 (95% CI = 0.84–0.87) 0.85 (95% CI = 0.83–0.87)
 Calibration intercept (α) 0.23 0.56 0.14 0.13 −0.24 −0.19
 Calibration slope (β) 1.71 1.61 1.22 1.77 0.85 1.11
 Sensitivity 75.0% 73.9% 74.0% 74.8% 77.6% 78.0%
 Specificity 75.6% 79.3% 78.1% 77.0% 76.7% 76.8%

Note: AUROC, area under the receiver operating characteristic curve; CI, confidence intervals.

Fig. 1.

Fig. 1.

Internally–externally validated calibration plots for the prediction model of FEP in the United Kingdom, the Netherlands, France, Spain, Italy, and Brazil. Squares illustrate risk groups by fourths of equally spaced model-estimated risks, through which the linear line was fitted (red). The smoothed loess curve (orange) was fitted based on individual data points. The 45° line (gray) represents perfect calibration where model-estimated equal actual risk. The histogram on the upper margin represents the distribution of model-estimated risks.

Predictor Variables.

Several predictor variables selected in the model (Supplementary table 4), such as educational attainment (unstandardized β = 0.035), being foreign-born (unstandardized β = 0.113), and childhood experiences including not having peers to go to (unstandardized β = 0.264), experiences of prolonged loneliness (unstandardized β = 0.152) and running away from home (unstandardized β = 0.052) existed before FEP onset. Thus, these variables may be seen as potentially causative predictors for developing FEP. Other important contributing factors for diagnostic risk for FEP were being unemployed (unstandardized β = 0.671), being single (unstandardized β = 0.577), having problems with the police (unstandardized β = 0.484) or having difficulties at work (unstandardized β = 0.528), using more cannabis than intended (unstandardized β = 0.523), daily cigarette smoking (unstandardized β = 0.590), and using other substances, such as cocaine (unstandardized β = 0.227). A worked example of calculating an individualized risk for FEP is provided in Supplementary Material.

Discussion

Having utilized data across 17 research sites from 6 countries to our knowledge, this is the first study to develop, fully evaluate and validate a diagnostic model for classifying FEP based on a personal profile of 22 personal characteristics and lifestyle. We followed the current guidelines for model development, evaluation and validiton,29,30 our results indicate that classification of FEP and controls is possible with high predictive accuracy across the UK, The Netherlands, Spain France Italy and even Brazil. To maximize the predictive accuracy,58 we catered for incomplete data, which is a common but serious limitation in psychiatric research but generally not addressed sufficiently.40,41 Given that the data ascertainment for this study was carried out in major urban and rural sites with heterogeneous populations27 suggests that the validity of our model may extend to other centers with similar population profiles. Sensitivity and specificity are tests of accuracy of a model and are among the fundamental measures to understanding the utility of clinical tests. Sensitivity refers to how good a test is at correctly identifying people who have the disease; whereas the specificity of a clinical test refers to the ability of the test to correctly identify those patients without the disease. Our results demonstrate that our model has a high sensitivity and high specificity implying that it will detect accurately many adults who are disease free as low risk without recurring further investigation. Because the model does not require any laboratory testing or clinical measurements, it could be easily integrated into electronic case-registers to facilitate the automatic and individualized diagnostic identification of FEP based on electronic or clinical records.

The predictor variables that were selected by our model, such as lower level of educational attainment, childhood adversity and stressful life events in adulthood,16,59–61 and cannabis use62–65 have previously been linked to FEP risk and probably occurred before the onset of psychoses. Accordingly, our results reiterate the important role these experiences play in increasing risk for FEP providing avenues for prevention strategies.66 Importantly, some of these factors, such as cannabis use, childhood adversities and educational attainment are potentially preventable with the right interventions.66 For example, using the same data as in the present study, it was shown that 24% of FEP cases would have been prevented if none in the population consumed cannabis of high potency.63 Our results further reiterate that better educational attainment may protect from FEP risk perhaps via more effective coping strategies, healthier behaviors and social relationships.67–69 For many, however, FEP develops during a period critical to the consolidation of life skills,70 which may result in an individual being unable to obtain qualifications after illness onset.71 It is, therefore, imperative to provide people with FEP access to supported education programs to (re)-engage them in the workforce.72 The confirmation of these factors as pivotal in development of FEP further supports the long-term benefits of reducing an exposure to these risk factors for psychosis; though this likely to be challenging considering the pathogenic mechanism underlying the link between some of these risk factors and psychosis is not fully understood.15 Furthermore, it may be very difficult to diminish exposure to some risk factors, for example, child abuse or migration; though, an obvious place to start is by attempting to reduce society’s consumption of high-potency cannabis through public education.62

Nonetheless, because all individuals with a diagnosis of FEP in the present study were already under the care of mental health services upon recruitment, it may be argued that there is a window of missed opportunity for detection of FEP before the illness onset using this model. An alternative approach is to develop a prognostic model that will aim to estimate an individual risk for FEP onset among young help-seeking people who have been identified as at clinical high risk, which is a state characterized by either “attenuated” psychotic symptoms, or full-blown psychotic symptoms that are brief and self-limiting. While these prognostic risk estimation studies offer a promise for detecting young people who are at high risk for converting to FEP from experiencing suboptimal symptoms, those young adults who have been classified as “at clinical high risk” are not representative of those who develop FEP in terms of socio-economic status, life-experiences and ethnical composition.11,15,66 In contract, our model was developed specifically for true FEP cases, which ensures its generalizability to the wider FEP communities.

Calibration slope for our prediction diagnostic model, however, showed a presence of some overfitting when applied specifically to participants from France, the UK, and The Netherlands suggesting that for the patients recruited in these countries estimated risks may be too high for those who are at high risk and too low for those who are at low risk; though, the observed calibration slope estimates were still within a range of previously reported models.30,73 It may be argued that calibration slope would have been better if more complex machine learning methods, such as support vector machines, had been used,26 or by employing more complex predictors, such as neuroanatomical biomarkers.14 Nonetheless, there is no evidence to suggest that more complex models or encompassing biomarkers lead to significant improvements in prediction accuracy even at the expense of reduced interpretability compared with simpler statistical models.14,74,75

The present study is not without limitations. Because we developed the model on a case-control sample, the true prevalence of FEP in the general population differs considerably. Although it may be argued that the percentage of missing values across variables might have affected the imputations and induced some bias in the estimates of the effects in the model, the proportion of missingness in the present study was comparable to many longitudinal datasets3–5,60 and within the range for missForest to handle it efficiently.76 As with many risk models, we only accounted for baseline variables, although for many time-varying factors, exposure status may change over time.77 However, using baseline variables reflects the real-life clinical information available to a physician and a participant when they need to make decisions on the likely risk of developing FEP disorders. Even though the average age between participants without a diagnosis of FEP was older than FEP participants, the age of onset of FEP in out sample was consistent with many studies on FEP conducted across Europe, and other continents.78–85 A higher proportion of our participants with FEP were from ethnic minorities when compared to health controls. However, psychiatric epidemiology has consistently demonstrated that the incidence rates of psychotic disorders are considerably elevated among those of Black ethnicity residing in the UK compared to the host population.80,86–88 Therefore, it is expected that individuals with FEP will be different greatly in ethnicity compared to adults without FEP diagnosis. It may be argued that many diagnostic categories assigned to patients on first contact with mental health services may either be provisional or likely to change over the illness course.89 Nevertheless, in the present study we focused on the baseline diagnosis to emulate the naturalistic setting for all patients with FEP when predicting their onset. There is further robust meta-analytical evidence for high prospective diagnostic stability in schizophrenia spectrum and affective spectrum psychoses in the due course of the illnesses.90,91 Nonetheless, it may be feasible to assume that there may be different predictors for affective versus non-affective psychosis. Thus, further modeling approaches may be necessary to investigate this in the possibility future. Finally, in the present study we have developed the model for FEP rather than individual diagnoses, such bipolar disorder, depression with psychotic features, because many diagnostic categories assigned to patients on first contact with mental health services may either be provisional or likely to change over the illness course.89

Conclusions

Having employed modern statistical learning algorithms, we developed, evaluated, and validated a diagnostic model for classifying FEP that achieved a good discrimination and calibration across six European countries and Brazil supporting its robustness and good generalizability across FEP programs in different countries. This study, therefore, bears important implications for the development of affordable and easy-to-administer standardized assessment batteries that can evaluate individuals’ risk for FEP in clinical settings across countries with similar characteristics of adults with FEP.

Supplementary Material

Supplementary data are available at Schizophrenia Bulletin Open online.

§EU-GEI collaborators

Kathryn Hubbard1, Stephanie Beards1, Doriana Cristofalo2, Mara Parellada3, Pedro Cuadrado4, José Juan Rodríguez Solano5, David Fraguas5, Álvaro Andreu-Bernabeu5, Angel Carracedo6, Enrique García Bernardo7, Laura Roldán3, Gonzalo López3, Silvia Amoretti8, Juan Nacher9, Paz Garcia-Portilla10, Javier Costas6, Estela Jiménez-López11, Mario Matteis3, Marta Rapado Castro3, Emiliano González3, Covadonga Martínez3, Emilio Sánchez7, Manuel Durán-Cutilla7, Nathalie Franke12, Fabian Termorshuizen13,14, Daniella van Dam12, Elsje van der Ven13,14, Elles Messchaart14, Marion Leboyer15,16,17,18, Franck Schürhoff15,16,17,18, Stéphane Jamain16,17,18, Grégoire Baudin15,16, Aziz Ferchiou15,16, Baptiste Pignon15,16,18, Jean- Romain Richard16,18, Thomas Charpeaud18,19,21, Anne-Marie Tronche18,19,21, Flora Frijda22, Daniele La Barbera22,23, Giovanna Marrazzo23, Lucia Sideli22, Crocettarachele Sartorio22,23, Laura Ferraro22,Fabio Seminerio22, Camila Marcelino Loureiro24,25, Rosana Shuhama24,25, Mirella Ruggeri2, Antonio LaSalvia2, Chiara Bonetto2

1Department of Health Service and Population Research, Institute of Psychiatry, King’s College London, De Crespigny Park, Denmark Hill, London, SE5 8AF, United Kingdom; 2Section of Psychiatry, Department of Neuroscience, Biomedicine and Movement, University of Verona, Piazzale L.A. Scuro 10, 37134 Verona, Italy; 3Department of Child and Adolescent Psychiatry, Hospital General Universitario Gregorio Marañón, School of Medicine, Universidad Complutense, IiSGM (CIBERSAM), C/Doctor Esquerdo 46, 28007 Madrid, Spain; 4Villa de Vallecas Mental Health Department, Villa de Vallecas Mental Health Centre, Hospital Universitario Infanta Leonor/ Hospital Virgen de la Torre, C/San Claudio 154, 28038 Madrid, Spain; 5Puente de Vallecas Mental Health Department, Hospital Universitario Infanta Leonor/ Hospital Virgen de la Torre, Centro de Salud Mental Puente de Vallecas, C/Peña Gorbea 4, 28018 Madrid, Spain; 6Fundación Pública Galega de Medicina Xenómica, Hospital Clínico Universitario, Choupana s/n, 15782 Santiago de Compostela, Spain; 7Department of Psychiatry, Hospital General Universitario Gregorio Marañón, School of Medicine, Universidad Complutense, IiSGM (CIBERSAM), C/Doctor Esquerdo 46, 28007 Madrid, Spain; 8Department of Psychiatry, Hospital Clinic, IDIBAPS, Universidad de Barcelona, C/Villarroel 170, 08036 Barcelona, Spain; Department of Psychiatry, Hospital Universitari Vall d’Hebron; Group of Psychiatry, Mental Health and Addictions, Psychiatric Genetics Unit, Vall d’Hebron Research Institute (VHIR), Barcelona, Spain; Biomedical Network Research Centre on Mental Health (CIBERSAM), Barcelona, Spain; 9Neurobiology Unit, Program in Neurosciences and Interdisciplinary Research Structure for Biotechnology and Biomedicine (BIOTECMED), Universitat de València, Burjassot, Spain. Biomedical Research Networking Centre in Mental Health (CIBERSAM), Madrid, Spain. Biomedical Research Institute INCLIVA, Valencia, Spain; 10Department of Medicine, Psychiatry Area, School of Medicine, Universidad de Oviedo, Centro de Investigación Biomédica en Red de Salud Mental (CIBERSAM), C/Julián Clavería s/n, 33006 Oviedo, Spain; 11Department of Psychiatry, Servicio de Psiquiatría Hospital “Virgen de la Luz,” C/Hermandad de Donantes de Sangre, 16002 Cuenca, Spain; 12Department of Psychiatry, Early Psychosis Section, Academic Medical Centre, University of Amsterdam, Meibergdreef 5, 1105 AZ Amsterdam, The Netherlands; 13Rivierduinen Centre for Mental Health, Leiden, Sandifortdreef 19, 2333 ZZ Leiden, The Netherlands; 14Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, South Limburg Mental Health Research and Teaching Network, Maastricht University Medical Centre, P.O. Box 616, 6200 MD Maastricht, The Netherlands; 15AP-HP, Groupe Hospitalier “Mondor,” Pôle de Psychiatrie, 51 Avenue de Maréchal de Lattre de Tassigny, 94010 Créteil, France; 16INSERM, U955, Equipe 15, 51 Avenue de Maréchal de Lattre de Tassigny, 94010 Créteil, France; 17Faculté de Médecine, Université Paris-Est, 51 Avenue de Maréchal de Lattre de Tassigny, 94010 Créteil, France; 18Fondation Fondamental, 40 Rue de Mesly, 94000 Créteil, France; 19CMP B CHU, BP 69, 63003 Clermont Ferrand, Cedex 1, France; 20EPS Maison Blanche, Paris 75020 France; 21Université Clermont Auvergne, EA 7280, Clermont-Ferrand, 63000, France; 22Department of Experimental Biomedicine and Clinical Neuroscience, Section of Psychiatry, University of Palermo, Via G. La Loggia n.1, 90129 Palermo, Italy; 23Unit of Psychiatry, “P. Giaccone” General Hospital, Via G. La Loggia n.1, 90129 Palermo, Italy; 24Departamento de Neurociências e Ciencias do Comportamento, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Av. Bandeirantes, 3900 - Monte Alegre- CEP 14049-900, Ribeirão Preto, SP, Brasil; 25Núcleo de Pesquina em Saúde Mental Populacional, Universidade de São Paulo, Avenida Doutor Arnaldo 455, CEP 01246-903, SP, Brasil

sgad008_suppl_Supplementary_Materials

Contributor Information

Olesya Ajnakina, Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, University of London, London, UK; Department of Behavioural Science and Health, Institute of Epidemiology and Health Care, University College London, London, UK.

Ihsan Fadilah, Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, University of London, London, UK.

Diego Quattrone, Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK.

Celso Arango, Child and Adolescent Psychiatry Department, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, School of Medicine, Universidad Complutense, IiSGM, CIBERSAM, C/Doctor Esquerdo 46, 28007 Madrid, Spain.

Domenico Berardi, Department of Biomedical and Neuromotor Sciences, Psychiatry Unit, Alma Mater Studiorum Università di Bologna, Viale Pepoli 5, 40126 Bologna, Italy.

Miguel Bernardo, Department of Psychiatry, Barcelona Clinic Schizophrenia Unit, Neuroscience Institute, Hospital Clinic of Barcelona, University of Barcelona, IDIBAPS, CIBERSAM, Barcelona, Spain.

Julio Bobes, Faculty of Medicine and Health Sciences, Psychiatry, Universidad de Oviedo, ISPA, INEUROPA. CIBERSAM, Oviedo, Spain.

Lieuwe de Haan, Department of Psychiatry, Early Psychosis Section, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.

Cristina Marta Del-Ben, Neuroscience and Behavior Department, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil.

Charlotte Gayer-Anderson, Department of Health Service and Population Research, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK.

Simona Stilo, Department of Mental Health and Addiction Services, ASP Crotone, Crotone, Italy; Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK.

Hannah E Jongsma, Centre for Transcultural Psychiatry Veldzicht, Balkbrug, The Netherlands; University Centre for Psychiatry, University Medical Centre Groningen, Groningen, The Netherlands.

Antonio Lasalvia, Section of Psychiatry, Department of Neuroscience, Biomedicine and Movement Sciences, University of Verona, Piazzale L.A. Scuro 10, 37134 Verona, Italy.

Sarah Tosato, Section of Psychiatry, Department of Neuroscience, Biomedicine and Movement Sciences, University of Verona, Piazzale L.A. Scuro 10, 37134 Verona, Italy.

Pierre-Michel Llorca, Université Clermont Auvergne, CMP-B CHU, CNRS, Clermont Auvergne INP, Institut Pascal, F-63000 Clermont-Ferrand, France.

Paulo Rossi Menezes, Department of Preventative Medicine, Faculdade de Medicina FMUSP, University of São Paulo, São Paulo, Brazil.

Bart P Rutten, Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, South Limburg Mental Health Research and Teaching Network, Maastricht University Medical Centre, P.O. Box 616, 6200 MD Maastricht, The Netherlands.

Jose Luis Santos, Department of Psychiatry, Servicio de Psiquiatría Hospital “Virgen de la Luz”, Cuenca, Spain.

Julio Sanjuán, Department of Psychiatry, Hospital Clínico Universitario de Valencia, INCLIVA, CIBERSAM, School of Medicine, Universidad de Valencia, Valencia, Spain.

Jean-Paul Selten, Rivierduinen Institute for Mental Health Care, Sandifortdreef 19, 2333 ZZ Leiden, The Netherlands.

Andrei Szöke, University of Paris Est Creteil, INSERM, IMRB, AP-HP, Hôpitaux Universitaires « H. Mondor », DMU IMPACT, Fondation FondaMental, F-94010 Creteil, France.

Ilaria Tarricone, Department of Medical and Surgical Sciences, Bologna University, Bologna, Italy.

Giuseppe D’Andrea, Department of Biomedical and Neuromotor Sciences, Psychiatry Unit, Alma Mater Studiorum Università di Bologna, Viale Pepoli 5, 40126 Bologna, Italy.

Andrea Tortelli, Etablissement Public de Santé Maison Blanche, Paris, France.

Eva Velthorst, Department of Psychiatry, Early Psychosis Section, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Peter B Jones, Department of Psychiatry, University of Cambridge, Herchel Smith Building for Brain and Mind Sciences, Forvie Site, Robinson Way, Cambridge, CB2 0SZ, UK; CAMEO Early Intervention Service, Cambridgeshire and Peterborough NHS Foundation Trust, Cambridge, CB21 5EF, UK.

Manuel Arrojo Romero, Department of Psychiatry, Psychiatric Genetic Group, Instituto de Investigación Sanitaria de Santiago de Compostela, Complejo Hospitalario s, Santiago de Compostela, Spain.

Caterina La Cascia, Department of Experimental Biomedicine and Clinical Neuroscience, University of Palermo, Via G. La Loggia 1, 90129 Palermo, Italy.

James B Kirkbride, Psylife Group, Division of Psychiatry, University College London, 6th Floor, Maple House, 149 Tottenham Court Road, London, W1T 7NF, UK.

Jim van Os, Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK; Department of Psychiatry, Brain Centre Rudolf Magnus, Utrecht University Medical centre, Utrecht, The Netherlands; Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, South Limburg Mental Health Research and Teaching Network, Maastricht University Medical Centre, P.O. Box 616, 6200 MD Maastricht, The Netherlands.

Michael O’Donovan, Division of Psychological Medicine and Clinical Neurosciences, MRC Centre for Neuropsychiatric Genetics and Genomics, Cardiff University, Cardiff CF24 4HQ, UK.

Craig Morgan, Department of Health Service and Population Research, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK.

Marta di Forti, Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK.

Robin M Murray, Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK; Department of Psychiatry, Experimental Biomedicine and Clinical Neuroscience, University of Palermo, Palermo, Italy.

EU-GEI WP2 Group, Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, University of London, London, UK.

Daniel Stahl, Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, University of London, London, UK.

EU-GEI WP2 Group:

Kathryn Hubbard, Stephanie Beards, Doriana Cristofalo, Mara Parellada, Pedro Cuadrado, José Juan Rodríguez Solano, David Fraguas, Álvaro Andreu-Bernabeu, Angel Carracedo, Enrique García Bernardo, Laura Roldán, Gonzalo López, Silvia Amoretti, Juan Nacher, Paz Garcia-Portilla, Javier Costas, Estela Jiménez-López, Mario Matteis, Marta Rapado Castro, Emiliano González, Covadonga Martínez, Emilio Sánchez, Manuel Durán-Cutilla, Nathalie Franke, Fabian Termorshuizen, Daniella van Dam, Elsje van der Ven, Elles Messchaart, Marion Leboyer, Franck Schürhoff, Stéphane Jamain, Grégoire Baudin, Aziz Ferchiou, Baptiste Pignon, Jean- Romain Richard, Thomas Charpeaud, Anne-Marie Tronche, Flora Frijda, Daniele La Barbera, Giovanna Marrazzo, Lucia Sideli, Crocettarachele Sartorio, Laura Ferraro, Fabio Seminerio, Camila Marcelino Loureiro, Rosana Shuhama, Mirella Ruggeri, Antonio LaSalvia, and Chiara Bonetto

Funding

OA is funded by the National Institute for Health Research (NIHR) (NIHR Post-Doctoral Fellowship—PDF-2018-11-ST2-020). IF is funded by NIHR Predoctoral Fellowship (NIHR300493). MDF is funded by Clinician Scientist Medical Research Council fellowship (project reference MR/M008436/1). DQ is funded by Post-Doctoral Guarantors of Brain Clinical Fellowship. DS is funded part funded by the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. JBK is supported by National Institute for Health Research, University College London Hospital, Biomedical Research Centre. The EU-GEI Project is funded by the European Community’s Seventh Framework Programme under grant agreement No. HEALTH-F2-2010-241909 (Project EU-GEI). The Brazilian study was funded by the São Paulo Research Foundation under grant number 2012/0417-0. The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research or the Department of Health and Social Care.

Conflict of Interest

R.M.M. has received honoraria from Janssen, Sunovian, Lundbeck and Otsuka. M.B. has been a consultant for, received grant/research support and honoraria from, and been on the speakers/advisory board of ABBiotics, Adamed, Angelini, Casen Recordati, Janssen-Cilag, Menarini, Rovi and Takeda. Other authors declare that they have no conflict of interest. All other authors declare no conflict of interest.

References

  • 1.Tandon R, Nasrallah HA, Keshavan MS.. Schizophrenia, “just the facts” 4. Clinical features and conceptualization. Schizophr Res. 2009;110(1–3):1–23. [DOI] [PubMed] [Google Scholar]
  • 2.Zipursky RB, Reilly TJ, Murray RM.. The myth of schizophrenia as a progressive brain disease. Schizophr Bull. 2013;39(6):1363–1372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Lally J, Ajnakina O, Stubbs B, et al. Remission and recovery from first-episode psychosis in adults: systematic review and meta-analysis of long-term outcome studies. Br J Psychiatry. 2017;211(6):350–358. [DOI] [PubMed] [Google Scholar]
  • 4.Ajnakina O, Stubbs B, Francis E, et al. Hospitalisation and length of hospital stay following first-episode psychosis: systematic review and meta-analysis of longitudinal studies. Psychol Med. 2020;50(6):991–1001. [DOI] [PubMed] [Google Scholar]
  • 5.Ajnakina O, Stubbs B, Francis E, et al. Employment and relationship outcomes in first-episode psychosis: a systematic review and meta-analysis of longitudinal studies. Schizophr Res. 2021;231:122–133. [DOI] [PubMed] [Google Scholar]
  • 6.Perkins DO, Gu H, Boteva K, Lieberman JA.. Relationship between duration of untreated psychosis and outcome in first-episode schizophrenia: a critical review and meta-analysis. Am J Psychiatry. 2005;162(10):1785–1804. [DOI] [PubMed] [Google Scholar]
  • 7.Lieberman JA, Perkins D, Belger A, et al. The early stages of schizophrenia: speculations on pathogenesis, pathophysiology, and therapeutic approaches. Biol Psychiatry. 2001;50(11):884–897. [DOI] [PubMed] [Google Scholar]
  • 8.McGorry P, Johanessen JO, Lewis S, et al. Early intervention in psychosis: keeping faith with evidence-based health care. Psychol Med. 2010;40(3):399–404. [DOI] [PubMed] [Google Scholar]
  • 9.Craig TKJ, Garety P, Power P, et al. The Lambeth Early Onset (LEO) Team: randomised controlled trial of the effectiveness of specialised care for early psychosis. BMJ. 2004;329(7474):1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Drake RJ, Haley CJ, Akhtar S, Lewis SW.. Causes and consequences of duration of untreated psychosis in schizophrenia. Br J Psychiatry. 2000;177:511–515. [DOI] [PubMed] [Google Scholar]
  • 11.Ajnakina O, Morgan C, Gayer-Anderson C, et al. Only a small proportion of patients with first episode psychosis come via prodromal services: a retrospective survey of a large UK mental health programme. BMC Psychiatry. 2017;17(1):308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Green CE, McGuire PK, Ashworth M, Valmaggia LR.. Outreach and Support in South London (OASIS). Outcomes of non-attenders to a service for people at high risk of psychosis: the case for a more assertive approach to assessment. Psychol Med. 2011;41(2):243–250. [DOI] [PubMed] [Google Scholar]
  • 13.Terry SF. Obama’s precision medicine initiative. Genet Test Mol Biomarkers. 2015;19(3):113–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Salazar de Pablo G, Studerus E, Vaquerizo-Serrano J, et al. Implementing precision psychiatry: a systematic review of individualized prediction models for clinical practice. Schizophr Bull. 2021;47(2):284–297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ajnakina O, David AS, Murray RM.. “At risk mental state” clinics for psychosis—an idea whose time has come—and gone!. Psychol Med. 2019;49(4):529–534. [DOI] [PubMed] [Google Scholar]
  • 16.Ajnakina O, Lally J, Di Forti M, et al. Patterns of illness and care over the 5 years following onset of psychosis in different ethnic groups; the GAP-5 study. Soc Psychiatry Psychiatr Epidemiol. 2017;52(9):1101–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Fusar-Poli P, Salazar de Pablo G, Rajkumar RP, et al. Diagnosis, prognosis, and treatment of brief psychotic episodes: a review and research agenda. Lancet Psychiatry. 2022;9(1):72–83. [DOI] [PubMed] [Google Scholar]
  • 18.Lieberman JA, Fenton WS.. Delayed detection of psychosis: causes, consequences, and effect on public health. Am J Psychiatry. 2000;157(11):1727–1730. [DOI] [PubMed] [Google Scholar]
  • 19.Rozycki M, Satterthwaite TD, Koutsouleris N, et al. Multisite machine learning analysis provides a robust structural imaging signature of schizophrenia detectable across diverse patient populations and within individuals. Schizophr Bull. 2018;44(5):1035–1044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kalmady SV, Greiner R, Agrawal R, et al. Towards artificial intelligence in mental health by improving schizophrenia prediction with multiple brain parcellation ensemble-learning. Npj Schizophr. 2019;5(1):2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Cooper JD, Han SYS, Tomasik J, et al. Multimodel inference for biomarker development: an application to schizophrenia. Transl Psychiatry. 2019;9(1):83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Studerus E, Ramyead A, Riecher-Rössler A.. Prediction of transition to psychosis in patients with a clinical high risk for psychosis: a systematic review of methodology and reporting. Psychol Med. 2017;47(7):1163–1178. [DOI] [PubMed] [Google Scholar]
  • 23.D’Amico G, Malizia G, D’Amico M.. Prognosis research and risk of bias. Intern Emerg Med. 2016;11(2):251–260. [DOI] [PubMed] [Google Scholar]
  • 24.Wynants L, Collins GS, Van Calster B.. Key steps and common pitfalls in developing and validating risk models. Bjog. 2017;124(3):423–432. [DOI] [PubMed] [Google Scholar]
  • 25.Steyerberg E.Clinical Prediction Models. A Practical Approach to Development, Validation, and Updating. 2nd ed., Switzerland: Springer Nature; 2019. [Google Scholar]
  • 26.Hastie T, Tibishirani R, Friedman J.. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed., New York: Springer; 2009. [Google Scholar]
  • 27.Gayer-Anderson C, Jongsma HE, Di Forti M, et al. ; EU-GEI WP2 Group. The EUropean Network of National Schizophrenia Networks Studying Gene-Environment Interactions (EU-GEI): Incidence and First-Episode Case-Control Programme. Soc Psychiatry Psychiatr Epidemiol. 2020;55(5):645–657. [DOI] [PubMed] [Google Scholar]
  • 28.Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128–138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Steyerberg EW, Vergouwe Y.. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J. 2014;35(29):1925–1931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Steyerberg E.Clinical Prediction Models. A Practical Approach to Development, Validation, and Updating. New York: Springer; 2009. [Google Scholar]
  • 31.Reilly BM, Evans AT.. Translating clinical research into clinical practice: impact of using prediction rules to make decisions. Ann Intern Med. 2006;144(3):201–209. [DOI] [PubMed] [Google Scholar]
  • 32.Steyerberg EW, Harrell FE Jr. Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol. 2016;69:245–247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Jongsma HE, Gayer-Anderson C, Lasalvia A, et al. ; European Network of National Schizophrenia Networks Studying Gene-Environment Interactions Work Package 2 (EU-GEI WP2) Group. Treated incidence of psychotic disorders in the multinational EU-GEI Study. JAMA Psychiatry. 2018;75(1):36–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Harmala S, O’Brien A, Parisinos CA, Direk K, Shallcross L, Hayward A.. Development and validation of a prediction model to estimate the risk of liver cirrhosis in primary care patients with abnormal liver blood test results: protocol for an electronic health record study in Clinical Practice Research Datalink. Diagn Progn Res. 2019;3:10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Quattrone D, Ferraro L, Tripoli G, et al. Daily use of high-potency cannabis is associated with more positive symptoms in first-episode psychosis patients: the EU-GEI case-control study. Psychol Med. 2020;51(8):13291–11337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Collins GS, Reitsma JB, Altman DG, Moons KG.. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Bjog. 2015;122(3):434–443. [DOI] [PubMed] [Google Scholar]
  • 37.R Core Team. R: Language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2022. https://www.R-project.org/. [Google Scholar]
  • 38.Riley RD, Ensor J, Snell KIE, et al. Calculating the sample size required for developing a clinical prediction model. Bmj. 2020;368:m441. [DOI] [PubMed] [Google Scholar]
  • 39.Austin PC, Steyerberg EW.. Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models. Stat Methods Med Res. 2017;26(2):796–808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Moons KG, Donders RA, Stijnen T, Harrell FE Jr. Using the outcome for imputation of missing predictor values was preferred. J Clin Epidemiol. 2006;59(10):1092–1101. [DOI] [PubMed] [Google Scholar]
  • 41.Zhao Y, Long Q.. Multiple imputation in the presence of high-dimensional data. Stat Methods Med Res. 2016;25(5):2021–2035. [DOI] [PubMed] [Google Scholar]
  • 42.Stekhoven Daniel J. Bühlmann Peter. MissForest—non-parametric missing value imputation for mixed-type data. Bioinformatics. 2012;28(1):112–118. [DOI] [PubMed] [Google Scholar]
  • 43.Oba S, Sato MA, Takemasa I, Monden M, Matsubara K, Ishii S.. A Bayesian missing value estimation method for gene expression profile data. Bioinformatics. 2003;19(16):2088–2096. [DOI] [PubMed] [Google Scholar]
  • 44.Kontopantelis E, White IR, Sperrin M, Buchan I.. Outcome-sensitive multiple imputation: a simulation study. BMC Med Res Methodol. 2017;17(1):2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Tibshirani R. The lasso method for variable selection in the Cox model. Stat Med. 1997;16(4):385–395. [DOI] [PubMed] [Google Scholar]
  • 46.Musoro JZ, Zwinderman AH, Puhan MA, ter Riet G, Geskus RB.. Validation of prediction models based on lasso regression with multiply imputed data. BMC Med Res Methodol. 2014;14:116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Fan J, Lv J.. A Selective overview of variable selection in high dimensional feature space. Stat Sin. 2010;20(1):101–148. [PMC free article] [PubMed] [Google Scholar]
  • 48.Laupacis A, Sekar N, Stiell IG.. Clinical prediction rules. A review and suggested modifications of methodological standards. JAMA. 1997;277(6):488–494. [PubMed] [Google Scholar]
  • 49.DJ H. Classifier technology and the illusion of progress. Stat Sci. 2006;21(1):1–5.17906740 [Google Scholar]
  • 50.Bernardini F, Attademo L, Cleary SD, et al. Risk prediction models in psychiatry: toward a new frontier for the prevention of mental illnesses. J Clin Psychiatry. 2017;78(5):572–583. [DOI] [PubMed] [Google Scholar]
  • 51.Altman DG, Vergouwe Y, Royston P, Moons KG.. Prognosis and prognostic research: validating a prognostic model. Bmj. 2009;338:b605. [DOI] [PubMed] [Google Scholar]
  • 52.Moons KG, Altman DG, Vergouwe Y, Royston P.. Prognosis and prognostic research: application and impact of prognostic models in clinical practice. Bmj. 2009;338:b606. [DOI] [PubMed] [Google Scholar]
  • 53.Harrell FE Jr, Lee KL, Mark DB.. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361–387. [DOI] [PubMed] [Google Scholar]
  • 54.Wynants L, van Smeden M, McLernon DJ, Timmerman D, Steyerberg EW, Van Calster B; Topic Group ‘Evaluating diagnostic tests and prediction models’ of the STRATOS initiative. Three myths about risk thresholds for prediction models. BMC Med. 2019;17(1):192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Perkins NJ, Schisterman EF.. The inconsistency of “optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve. Am J Epidemiol. 2006;163(7):670–675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Chekroud AM, Zotti RJ, Shehzad Z, et al. Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry. 2016;3(3):243–250. [DOI] [PubMed] [Google Scholar]
  • 57.Bonnett LJ, Snell KIE, Collins GS, Riley RD.. Guide to presenting clinical prediction models for use in clinical settings. Bmj. 2019;365:l737. [DOI] [PubMed] [Google Scholar]
  • 58.Cowley LE, Farewell DM, Maguire S, Kemp AM.. Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature. Diagn Progn Res. 2019;3:16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Fisher HL, Jones PB, Fearon P, et al. The varying impact of type, timing and frequency of exposure to childhood adversity on its association with adult psychotic disorder. Psychol Med. 2010;40(12):1967–1978. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Morgan C, Lappin J, Heslin M, et al. Reappraising the long-term course and outcome of psychotic disorders: the AESOP-10 study. Psychol Med. 2014;44(13):2713–2726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Murray RM, Mondelli V, Stilo SA, et al. The influence of risk factors on the onset and outcome of psychosis: what we learned from the GAP study. Schizophr Res. 2020;225:63–68. [DOI] [PubMed] [Google Scholar]
  • 62.Di Forti M, Marconi A, Carra E, et al. Proportion of patients in south London with first-episode psychosis attributable to use of high potency cannabis: a case–control study. Lancet Psychiatry. 2015;2(3):233–238. [DOI] [PubMed] [Google Scholar]
  • 63.Di Forti M, Sallis H, Allegri F, et al. Daily use, especially of high-potency cannabis, drives the earlier onset of psychosis in cannabis users. Schizophr Bull. 2014;40(6):1509–1517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Di Forti M, Quattrone D, Freeman TP, et al. ; EU-GEI WP2 Group. The contribution of cannabis use to variation in the incidence of psychotic disorder across Europe (EU-GEI): a multicentre case–control study. Lancet Psychiatry. 2019;6(5):427–436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Murray RM, Quigley H, Quattrone D, Englund A, Di Forti M.. Traditional marijuana, high-potency cannabis and synthetic cannabinoids: increasing risk for psychosis. World Psychiatry. 2016;15(3):195–204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Murray RM, David AS, Ajnakina O.. Prevention of psychosis: moving on from the at-risk mental state to universal primary prevention. Psychol Med. 2021;51(2):223–227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Nandi A, Glymour MM, Subramanian SV.. Association among socioeconomic status, health behaviors, and all-cause mortality in the United States. Epidemiology. 2014;25(2):170–177. [DOI] [PubMed] [Google Scholar]
  • 68.Stringhini S, Berkman L, Dugravot A, et al. Socioeconomic status, structural and functional measures of social support, and mortality: The British Whitehall II Cohort Study, 1985–2009. Am J Epidemiol. 2012;175(12):1275–1283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Stringhini S, Sabia S, Shipley M, et al. Association of socioeconomic position with health behaviors and mortality. JAMA. 2010;303(12):1159–1166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Killackey EJ, Jackson HJ, Gleeson J, Hickie IB, McGorry PD.. Exciting career opportunity beckons! Early intervention and vocational rehabilitation in first-episode psychosis: employing cautious optimism. Aust N Z J Psychiatry. 2006;40(11–12):951–62. [DOI] [PubMed] [Google Scholar]
  • 71.Marwaha S, Johnson S.. Schizophrenia and employment—a review. Soc Psychiatry Psychiatr Epidemiol. 2004;39(5):337–349. [DOI] [PubMed] [Google Scholar]
  • 72.Morgan VA. Strategies for improving employment outcomes for people with psychosis. Commentary on: severe mental illness and work—what can we do to maximise employment opportunities for individuals with psychosis? Aust N Z J Psychiatry. 2013;47(5):486–487. [DOI] [PubMed] [Google Scholar]
  • 73.Ajnakina O, Agbedjro D, McCammon R, et al. Development and validation of prediction model to estimate 10-year risk of all-cause mortality using modern statistical learning methods: a large population-based cohort study and external validation. BMC Med Res Methodol. 2021;21(1):8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B.. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. [DOI] [PubMed] [Google Scholar]
  • 75.van der Ploeg T, Austin PC, Steyerberg EW.. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med Res Methodol. 2014;14:137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Stekhoven DJ, Bühlmann P.. MissForest—non-parametric missing value imputation for mixed-type data. Bioinformatics. 2012;28(1):112–118. [DOI] [PubMed] [Google Scholar]
  • 77.Osborn DP, Hardoon S, Omar RZ, et al. Cardiovascular risk prediction models for people with severe mental illness: results from the prediction and management of cardiovascular risk in people with severe mental illnesses (PRIMROSE) research program. JAMA Psychiatry. 2015;72(2):143–151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Verdoux H, Liraud F, Assens F, Abalan F, van Os J.. Social and clinical consequences of cognitive deficits in early psychosis: a two-year follow-up study of first-admitted patients. Schizophr Res. 2002;56(1–2):149–59. [DOI] [PubMed] [Google Scholar]
  • 79.Svedberg B, Mesterton A, Cullberg J.. First-episode non-affective psychosis in a total urban population: a 5-year follow-up. Soc Psychiatry Psychiatr Epidemiol. 2001;36(7):332–337. [DOI] [PubMed] [Google Scholar]
  • 80.Morgan C, Lappin J, Heslin M, et al. Reappraising the long-term course and outcome of psychotic disorders: The AESOP-10 study. Psychol Med. 2014;44(13):2713–2726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Aadamsoo K, Saluveer E, Knarpuu H, Vasar V, Maron E.. Diagnostic stability over 2 years in patients with acute and transient psychotic disorders. Nord J Psychiatry. 2011;65(6):381–388. [DOI] [PubMed] [Google Scholar]
  • 82.Bland RC, Orn H.. 14-year outcome in early schizophrenia. Acta Psychiatr Scand. 1978;58(4):327–338. [DOI] [PubMed] [Google Scholar]
  • 83.Johnson S, Sathyaseelan M, Charles H, Jacob KS.. Predictors of disability: a 5-year cohort study of first-episode schizophrenia. Asian J Psychiatry. 2014;9:45–50. [DOI] [PubMed] [Google Scholar]
  • 84.Lehtinen V, Aaltonen J, Koffert T, Rakkolainen V, Syvalahti E.. Two-year outcome in first-episode psychosis treated according to an integrated model. Is immediate neuroleptisation always needed? Eur Psychiatry. 2000;15(5):312–320. [DOI] [PubMed] [Google Scholar]
  • 85.Zandi T, Havenaar JM, Laan W, Kahn RS, van den Brink W.. Predictive validity of a culturally informed diagnosis of schizophrenia: a 30month follow-up study with first episode psychosis. Schizophr Res. 2011;133(1–3):29–35. [DOI] [PubMed] [Google Scholar]
  • 86.Harrison G, Owens D, Holton A, Neilson D, Boot D.. A prospective study of severe mental disorder in Afro-Caribbean patients. Psychol Med. 1988;18(3):643–657. [DOI] [PubMed] [Google Scholar]
  • 87.Boydell J, van Os J, McKenzie K, et al. Incidence of schizophrenia in ethnic minorities in London: ecological study into interactions with environment. Bmj. 2001;323(7325):1336–1338. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Morgan C, Dazzan P, Morgan K, et al. ; AESOP study group. First episode psychosis and ethnicity: initial findings from the AESOP study. World Psychiatry. 2006;5(1):40–46. [PMC free article] [PubMed] [Google Scholar]
  • 89.Schwartz JE, Fennig S, Tanenberg-Karant M, et al. Congruence of diagnoses 2 years after a first-admission diagnosis of psychosis. Arch Gen Psychiatry. 2000;57(6):593–600. [DOI] [PubMed] [Google Scholar]
  • 90.Koepsell TD, Gill DP, Chen B.. Stability of clinical etiologic diagnosis in dementia and mild cognitive impairment: results from a multicenter longitudinal database. Am J Alzheimers Dis Other Demen. 2013;28(8):750–758. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Albers L, Straube A, Landgraf MN, Heinen F, von Kries R.. High diagnostic stability of confirmed migraine and confirmed tension-type headache according to the ICHD-3 beta in adolescents. J Headache Pain. 2014;15(1):36. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sgad008_suppl_Supplementary_Materials

Articles from Schizophrenia Bulletin Open are provided here courtesy of Oxford University Press on behalf of the University of Maryland's School of Medicine, Maryland Psychiatric Research Center

RESOURCES