Abstract
Background
Using novel data mining methods such as natural language processing (NLP) on electronic health records (EHRs) for screening and detecting individuals at risk for psychosis.
Method
The study included all patients receiving a first index diagnosis of nonorganic and nonpsychotic mental disorder within the South London and Maudsley (SLaM) NHS Foundation Trust between January 1, 2008, and July 28, 2018. Least Absolute Shrinkage and Selection Operator (LASSO)-regularized Cox regression was used to refine and externally validate a refined version of a five-item individualized, transdiagnostic, clinically based risk calculator previously developed (Harrell’s C = 0.79) and piloted for implementation. The refined version included 14 additional NLP-predictors: tearfulness, poor appetite, weight loss, insomnia, cannabis, cocaine, guilt, irritability, delusions, hopelessness, disturbed sleep, poor insight, agitation, and paranoia.
Results
A total of 92 151 patients with a first index diagnosis of nonorganic and nonpsychotic mental disorder within the SLaM Trust were included in the derivation (n = 28 297) or external validation (n = 63 854) data sets. Mean age was 33.6 years, 50.7% were women, and 67.0% were of white race/ethnicity. Mean follow-up was 1590 days. The overall 6-year risk of psychosis in secondary mental health care was 3.4 (95% CI, 3.3–3.6). External validation indicated strong performance on unseen data (Harrell’s C 0.85, 95% CI 0.84–0.86), an increase of 0.06 from the original model.
Conclusions
Using NLP on EHRs can considerably enhance the prognostic accuracy of psychosis risk calculators. This can help identify patients at risk of psychosis who require assessment and specialized care, facilitating earlier detection and potentially improving patient outcomes.
Keywords: natural language processing, electronic health records, prevention, psychosis, machine learning, prediction
Introduction
The burden of psychotic disorders is substantial: for example, schizophrenia accounted for 13 million years lived with disability in 2017,1 with the most recent estimate reporting a European economic burden of €93.9 billion.2 Existing treatments have little impact on illness course in established psychosis.3,4 Primary indicated prevention in individuals at clinical high risk for psychosis (CHR-P), however, has the potential to reduce the duration of untreated psychosis and alter its course.5,6 Effective preventive intervention is reliant on the successful identification of individuals at risk for psychosis for referral to specialized CHR-P clinical services; 7–9 these individuals tend to present with attenuated psychotic symptoms and overall functional impairment.10,11 Detection of at-risk individuals currently relies on help-seeking behaviours12 and idiosyncratic referral pathways initiated on suspicion of psychosis risk.13 Emerging evidence suggests that current detection strategies are highly inefficient,13 with only 5%14–12%15 of first episode cases intercepted by CHR-P clinical services. To tackle these challenges, we previously developed an individualized, clinically based transdiagnostic risk calculator, using clinical and demographic predictors widely available in electronic health records (EHRs): age, sex, age by sex, ethnicity, and index ICD-10 diagnosis or CHR-P designation.16 The transdiagnostic risk calculator has been externally validated in two separate large EHR datasets, demonstrating adequate prognostic performance (n = 54 716, Harrell’s C = 0.79; 14n = 13 702, Harrell’s C = 0.7317). This transdiagnostic risk calculator has already undergone pilot testing for clinical implementation.18 The calculator’s potential for implementation, combined with its unique position to enhance large-scale detection of at-risk individuals, underscores the importance of improving its prognostic accuracy. Given the replication crisis in psychiatry and science,19,20 improving existing, previously validated risk prediction models represents a more efficient approach than redeveloping new models.21
While EHRs offer some information (notably on sociodemographic characteristics) in structured fields, the majority of information is recorded in free text such as event notes and uploaded attachments, representing an enormous reservoir of untapped information.22 For example, information on symptomatology and substance use is not routinely recorded in a structured way.22 Natural language processing (NLP) techniques have recently been developed to mine structured data from free text. These offer an unprecedented opportunity to incorporate more granular predictors closer to the pathophysiology of psychosis onset into a model. By applying NLP to EHRs, this study aims to improve on the prognostic accuracy achieved by the previously published transdiagnostic risk calculator,16 further supporting the efficient detection of individuals at risk for psychosis.
Methods and Materials
Setting
The South London and Maudsley (SLaM) NHS Trust is one of Europe’s largest secondary mental health-care providers.23 Its main catchment area covers four socioeconomically diverse South London boroughs: Croydon, Lambeth, Lewisham, and Southwark, alongside tertiary referrals from the rest of London and the United Kingdom. The Clinical Record Interactive Search (CRIS) system facilitates interrogation of de-identified EHRs held by the Trust, which adopted a system of electronic recording in 2007.22,23 SLaM now has EHRs for over 400 000 individuals, providing data on their sociodemographic and clinical characteristics.
Study Population
We extracted data for all individuals accessing the SLaM NHS Trust between January 1, 2008 and July 28, 2018. Inclusion and exclusion criteria followed that of the original analysis and development of the psychosis risk calculator: namely, all individuals who received a first index primary diagnosis of any nonorganic and nonpsychotic mental disorder.16
Model Specifications
Original Model
As detailed in Fusar-Poli et al,16 the original transdiagnostic, clinically based, and individualized risk calculator was developed using a retrospective cohort study leveraging EHRs from the SLaM boroughs of Lambeth and Southwark. Cox regression was used to predict the hazard ratio (HR) of developing any psychotic disorder over time (defined in Supplementary methods S1). Predictors included age (at the time of index diagnosis), sex, age by sex, self-assigned ethnicity, and cluster index diagnosis or CHR-P designation (defined in Supplementary tables S2 and S3). The model was externally validated first in the SLaM boroughs of Croydon and Lewisham and later in Camden and Islington NHS Trust.16,17 In this retrospective version of the risk calculator, individuals who developed psychosis within 3 months following their index diagnosis were excluded. However, implementing this diagnostic lag prospectively in the subsequent implementation study would have resulted in delays for referral to assessment. Therefore, a refined version of the risk calculator without the lag period, which demonstrated similar external prognostic accuracy, was optimized for prospective use and is considered in the current study (for details, see Supplementary table S4).
Model Refinement with NLP Data
In the present study, we refined the prospective version of the transdiagnostic model using all original predictors plus additional NLP-derived predictors. NLP tools were used to extract symptom and substance use data from free text recorded by clinicians within the 6 months prior to the index diagnosis. This time period was chosen to ensure that predictor data did not overlap with our outcome variables, and because symptom assessment tends to precede formal diagnosis. We employed CRIS-specific NLP algorithms that convert unstructured information from free text into structured and quantifiable data. Details on symptom algorithm development can be found in Jackson et al; 22 in general, these were developed using cross-validated support vector machines on a gold-standard, human-annotated training corpus for each symptom. A regularly updated algorithm library, with comprehensive detail on keywords used and validation efforts, can be found on the CRIS website.24 NLP algorithm performance is mainly measured in terms of precision (proportion of true positive instances of total NLP-labelled positive instances) and recall (proportion of true positive instances of all positive instances available in the text). As EHRs provide multiple opportunities for term detection, we favored precision over recall, using only NLP algorithms with at least 80% precision (see Supplementary methods S2 and Supplementary table S5). We also excluded predictors with near-zero variance, which can cause model instability across validation folds.25 This resulted in 14 NLP symptom and substance use predictors with a mean (SD) precision of 0.91 (0.06): tearfulness, poor appetite, weight loss, insomnia, cannabis use, cocaine use, guilt, irritability, delusions, hopelessness, disturbed sleep, poor insight, agitation, and paranoia. We dichotomized NLP symptom and substance use predictors as trigger terms tend to be repeated within and across records; treating predictors as continuous would otherwise have resulted in the model erroneously interpreting them as a linear reflection of severity.26–28 The value “0” indicated that a given symptom or substance was not mentioned in a patient’s EHR. We retained individuals without NLP-derivable symptom or substance data prior to index diagnosis, treating NLP data as bonus information where available.
Statistical Analysis
This EHR clinical register-based study is reported according to the RECORD and STROBE statements (see Supplementary table S1).29 Model development and validation followed the methodological guidelines of Royston and Altman,30 Steyerberg et al,31 and the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD).32
We performed descriptive analyses of baseline clinical and sociodemographic characteristics of the sample, obtaining means and frequencies for continuous and categorical variables, respectively. The Kaplan–Meier failure function (1-survival) and Greenwood 95% confidence intervals were used to describe the cumulative risk of psychosis onset in SLaM patients. The primary outcome measure was model prognostic discrimination performance measured through Harrell’s C, a recommended measure for external validation of Cox models.33 A Harrell’s C value of 0.9–1.0 is considered outstanding, 0.8–0.9 excellent and 0.7–0.8 acceptable.34 Model development and validation analyses were conducted in R version 3.6.1.
Model Development
We first divided our cohort into derivation and validation samples using nonrandom, geographic split-sampling, which is one of the recommended approaches to model building.32 Mirroring the original analysis, the derivation sample comprised all cases from Lambeth and Southwark until December 31, 2015.16 The validation sample comprised all cases from the same two boroughs from January 2016 plus cases from all other boroughs, constituting temporal and geographic forms of external validation. These samples differ on several sociodemographic characteristics.23
We then trained a Cox proportional-hazards model on our derivation sample using the refined transdiagnostic model. Since adding large numbers of predictors can result in overfitting,21 we regularized our model using the Least Absolute Shrinkage and Selection Operator (LASSO) penalty, implemented via the glmnet package in R. The LASSO algorithm performs feature selection by shrinking the coefficients of redundant predictors toward zero. This penalty requires selection of a tuning parameter lambda, which controls the number of coefficients estimated to be non-zero. We used 10-fold cross-validation to select the optimal lambda, choosing the minimum lambda value that maximized partial likelihood from a range of possible values. With the resultant model coefficients, we developed a prognostic index in the derivation dataset by generating prognostic risk scores for each individual.
External Model Validation
We applied the regression coefficients from the derivation data set to each case in the external validation set to generate the prognostic index for the validation dataset. Model prognostic performance (Harrell’s C, which captures discrimination)31 was the primary outcome measure. We further assessed overall model performance using the Brier score (the average mean squared difference between predicted probabilities and actual outcomes), which captures calibration and discrimination aspects.31 A lower score indicates higher precision and less bias. Calibration (the agreement between observed outcomes and predictions) was assessed with the regression slope of the prognostic index.35 Finally, we performed a sensitivity analysis to assess whether our model would perform better in temporal or geographic external validation by splitting our external validation set into these two groups.
Results
Sociodemographic and Clinical Characteristics
Of 108 211 individuals receiving a first SLaM diagnosis of nonorganic and nonpsychotic mental disorder within SLaM in the period between January 1, 2008 and July 28, 2017, 92 151 had complete data across all original predictors (see figure 1). Mean (SD) age was 33.6 (19.0); individuals were almost evenly split by sex (female 50.7%, male 49.3%), most were of White ethnicity (67.0%), and anxiety disorders were the most frequent index diagnoses (27.5%, table 1). With respect to the new NLP predictors, 44 368 (41%) individuals had no symptom or substance data in the 6 months prior to their index diagnosis. Derivation (n = 28 297) and validation sets (n = 63 854, figure 1) showed notable differences in terms of ethnic make-up and the spread of index diagnoses (eg, substance use disorder was more prevalent in the derivation set, see Supplementary table S6). Mean (SD) follow-up was 1590 (721) days with a significant difference between derivation and validation sets (derivation: mean [SD], 1896 [463]; validation: mean [SD], 1455 [772]). Overall 6-year risk of developing a psychotic disorder was 0.034 (95% CI, 0.033–0.036). Cumulative incidence (Kaplan–Meier failure function) for risk of development of psychotic disorders is presented in Supplementary figure S1.
Fig. 1.
Flowchart of study population.
Table 1.
Sample Characteristics
| No (%) | |||
|---|---|---|---|
| Variable | Study Population (n = 92 150) | Derivation Dataset (n = 28 297) | Validation Dataset (n = 63 853) |
| Age, mean (SD)a | 33.6 (19.0) | 34.8 (18.8) | 33.0 (19.1) |
| Sexa | |||
| Female | 46 741 (50.7%) | 13 861 (49.0%) | 32 880 (51.5%) |
| Male | 45 410 (49.3%) | 14 436 (51.0%) | 30 974 (48.5%) |
| Ethnicitya | |||
| White | 61 711 (67.0%) | 16 700 (59.0%) | 45 011 (70.5%) |
| Asian | 4549 (4.94%) | 1030 (3.64%) | 3519 (5.51%) |
| Black | 15 187 (16.5%) | 6029 (21.3%) | 9158 (14.3%) |
| Mixed | 3805 (4.13%) | 1183 (4.18%) | 2622 (4.11%) |
| Other | 6899 (7.49%) | 3355 (11.9%) | 3544 (5.55%) |
| Index diagnosis | |||
| CHR-P | 445 (0.48%) | 238 (0.84%) | 207 (0.32%) |
| Anxiety disorders | 25 323 (27.5%) | 6765 (23.9%) | 18 558 (29.1%) |
| Acute and transient psychotic disorders | 1568 (1.70%) | 552 (1.95%) | 1016 (1.59%) |
| Bipolar disorders | 3149 (3.42%) | 1018 (3.60%) | 2131 (3.34%) |
| Childhood onset disorders | 12 332 (13.4%) | 3351 (11.8%) | 8981 (14.1%) |
| Developmental disorders | 4645 (5.04%) | 923 (3.26%) | 3722 (5.83%) |
| Mental retardation | 1640 (1.78%) | 609 (2.15%) | 1031 (1.61%) |
| Nonbipolar affective disorders | 15 965 (17.3%) | 5240 (18.5%) | 10 725 (16.8%) |
| Personality disorders | 3524 (3.82%) | 1071 (3.78%) | 2453 (3.84%) |
| Physiological disorders | 6806 (7.39%) | 1958 (6.92%) | 4848 (7.59%) |
| Substance use disorders | 16 754 (18.2%) | 6572 (23.2%) | 10 182 (15.9%) |
| Symptoms | |||
| Tearfulness | 20 214 (21.9%) | 13 835 (21.7%) | 6379 (22.5%) |
| Appetite loss | 13 653 (14.8%) | 9322 (14.6%) | 4331 (15.3%) |
| Weight loss | 8623 (9.36%) | 6002 (9.40%) | 2621 (9.26%) |
| Insomnia | 5115 (5.55%) | 3401 (5.33%) | 1714 (6.06%) |
| Poor insight | 17 089 (18.5%) | 12 000 (18.8%) | 5089 (18.0%) |
| Guilt | 9953 (10.8%) | 6665 (10.4%) | 3288 (11.6%) |
| Irritability | 9049 (9.82%) | 6259 (9.80%) | 2790 (9.86%) |
| Delusions | 5352 (5.81%) | 3649 (5.71%) | 1703 (6.02%) |
| Hopelessness | 8883 (9.64%) | 6117 (9.58%) | 2766 (9.77%) |
| Disturbed sleep | 25 786 (28.0%) | 17 576 (27.5%) | 8210 (29.0%) |
| Agitation | 12 916 (14.0%) | 9054 (14.2%) | 3862 (13.6%) |
| Paranoia | 13 212 (14.3%) | 9201 (14.4%) | 4011 (14.2%) |
| Substance use | |||
| Cannabis | 13 604 (14.8%) | 9271 (14.5%) | 4333 (15.3%) |
| Cocaine | 10 229 (11.1%) | 6554 (10.3%) | 3675 (13.0%) |
aMissingness values for ethnicity, sex, and age were 11.9%, 0.06%, and 0.02%, respectively.
Model Development
There were 1060 transitions to psychosis in the derivation dataset (raw counts stratified per index diagnosis are available in Supplementary table S7), of which 55 were observed in the CHR-P group (5%). The refined risk prediction model significantly predicted psychosis onset (likelihood ratio χ2 test, 2769; P < .001, table 2). No variables were selected out of the model via LASSO regularization. The LASSO penalty improves model performance at the expense of bias in parameter estimates (which reduces coefficient interpretability), therefore significance testing for individual predictors would not be appropriate.36 Paranoia, delusions, and agitation were the strongest positive NLP-derived predictors of psychosis while hopelessness was the strongest negative one. The transdiagnostic model refined with NLP predictors showed good apparent prognostic performance (Harrell’s C index, 0.86, table 3), an increase of 0.05 from the Harrell’s C of the original model.
Table 2.
Characteristics of the Refined, Individualized, and Transdiagnostic Clinically Based Risk Prediction Model Employing NLP Predictors to Detect Individuals at Risk for Psychosis in EHRs
| Predictor | Hazard Ratio | |
|---|---|---|
| Original model | Male sex | 1.29 |
| Age | 1.01 | |
| Sex * Age | 0.99 | |
| Ethnicity | ||
| White | Ref | |
| Asian | 1.57 | |
| Black | 2.16 | |
| Mixed | 1.20 | |
| Other | 1.18 | |
| Primary diagnosis | ||
| CHR-P | Ref | |
| Anxiety disorders | 0.16 | |
| Acute and transient psychotic disorders | 1.26 | |
| Bipolar disorders | 0.38 | |
| Childhood-onset disorders | 0.06 | |
| Developmental disorders | 0.07 | |
| Mental retardation | 0.07 | |
| Nonbipolar affective disorders | 0.22 | |
| Personality disorders | 0.17 | |
| Physiological disorders | 0.11 | |
| Substance use disorders | 0.15 | |
| New NLP symptoms and substance use | ||
| Agitation | 1.64 | |
| Appetite loss | 1.06 | |
| Cannabis | 1.13 | |
| Cocaine | 0.87 | |
| Delusions | 2.10 | |
| Disturbed sleep | 1.12 | |
| Guilt | 0.93 | |
| Hopelessness | 0.70 | |
| Insomnia | 1.05 | |
| Irritability | 1.05 | |
| Loss of insight | 1.02 | |
| Paranoia | 2.62 | |
| Tearfulness | 0.93 | |
| Weight loss | 1.14 |
Coefficients obtained via LASSO-regularised, multivariable Cox proportional hazards regression using the derivation dataset (n = 28 297).
Table 3.
Performance Measures for the Original Transdiagnostic Individualized Risk Prediction Model vs the NLP Model Refinement
| Original Transdiagnostic Model | Refined Model Including NLP Predictors | |||
|---|---|---|---|---|
| Performance Measure | Derivation | Validation | Derivation | Validation |
| Overall | ||||
| Brier | 0.028 | 0.021 | 0.085 | 0.061 |
| R2 | 0.746 | 0.719 | 0.885 | 0.885 |
| Discrimination | ||||
| Harrell’s C (95% CI) | 0.809 (0.795–0.822) | 0.790 (0.775–0.806) | 0.861 (0.849–0.873) | 0.848 (0.838–0.858) |
| Calibration | ||||
| Calibration slope | 1 | 0.968 | 1 | 1.059 |
External Model Validation
There were 1662 transitions to psychosis in the external validation dataset, which far exceeds the minimum value of 100 events required for robust external validation.37 The transdiagnostic model refined with NLP predictors still retained good prognostic performance when applied to unseen data (Harrell’s C 0.85), an increase of 0.06 from the Harrell’s C of the original model. The calibration slope coefficient of 1.06 (95% CI 1.03–1.09) indicated no major miscalibration issues. A sensitivity analysis found stronger model discriminatory performance in temporal external validation (Harrell’s C = 0.91) than geographic (Harrell’s C = 0.86; Supplementary table S8).
Discussion
To our knowledge this is the first study demonstrating the potential of applying automated methods such as NLP to EHRs to detect individuals at risk for psychosis. By incorporating NLP-derived data on symptoms and substance use, we refined a previously validated transdiagnostic, individualized, and clinically based risk prediction model. Model refinement considerably improved the external prognostic accuracy of the model to a good level (Harrell’s C 0.85) compared with an adequate level when using structured field data alone (Harrell’s C 0.79).
Efficient detection of individuals at CHR-P has been a neglected area of research despite its necessity for successful early intervention. This study progresses the field by demonstrating, for the first time, that advanced data-mining NLP methods38 can improve prognostication of psychosis risk at large scale. Our NLP-refined model confers a substantial increase in external prognostic accuracy, with a Harrell’s C increment of 0.06. Harrell’s C indexes the probability that for any given case–control pair, the model will generate a higher predicted risk score for the case. Our refinement has effected an improvement from adequate to good, a level of prognostic accuracy exceeding that of the original CHR-P instruments (C-statistic 0.79).39 Compared with CHR-P instruments, the NLP-based risk calculator produces reasonably well-calibrated and individualized estimates of risk (as opposed to group-level estimates), is automatable in EHRs and can be applied in large datasets. Furthermore, our risk calculator detects psychosis risk transdiagnostically outside the CHR-P designation.40 This is crucial given recent evidence that a third of first episode psychosis cases do not evolve through a previous CHR-P stage.13 Indeed, the original risk calculator has already been externally validated, and shown to perform well, in other NHS Trusts that do not have CHR-P services.17 A recent review of psychosis risk prediction models found that clinical variables such as paranoia and unusual thought content consistently appear as significant predictors of psychosis.41 NLP techniques can extract symptom data at a fraction of the cost of individual patient recruitment. Prognostic performance of our NLP-based refinement of the transdiagnostic risk calculator also exceeds that achieved using harder-to-obtain neuroanatomical predictors (eg, grey matter volume), with accuracies ranging from 0.50 to 0.63.42 In a previous study, we found that the use of machine learning per se is not associated with improved prognostic accuracy.43
The first step toward translating NLP tools into clinical benefits for patients is to apply these methods in larger risk cohorts to test their reproducibility.44 This study represents the largest application of NLP in the area of predicting conversion to psychosis to date (the second largest study includes only 59 patients).44 Our promising findings align with the existing NLP literature. For example, NLP-based automated speech analysis has been used to measure subtle, clinically relevant mental state changes in emerging psychosis.45 Other studies have found that NLP-derived tools can identify symptom distributions in clinician notes beyond those captured by ICD codes, and that these domains usefully map onto Research Domain Criteria.46 Our group has also confirmed that CRIS-NLP algorithms can reliably extract data on typically complex symptomatic domains such as negative psychotic symptoms or insight,26,27 which has been replicated by an independent team.28 These algorithms can also extract substance use data that predict longitudinal outcomes.47 The accuracy of NLP-based estimates compared with gold-standard domains is further corroborated by their robust prognostic (ie, predicting the course of a condition) and predictive (ie, predicting the response to treatment) value.28,46,48 NLP-derived data can also be incorporated into dynamic risk prediction models such as those recently used to predict psychosis onset risk.49 For example, one could use NLP to extract dynamic treatment data relevant to CHR-P populations, such as exposure to cognitive behavioral therapy, which is not routinely recorded in structured fields.50 This information could dynamically flow into risk predictions that are updated every time a patient’s record is updated with new information.
The NLP-refined transdiagnostic risk calculator is well suited for implementation in routine clinical care. First, our calculator represents the only available, pragmatic option for improving detection of individuals at risk of psychosis in secondary mental health care. The only existing alternative is to conduct extensive outreach campaigns that promote referrals based on clinicians’ suspicion of psychosis risk. This approach is inefficient because it dilutes risk enrichment (ie, refers more patients with only low risk for psychosis).51 Second, NLP-derived data were available for most at-risk individuals. Third, the NLP-refined model performed well in external validation efforts both temporally and geographically. The NLP algorithms used can be transferred to other sites with electronic health registers interrogable via CRIS for further external validation (eg, the Oxford Health NHS Foundation Trust). Fourth, this study refines an already well-performing model. Enhancing the benefits of clinical implementation through continual refinement of a given prognostic model is preferable to repeatedly developing new models from scratch that may never enter clinical routine.21 As the original risk calculator has already been piloted for implementation,18 the new NLP-refined model can be easily absorbed into clinical practice. Finally, this refined risk prediction model is designed to work around the way clinicians and mental health professionals enter text into EHRs. Future developments could implement an automated algorithm to trigger a prompt when all five baseline predictors are entered, taking into account any NLP symptom or substance use data entered prior to this point. For those (41%) without symptom or substance use information recorded prior to formal diagnosis, the original risk calculator can still be used. We have recently integrated the original version of this risk calculator in EHRs and prospectively piloted a real-time and real-world psychosis risk detection and alerting system.18,52 This method leverages the CogStack platform, which is an open-source text extraction system.52 The CogStack platform offers full-text search of clinical data, real-time calculation of psychosis risk, early risk alerts to clinicians, and visual monitoring of patients over time.52 This method is highly transportable and can be easily deployed in NHS Trusts with a CRIS or CogStack platform. So far, the CRIS platform—including consenting procedures—is under expansion across 12 NHS Trusts in the UK, harnessing over 2 million deidentified patient records (https://crisnetwork.co/). This study also provides further empirical evidence supporting the expansion of EHRs in clinical psychiatry to facilitate precision and stratified medicine approaches on a global scale. Study limitations are appended in the eLimitations section.
Conclusions
Applying NLP techniques to EHR data recorded during routine clinical practice can facilitate robust research in large, representative samples of patients. NLP can add value to precision psychiatry by enhancing the prognostic accuracy of risk prediction models. Automatic text extraction from EHRs through NLP enhanced the transdiagnostic prognostic power achieved by a previously developed psychosis risk calculator. This can help to facilitate earlier detection of patients at risk of developing psychosis who require an assessment and specialized care, potentially improving outcomes of psychosis.
Supplementary Material
Acknowledgments
All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare that R.S. has received research funding from Roche, Janssen, GSK, and Takeda. P.F.P. has received grant funds from Lundbeck and honoraria from Lundbeck, Menarini, and Angelini outside the current study. D.S. has received grant funds from Lundbeck. The other authors report no financial relationships with commercial interests. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health. The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Author Contribution
The study was conceived by P.F.P. Data extraction and statistical analysis was performed by J.I. with support from C.C., R.P., M.P., and D.S. Reporting of findings were carried out by J.I. and P.F.P. All authors (J.I., C.C., D.S., M.P., R.S., P.F.P., H.B., and R.P.) contributed to study design, manuscript preparation, and approved the final version.
Ethics Approval
The CRIS data resource received ethical approval as an anonymized dataset for secondary analyses from Oxfordshire REC C (Ref: 18/SC/0372).
Funding
This study was supported by the King’s College London Confidence in Concept award from the Medical Research Council (MRC) (MC_PC_16048) to P.F.P. H.S., M.P., R.S., D.S. and C.C. received funding from the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London, which also supports the development and maintenance of the BRC Case Register. R.P. has received support from a Medical Research Council (MRC) Health Data Research UK Fellowship (MR/S003118/1) and a Starter Grant for Clinical Lecturers (SGL015/1020) supported by the Academy of Medical Sciences, The Wellcome Trust, MRC, British Heart Foundation, Arthritis Research UK, the Royal College of Physicians and Diabetes UK.
Data Availability
The data accessed by CRIS remain within an NHS firewall and governance is provided by a patient-led oversight committee. Subject to these conditions, data access is encouraged and those interested should contact RS (robert.stewart@kcl.ac.uk), CRIS academic lead.
References
- 1. GBD 2017 Disease and Injury Incidence and Prevalence Collaborators SL, Abate D, Abate KH, et al. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet (London, England). 2018;392:1789–1858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Gustavsson A, Svensson M, Jacobi F, et al. Cost of disorders of the brain in Europe 2010. Eur Neuropsychopharmacol. 2011;21(10):718–779. [DOI] [PubMed] [Google Scholar]
- 3. Jääskeläinen E, Juola P, Hirvonen N, et al. A systematic review and meta-analysis of recovery in schizophrenia. Schizophr Bull. 2013;39(6):1296–1306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Millan MJ, Andrieux A, Bartzokis G, et al. Altering the course of schizophrenia: progress and perspectives. Nat Rev Drug Discov. 2016;15(7):485–515. [DOI] [PubMed] [Google Scholar]
- 5. Fusar-Poli P, Bauer M, Borgwardt S, et al. European college of neuropsychopharmacology network on the prevention of mental disorders and mental health promotion (ECNP PMD-MHP). Eur Neuropsychopharmacol. 2019;29(12):1301–1311. [DOI] [PubMed] [Google Scholar]
- 6. Oliver D, Davies C, Crossland G, et al. Can we reduce the duration of untreated psychosis? A systematic review and meta-analysis of controlled interventional studies. Schizophr Bull. 2018;44(6):1362–1372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Fusar-Poli P, Tantardini M, De Simone S, et al. Deconstructing vulnerability for psychosis: Meta-analysis of environmental risk factors for psychosis in subjects at ultra high-risk. Eur Psychiatry. 2017;40:65–75. [DOI] [PubMed] [Google Scholar]
- 8. Oliver D, Reilly TJ, Baccaredda Boy O, et al. What causes the onset of psychosis in individuals at clinical high risk? A meta-analysis of risk and protective factors. Schizophr Bull. 2020; 46(1): 110– 120. 10.1093/schbul/sbz039 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Radua J, Ramella-Cravaro V, Ioannidis JPA, et al. What causes psychosis? An umbrella review of risk and protective factors. World Psychiatry. 2018;17(1):49–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Fusar-Poli P, Rocchetti M, Sardella A, et al. Disorder, not just state of risk: meta-analysis of functioning and quality of life in people at high risk of psychosis. Br J Psychiatry. 2015;207(3):198–206. [DOI] [PubMed] [Google Scholar]
- 11. Fusar-Poli P, Byrne M, Badger S, Valmaggia LR, McGuire PK. Outreach and support in south London (OASIS), 2001–2011: ten years of early diagnosis and treatment for young individuals at high clinical risk for psychosis. Eur Psychiatry. 2013;28(5):315–326. [DOI] [PubMed] [Google Scholar]
- 12. Falkenberg I, Valmaggia L, Byrnes M, et al. Why are help-seeking subjects at ultra-high risk for psychosis help-seeking? Psychiatry Res. 2015;228(3):808–815. [DOI] [PubMed] [Google Scholar]
- 13. Fusar-Poli P, Sullivan SA, Shah JL, Uhlhaas PJ. Improving the detection of individuals at clinical risk for psychosis in the community, primary and secondary care: an integrated evidence-based approach. Front Psychiatry. 2019;10:774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Fusar-Poli P. Extending the benefits of indicated prevention to improve outcomes of first-episode psychosis. JAMA Psychiatry. 2017;74(7):667–668. [DOI] [PubMed] [Google Scholar]
- 15. McGorry PD, Hartmann JA, Spooner R, Nelson B. Beyond the “at risk mental state” concept: transitioning to transdiagnostic psychiatry. World Psychiatry. 2018;17(2): 133–142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Fusar-Poli P, Rutigliano G, Stahl D, et al. Development and validation of a clinically based risk calculator for the transdiagnostic prediction of psychosis. JAMA Psychiatry. 2017; 74(5): 493– 500. 10.1001/jamapsychiatry.2017.0284 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Fusar-Poli P, Werbeloff N, Rutigliano G, et al. Transdiagnostic risk calculator for the automatic detection of individuals at risk and the prediction of psychosis: second replication in an independent national health service trust. Schizophr Bull. 2019;45(3):562–570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Fusar-Poli P, Oliver D, Spada G, et al. Real world implementation of a transdiagnostic risk calculator for the automatic detection of individuals at risk of psychosis in clinical routine: study protocol. Front Psychiatry. 2019; 10(MAR). 10.3389/fpsyt.2019.00109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Loken E, Gelman A. Measurement error and the replication crisis. Science. 2017;355(6325):584–585. [DOI] [PubMed] [Google Scholar]
- 20. Begley CG, Ioannidis JP. Reproducibility in science: improving the standard for basic and preclinical research. Circ Res. 2015;116(1):116–126. [DOI] [PubMed] [Google Scholar]
- 21. Fusar-Poli P, Hijazi Z, Stahl D, Steyerberg EW. The science of prognosis in psychiatry: a review. JAMA Psychiatry. 2018;75(12):1289–1297. [DOI] [PubMed] [Google Scholar]
- 22. Jackson RG, Patel R, Jayatilleke N, et al. Natural language processing to extract symptoms of severe mental illness from clinical text: the Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project. BMJ Open. 2017;7(1):e012012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Perera G, Broadbent M, Callard F, et al. Cohort profile of the South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLaM BRC) Case Register: current status and recent enhancement of an Electronic Mental Health Record-derived data resource. BMJ Open. 2016;6(3):e008721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. CRIS Natural Language Processing Applications Library. n.d. https://www.maudsleybrc.nihr.ac.uk/facilities/clinical-record-interactive-search-cris/cris-natural-language-processing/. Accessed July 23, 2020.
- 25. Kuhn M, Johnson K. Applied predictive modeling. New York: Springer; 2013: 1– 600. 10.1007/978-1-4614-6849-3 [DOI]
- 26. Ramu N, Kolliakou A, Sanyal J, Patel R, Stewart R. Recorded poor insight as a predictor of service use outcomes: cohort study of patients with first-episode psychosis in a large mental healthcare database. BMJ Open. 2019;9(6):e028929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Patel R, Jayatilleke N, Broadbent M, et al. Negative symptoms in schizophrenia: a study in a large clinical sample of patients using a novel automated method. BMJ Open. 2015;5(9):e007619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Downs J, Dean H, Lechler S, et al. Negative symptoms in early-onset psychosis and their association with antipsychotic treatment failure. Schizophr Bull. 2019; 45(1): 69– 79. 10.1093/schbul/sbx197 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Benchimol EI, Smeeth L, Guttmann A, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med. 2015;12(10):e1001885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Royston P, Altman DG. External validation of a Cox prognostic model: principles and methods. BMC Med Res Methodol. 2013; 13(1). 10.1186/1471-2288-13-33 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J. 2014;35(29):1925–1931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD Statement. Eur Urol. 2015; 67(6): 1142– 1151. 10.1016/j.eururo.2014.11.025 [DOI] [PubMed] [Google Scholar]
- 33. Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011;30(10):1105–1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Hosmer DW, Lemeshow S, May S. Applied survival analysis: regression modeling of time to event data 2nd ed. Wiley Blackwell; 2011: 1– 401. 10.1002/9780470258019 [DOI] [Google Scholar]
- 35. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128–138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Hastie T, Tibshirani R, Wainwright M. Statistical learning with sparsity: the lasso and generalizations. CRC Press; 2015: 1– 337. 10.1201/b18401 [DOI]
- 37. Collins GS, Ogundimu EO, Altman DG. Sample size considerations for the external validation of a multivariable prognostic model: A resampling study. Stat Med. 2016; 35(2): 214– 226. 10.1002/sim.6787 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Maynard D, Bontcheva K Natural language processing. Persp Ontol Learn. IOS Press; 2014: 51– 67. 10.4018/ijssoe.2014010105 [DOI] [Google Scholar]
- 39. Oliver D, Kotlicka-Antczak M, Minichino A, Spada G, McGuire P, Fusar-Poli P. Meta-analytical prognostic accuracy of the Comprehensive Assessment of at Risk Mental States (CAARMS): the need for refined prediction. Eur Psychiatry. 2018;49:62–68. [DOI] [PubMed] [Google Scholar]
- 40. Fusar-Poli P, Solmi M, Brondino N, et al. Transdiagnostic psychiatry: a systematic review. World Psychiatry. 2019; 18: 192– 207. 10.1002/wps.20631 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Worthington MA, Cao H, Cannon TD. Discovery and validation of prediction algorithms for psychosis in youths at clinical high risk. Biol Psychiatry Cogn Neurosci Neuroimaging. 2020;5(8):738–747. [DOI] [PubMed] [Google Scholar]
- 42. Vieira S, Gong QY, Pinaya WHL, et al. Using machine learning and structural neuroimaging to detect first episode psychosis: reconsidering the evidence. Schizophr Bull. 2020;46(1):17–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Fusar-Poli P, Stringer D, M S Durieux A, et al. Clinical-learning versus machine-learning for transdiagnostic prediction of psychosis onset in individuals at-risk. Transl Psychiatry. 2019;9(1):259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Corcoran CM, Carrillo F, Fernández-Slezak D, et al. Prediction of psychosis across protocols and risk cohorts using automated language analysis. World Psychiatry. 2018;17(1):67–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Bedi G, Carrillo F, Cecchi GA, et al. Automated analysis of free speech predicts psychosis onset in high-risk youths. NPJ Schizophr. 2015;1:15030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. McCoy TH, Castro VM, Rosenfield HR, Cagan A, Kohane IS, Perlis RH. A clinical perspective on the relevance of research domain criteria in electronic health records. Am J Psychiatry. 2015;172(4):316–320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Patel R, Wilson R, Jackson R, et al. Cannabis use and treatment resistance in first episode psychosis: a natural language processing study. Lancet. 2015;385Suppl 1:S79. [DOI] [PubMed] [Google Scholar]
- 48. Patel R, Wilson R, Jackson R, et al. Association of cannabis use with hospital admission and antipsychotic treatment failure in first episode psychosis: An observational study. BMJ Open. 2016;6:e009888. 10.1136/bmjopen-2015-009888 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Studerus E, Beck K, Fusar-Poli P, Riecher-Rössler A. Development and validation of a dynamic risk prediction model to forecast psychosis onset in patients at clinical high risk. Schizophr Bull. 2020;46(2):252–260. 10.1093/schbul/sbz059 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Colling C, Evans L, Broadbent M, et al. Identification of the delivery of cognitive behavioural therapy for psychosis (CBTp) using a cross-sectional sample from electronic health records and open-text information in a large UK-based mental health case register. BMJ Open. 2017;7(7):e015297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Fusar-Poli P, Schultze-Lutter F, Cappucciati M, et al. The dark side of the moon: meta-analytical impact of recruitment strategies on risk enrichment in the clinical high risk state for psychosis. Schizophr Bull. 2016;42(3):732–743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Wang T, Oliver DAP, Msosa YJ, et al. A real-time psychosis risk detection and alerting system based on electronic health records using CogStack. J Vis Exp. 2020; 2020(159). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data accessed by CRIS remain within an NHS firewall and governance is provided by a patient-led oversight committee. Subject to these conditions, data access is encouraged and those interested should contact RS (robert.stewart@kcl.ac.uk), CRIS academic lead.

