Skip to main content
BMJ Open logoLink to BMJ Open
. 2017 Jul 17;7(7):e015297. doi: 10.1136/bmjopen-2016-015297

Identification of the delivery of cognitive behavioural therapy for psychosis (CBTp) using a cross-sectional sample from electronic health records and open-text information in a large UK-based mental health case register

Craig Colling 1,2, Lauren Evans 1,2, Matthew Broadbent 1,2, David Chandran 1, Thomas J Craig 1, Anna Kolliakou 1, Robert Stewart 1,2, Philippa A Garety 1,2
PMCID: PMC5734297  PMID: 28716789

Abstract

Objective

Our primary objective was to identify cognitive behavioural therapy (CBT) delivery for people with psychosis (CBTp) using an automated method in a large electronic health record database. We also examined what proportion of service users with a diagnosis of psychosis were recorded as having received CBTp within their episode of care during defined time periods provided by early intervention or promoting recovery community services for people with psychosis, compared with published audits and whether demographic characteristics differentially predicted the receipt of CBTp.

Methods

Both free text using natural language processing (NLP) techniques and structured methods of identifying CBTp were combined and evaluated for positive predictive value (PPV) and sensitivity. Using inclusion criteria from two published audits, we identified anonymised cross-sectional samples of 2579 and 2308 service users respectively with a case note diagnosis of schizophrenia or psychosis for further analysis.

Results

The method achieved PPV of 95% and sensitivity of 96%. Using the National Audit of Schizophrenia 2 criteria, 34.6% service users were identified as ever having received at least one session and 26.4% at least two sessions of CBTp; these are higher percentages than previously reported by manual audit of a sample from the same trust that returned 20.0%. In the fully adjusted analysis, CBTp receipt was significantly (p<0.05) more likely in younger patients, in white and other when compared with black ethnic groups and patients with a diagnosis of other schizophrenia spectrum and schizoaffective disorder when compared with schizophrenia.

Conclusions

The methods presented here provided a potential method for evaluating delivery of CBTp on a large scale, providing more scope for routine monitoring, cross-site comparisons and the promotion of equitable access.

Keywords: cognitive behavioural therapy, CBT, CBT for psychosis, CBTp, electronic Health records, EHR


Strengths and limitations of this study.

  • Key strengths of this study were the large sample and the innovative approaches adopted to identify cognitive behavioural therapy for psychosis (CBTp) delivery within the clinical record.

  • The ability to replicate the inclusion criteria of two previous audits also allowed us to contextualise the findings, and the large data set allowed access to data by year and also to examine clinical and demographic factors influencing delivery, identifying inequalities in access that are not detectable in smaller samples.

  • The use of routine data and automated ascertainment provides the scope for more in-depth evaluation of real-world treatment delivery and success, and the wider use of other EHR-derived data to investigate predictors of treatment receipt and outcome.

  • A limitation of this study was that it took place in a single (although large) service provider; however, our results have identified themes that are consistent with other findings in relation to CBTp provision.

  • This approach does not provide an assessment of quality of treatment, its specific therapeutic focus or its duration.

  • This approach does not identify offers of CBTp that are not taken up.

Introduction

Background

Pharmacotherapy as monotherapy for people with a diagnosis of psychosis or schizophrenia is no longer regarded as optimal treatment. The implementation of cognitive behavioural therapy for psychosis (CBTp) is of international concern and relevance,1 and CBTp, given its evidence base, is recommended in many countries including Australia and New Zealand,2 Canada,3 Spain4 and the USA.5 This paper is focused on the provision of CBTp at a single UK site, but the challenges associated with monitoring and improving the implementation of CBT for service users with psychosis have international relevance. For England and Wales, the NICE national guideline recommends that psychological therapies, in particular CBTp and family intervention, should be offered; NICE makes the recommendation they are offered to all people with the diagnosis of schizophrenia and their carers.6 However, repeatedly, within the UK, service users, charities such as Rethink,7 policy makers and audits8 9 have reported that only a small proportion of people are accessing these treatments. For example, the Schizophrenia Commission reported that only about 10% of service users access CBTp.10 To address these concerns, the Department of Health and National health service (NHS) England are undertaking various initiatives, including the Improving Access to Psychological Therapies for Severe Mental Illness11 programme and the new Early Intervention Access and Waiting Time initiative12 both of which aim to drive up access. However, one area of uncertainty that will limit evaluation of progress is whether we do have accurate baseline estimates of current levels of provision. A recent national audit (National Audit of Schizophrenia 2 (NAS2))13 taking a random sample of 100 service users with a diagnosis of schizophrenia or schizoaffective disorder in the community in each of 64 participating mental health trust or health boards in England and Wales concluded that there are significant gaps in the availability of CBTp and family interventions. For example, this manual case note audit found that trusts reported that on average 39% of service users had been offered CBTp and 19% of service users had taken up CBTp. However, there are grounds for thinking that the NAS2 audit might be inaccurate. The audit provided no definition or criteria for psychological therapy provision, asked whether a service user had ever been offered or received therapy and was based on reports by consultant psychiatrists. The audit report noted that responses probably encompassed a broader set of interventions than covered by the NICE recommendations. In contrast, a detailed manual survey of a random sample of 187 records reported a very much lower rate of offers (6.9%) and delivery (6.4%) of CBTp,14 employing expert reviews of reported therapy record content within a 1-year period in one large mental health trust.

Manually conducted audits of case notes and electronic records, such as NAS2, requiring individual responses of health professionals, are a labour intensive way of establishing these data, limit the number of cases that can reasonably be investigated and are too cumbersome to use routinely as practical tools to monitor service-level implementation. The UK’s national minimum data set15 does not currently require interventions to be recorded, although this may change. Although in the South London and Maudsley NHS Foundation Trust (SLaM), the site for this analysis, a structured drop-down record for psychological interventions in electronic records is available, there is concern that, as non-mandatory, it is incomplete and unreliable as a means to monitor activity.

In the current study we therefore sought to develop a method of using automated text-based searches of clinical records using natural language processing (NLP) techniques, supplemented by information from structured fields, to investigate how much this might enhance our ability to provide accurate routine automatic data reports and analysis, and thus provide an efficient method of monitoring the implementation of psychological therapy provision, overcoming the limitations of manual case note audits. The decision to focus initially on CBTp delivery instead of CBTp offer was a pragmatic one based on the perceived complexity and the resultant time required for each project.

Research question

The primary research question of the study was whether we could identify, with sufficiently high positive predictive value (PPV) and sensitivity, CBTp delivery using free text and structured methods in a large electronic service user record database. We also examined how many and what proportion of service users according to inclusion and exclusion criteria employed in published audits, with a case note diagnosis of schizophrenia or psychosis were recorded as having received CBTp within their episode of care using the CRIS database, during defined time periods, combining NLP and structured records. We then compared these data with the results of two published audits. Finally we examined whether demographic characteristics differentially predicted the receipt of CBTp.

Methods

Setting

SLaM is a large provider of mental healthcare, serving a catchment of around 1.3 million residents in four boroughs of south London (Croydon, Lambeth, Lewisham and Southwark). The majority of people with a diagnosis of a schizophrenia spectrum disorder are served by Early Intervention (EI) teams for the first 3 years from initial presentation and by Promoting Recovery (PR) teams subsequently.

Study design

Source of clinical data

The data for this study were obtained from the SLaM Biomedical Research Centre (BRC) Case Register and its Clinical Record Interactive Search (CRIS) application,16 which accesses anonymised data from the electronic health records (EHR) of individuals who have previously received or are currently receiving mental healthcare from SLaM within a robust, service user-led governance framework.17 At the time of writing, this is over 265 000 service user records. We used CRIS to replicate the inclusion criteria of NAS2 and Haddock et al 14 as means of comparison with these two published audits. The SLaM BRC Case Register contains structured fields, such as those coding demographic information, as well as unstructured (but de-identified) free text fields from case notes and correspondence where history, mental state examination, diagnostic formulation and management plan are primarily recorded. The CRIS data resource has been approved for secondary analysis by the Oxfordshire Research Ethics Committee,18 and a service user-led oversight committee considers all proposed research before access to the anonymised data is permitted. The EHR system was implemented in SLaM services in April 2006.

Overview of methodology

The initial step was to identify the delivery of CBT across all patient groups not distinguished by diagnostic groups or other characteristics and then subsequently, and as the specific focus of this study, to test the performance of the application for the delivery of CBT with a sample of service users with a diagnosis of psychosis (ie, ‘CBTp’).

Identification of CBT delivery using CRIS

NLP techniques19 were used to identify CBT delivery from free text fields within the BRC Case Register. The annotation strategy to identify whether a clinical record was an actual session of CBT was developed by three human annotators (CC, LE and MB), who also completed the initial feasibility, which was signed off by an expert clinical lead (PAG). All annotations were double-annotated by two human annotators, and disagreements were resolved by consensus and liaising with the clinical lead, if required. Inter-annotator agreement was evaluated following each batch of annotations completed, and the annotation strategy was updated according to issues raised and clarifications identified. Two annotators reviewed a training set of 300 instances in the development phase before annotating a gold standard data-set of 200 where the term ‘CBT’ (or variants of) occurred and annotated as to whether the sentence that contained the term ‘CBT’ was an actual session of CBT rather than a historic reference to therapy, a referral for CBT, a decision not to offer CBT or another reference to CBT that was not a therapy session. When a positive instance of CBT delivery was identified, the following features were recorded: session number, stage of treatment, the recipient of treatment and whether the CBT was delivered individually or via a group. Once the human annotations were complete, the training set was reviewed by the NLP developer (DC) to establish the rules to determine whether the CBT text is an actual session or not. These rules were coded using General Architecture Text Engineering software.20 Within the development process, the impact of the rules applied to the training set were measured by the PPV and sensitivity. There is an inherent trade-off between the PPV and sensitivity (as one increases, the other reduces) so there is a balance between what is more important in relation to the problem domain. We concluded that for this study an evenly weighted solution was preferred with a slight preference to PPV. When PPV is prioritised, this results in false positives being minimised, which increases the confidence in the test to correctly identify the positive outcome at the expense of incorrectly classifying some positive instances as false negatives (FNs).

When all the rules were developed based on the training set, the model was tested against an independent gold standard data set to evaluate how well the model performed on unseen data using PPV and sensitivity as the metrics of evaluation. Once the mean of the PPV and sensitivity on the gold standard were greater than 85%, the resulting application was applied against the CRIS database, and we further tested whether combining the NLP output with other relevant variables such as the professional group of the clinician who entered the clinical note, whether the clinical note was classified as a psychological therapy in structured data drop down menu or whether the positioning of the CBT reference in the clinical document could be used to improve the performance of the application.

Identification of CBTp delivery using CRIS

The output of the CBT application was generated in a sample of service users with a current diagnosis of psychosis to evaluate whether the PPV and sensitivity were of an acceptable standard or whether a specific CBTp application would need to be developed.

Within SLaM, psychological interventions can be recorded through a drop-down box within the clinical record, but as a non-mandatory field, the recording was considered as potentially poor. To assess the quality and use of this field, a senior clinician completed an assessment of 100 documents where CBT was indicated within the drop-down box, identifying whether the text associated with the document could be confirmed as a session of CBT.

Both free text and structured methods of identifying CBT were combined to create a single set of results, which was used for analysis purposes. As the focus of this paper is to identify the delivery of CBT for patients with a diagnosis of psychosis, the term ‘CBTp’ is used from this point forward.

Participants

We used the CRIS database to generate two large participant samples in this study: one replicating the inclusion criteria and the sampling time frame employed by the NAS2 audit and a second that replicated the Haddock et al 14 audit inclusion criteria, allowing a comparison with each publicly available study.

1. NAS2 audit inclusion criteria

All individuals ‘active’ (ie, receiving services rather than discharged from care) for at least 12 months on 1 July 2013 aged over 18 years receiving either an EI or a PR service, with a recorded diagnosis of schizophrenia (F20.0–F20.9) or schizoaffective disorder (F25.0–F25.9). The NAS2 audit requested whether CBTp was ‘taken up’, and we examined this in two ways: service users with at least one session of CBTp and service users with at least two sessions of CBTp prior to the census date.

2. Haddock et al audit inclusion criteria

All individuals active between 1 July 2012 and 1 July 2013 aged over 18 years receiving either an EI or a PR service, with a recorded diagnosis of schizophrenia spectrum diagnosis (schizophrenia, schizoaffective, schizotypal and delusional disorders (F20.0–F29.9)). CBTp delivery was defined as at least one session of CBTp within the 12-month audit period.

In addition to the original timeframe, we resampled the data Haddock et al 14 inclusion criteria for a separate 12-month timeframe in 2015 to check the robustness of the findings related to health inequalities.

If patients met the inclusion criteria across multiple teams within the same service type, to avoid double counting, the episodes were merged by selecting the earliest episode start date and latest end date for those episodes and presented as a single episode of care.

Demographic and service variables

The following variables were extracted for analyses: age, diagnosis, ethnicity, gender, marital status and service type. All data obtained were the most recent prior to the census date. Ethnicity was recorded according to categories defined by the UK Office for National Statistics and categorised for analysis purposes into three groups: black (comprising black African, black Caribbean and any other black background), other (comprising white and black African, white and Asian, white and black Caribbean, any other mixed background, Indian, Pakistani, Bangladeshi, any other Asian background, Chinese and any other ethnic group) and white (comprising white British, white Irish and any other white background). Marital status was aggregated into two groups: single/divorced (including dissolved civil partnerships and widowed) and married/cohabiting/civil partnerships. Diagnosis is routinely recorded in clinical services using the International Classification Disease version 10 (ICD-10) classification system in drop-down fields and was limited to schizophrenia spectrum (F20–F29), with an additional subgrouping applied in line with the NAS2 diagnostic categories of schizophrenia (F20.0–F20.9), schizoaffective disorder (F25.0–F25.9) and ‘other schizophrenia spectrum’ (F21, F22.0–F22.9, F23.0–F23.9, F24, F28 and F29). We used the largest sample (using the Haddock el al 14 inclusion criteria) to investigate the delivery of CBTp across the following categories: age group, diagnosis, gender, ethnic group, marital status and whether the patient was in contact with either the EI or PR service.

Statistical analysis

Descriptive statistics for demographic variables are reported as means and SD for continuous variables (age at referral) and as frequencies and percentages for all other variables. A binary logistic regression model was used to examine the differences for proportions of cases who received CBTp and those who did not. We initially undertook an unadjusted analysis by age group, diagnosis, ethnicity, gender, marital status and service type to establish whether the receipt of CBTp differed by these demographic factors. We subsequently undertook a multivariable analysis, adjusting for potential confounders by including covariates (age, diagnosis, ethnicity, gender, marital status and service type) in the model except the variable of interest. Due to the relationship between age and service type (EI services are by definition for a younger patient group), we included the partially adjusted model that excludes service as a predictor to check whether the increased likelihood of younger people receiving CBTp is still present.

Results

PPV and sensitivity of identification of CBT in case records

The developed NLP CBT delivery application was evaluated against the independent gold standard resulting in PPV and sensitivity for CBT annotations of 85% and 86%, respectively. Following the development of the CBT NLP application, we concluded the PPV would be improved with a tolerable reduction in sensitivity if we applied the following postprocessing rule: to exclude CBT sentences that commenced after the first 200 characters of the clinical document. This postprocessing rule resulted in an improved overall performance of the application, with an increase in PPV of 12% to 97% and a reduction in sensitivity of 4% to 82%. The evaluation of the structured CBT entry alone resulted in a PPV of 89%. We then combined both methods, and a measure was adopted to establish the sensitivity of the combined method by reviewing the FNs from the NLP app and examining whether they were identified by the structured method: of the 12 FNs identified by the NLP app, 75% (9/12) were correctly identified by the structured data with the effect of increasing the sensitivity from 82% (56/68) for the NLP app alone to 96% (65/68) for the combined method. By combining methods, we therefore achieved a PPV of 97% and a sensitivity of 96%. The NLP app resulted in identifying 26% additional service users who received CBT not recorded by the drop-down box.

PPV and sensitivity of identification of CBTp in case records

We further evaluated the developed NLP CBT delivery application against a sample of service users with a diagnosis of psychosis. The performance against the independent gold standard resulted in PPV and sensitivity for CBTp annotations of 81% and 85%, respectively. Applying the above-mentioned postprocessing rule (to exclude CBTp sentences that commenced after the first 200 characters of the clinical document) resulted in an increase in PPV of 14% to 95% and a reduction in sensitivity of 7% to 78%. The evaluation of the structured CBT entry alone resulted in a PPV of 89%. Having combined both methods, of the 10 FNs identified by the NLP app, 80% (8/10) were correctly identified by the structured data, with the effect of increasing the sensitivity from 78% (36/46) for the NLP app alone to 96% (44/46) for the combined method. By combining methods, we therefore achieved a PPV of 95% and sensitivity of 96%. The NLP app resulted in identifying 21% additional service users who received CBTp not recorded by the drop-down box.

Delivery of CBTp using sample based on NAS2 inclusion criteria

Two thousand three hundred and eight service users were identified in the data set as fulfilling the NAS2 inclusion criteria. Service users had a mean age of 40.7 at referral (SD 12.1; range 18–83), 60.3% (1392/2308) were male, 51.9% (1197/2308) were of a black ethnic origin and 34.6% (799/2308) were from a white ethnic origin, 90.7% (2094/2308) were single/divorced, 78.2% (1806/2308) had a diagnosis of schizophrenia and 21.8% (502/2308) had a diagnosis of schizoaffective disorder.

The SLaM return for the actual NAS2 audit was that 20% of the random sample of n=100 were identified as having ever received CBTp. In contrast, using the current method, 34.6% (799/2308) were identified as having at least one session and 26.4% (610/2308) were identified as having at least two sessions of CBTp. A breakdown of CBTp delivery by diagnostic group can be viewed in table 1.

Table 1.

CBTp delivery by diagnostic groups using NAS2 audit criteria

Diagnostic group n % episodes with at least one CBTp session % episodes with at least two CBTp sessions
Schizoaffective disorder (F25.0–F25.9) 502 42.4 32.9
Schizophrenia (F20.0–F20.9) 1806 32.4 24.6
Total 2308 34.6 26.4

CBTp, cognitive behavioural therapy for psychosis; NAS2, National Audit of Schizophrenia 2.

We also explored the level of CBTp provision by year, which can be viewed in figure 1.

Figure 1.

Figure 1

The NAS2 audit requested whether CBTp was ‘taken up’, and we examined this in two ways: service users with at least one session of CBTp, which is represented by the blue line, and service users with at least two sessions of CBTp prior to the census date, which is represented by the red line split by year prior to the census date. The actual return for this trust was also added as means of comparison, which is represented by the green line. CBTp, cognitive behavioural therapy for psychosis; NAS2, National Audit of Schizophrenia 2; SLaM, South London and Maudsley NHS Foundation Trust.

Delivery of CBTp using sample based on Haddock et al inclusion criteria

Two thousand five hundred and seventy-nine service users fulfilled the inclusion criteria within the same 12-month audit period. Service users had a mean age of 40 at referral (SD 12.4; range 18–83), 60.3% (1555/2579) were male, 50.9% (1314/2579) were of a black ethnic origin and 35.2% (908/2579) were from a white ethnic origin, 90.5% (2339/2579) were single/divorced, 70.0% (1806/2579) had a diagnosis of schizophrenia and 19.5% (502/2579) had a diagnosis of schizoaffective disorder. We found that 12.8% (330/2579) received CBTp interventions within the same 12-month audit period, whereas Haddock et al 14 reported 6.4% (12/187) in their sample.

We also examined a more recent time-period: 2597 service users fulfilled the inclusion criteria within a 12-month audit period within 2015. Service users had a mean age of 39.6 at referral (SD 12.7; range 18–85), 60.4% (1568/2597) were male, 52.3% (1357/2597) were of a black ethnic origin and 32.1% (883/2579) were from a white ethnic origin, 90.5% (2351/2597) were single/divorced, 63.4% (1646/2597) had a diagnosis of schizophrenia and 20.0% (519/2597) participants had a diagnosis of schizoaffective disorder. We found that 14.8% (385/2597) received CBTp interventions within the 12-month audit period.

We additionally investigated the proportion of participants that received CBTp ‘year on year’, by checking if the participants who took part in the audit in 2015 also received CBTp in the 2013 audit. This check found that 13.8% (53/385) of the participants who received CBTp in 2015 had also received CBTp in 2013.

Demographic predictors of at least one session of CBTp

The demographic characteristics of service users who received CBTp were compared with those who did not using our largest sample of n=2579, which employed the Haddock et al 14 inclusion criteria. The receipt of CBTp was more common in younger service users, in the white compared with the black group, in those in the schizoaffective disorder group compared with those in the schizophrenia group, and in those receiving care from the EI for psychosis teams rather than the PR teams. Table 2 provides a summary of the unadjusted and adjusted multivariate logistic regression for receipt of CBTp by age group, diagnostic group, ethnic group, gender, marital status and service type.

Table 2.

Unadjusted and adjusted logistic regressions for predictors of at least one session of CBTp

Unadjusted Partially adjusted* Fully adjusted
Group n OR CI Significance OR CI Significance OR CI Significance
Age
 Under 41 1346 1.57 1.24 to 1.99 <0.001 1.57 1.23 to 2.01 <0.001 1.32 1.01 to 1.72 0.043
 41 and over 1233 Reference category
Ethnicity
 Black 1314 Reference category
 White 908 1.34 1.04 to 1.72 0.024 1.40 1.08 to 1.80 0.011 1.43 1.10 to 1.85 0.007
 Other 357 1.35 0.96 to 1.90 0.081 1.33 0.94 to 1.88 0.106 1.31 0.93 to 1.86 0.122
Diagnosis
 Other schizophrenia spectrum 271 2.26 1.63 to 3.14 <0.001 2.02 1.45 to 2.82 <0.001 1.52 1.05 to 2.20 0.025
 Schizoaffective disorder 502 1.53 1.15 to 2.03 0.003 1.47 1.10 to 1.97 0.009 1.48 1.11 to 1.98 0.008
 Schizophrenia 1806 Reference category
Gender
 Male 1555 Reference category
 Female 1024 1.15 0.91 to 1.46 0.230 1.19 0.94 to 1.52 0.155 1.20 0.94 to 1.54 0.139
Marital status
 Single/divorced 2339 Reference category
 Married/cohabiting 240 0.93 0.62 to 1.40 0.729 0.90 0.60 to 1.37 0.623 0.95 0.63 to 1.44 0.809
Service type
 Early intervention 327 2.49 1.87 to 3.31 <0.001 N/A 1.98 1.40 to 2.81 <0.001
 Promoting recovery 2252 Reference category

*Within the partially adjusted model the results were adjusted for age, ethnic group, diagnostic group, gender and marital status.

†Within the fully adjusted model, the results were adjusted for age, ethnic group, diagnostic group, gender, marital status and service.

CBTp, cognitive behavioural therapy for psychosis.

We additionally explored the number and percentage of participants who received CBTp by the standard NHS 16 ethnic groups to further detail the ethnic composition and CBTp provision, which can be viewed in table 3.

Table 3.

Participants by ethnic origin and CBTp delivery using largest sample

Analysis group NHS ethnic groups Participants Participants that received CBTp
Black Black African (N) 16.8% (432/2579) 9.7% (42/432)
Black Caribbean (M) 14.9% (384/2579) 9.9% (38/384)
Any other black background (P) 19.3% (498/2579) 13.5% (67/498)
Total black 50.9% (1314/2579) 11.2% (147/1314)
Other White and black Caribbean (D) 1.4% (37/2579) 18.9% (7/37)
White and black African (E) 0.5% (12/2579) 33.3% (4/12)
White and Asian (F) 0.2% (6/2579) 16.7% (1/6)
Any other mixed background (G) 0.7% (19/2579) 10.5% (2/19)
Indian (H) 1.4% (36/2579) 11.1% (4/36)
Pakistani (J) 0.8% (21/2579) 9.5% (2/21)
Bangladeshi (K) 0.5% (12/2579) 8.3% (1/12)
Any other Asian background (L) 2.6% (67/2579) 16.4% (11/67)
Chinese (R) 0.7% (18/2579) 0.0% (0/18)
Any other ethnic group (S) 5.0% (129/2579) 15.5% (20/129)
Total other 13.8% (357/2579) 14.6% (52/357)
White British (A) 27.5% (710/2579) 15.4% (109/710)
Irish (B) 1.6% (41/2579) 14.6% (6/41)
Any other white background (C) 6.1% (157/2579) 10.2% (16/157)
Total white 35.2% (908/2579) 14.4% (131/908)
2579 330

Age, ethnicity, gender and marital status had a 100% completeness rate.

CBTp, cognitive behavioural therapy for psychosis.

Discussion

To our knowledge, this is the first attempt at using NLP techniques on free text entries, supplementing structured fields, in order to identify the delivery of one type of psychological therapy in a large health record data set. This was broadly successful, in that we achieved a high level of PPV (95%) and sensitivity (96%) that is consistent with published CRIS NLP applications, which have measured other clinical activities or characteristics such as prescribed medication,21 Mini-Mental State Examination score,22 diagnosis23 and service user characteristics, such as smoking status24 and whether the service user lived alone.16 The methods presented here are therefore potentially effective and efficient for examining the delivery of CBTp on a large scale where manual audits are inevitably limited in sample size for logistical reasons.

NLP applications are increasingly being used to extract information from medical records for a wide range of health-related areas including but not limited to the detection of adverse drug events, falls, nosocomial infections,25–27 obesity status and obesity-related diseases28 29 and detecting patterns in patient care and patient treatment habits30 31 that highlight the potential for NLP to supplement other data collection methods. NLP applications for mental health services are less prominent, but there have been recent studies in the USA that used NLP to determine depression outcome, adverse drug reactions and characterisation of diagnostic profiles.32–34

When using this method, we identified higher levels of CBTp delivery than previously reported in the SLaM contribution to the NAS2 audit using the same sampling criteria but a very much larger sample. Note the published audits using NAS2 and Haddock et al 14 inclusion criteria differ on timeframe, diagnosis and interpretation of CBTp delivery. We also found higher levels of CBTp delivery (about double) than that reported by Haddock et al 14 in the same time period although in a different service setting. This suggests that manual audits may result in under-reporting, presumably because of the limitations of clinician knowledge or readily accessible recording in health records, and our development is encouraging because it may result in both better quality output and much less time-intensive data collection. It is notable that the NAS2 audit enquired whether CBTp had ever been provided: the methods described here can search by year, which is clinically more useful; the data also might suggest that clinicians in responding to such an audit are typically considering perhaps the previous 2 years. Furthermore, when we conducted the sampling twice for 2013 and 2015, we found some evidence of a modest increase in provision—from 12.8% to 14.8%. However, our results also continue to show that CBTp delivery falls very far short of the NICE recommendations of universal access. It is a matter of additional importance and concern that there do appear to be demographic predictors, suggesting access is inequitable in terms of age, diagnosis and ethnicity. Improving access to psychological therapies can be enhanced by examining data such as these and targeting provision towards underserved groups. The value of informatics to monitor the delivery of psychological therapy provision and the advantages described here are important for health systems internationally.

Strengths

Key strengths of this study were the large sample and the innovative approaches adopted to identify CBTp delivery within the clinical record. The ability to replicate the inclusion criteria of two previous audits also allowed us to contextualise the findings, and the large data set allowed access to data by year and also to examine clinical or demographic factors influencing delivery. Clearly, there are also a large number of other variables in the EHR that are potentially available for examination without the need to repeat data extraction, as would be the case in a manual audit. These might include service user characteristics, service delivery settings, therapist characteristics and aspects of therapy provision such as assessments, number of sessions, discontinuation and drop out and clinical outcomes. The large sample size generated by this approach has enabled us to identify previously unknown inequalities in the provision of CBTp within our own trust that we have taken steps to address, such as raising with the senior team and the provision of regular monitoring reports split by demographic variables shared with clinical teams.

Limitations

A limitation of this study was that it took place in a single (although large) service provider; however, our results have identified themes that are consistent with other findings in relation to CBTp provision and could indicate generalisability but would warrant further investigation. The sample presented here is reflective of the local service provision, although SLaM services may benefit from some research-funded clinical activity, the extent of which may differ to other services within the UK and internationally. However, other countries such as Australia and New Zealand,2 Canada,3 Spain,4 UK and USA5 have policies that recommend CBTp provision and therefore monitoring implementation of these policies is of international importance. If other services were interested in adopting the method described here to identify CBTp, we would recommend that a full de novo evaluation of the application performance be undertaken as it cannot be assumed that performance on one cohort would be directly generalisable to others.16

A further limitation concerns the use of routine clinical data rather than de novo data collection. Clearly, the information available is limited by what is recorded in the source records. For fully EHR, such as those that are now used routinely in UK mental health services, there are no other information repositories that provide administrative or medico-legal back-up, and therefore there are incentives for clinicians to record details of interventions, in order to provide evidence that these did actually take place. We believe that we were able to identify relevant CBT treatment receipt through the search approach used identified through querying both structured and text fields; indeed, demonstrating that additional querying of text fields identified significantly larger numbers of episodes. However, we are not at this stage able to automate the identification of more subtle and nuanced descriptions of the treatment and its context; that is, we could not identify the ‘offer’ rather than receipt of CBT, because of the wide range of wording used to record this, and we did not attempt to quantify the quality or nature of treatment received. It is possible that future advances in NLP may allow the automated ascertainment of these constructs, but it is possible that de novo data collection and/or manual case note evaluation will remain the only solutions, although limited in the samples that can be generated.

Clearly an alternative approach would be to impose data collection on clinicians by requiring them to complete structured assessments to delineate the process of offering, commencing and monitoring treatment. This would obviate the need for NLP approaches; this, however, depends on clinicians’ willingness to complete these instruments and for the approach to sustain itself over time, potentially problematic if clinicians also have to complete text fields for what may be seen as a more salient need to communicate information on sessions for their own and colleagues’ future reference as well as for medico-legal purposes. It therefore seems likely that medical records data will remain a mixed economy of structured and text-derived information and that audits will incorporate a mixture of large-scale, multi-site ‘big data’ analyses and targeted in-depth case-note review.

Next steps

The opportunity provided by employing methods shown here allows the proactive analysis of large EHR-derived data sets. In the future, a refinement could be to identify CBTp delivery data by using data from NLP and structured fields to identify a course of CBTp treatment. Initial definitions regarding the development of a course of treatment would require at least two CBTp sessions with less than a 3-month break between sessions and in addition using other NLP features such as the CBTp session number and stage of therapy to enhance the creation of such a construct. We are also now working on developing an application that identifies the delivery of other therapy types and applications that more precisely characterise the pathway from CBTp being considered through its offer and to actual receipt.

Supplementary Material

Reviewer comments
Author's manuscript

Footnotes

Contributors: PAG and RS conceived the study and manuscript. CC, LE, MB, DC, AK, PAG and RS provided substantial contributions to the design of the work. Analyses were carried out by CC, DC, LE and MB. CC and PAG initially contributed significant text to the study manuscript. CC, PAG, TJC and RS provided the final approval of the version to be published.

Funding: This work was supported by the Clinical Records Interactive Search (CRIS) system and funded and developed by the National Institute for Health Research (NIHR) Mental Health Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London and a joint infrastructure grant from Guy’s and St Thomas’ Charity and the Maudsley Charity (grant number BRC-2011-10035). PAG and TJC acknowledge BRC support. CC and LE are funded by SLAM. All other authors receive salary support from the National Institute for Health Research (NIHR) Mental Health Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London. The above funding bodies had no role in the study design; in the collection, analysis and interpretation of the data; in the writing of the report; and in the decision to submit the paper for publication.

Competing interests: None declared.

Provenance and peer review: Not commissioned; externally peer reviewed.

Data sharing statement: No additional unpublished data are available.

References

  • 1. van der Gaag M. The efficacy of CBT for severe mental illness and the challenge of dissemination in routine care. World Psychiatry 2014;13:257–8. 10.1002/wps.20162 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Galletly C, Castle D, Dark F, et al. . Royal Australian and New Zealand College of Psychiatrists clinical practice guidelines for the management of schizophrenia and related disorders. Aust N Z J Psychiatry 2016;50:410–72. 10.1177/0004867416641195 [DOI] [PubMed] [Google Scholar]
  • 3. Canadian Psychiatric Association. Clinical practice guidelines;treatment of Schizophrenia. 2005. https://ww1.cpa-apc.org/Publications/Clinical_Guidelines/schizophrenia/november2005/cjp-cpg-suppl1-05_full_spread.pdf (accessed 14 Mar 2017).
  • 4. Working Group of the Clinical Practice Guideline for Schizophrenia and Incipient Psychotic Disorder. Mental Health Forum, coordination. Clinical Practice Guideline for Schizophrenia and Incipient psychotic disorder. Madrid: Quality Plan for the National Health System of the Ministry of Health and Consumer Affairs. Agency for Health Technology Assessment and Research, 2009. Clinical Practice; Guideline: CAHTA. Number 2006/05-2 http://www.guiasalud.es/GPC/GPC_495_Schizophrenia_compl_en.pdf (accessed 14 Mar 2017). [Google Scholar]
  • 5. Kreyenbuhl J, Buchanan RW, Dickerson FB, et al. . The Schizophrenia Patient Outcomes Research Team (PORT): updated treatment recommendations 2009. Schizophr Bull 2010;36:94–103. 10.1093/schbul/sbp130 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. National institute clinical excellence. Psychosis and schizophrenia in adults: prevention and management. UK: National Collaborating Centre for Mental Health, 2014. [Google Scholar]
  • 7. Rethink. Your treatment, your choice survey: final report. UK: Rethink, 2008. [Google Scholar]
  • 8. Prytys M, Garety PA, Jolley S, et al. . Implementing the NICE guideline for schizophrenia recommendations for psychological therapies: a qualitative analysis of the attitudes of CMHT staff. Clin Psychol Psychother 2011;18:48–59. 10.1002/cpp.691 [DOI] [PubMed] [Google Scholar]
  • 9. Krupnik Y Pilling S, Killaspy H, et al. . A study of family contact with clients and staff of community mental health teams. Psychiatr Bull 2005;29:174–6. 10.1192/pb.29.5.174 [DOI] [Google Scholar]
  • 10. Schizophrenia commission. An abandoned illness. UK: Schizophrenia commission, 2012. [Google Scholar]
  • 11. Department of Health. Talking therapies:a four-year plan of action. Uk: Coi, 2011. [Google Scholar]
  • 12. Department of Health. Achieving better access to Mental Health Services by 2020. Uk: Coi, 2014. [Google Scholar]
  • 13. Royal College of Psychiatrists. Eport of the second round of the National Audit of Schizophrenia (NAS2). UK: Royal College of Psychiatrists, 2014. [Google Scholar]
  • 14. Haddock G, Eisner E, Boone C, et al. . An investigation of the implementation of NICE-recommended CBT interventions for people with schizophrenia. J Ment Health 2014;23:162–5. 10.3109/09638237.2013.869571 [DOI] [PubMed] [Google Scholar]
  • 15. Health and social care information centre. Mental Health and Leaning Disabilities Data set. UK: Health and social care information centre, 2015. [Google Scholar]
  • 16. Perera G, Broadbent M, Callard F, et al. . Cohort profile of the South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLaM BRC) case register: current status and recent enhancement of an electronic mental health record-derived data resource. BMJ Open 2016;6:e008721 10.1136/bmjopen-2015-008721 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Fernandes AC, Cloete D, Broadbent MT, et al. . Development and evaluation of a de-identification procedure for a case register sourced from mental health electronic records. BMC Med Inform Decis Mak 2013;13:71 10.1186/1472-6947-13-71 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Stewart R, Soremekun M, Perera G, et al. . The South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLAM BRC) case register: development and descriptive data. BMC Psychiatry 2009;9:51 10.1186/1471-244X-9-51 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Spyns P. Natural language processing in medicine: an overview. Methods Inf Med 1996. 35 35:285–301. [PubMed] [Google Scholar]
  • 20. Cunningham H, Gate CH. A general architecture for text engineering. Comput Hum 2002;36:223–54. 10.1023/A:1014348124664 [DOI] [Google Scholar]
  • 21. Hayes RD, Downs J, Chang CK, et al. . The effect of clozapine on premature mortality: an assessment of clinical monitoring and other potential confounders. Schizophr Bull 2015;41:644–55. 10.1093/schbul/sbu120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Su YP, Chang CK, Hayes RD, et al. . Mini-mental state examination as a predictor of mortality among older people referred to secondary mental healthcare. PLoS One 2014;9:e105312 10.1371/journal.pone.0105312 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Sultana J, Chang CK, Hayes RD, et al. . Associations between risk of mortality and atypical antipsychotic use in vascular dementia: a clinical cohort study. Int J Geriatr Psychiatry 2014;29:1249–54. 10.1002/gps.4101 [DOI] [PubMed] [Google Scholar]
  • 24. Wu CY, Chang CK, Robson D, et al. . Evaluation of smoking status identification using electronic health records and open-text information in a large mental health case register. PLoS One 2013;8:e74262 10.1371/journal.pone.0074262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Chazard E, Ficheur G, Merlin B, et al. . PSIP consortium, Beuscart R. detection of adverse drug events detection: data aggregation and data mining. Stud Health Technol Inform 2009;148:75–84. [PubMed] [Google Scholar]
  • 26. Bates DW, Evans RS, Murff H, et al. . Detecting adverse events using information technology. J Am Med Inform Assoc 2003;10:115–28. 10.1197/jamia.M1074 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Mendonça EA, Haas J, Shagina L, et al. . Extracting information on pneumonia in infants using natural language processing of radiology reports. J Biomed Inform 2005;38:314–21. 10.1016/j.jbi.2005.02.003 [DOI] [PubMed] [Google Scholar]
  • 28. Yang H, Spasic I, Keane JA, et al. . A text mining approach to the prediction of disease status from clinical discharge summaries. J Am Med Inform Assoc 2009;16:596–600. 10.1197/jamia.M3096 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Guillen R. Identifying obesity and co-morbidities from medical records. Proceedings of the I2b2 Workshop on Challenges in Natural Language Processing for Clinical Data 2009; 868. [Google Scholar]
  • 30. Pakhomov SV, Hanson PL, Bjornsen SS, et al. . Automatic classification of foot examination findings using clinical notes and machine learning. J Am Med Inform Assoc 2008;15:198–202. 10.1197/jamia.M2585 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Rao RB, Krishnan S, Niculescu RS. Data mining for improved cardiac care. ACM SIGKDD Explorations Newsletter 2006;8:3–10. 10.1145/1147234.1147236 [DOI] [Google Scholar]
  • 32. Perlis RH, Iosifescu DV, Castro VM, et al. . Using electronic medical records to enable large-scale studies in psychiatry: treatment resistant depression as a model. Psychol Med 2012;42:41–50. 10.1017/S0033291711000997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Sohn S, Kocher JP, Chute CG, et al. . Drug side effect extraction from clinical narratives of psychiatry and psychology patients. J Am Med Inform Assoc 2011;18:144–149. 10.1136/amiajnl-2011-000351 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Roque FS, Jensen PB, Schmock H, et al. . Using electronic patient records to discover disease correlations and stratify patient cohorts. PLoS Comput Biol 2011;7:e1002141 10.1371/journal.pcbi.1002141 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Reviewer comments
Author's manuscript

Articles from BMJ Open are provided here courtesy of BMJ Publishing Group

RESOURCES