Abstract
Postoperative pain scores are widely monitored and collected in the electronic health record (EHR) yet current methods fail to fully leverage the data with fast implementation. A robust linear regression was fitted to describe the association between the log-scaled pain score and time from discharge after total knee replacement. The estimated trajectories were used for a subsequent K-medians cluster analysis to categorize the longitudinal pain score patterns into distinct clusters. For each cluster, a mixture regression model estimated the association between pain score and time to discharge adjusting for confounding. The fitted regression model generated the pain trajectory pattern for given cluster. Finally, regression analyses examined the association between pain trajectories and patient outcomes. A total of 3442 surgeries were identified with a median of 22 pain scores at an academic hospital, 2009–2016. Four pain trajectory patterns were identified and one was associated with higher rates of outcomes. In conclusion, we describe a novel approach with fast implementation to model patients’ pain experience using EHRs. In the era of big data science, clinical research should be learning from all available data regarding a patient’s episode of care instead of focusing on an “average” patient outcomes.
Keywords: Pain Scores, Electronic Health Records, Robust Linear Regression, K-Medians Cluster Analysis
1. Background
Every year over 53 million Americans have surgery and pain is an expected treatment-related side effect.1,2 Appropriate postoperative pain management is critical, as poor management can lead to adverse events (e.g. deep vein thrombosis, pneumonia), compromise care of the underlying disease, and promote the transition into chronic pain.3–5 However, appropriate postoperative pain management remains a major challenge.6,7,8 Although patient-reported pain scores are routinely collected and widely monitored in electronic health records (EHRs),9 the appropriate utilization of those scores is not clear from a policy, clinician or research point of view.10 Pain scores are typically reported on a 0–10 scale, where zero indicates “no pain” and ten indicates “worst pain”. These scores are generally used reduced to in a single moment, e.g. mean or last pain score on discharge day.
Postoperative pain scores are often used as critical indicators for quality of care, providing information on patients’ recovery, guiding pain medications, including opioids, and assisting with clinical judgement regarding their postoperative care. However, there is a big gap between condensing these abundant amounts of data with plausible statistical assumptions and delivering evidence based on these statistical results to care providers. Most studies examining postoperative pain use a single time point or simple summary measures of pain scores (e.g. mean or maximum). Nevertheless, within the EHR, pain scores are captured at multiple time points and vary greatly throughout the inpatient stay.9 The reduction of pain scores to a single value leads to loss of information that could be critical to pain management and hence patient recovery.
Currently, there is no consensus on best approaches for reducing the pain score data into a single value. One of the most commonly used methods is to select one summary score on the day of discharge, either mean, maximum or last pain score before discharge which is often then categorized into distinct groups (e.g., ‘no pain, pain score=0’; ‘mild pain, pain score 1–3’; ‘moderate pain, pain score 4–6’; ‘severe pain, pain score 7–10’).11 These categories are then used to represent patients’ entire postoperative pain experience during inpatient stay, which can range from days to weeks depending on the patient’s diagnosis. One criticism of this naïve method is that the selection of a single summary pain score is subjective and sometimes controversial.12 Simply averaging pain scores across the entire admission might overemphasize irrelevant portions of the clinical course.
Statistical methods, such as latent class growth analysis implemented in PROC TRAJ (SAS)13, longitudinal latent class growth analysis (LCGA) and growth mixture models (GMM) implemented with Mplus14 and R15, make it possible to cluster patients’ longitudinal pain path using a unified statistical model. However, most of these methods are sensitive to outliers and model assumptions, therefore not suitable for analyzing big data extracted from EHRs. Specifically, the mixture models such as LCGA require the outcome of interest to follow a normal distribution.16 LCGA and GMM models put restrictive parametric assumptions on the structure of clusters, e.g., the regression coefficients for individual trajectory need to follow normal distribution, whose mean and variance-covariance matrix depend on the corresponding cluster. Violation of those assumptions may lead to false discoveries and non-reproducible results. Another pitfall for the latent class growth model lies in the fact that the module of some analytical programs, such as PROC TRAJ in SAS, only allows pain scores to be measured at the same schedule for all patients (i.e. 2-days post operation). This is not a realistic case for application with EHR data, whereas pain scores are recorded randomly at any time point throughout the inpatient stay. Furthermore, models such as GMM, implemented in Mplus and R, were computationally intensive, because the model would estimate both the coefficients of the trajectories and the cluster parameters simultaneously, which would lead to exhausted computational memory. To make things worse, computational time could rise exponentially with each increase number of clusters. To ensure fast implementation, recent publications on pain trajectory analysis, have focused on some semi-parametric methods. For example, Kannampallil et al. proposed a method to identify pain trajectories by introducing k-means cluster analysis upon the empirical Bayes (EB) estimates generated from a single mixed-effected model of the entire data.17 This method is easy to implement, however the employment of non-parametric K-means cluster analysis is not compatible with its key assumption that estimates generated from the mixed-effect model are from the distribution of one single target population, which also contradicted our original motivation for clustering patients into distinctive underlying subgroups.
In conclusion, a new set of methods should be proposed to fully leverage the rich EHR data with fast implementation and appropriate model assumptions that current methods failed to consider.
2. Methods
The method we propose here can be separated into three major steps. We propose to first use robust linear regression to get the individual trajectory. Second, K-medians cluster analysis are applied on these trajectories to identify clusters. The final step is to run generalized mixed models on each cluster to plot the corresponding trajectory patterns.
2.1. Construct Individual trajectory for each inpatient stay
As opposed to longitudinal cohorts which have a limited number of baseline and follow-up measures, in EHRs, each patient can have multiple pain scores recorded each day throughout their inpatient stay. This provides enough data to separately fit a regression model for each patients’ inpatient stay. Here, we propose to perform robust linear regression (M-estimator from the ‘rlm function’ in MASS package of R) to model the log transformed pain score as a function of the time. This accommodates the non-normal distribution of pain score measures and potential outliers. Coefficient estimates from each regression model are obtained via iterative weighted least square method and used for further cluster analysis.
Specifically, for each individual inpatient stay , we fit a robust linear regression as below.
where
the jth score for inpatient stay ;
the jth time point at which the pain score is measure for inpatient stay i;
: basis functions to expand the time variable and form a m-dimensional covariate in regression model.
The ‘basis function’ can be any function that may characterize the trajectory pattern, for example, polynomial function, B-spline, S-spline, etc. Take the three-degree polynomial function as an example, the regression model can be written as
2.2. Non-parametric cluster analysis to identify the trajectory subgroups
The aim is to cluster patient stays according to their estimated trajectory Specifically, patient stays i and j should be clustered together, if is small, where is the time interval of interest. In practice, we employ the following approximation in the clustering algorithm:
where are S equally spaced points within the interval and Specifically, we have employed the following clustering algorithm:
-
1.For patient-stay , obtain where
-
2.
K-medians cluster analysis18 is applied to to categorize inpatient stays into distinguishable groups. Specifically, the clusters are constructed by minimizing the loss function measuring the within-cluster variation:
where represent the index sets of clusters. The metric instead of the commonly used squared Euclidean distance is used here for its robustness. Other unsupervised learning algorithm, such as k-means, k-medoids, can be implemented with difference choices of the distance measure as well.
We will conduct the k-medians clustering analysis with increasing number of clusters: . The process will terminate if either one of the following criteria is violated
the increase of between cluster variation (BCV) is above 5%;
the smallest cluster is over2 5% of the overall population;
where the between cluster variation is measured as
These criteria can be adjusted according to the sample size and the cluster performance. The final clustering result is given based on the largest number of clusters prior to termination. The cluster performance is also graphically examined by plotting the first two principal components (PCs) from the principal components analysis (PCA) for the trajectory parameters, . A “good” clustering result will typically demonstrated separable groups of observations projected onto the two-dimensional space spanned by the first two PCs. Other diagnostic methods can be used as well, such as plotting the distribution of distance from final centroid by cluster and bootstrapping Rand Index.18
2.3. Estimate the trajectory patterns for each subgroup
For each cluster we identify in 2.2, we further fit a generalized mixed effects model using the log-scaled pain score as the outcome measure. We may incorporate patient demographics, clinical variables and treatment variables as independent variables in the generalized mixed effects model to estimate the adjusted trajectory. Specifically, for all inpatient stays in an identified cluster, we fit the following mixed effects model
where is the confounding factor to be adjusted,
and . To display the cluster-specified pain score pattern, we predict the pain score using its estimated median based on the generalized mixed effects model. In the prediction, all the confounding factors are set at their sample medians level of the entire population (thus are equal across different clusters). In summary, the predicted pain score at time is
where are estimated regression coefficients andis the sample median of the adjusted confounding factors. We may plot time vs the predicted pain score along with the corresponding 95% confidence interval for each cluster.
The method implementation is coded using R software (version 3.2.4).
3. Results
We used data captured in our institution’s EHR database, CLARITY, which is a component of the Epic Systems software. We identified patients undergoing total knee arthroplasty (TKA), which is a common and often painful surgery, using ICD-9-CM, ICD-10-CM and CPT codes, 2009 to 2016. We captured patient demographics, inpatient/outpatient medications (down to the ingredient level), pain scores spanning the episode of care, type of insurance coverage and follow-up visits/diagnoses/procedures up to 90 days after discharge. Patients were excluded if age at surgery was less than 18 years or death occurred during the hospitalization. This study was approved by our Institutional Review Board.
A total of 4 453 encounters were identified. We excluded encounters that had a length of stay (LOS) less than one day or less than ten pain scores recorded during their inpatient stay. The patients with less than 10 pain scores were excluded from analysis to prevent variability in regression coefficients estimates for individual trajectories. A total of 3 442 encounters from 3 025 patients were included in our final analytical dataset. There were 81 106 pain scores for the first surgery during the last three days of their inpatient stay. The median number of pain scores per inpatient stay was 22 (IQR: 17–28), which is sufficient to fit a cubic polynomial regression model for each inpatient stay.
We focused on the last three postoperative days before discharge since the median of inpatient stay for TKA was 3.2 (IQR: 3.0 – 3.4). A three-degree polynomial function of time, which represented the linear, quadratic and cubic terms of time from discharge, was used for building the regression model (2.1). The painscores were incremented by one to take into account the information corresponding to 0 pain score (indicating no pain) in the logarithmic response variable. Estimated coefficients (including the intercept) per inpatient stay were used to calculate the trajectory values at seven equally spaced time points within the time interval [0, 3days] referring to the last 3 postoperative days before the patient discharge described by (2.2). The rationale behind choosing last three postoperative days is that the median length of stay for patient is 3.2 days and this approach induces uniformity in the analysis. From 3 442 estimated trajectories, four distinguished clusters were further established by K-medians algorithm with the percent of BCV at 44% and minimum size of cluster of 294 inpatient stays (8.5% of all trajectories). If the number of clusters is five, the percent of BCV would marginally increase to 51%, while the size of the smallest group fell to 127 (<5%) encounters. Therefore, four clusters were ultimately selected according to the criteria we described in 2.2. To visualize the clustering results, trajectories were represented by their first two PCs, which are plotted and colored differently by clusters in Figure 1.
Figure 1.
Distribution of the Robust Linear Regression by Cluster and Major Principal Components.
Next, we estimated the trajectory pattern by fitting the mixed effects model for each cluster, adjusting for several patient and clinical-related covariates: patients’ age at admission, race-ethnicity, gender, marital status, number of comorbidities at admission,19 body mass index (BMI), length of stay, length of the procedure in hours, pre-surgery pain score, pre- and post-surgery morphine equivalent value per day calculated using oral morphine conversion factors.20,21 Predicted pain score as a function of time from discharge were plotted for each cluster shown by Figure 2.
Figure 2.
Patients’ Patterns of Pain Score vs Days from Discharge
* Figure adjusts for age, gender, marital status, BMI, race-ethnicity, number of comorbidities, length of procedure, length of stay, postoperative morphine-equivalent-values (MME) by day, preoperative morphine-equivalent-values(MME) by day and preoperative pain score (0–10)
Patients’ characteristics and their clinical information was summarized by cluster in Table 1. Four unique patterns of postoperative pain experience were discovered in our patient cohort. Cluster 1 encounters had mild pain after surgery followed by a steady rise in pain scores before discharge (‘Slightly Rise’ Group). Their final pain levels at discharge were between three and four. This group of patients were more likely to be female (66%), living without a partner (21%), stayed in the hospital longer (3.5 days), had higher opioid usage after surgery (67.1 mg/day) and higher preoperative pain scores (2.3). Cluster 2 encounters represented a pain trajectory of patients undergoing TKA with moderate pain scores after surgery and fluctuated pain level during their stay but reported very low pain at discharge (‘Completely Drop’ Group). Patients of this group were older (69.4 years), had higher BMI (35.3), took less opioids (55.0 mg/day) and received less complicated procedures (length of procedure: 1.7 hours). Cluster 3 was a small group of unique patients that initially experienced very low pain immediately following surgery, but their pain rose sharply before discharge (‘Sudden Rise’ Group). These patients were younger (66.5 years old), more likely to be male (49%), more likely to be Hispanic and Black (20%), had lower preoperative pain scores (1.9) and had less complicated procedures (length of procedure: 1.6 hours) compared to patients in the other clusters. Cluster 4 consisted of patients who reported moderate pain (pain score=2–3) throughout their inpatient stay (‘Steady’ Group). The ‘steady’ group tended to be younger (67.2 years), had higher preoperative pain scores (pain score=2.4), experienced longer procedure time (1.9 hours) and received more postoperative opioid drugs (69.3 mg/day) during the inpatient stay compared to other clusters.
Table 1.
Patients’ Characteristics by Cluster
| Variable | Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 | p value |
|---|---|---|---|---|---|
| n= | 1020 | 694 | 294 | 1434 | |
| Age at Admission, years, mean (sd) | 67.5(10.69) | 69.4(10.12) | 66.5(10.78) | 67.2(10.93) | <0.001 |
| Gender, n (%) | <0.001 | ||||
| Female | 670(66%) | 426(61%) | 150(51%) | 894(62%) | |
| Male | 350(34%) | 268(39%) | 144(49%) | 540(38%) | |
| Race-Ethnicity, n (%) | 0.015 | ||||
| White | 642(63%) | 480(69%) | 186(63%) | 929(65%) | |
| Black | 41(4%) | 23(3%) | 19(6%) | 58(4%) | |
| Hispanic | 122(12%) | 58(8%) | 42(14%) | 146(10%) | |
| Asian | 100(10%) | 64(9%) | 29(10%) | 172(12%) | |
| Other | 97(10%) | 61(9%) | 15(5%) | 107(7%) | |
| BMI, mean (sd) | 31.0(6.96) | 35.3(128.98) | 30.5(6.32) | 30.5(6.80) | 0.41 |
| Marital Status, n (%) | 0.038 | ||||
| Married/Life Partner | 651(64%) | 476(69%) | 202(68%) | 916(64%) | |
| Single | 152(15%) | 87(13%) | 38(13%) | 246(17%) | |
| Widowed/Divorced/Separated | 213(21%) | 126(18%) | 50(17%) | 265(18%) | |
| Comorbidities, n (%) | 0.23 | ||||
| >2 | 399(39%) | 240(35%) | 103(36%) | 544(38%) | |
| 2 | 261(26%) | 170(25%) | 75(26%) | 386(27%) | |
| 1 | 232(23%) | 183(26%) | 69(24%) | 301(21%) | |
| None | 127(12%) | 100(14%) | 43(15%) | 201(14%) | |
| Preoperative Pain Score, mean (sd) | 2.3(2.86) | 2.0(2.65) | 1.9(2.52) | 2.4(2.91) | 0.016 |
| Length of Procedure, hours, mean (sd) | 1.8(0.79) | 1.7(0.55) | 1.6(0.54) | 1.9(0.83) | <0.001 |
| Length of Stay, days, mean (sd) | 3.5(1.29) | 3.3(1.35) | 3.0(1.32) | 3.2(1.67) | <0.001 |
| Preoperative Anxiety Level, n (%) | 0.71 | ||||
| Severe/Moderate | 53(6%) | 32(5%) | 9(3%) | 70(5%) | |
| Mild | 460(54%) | 302(52%) | 129(54%) | 613(51%) | |
| None | 346(40%) | 244(42%) | 102(42%) | 519(43%) | |
| Preoperative MMEa per Day, mg/day, mean (sd) | 7.3(35.65) | 7.6(35.50) | 4.3(24.18) | 9.0(40.30) | 0.10 |
| Postoperative MMEa per Day, mg/day, mean (sd) | 87.4(87.78) | 73.0(83.82) | 79.0(115.6) | 99.7(113.4) | <0.001 |
Morphine Equivalent Value
In our cohort, patients’ inpatient pain experience was marginally associated with patients’ demographics, injury severity, treatment options and opioid medications. We hypothesized that these trajectories could be used as surrogates of patients’ recovery or early indicators of post-discharge complications in addition to other important clinical factors. Among all four groups, we also hypothesized that the ‘Sudden Rise’ would be one distinct group of patients who might be more susceptible to complications after discharge.
To test these hypotheses, we conducted a set of logistic regressions with the occurrence of 30-, 60-, 90-day follow-up visits of all purposes, inpatient readmissions or subsequent emergency department visits and post-discharge complications (surgery-pain related revisits, wound infection and others, see Supplementary Table 1) as the binary outcomes, respectively. The pain trajectory pattern, patients’ demographics, and clinical covariates were included as covariates. The ‘Steady’ cluster was set to be the reference group in our analysis because it was a group with the largest sample size and was considered a clinically typical ‘well-managed’ group. Compared to the ‘Steady’ group (Cluster 4), patients in the ‘Sudden Rise’ group (Cluster 3) were associated with higher risk of follow-up revisits (OR: 2.37, 2.11 and 1.97 for 30-, 60- and 90-day windows, respectively), any surgery-related pain (OR: 5.49, 3.41, 2.73) and surgery-related chronic pain (OR: 5.82, 2.96, 2.03). In addition, we noticed that the ‘Complete Drop’ group had higher risk of any surgery-related pain (OR: 2.36), follow-up visits of any types (OR: 1.36) and inpatient readmissions/subsequent ED visits (OR: 1.93) 30 days after discharge, although we failed to observe statistically significant effects in 60-, 90- days for these outcomes. No statistical significance was detected for complications of any types across all observation windows (Table 2). Since readmissions and complications were rare in our population, Poisson regressions and negative-binomial regressions were also performed using the number of post-discharge revisits with any specific outcome as the dependent variable. Consistent with the results from logistic regression, the ‘Sudden Rise’ group had higher rates of follow-up visits at 30-day post-discharge as well as higher rates of any surgical-related pain, including chronic pain in either 30-, 60-, 90- day window (Supplementary Table 2).
Table 2.
Logistic Regression for Major Post-discharge Outcomes by Clustera
| Outcome | Cluster 1 | Cluster 2 | Cluster 3 |
|---|---|---|---|
| n = | 1020 | 694 | 294 |
| 30-Day (n=)b | 936 | 634 | 259 |
| All Revisits | 1.15 (0.97 – 1.38) | 1.36 (1.11 – 1.66)** | 2.37 (1.77 – 3.17)*** |
| Inpatient | 1.10 (0.53 – 2.23) | 1.84 (0.87 – 3.78) | 1.26 (0.35 – 3.55) |
| Emergency Department | 0.97 (0.52 – 1.78) | 1.67 (0.89 – 3.09) | 0.36 (0.06 – 3.54) |
| Inpatient + ED | 1.10 (0.68 – 1.77) | 1.93 (1.19 – 3.12)** | 0.78 (0.29 – 1.74) |
| Complications (Any) | 1.00 (0.67 – 1.50) | 1.14 (0.72 – 1.79) | 0.68 (0.29 – 1.40) |
| Surgery-related Pain | 1.77 (1.02 – 3.10)* | 2.36 (1.31 – 4.27) ** | 5.49 (2.99 – 10.09)*** |
| Surgery-related Acute Pain | 2.86 (1.06 – 8.54)* | 2.60 (0.79 – 8.70) | 2.44 (0.51 – 10.41) |
| Surgery-related Chronic Pain | 2.13 (0.94 – 5.33) | 2.00 (0.78 – 5.23) | 5.82 (2.44– 15.60)*** |
| 60-Day (n=)b | 921 | 631 | 254 |
| All Revisits | 1.11 (0.92 – 1.35) | 1.10 (0.89 – 1.37) | 2.11 (1.50 – 3.01)*** |
| Inpatient | 0.91 (0.51 – 1.59) | 1.31 (0.71 – 2.34) | 1.08 (0.40 – 2.50) |
| Emergency Department | 0.95 (0.56 – 1.57) | 1.41 (0.81 – 2.39) | 0.39 (0.09 – 1.12) |
| Inpatient + ED | 0.95 (0.64 – 1.41) | 1.48 (0.97 – 2.22) | 0.76 (0.34 – 1.50) |
| Complications (Any) | 0.91 (0.65 – 1.25) | 1.14 (0.79 – 1.62) | 0.69 (0.36 – 1.21) |
| Surgery-related Pain | 1.21 (0.80 – 1.84) | 1.35 (0.84 – 2.14) | 3.41 (2.09 – 5.52)*** |
| Surgery-related Acute Pain | 1.96 (0.86 – 4.57) | 1.64 (0.58 – 4.36) | 1.95 (0.52 – 6.10) |
| Surgery-related Chronic Pain | 1.11 (0.59 – 2.07) | 0.90 (0.42 – 1.84) | 2.96 (1.47 – 5.90)** |
| 90-Day (n=)b | 907 | 621 | 246 |
| All Revisits | 1.08 (0.89 – 1.31) | 1.08 (0.87 – 1.34) | 1.97 (1.39 – 2.85)*** |
| Inpatient | 0.96 (0.59 – 1.54) | 1.01 (0.57 – 1.73) | 1.07 (0.45 – 2.24) |
| Emergency Department | 0.93 (0.57 – 1.50) | 1.28 (0.75 – 2.13) | 0.36 (0.09 – 1.01) |
| Inpatient + ED | 0.95 (0.66 – 1.36) | 1.25 (0.84 – 1.84) | 0.78 (0.38 – 1.45) |
| Complications (Any) | 0.92 (0.67 – 1.24) | 1.21 (0.86 – 1.68) | 0.74 (0.41 – 1.26) |
| Surgery-related Pain | 1.10 (0.76 – 1.59) | 1.25 (0.82 – 1.88) | 2.73 (1.71 – 4.27)*** |
| Surgery-related Acute Pain | 1.79 (0.83 – 3.92) | 1.58 (0.61 – 3.93) | 2.11 (0.64 – 5.99) |
| Surgery-related Chronic Pain | 0.82 (0.46 – 1.42) | 0.86 (0.46 – 1.57) | 2.03 (1.06 – 3.80)* |
Logistic regression was fitted for each outcome, adjusting for age at admission, gender, race-ethnicity, marital status, length of stay, postoperative morphine-equivalent-values per day, number of comorbidities, preoperative pain score. Odds ratios (ORs) and their corresponding 95% confidence intervals (CI) were reported in the table. Cluster 4 was the control group for all analysis (n=1407, 1392, 1373 at 30-, 60-, 90- Day).
Inpatient stays with patients who admission date fell out of the corresponding observation window were not included in the analysis, despite that they were included in the original cluster analysis.
p<0.05
p<0.01
p<0.001
The trajectory analyses were compared with other basic analytical approaches that are common in the literature, i.e. last recorded pain score, mean pain score on discharge day, and max pain score on discharge day (Supplementary Figure 1). In terms of predictions and model fits, our trajectory analyses outperform all the single score discharge pain methods in regarding to their Area under curve (AUC) and Akaike information criterion (AIC). (Supplementary Table 3).
4. Discussion
We’ve entered a new era in which the healthcare system has undergone dramatic changes. According to a 2015 National Electronic Health Records Survey, 87% of physicians in the US reported using an EHR system and 78% reported using a Certified EHR system.22 With the improvement of health informatics technology, massive amounts of patient and clinical information are now captured and stored in EHRs. However, how to meaningfully extract and analyze these data becomes a new challenge in both the clinical and statistics world. Pain scores derived from EHRs, as an example here, are often not used efficiently and effectively in clinical research. The practical difficulty lies in that current analytical methods, such as mixed growth curve modelling, or growth mixture model, are not scalable to cope the amount of the data in large EHR datasets and other methods of limiting assumptions cannot be used on the complex and abundant EHR data. Therefore, novel methods that are applicable to massive EHR data are critically needed in new areas in big data clinical research.
The method we proposed here, which combined robust linear regression and unsupervised K-medians cluster analysis, was able to compile all the pain score data recorded in the EHR and identify distinguished patterns of inpatient pain scores after TKA surgery. Our method is flexible with any time metrics (e.g. time before discharge and time after surgery) and any hypothesized shape of the pain scores (e.g. polynomial, S-pline, etc.). It is not limited to model pain scores and other numeric or ordinal values that were commonly recorded in EHR data (e.g. lab values) can be modelled similarly with appropriate modifications. The method is scalable to large amounts of data and does not heavily rely on the restrictive underlying distributional assumptions as the other statistical methods, e.g., the growth mixture model. In addition, K-medians clustering was proposed for clustering, which minimized the effects of outliers upon the clustering results. Although we proposed to implement the K-medians clustering and robust linear regression to address the negative effect of the outliers, other techniques, such as weighted dynamic time warping or longest common subsequence distance measure, could be incorporated in our proposed 3-step method with minor modification as well.
The method developed here was robust and superior in terms of prediction compared to other commonly used pain score analyses, i.e. mean, last, and maximum pain scores on discharge day23, 24. The single-value analyses were not able to distinguish between patient in the steady pain trajectory and those in the cluster with a sudden rise in pain scores at discharge. This is an important distinction, as patients in the sudden rise trajectory were more likely to have adverse pain-related events following discharge. When analyses focus on a single pain score, it is important understand the ubiquity of pain score recordings in the EHRs. For example, the last pain score recorded can occur minutes or hours prior to discharge, making this number extremely susceptible to inpatient pain medication. Our method that leverages all pain scores captured during the inpatient stay is a clear step towards patient-centered care, enabling clinicians to treat a patient’s pain experience rather than a single pain score.
From a clinical perspective, our method was able to identify subpopulations of patients whose distinguishable inpatient pain trajectories were associated with adverse outcomes, in particular pain-related readmissions. Pain-related readmissions following surgery are not uncommon and are costly to the healthcare system.25, 26 Methods developed in this study could be used to identify patients needing additional pain management resources upon discharge. Such pain trajectories could also be incorporated into clinical decision tools at the point of care, providing evidence to guide pain management – a clear need given our nation’s current opioid epidemic.27,28
There are several limitations in our method. First, the method was developed under the EHR setting where pain scores are attempted to be recorded at varying intervals over the entire inpatient stay. Furthermore, many question the utilization of pain scores to represent a patient’s pain experience29, 30. However, to date, pain scores are the best representation of a patient’s pain experience at a population level and outside of controlled, qualitative studies. For a single robust linear regression with three-degree polynomial expansion of time, we need to have at least ten pain scores per stay to obtain an acceptable fit. Therefore, our methods exclude stays with too few values, which may still contain valuable information. Second, we employed several criteria to select the optimal number of clusters. These criteria can be subjective based on both clinical and statistical judgements and often are determined in an ad-hoc manner. Selection of different criteria will therefore result in different results. In addition, for patients that lie close to the boundary between two clusters, the distance to two cluster centers can be similar. Although we choose to assign them into the cluster with the nearest distance numerically, interpretation of these patients is difficult. Such assignment criteria ignored the uncertainty in cluster membership for those patients and tend to dilute the association between the trajectory pattern and clinical outcome. On the other hand, a large number of patients or inpatient stays are needed to identify a trajectory pattern which is not commonly observed but may be clinically important. For cohorts with small numbers of observations, it will be difficult to distinguish genuine trajectory patterns from ‘artificial’ clusters formed by a random chance. Future application of the method in a nationwide EHR system should be implemented to validate our findings. If validated, the approach could be applied to many other longitudinal clinical data to predict meaningful patient outcomes. Last, our method could be criticized by the fact that it fails to incorporate the sampling variability of the coefficient estimates in its first step of the individual regressions. In our sample implementation, we specifically restricted the inpatient stays into the analysis with those had at least 10 pain scores to ensure stability of estimates for the coefficients across all individual regressions. However, further modification that enables the incorporation of the sampling variability could be promising on the premises of our current algorithm.
5. Conclusion
In summary, we have described a novel approach with fast implementation to model patients’ pain experience using EHRs. Higher rates of surgery-related pain after discharge were observed in one empirically distinguishable inpatient pain trajectory with EHR data. This approach could be applied to many other longitudinal clinical data to predict meaningful patient outcomes. Moving towards a learning health care system, clinical care should be learning from all available data regarding a patient’s episode of care instead of focusing on an “average” patient or score. We now have ever advancing analytic capacity and our method provides a rigorous and statistically sound approach to leverage longitudinal clinical data from EHRs for personalized treatment plans.
Supplementary Material
6. Acknowledgements
This project was supported by grant number R01HS024096 from the Agency for Healthcare Research and Quality. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality.
References
- 1.DeFrances CJ, Podgornik MN. 2004 national hospital discharge survey. Advance data from vital and health statistics. 2006. May 4;371:1–20. [PubMed] [Google Scholar]
- 2.Kozak LJ, DeFrances CJ, Hall MJ. National hospital discharge survey: 2004 annual summary with detailed diagnosis and procedure data. Vital and health statistics. Series 13, Data from the National Health Survey. 2006. October(162):1–209. [PubMed] [Google Scholar]
- 3.Adamson SJ, Deering DE, Sellman JD, Sheridan J, Henderson C, Robertson R, Pooley S, Campbell SD, Frampton CM. An estimation of the prevalence of opioid dependence in New Zealand. International Journal of Drug Policy. 2012. January 31;23(1):87–9. [DOI] [PubMed] [Google Scholar]
- 4.Katz JN, Losina E, Barrett J, Phillips CB, Mahomed NN, Lew RA, Guadagnoli E, Harris WH, Poss R, Baron JA. Association between hospital and surgeon procedure volume and outcomes of total hip replacement in the United States Medicare population. Jbjs. 2001. November 1;83(11):1622–9. [DOI] [PubMed] [Google Scholar]
- 5.Goldstein DH, Ellis J, Brown R, Wilson R, Penning J, Chisom K, VanDenKerkhof E, Canadian Collaborative Acute Pain Initiative. Meeting proceedings: recommendations for improved acute pain services: Canadian Collaborative Acute Pain Initiative. Pain Research and Management. 2004;9(3):123–30. [DOI] [PubMed] [Google Scholar]
- 6.Apfelbaum JL, Chen C, Mehta SS, Gan TJ. Postoperative pain experience: results from a national survey suggest postoperative pain continues to be undermanaged. Anesthesia & Analgesia. 2003. August 1;97(2):534–40. [DOI] [PubMed] [Google Scholar]
- 7.Edlund MJ, Martin BC, Russo JE, Devries A, Braden JB, Sullivan MD. The role of opioid prescription in incident opioid abuse and dependence among individuals with chronic non-cancer pain: the role of opioid prescription. The Clinical journal of pain. 2014. July;30(7):557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Waljee JF, Li L, Brummett CM, Englesbe MJ. Iatrogenic Opioid Dependence in the United States: Are Surgeons the Gatekeepers?. Annals of surgery. 2017. April 1;265(4):728–30. [DOI] [PubMed] [Google Scholar]
- 9.Department of Veterans Affairs. Pain as the 5th vital sign toolkit. Veterans Health Administration. 2000. [Google Scholar]
- 10.Ruau D, Liu LY, Clark JD, Angst MS, Butte AJ. Sex differences in reported pain across 11,000 patients captured in electronic medical records. The Journal of Pain. 2012. March 31;13(3):228–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Hughes R, editor. Patient safety and quality: An evidence-based handbook for nurses. Rockville, MD: Agency for Healthcare Research and Quality; 2008. April. [PubMed] [Google Scholar]
- 12.Lund Iréne, et al. “Lack of interchangeability between visual analogue and verbal rating pain scales: a cross sectional description of pain etiology groups.” BMC Medical Research Methodology 5.1 (2005): 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Jones BL, Nagin DS, Roeder K. A SAS procedure based on mixture models for estimating developmental trajectories. Sociological methods & research. 2001. February;29(3):374–93. [Google Scholar]
- 14.Jung T, Wickrama KA. An introduction to latent class growth analysis and growth mixture modeling. Social and personality psychology compass. 2008. January 1;2(1):302–17. [Google Scholar]
- 15.Proust-Lima C, Philipps V, Liquet B. Estimation of extended mixed models using latent classes and latent processes: the R package lcmm. arXiv preprint arXiv:1503.00890. 2015. March 3. [Google Scholar]
- 16.Ram N, Grimm KJ. Methods and measures: Growth mixture modeling: A method for identifying differences in longitudinal change among unobserved groups. International journal of behavioral development. 2009. November;33(6):565–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kannampallil T, Galanter WL, Falck S, Gaunt MJ, Gibbons RD, McNutt R, Odwazny R, Schiff G, Vaida AJ, Wilkie DJ, Lambert BL. Characterizing the pain score trajectories of hospitalized adult medical and surgical patients: a retrospective cohort study. Pain. 2016. December;157(12):2739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Leisch F. A toolbox for k-centroids cluster analysis. Computational statistics & data analysis. 2006. November 15;51(2):526–44. [Google Scholar]
- 19.Wasey Jack O. (2016). icd: Tools for Working with ICD-9 and ICD-10 Codes, and Finding Comorbidities. R package version 2.1. https://CRAN.R-project.org/package=icd [Google Scholar]
- 20.Nielsen S, Degenhardt L, Hoban B, Gisev N. A synthesis of oral morphine equivalents (OME) for opioid utilisation studies. Pharmacoepidemiology and drug safety. 2016. June 1;25(6):733–7. [DOI] [PubMed] [Google Scholar]
- 21.CDC compilation of opioid analgesic formulations with morphine milligram equivalent conversion factors, 2015 version. Centers for Disease Control and Prevention, 2015. at http://www.pdmpassist.org/pdf/BJA_performance_measure_aid_MME_conversion.pdf.) [Google Scholar]
- 22.Centers for Disease Control and Prevention (CDC). (2015). National Electronic Health Records Survey: 2015 Specialty and Overall Physicians Electronic Health Record Adoption Summary Tables. Retrieved from https://www.cdc.gov/nchs/data/ahcd/nehrs/2015_nehrs_ehr_by_specialty.pdf
- 23.Hwang U, Belland LK, Handel DA, Yadav K, Heard K, Rivera-Reyes L, Eisenberg A, Noble MJ, Mekala S, Valley M, Winkel G. Is all pain is treated equally? A multicenter evaluation of acute pain care by age. PAIN®. 2014. December 31;155(12):2568–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Huang N, Cunningham F, Laurito CE, Chen C. Can we do better with postoperative pain management?. The American Journal of Surgery. 2001. November 30;182(5):440–8. [DOI] [PubMed] [Google Scholar]
- 25.Curtin CM, Hernandez-Boussard T. Readmissions after treatment of distal radius fractures. The Journal of hand surgery. 2014. October 31;39(10):1926–32. [DOI] [PubMed] [Google Scholar]
- 26.Finnegan MA, Shaffer R, Remington A, Kwong J, Curtin C, Hernandez-Boussard T. Emergency Department Visits Following Elective Total Hip and Knee Replacement Surgery: Identifying Gaps in Continuity of Care. JBJS. 2017. June 21;99(12):1005–12. [DOI] [PubMed] [Google Scholar]
- 27.Gawande AA. It’s Time to Adopt Electronic Prescriptions for Opioids. Annals of surgery. 2017. April 1;265(4):693–4. [DOI] [PubMed] [Google Scholar]
- 28.Murthy VH. Ending the opioid epidemic—a call to action. New England Journal of Medicine. 2016. December 22;375(25):2413–5. [DOI] [PubMed] [Google Scholar]
- 29.Lorenz KA, Krebs EE, Bentley TG, Sherbourne CD, Goebel JR, Zubkoff L, Lanto AB, Asch SM. Exploring alternative approaches to routine outpatient pain screening. Pain Medicine. 2009. October 1;10(7):1291–9. [DOI] [PubMed] [Google Scholar]
- 30.Krebs EE, Lorenz KA, Bair MJ, Damush TM, Wu J, Sutherland JM, Asch SM, Kroenke K. Development and initial validation of the PEG, a three-item scale assessing pain intensity and interference. Journal of general internal medicine. 2009. June 1;24(6):733–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


