Skip to main content
JCO Clinical Cancer Informatics logoLink to JCO Clinical Cancer Informatics
. 2024 Mar 12;8:e2300039. doi: 10.1200/CCI.23.00039

Prediction of Cancer Symptom Trajectory Using Longitudinal Electronic Health Record Data and Long Short-Term Memory Neural Network

Sena Chae 1,, W Nick Street 2, Naveenkumar Ramaraju 3, Stephanie Gilbertson-White 1
PMCID: PMC10948138  PMID: 38471054

Abstract

PURPOSE

Ability to predict symptom severity and progression across treatment trajectories would allow clinicians to provide timely intervention and treatment planning. However, such predictions are difficult because of sparse and inconsistent assessment, and simplistic measures such as the last observed symptom severity are often used. The purpose of this study is to develop a model for predicting future cancer symptom experiences on the basis of past symptom experiences.

PATIENTS AND METHODS

We performed a retrospective, longitudinal analysis using records of patients with cancer (n = 208) hospitalized between 2008 and 2014. A long short-term memory (LSTM)–based recurrent neural network, a linear regression, and random forest models were trained on previous symptoms experienced and used to predict future symptom trajectories.

RESULTS

We found that at least one of three tested models (LSTM, linear regression, and random forest) outperform predictions based solely on the previous clinical observation. LSTM models significantly outperformed linear regression and random forest models in predicting nausea (P < .1) and psychosocial status (P < .01). Linear regression outperformed all models when predicting oral health (P < .01), while random forest outperformed all models when predicting mobility (P < .01) and nutrition (P < .01).

CONCLUSION

We can successfully predict patients' symptom trajectories with a prediction model, built with sparse assessment data, using routinely collected nursing documentation. The results of this project can be applied to better individualize symptom management to support cancer patients' quality of life.

INTRODUCTION

Patients with cancer routinely suffer from symptoms related to both the disease and treatment, with patients undergoing chemotherapy experiencing more symptoms than patients who do not.1 Commonly reported symptoms in patients with cancer are pain, fatigue, sleep disturbance, nausea and vomiting, depression, emotional distress, appetite changes, constipation, and dyspepsia.2,3 Uncontrolled symptoms in patients with cancer can result in negative outcomes such as decreased quality of life,4 increased health care utilization,5 and potentially shorter life. Precision symptom management such as predicting symptom severity could improve physical activity, anxiety, and quality of life,4,6 and potentially reduce health care utilization7 and cost.8 Identifying the impact of factors such as patient characteristics, treatment regimens, and previous symptoms on symptom occurrence during chemotherapy is important for considering appropriate pretreatment measures, including nutrition and mental health care.9 However, the ability to predict symptom development during cancer treatment remains elusive.

CONTEXT

  • Key Objective

  • To develop a model for predicting future cancer symptom experiences on the basis of past symptom experiences using a long short-term memory (LSTM) network, which can address the issue of the sparse and irregular reporting time periods of electronic health record data.

  • Knowledge Generated

  • We found that at least one of three tested models (LSTM, linear regression, and random forest) outperform predictions based solely on the previous clinical observation. LSTM model outperformed other models in predicting activity (P < .01), nausea (P < .1), and psychosocial status (P < .01).

  • Relevance (J.L. Warner)

  • The authors have shown a way to take advantage of structured data from flow sheets in predictive analytics for an important aspect of the cancer journey, symptoms and psychosocial status. LSTM was formally shown to outperform the standard machine learning approaches of linear regression and random forest for several parameters.*

    *Relevance section written by JCO Clinical Cancer Informatics Editor-in-Chief Jeremy L. Warner, MD, MS, FAMIA, FASCO.

Automatic data-driven prediction of symptom severity would enable early intervention and treatment adaptation. Real-world electronic health record (EHR) data can potentially support automatic symptom prediction because of its wealth of patient symptom information.10

Symptoms are primarily recorded in the EHR during health care interactions, with little to no information about the patient's home experience. Additionally, patients' assessments are irregular and may show dynamic, nonlinear symptom severity changes in the current EHR flowsheet data (Appendix Fig A1). Although traditional statistical analysis using EHRs have not been successful facing these challenges,11 this study evaluates an alternative method for predicting symptom development in patients with cancer using complex EHR data.

Most studies that built predictive models with EHR data from patients with cancer focused on predicting incidence of cancer,12-14 cardiovascular risk,15-17 readmission,18 or survival,19-21 but prediction of symptom severity was rare.22

Artificial neural networks such as recurrent neural networks (RNNs) have been applied to analyze time-stamped events in EHRs, including predicting heart failure onset.15,16,23 Specifically, the LSTM algorithm, a type of RNN, has shown promise in health care.16,24-27 Lipton et al27 demonstrated LSTM's ability to capture long-range dependencies and nonlinear patterns in EHR data, while another study found that LSTM outperformed traditional models (K-nearest neighbor, logistic regression, support vector machines, and multilayer perceptron) in heart failure detection.16 LSTM networks are appropriate for processing time-stamped data, learning patterns across different time intervals while filtering out unnecessary information.

The purpose of this study is to develop a model for predicting future cancer symptom experiences on the basis of past symptom experiences using longitudinal EHR data and LSTM.

PATIENTS AND METHODS

Patient Population and Settings

A retrospective and longitudinal analysis was performed using patients with solid organ cancer who were hospitalized (n = 208) for any reason at a Midwestern academic medical center between 2008 and 2014 (institutional eview board ID: 201505811). Inclusion criteria included (1) age 18 years or older, (2) solid organ cancer (stage III or IV disease using TNM staging),28 (3) received at least one chemotherapy cycle, and (4) had at least one in-patient hospitalization. All records meeting these criteria spanning 7 years were obtained from the Clinical Research Data Warehouse.

Variables

Demographic and clinical characteristics including age, sex, cancer site, TNM stage, and chemotherapy cycles were extracted from the EHR. Symptom assessment data came from routinely documented flowsheets using various assessment scales by nurses during inpatient hospitalizations, as analyzed in Table 1. Because of the sparsity of symptom assessments, the eight most frequently documented symptoms (activity, appetite, mobility, nausea, nutrition, oral health, pain, and psychosocial status) were used.

TABLE 1.

Symptom Assessments Used in This Study

Symptoms (range) Response Option The Average No. of Days Documented at Least One Symptom Assessment of 208 Patients (range)
Activity as part of the Braden scale (1-4) 1 is bedfast, 2 is chairfast, 3 is walks occasionally, and 4 is walks frequently 4.4 (1-25)
Appetite (1-4) Good, fair, poor, and no appetite, which were then transformed into the numerical values 1-4 5 (1-21)
Mobility as part of the Braden scale (1-4) 1 is completely immobile, 2 is very limited, 3 is slightly limited, and 4 is no limitations 4.7 (1-19)
Nausea (1-3) Absent, intermittent, and present, which were then transformed into the numerical values 1-3 4.9 (1-23)
Nutrition under the Braden scale (1-4) Very poor, probably inadequate, adequate, and excellent, which were then transformed into the numerical values 1-4 5.7 (1-28)
Oral health (0-15) Oral health assessment was scored by the sum score of voice, swallow, lips, tongue, saliva, mucous membrane, gingiva, and teeth. Each variable of oral health is assessed on a 0-2 scale with 2 being worst, and the scores are summed to create a total score of oral health 4.9 (1-24)
Pain severity (0-10) 0 is no pain, 10 is worst possible 17.5 (1-111)
Psychosocial status (1-2) WDL and exception to WDL, which were then transformed into the dichotomous values 1-2. WDL is considered normal and exception to WDL is considered abnormal on the basis of the depth of nursing assessment performed. WDL means appropriate affect, cooperative for developmental level, and able to express thoughts, feelings, and needs. Exception to WDL means the exception to appropriate affect, cooperative for developmental level, and able to express thoughts, feelings, and needs 4.2 (1-19)

Abbreviation: WDL, within defined limits.

LSTM Modeling

Data were analyzed using R (version 4.22; The R Foundation for Statistical Computing, Vienna, Austria),29 and prediction algorithms were trained on Python (version 3.10; Python Software Foundation, Wilmington, DE).30 We used the deep learning Keras Library,31 which runs on top of TensorFlow, for training our models using the LSTM algorithm in Python.

We examined the prediction capabilities of each model trained using the LSTM algorithm for one of eight symptoms using 6 years of patients, subsequently predicting their symptom severity after the initiation of chemotherapy. Although developing a model to predict each of eight symptoms, the other seven symptoms that were not the main focus of a model were still included as inputs. Because the symptom distributions were inconsistent across the various symptoms, we log2-transformed all the symptom values before training the model. It is recommended to use a log2 or natural-logarithm transformation when the data are skewed.32 In our data set, only a few patients had extreme values across symptoms. The prediction performance of all the models we used improves with this transformation. The features included two variable types, static and temporal. The five static variables are age at diagnosis, TNM stage, primary cancer site, sex, and race. The definition of temporal variables used to predict each symptom using pain as an example is described in Table 2.

TABLE 2.

Temporal Variables Included in the LSTM Model Using the Example of Pain

Temporal Variable Operational Definition
Days since diagnosis The number of days since cancer diagnosis
Chemo Whether this patient received chemotherapy or not on the particular day
Last chemo day The most recent date that the patient received chemotherapy
Chemo cycle The chemotherapy cycle number
Pain The pain severity on the particular day when a pain was assessed
Last pain The pain severity assessed on the most recent day
Last pain day The latest date (since cancer diagnosis) of pain record
Last pain diff The difference in pain severity between current day and the most recent day of assessment
Pain day diff The difference between in days between the current day and the most recent day that pain severity was last recorded
Pain reported count The number of pain records so far

Abbreviation: LSTM, long short-term memory.

Table 3 provides an exemplar for the features included for pain for a selected patient. The exemplar includes the pain features (severity and longevity), as well as the frequency of chemotherapy treatments across the first 23 days of hospitalization (discrete days when the patient was hospitalized) after the date of cancer diagnosis. This included both single-day and more-than-one-consecutive-day hospitalizations. For symptom inputs, we used the most recent value of symptom severity and the count of instances of a documented symptom. We also input the number of days between records, which we call pain day diff. Using this information, the LSTM algorithm puts more weight on input from more recent days when training the model.

TABLE 3.

Features Included in the LSTM Model for Specific Patient for Pain Severity

Visit No. Static Variables Temporal Variables
Days Since Diagnosis Chemo Last Chemo Day Chemo Cycle Pain Last Pain Last Pain Day Last Pain Diff Past Day Diff Pain Reported Count
1 Age at diagnosis, TNM stage, cancer site, sex, race 48 1a NaNa 1a NaN NaN NaN NaN NaN 0
2 55 1 48 2 NaN NaN NaN NaN NaN 0
3 62 1 55 3 NaN NaN NaN NaN NaN 0
4 66 NaN 62 3 8.0 NaNb NaNb NaNb NaN 0b
5 67 NaNc 62c 3c 0.0c 8.0c 66c NaN 1c 1c
6 68 NaN 62 3 NaN 8.0 66d NaNd 2 1
7 70 1 62 4 2.0 0.0 67 –8.0 3 2
8 76 NaN 70 4 6.0 2.0 70 2.0 6 3
9 77 1 70 5 5.0 6.0 76 4.0 1 4
10 84 1 77 6 7.0 5.0 77 –1.0 7 5
20 115 NaN 113 8 NaN 3.0 113 –0.7 2 14
21 127 NaN 113 8 0.0 3.8 114 0.8 13 14
22 128 NaN 113 8 2.0 0.0 127 –3.8 1 15
23 129 NaN 113 8 3.0 2.0 128 2.0 1 16

Abbreviations: LSTM, long short-term memory; NaN, not a number—a value that is undefined or unrepresentable.

a

For this patient, 48 days after a cancer diagnosis, the patient started the first cycle of chemotherapy. There was no observed pain on this day. Before day 48, we do not have any information except demographic and clinical information. Therefore, this patient's chemo is one and chemo cycle is one, and last chemo day does not exist.

b

This patient had second chemotherapy on day 55, third chemotherapy on day 62, fourth chemotherapy on day 70, and so on. This patient did not have a pain record before day 66; therefore, last pain, last pain day, last pain diff, and pain day diff do not exist on day 66. Pain is eight on day 66 and pain reported count is zero.

c

When we use day 67 in the algorithm, the pain intensities included for this day are eight and zero, last pain is eight, and the count of how many times this patient was assessed so far is one. On day 67 after cancer diagnosis, this patient did not receive chemotherapy thus chemo is none, and chemo cycle is still three and last chemo day is 62. On day 67, last pain day is 66 and last pain is eight. Here, the current day with pain is 67 and the most recent day with a pain record is 66. Thus, pain day diff on day 67 is one.

d

Last pain diff is none because there is no pain record on day 68.

Our methodology used the symptom assessment records of 10 days of hospitalization for each of the 208 subjects with a sliding time window to predict the symptom value for the next visit, the 11th visit, as a target. Time-series inputs were processed iteratively within this sliding window. We chose 10 days of hospitalizations that are not necessarily consecutive because the number of days within a patient's hospitalization is not fixed. We chose a sample that includes 10 recent hospitalizations for each patient so that the number of days of hospitalization is closer to the mean length of stay (13.6 days) while minimizing the sparsity in the data. When a patient had fewer than 10 days of hospitalization, we initialized the input features with mean values for unobserved days. The median number of days between hospitalizations (unobserved days) is three, which was used to impute missing values in our model. We standardized the inputs to follow a standard normal distribution using mean and standard deviation (SD) of the respective features calculated from the training data. Figure 1 illustrates LSTM modeling of longitudinal symptom record features as input data using the observation and prediction window.

FIG 1.

FIG 1.

Study design of experiment for predicting pain. LSTM, long short-term memory.

Hyperparameter Optimization for an LSTM

We fine-tuned hyperparameters to prevent overfitting and increase prediction accuracy. First, we initialized the model as sequential to add different layers, such as pooling and dropout.34,35 We optimized hyperparameters for LSTM models in two stages: a manual fine-tuning of hyperparameters followed by a grid search. During the manual fine-tuning, we found that having an LSTM model with an Adam optimizer, mean absolute error (MAE) as loss function, learning rate of 0.01 with an exponential decay, and leaky rectified linear unit as the activation function on the output unit consistently outperformed other choices. We then use a grid search approach to select the number of units in the first two LSTM hidden (4-16) and the fully connected layer (16-256); the type of regularization to use (Lasso, Ridge, and Elastic Net); and the dropout rate (2.5% or 5%) in every LSTM layer.34-36 We used a cross-validation approach to select the hyperparameters that resulted in the lowest error for each symptom. In all the LSTM models, 10 timesteps were used in the hyperparameter selection and the training process. During the grid search, we trained the models using 300 epochs.37 We stopped the training prematurely if the validation error did not improve for 10 epochs. The epoch with the lowest validation error was selected as the final model, ensuring robust generalization for the chosen hyperparameters.

Alternate Models

The Wolpert-Macready theorem states there is no single best algorithm for all problems.28 Therefore, besides the LSTM algorithm, we also trained a linear regression model and a random forest model for all eight symptoms, both provided by the scikit-learn library in Python. Because of inherent limitations in these models, we trained the models using only the most recent observation. We use the grid-search algorithm38 to automatically fine-tune the hyperparameters using five-fold cross-validation.39 For the random forest algorithm, we explored several model choices by varying the maximum depth of a tree (2, 5, and automatic), the maximum features to consider in each tree (2, 5, and automatic), and the number of trees in the model (25, 50, and 100). Similarly, for the linear regression algorithm, we explored the choice of regularization (Lasso, Ridge, and Elastic Net) and penalty (0.1, 0.5, and 1) using grid search. We used the best model identified by grid search in our results.

Performance Evaluation

In clinical practice, symptoms are typically assessed using patients' historical records, often focusing on the most recent documented value unless there have been changes in medication or interventions. Hence, in this study, we used the last observed values as a baseline for evaluating the performance of LSTM and traditional machine learning approaches.

MAE is commonly used to evaluate regression problems.40 We used six pairwise t-tests to compare the performance of four models (LSTM, linear regression, random forest, and model-free prediction using the last observed value). MAE for the LSTM, linear regression, and random forest was estimated using five-fold cross-validation, in which models were trained on 80% of the data and evaluated on the remaining 20% of the data as a test set. This cross-validation was repeated on 30 different randomized partitions of the data set while ensuring that all patients had equal representation in the test set. The 30 MAEs and SDs computed for each model were used to compare the model performance.

Symptom Variability and Model Selection

Each symptom displays distinct variations over time across patients. Complex models such as LSTM might be more suitable when the symptom exhibits variation over time for individual patients with some unknown relationship. However, if a symptom demonstrates minimal variation for individual patients using LSTM, overfitting might occur. In such cases, a linear regression or a random forest model using only past observation could be preferable. To quantify symptom variability, we used an analysis of variance (ANOVA) approach, treating each patient as a group.33 The within-group variance, between-group variance, and the F-test statistics for variances are presented in Table 4. The F-test for the equality of variances is a statistical test used to determine if two or more groups or populations have significantly different variances. The formula for the one-way ANOVA F-test statistic is F = betweengroupvariablitywithingroupvariablity.33

TABLE 4.

Symptom Variability Using ANOVA

Symptom Observations, No. Patients, No. Within-Group Variance Between-Group Variance F-Score
Activity 810 189 0.09 0.15 1.76
Appetite 227 45 0.10 0.11 1.08
Mobility 861 190 0.05 0.09 1.71
Nausea 366 81 0.12 0.11 0.90
Nutrition 1,045 188 0.06 0.06 1.01
Oral health 675 135 0.02 0.11 6.16
Pain severity 1,715 99 0.04 0.24 6.71
Psychosocial status 515 92 0.13 0.12 0.94

Abbreviation: ANOVA, analysis of variance.

RESULTS

Sample Characteristics

The study sample consisted of 121 men and 87 women; a total of n = 208 patients diagnosed with cancer. Mean (SD) age was 60.7 (10.9) years, with a range of 28- 88 years (Table 5). The majority (95.2%) of patients were White and had cancer stage III (30.3%) or VI (63.5%) at the time of diagnosis. The primary site of cancer varied with 12 different sites in the sample, the most common being cancer of the bronchus and lungs (41.3%). The majority of the sample (83.7%) were deceased at the time of data extraction for this study.

TABLE 5.

Characteristics of the Sample

Characteristic n = 208
Age at diagnosis, years, mean (range) 60.7 ± 10.9 (28-88)
Sex, No. (%)
 Male 121 (58.2)
 Female 87 (41.8)
Race, No. (%)
 White 198 (95.2)
 African American 3 (1.4)
 Asian 4 (1.9)
 Declined to answer 3 (1.4)
TNM stage, No. (%)
 I 1 (0.5)
 II 8 (3.9)
 III 63 (30.3)
 IV 132 (63.5)
 Others 4 (1.9)
Cancer site, No. (%)
 Breast and endocrine 2 (1)
 H&N 32 (15.4)
 Esophagus 10 (4.8)
 HBP 26 (12.5)
 Bronchus and lung 86 (41.3)
 Stomach 4 (1.9)
 LGI 10 (4.8)
 UGI 2 (1)
 Skin 6 (2.9)
 GY 1 (0.5)
 Urology 17 (8.2)
 NA 12 (5.8)
How many cycles for each cluster, No. (%)
 1 39 (18.8)
 2 29 (13.9)
 3 19 (9.1)
 4 15 (7.2)
 5 11 (5.3)
 6 21 (10.1)
 7 13 (6.3)
 8 10 (4.8)
 9 7 (3.4)
 Over 10 42 (20.2)

Abbreviations: GY, gynecology; H&N, head and neck; HBP, hepatobiliary and pancreas; LGI, low GI; NA, not available; UGI, upper GI.

Model Performance Evaluation

The results pertain to the evaluation of the model's performance on the basis of the training data. Table 6 shows a comparison of MAE between the real and predicted values using LSTM, linear regression, random forest, and the last observation. The results of t-tests showed that the performance of LSTM, linear regression, and random forest models were better than model-free prediction (last observed) for all eight symptoms except for the LSTM model to predict oral health. The best performing model for a given symptom improved the model-free performance by 41.03% (activity), 36.36% (appetite), 45.92% (mobility), 42.86% (nausea), 44.55% (nutrition), 10.38% (oral health), 16.57% (pain severity), and 36.67% (psychosocial status).

TABLE 6.

Comparison of MAE Between the Real and Predicted Values Using the Last Observation, LSTM, Linear Regression, and Random Forest (MAE ± SD)

Symptom (range) Last Observed LSTM Linear Regression Random Forest Improvement by the Best Model, %
Activity (1-4) 1.17 (0.04) 0.69* (0.05) 0.76 (0.04) 0.74 (0.05) 41.03
Appetite (1-4) 1.21 (0.14) 0.78 (0.1) 0.77 (0.09) 0.78 (0.09) 36.36
Mobility (1-4) 0.98 (0.02) 0.57 (0.03) 0.57 (0.03) 0.53* (0.04) 45.92
Nausea (1-3) 0.91 (0.08) 0.52** (0.04) 0.54 (0.04) 0.53 (0.04) 42.86
Nutrition (1-4) 1.01 (0.03) 0.59 (0.02) 0.58 (0.02) 0.56* (0.02) 44.55
Oral health (0-15) 1.83 (0.18) 1.85 (0.33) 1.64* (0.24) 1.79 (0.32) 10.38
Pain severity (0-10) 1.75 (0.17) 1.47 (0.17) 1.46 (0.12) 1.46 (0.14) 16.57
Psychosocial status (1-2) 0.3 (0.06) 0.19* (0.04) 0.26 (0.02) 0.27 (0.03) 36.67

NOTE. Values in bold indicate which of the three machine learning models performed best in comparison with the last observation.

Abbreviations: LSTM, long short-term memory; MAE, mean absolute error; SD, standard deviation.

*P < .01, **P < .1.

The t-tests show that the LSTM model outperformed other models in predicting activity (P < .01), nausea (P < .1), and psychosocial status (P < .01). The LSTM model's performance was comparable with that of the linear and random forest models in predicting appetite and pain. Random forest outperformed all models when predicting mobility (P < .01) and nutrition (P < .01; Table 6).

We found that the model selection depends on the inherent variability of the symptom. Comparing the results from Tables 4 and 6, it is noticeable that the performance improvement by the models increased as the F-test statistics decreased for LSTM, or as the F-test statistics increased for linear regression. The usage of models to predict symptoms proved advantageous when the unexplained variance was more pronounced for a specific symptom (low F-test statistics), particularly with complex models such as LSTM.

DISCUSSION

To our knowledge, this study marks the first application of the LSTM algorithm to predict the severity of eight symptoms (activity, appetite, mobility, nausea, nutrition status, oral health assessment, pain severity, and psychosocial status) in patients with cancer. The enhanced predictive capability of LSTM, using historical symptom assessments, aligns with previous research findings.42-46 Factors such as younger age, a history of nausea and vomiting, anxiety, and fatigue were associated with increased chemotherapy-induced nausea and vomiting.43 Additionally, nutrition status in this study can be predicted by considering the other seven symptoms, including nausea. These findings support earlier research indicating that uncontrolled, treatment-induced nausea and/or vomiting can lead to anorexia, weight loss, or nutritional issues.44

Psychosocial status is a valuable predictor of decreased activity and psychosocial distress, aligning with previous research indicating that depressive symptoms can exacerbate decreased activity in women with metastatic breast cancer.45 Similarly, baseline distress and neuroticism predicted longer-term emotional distress after cancer diagnosis.42 Additionally, pain intensity was influenced by patients' emotional distress, including anxiety and depression symptoms.46 These findings regarding the role of psychosocial status in predicting activity, emotional distress, and pain intensity emphasize the importance of early assessment for psychosocial status and emotional distress. Conversely, considering that pain intensity was the most frequently documented symptom on a flowsheet, one interpretation of the findings is that patients who initially reported significant pain are more likely to consistently report pain.

Similar to the findings of the predictive relationships between the symptoms above, trouble sleeping, depressed mood, and pain, as well as age and cancer site had a significant effect on fatigue in a previous study among patients with cancer.47 Those results corroborate our results from analyzing future fatigue severity among patients with cancer on the basis of several symptoms and patients' demographic and clinical characteristics using LSTM. These results suggest that interventions targeting treatment of depressed mood and pain will improve fatigue in patients with cancer.

In general, the LSTM-based model outperformed a clinically used model observing the most recent symptom to predict future symptoms in this study. We assume the reason for this is information loss: a clinically used model cannot work with time-series sequences because it simply calculates changes of symptom severity; however, the LSTM-based model considers irregular time intervals between hospitalizations. These findings are similar to those for early prediction of negative outcomes such as sepsis,26 mortality,25,48 and opioid overdose risk.24 However, the MAEs between predicted and actual values using LSTM for oral health were marginally less than using the last observations. A study investigating factors of mucositis in patients with cancer receiving chemotherapy showed that xerostomia and lower baseline neutrophil levels are significantly linked with oral mucositis.45 This result suggests the benefit of considering other features predicted with oral health such as xerostomia and lower baseline neutrophil level for future research.

Importantly, this study's current experiments suggest that LSTM models can be used in the prediction of three symptoms (activity, nausea, and psychosocial status) because LSTM models outperformed linear regression and random forest models significantly on these three symptoms. The regression model is suitable for oral health on the basis of the results that the linear regression model significantly outperformed LSTM and random forest models in the prediction of oral health. The random forest model is suitable for mobility and nutrition because it significantly outperformed LSTM and linear regression in the prediction of nutrition and mobility.

We found that the LSTM model will not be able to outperform regression on oral health and random forest for nutrition and mobility even with a lot of fine-tuning. One possible explanation regarding the standardized variance perspective of the data is that for symptoms with lower variability in both the predicted variable (symptom) and consecutive visits, using LSTM models becomes less advantageous. We found that the LSTM model is suitable when the F-test statistics are low (ie, more variance in the symptom within the same patient and for each patient) and that using the linear regression or random forest is better when the F-test statistics is high. We discovered that the LSTM model may not always be the optimal performer. We can explore alternative, simplified model–based approaches, which may be determined by specific data characteristics, such as variance over time and within individual patients. Additionally, the combination of multiple models needs to be explored and evaluated.

The predictive method used in this study is not immediately ready for clinical implementation. Future research will aim to develop and validate a symptom prediction model, paving the way for pilot testing. Successful pilot testing will inform the potential implementation of a clinical decision support tool for symptom prediction in hospitals. This predictive approach has broader applications beyond cancer, potentially benefiting patients with various conditions such as chronic diseases, palliative care, or aging populations.

We acknowledge the following limitations: (1) for multilabel prediction problems related to symptoms, the input could be patient's demographics and treatment attributes, and the output could be multiple target variables such as types of symptoms across different symptoms. Although multilabel prediction could be an approach to consider, it may not be suitable for neural networks in this specific study because of their performance. We noted that addressing the use of multitask learning in future work could be a potential avenue for exploration. (2) These data were extracted from the EHR of a single institution. Future research is needed to predict symptoms during various treatment protocols using more comprehensive and multiple-dimensional symptom assessment tools that are incorporated in the EHR with larger and multicenter studies. (3) We had a relatively small sample size (n = 208) that consisted of a heterogeneous mix of solid tumor primary cancer sites. Future research is needed to validate this model with a larger sample size and multiple institutions and various patients.

In conclusion, we modeled temporal cancer symptoms to imitate clinician reasoning by incorporating old with new information using an LSTM network, which can address the issue of the sparse and irregular reporting time periods of EHR data. At least one out of three models (LSTM, linear regression, and random forest) has shown superior performance compared with predictions based solely on previous clinical observations. Particularly, LSTM models significantly outperformed linear regression and random forest models in predicting three symptoms (activity, nausea, and psychosocial status). Linear regression outperformed all models when predicting oral health, while random forest outperformed all models when predicting mobility and nutrition. Nevertheless, the LSTM model may not always be the optimal choice, and we can consider exploring alternative, simpler model–based approaches on the basis of the characteristics of the data. The further development of predictive models and future implementation can be used by clinicians for a better understanding of symptom prognosis patterns and for making timely decisions to plan treatments. We can predict patients' symptom trajectories using routinely collected nursing documentation with a predictive model that can be built into future EHRs.

ACKNOWLEDGMENT

This work was a part of a dissertation by S.C. in 2020 at the University of Iowa, and S.G.-W. and W.N.S. were co-chairs. The authors acknowledge the Iowa Health Data Resource (IHDR) at the University of Iowa for their support.

APPENDIX

FIG A1.

FIG A1.

(A) The frequency of patients' days of hospitalization to the hospital and (B) spaghetti plot of observed pain trajectories for individuals for 6 months after the first cycle of chemotherapy. Blue dots: short-term, meaning single-day hospitalizations; blue lines: long-term, meaning more-than-one-consecutive-day hospitalizations.

PRIOR PRESENTATION

Presented at the MNRS's 44th Annual Research Conference, Chicago, IL, April 1-4, 2020.

SUPPORT

Supported in part by the Holden Comprehensive Cancer Center at the University of Iowa, supported by the National Cancer Institute (NCI) grant P30CA086862; The Institute for Clinical and Translational Science at the University of Iowa, funded through the Clinical and Translational Science Award (CTSA) grant UL1TR002537; Alma Miller Ware Nursing Endowment from the University of Iowa College of Nursing.

AUTHOR CONTRIBUTIONS

Conception and design: All authors

Provision of study materials or patients: Stephanie Gilbertson-White

Collection and assembly of data: Stephanie Gilbertson-White

Data analysis and interpretation: Sena Chae, W. Nick Street, Naveenkumar Ramaraju

Manuscript writing: All authors

Final approval of manuscript: All authors

Accountable for all aspects of the work: All authors

AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST

The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/cci/author-center.

Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).

No potential conflicts of interest were reported.

REFERENCES

  • 1.Thiagarajan M, Chan CMH, Fuang HG, et al. : Symptom prevalence and related distress in cancer patients undergoing chemotherapy. Asian Pac J Cancer Prev 17:171-176, 2016 [DOI] [PubMed] [Google Scholar]
  • 2.Nayak MG, George A, Vidyasagar MS, et al. : Symptoms experienced by cancer patients and barriers to symptom management. Indian J Palliat Care 21:349-354, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Stark L, Tofthagen C, Visovsky C, et al. : The symptom experience of patients with cancer. J Hosp Palliat Nurs 14:61-70, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Tagami K, Kawaguchi T, Miura T, et al. : The association between health-related quality of life and achievement of personalized symptom goal. Support Care Cancer 28:4737-4743, 2020 [DOI] [PubMed] [Google Scholar]
  • 5.Daly B, Nicholas K, Gorenshteyn D, et al. : Misery loves company: Presenting symptom clusters to urgent care by patients receiving antineoplastic therapy. JCO Oncol Pract 14:e484-e495, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bakitas M, Lyons KD, Hegel MT, et al. : Effects of a palliative care intervention on clinical outcomes in patients with advanced cancer: The project enable ii randomized controlled trial. JAMA 302:741-749, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Howell D, Rosberger Z, Mayer C, et al. : Personalized symptom management: A quality improvement collaborative for implementation of patient reported outcomes (PROS) in ‘real-world’ oncology multisite practices. J Patient Rep Outcomes 4:47, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Nevadunsky NS, Gordon S, Spoozak L, et al. : The role and timing of palliative medicine consultation for women with gynecologic malignancies: Association with end of life interventions and direct hospital costs. Gynecol Oncol 132:3-7, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Sutton E, Hackshaw-McGeagh LE, Aning J, et al. : The provision of dietary and physical activity advice for men diagnosed with prostate cancer: A qualitative study of the experiences and views of health care professionals, patients and partners. Cancer Causes Control 28:319-329, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Basch E, Schrag D: The evolving uses of “real-world” data. JAMA 321:1359-1360, 2019 [DOI] [PubMed] [Google Scholar]
  • 11.Gasparrini A: The case time series design. Epidemiology 32:829-837, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhao D, Weng C: Combining PubMed knowledge and EHR data to develop a weighted Bayesian network for pancreatic cancer prediction. J Biomed Inform 44:859-868, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wang X, Zhang Y, Hao S, et al. : Prediction of the 1-year risk of incident lung cancer: Prospective study using electronic health records from the state of Maine. J Med Internet Res 21:e13260, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Supriya M, Deepa AJ: A novel approach for breast cancer prediction using optimized ANN classifier based on big data environment. Health Care Manag Sci 23:414-426, 2020 [DOI] [PubMed] [Google Scholar]
  • 15.Choi E, Schuetz A, Stewart WF, et al. : Using recurrent neural network models for early detection of heart failure onset. J Am Med Inform Assoc 24:361-370, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Maragatham G, Devi S: LSTM model for prediction of heart failure in big data. J Med Syst 43:111-113, 2019 [DOI] [PubMed] [Google Scholar]
  • 17.Zhao J, Feng Q, Wu P, et al. : Learning from longitudinal data in electronic health record and genetic data to improve cardiovascular event prediction. Sci Rep 9:717, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lodhi MK, Ansari R, Yao Y, et al. : Predicting hospital re-admissions from nursing care data of hospitalized patients. Adv Data Min 2017:181-193, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Stukenborg GJ, Blackhall LJ, Harrison JH, et al. : Longitudinal patterns of cancer patient reported outcomes in end of life care predict survival. Support Care Cancer 24:2217-2224, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Gupta S, Tran T, Luo W, et al. : Machine-learning prediction of cancer survival: A retrospective study using electronic administrative records and a cancer registry. BMJ Open 4:e004007, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Cowling TE, Bellot A, Boyle J, et al. : One-year mortality of colorectal cancer patients: Development and validation of a prediction model using linked national electronic data. Br J Cancer 123:1474-1480, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Papachristou N, Puschmann D, Barnaghi P, et al. : Learning from data to predict future symptoms of oncology patients. PLoS One 13:e0208808, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Jin B, Che C, Liu Z, et al. : Predicting the risk of heart failure with EHR sequential data modeling. IEEE Access 6:9256-9261, 2018 [Google Scholar]
  • 24.Dong X, Deng J, Hou W, et al. : Predicting opioid overdose risk of patients with opioid prescriptions using electronic health records based on temporal deep learning. J Biomed Inform 116:103725, 2021 [DOI] [PubMed] [Google Scholar]
  • 25.Ge W, Huh JW, Park YR, et al. : An interpretable ICU mortality prediction model based on logistic regression and recurrent neural networks with LSTM units. AMIA Annu Symp Proc 2018:460-469, 2018 [PMC free article] [PubMed] [Google Scholar]
  • 26.Zhang D, Yin C, Hunold KM, et al. : An interpretable deep-learning model for early prediction of sepsis in the emergency department. Patterns 2:100196, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Lipton ZC, Kale D, Wetzel R: Directly modeling missing data in sequences with RNNS: Improved classification of clinical time series. Proceedings of the 1st Machine Learning for Healthcare Conference, PMLR, 2016, pp 253-270
  • 28.National Cancer Institute: NCI dictionary of cancer terms. https://www.cancer.gov/publications/dictionaries/cancer-terms/def/advanced-cancer#:~:text=advanced%20cancer%20(ad%2DVANST%20KAN,cancer%20cells%2C%20or%20relieve%20symptoms [Google Scholar]
  • 29.RStudio Team : Rstudio: Integrated Development for R. Boston, MA, Rstudio, 2015 [Google Scholar]
  • 30.Lutz M: Programming Python. Sebastopol, CA, O'Reilly Media, 2001 [Google Scholar]
  • 31.Chollet F: Deep Learning With Python. New York, NY, Simon and Schuster, 2021 [Google Scholar]
  • 32.O’Keeffe AG, Ambler G, Barber JA: Sample size calculations based on a difference in medians for positively skewed outcomes in health care studies. BMC Med Res Methodol 17:157, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Hahs-Vaughn DL, Lomax RG: Statistical Concepts—A Second Course. London, UK, Routledge, 2020 [Google Scholar]
  • 34.Pham V, Bluche T, Kermorvant C, et al. : Dropout improves recurrent neural networks for handwriting recognition. International Conference on Frontiers in Handwriting Recognition, 2014, pp 285-290
  • 35.Srivastava N, Hinton G, Krizhevsky A, et al. : Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929-1958, 2014 [Google Scholar]
  • 36.Farhadi Z, Bevrani H, Feizi-Derakhshi MR: Combining regularization and dropout techniques for deep convolutional neural network. 2022 Global Energy Conference (GEC), 2022, pp 335-339
  • 37.Ballabio D, Vasighi M, Consonni V, et al. : Genetic algorithms for architecture optimisation of counter-propagation artificial neural networks. Chemometr Intell Lab Syst 105:56-64, 2011 [Google Scholar]
  • 38.Pedregosa F, Varoquaux G, Gramfort A, et al. : Scikit-learn: Machine learning in python. J Mach Learn Res 12:2825-2830, 2011 [Google Scholar]
  • 39.Lachenbruch PA, Mickey MR: Estimation of error rates in discriminant analysis. Technometrics 10:1-11, 1968 [Google Scholar]
  • 40.Hodson TO: Root-mean-square error (RMSE) or mean absolute error (MAE): When to use them or not. Geosci Model Dev 15:5481-5487, 2022 [Google Scholar]
  • 41.Reference deleted. [Google Scholar]
  • 42.Cook SA, Salmon P, Hayes G, . et al. : Predictors of emotional distress a year or more after diagnosis of cancer: A systematic review of the literature. Psychooncology 27:791-801, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Molassiotis A, Stamataki Z, Kontopantelis E: Development and preliminary validation of a risk prediction model for chemotherapy-related nausea and vomiting. Support Care Cancer 21:2759-2767, 2013 [DOI] [PubMed] [Google Scholar]
  • 44.Pirri C, Bayliss E, Trotter J, et al. : Nausea still the poor relation in antiemetic therapy? The impact on cancer patients' quality of life and psychological adjustment of nausea, vomiting and appetite loss, individually and concurrently as part of a symptom cluster. Support Care Cancer 21:735-748, 2013 [DOI] [PubMed] [Google Scholar]
  • 45.McCarthy G, Awde J, Ghandi H, et al. : Risk factors associated with mucositis in cancer patients receiving 5-fluorouracil. Oral Oncol 34:484-490, 1998 [DOI] [PubMed] [Google Scholar]
  • 46.Jacobsen R, Møldrup C, Christrup L, et al. : Psychological and behavioural predictors of pain management outcomes in patients with cancer. Scand J Caring Sci 24:781-790, 2010 [DOI] [PubMed] [Google Scholar]
  • 47.Stepanski EJ, Walker MS, Schwartzberg LS, et al. : The relation of trouble sleeping, depressed mood, pain, and fatigue in patients with cancer. J Clin Sleep Med 5:132-136, 2009 [PMC free article] [PubMed] [Google Scholar]
  • 48.Rongali S, Rose AJ, McManus DD, et al. : Learning latent space representations to predict patient outcomes: Model development and validation. J Med Internet Res 22:e16374, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from JCO Clinical Cancer Informatics are provided here courtesy of American Society of Clinical Oncology

RESOURCES