Applications of Aspect-based Sentiment Analysis on Psychiatric Clinical Notes to Study Suicide in Youth

Amy George; David Johnson; Giuseppe Carenini; Ali Eslami; Raymond Ng; Elodie Portales-Casamar

. 2021 May 17;2021:229–237.

Applications of Aspect-based Sentiment Analysis on Psychiatric Clinical Notes to Study Suicide in Youth

Amy George ¹, David Johnson ¹, Giuseppe Carenini ¹, Ali Eslami ^2,⁴, Raymond Ng ¹, Elodie Portales-Casamar ^3,⁴

PMCID: PMC8378644 PMID: 34457137

Abstract

Understanding and identifying the risk factors associated with suicide in youth experiencing mental health concerns is paramount to early intervention. 45% of patients are admitted annually for suicidality at BC Children's Hospital. Natural Language Processing (NLP) approaches have been applied with moderate success to psychiatric clinical notes to predict suicidality. Our objective was to explore whether machine-learning-based sentiment analysis could be informative in such a prediction task. We developed a psychiatry-relevant lexicon and identified specific categories of words, such as thought content and thought process that had significantly different polarity between suicidal and non-suicidal cases. In addition, we demonstrated that the individual words with their associated polarity can be used as features in classification models and carry informative content to differentiate between suicidal and non-suicidal cases. In conclusion, our study reveals that there is much value in applying NLP to psychiatric clinical notes and suicidal prediction.

Introduction

Although suicide accounts for fewer than 10% of deaths in youth globally, it is still the second leading cause of death in youth¹. Suicidal thoughts and ideations are even more prevalent among youth, ranging from 20 to 30%^2,3. Suicidal ideations and behaviours are strongly associated with co-occurring mental disorders and high-risk behaviours^4,5. Despite a growing body of research on intervention and improved social awareness, mental health concerns in youth are still prevalent and under-treated. Recent research has shown a 60% increase in pediatric emergency visits for mental health disorders and a 329% increase in visits for intentional self-harm between 2007 and 2016⁶. Research by Doan et al.⁷ has shown that over 35% of youth surveyed with universal psychosocial screening in the emergency room warranted further psychiatric follow-up. There is clearly a gap that can be addressed before they are admitted for self-harm or suicide attempts.

The Child and Adolescent Psychiatric Emergency (CAPE) unit at BC Children's Hospital, Vancouver, Canada, specializes in providing emergency intervention and stabilization for youth in psychiatric crisis. Approximately 45% of patients are admitted annually to CAPE for suicidality, amounting to over 100 admissions annually of children deemed to be at substantial and acute risk for suicide. The rate of readmission to CAPE is approximately 30% which has remained consistent over several years. Clinical notes are written at admission and discharge by psychiatrists and typically include, although without following a formal template, the patients' background, mental health, family history, current circumstances, and more, providing a wealth of information that can be analysed to help understand key factors associated with suicidality. Such understanding may be applied more broadly to help flag patients potentially at risk and offer care before reaching a critical stage.

Sentiment analysis is a branch of Natural Language Processing (NLP) used most often to identify and quantify the sentiment, feelings or opinions associated with a topic. This branch of NLP is most often applied to the analysis of online content (Twitter, reviews, forums, etc.) in an attempt to understand users' opinions of a brand or topic. There have only been limited applications of NLP to psychiatric clinical notes in part because they tend to be long and cover a variety of topics (family history, social context, clinical observations…) making it hard to tease out the relevant information from the rest. Conversely, this is exactly why such approaches are particularly important to apply to mental health where most clinical documentation is done through long narratives and does not fit well into structured fields that are typical in case report forms⁸. Research has shown that NLP and Machine Learning (ML) techniques can be successfully applied to identify suicide related crises in clinical notes^9-12. However, sentiment analysis has rarely been applied to clinical notes since one might not expect clinicians to express sentiment in their documentation. A study by McCoy et al.¹³ applied sentiment analysis to over 17,000 discharge notes looking at the correlation between a sentiment score for each note and readmission and mortality risk. The study used the Pattern Module¹⁴ which comes with a predefined lexicon that has an assigned polarity for each word but is not tailored to the clinical field. Similarly, a study by Waudby et al.¹⁵ performed a survival analysis by applying the Pattern Module to free-text nursing notes, and a study by Weissman et al.¹⁶ looked at the construct validity of six different sentiment analysis methods on patient encounter notes, five of which were lexicon based and only one (CoreNLP by Stanford) was ML-based. These studies mostly focused on overall sentiment and how it correlates with patient outcomes such as mortality or readmission. They were also limited by the use of lexicon-based sentiment analysis techniques, which lack lexicons tailored to the clinical field. As noted by Holderness et al.¹⁷, most off-the-shelf sentiment analysis tools are not tailored to clinical notes, do not incorporate any medical ontologies, and thus cannot identify clinical sentiment well.

There is an increasing number of ML-based sentiment analysis tools that remove the need to use generic lexicons in specialized domains. One such tool is a recently developed sentiment analysis software, ABSApp¹⁸ (part of NLP Architect by Intel(r) AI Lab), which is a system for weakly-supervised aspect-based sentiment extraction. Aspects are the words that are the object of sentiment words within a sentence. Sentiment words convey whether the given aspect has a positive or negative polarity. ABSApp can extract aspects and sentiment words from an unlabeled dataset sentence by sentence, producing an aspect-level sentiment report across the dataset. To reduce redundancy, the report combines aspects by their aliases so that plural or other forms of the same word are not listed multiple times.

This study aims to investigate the utility of sentiment analysis using ABSApp to analyze psychiatric clinical notes specifically as it relates to suicidal risk. Our objective was to evaluate whether the use of a tailored lexicon combined with a quantification of the aspects at the topic level can enable classification of the notes related to suicide from other psychiatric crises.

Methods

With research ethics board approval (H18-01402; June 2018), we obtained 1,559 long-form clinical notes written by psychiatrists during encounters with patients at the Child and Adolescent Psychiatric Emergency (CAPE) unit of BC Children's Hospital, Vancouver, Canada, between January 1st, 2015 and May 5th, 2018. Of the 1,559 notes, 515 were labelled as related to suicide (thoughts, ideation, or attempt; "Suicidal dataset") according to ICD 10 codes, 151 were labelled as other psychiatric crises ("Non-suicidal dataset"), and the remaining 893 were not labelled and are excluded from the analysis, only used in the initial step to create the lexicon. The 666 files we included in our analysis represent 289 unique patients.

To create our tailored lexicon, we applied ABSApp's lexicon extraction feature to our entire dataset. It extracts the aspects based on a prebuilt lexicon of sentiment words with an assigned polarity (positive or negative, neutral is not included)¹⁸. In brief, new aspect and sentiment terms are extracted through a bootstrap process initiated with a seed lexicon of generic sentiment terms. In order to initialize the bootstrap process, we used the opinion lexicon that comes with ABSApp which contains around 6,800 sentiment terms along with their polarity. This ML-based lexicon extraction algorithm¹⁹ enables the expansion of sentiment and aspect coverage to find sentiment words not already in the lexicon, thereby expanding sentiment and aspect coverage, which has been a limitation of previous studies using solely lexicon-based analysis. By running our full dataset of 1,559 files through ABSapp, we obtained a report of 700 aspects, along with their aliases, such as plural forms, as well as up to 20 examples of the context in which the aspects and sentiment pairs were found. With reference to a protocol²⁰ developed for the same software, we systematically reviewed all aspects to remove redundancies, analysed their nuance and categorized them to ensure relevance to the given field. This also included a detailed inspection of the context of the aspects within the example sentences. We only retained aspects where at least 50% (10/20) of the examples were deemed relevant to patient characteristics or care process. If fewer than 50% of examples were relevant, the aspect was deleted. Categorizations were reviewed by the supervisor, and disagreements or uncertainty were resolved by discussion until there was consensus. Aspects were also consolidated into aliases during this process. This refinement process resulted in a final lexicon of 330 aspects, which our patient partners then reviewed and provided feedback on.

We then ran ABSApp's sentiment extraction feature on the 666 labelled files using the edited lexicon to extract aspects, their contexts and their associated polarity in each instance. From this, we extracted all unique aspects that had been found, and counted the number of positive and negative sentiments for each unique aspect to get frequencies by polarity for both the Suicidal and Non-suicidal datasets.

We calculated the negativity proportion for each aspect found in both the Suicidal and Non-suicidal datasets by dividing the negative polarity count by the sum of the positive and negative polarity counts for each aspect in each dataset. For the aspect-level analysis, we performed Fisher's exact test on the positive and negative counts produced by each dataset for each aspect, and then corrected for multiple testing with Bonferroni. We also performed Fisher's exact test and Bonferroni correction on the aspect frequencies at the category level.

To understand if any individual aspects or categories were strongly associated with suicidality, we ran two classification models on the document-level data. We chose Logistic Regression and Random Forest Classification for their interpretability. We created a matrix using the 330 aspects as features and calculated a net polarity for each feature by summing the positive (+1) and negative (-1) polarity associated with each aspect found in each of the 515 and 151 files in the Suicidal and Non-suicidal datasets, respectively. We organized our files by patient encounters first so that admission and discharge notes for the same patient would not be split across training and testing data when dividing the datasets in folds. We then used random forests and logistic regression for the classification task and performed cross validation with 3-folds. We also ran the classifier on the whole dataset of 666 files and used the 'feature importance' function of the Random Forest Classifier to estimate which features are most important based on permutations. We ran these two models with default parameters²¹ and did not perform any hyperparameter tuning as this experiment was meant to understand baseline performance. Finally, we shuffled the labels of the training data and reran the classification models with the 3-folds to confirm that the performance of our results was not a product of the disproportionate amount of Suicidal data we have to Non-suicidal data. Performance was measured by calculating the mean accuracy and mean Receiver-Operating Characteristic (ROC) curve and Area Under the Curve (AUC) across the cross-validation folds. All analyses were run using Python and scikit learn libraries²¹.

Results

In order to develop a lexicon tailored to our dataset, we ran ABSApp to extract all aspects associated with sentiments from our unlabeled dataset of 1,559 psychiatric clinical notes. The output saturated at just over 700 aspects, which made manual refinement possible. We categorized the aspects using seven major risk factor domains that can be found in patient records associated with readmission of psychiatric patients: appearance, mood, interpersonal relationships, substance use, occupation, thought content, and thought process¹⁴. Since many of the aspects were related to either medications or disorders, we added these two categories as well. 360 aspects that did not fit into one of the nine categories and were considered unrelated to patient characteristics or care process were removed. Table 1 describes the number of unique aspects retained and their distribution by category, with the examples highlighting the three most frequent aspects in each category. The full lexicon is available upon request.

Table 1. Number of aspects and the top three examples in each category in our tailored lexicon.

Aspect category	Count	Examples
Appearance	19	Eye contact, gestures, tics
Disorders	82	Disorders, history, illnesses
Interpersonal relationships	58	Mother, parents, dad
Medications	30	SSRI, fluoxetine, effects
Mood	38	Felt, moods, behaviours
Occupation	16	Schools, grades, students
Substance use	13	Substances, drugs, medications
Thought content	43	Suicidal ideations, thoughts, intent
Thought process	31	Accessibility, speech, insight

Open in a new tab

To start investigating potential differences between our Suicidal vs. Non-suicidal datasets, we calculated the overall proportions of positive and negative sentiment contained across all clinical notes in each dataset. In the Suicidal dataset, we identified a total of 5,954 instances of positive sentiment and 22,926 instances of negative sentiment associated with one of the aspects, as compared to 1,452 and 5,752 respectively in the Non-suicidal dataset. Despite a large difference in the quantity of sentiment instances found in the two datasets, the ratio of positive to negative instances of sentiment are almost the same in each dataset – 20% positive and 80% negative.

We next separated the aspects based on their category and counted the positive and negative sentiments contained in each category. Table 2 shows the break-down of negativity proportions in both datasets across all categories. It shows that negativity is variable with the biggest difference in the Thought Process category (13.91% difference), followed by Thought Content (9.46% difference). Three categories were significantly different based on a Fisher's exact test and Bonferroni correction: Thought Content (corrected p-value < 0.001), Thought Process (corrected p-value < 0.05) and Mood (corrected p-value < 0.05).

Table 2. Break-down of positive and negative sentiment counts in the Suicidal and Non-suicidal datasets.

Aspect category	Suicidal dataset positive:negativecount	Suicidal dataset negativity proportion	Non-suicidal dataset positive:negative count	Non-suicidal dataset negativity proportion
Appearance	349: 309	46.96%	54: 67	55.37%
Disorders	938: 11,207	92.28%	294: 3,215	91.62%
Interpersonal relationships	1,451: 1,995	57.89%	371: 546	59.54%
Medications	312: 237	43.17%	66: 66	50.00%
Mood *	930: 2,665	74.13%	219: 823	78.98%
Occupation	228: 418	64.71%	45: 76	62.81%
Substance use	811: 1,297	61.53%	198: 323	62.00%
Thought content *	665: 4,539	87.22%	147: 514	77.76%
Thought process *	277: 324	53.91%	56: 118	67.82%
Total	5,954: 22,926	79.38%	1,452: 5,752	79.84%

Open in a new tab

There were a total 265 aspects found in both datasets out of the 330 in the lexicon. To investigate the differences between the Suicidal and Non-suicidal datasets at the aspect level, we performed a Fisher's exact test on each aspect, and then performed the Bonferroni correction to correct for multiple testing. After correction, the aspect "intent" remained statistically significant with a corrected p-value < 0.01 (Suicidal: 297:158; Non-suicidal: 21:39 for positive:negative counts respectively). The next closest aspect was "risk factors" with a corrected p-value of 0.08.

As a preliminary step toward assessing the value of this output in classification tasks, we investigated whether aspect polarity counts for each document could be used as informative features in logistic regression and random forest classifiers when applied to our datasets. We split the data proportionately into three folds and ran 3-fold cross-validation with the models. Then, we generated three new folds by shuffling which files were added to each fold and repeated the analysis. Finally, we took the mean of all six outputs for each model. The results are listed in table 3. The 3-fold cross-validation resulted in an accuracy of 80.70% for logistic regression and 83.69% for random forest. The top three features from the random forest classifier were "suicidal ideations", "autism", and "behaviors". Interestingly, "intent" also comes high in the list of top features, ranked 6th out of 330 aspects.

Table 3. Mean classification accuracy (%) from 3-fold cross validation using the aspects and their polarity as features in a logistic regression model and random forest classifier.

Aspect Features	Logistic Regression	Random Forest
All aspects	80.70%	83.69%
All aspects (50:50 dataset)	69.89%	80.35%
All aspects (training labels shuffled)	68.00%	75.09%
All aspects (50:50 dataset and training labels shuffled)	43.50%	53.27%
Appearance aspects only	77.06%	74.50%
Disorders aspects only	81.00%	82.32%
Interpersonal aspects only	76.02%	75.09%
Medications aspects only	76.92%	75.27%
Mood aspects only	77.87%	77.37%
Occupation aspects only	77.05%	76.20%
Substance use aspects only	77.22%	73.54%
Thought content aspects only	79.18%	76.37%
Thought process aspects only	76.47%	75.87%

Open in a new tab

Given that we have far more Suicidal documents than Non-suicidal, we were concerned that the accuracy was a product of the unbalanced dataset. When the dataset was reduced to a 50:50 split (151 files in each the Suicidal and Non-suicidal dataset), the accuracy dropped to 69.89% for logistic regression and 80.35% for random forest. Next, we shuffled the labels in the training data for the three folds, which resulted in a drop of accuracy to 68.00% and 75.09% for the full dataset and 43.50% and 53.27% for the 50:50 dataset. To see if any category of aspects plays a stronger role in the classification task, we used only the aspects from each category in turn as the models' features. Table 3 shows that the Disorders category contains the most informative features for the classification, with the highest accuracy, similar to the accuracy for the models using all aspects. All other categories resulted in similarly lower accuracies. As expected, shuffling performed the worst out of all.

Finally, to evaluate performance more thoroughly, we generated ROC curves for the models with all aspects and aspects from individual categories (Figure 1). We observe that the models including all aspects perform the best with AUC of 0.8 and 0.85 for the logistic regression and random forest classifier respectively. Three categories perform also fairly well on their own, Thought Content (AUC of 0.78 and 0.77 respectively), Disorders (AUC of 0.74 and 0.77 respectively), and Mood (AUC of 0.72 for both). The ROC curves demonstrate a low performance for all other categories (AUC ranging from 0.5 to 0.61).

Figure 1. — Mean Receiver-Operating Characteristic (ROC) curves from 3-fold cross validation using the aspects and their polarity as features in a logistic regression model (a) and random forest classifier (b).

Discussion

This exploratory study establishes a new lexicon generated by extraction from clinical notes using sentiment analysis tailored to the psychiatric and mental health fields. We identified 330 aspects relevant to the domain with attached sentiments, highlighting the fact that even though clinical notes are not expressing an individual's sentiments, the methodology still has relevance in the analysis of these notes. Our study not only looks at the polarity of sentiments within the notes like previous studies have done^13,15,16, but it also innovates by focusing on the specific aspects these sentiments are attached to. We also explored their relation to the mental health domain through categorization of the aspects to previously defined risk factors for readmission of psychiatric patients¹⁷. This categorization enables the user to drill-down into the sentiment analysis results and facilitates interpretation. As a demonstration, looking at our small Suicidal vs Non-suicidal datasets, despite not identifying an overall difference in negativity (79.38% vs 79.84%), we observed variance within individual categories. The difference between the two datasets for Thought Content, Thought Process, and Mood were statistically significant.

Interestingly, looking at individual aspects, we observed that "intent" showed the most striking contrast between the two datasets, with many more negative sentiments in the Suicidal dataset. As "intent" is an aspect in the Thought Content category, this finding aligns with the fact that overall, Thought Content is attached to more negative sentiments in the Suicidal dataset and differs significantly from the Non-suicidal dataset.

As an additional way to explore the value of the sentiment analysis methodology applied to clinical notes, we investigated how informative the findings would be as features in classification models trained to differentiate between clinical notes related to suicidal patients vs other psychiatric crises. We selected two out-of-the-box classifiers to ensure that our results would not be biased by the tools selected, and we observed that the sentiment analysis data showed promising accuracy and AUC values in the classification task, similar to previous studies. For instance, a study by Le et al.²² showed that Support Vector Machine (SVM) and other algorithms could be used to predict risk of inpatient self-harm with an accuracy of 0.69-0.77 from free-text narrative clinical notes by using symptom, sentiment and frequency dictionaries. Another study by Fernandes et al.¹² achieved precision of 82.8% for classifying suicide attempts with SVM on clinical notes using a manually curated list of features related to suicide. In our study, the tailored and categorized lexicon enabled a more in-depth investigation of which aspects are more informative than others both within and across categories. The Thought Content, Disorders, and Mood categories showed the best performance, which aligns with the fact that the top three features in the random forest classifier using all aspects belong to these three categories ("suicidal ideations" in Thought Content, "autism" in Disorders, and "behaviors" in Mood). It also aligns with the observation that the overall negativity score for the Thought Content and Mood categories were statistically different between the Suicidal and Non-Suicidal datasets. It is interesting to note that, although Disorders performed well both at the accuracy and AUC levels, it showed almost no overall negativity score difference at the category level between the Suicidal and Non-suicidal datasets. This shows that aggregating sentiment at the category-level may lose some important information and reinforces the value of the aspect-level analysis.

Our study reveals that there is much that can be gained by applying NLP techniques to psychiatric clinical notes. Although sentiment analysis may often be the tool of businesses trying to improve their brand image, this study demonstrates that it can be applied to a complex, nuanced, and sensitive task such as the analysis of psychiatric clinical notes. Not only can it be used to understand the overall polarity of a document or a dataset, but it can be used to classify complex data and extract words that may be associated with suicidality. Research has shown there is both a gap in care and space to address mental health concerns in youth⁷. Moving forward, we plan to apply our findings and techniques in building predictive models that could be used to screen youth and offer help in advance.

Conclusion

Despite the challenges that free-text clinical notes pose, we have shown that they can be an excellent resource and that aspect-based sentiment analysis and machine learning models are up to the task. We acknowledge that there are many limitations to this study. Namely, that the portion of our dataset we were able to use for classification is very small when compared to what machine-learning-based techniques are typically performed on. In addition, our data is very unbalanced, with only a few non-suicidal files, which makes it difficult to do cross-validation with more folds. To address these issues, we are in the process of labelling the remaining 893 notes left aside in this analysis to use as an independent dataset to test our model. Our data is also very sparse, as each document only contains a handful of the 330 features we analysed. As we continue this project, we may address the sparsity to try to improve accuracy of our models. Finally, as our data only spans three years to 2018, we are also working on getting the newest data from the last two years, which we hope will double our data size.

Our preliminary results are very encouraging despite working only with default parameters. Future steps will include tuning hyperparameters to improve performance, as well as a comparison of our sentiment-based method to a more classical bag-of-words or Naïve Bayes model.

In future work we hope to explore whether the differences between the aspect categories that we found may have some clinical relevance. Thought Content includes the aspect "intent", which is often how psychiatrists describe suicidality, so it is not surprising that it could be more negative in the suicide group. Other research has shown that behavioural disturbances and disordered thought process in the context of neurodevelopment disorders are common^23,24. This may partially explain why there is more negativity in the Thought Process category within the Non-suicidal dataset.

A possible future direction could be to apply these techniques to a broader set of outpatient notes - for example, when a patient is seeing their psychiatrist for follow up in the community, we might be able to flag that the patient is doing worse and suggest interventions before the patient reaches the point of presenting to the emergency room. Semantic analysis with lexicons fine-tuned to the mental health domain, when used in conjunction with longitudinal predictive models, may help predict an individual's risk for suicide as well as their risk for developing various mental health conditions.

Acknowledgements

We acknowledge financial support for this project from 1) the BC SUPPORT Unit Data Science and Health Informatics Methods Cluster (Award Number: DaSHI-002), which is part of British Columbia's Academic Health Science Network; and 2) the Evidence to Innovation Stimulus Award at BC Children's Hospital Research Institute. The BC SUPPORT Unit receives funding from the Canadian Institutes of Health Research and the Michael Smith Foundation for Health Research. We thank Sinead Nugent from the CAPE unit for the manual annotations and labelling of the datasets; Dr. Ali Mussavi Rizi from PHSA Data Analytics, Reporting & Evaluation for providing and ensuring continuous access to the data; as well as other students involved in other aspects of the project and who provided feedback on the manuscript, Rebecca Lin, Esther Lin, Cindy Ou Yang, and John-Jose Nuïez. Finally, we would like to give special thanks to our patient and family partners - Ariel Qi, Alison Taylor, Omar Bseiso, for their pointed questions, engagement, in-depth feedback, and suggestions on possible future directions.

Figures & Table

References

1.Cha CB, Franz PJ, M Guzmán E, Glenn CR, Kleiman EM, Nock MK. Annual Research Review: Suicide among youth - epidemiology, (potential) etiology, and treatment. J Child Psychol Psychiatry. 2018;59:460–82. doi: 10.1111/jcpp.12831. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Nock MK, Borges G, Bromet EJ, Cha CB, Kessler RC, Lee S. Suicide and suicidal behavior. Epidemiol Rev. 2008;30:133–54. doi: 10.1093/epirev/mxn002. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Evans E, Hawton K, Rodham K, Deeks J. The prevalence of suicidal phenomena in adolescents: a systematic review of population-based studies. Suicide Life Threat Behav. 2005;35:239–50. doi: 10.1521/suli.2005.35.3.239. [DOI] [PubMed] [Google Scholar]
4.Patton GC, Coffey C, Sawyer SM, Viner RM, Haller DM, Bose K, et al. Global patterns of mortality in young people: a systematic analysis of population health data. Lancet. 2009;374:881–92. doi: 10.1016/S0140-6736(09)60741-8. [DOI] [PubMed] [Google Scholar]
5.Georgiades K, Boylan K, Duncan L, Wang L, Colman I, Rhodes AE, et al. Prevalence and Correlates of Youth Suicidal Ideation and Attempts: Evidence from the 2014 Ontario Child Health Study. Can J Psychiatry. 2019;64:265–74. doi: 10.1177/0706743719830031. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Lo CB, Bridge JA, Shi J, Ludwig L, Stanley RM. Children’s Mental Health Emergency Department Visits: 2007-2016. Pediatrics. 2020:145. doi: 10.1542/peds.2019-1536. [DOI] [PubMed] [Google Scholar]
7.Doan Q, Wright B, Atwal A, Hankinson E, Virk P, Azizi H, et al. Utility of MyHEARTSMAP for Universal Psychosocial Screening in the Emergency Department. The Journal of Pediatrics. 2020;219:54–61.e1. doi: 10.1016/j.jpeds.2019.12.046. [DOI] [PubMed] [Google Scholar]
8.Velupillai S, Suominen H, Liakata M, Roberts A, Shah AD, Morley K, et al. Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances. J Biomed Inform. 2018;88:11–9. doi: 10.1016/j.jbi.2018.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Ben-Ari A, Hammond K. Text Mining the EMR for Modeling and Predicting Suicidal Behavior among US Veterans of the 1991 Persian Gulf War. 2015 48th Hawaii International Conference on System Sciences; 2015. pp. 3168–75. [Google Scholar]
10.Metzger M-H, Tvardik N, Gicquel Q, Bouvry C, Poulet E, Potinet-Pagliaroli V. Use of emergency department electronic medical records for automated epidemiological surveillance of suicide attempts: a French pilot study. International Journal of Methods in Psychiatric Research. 2017;26:e1522. doi: 10.1002/mpr.1522. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Hammond KW, Laundry RJ. Application of a Hybrid Text Mining Approach to the Study of Suicidal Behavior in a Large Population. 2014 47th Hawaii International Conference on System Sciences. 2014. pp. 2555–61.
12.Fernandes AC, Dutta R, Velupillai S, Sanyal J, Stewart R, Chandran D. Identifying Suicide Ideation and Suicidal Attempts in a Psychiatric Clinical Research Database using Natural Language Processing. Scientific Reports. 2018;8:7426. doi: 10.1038/s41598-018-25773-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.McCoy TH, Castro VM, Cagan A, Roberson AM, Kohane IS, Perlis RH. Sentiment Measured in Hospital Discharge Notes Is Associated with Readmission and Mortality Risk: An Electronic Health Record Study. PLoS ONE. 2015;10:e0136341. doi: 10.1371/journal.pone.0136341. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.De Smedt T, Daelemans W. Pattern for python. J Mach Learn Res. 2012;13:2063–7. [Google Scholar]
15.Waudby-Smith IER, Tran N, Dubin JA, Lee J. Sentiment in nursing notes as an indicator of out-of-hospital mortality in intensive care patients. PLOS ONE. 2018;13:e0198687. doi: 10.1371/journal.pone.0198687. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Weissman GE, Ungar LH, Harhay MO, Courtright KR, Halpern SD. Construct validity of six sentiment analysis methods in the text of encounter notes of patients with critical illness. Journal of Biomedical Informatics. 2019;89:114–21. doi: 10.1016/j.jbi.2018.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Holderness E, Miller N, Cawkwell P, Bolton K, Meteer M, Pustejovsky J, et al. Analysis of risk factor domains in psychosis patient health records. J Biomed Semant. 2019;10:19. doi: 10.1186/s13326-019-0210-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Pereg O, Korat D, Wasserblat M, Mamou J, Dagan I. ABSApp: A Portable Weakly-Supervised Aspect-Based Sentiment Extraction System. arXiv:190905608 [cs] [Internet] 2019. [cited 2020 Aug 27]; Available from: http://arxiv.org/abs/1909.05608 .
19.Qiu G, Liu B, Bu J, Chen C. Opinion Word Expansion and Target Extraction through Double Propagation. Computational Linguistics. 2011;37:9–27. [Google Scholar]
20.Johnson D, Chen Y, Dragojlovic N, Kopac N, Carenini G, Ng R. Estimating Patient Preferences Directly from Patient-Generated Text Using Aspect Based Sentiment Analysis. Paper submitted for publication.
21.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. MACHINE LEARNING IN PYTHON; p. 6. [Google Scholar]
22.Le DV, Montgomery J, Kirkby KC, Scanlan J. Risk prediction using natural language processing of electronic mental health records in an inpatient forensic psychiatry setting. Journal of Biomedical Informatics. 2018;86:49–58. doi: 10.1016/j.jbi.2018.08.007. [DOI] [PubMed] [Google Scholar]
23.Sadock BJ, Sadock VA. Kaplan & Sadock’s Concise Textbook of Clinical Psychiatry. Lippincott Williams & Wilkins; 2008. p. 756. [Google Scholar]
24.Jean S, Kim B, Donna R. Psychiatric Emergencies in Children and Adolescents: An Emergency Department Audit. Australas Psychiatry. 2006;14:403–7. doi: 10.1080/j.1440-1665.2006.02313.x. [DOI] [PubMed] [Google Scholar]

[r1-3476977] 1.Cha CB, Franz PJ, M Guzmán E, Glenn CR, Kleiman EM, Nock MK. Annual Research Review: Suicide among youth - epidemiology, (potential) etiology, and treatment. J Child Psychol Psychiatry. 2018;59:460–82. doi: 10.1111/jcpp.12831. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2-3476977] 2.Nock MK, Borges G, Bromet EJ, Cha CB, Kessler RC, Lee S. Suicide and suicidal behavior. Epidemiol Rev. 2008;30:133–54. doi: 10.1093/epirev/mxn002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r3-3476977] 3.Evans E, Hawton K, Rodham K, Deeks J. The prevalence of suicidal phenomena in adolescents: a systematic review of population-based studies. Suicide Life Threat Behav. 2005;35:239–50. doi: 10.1521/suli.2005.35.3.239. [DOI] [PubMed] [Google Scholar]

[r4-3476977] 4.Patton GC, Coffey C, Sawyer SM, Viner RM, Haller DM, Bose K, et al. Global patterns of mortality in young people: a systematic analysis of population health data. Lancet. 2009;374:881–92. doi: 10.1016/S0140-6736(09)60741-8. [DOI] [PubMed] [Google Scholar]

[r5-3476977] 5.Georgiades K, Boylan K, Duncan L, Wang L, Colman I, Rhodes AE, et al. Prevalence and Correlates of Youth Suicidal Ideation and Attempts: Evidence from the 2014 Ontario Child Health Study. Can J Psychiatry. 2019;64:265–74. doi: 10.1177/0706743719830031. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r6-3476977] 6.Lo CB, Bridge JA, Shi J, Ludwig L, Stanley RM. Children’s Mental Health Emergency Department Visits: 2007-2016. Pediatrics. 2020:145. doi: 10.1542/peds.2019-1536. [DOI] [PubMed] [Google Scholar]

[r7-3476977] 7.Doan Q, Wright B, Atwal A, Hankinson E, Virk P, Azizi H, et al. Utility of MyHEARTSMAP for Universal Psychosocial Screening in the Emergency Department. The Journal of Pediatrics. 2020;219:54–61.e1. doi: 10.1016/j.jpeds.2019.12.046. [DOI] [PubMed] [Google Scholar]

[r8-3476977] 8.Velupillai S, Suominen H, Liakata M, Roberts A, Shah AD, Morley K, et al. Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances. J Biomed Inform. 2018;88:11–9. doi: 10.1016/j.jbi.2018.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r9-3476977] 9.Ben-Ari A, Hammond K. Text Mining the EMR for Modeling and Predicting Suicidal Behavior among US Veterans of the 1991 Persian Gulf War. 2015 48th Hawaii International Conference on System Sciences; 2015. pp. 3168–75. [Google Scholar]

[r10-3476977] 10.Metzger M-H, Tvardik N, Gicquel Q, Bouvry C, Poulet E, Potinet-Pagliaroli V. Use of emergency department electronic medical records for automated epidemiological surveillance of suicide attempts: a French pilot study. International Journal of Methods in Psychiatric Research. 2017;26:e1522. doi: 10.1002/mpr.1522. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11-3476977] 11.Hammond KW, Laundry RJ. Application of a Hybrid Text Mining Approach to the Study of Suicidal Behavior in a Large Population. 2014 47th Hawaii International Conference on System Sciences. 2014. pp. 2555–61.

[r12-3476977] 12.Fernandes AC, Dutta R, Velupillai S, Sanyal J, Stewart R, Chandran D. Identifying Suicide Ideation and Suicidal Attempts in a Psychiatric Clinical Research Database using Natural Language Processing. Scientific Reports. 2018;8:7426. doi: 10.1038/s41598-018-25773-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r13-3476977] 13.McCoy TH, Castro VM, Cagan A, Roberson AM, Kohane IS, Perlis RH. Sentiment Measured in Hospital Discharge Notes Is Associated with Readmission and Mortality Risk: An Electronic Health Record Study. PLoS ONE. 2015;10:e0136341. doi: 10.1371/journal.pone.0136341. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r14-3476977] 14.De Smedt T, Daelemans W. Pattern for python. J Mach Learn Res. 2012;13:2063–7. [Google Scholar]

[r15-3476977] 15.Waudby-Smith IER, Tran N, Dubin JA, Lee J. Sentiment in nursing notes as an indicator of out-of-hospital mortality in intensive care patients. PLOS ONE. 2018;13:e0198687. doi: 10.1371/journal.pone.0198687. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r16-3476977] 16.Weissman GE, Ungar LH, Harhay MO, Courtright KR, Halpern SD. Construct validity of six sentiment analysis methods in the text of encounter notes of patients with critical illness. Journal of Biomedical Informatics. 2019;89:114–21. doi: 10.1016/j.jbi.2018.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r17-3476977] 17.Holderness E, Miller N, Cawkwell P, Bolton K, Meteer M, Pustejovsky J, et al. Analysis of risk factor domains in psychosis patient health records. J Biomed Semant. 2019;10:19. doi: 10.1186/s13326-019-0210-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r18-3476977] 18.Pereg O, Korat D, Wasserblat M, Mamou J, Dagan I. ABSApp: A Portable Weakly-Supervised Aspect-Based Sentiment Extraction System. arXiv:190905608 [cs] [Internet] 2019. [cited 2020 Aug 27]; Available from: http://arxiv.org/abs/1909.05608 .

[r19-3476977] 19.Qiu G, Liu B, Bu J, Chen C. Opinion Word Expansion and Target Extraction through Double Propagation. Computational Linguistics. 2011;37:9–27. [Google Scholar]

[r20-3476977] 20.Johnson D, Chen Y, Dragojlovic N, Kopac N, Carenini G, Ng R. Estimating Patient Preferences Directly from Patient-Generated Text Using Aspect Based Sentiment Analysis. Paper submitted for publication.

[r21-3476977] 21.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. MACHINE LEARNING IN PYTHON; p. 6. [Google Scholar]

[r22-3476977] 22.Le DV, Montgomery J, Kirkby KC, Scanlan J. Risk prediction using natural language processing of electronic mental health records in an inpatient forensic psychiatry setting. Journal of Biomedical Informatics. 2018;86:49–58. doi: 10.1016/j.jbi.2018.08.007. [DOI] [PubMed] [Google Scholar]

[r23-3476977] 23.Sadock BJ, Sadock VA. Kaplan & Sadock’s Concise Textbook of Clinical Psychiatry. Lippincott Williams & Wilkins; 2008. p. 756. [Google Scholar]

[r24-3476977] 24.Jean S, Kim B, Donna R. Psychiatric Emergencies in Children and Adolescents: An Emergency Department Audit. Australas Psychiatry. 2006;14:403–7. doi: 10.1080/j.1440-1665.2006.02313.x. [DOI] [PubMed] [Google Scholar]

PERMALINK

Applications of Aspect-based Sentiment Analysis on Psychiatric Clinical Notes to Study Suicide in Youth

Amy George, BA, BCS

David Johnson, Ms

Giuseppe Carenini, PhD

Ali Eslami, PhD

Raymond Ng, PhD

Elodie Portales-Casamar, PhD

Abstract

Introduction

Methods

Results

Table 1. Number of aspects and the top three examples in each category in our tailored lexicon.

Table 2. Break-down of positive and negative sentiment counts in the Suicidal and Non-suicidal datasets.

Table 3. Mean classification accuracy (%) from 3-fold cross validation using the aspects and their polarity as features in a logistic regression model and random forest classifier.

Figure 1.

Discussion

Conclusion

Acknowledgements

Figures & Table

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Applications of Aspect-based Sentiment Analysis on Psychiatric Clinical Notes to Study Suicide in Youth

Amy George, BA, BCS

David Johnson, Ms

Giuseppe Carenini, PhD

Ali Eslami, PhD

Raymond Ng, PhD

Elodie Portales-Casamar, PhD

Abstract

Introduction

Methods

Results

Table 1. Number of aspects and the top three examples in each category in our tailored lexicon.

Table 2. Break-down of positive and negative sentiment counts in the Suicidal and Non-suicidal datasets.

Table 3. Mean classification accuracy (%) from 3-fold cross validation using the aspects and their polarity as features in a logistic regression model and random forest classifier.

Figure 1.

Discussion

Conclusion

Acknowledgements

Figures & Table

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases