Abstract
Patients can use social media to describe their healthcare experiences. Several social media platforms, such as the Care Opinion platform, host large volumes of patient stories. However, the large number of these stories and the healthcare system’s workload make exploring these stories a difficult task for healthcare providers and administrators. This study uses text mining for analyzing patient stories on the Care Opinion platform and exploring healthcare experiences described in these stories. We collected 367,573 stories, which were posted between September 2005 and September 2019. Topic modeling (Latent Dirichlet Allocation) and sentiment analysis were used to analyze the stories. Sixteen topics were identified representing five aspects of the healthcare experience: communication between patients and providers, quality of clinical services, quality of non-clinical services, human aspects of healthcare experiences, and patient satisfaction. There was also a clear sentiment in 99% of the stories. More than 55% of the stories that describe the patient’s request for information, the patient’s description of treatment, or the patient’s making of an appointment had a negative sentiment, which represents patient dissatisfaction. The study provides insights into the content of patient stories and demonstrates how topic modeling and sentiment analysis can be used to analyze large volumes of patient stories and provide insights into these stories. The findings suggest that these stories are not general social media posts; instead, they describe elements of healthcare experiences that can be helpful for quality improvement.
Supplementary Information
The online version contains supplementary material available at 10.1007/s41666-021-00097-5.
Keywords: Social media, Text mining, Topic modeling, Patient stories, Patient experience
Introduction
Social media refers to Internet-based applications that enable people to communicate, interact, publish, and exchange all types and formats of information, including text, pictures, audio, and video [1]. Since the beginning of the social media revolution in the 1990s, billions of people have used it for various human activities, including education, entertainment, social networking, marketing, healthcare, and news broadcasting [2, 3]. Patients use social media to exchange their health knowledge, share their illness and healthcare experiences, and get social or emotional support [4–6].
Healthcare experience refers to interactions of patients with healthcare providers, including nurses, physicians, and staff, and the resultant perceptions and behavioral and emotional effects [7]. Patient stories on social media describe several aspects of healthcare experiences, including quality of healthcare services and communication with healthcare staff, and they reflect the patient satisfaction level with these experiences [8]. These stories can also shed light on healthcare issues that may not be captured by patient experience surveys, which are the dominant method used in the healthcare system for assessing patient experience [9–11]. Patients use two types of social media platforms to post these stories: general platforms and specialized platforms. The general social media platforms host diverse types of posts and are not dedicated to patient stories; these include Facebook, Twitter, and Reddit. Specialized social media platforms are fully dedicated to collecting patient stories and facilitating patient-provider communication regarding these stories. An example of these platforms is Hao Dai Fu, a Chinese website (www.haodf.com), which in March 2020 contained 4,024,818 patient reviews for doctors across China [12]. Another example is the Care Opinion platform (www.careopinion.org.uk), which is the focus of this study.
Background
Patient Healthcare Experience
Zakkar [13] developed a framework that classifies the elements of patient experience into determinants and manifestations. The determinants are the factors that affect this experience, and they are the patient’s expectations, the burdens of illness, the quality of healthcare, the healthcare system’s responsiveness to patients’ needs, and the politics in the healthcare system. The determinants related to healthcare quality receive larger attention from healthcare providers than the other elements [13]. Several factors affect healthcare quality and, consequently, healthcare experiences, including patient safety, the effectiveness of care, timeliness of services, quality of communication between the patient and the healthcare team, the patient’s comfort, the respect patients receive for their values and preferences, and the empathy and support patients receive from the healthcare team [14]. The determinants of healthcare experiences can result in emotional and behavioral outcomes on the patient called manifestations of healthcare experience, including patient satisfaction and patient engagement [13]. Patient satisfaction is a sentimental judgment by the patients regarding achieving specific goals during their healthcare experiences [15]. Patient engagement is a behavioral reaction of the patients resulting from the healthcare experience and materializes into various levels of commitment to patient health and well-being [15].
Care Opinion Platform
The Care Opinion platform has been operating since 2005. It enables patients to post stories about their experiences with the UK healthcare system. Patients should identify the name of the healthcare settings, such as family practices or hospitals, where their experiences took place, and they can give titles to their stories and put some tags to describe good and annoying elements in their healthcare experiences. A moderator examines stories before being published to ensure that they do not contain defamatory content. The moderator assigns tags to the story, which represent the type of healthcare services described, such as diabetes care or family medicine. The moderator also gives a criticality score to the story, which represents its urgency. The platform enables healthcare providers to review and respond to the stories about them. All the stories and the responses are published on the platform’s website, and other patients and the public can read them. In September 2019, the platform contained 63,537 members who posted 367,573 stories, of which 73% had received responses from the healthcare providers. The platform does not use any computer software to process and analyze the content of the stories.
Text Mining
The diversified usage of social media has produced large volumes of data (i.e., big data), consisting of textual posts, pictures, and audio and video materials. Analysis and extraction of useful information from this vast and continuously growing data have been challenging [16]; however, the need for using this data has fostered the development and use of research methods and machine learning algorithms. Text mining refers to the use of computational algorithms (e.g., machine learning) for analyzing unstructured text data. These algorithms transfer data into a numerical format suitable for statistical and linguistic analyses [17]. Text mining also utilizes natural language processing (NLP) techniques. Text mining is used to perform several types of functions, including document classification, document clustering, information retrieval, and web mining [17]. It has been used to analyze social media in different domains such as business, politics, and healthcare [17]. In healthcare, text mining has been used in event-based public health surveillance, pharmacovigilance, health behavior monitoring, and exploring illness experiences [18]. Two text mining techniques that are frequently used are topic modeling and sentiment analysis.
Topic modeling denotes a group of unsupervised machine learning methods that can identify themes in a collection of documents or a corpus by analyzing the co-occurrence of words in these documents and identifying prominent topics in the corpus and prominent words in each topic [16, 19]. It utilizes several NLP methods to transform textual data, which is inherently unstructured, into a structured quantifiable form onto which statistical analyses may be applied [20]. A key characteristic of topic modeling is that it does not use any form of pre-classification or a human annotation of the documents. Therefore, it has been used to analyze high-volume data sources such as social media data, genetic sequences, and digitized library collections where such annotation is impractical [21, 22]. Topic modeling has also been used in healthcare research. For example, Myneni et al. [23] used topic modeling to analyze discussions on QuitNet, which is an online social network for smoking cessation [23]. Kim et al. [24] used topic modeling and sentiment analysis to analyze 4,581,181 tweets and 14,818 news articles on the Ebola epidemic [24].
One topic modeling method that is widely used is the Latent Dirichlet Allocation (LDA). This method assumes that there is a set of topics in a corpus of text, each of which is a distribution of words where each word has a probability in this distribution. Each document in the corpus can be associated with any topic from the set of topics but with varying probabilities [21]. LDA has been used to analyze smoking-related posts on social media and to explore people’s experiences and attitudes towards smoking harms and cessation [25]. LDA was also used to analyze millions of posts on a Swedish social media platform and explore Muslim immigrants’ representations on social media [16].
Sentiment analysis is a text mining technique that can be used to analyze the polarity or valence of textual data [26]. In online product reviews, a review’s sentiment reflects the satisfaction level of the reviewer or customer [8, 26]. Sentiment analysis can be done using several approaches. Machine learning approaches develop models that can be trained to classify documents based on their sentiment [27]. The lexicon-based approaches use a dictionary with a set of words that have a distinct sentiment. Each word is assigned a positive or negative sentiment score depending on whether the word carries a positive or negative sentiment. The words’ sentiment scores in a document can be used to analyze and compute a document’s sentiment score, which provides an approximation of its overall sentiment [28]. Researchers have used sentiment analysis to analyze online product reviews, social media posts, and polls [20, 26].
Purpose of the Study
This study explores the use of text mining and sentiment analysis to analyze a large volume of patient stories on the Care Opinion Platform and identify the elements of healthcare experiences described in these stories.
Methods
Data Collection
We collected 367,573 patient stories from the Care Opinion platform. The stories were posted between September 2005 and September 2019, and they are all in English. We developed a web scraper that used the platform API to download the stories. We did not, however, collect the providers’ responses to the stories. The stories are anonymized. A story has associated meta-data, including a title, the date of posting, the name of the healthcare setting described in it, the number of provider responses to it, and the tags put by the patient and the moderator. However, for this study, this meta-data was not part of the story text. The story’s average size is 131 words (σ = 121), and the median length is 98 words.
Data Analysis
We developed several computer programs for doing all analyses and data processing. We used Python (version 3.7) as a programming language, along with a set of Python libraries, including Gensim (version 3.8.1) [29], which provided topic modeling functionality, NLTK (version 3.4.5) [30], which provided data preprocessing techniques such as tokenization and stop-word removal, SpaCy (Version 2.2.3) [31], which provided lemmatization functionality, and vaderSentiment (version 3.2.1) [28], which provided sentiment analysis functionality.
To prepare the stories for topic modeling, some preprocessing techniques are performed. These include tokenization, stop-word removal, and lemmatization [32, 33]. We performed unigram and bigram tokenization; however, we only included the dominant bigram phrases. The tokens are then used to build the document-term matrix for the corpus, which is analyzed by the topic modeling algorithm. While Python libraries such as NLTK and SpaCy include several modules that can be used for data preprocessing, the preprocessing process is experimental to a large extent, and the preprocessing techniques are executed multiple times to tweak the execution parameters and ensure that the documents are tokenized accurately.
We conducted LDA topic modeling using the full corpus of stories, each of which is considered a separate document. Several parameters can be configured to control the modeling process, including the document-topic density (i.e., alpha), which controls the per-document topic probabilities [34, 35]; the topic term density (i.e., beta), which controls the per-topic term probabilities [34, 35]; and the expected number of topics. Given each story’s small size, we estimated that a story might represent a few topics only, and therefore, we set alpha to 0.01. We set the beta parameter to 1/number of topics. To select the ideal number of topics, we conducted a topic number detection experiment (Fig. 1), using the UCI model quality indicator [36, 37], which showed that the ideal topic number is 16. The UCI indicator is premised on the idea that for a topic to be meaningful for humans, its word set should include words that may co-occur in human-generated articles such as Wikipedia articles [36, 37]. This indicator scores each topic by calculating the logs of probabilities of co-occurrence of the topic’s words in a corpus of Wikipedia articles [36]. The indicator is considered close to the human judgment of the meaningfulness of topics [38].
Fig. 1.
Model quality experiment
LDA topic modeling produces a topic-term matrix (Online Resources 1 and 2), which contains the probabilities of the terms in each topic. We analyzed this matrix and the terms that have the highest probabilities within each topic and then coded or labeled each topic based on the main themes represented in the topic’s set of terms and using the patient experience determinants and manifestations framework developed by Zakkar [13] and explained in Section 2.1. This framework provided the theoretical background and a systematic way of reading, comparing, and organizing qualitative data [39]. To improve the labels’ quality, we also examined the document-topic matrix, which is another output of the modeling process. This matrix shows the main topics comprising each document. Reviewing some of the documents where a topic is the most prominent and comparing them with the topic’s set of terms enabled us to verify the accuracy of the labels. To coin descriptive and sentiment aware labels for each topic, we calculated the sentiment scores for all the documents where the topic is the most prominent. When the majority of these documents reveal clear sentiment, the corresponding label reflects that.
Results
The topic modeling revealed 16 topics. Table 1 provides descriptive labels, brief descriptions, and categories for these topics, which are intended to help the reader understand the differences among these topics. Some of the topics are given more than one category to increase their specificity. In Table 2, we provide examples of the stories and the corresponding topics.
Table 1.
Topic labels and descriptions
| Topic label | What the story describes | Category | Topic sentiment | |
|---|---|---|---|---|
| Topic 0 | Patient Requesting Information | Patients requesting information about their health conditions and treatment | Communication | Positive or negative |
| Topic 1 | Maternity Care | Healthcare experiences related to pregnancy and birth | Quality of non-clinical services | Positive or negative |
| Topic 2 | Patient Satisfaction with Staff Communication | Satisfactory communication with the healthcare team | Communication, patient satisfaction | Positive |
| Topic 3 | Wait Time in the Healthcare Setting | A patient’s view about the wait time in the healthcare setting | Quality of non-clinical services | Positive or negative |
| Topic 4 | Patient Expressing Satisfactory Encounter with the Staff | An empathetic and respectful encounter of the patient with the healthcare team | Human aspects of healthcare experience, patient satisfaction | Positive |
| Topic 5 | Patient Expressing Gratitude | Patients’ gratitude towards the healthcare team and satisfaction with the health outcomes of their healthcare experience | Patient satisfaction | Positive |
| Topic 6 | Timing of the Appointment | A patient’s view regarding the appropriateness of the healthcare appointment to a patient’s conditions | Quality of non-clinical services | Positive or negative |
| Topic 7 | Healthcare Experience of a patient | Healthcare experiences | Quality of non-clinical services | Positive and negative |
| Topic 8 | Health Service Availability and Accessibility | Health needs and service availability | Quality of non-clinical services | Positive and negative |
| Topic 9 | Patient Thanking the Staff | General thankfulness to the healthcare team | Patient satisfaction | Positive |
| Topic 10 | Patient’s Description of Treatment | Patient’s impression about the received clinical treatment | Clinical quality of services | Positive or negative |
| Topic 11 | Cleanness of the Healthcare Setting | The cleanness of the healthcare setting | Quality of non-clinical services | Positive or negative |
| Topic 12 | A Patient Experience Described by a Family Member | A patient story is told by a family member and identifies several elements of the quality of service | Quality of non-clinical services | Positive and negative |
| Topic 13 | Patient’s Making of an Appointment | A patient’s view on elements of appointment process such as talking to a staff member or referral | Quality of non-clinical services | Positive or negative |
| Topic 14 | Musculoskeletal Health Conditions | Healthcare experiences related to accidents and injuries and the need for physiotherapy | Quality of non-clinical services | Positive or negative |
| Topic 15 | Car Parking | Issues related to car parking in a healthcare setting | Quality of non-clinical services | Positive or negative |
Table 2.
Examples of patient stories and the corresponding topic
| Topic label | Story example 1 | Story example 2 | |
|---|---|---|---|
| Topic 0 | Patient Requesting Information | “On the whole, I don't want to criticize - the doctors and nurses have a difficult enough job, I'm sure - but if someone had just popped their head into my field of view once or twice and said "don't worry, we haven't forgotten you - we'll get to you as soon as we can" it would have made all the difference.” |
“Language barrier with foreign doctors and even harder for me as I am partially deaf. I came away not understanding the instructions.” |
| Topic 1 | Maternity Care | “Six months ago, my partner gave birth to a lovely baby girl (Leeds Clarendon wing) the labour was long and complicated, but once again the staff were fantastic. Well done to all.” |
“The doctors and midwife all tried their best to keep to my birth plan (my daughter didn't!). However, by contrast the maternity ward was awful - women and babies coming and going at all hours, those in difficulties next to those waiting to give birth, babies giving some concern next to those who were fine all causing extra stress for all the mums / mums to be. The staff were friendly but clearly overworked. The cleaning was appalling, though the ward was swept and mopped through each day a molding piece of food remained under my bed the 3 days I was there. I said nothing as I watched in horror to see if the job would be done properly.” |
| Topic 2 | Patient Satisfaction with Staff Communication |
“Everything was explained. The nurses were thoughtful. Thank you to all the nurses. The surgery was clean and tidy. The Manfield Day surgery is probably the best part of the hospital.” |
“When I recently attended for my first check up after having heart surgery, I was a little apprehensive at first on knowing that I was now under the care of a young, new consultant cardiologist. However, she was friendly and very informative, it was no bother to explain anything in as much detail as I wished to receive. I was put at ease and can look forward to a confident future.” |
| Topic 3 | Wait Time in the Healthcare Setting |
“Yesterday I attended the Radiology Department at Solihull Hospital for an ultrasound scan. I was treated promptly, courteously and considerately. At the conclusion of the test, I was given a clear and helpful report.” |
“Waited 20 mins to be seen. Not a very obvious waiting system, where you have to pick-up a ticket so you can be seen.” |
| Topic 4 | Patient Expressing Satisfactory Encounter with the Staff | “We were made very comfortable - especially at a time when patients are worried. Doctors and staff are excellent - very pleasant. I think that services have improved in the past two years. Thank you.” |
“I recently had a small operation at Pinderfields and I found the service was very good. The nurses (ward 3) were very good, very attentive and polite. I was only in for one day, found staff to be friendly. Many thanks.” |
| Topic 5 | Patient Expressing Gratitude | “The first 48 hours after someone suffers a stroke are critical, and without the skills, knowledge and care of all at the Hallamshire Hospital, my brother would not have made such a good recovery. While he was critical, he was given one on one care at all times, always treated with dignity, even when 'out for the count' (he had to be sedated for several). I cannot thank everybody at The Hallamshire enough, from the Auxiliaries who cared enough to spend time talking to and reading with my brother, through to Consultants and Registrars who made vital decisions, and also were very thorough during follow-up appointments.” | “My brother was discharged from hospital after spending a long time very ill in hospital. All staff, chefs, cleaners, hostesses, volunteers, nurses, physiotherapists, occupational therapists, doctors & social workers worked extremely hard work, were caring with positive attitudes & treated him with respect which contributed to his recovery. Thank you.“ |
| Topic 6 | Timing of the Appointment | “An optician diagnosed Bilateral Pigmentary Dispersal Syndrome during a routine eye test. I was referred to the ophthalmology department of the UHW, Cardiff and seen very quickly. I was offered a procedure to reduce the risk of the condition developing into glaucoma. The appointment came through three weeks after I agreed to the procedure. It was carried out successfully. I was very pleased with the treatment I was given. My only moan is the very long waiting time to see the consultant (over an hour one day.)” | “The department staff were running late. The appointment wasn't on time at all, in fact it was running 2 hours late. However, Dr Shankly was very good.” |
| Topic 7 | Healthcare Experience of a Patient |
“Not much problem with the parking, we have a disabled sticker so can usually get somewhere. We were not rushed at all, really enjoyed the appointment experience. Good food when on wards.” |
“I had a pacemaker inserted a couple of years ago after odd faints. I have to attend for check ups every year and I have also been seen a couple of times when I was worried that it wasn’t working quite right. Every time I have been there, I have been dealt with quickly, efficiently and competently. I wish other parts of the health service could follow their example.” |
| Topic 8 | Health Service Availability and Accessibility |
“We saw a nurse within 5 minutes. She understood what we wanted, but we had to see the doctor. We waited about 4 hours for the doctor. He was very pleasant and authorised a course of antibiotics. I don't understand why the nurse could not have authorised the antibiotics and sent us home. Nor do I understand why sick children are made to wait in such unpleasant conditions.” |
“My son had IRITIS an eye condition for a year and he attended regular hospital appointments at Fairfield Hospital in Bury. The main problem was that he would see a different doctor on each visit and due to the large number of patients at each session there was insufficient time for the doctor to read the notes and my son was being asked the same (sometimes embarrassing) questions each visit and referred for X rays and blood tests that he had already had.” |
| Topic 9 | Patient Thanking the Staff | “I would like to thank everyone on Richmond Ward @ West Middlesex University Hospital for the care they gave me recently, when I underwent a sinus operation. The care was second to none and I would like to say that even though the ward was sometimes under staffed the attention they gave was excellent.” | “I have had two knee replacement ops at Goole within a year. I have only the greatest praise for everyone involved there. I was looked after very well and am very happy with the results. The nurses and domestic staff and technicians are especially worthy of praise. Many thanks.” |
| Topic 10 | Patient’s Description of Treatment | “My lupus was diagnosed and treated by Dr Bothwell at dermatology, Barnsley for years to my satisfaction. last appointment I saw a different doctor who stated that I was in the wrong place as I had no skin problems at that time and my attempts to discuss other lupus-related issues were not listened to. I was left with no treatment for currant problems or referral to someone else who the doctor may have thought more suitable. I have had to wait for a new referral by my gp to a different dept. and therefore have had no management or support for my illness for several months.” |
“My husband is on a lot of medication because of his rare illness. He has a regular repeat prescription from our doctors in North Yorkshire. For several years he has been on Oxybutilyn to help with waterworks problems. A few months ago, I noticed that this had been changed, read the leaflet inside and read that it wasn't advised to use with 3 of his medications. The chemist said that Dr should know so it should be OK. He had a very violent reaction, Dr refused to come out to him but agreed to get ambulance (after I pressed for some action), who 'blue-lighted' him to Scarborough hosp. I found out that the doctor had changed everybody from the surgery using Oxybutilyn to the other medication. I suggest that doctors should check each patient's other medication before changing especially in serious conditions.” |
| Topic 11 | Cleanness of the Healthcare Setting | “I think the hospital could do with a complete update. And not just a dust round, but a deep clean.” | “Whenever I go to a local health centre, I am shocked that so many have filthy carpets. Surely carpets are unhygienic with so many people passing through? Aren't there design standards for this sort of thing?” |
| Topic 12 | A Patient Experience Described by a Family Member | “I had to take my 14 month old daughter for her hearing test at outpatients and the service that we received was good. We did not have to wait too long to be seen and the staff there were extremely helpful and informative. Overall we were there between 20 - 30 mins and during that period, toys were provided that my child was able to play with. Come on Northampton General Hospital.” |
“After spending six months in hospital due to a hip replacement, my grandmother was finally released on Mon 24th July. On returning home, my mother immediately saw that the leg brace she had been using whilst in hospital had caused an ulcer. She rang the hospital urgently as my Grandmother was diabetic and this could cause a serious problem. Teresa agreed to see my Grandmother within an hour. I must say she was very helpful and quickly fitted a new leg brace.” |
| Topic 13 | Patient’s Making of an Appointment |
“Whenever I ring up my doctor’s surgery to make an appointment, if they don't have any slots within 48hrs, they refuse to book me in for a later slot, telling me I have to ring back tomorrow. Is this because they have a target of seeing X% of people within 48hrs? If so, it's ridiculous - for non-urgent appointments it actually helps the patient to be able to book several days ahead, as they can plan time off work, etc.” |
“I moved to my current surgery 12 months ago, as I moved addresses and so had to change doctors. My old surgery allowed you to make appointments in advance. However my new surgery will not let you make appointments in advance - we have to phone the day before the appointment. I have to work shifts and this is most inconvenient when before I could fit the appointments around my shifts.” |
| Topic 14 | Musculoskeletal Health Conditions |
“My daughter broke both her elbows, at different times, and the service she received was 2nd to none. I have been taken to hospital twice with problems with my ACL and the second time I also broke my elbow. My daughter was x-rayed both times on her follow up visit and given physio. I was not, although the injury was the same. I feel like a second class citizen.” |
“I was referred to the physiotherapy department of the UHW because chronic ill health has made me immobile and I had a lot of muscular pain. I was seen within a month of referral and only needed three appointments. The exercises I was given were so good that the severe pain cleared up after a few weeks of doing them. My daughter also attended physiotherapy at UHW for lower back and neck pain and was given excellent treatment. Both physios were delightful.” |
| Topic 15 | Car Parking |
“I struggled parking, I was driving around for ages. It's our first time at the children's and everything has been great.” |
“Poor disabled spots for car parking. Grateful to all staff for keeping me going for all these years.” |
Our topic categorization distinguishes between two types of healthcare quality: quality of non-clinical service and clinical quality. The quality of non-clinical services (topics 1, 3, 6, 7, 8, 11, 12, 13, 14, 15) refers to patient perspective on healthcare quality, and it includes service elements that can be observed and understood by a patient such as cleanness of the setting, communication with staff members, and the wait time [40]. On the other hand, the clinical quality (topic 10) denotes health service effectiveness for diagnosing diseases and achieving good health outcomes [40], and it is associated with the healthcare provider’s expertise, medical equipment, and medicines used in treatment. The communication category (topics 0 and 2) refers to the communication between patients and the healthcare team. The patient satisfaction category (topics 2, 4, 5, and 9) represents a patient’s subjective evaluation of the healthcare experience or some elements of this experience. Lastly, the human aspects of healthcare experiences category (topic 4) describes how the healthcare team interacts with the patient respectfully and with empathy.
Topic Distribution Over Stories
We calculated the distribution of topics over the stories by counting the topic with the highest probability for each story. The distribution is presented in Table 3 and Table 4.
Table 3.
Topic distribution over stories
| Topic | Topic label | Category | Number of stories | % of stories in the corpus |
|---|---|---|---|---|
| Topic 13 | Patient’s Making of an Appointment | Quality of non-clinical services | 52,737 | 14.35% |
| Topic 0 | Patient Requesting Information | Communication | 43,882 | 11.94% |
| Topic 2 | Patient Satisfaction with Staff Communication | Communication, patient satisfaction | 42,659 | 11.61% |
| Topic 5 | Patient Expressing Gratitude | Patient satisfaction | 40,322 | 10.97% |
| Topic 7 | Healthcare Experience of a Patient | Quality of non-clinical services | 31,857 | 8.67% |
| Topic 4 | Patient Expressing Satisfactory Encounter with the Staff | Human aspects of healthcare experience, patient satisfaction | 27,979 | 7.61% |
| Topic 9 | Patient Thanking the Staff | Patient satisfaction | 24,016 | 6.53% |
| Topic 3 | Wait Time in the Healthcare Setting | Quality of non-clinical services | 23,586 | 6.42% |
| Topic 10 | Patient’s Description of Treatment | Clinical quality of services | 20,123 | 5.47% |
| Topic 8 | Health Service Availability and Accessibility | Quality of non-clinical services | 17,793 | 4.84% |
| Topic 6 | Timing of the Appointment | Quality of non-clinical services | 13,525 | 3.68% |
| Topic 11 | Cleanness of Healthcare Setting | Quality of non-clinical services | 12,547 | 3.41% |
| Topic 15 | Car Parking | Quality of non-clinical services | 8320 | 2.26% |
| Topic 1 | Maternity Care | Quality of non-clinical services | 7696 | 2.09% |
| Topic 14 | Musculoskeletal Health Conditions | Quality of non-clinical services | 426 | 0.12% |
| Topic 12 | A Patient Experience Described by a Family Member | Quality of non-clinical services | 105 | 0.03% |
Table 4.
Category distribution over stories
| Category | % of stories in the corpus * |
|---|---|
| Quality of non-clinical services | 45.87% |
| Patient satisfaction | 36.72% |
| Communication | 23.54% |
| Human aspects of healthcare experience | 7.61% |
| Clinical quality of services | 5.47% |
*The sum of the ratios does not add up to 100% because some categories are represented by several topics
Table 4 shows the percentage of stories corresponding to each topic category. Because some categories can be represented by several topics, these figures do not add up to 100%. We can see that 45% of the stories talked about issues related to the quality of non-clinical services. Interestingly, we can see that one-third of the stories describe patient satisfaction topics, which are topics 2, 4, 5, and 9, as we discussed above.
Sentiment Analysis
As we explained in the background, because these stories are strongly related to health, illness, and patient’s needs, sentiments are expected to be clear in these stories. We found that 99% of the stories have a clear sentiment, which could be positive or negative. The sentiment distribution at the topic level is presented in Table 5. For clarity, the topics corresponding to different sentiment levels are color-coded differently.
Table 5.
Topic sentiment distribution
We can see that for topics 2, 4, 5, and 9, positive sentiment can be seen in more than 85% of the stories where a topic is the most prominent. For some other topics, negative sentiments can be seen in more than 55% of the stories, and these are topic 0, topic 10, and topic 13.
Discussion
In this study, we analyzed 367,573 patient stories that were posted on the Care Opinion platform. The stories describe healthcare experiences from the perspective of patients in the UK. The study findings show that patients have used this platform to express satisfaction with their healthcare experiences and the quality of non-clinical services. However, some patients have also expressed their dissatisfaction regarding some service elements that are, from a healthcare quality perspective, critical to achieving health outcomes.
As presented in Table 4, the analysis of topics and the sentiment of the relevant stories can reveal important issues about some elements of healthcare quality in the UK, which are considered in healthcare quality literature pivotal for achieving healthcare quality [41]. Varying levels of negative sentiment exist in stories of different topics, some of which represent key elements in the healthcare system, such as the clinical quality of services, communication, wait time in the healthcare setting, the timing of the appointment, cleanness of the healthcare setting, and service availability and accessibility. This negative sentiment warrants the attention of healthcare providers, administrators, and policymakers.
The topics identified in our study are described in current healthcare quality literature. For example, a systematic review and meta-synthesis study by Graham et al. [42] explored qualitative research studies published between 1997 and 2017. The reviewed studies explore adult patients’ experience in emergency departments in Sweden, Canada, the USA, the UK, and other countries. The review identified five types of patient needs that healthcare providers should fulfill to create an ideal patient experience. These types are communication needs, emotional needs, care needs, waiting needs, and physical and environmental needs [42]. The communication needs comprise a patient’s need for good, respectful, and empathetic interpersonal communication and interaction with healthcare providers. Patients also need accurate and understandable information about their health conditions and the required healthcare services. The emotional needs are the need to reduce patients’ uncertainty about their health conditions and recognize patients’ illness experiences and suffering by healthcare providers. The care needs represent patients’ needs for competent and effective care to solve their health issues and reduce their health concerns. Waiting needs represent patients’ needs for timely services and convenient waiting rooms. Patients also need to be informed about the expected wait time before receiving healthcare services. The physical and environmental needs refer to patients’ basic needs for a clean and comfortable healthcare setting that can also protect their privacy [42]. The 16 topics identified in our study are consistent with the five types of patient needs identified in Graham et al.’s study [42].
Additionally, the identified topics underpin several healthcare initiatives that aim to improve the quality of healthcare and patient experiences, such as healthcare quality control [7, 14, 43, 44], patient-centeredness [14, 43], people-centered health services [45, 46], or healthcare responsiveness [47, 48]. These topics are also explored in many healthcare surveys, including those developed by NHS England [41], the Agency for Healthcare Research and Quality in the USA [49, 50], Picker Institute [51], and Health Quality Ontario [52].
One of the major healthcare quality initiatives is the one introduced by the Institute of Medicine in the USA [14], which defines six aims for healthcare quality improvement: patient safety, effective and evidence-based care, patient-centeredness, timeliness of services, efficiency, and health equity [14]. The patient-centeredness goal focuses on improving patient experience [14]. Patient centeredness is very common in the healthcare literature. It focuses on providing an ideal healthcare experience to the patient [14], and it identifies a set of factors that positively affect the patient experience. These factors are respecting patient’s values, preferences, and needs, coordination and integration of healthcare services, appropriate communication between the patient and medical staff, the physical comfort of patients, the level of compassion in the care provided to patients, and the social support available to patients [14].
In this study, we used text mining methods comprised of NLP, topic modeling, and sentiment analysis to analyze a large volume of patient stories posted on an online platform. This approach has three main benefits. First, as in all text mining methods, the efficiency of data processing is evident. Because of their large volume, humans’ reading of these stories to identify common issues or views about the patient experience can be difficult and time-consuming. Second, the topic model developed in this study based on the Care Opinion corpus, or the models that can be developed in future studies based on other corpora, can be used to label the stories posted by the patients in the respective platforms. Lastly, the analysis of topics and sentiment of the stories can reveal essential issues about some elements of healthcare quality that may not be captured using the traditional patient satisfaction surveys, which generally show the overall satisfaction levels and do not capture the relation between satisfaction levels and specific elements of the patient experience.
Limitations
In this study, we have used LDA topic modeling to analyze patient stories. In LDA methodology, the researcher has to assign descriptive labels to these topics. Label assignment is an interpretive process. We strived to improve the accuracy of the labels, and we used an existing theoretical framework. However, the labels may also reflect our personal perspectives and understanding of the healthcare experience phenomenon.
We have used a lexicon-based method for sentiment analysis, which estimates sentiment scores based on specific vocabulary. Although this method’s accuracy is good, using methods based on machine learning, such as supervised or semi-supervised classifiers that are trained on an annotated corpus of text that resembles patient stories, can provide more accurate results.
Conclusion
Since the emergence of social media, patients have used it to post stories about their healthcare experiences. These stories describe different aspects of these experiences, including health conditions, healthcare quality, communication between patients and providers, and health outcomes. The stories can also reveal patient satisfaction or dissatisfaction with specific elements of these experiences. Text mining methods enable researchers and healthcare providers to analyze and benefit from the large volumes of patient stories available on social media to explore the healthcare experiences of patients and identify critical issues in these experiences.
Social media platforms dedicated to collecting patient stories, such as the Care Opinion platform, may be more credible than the general social media platforms such as Facebook. The platform’s credibility increases the credibility of the stories and encourages healthcare providers to respond to them. Healthcare providers, administrators, and policymakers are invited to explore these platforms as sources of information that can be used to improve healthcare quality.
Supplementary Information
The Topic-Term Matrix (PDF 134 kb)
(XLSX 16 kb)
Acknowledgements
We would like to thank Dr. Craig Janes and Dr. Samantha Meyer at the University of Waterloo, Canada, for their valuable comments on an earlier draft of this manuscript. We are also grateful to the three anonymous peer reviewers for their valuable comments and questions.
Code Availability
Author Contribution
MAZ led the study design, data collection, analysis, and preparation of the manuscript. DJL contributed to the study design, preparation of this manuscript, and data analysis.
Data Availability
The data should be licensed by the Care Opinion Platform Operator.
Declarations
Ethics Approval
This study received ethics approval (ORE #41396) from the University of Waterloo’s Research Ethics Board.
Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Conflict of Interest
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Kaplan AM (2018) Social Media, definition, and history BT - Encyclopedia of Social Network Analysis and Mining. Presented at the
- 2.Mayer MA, Fernández-Luque L, Leis A (2016) Big data for health through social media. In: Participatory Health Through Social Media. pp. 67–82. Elsevier
- 3.McCay-Peet L, Quan-Haase A (2017) What is social media and what questions can social media research help us answer? In: The SAGE handbook of social media research methods. SAGE Publications Ltd, 55 City Road
- 4.Antheunis ML, Tates K, Nieboer TE. Patients’ and health professionals’ use of social media in health care: motives, barriers and expectations. Patient Educ Couns. 2013;92:426–431. doi: 10.1016/j.pec.2013.06.020. [DOI] [PubMed] [Google Scholar]
- 5.Hagg E, Dahinten VS, Currie LM. The emerging use of social media for health-related purposes in low and middle-income countries: a scoping review. Int J Med Inform. 2018;115:92–105. doi: 10.1016/j.ijmedinf.2018.04.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hamm MP, Chisholm A, Shulhan J, Milne A, Scott SD, Given LM, Hartling L. Social media use among patients and caregivers: a scoping review. BMJ Open. 2013;3:e002819. doi: 10.1136/bmjopen-2013-002819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.(2017) Agency for Healthcare Research and Quality: what is patient experience? Rockville
- 8.Kietzmann J, Canhoto A. Bittersweet! Understanding and managing electronic word of mouth. J Public Aff. 2013;13:146–159. doi: 10.1002/pa.1470. [DOI] [Google Scholar]
- 9.Schlesinger M, Grob R, Shaller D, Martino SC, Parker AM, Finucane ML, Cerully JL, Rybowski L. Taking patients’ narratives about clinicians from anecdote to science. N Engl J Med. 2015;373:675–679. doi: 10.1056/NEJMsb1502361. [DOI] [PubMed] [Google Scholar]
- 10.Greaves F, Ramirez-Cano D, Millett C, Darzi A, Donaldson L. Harnessing the cloud of patient experience: using social media to detect poor quality healthcare. BMJ Qual Saf. 2013;22:251–255. doi: 10.1136/bmjqs-2012-001527. [DOI] [PubMed] [Google Scholar]
- 11.Liu X, Chen H. A research framework for pharmacovigilance in health social media: identification and evaluation of patient adverse drug event reports. J Biomed Inform. 2015;58:268–279. doi: 10.1016/j.jbi.2015.10.011. [DOI] [PubMed] [Google Scholar]
- 12.Haodf: Home Page, www.haodf.com
- 13.Zakkar M. Patient experience: determinants and manifestations. Int J Heal Gov. 2019;24:143–154. doi: 10.1108/IJHG-09-2018-0046. [DOI] [Google Scholar]
- 14.Institute of Medicine (2001) Crossing the quality chasm: a new health system for the 21st century. Institute of Medicine, Washington [PubMed]
- 15.LaVela SL, Gallan A. Evaluation and measurement of patient experience. Patient Exp J. 2014;1:28–36. doi: 10.1177/237437431400100206. [DOI] [Google Scholar]
- 16.Törnberg A, Törnberg P. Muslims in social media discourse: combining topic modeling and critical discourse analysis. Discourse Context Media. 2016;13:132–142. doi: 10.1016/j.dcm.2016.04.003. [DOI] [Google Scholar]
- 17.Miner G, Elder J IV, Fast A, Hill T, Nisbet R, Delen D (2012) Practical text mining and statistical analysis for non-structured text data applications. Academic Press
- 18.Velasco E, Agheneza T, Denecke K, Kirchner G, Eckmanns T. Social media and Internet-based data in global systems for public health surveillance: a systematic review. Milbank Q. 2014;92:7–33. doi: 10.1111/1468-0009.12038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Jurafsky D, Martin JH (2018) Speech and language processing. Previosuly published by Pearson
- 20.Lane H, Howard C, Hapke HM (2019) Natural language processing in action: understanding, analyzing, and generating text with Python. Manning, Shelter Island
- 21.Blei DM. Probabilistic topic models. Commun ACM. 2012;55:77–84. doi: 10.1145/2133806.2133826. [DOI] [Google Scholar]
- 22.Chen C, Ren J. Forum latent Dirichlet allocation for user interest discovery. Knowl-Based Syst. 2017;126:1–7. doi: 10.1016/j.knosys.2017.04.006. [DOI] [Google Scholar]
- 23.Myneni S, Cobb NK, Cohen T (2013) Finding meaning in social media: content-based social network analysis of QuitNet to identify new opportunities for health promotion. In: MedInfo. pp. 807–811 [PubMed]
- 24.Kim EH-J, Jeong YK, Kim Y, Kang KY, Song M. Topic-based content and sentiment analysis of Ebola virus on Twitter and in the news. J Inf Sci. 2015;42:763–781. doi: 10.1177/0165551515608733. [DOI] [Google Scholar]
- 25.Chen AT, Zhu S-H, Conway M. What online communities can tell us about electronic cigarettes and hookah use: a study using text mining and visualization techniques. J Med Internet Res. 2015;17:e220. doi: 10.2196/jmir.4517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Pang B, Lee L. Opinion mining and sentiment analysis. Found Trends Inf Retr. 2008;2:1–135. doi: 10.1561/1500000011. [DOI] [Google Scholar]
- 27.Piryani R, Gupta V, Singh VK, Ghose U (2017) A linguistic rule-based approach for aspect-level sentiment analysis of movie reviews BT - Advances in Computer and Computational Sciences. Presented at the
- 28.Hutto CJ, Gilbert E (2014) Vader: A parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International Conference on Weblogs and Social Media (ICWSM-14)
- 29.Rehurek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. pp. 45–50. ELRA, Valletta, Malta
- 30.Bird S, Loper E, Klein E (2009) Natural language processing with Python. https://www.nltk.org/
- 31.Honnibal M, Montani I, Van Landeghem S, Boyd A (2020) spaCy: Industrial-strength natural language processing in Python, https://spacy.io/
- 32.Manning CD, Raghavan P, Schutze H (2008) An introduction to information retrieval. Cambridge University Press, Cambridge
- 33.Sarkar D (2019) Text analytics with Python: a practitioner’s guide to natural language processing. Apress, Berkeley
- 34.Srivastava AN, Sahami M (2009) Text mining: classification, clustering, and applications. Chapman and Hall/CRC, New York
- 35.Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3:993–1022. [Google Scholar]
- 36.Newman D, Lau JH, Grieser K, Baldwin T (2010) Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, USA, pp 100–108
- 37.Aletras N, Stevenson M (2013) Evaluating topic coherence using distributional semantics. In: Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013)–Long Papers. pp. 13–22
- 38.Röder M, Both A, Hinneburg A (2015) Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining. pp. 399–408. Association for Computing Machinery, New York, NY, USA
- 39.Bauer MW (2000) Classical content analysis: a review. In: Bauer MW, Gaskell G (eds) Qualitative researching with text, image and sound. SAGE Publications Ltd, London, pp 131–151
- 40.Roberts MJ, Hsiao W, Herman P, Reich MR (2004) Getting health reform right: a guide to improving performance and equity. Oxford University Press, New York
- 41.(2018) Clinical programmes and patient insight analytical unit: statement of methodology for the overall patient experience scores. Statistics
- 42.Graham B, Endacott R, Smith JE, Latour JM. They do not care how much you know until they know how much you care’: a qualitative meta-synthesis of patient experience in the emergency department. Emerg Med J. 2019;36:355–363. doi: 10.1136/emermed-2018-208156. [DOI] [PubMed] [Google Scholar]
- 43.National Academies of Sciences and Medicine, E (2018) Crossing the global quality chasm: improving health care worldwide. The National Academies Press, Washington, DC [PubMed]
- 44.Donabedian A (2002) An introduction to quality assurance in health care. Oxford University Press, New York
- 45.World Health Organization. Framework on integrated, people-centred health services. Report by the Secretariat
- 46.World Health Organization (2018) Continuity and coordination of care: a practice brief to support implementation of the WHO Framework on integrated people-centred health services. World Health Organization, Geneva
- 47.World Health Organization (2000) The world health report 2000: health systems: improving performance. World Health Organization, Geneva
- 48.Valentine NB, de Silva A, Kawabata K, Darby C, Murray CJL, Evans DB (2003) Health system responsiveness: concepts, domains and operationalization. In: Murray CJL, Evans DB (eds) Health systems performance assessment: debates, methods and empiricism. World Health Organization, Geneva, pp 573–596
- 49.Agency for Healthcare Research and Quality. CAHPS Clinician & Group Survey. https://www.ahrq.gov/cahps/surveys-guidance/cg/index.html. Accessed 2 Oct 20172017
- 50.The Centers for Medicare & Medicaid Services. CAHPS for Hospital. http://www.hcahpsonline.org/en/survey-instruments. Accessed 2 Oct 2017
- 51.Picker Institute. Questionnaire - Patient experiences of compassionate care. http://www.picker.org/tools-resources/toolkits. Accessed 2 Oct 2017
- 52.Health Quality Ontario (2015) Primary Care Patient Experience Survey. Toronto
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
The Topic-Term Matrix (PDF 134 kb)
(XLSX 16 kb)
Data Availability Statement
The data should be licensed by the Care Opinion Platform Operator.


