Abstract
Online health support groups are places for people to compare themselves with others and obtain informational and emotional support about their disease. To do so, they generally need to reveal private information about themselves and in many support sites, they can do this in public or private channels. However, we know little about how the publicness of the channels in health support groups influence the amount of self-disclosure people provide. Our work examines the extent members self-disclose in the private and public channels of an online cancer support group. We first built machine learning models to automatically identify the amount of positive and negative self-disclosure in messages exchanged in this community, with adequate validity (r>0.70). In contrast to findings from non-health-related sites, our results show that people generally self-disclose more in the public channel than the private one and are especially likely to reveal their negative thoughts and feelings publicly. We discuss theoretical and practical implications of our work.
Introduction
Self-disclosure refers to the process “by which one person verbally reveals information about himself or herself to another” (Dindia et al. 2002). The self-disclosure of feelings, thoughts and experiences can provide high quality information about the communicator, help reduce stress, and facilitate the development of social relationship (Tamir and Mitchell 2012). Although self-disclosure fulfills people’s fundamental needs for social connectedness, it makes people vulnerable and opens them to risks because the disclosers “give up some degree of privacy and personal control” (Altman 1975).
Most online communities allow members to broadcast communication to a large audience or to communicate with selected others. In Facebook, for example, people can choose to broadcast a message on their news feed or communicate selectively to specific friends via private messages. The choice of using a public or private channel shapes the communication content people share. For example, Bazarova et al. 2015 found that Facebook users revealed more intense and negative emotions in private messages than in their public status updates. However, the goals of self-disclosure differ depending on the nature of the community. Users on Facebook, for example, are often concerned with positive self-presentation regarding their disclosure; users on online health support groups might care more about eliciting social support. Therefore, it is unclear whether the conclusions drawn from non-support groups can apply to health support groups.
Many people with serious diseases and their caregivers use online health support groups to seek social support, share personal experiences and form social ties with others in similar circumstances (Chou et al. 2011). To accomplish these goals, they often need to self-disclose sufficient personal details about their situations, emotions and diseases. For example, when one member of an online health support group was seeking support after a surgery, her appeal was filled with self-disclosure about negative thoughts and feelings: “…I had my surgery after 18 weeks of chemo/radiation…I’m having a hard time. I just burst out in tears at anything. They started giving me a antidepressant. Did anybody else have a problem like this?”. Prior research in mental health forums suggested that users find it easier to reveal personal details online compared with in a face-to-face context (Kummervold et al. 2002).
Evidence from Reddit, Twitter, and similar sites demonstrate that people often publicly disclose intimate details of their lives when exchanging social support. As in more generic social media sites, most online health communities provide tools that allow members to communicate publicly in discussion boards or privately in chats. However, it is unclear whether findings on self-disclosure and channel differences drawn from general SNSs can be directly applied to online health support groups. In contrast to other social media sites like Facebook, members in health support groups more psychologically vulnerable given their disease state and must balance the value self-disclosure provides them in exchanging support with its risks. Failure in accurately assessing the nature of the group and selecting the right channel for their discussion may harm their ability to receive desired social benefits from the group and cause them to leave. In this work, we examine how members of online health support groups self-disclose in the private and public channels afforded by the group.
Self-disclosure and Channel Difference
Self-disclosure in Social Media
Various research has identified the types of information, especially those related with health issues, people disclosed in non-support groups. For instance, Pratt et al. 2015 found that members of Reddit tend to ask for advice immediately after diagnosis or during treatment; in contrast, cancer survivors there are more likely to share information with personal narratives. Ammari (Ammari, Schoenebeck, and Morris 2014) showed that parents of children with special needs turn to online support groups to self-disclose, discussing parenting issues and seeking social support. On Twitter (Kivran-Swaine et al. 2014) women express more severe and enduring loneliness.
Risk-Reward Balance
As mentioned (Altman 1975), people are reluctant to divulge information about themselves. In online communities, once people self-disclose in a public space, the information they shared will be freely accessible to other members and even unknown future audiences. As a result, people might regulate themselves and self-disclose less in public, and balance betwen the rewards and risks assocaited with self-disclosure.
Self-disclosure Goal
Different self-disclosure goals can potentially account for the different disclosure behavior in private and public channels (Miller and Read 1991). People may refrain from public self-disclosure for impression management purposes, or they may allow for more self-disclosure in a dyadic conversation to aid relationship development. In health support groups, people may need to publicly self-disclose themselves to get the support they seek. The functional theory of self-disclosure proposed by Derlega and Grzelak 1974 suggests that goals or subjective reasons for self-disclosure activate the disclosure decision-making process.
Valence
In addition to the total amount of self-disclosure, valence of self-disclosure (i.e., whether it reveals positive or negative aspects of the self) may also change with the publicness of the communication channel. Existing studies show that people express fewer negative emotions in communication channels visible to a wider network (e.g., in discussion forums) compared to more in private channels (e.g., private chat). One explanation lies in that positive self-disclosure often being strategic in the public channel, which serves to manage the discloser’s self-presentation. Thus, for more intimate information, people tend to share it in private channels than in network-visible ones.
In online health support groups, it might not be the same case as regular SNSs. Online health support groups provide a conducive environment for people to share their experience coping with the disease, as well as other factors such as their pain, gender, etc; in that way, they are able to receive proper help and advice. However, sharing these sensitive conditions or experiences in a public discussion forum might make people lose control of their self-disclosure, and lead them to regulate their disclosure behavior. Given the sensitiveness of the contents these members share, they tend to have a higher need of controlling their self-disclosure within the appropriate audience whom they trust, or felt safe with. Private messages, on the other hand, are typically sent to an individual in a dyadic conversation setting. This directed and private nature of these exchanges might provide people with relatively more control over their self-disclosure, while users are still able to obtain the desired social outcomes for self-disclosure. Thus, we expect that:
Hypothesis 1 People self-disclose more in the private channel compared to the public channel.
Hypothesis 2 People express more negative self-disclosure compared to positive self-disclosure in the private channel.
Dataset Preparation
Our analyses are conducted in the American Cancer Society’s Cancer Survivor Network (CSN1). The CSN discussions boards (Public Channel) are public places where registered members can participate by starting new threads or commenting on other members’ existing threads. Registered members of CSN can also communicate directly with each other using a function called “CSN Email”. Conversations between two people are recorded in a format like email or private chat messages (Private Channel), and are only visible to individuals addressed in the message headers. Our collaboration with the American Cancer Society provided access to all public posts and comments and private messages posted on the site from Dec 2002 to Feb 2015. We removed posts from four sub-forums that were peripheral to the site’s mission and removed posts from administrators’ accounts. The analyses below are based on 5,649 registered users have used both the public discussion boards and private messages. In total, they exchanged 105,213 private messages, and 826,389 public messages belonging in 28,911 threads.
Self-disclosure Identification
Self-disclosure refers to the verbal expression by which a person reveals information about oneself to others. We differentiated the type of self-disclosure based on Valence. Positive self-disclosure refers to discussing positive thoughts or emotions, such as happiness, gratitude and love, e.g. “My family is so supportive and makes me feel like such a loved and special person”. Negative self-disclosure refers to discussing negative thoughts or emotions, such as worry, sadness or anger, e.g. “I am freaked out after reading my mammogram report”.
Corpus Annotation
One thousand threads were randomly sampled from the discussion forum of CSN and served as the units of our annotation. For each thread, we asked three nurse annotators to indicate separately the extent to which the thread starter was expressing positive and negative self-disclosure . We provided them with detailed instructions and 2 rounds of training, in which they discussed disagreements. They rated the amount of positive and negative self-disclosure in the message on Likert scales with end-points “1 (not at all)” and “7 (strongly)”. To assess the reliability of the judges’ ratings, we computed the intra-class correlations (ICC) for each task. The ICC for both positive and negative self-disclosure was 0.90. We aggregated three workers’ responses for each message by averaging their ratings. To improve the generalization ability, we combined this corpus with a similar dataset retrieved from an online breast cancer support community (Wang, Kraut, and Levine 2015), resulting in 1,974 messages in total. Each message in the combined corpus had an average numerical score between 1 and 7 that indicates the amount of positive and negative self-disclosure it contains.
Feature Space Design
We used the hand-coded annotations of the 1,974 messages to train machine learning models that correlate characteristics of messages with human judgments on the presence of self-disclosure. We introduced a set of textual features for the machine learning models.
LIWC Features:
(1) First-voiced words have been used in several studies as indicative of self-disclosure and we computed the frequency of usage of the words in LIWC (Pennebaker et al. 2015) dictionaries: 1st-personal singular, 1st-personal plural, 2nd person, 3rd-person singular, 3rd-person plural and articles. (2) Revealing underlying emotions or sharing personal life events and descriptions of either positive or negative experiences often contain words that carry strong sentiment. Thus we computed several affect-relevant measures using LIWC: positive and negative emotion, anger, anxiety and sadness.
Linguistic Style:
We introduce measures to characterize linguistic styles in the posts. (1) Sentence count and word count are extracted to represent the length and complexity of the messages; (2) Part of speech tags (POS), such as proper nouns and adjectives, are counted to capture certain emotion or information cues; (3) We calculated the number of occurrences of all weak/strong subjectivity oriented words in a sentence using the resource in the work by Akkaya, Wiebe, and Mihalcea; (4) Negation feature counts the number of negation words or phrases e.g., “not”, “shouldn’t”; (5) The number of question marks and modal verb is also considered; (6) Name entity recognition is performed to recognize name entities and count how many name entities are mentioned in a post.
Lexicon:
To characterize the topical language of individuals and what people talk about, we utilized a lexicon of terms derived from Latent Dirichlet Allocation (LDA) to discover hidden topics in online support groups as well as the words associated with each topic (Wang, Kraut, and Levine 2015). We also considered four lexicons for diseases, symptoms, drugs, and drug ingredients, extracted from Free-base2. Then we deployed this lexicon to determine the frequency of these terms that appear in each post.
Word Embedding:
We further considered the meaning of sentences via Word2Vec (Mikolov et al. 2013). That is, words or phrases from the vocabulary were mapped to vectors of real numbers, representing their distributional semantic meaning. Specifically, we trained word2vec word embeddings with 300 dimensions using approximately 100 million tokens from messages in several online health support groups. We measured the vector for each word in the post, and then aggregated them using the coordinate-wise mean, to obtain the meaning of each message.
Identification Result
In summary, each message was represented as a 362 feature vector (62 linguistic features and 300 word2vec features). They were the input to two Support Vector Machine regression models with RBF kernels, one outputting a numerical estimate of the amount of positive self-disclosure in a message and the other the amount of negative self-disclosure. We performed 10 fold cross validation on the 1974 coded messages to evaluate their performance, as shown in Table 1. These models predicted the human annotations well, with the correlations of 0.708 and 0.767 for positive and negative self-disclosure respectively). We then applied them to estimate the amount of positive and negative self-disclosure contained in the 826,389 discussion board messages and 105,213 private messages.3. We also built two baselines regression models that used 6 features, including first-voiced personal pronouns and positive and negative emotion dictionaries. The base-line models where more poorer fits to the human annotations, with correlations of 0.127 and 0.523 respectively.
Table 1:
Models | R-Squared | Correlation |
---|---|---|
Predicting Positive Self-disclosure | ||
First-voiced baseline | 0.016 | 0.127 |
Our linguistic model | 0.501 | 0.708 |
Predicting Negative Self-disclosure | ||
First-voiced baseline | 0.274 | 0.523 |
Our linguistic model | 0.588 | 0.767 |
Channel Differences in Self-Disclosure
Figure 1 shows the amount of average amount of positive and negative self-disclosure per message in the private messages, thread starting messages and comments. Contrary to hypotheses based on generic social networking sites, in this cancer support group, public thread-starting messages and comments both contained higher level of self-disclosure compared to messages posted in the private channel. Specifically, (1) the public threads-starting messages had the highest negative self-disclosure. This is consistent with a hypothesis supported by prior work (Wang, Kraut, and Levine 2015) that people need to disclose inner turmoil or negative events in their lives when looking for advice and help. (2) Comments have relatively higher positive self-disclosure than thread starting posts, indicating that people tend to behave and respond positively when providing support to others. This also confirms the supportive nature of these online support groups. (3) Compared with negatively self-disclosing, people tend to talk about relatively more positive aspects of their lives in the private channel.
Discussion and Conclusion
This research examined how members self-disclose in the private and public channels in an online cancer support community. In contrast to findings based on generic SNSs like Facebook, in the cancer support group, people overall self-disclose more and expressed more negative self-disclosure in public channels than the private ones. Members of these groups are only able to receive appropriate information, advice and emotional support from fellow community members by describing private information about their disease and life circumstance. They are more likely to do this in public than private in order to elicit support and social comparisons with the large, but unknown audience participating in the forums. The public nature of these sites drives people to self-disclose information by allowing them to express themselves openly and receiving social support from others. In contrast, users seem to use the private channel to continue conversations they started publicly, to update their progress and show caring to others, all of which occur with positive self-disclosure.
Although the corpus that we used for annotation and model training is constructed from a public discussion board, our robustness check that compared the amount of positive and negative emotional words using LICW in the public and private settings demonstrates similar findings. However, we urge future research to use a more representative corpus to validate our findings.
Footnotes
We admit that difference might exist between forum posts and private messages. However, due to privacy issues, annotators are not allowed to view and annotate private messages.
References
- Akkaya C; Wiebe J; and Mihalcea R 2009. Subjectivity word sense disambiguation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1, 190–199. Association for Computational Linguistics. [Google Scholar]
- Altman I 1975. The Environment and Social Behavior: Privacy, Personal Space, Territory, and Crowding ERIC. [Google Scholar]
- Ammari T; Schoenebeck SY; and Morris MR 2014. Accessing social support and overcoming judgment on social media among parents of children with special needs. In ICWSM
- Bazarova NN; Choi YH; Schwanda Sosik V; Cosley D; and Whitlock J 2015. Social sharing of emotions on facebook: Channel differences, satisfaction, and replies. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, CSCW ‘15, 154–164. New York, NY, USA: ACM. [Google Scholar]
- Chaikin AL, and Derlega VJ 1974. Variables affecting the appropriateness of self-disclosure. Journal of Consulting and Clinical Psychology 42(4):588. [Google Scholar]
- Chou W.-y. S.; Liu B; Post S; and Hesse B 2011. Health-related internet use among cancer survivors: data from the health information national trends survey, 2003– 2008. Journal of Cancer Survivorship 5(3):263–270. [DOI] [PubMed] [Google Scholar]
- Dindia K; Allen M; Preiss R; Gayle B; and Burrell N 2002. Self-disclosure research: Knowledge through meta-analysis. Interpersonal Communication Research: Advances Through Meta-analysis 169–185.
- Kivran-Swaine F; Ting J; Brubaker JR; Teodoro R; and Naaman M 2014. Understanding loneliness in social awareness streams: Expressions and responses. In ICWSM
- Kummervold PE; Gammon D; Bergvik S; Johnsen J-AK; Hasvold T; and Rosenvinge JH 2002. Social support in a wired world: use of online mental health forums in norway. Nordic Journal of Psychiatry 56(1):59–65. [DOI] [PubMed] [Google Scholar]
- Mikolov T; Sutskever I; Chen K; Corrado GS; and Dean J 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, 3111–3119.
- Miller LC, and Read SJ 1991. On the coherence of mental models of persons and relationships: A knowledge structure approach. Cognition in Close Relationships
- Pennebaker JW; Boyd RL; Jordan K; and Blackburn K 2015. The development and psychometric properties of liwc2015. UT Faculty/Researcher Works
- Pratt JEZDW. Self-characterized illness phase and information needs of participants in an online cancer forum. 2015.
- Tamir DI, and Mitchell JP 2012. Disclosing information about the self is intrinsically rewarding. Proceedings of the National Academy of Sciences 109(21):8038–8043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y-C; Kraut RE; and Levine JM 2015. Eliciting and receiving online support: using computer-aided content analysis to examine the dynamics of online social support. Journal of Medical Internet Research 17(4):e99. [DOI] [PMC free article] [PubMed] [Google Scholar]