Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Aug 22.
Published in final edited form as: Proc SIGCHI Conf Hum Factor Comput Syst. 2017 May;2017:1634–1646. doi: 10.1145/3025453.3025909

A Social Media Based Index of Mental Well-Being in College Campuses

Shrey Bagroy 1, Ponnurangam Kumaraguru 2, Munmun De Choudhury 3
PMCID: PMC5565736  NIHMSID: NIHMS891349  PMID: 28840202

Abstract

Psychological distress in the form of depression, anxiety and other mental health challenges among college students is a growing health concern. Dearth of accurate, continuous, and multi-campus data on mental well-being presents significant challenges to intervention and mitigation efforts in college campuses. We examine the potential of social media as a new “barometer” for quantifying the mental well-being of college populations. Utilizing student-contributed data in Reddit communities of over 100 universities, we first build and evaluate a transfer learning based classification approach that can detect mental health expressions with 97% accuracy. Thereafter, we propose a robust campus-specific Mental Well-being Index: MWI. We find that MWI is able to reveal meaningful temporal patterns of mental well-being in campuses, and to assess how their expressions relate to university attributes like size, academic prestige, and student demographics. We discuss the implications of our work for improving counselor efforts, and in the design of tools that can enable better assessment of the mental health climate of college campuses.

Keywords: college mental health, Reddit, social media, transfer learning

INTRODUCTION

College students confront many challenges in pursuit of their educational goals [51]. When such experiences are perceived as negative for a prolonged period of time, they can have an adverse effect on students’ well-being [1, 3, 66]. Beyond implications for personal health, the Federal Bureau of Investigation has reported that mental health concerns on college campuses can pertain directly to episodes of violence [22].

Yet few university students seek help related to mental illness. Only 18% of students with a past-year history of poor mental wellness are known to seek counseling, therapy or treatment [9]. This arises due to a variety of barriers: limited health insurance coverage, paucity of knowledge about psychiatric services, social stigma, and lack of time [1, 57].

It is recognized that campus-wide support measures, coping strategies, and mitigation programs might decrease the negative effects of mental illness in college students [3, 54, 66]. However, employing all of these interventions necessitates adequate assessment of the “mental health climate” within a university campus. Currently, such assessments are challenged due to the paucity of adequate and accurate data on students’ well-being. Local data is gathered through visits to the campus counseling center. However these services are often only availed by students when their mental well-being takes a downward turn [27]. Complementarily, many universities conduct periodic surveys [5] to gauge mental health challenges of students and supplement clinical data on students’ mental health [35, 9]. However the large temporal gaps across which these measurements are made, the retrospective nature of recalling past experiences, and limited consistency of survey findings across multiple campuses make it difficult for authorities to act upon such information and influence campus mental health intervention programs.

In this paper, we aim to bridge this gap by utilizing social media as an unobtrusive “lens” to gauge mental health expressions of students in university campuses. Our motivation stems from two observations. First, recent advances in HCI and social computing research has provided promising evidence that content shared on social media, especially its linguistic characteristics, can enable accurate inference, tracking, and understanding of mental health states [19, 14, 60]. Second, over 90% of young adults, or individuals of college going age, use social media [40]. In fact, many college students are appropriating these platforms to meet a variety of their needs, such as for self-disclosure, support seeking and social connectedness [30]. How can we build on these methodological advances and the pervasive use of social media by students to gauge their mental well-being? To answer this question, we focus on the following research aims:

  • Aim 1: Building and validating a machine learning methodology to identify mental health expressions of students in campus-geared online communities.

  • Aim 2: Analyzing the linguistic and temporal characteristics of the above inferred mental health expressions of students in different university campuses.

  • Aim 3: Developing an index of collective mental well-being in a campus, and examining its relationship to attributes of the university, including academic prestige, enrollment size, and student demographic distribution.

To accomplish these research aims, we use large-scale, passively gathered, longitudinal data shared in over 100 campus-geared communities on the social media Reddit. We then show that an inductive transfer learning approach [49] can help to detect mental health expressions in student populations with high precision and accuracy (97%). Analyzing these expressions over time, we find that they show a monotonically increasing trend through the academic year. However these expressions demonstrate a decreasing trend during the summer months. Alarmingly, we also observe that the relative proportion of these expressions have shown a 16% increase between 2011 and 2015. Then, we demonstrate that our transfer learning based classification approach can be used to develop a novel metric of college campus well-being, known as “Mental Well-being Index” (MWI). MWI enables us to discover both established as well as previously underexplored differences across college campuses. We find MWI to be lower in public universities with large undergraduate student bodies and female students; counter-intuitively, it is not lower in colleges with higher academic prestige.

To the best of our knowledge, we present the first large-scale multi-campus study of college student mental health by leveraging social media data of over 100 campuses. Our findings bear implications for improving mental health support and counseling efforts within campuses. We conclude by discussing design considerations for technologies that enable unobtrusive, real-time tracking of collective mental health of college student populations.

RELATED WORK

Mental Health of College Campuses

Clinical and Epidemiological Assessments

There is a rich body of work in psychology, health policy, epidemiology, and public health around assessing and understanding mental health concerns of college students [31, 45, 59]. While some of these works focus on identifying the underlying factors that impact college students’ mental health [59], the bulk focuses on assessing the mental health of students who seek help at college counseling centers [31]. Most college counseling centers persistently collect local data from students seeking their services on individual campuses, such as through survey instruments validated against the Diagnostic and Statistical Manual of Mental Disorders (DSM) [45]. However it has been noted that these data are rarely shared nationally. Further, Soet and Sevig [56] argued that college students who do not come to counseling centers need to be investigated to examine similarities and differences between clinical (those who seek mental health treatment) and non-clinical populations. By focusing on content shared on university campus social media communities, we are able to identify an alternative mechanism to gather information about mental health challenges in college campuses, including of those who may be unwilling or unable to seek professional help.

Noting the challenges of solely focusing on clinical student populations, many universities have adopted a survey conducted by the American College Health Association [6]. The survey includes limited questions on mental health issues such as medication use, depression, and suicide. However these studies are limited in temporal granularity and involve significant resource commitment to conduct.

To overcome limitations of survey approaches, researchers have employed wearable sensing technologies and experience sampling methods to obtain a variety of physiological and psychological signals in a continuous fashion from college student populations [62, 28, 65, 64]. However most of these works have been done in a controlled lab setting or supervised real-life setting. Exception is the recent work of Wang et al [62], who used smartphone sensing to validate attributes of campus life, including academic conduct, lifestyle, socialization, depression, stress, and mood [13, 63, 10].

Although these approaches capture rich, dense sensing data about people’s behaviors, activities and moods, they need considerable cooperation, intervention, and compliance from participants. This challenges conducting longitudinal, repeatable studies on student mental health spanning large populations. Social media data, on the other hand, can be passively collected, and therefore can scale to multiple campuses easily. Thus it can be more useful for longitudinal tracking of a specific university’s mental health climate.

Differences Across Academic and Student Attributes

Various research studies have explored how mental health of college students relates to demographic and social factors [7]. Racial and ethnic minority students are reported to under-use mental health services, even while reporting more distress [67]. Further, studies report gender differences in psychological distress among young adults and students [45]. Therefore, it is posited that a focus on reaching diverse groups of college students, such as gender or racial minorities in specific campuses, is needed [24].

However, there has been limited research examining the manifestation of mental health concerns among students in the light of the academic setting, such as enrollment size, competitiveness, prestige, supportiveness of academic personnel, and field of study [56]. Hunt and Eisenberg [32] argued that the risk factors for mental disorders among students must be understood in the context of not only their vulnerabilities, but also how they interact with external factors in college. Our work attempts to fill this gap by analyzing the relationship between social media derived mental well-being of students, and a range of attributes of the corresponding institutions, such as the size and demographics of their student body, and academic prestige.

Social Media and Mental Health

General Population

An emergent body of work in HCI and related disciplines has examined the relationship between mental well-being and self-disclosure on social media [18, 2]. De Choudhury and De [18] explored individuals who appropriated the Reddit platform to express a variety of emotional and mental distress, and how these expressions are characterized by disinhibiting behavior. Andalibi et al. [2] qualitatively characterized the variety of emotional expressions that are shared on Instagram via the hashtag “#depression”. In other work, social media content analysis, specifically of linguistic cues and conversational patterns, has enabled novel mechanisms to predict risk to mental health concerns, ranging from depression [19], substance abuse [47, 41], loneliness [36], eating disorders [12], and other mental health disorders [14, 60]. We extend this body of work by developing an automated machine learning method that can detect mental health expressions in social media, specifically in the context of college student populations.

College Students

There is limited work on college students’ social media use and their mental well-being. Ellison et al. [26], in a seminal study, found that there is a positive relationship between college students’ Facebook use and the maintenance and creation of social capital. Similarly, Manago et al. [42] found that social networking sites helped college students satisfy enduring human psychosocial needs. Other work has found that social media use may foster the development of intimate relationships, including supportive friendships among college students [52]. Moreover, the communication with friends that occurs on these platforms may help college students resolve key issues present during this transitional phase of life [44]. Given the pervasiveness of social media use among college students and its relationship with psychological well-being [30, 43, 61], we examine how data gathered from these platforms may lend insights into the collective mental well-being of college campuses.

Macro-Scale Mental Well-Being With Social Media

Leveraging social media, researchers have also sought to develop quantifiable indices of emotional and mental well-being of large populations [46, 34, 29]. Kramer [37] developed a “Gross National Happiness” index with Facebook posts, whereas Dodds et al. [21] developed a “happiness index”, a hedonometer, based on textual content shared on Twitter. More recently, Schwartz et al. [55] adopted more sophisticated methods like topic models to study the patterns of county-specific levels of well-being and life satisfaction using Twitter data. Utilizing clinical self-reported information about depression, De Choudhury et al. [17] also developed a Twitter-based state-level index of depression. Our contributions in this paper build on these investigations and methodologies. We examine how unobtrusive data gathered from students’ social media use may inform the development of a quantifiable index that tracks collective mental well-being of students in different college campuses.

DATA

University Data

We first obtained a list of 150 ranked major universities in the United States by crawling the US News and World Report website [48]. This list is constructed based on the Carnegie classification, employed extensively by higher education researchers, and using a set of 16 indicators of academic excellence, defined by US News. The list includes a variety of universities spread across the US in different settings (e.g., urban, rural), and with a wide range of student enrollment sizes. Figure 1(a) shows their geographic distribution. As a part of this crawl, we also obtained university metadata: gender distribution of students, average tuition and fees, and academic calendar (semester/quarter).

Figure 1.

Figure 1

(a) Geographical distribution of the 150 ranked universities used in this paper. (b) University subreddit size over student enrollment, (as of July 2016).

To obtain further information about the nature of the student body, we crawled the Wikipedia pages of all of the 150 universities. From these pages, we extracted the size of student enrollment, type (public/private), and setting (rural/suburban/urban/city) at every institution. These definitions come from a formal categorization scheme used by the US Department of Education. The student body enrollment sizes ranged from 2,255 to 97,494, with 98 public and 52 private universities. 50 universities were reported to be urban, 47 city, 39 suburban, and 13 rural.

Finally, we obtained information on racial diversity of the universities from a website known as Priceonomics [58]. The website calculates the Herfindahl-Hirschman Index (HHI), by combining the race/ethnicity distribution of student bodies at different universities, with data given from the Department of Education. HHI ranges from 1 (the least diverse: a population of all one type) to 1/N (the most diverse), where N is the number of different racial categories being analyzed.

Social Media Data of Universities

Next we obtained social media data of the above universities. Specifically, we focused on the social media Reddit.

Why Reddit?

Reddit is known to be a widely used online forum and social media site among the college student demographic [23]. Due to its forum structure, it is extensively used for both content sharing, as well as for obtaining feedback and information from communities of interest. Reddit harbors a variety of communities known as “subreddits”, including many dedicated to specific university campuses. This allows a large sample of posts shared by students of a university to be collected in one place. Our preliminary manual inspection of university subreddits (e.g., r/gatech or r/KState) revealed that these subreddits are appropriated by students to discuss college topics (Table 1). Focusing on these public Reddit communities also does not require explicit data collection efforts to be coordinated at each of the 150 university sites. Although more students are likely to use Facebook, due to its largely privately shared content, it is challenging to obtain access to a large dataset of a university’s students. Next, while Twitter is also widely adopted, without explicit self-reported information, it is challenging to identify college student accounts. Finally, prior work [2, 18] notes that semi-anonymity of Reddit enables candid self-disclosure around stigmatized topics like mental health.

Table 1.

Example (slightly paraphrased) subreddit posts identified to be written by university students.

What constitutes a “good” GPA? I’m currently entering my third quarter as a CS major and realize today that I’ve no idea what’s considered a “good” (or even “average”) GPA here
Not getting calculus I have a 3.7 GPA and I made an A in both [ class A] and [ class B], so I know I’m not dumb. I don’t know what the hell is wrong with me cause I sure feel dumb.
Need a little help please I’m not doing very well, I don’t mean academically I mean in my head. I really don’t think I can continue with this quarter.

By utilizing Reddit’s subreddit search functionality, we crafted various search queries using names and acronyms of the university names in our above list. We were able to identify the public subreddit pages of 146 out of the 150 above identified universities. Thereafter we employed a multi-step approach to secure posts and associated metadata shared in these 146 subreddits:

Initial Data Acquisition

We leveraged the archive of all of Reddit data made available on Google’s BigQuery [11]. Big-Query is a cloud based managed data warehouse, that allows third parties to access large publicly available dataset through simple SQL-type queries. Our queries grabbed all posts ranging between June 2011 and February 2016 available in the Reddit data archive. This included 424,984 posts from 153,378 unique users across all of the 146 universities, with a mean of 2,910.8 posts (σ = 4329.6) and 1,050 unique users (σ = 1407) per subreddit.

Filling the Gaps in Subreddit Data

The second step of our data collection process focused on identifying subreddits with insufficient data, and supplementing them through additional alternative data collection. Through Reddit’s official API (https://www.reddit.com/dev/api/), we obtained the most recent number of subscribers in the 146 university subreddits (as of July 2016). Then to investigate if and to what extent some subreddits may have had unusually low data as given in step 1, we determined the median unique user to subscriber ratio in each subreddit. This allows us to capture the subreddits where the subscriber count is high, however the data obtained is not sufficiently representative. For subreddits with unique user to subscriber ratio under median (.42) (73 in all), we performed a one-time data collection using the Reddit API. This gave us a set of (at most) 1000 most recent posts for each subreddit, with a total of 39,824 posts added to the data obtained in step 1, following de-duplication. We note that this procedure did not skew the yearly distributions of data across the subreddits: The skew (yearly rate of change) before and after data filling were 4.86 and 5.05 respectively, which were found to be statistically equivalent based on a two-sample equivalence test (p = .013, p = .025), a test that uses two one-sided t-tests on the before-after yearly rates of change from both sides of a chosen difference interval [−1, 1].

Correcting for Under-Adoption of Reddit

Based on the Reddit data we collected, from Figure 1(b) we observe high positive correlation between a university’s size of student body (enrollment) and the number of users subscribing to the corresponding subreddit (R2 = .38; ρ2 = .6; p < .05). For subreddits deviating from trend, it would imply that the associated university’s student body was under or over-represented on Reddit. We therefore devised a method to identify these subreddits, to correct especially for under-adoption bias. We first calculated the ratio between the number of subreddit subscribers to student enrollment for each subreddit and the corresponding university. If, for a subreddit, this ratio was less than the expected adoption of Reddit for the same demographic group (4–8% as of 20161), we assumed that Reddit was under-adopted by the students in the corresponding university. We thus removed subreddits where this ratio was <4%. This brought down our subreddits from 146 to 109. In these 109 subreddits, the mean Reddit adoption was 8.6% (σ =1.3), which is close to the highest adoption reported by Pew.

The final dataset employed in our ensuing analyses included 446,897 posts from 152,834 unique users (mean posts per subreddit: 4,100; mean users per subreddit: 1,402). Figure 2(a) gives a distribution of the volume of crawled posts over the years. Figure 2(b) gives the final distribution of subreddits over the unique user to subscriber count ratio. Figure 2(c–d) gives distribution of the posts and unique users across the final 109 subreddits.

Figure 2.

Figure 2

(a) Volume of posts over time, from the final 109 university subreddits. (b) Distribution of subreddits over unique user to subscriber count ratio. (c) Distribution of subreddits over total number of posts. (d) Distribution of subreddits over the number of unique users.

Demographic Representativeness

We note, it is possible that the type of students who frequent the university subreddits could be consistently different from the student body at the same university. To examine the representativeness of our university subreddit data, we employed a random sample of 500 posts, distributed across the subreddits and the years, for manual examination of demographics. Two researchers then independently coded these posts for self-reported gender, race, or academic stage (undergraduate/graduate). For instance, from the post “I’m a junior transfer and this will be my second semester”, the researchers identified the post author to be an undergraduate, whereas from “Hi all! I’m a new grad student (male, 22) here, the gender of the author can be inferred to be male. We found the interrater agreement to be high: Cohen’s κ = .84.

The relative ratios between the gender, race, and academic stage distributions of the coded posts and the university student body (obtained based on our methodology in the subsection “University Data”) showed significant positive correlation: The mean undergrad/grad ratio in our labeled data was 2.9, while it was 2.6 in the universities. A two-sample test of equivalence gave p-values of .016 and .011 respectively, with respect to the difference interval [−.4, .4]. The sex (male/female) ratio for our labeled data was 1.6, also observed to be statistically equivalent to that of the student body, 1.1 (p = .02, p = .03, w.r.t. the difference interval [−.5, .5]). This establishes the validity of our acquired Reddit data as a representative data source for studying mental health disclosures in university campuses.

METHODS

We now present a methodology of identifying posts shared in university subreddits that that are likely to be mental health expressions. Note that, our Reddit data does not contain any gold standard information around whether a post shared in a university subreddit is about one’s mental health experience or condition. Our proposed method overcomes this challenge by employing an inductive transfer learning approach [16].

First, we include (as ground truth data) Reddit posts made on various mental health support communities. Prior work has established that, in these communities, individuals self-disclose a variety of mental health challenges explicitly [50]. Parallelly, we utilize another set of Reddit posts, made on generic subreddits unrelated to mental health, to be a control. Next, we build a machine learning classifier to distinguish between these two types of posts. Then we learn features that could detect whether an post shared in a university subreddit could be an expression of some mental health concern. We discuss these steps in detail in the following subsections.

Mental Health and Control Data

We gained access to a sample of 63,485 public posts from 35,038 unique users, shared between 2014 and 2016, in a variety of mental health subreddits—this repository of posts has been used in prior work to study mental health self-disclosure and support seeking manifested in social media [18, 50, 39, 20]. This dataset includes posts and associated metadata spanning 14 mental health related subreddits, such as r/depression, r/mentalhealth, and r/traumatoolbox, r/bipolarreddit. From this corpus, we excluded posts that contained only a title without a post body. This gave us 21,734 posts. We refer to these posts as MH posts.

Our control data also relied on a dataset compiled and utilized in prior work [50]; it contains posts from subreddits such as r/WorldNews, r/food, and r/AskReddit. We randomly sampled an equal number of posts (21,734) as the MH posts above for our control dataset. We refer to these posts as CL posts.

Automatic Identification of Mental Health Expressions

In order to automatically identify posts relating to mental health expressions, we adopted an inductive transfer learning approach [49]. As described above, we utilized the dataset of MH and CL posts as positive and negative examples in a binary classification framework. We utilized this data to train and test different classification techniques, including random forests, Ada Boost, Support Vector Machines (SVM), and Logistic Regression. We also adopted k-fold cross validation to evaluate the optimality and robustness of the approach, and included as features the linguistic content of the posts (stop-word eliminated uni-, bi-, and tri-grams).

Mental Well-being Index of Universities (MWI)

Can the ability to predict whether or not a Reddit post involves an individual’s mental health expression provide the basis for an accurate, reliable, fine-grained model of mental well-being manifested in various university subreddits? To this end, we use our above developed machine learning classifier to automatically label the corpus of posts shared in the 109 university subreddits. Thereafter, we define a metric called the “Mental Well-being Index” (MWI). At a time t and for a university U, we define it as the standardized difference between the frequencies of users sharing non-mental health expression posts fn(t, U) observed until t in the subreddit of U, and that of the users sharing mental health expressions until t in the same subreddit (fp(t, U)):

MWI(t,U)=(fn(t,U)μn)/σn(fp(t,U)μp)/σp (1)

where μp (correspondingly μn) and σp (correspondingly σn) are the mean and standard deviations of the frequencies of mental health expression (correspondingly non-mental health expression) posts. Note that we consider separate terms for the two classes of posts. This allows mental health expression and other expressions in posts to be weighted equally (since their relative volumes are likely to be different in a subreddit). This way, we also focus on variation in each class separately. That is, even if per one’s behavior, some individuals dramatically under-express mental health concerns in their posts, the relative non-mental health expression compared to mental health expression will be informative.

RESULTS

Aim 1: Evaluating the Mental Health Expression Classifier

Model Performance

First, we evaluate our transfer learning based classifier for detecting mental health expressions in Reddit posts. We split up our corpus of 43,486 posts shared on the MH and CL communities into a training set (on which 5-fold cross validation is applied) and a validation set, with 80% posts for training and 20% for validation. Following 5-fold cross validation with different classification approaches, we found our logistic regression model to yield the highest accuracy: 93.4%, with an average precision, recall, and F-1 score of .93 each (refer to the column on “Validation set” Table 2 for these metrics). Table 3 presents the confusion matrix corresponding to classification on this validation set. We also note the area-under-curve (AUC) value for this classifier, corresponding to the receiver operating characteristic (ROC) curve, to be high: .98 (see Figure 3). AUC is a widely used metric because it shows the tradeoff between true and false positive rates. Figure 3 shows that with only about 5% of FPR, we can achieve TPR of over 90%, illustrating high performance.

Table 2.

Classifier performance on validation (8,692 MH and CL posts) and test sets (500 annotated university subreddit posts).

Val. set Test set
Acc. 93.4% 96.8%
Prec. 0.93 0.98
Rec. 0.93 0.97
AUC 0.98 0.97
F1 0.93 0.97
Table 3.

Confusion matrix classifying posts in the validation set for mental health (MH) and control (CL) posts.

Predicted
Actual CL MH
CL 4080 266
MHD 308 4038
Figure 3.

Figure 3

ROC curve in classifying mental health (MH) and control (CL) posts.

We also evaluate this best performing logistic regression in terms of its explanatory power in the validation set over an equivalent Null model. We find that the difference between the deviance of the Null model and the deviance of our model approximately follows a χ2 distribution, with degrees of freedom equal to the number of additional predictor features in the latter model: χ2(250, 000, N = 8692) = 123076 – 98476 = 2.46 × 104, p < 1010. Summarily, our model results in significant reduction of deviance in classifying MH and CL posts.

Examining Significant Predictive Features

Which n-gram features given by the logistic regression model are the most predictive of a post being classified to be MH or CL? To answer this question, we present, in Table 4, the 30 predictor features with positive β coefficient weights (i.e., they are correlated with MH posts), and another 30 predictor features with negative β coefficient weights (i.e., they are correlated with CL posts). We illustrate the context of use of a sample of these features to understand their relationship to classification outcomes.

Table 4.

Selected top 60 predictor features and their most positive/negative β coefficients from our logistic regression classifier. We show features corresponding to both mental health disclosure and control posts.

n-gram β n-gram β n-gram β n-gram β
bpd 10.5 manic 4.3 reddit −9.7 favorite −2.5
anxiety 9.8 don know 4.1 tifu −7.2 movie −2.5
bipolar 7.8 tired 4.1 dae −6.0 did −2.4
ptsd 7.7 going 4.1 eli5 −5.7 muscle −2.4
suicide 7.6 fucking 4.0 lpt −5.6 gym −2.4
feel 7.3 hate 4.0 ysk −5.6 edit −2.4
depressed 6.6 mental health 4.0 women −4.8 guy −2.3
help 6.4 diagnosed 4.0 redditors −4.6 pregnant −2.2
sa 6.1 psychiatrist 3.9 men −3.7 squat −2.1
just 5.8 social 3.8 sex −3.4 pill −2.1
mental 5.8 suicidal 3.8 workout −3.2 song −2.1
talk 5.7 today 3.8 trp −3.2 thread −2.1
meds 5.6 dbt 3.8 baby −3.1 story −2.1
therapy 5.5 request 3.7 discussion −3.0 nsfw −2.0
kill 5.4 years 3.7 dad −3.0 curious −2.0

Vast majority of the coefficients associated with positive weights pertain to expressions of a variety of different mental health challenges (“bpd”, “bipolar”, “ptsd”, “suicide”, “depressed”, “mental health”). Some of the other features with positive coefficients include mentions of stress and anxiety (“anxiety”, “anxious”, “nervous”), some of the known concomitants of mental health concerns. Other features appear to be calls for help and support seeking on Reddit (“help”, “talk”, “request”). Some features also appear in discussions of treatment, coping strategies and outcomes (“meds”, “therapy”, “medication”, “diagnosed”, “psychiatrist”, “recovery”). Finally, a set of features with positive coefficients also include manifestation of negative emotions, hopelessness and dejection, pain, and even extreme thoughts of harming and killing oneself (“kill”, “tired”, “suicidal”, “die”, “worse”).

On the other hand, the predictor features with large negative β coefficients span a diverse range of topics, ranging from events and experiences (‘christmas”, ‘pregnant”), lifestyle (“workout”, “gym”), sports (“game”), community participation and usage practices (“lpt request”, “want know”, “curious”), to Reddit specific topics (“reddit”, redditors”, “edit”).

In summary, we observe that our transfer learning based classifier is able to robustly detect and characterize MH expressions in Reddit posts with high accuracy.

Expert Evaluation of MH Expressions in University Subreddits

Using the above trained and validated MH-CL post classifier, we then examined its performance in identifying mental health (henceforth MH) expressions in the posts belonging to the 109 university subreddits. For the purpose, we first employed two raters to annotate a random sample of 500 university subreddit posts (balanced across the two classes) to be about MH expressions or not. We used this annotated sample as a test set on which we applied our trained classifier.

Our qualitative annotation task proceeded as follows. Adopting an inductive semi-open coding approach, first, two raters, one a clinical psychologist and another a social media expert independently assigned binary annotations (MH expression or not) to a sub-sample of 100 posts. To arrive at these rules, they referred to prior work on qualitative and quantitative studies of mental health disclosures on social media [8, 2], and literature in psychology on markers of mental health expressions [53, 15, 33]. Following this initial rating exercise, the raters got together to resolve differences and constructed a final rulebook. Per this rulebook, a post had to satisfy one of more of these criteria to be annotated to be a MH expression:

  • Explicit expressions of first hand experience of psychological distress or mental health concerns (“i get overwhelmingly depressed”, “i get into a negative spiral”, “I think I am on the verge and feel like it’s the end”).

  • Explicit expressions of support, help, or advice seeking around difficult life challenges and experiences (“are there any resources I can use to talk to someone about depression?”, “I have been going to therapy for 2 days a week every week for my anxiety”).

Using this rulebook, the raters then annotated the larger sample of remaining 400 posts. The final agreement was found to be high (Cohen’s κ = .83).

We then applied our trained classifier to this annotated test set of 500 posts. We found our classifier to demonstrate consistent performance as before, in the task of distinguishing between university subreddit posts that are related to MH expressions and those that are not. We achieved a mean accuracy of 96.8%, with AUC of .97 (see the second column on “Test set” in Table 2). We, therefore, proceeded with using this classifier in machine labeling all of the 446,397 other university subreddits posts. Our classifier identified 13,914 posts (3.1%) to be MH expressions, whereas the rest of the 432,483 posts were marked not about the topic. This corresponded to 9010 unique users out of a total of 152,834 (mean=5%, std. dev.=1.5% across the 109 subreddits).

Aim 2: Analyzing MH Expressions in Universities

As per our next research aim, we present analytical observations given by the above classification of university subreddit posts—specifically we seek to: 1) identify what linguistic constructs (n-grams) characterize posts predicted to be MH expressions, and 2) understand the temporal manifestations of MH expressions in the context of different universities.

Linguistic Characteristics of MH Expressions

In Table 5 we present the top 20 uni-, 20 bi-, and 20 tri-grams that appear uniquely in university subreddit posts identified to be MH expressions. For qualitative inspection of the context of use of these n-grams, we randomly sampled a set of 100 MH posts that contained at least one of these top n-grams. We performed qualitative semi-open coding on this sample, employing the same two raters as above. The rating task discovered various topical contexts in which these n-grams were used in the university subreddit posts:

Table 5.

Top 20 uni-, bi-, and tri-grams that appear uniquely in university subreddit posts detected by our classifier to be about mental health disclosures.

Unigrams Bigrams Trigrams
health to talk feel like i
care my life course an intro
worried my parents was doing great
family dont think but i feel
problems really dont the jobs i
feeling just need only one homework
guess my story feel like im
hate can help really dont want
exhausted the people i really need
talking killing myself to make friends
cheated people i im just not
depression life isnt want to live
honestly though i doing poorly in
issues mental health issues with depression
fucking social life could help me
strangers up late go into debt
ruin worried about to deal with
psychiatric suicidal thoughts dont know where
experiences need help to hang out
alone isnt fair i need help

We find that students appropriate the Reddit communities to converse on a number of college, academic, relationship, and personal life challenges that relate to their mental well-being (“go into debt”, “doing poorly in”, “only one homework”, “course an intro”, “up late”, “the jobs i”):

I’m lost and overwhelmed. […] I feel sick to my stomach because it made me go into debt and I can’t seem to bring myself to go out there and find a job.

The n-grams also indicate that certain posts contain explicit mentions of mental health challenges (“psychiatric”, “depression”, “killing myself”, “suicidal thoughts”), as well as the difficulties students face in their lives due to these experiences (“life isnt”, “issues with depression”, “was doing great”, “ruin”, “cheated”):

New Fall transfer here. Can I use [ some service] to get psychiatric help, including a diagnosis and meds IF necessary? There’s definitely some psychological issues I’ve been carrying around with me my entire life.

Finally, some of the n-grams indicate that students tend to vent on the different subreddits about challenges of college life, or to share their personal stories, feelings, and experiences (“exhausted”, “isnt fair”, “my story”, “im just not”):

[…] Since it’s too late to apply to any colleges for the fall semester, and since my entire social life is here at [ some university], I would prefer to stick around here rather than going home. So that’s basically my story. Can anyone recommend me a mental health professional?

Some of the top n-grams are also used in the context of seeking support (“need help”, “i really need”, “could help me”):

Anyone willing to help me with [ some course] preparing for the final? I need help, this semester has been extremely stressful due to development of clinical depression and anxiety.

Putting it together, this analysis help us validate that our classifier is able to reveal markers of mental health challenges in college students that are known to relate to their academic, personal or social lives [31, 45, 59].

Temporal Characteristics of MH Expressions

Next, we present an analysis of how the identified MH expressions in different university subreddits change over time. First, we compute the proportion of MH posts for each of the 109 subreddits. Aggregating this fraction over each of the years in our datasets2, we study the relative temporal change in expression of mental health in the university subreddits under consideration. In Figure 4(a), we show this trend, along with a corresponding linear least squares model fit (R2 = .79, p < .05). We observe that, the proportion of posts with MH expressions has been on the rise—there is 16% increase in 2015, compared to that in 2011. Further, identifying groups of subreddits with strictly positive or negative slopes per their least square fits (Figure 4(b)), we find that although there are some university subreddits with negative slopes (i.e., they show a decreasing trend over the years), for the vast majority (71% of the 109 subreddits), there has been a continual increase over time.

Figure 4.

Figure 4

(a) Yearly trend of the proportion of mental health (MH) posts aggregated across all of the university subreddits. (b) Distribution of university subreddits over their respective positive or negative slopes of linear least squares fit to (a).

Next, we examine how MH expressions in university subreddits change over the course of a typical academic year. Since academic years differ in universities adopting the semester and the quarter system, we compare differences in the trends of MH expressions across these university groups as well. From Figure 5(a–b) we find that over the course of a typical academic year, the proportion of MH posts of universities in both the semester and quarter system show a monotonically increasing trend. Note the positive slopes of the two linear least squares model fits: R2 = .88, p < .05 and R2 = .78, p < .05 respectively. Between August and April, for the universities in the semester system, we observe an 18.5% increase in MH expression (p < .05); this percentage is much higher: 78% for those in the quarter system, when compared between September and May (p < .05). On the other hand, during the summer months (Figure 5(c–d)), for both semester system and quarter system universities, we observe a reverse trend for the proportion of MH posts, i.e., a trend with a negative slope: R2 = .99, p < .05 for both university groups.

Figure 5.

Figure 5

MWI during (a) the academic year for universities with the semester system; (b) the academic year for universities with the quarter system; (c) the summer for universities with the semester system; and (d) the summer for universities with the quarter system.

Aim 3: Relating MWI to University Attributes

For our final investigation (aim 3), we compute the MWI metric for each of the 109 university subreddits. We examine the relationship between the MWI of each university subreddit and the corresponding university’s attributes.

We glean several interesting observations from Figure 6(a–h). From Figure 6(a) we find that MWI of the 66 public universities we consider, is lower, relative to that in the 43 private universities, by 332%. This difference is found to be statistically significant based on an independent sample t-test (t = 7.38, p < .05). Examining universities by their setting (Figure 6(b)), we find that MWI is lower in the 7 rural and 33 suburban universities by 40–266% compared to others (p < .05), while it is the highest in the 31 universities categorized to be in cities (by 29 – 77%; p < .05). Next, Figure 6(c) and (d) show the relationship between MWI of the universities, and their academic prestige and tuition fees. We observe a negative slope in the scatter plot of the former (R2 = .09; p < .05), while a positive slope in case of the latter (R2 = .17; p < .05). In essence, universities with higher academic prestige (or low absolute value rank) and higher tuition tend to be associated with higher MWI.

Figure 6.

Figure 6

Scatterplots of MWI vs (a) university type, (b) university setting, (c) rank, (d) fees, (e) enrollment, (f) student body ratio, (g) sex ratio, and (h) Racial Diversity (HHI).

Now we discuss the relationship of MWI of the universities with four attributes of their student body. Both Figure 6(e) and Figure 6(f) show that universities with larger student bodies (enrollment) as well as greater proportion of undergraduates in their student bodies tend to be associated with lower MWI (R2 = .15; p < .05 and R2 = .16; p < .05 respectively). Finally, examining two demography related attributes of student bodies, we find from Figure 6(g) and (h) that MWI tends to be lower in universities with more females (or sex ratio, male to female ≤ 1) by 850% (p < .01). Further, although our data shows a marginally lower MWI in universities with greater racial diversity, we did not find statistical significance to support this claim (R2 = .01; p = .2).

DISCUSSION

Theoretical Implications of the Findings

Mechanisms for collective assessment of mental health challenges in college populations are highly valued in the literature [56, 6], however they are rare in practice. By proposing an index of mental well-being in campuses that is derived from passively acquired social media data of students, we believe our work makes a contribution to close this gap. As our results have shown, with this kind of measurement, we are able to glean previously established as well as new insights into students’ mental well-being in different universities and types of student bodies.

MH expressions of universities have been increasing over the years

A notable finding of our analysis is the monotonically increasing trend of (normalized) MH expressions across university campuses—there was a 16% rise between 2015 and 2011. This finding aligns with observations from nationwide surveys on mental health of college students. In a 2008 national survey of directors of campus counseling centers, 95% of directors reported a significant increase in severe psychological problems among their students [4]. While our findings do not extrapolate to the same set of universities or the same timeframe, same directionality of the trend provides some validation of our MH expression detection method.

MH expressions show increase during the academic year, but decrease over the summer

We also observe that the MH expressions consistently increase through the academic year, while consistently decrease during the summer. There is prior work that situates academic pressure as a notable contributing factor of mental health challenges in students [31, 45]. We hypothesize that as the academic year advanced, the accumulating academic pressure may be one reason behind the monotonically increasing trend of MH expressions. On the other hand, students may have identified the summer months to be a time to unwind and relax, hence likely lowering the expression of MH challenges. However we suggest caution in deriving causal claims from these findings.

MWI is lower for large, public universities with large undergraduate student bodies

Next, in relating student body attributes to a university’s MWI, we are able to confirm some known facts about college student mental health. The campuses most challenged by mental health issues tend to be public universities which have large student bodies and a greater proportion of undergraduate students. It is reported that the student bodies of these campuses include many who are the first in their families to attend college and therefore carry intense pressure to succeed [32]. Further, undergraduate students especially are known to be at an elevated risk [35, 7]. We conjecture these prior findings may provide some explanation behind the observed low MWI in universities with large undergraduate student bodies.

MWI is higher in high prestige universities

Further, we observe that there is a positive correlation with higher academic prestige (in terms of ranking) of a university and MWI. Despite reports of mental health challenges being more prevalent in top ranked colleges [25], our results reveal an opposite trend. In a high prestige university, it is likely the student body is self-selected, in that they perhaps already have internalized the need to deal with the academic pressure and rigor needed for success. Therefore they might be unlikely to express being overwhelmed mentally and emotionally on a public social media community like Reddit.

MWI is higher in universities with higher tuition

Relatedly, the other somewhat surprising finding is the positive relationship between a university’s tuition fees and its MWI. Financial stress is identified to be a major factor behind college students’ mental health issues [31]. Our finding deviates from this expected behavior. We conjecture it might be explained by the socio-economic support structure that many of the students at high tuition universities may come with—those who get admitted are likely to have friends, family, and a sound financial backbone that may be mediating their risk to well-being challenges [24].

MWI is lower in universities with a larger female student body

As a final observation, our results indicate that there is greater expression of mental health challenges in universities that have a larger proportion of female students over male. As noted in our literature review, female college students tend to seek mental health help more frequently compared to male students [45]. Hence it can be presumed that they also tend to be more expressive about their mental health challenges in social media, thereby explaining our finding for MWI.

Taken together, through this paper, we introduced social media, for detecting campus-specific mental health expressions of students. This has enabled us to obtain a variety of insights, like the ones discussed above, into the mental well-being of a campus in a granularity and scale not possible before. Moreover, with our approach, it is possible to gather these insights through unobtrusive, inexpensive means, with little intrusion. Thus our work can expand and complement current survey-based efforts of assessing student mental health and its relationship to attributes of the university or the student body. Broadly, we contribute to the emergent body of HCI research that leverages naturalistically shared population data on social media for mental health measurement [19]: We are able to identify a variety of college-student specific linguistic markers of mental health challenges by employing a novel data source of university-specific Reddit communities.

Implications for Design

We believe that our work can enable technology design that promotes population-centric reflection of mental well-being in college campuses in ways not possible before. This can be accomplished in the following ways:

Technologies for Improving Counseling Efforts

Our work shows that when college students appropriate social media to express their mental health challenges, our method can accurately identify and measure such expressions. This observation and methodology can be incorporated into interactive applications for campus counselors. The application could surface specific linguistic attributes highly correlated with the mental health expressions of students on social media, as well as the temporal manifestations of these attributes in different student groups. This information can be highly beneficial to campus counseling centers and other campus health service providers in understanding the pervasiveness of mental health expressions, and the variety of topics that student attribute in these communities to be related to mental health challenges. They could act on this information to allocate their services and strategies to better reach and serve students.

Technologies for Assessing Campus Morale

Our method and findings bear implications for the design of novel student mental well-being tracking interfaces, visualizations, and systems for use by campus administrators. These interfaces could provide stakeholders with an interactive way to identify temporally and in a near real-time fashion, the ebbs and highs of mental well-being, during a typical academic year as well as over extended periods of time. This information can then be utilized for routine assessments of campus morale, as well as to understand the impacts of academic events like examinations, regulations and policy decisions in campus life. Further, it can also contribute to improved preparedness in campus in case of an emergency and assessing mental resilience of the student body in response to adverse events that affect mental well-being of student. Finally, these systems can also empower campus administrators with collective information about students given by the students themselves, to identify how to proactively employ, allow and manage campus specific resources, mental health awareness and mitigation programs in order to best cater to the needs of the students, and improve campus mental well-being.

Limitations and Future Work

Diagnostic Claims, Causality, and Generalizability

We note that the Mental Well-being Index is not meant to be a diagnostic tool to assess who is at risk of mental illness. Thus we caution against appropriating the index as a mechanism to identify specific college students who could be suffering from mental health concerns. However, as future work, it will be worthwhile to examine to what extent MWI’s assessments of mental health challenges correlate with psychometric assessments obtained via instruments like the Patient Health Questionnaire (PHQ) [38]. We also caution against deriving causal claims between various university attributes and manifested MWI. Further, we only studied 150 ranked universities in the US. We caution against arbitrary generalizations.

Prevalence

We found that the percentages of mental health expression users in Reddit were slightly lower (5%) than reported national statistics of college students with significant mental health concerns (7%) [6]. We hypothesize a couple of reasons behind this difference. 1) Students, due to the stigma of mental illness, may be underreporting their mental health concerns on a public platform like Reddit. 2) Students who appropriate social media for mental health needs may be ones who do not (or are not able to) seek professional help, accounting for the discrepancy.

Evaluation

Although our validation approach for MWI derived trends that align with some known patterns of college students’ mental health challenges, one of the limitations of our work is more rigorous evaluation. Which are the campuses with the most mental health challenges, and does our MWI metric correlate with those statistics? Answering this question requires access to the normalized levels of mental health concerns in the universities studied here, which is not available for public use [56]. Moreover, university administrators may be hesitant to share such statistics more widely due to the stigma it may bring to a university student body. However, as we showed, MWI can be adopted to compare across subgroups of campuses that share similar attributes.

Alternative Data Sources

Finally, we leveraged data from Reddit. While this social media is very popular in the college student demographic, it is likely that a variety of other social media are also used with data volunteering efforts of students, such as Instagram, Twitter, and Snapchat, as noted earlier. Future research could examine how data from these various platforms may be integrated to improve the assessment of mental health expressions in campuses.

CONCLUSION

Many college students are appropriating online social platforms for mental health disclosure and support seeking needs. We used student-geared Reddit communities of over a hundred universities to build and evaluate a transfer learning based classification approach that can detect mental health expressions with 97% accuracy. Leveraging this classifier, we then developed a Mental Well-being Index (MWI) to evaluate the collective mental health status of over 100 university campuses in the US. We then showed the relationship between various attributes of the universities and their student bodies, and MWI. We believe our work can enbale technology design to tackle mental health challenges in college populations.

Acknowledgments

We thank Gregory Abowd and other members of the Georgia Tech CampusLife team for valuable feedback. We also thank members of Precog for their valuable inputs; special thanks to Niharika Sachdeva. De Choudhury was partly supported through NIH grant #1R01GM11269701.

Footnotes

2

We exclude 2016 since we don’t have complete data for the year.

Contributor Information

Shrey Bagroy, Precog, IIIT-Delhi, shrey14099@iiitd.ac.in.

Ponnurangam Kumaraguru, Precog, IIIT-Delhi, pk@iiitd.ac.in.

Munmun De Choudhury, College of Computing, Georgia Tech, munmund@gatech.edu.

References

  • 1.Abouserie Reda. Sources and levels of stress in relation to locus of control and self esteem in university students. Educational psychology. 1994;14(3):323–330. 1994. [Google Scholar]
  • 2.Andalibi Nazanin, Ozturk Pinar, Forte Andrea. Depression-related Imagery on Instagram. Proc. CSCW’15 Companion. 2015:231–234. [Google Scholar]
  • 3.Antonovsky Aaron. Health, stress, and coping. 1979 1979. [Google Scholar]
  • 4.American College Health Association and others. American College Health Association-National College Health Assessment Spring 2008 Reference Group Data Report (abridged): the American College Health Association. Journal of American college health: J of ACH. 2009;57(5):477. doi: 10.3200/JACH.57.5.477-488. 2009. [DOI] [PubMed] [Google Scholar]
  • 5.American College Health Association and others. American College Health Association-National College Health Assessment II: Reference Group Data Report Spring 2012. Linthicum, MD: American College Health Association; 2012. 2012. [Google Scholar]
  • 6.American College Health Association and others. National College Health Assessment II: Reference Group Executive Summary Fall 2012. Hanover, MD: American College Health Association; 2013. 2013. [Google Scholar]
  • 7.Bailey Roger C, Miller Christy. Life satisfaction and life demands in college students. Social Behavior and Personality: an international journal. 1998;26(1):51–56. 1998. [Google Scholar]
  • 8.Balani Sairam, Choudhury Munmun De. Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems. ACM; 2015. Detecting and Characterizing Mental Health Related Self-Disclosure in Social Media; pp. 1373–1378. [Google Scholar]
  • 9.Barlow David H, Lehrer Paul M, Woolfolk Robert L, Sime Wesley E. Principles and practice of stress management. Guilford Press; 2007. [Google Scholar]
  • 10.Ben-Zeev Dror, Scherer Emily A, Wang Rui, Xie Haiyi, Campbell Andrew T. Next-Generation Psychiatric Assessment: Using Smartphone Sensors to Monitor Behavior and Mental Health. 2015 doi: 10.1037/prj0000130. 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Google BigQuery. [Accessed: 2016-09-03];Reddit Archive on Google BigQuery. 2015 https://bigquery.cloud.google.com/table/fh-bigquery:redditposts.fullcorpus201512?pli=1. 2015.
  • 12.Chancellor Stevie, Lin Zhiyuan, Goodman Erica L, Zerwas Stephanie, Choudhury Munmun De. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. ACM; 2016. Quantifying and Predicting Mental Illness Severity in Online Pro-Eating Disorder Communities; pp. 1171–1184. [Google Scholar]
  • 13.Chen Fanglin, Wang Rui, Zhou Xia, Campbell Andrew T. Proceedings of the 2014 workshop on physical analytics. ACM; 2014. My smartphone knows i am hungry; pp. 9–14. [Google Scholar]
  • 14.Coppersmith Glen, Dredze Mark, Harman Craig. Quantifying mental health signals in twitter. ACL Workshop on Computational Linguistics and Clinical Psychology 2014 [Google Scholar]
  • 15.Cozby Paul C. Self-disclosure: a literature review. Psychological bulletin. 1973;79(2):73. doi: 10.1037/h0033950. 1973. [DOI] [PubMed] [Google Scholar]
  • 16.Dai Wenyuan, Yang Qiang, Xue Gui-Rong, Yu Yong. Proceedings of the 24th international conference on Machine learning. ACM; 2007. Boosting for transfer learning; pp. 193–200. [Google Scholar]
  • 17.Choudhury Munmun De, Counts Scott, Horvitz Eric. Proceedings of the 5th Annual ACM Web Science conference. ACM; 2013a. Social media as a measurement tool of depression in populations; pp. 47–56. [Google Scholar]
  • 18.Choudhury Munmun De, De Sushovan. Mental Health Discourse on reddit: Self-disclosure, Social Support, and Anonymity; International conference on Weblogs and Social Media (ICWSM).2014. [Google Scholar]
  • 19.Choudhury Munmun De, Gamon Michael, Counts Scott, Horvitz Eric. Predicting depression via social media; AAAI conference on Weblogs and Social Media.2013b. [Google Scholar]
  • 20.Choudhury Munmun De, Kiciman Emre, Dredze Mark, Coppersmith Glen, Kumar Mrinal. Proceedings of the 2016 CHI conference on Human Factors in Computing Systems. ACM; 2016. Discovering shifts to suicidal ideation from mental health content in social media; pp. 2098–2110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Dodds Peter Sheridan, Harris Kameron Decker, Kloumann Isabel M, Bliss Catherine A, Danforth Christopher M. Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter. PloS one. 2011;6(12):e26752. doi: 10.1371/journal.pone.0026752. 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Drysdale Diana A, Modzeleski William, Simons Andre B, Napolitano Janet, Sullivan Mark, Duncan Arne, Jennings Kevin, Holder Eric, Mueller Robert S, III and others. [Accessed: 2010-09-30];Targeted Violence Affecting Institutions of Higher Education. 2010 https://www.fbi.gov/stats-services/publications/campus-attacks. 2010.
  • 23.Duggan Maeve, Smith Aaron. [Accessed: 2016-09-03];6% of Online Adults are reddit Users. 2013 http://www.pewinternet.org/2013/07/03/6-of-online-adults-are-reddit-users/ 2013.
  • 24.Eisenberg Daniel, Golberstein Ezra, Gollust Sarah E. Help-seeking and access to mental health care in a university student population. Medical care. 2007;45(7):1594–601. doi: 10.1097/MLR.0b013e31803bb4c1. 2007. [DOI] [PubMed] [Google Scholar]
  • 25.Eisenberg Daniel, Hunt Justin, Speer Nicole. Mental health in American colleges and universities: variation across student subgroups and across campuses. The Journal of nervous and mental disease. 2013;201(1):60–67. doi: 10.1097/NMD.0b013e31827ab077. 2013. [DOI] [PubMed] [Google Scholar]
  • 26.Ellison Nicole B, Steinfield Charles, Lampe Cliff. The benefits of Facebook “friends”: Social capital and college students’ use of online social network sites. Journal of Computer-Mediated Communication. 2007;12(4):1143–1168. 2007. [Google Scholar]
  • 27.Gallagher Robert P, Gill Al. National survey of counseling center directors. Alexandria, VA: International Association of Counseling Services; 2004. 2004. [Google Scholar]
  • 28.Gjoreski Martin, Gjoreski Hristijan, Lutrek Mitja, Gams Matja. Intelligent Environments (IE), 2015 International conference on. IEEE; 2015. Automatic detection of perceived stress in campus students using smartphones; pp. 132–135. [Google Scholar]
  • 29.Golder Scott A, Macy Michael W. Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science. 2011;333:6051. 1878–1881. doi: 10.1126/science.1202775. 2011. [DOI] [PubMed] [Google Scholar]
  • 30.Hampton Keith, Goulet Lauren Sessions, Rainie Lee, Purcell Kristen. Social networking sites and our lives. 2011 Retrieved July 12, 2011 from (2011) [Google Scholar]
  • 31.Hudd Suzanne, Dumlao Jennifer, Erdmann-Sager Diane, Murray Daniel, Phan Emily, Soukas Nicholas, Yokozuka Nori. Stress at college: Effects on health habits, health status and self-esteem. College Student Journal. 2000 2000. [Google Scholar]
  • 32.Hunt Justin, Eisenberg Daniel. Mental health problems and help-seeking behavior among college students. Journal of Adolescent Health. 2010;46(1):3–10. doi: 10.1016/j.jadohealth.2009.08.008. 2010. [DOI] [PubMed] [Google Scholar]
  • 33.Jourard Sidney M. Mental Hygiene. New York: 1959. Healthy personality and self-disclosure. 1959. [PubMed] [Google Scholar]
  • 34.Kamvar Sepandar D, Harris Jonathan. Proceedings of the fourth ACM international conference on Web search and data mining. ACM; 2011. We feel fine and searching the emotional web; pp. 117–126. [Google Scholar]
  • 35.Kitzrow Martha Anne. The mental health needs of today’s college students: Challenges and recommendations. NASPA journal. 2003;41(1):167–181. 2003. [Google Scholar]
  • 36.Kivran-Swaine Funda, Ting Jeremy, Brubaker Jed R, Teodoro Rannie, Naaman Mor. Understanding Loneliness in Social Awareness Streams: Expressions and Responses. ICWSM 2014 [Google Scholar]
  • 37.Kramer Adam DI. Proceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM; 2010. An unobtrusive behavioral model of gross national happiness; pp. 287–290. [Google Scholar]
  • 38.Kroenke Kurt, Spitzer Robert L, Williams Janet BW. The Phq-9. Journal of general internal medicine. 2001;16(9):606–613. doi: 10.1046/j.1525-1497.2001.016009606.x. 2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Kumar Mrinal, Dredze Mark, Coppersmith Glen, De Choudhury Munmun. Proceedings of the 26th ACM conference on Hypertext & Social Media. ACM; 2015. Detecting Changes in Suicide Content Manifested in Social Media Following Celebrity Suicides; pp. 85–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lenhart Amanda, Purcell Kristen, Smith Aaron, Zickuhr Kathryn. Social media and young adults. Pew Internet & American Life Project. 2010;3 2010. [Google Scholar]
  • 41.MacLean Diana, Gupta Sonal, Lembke Anna, Manning Christopher, Heer Jeffrey. Forum77: An Analysis of an Online Health Forum Dedicated to Addiction Recovery. Computer-Supported Cooperative Work and Social Computing (CSCW) 2015 [Google Scholar]
  • 42.Manago Adriana M, Taylor Tamara, Greenfield Patricia M. Me and my 400 friends: the anatomy of college students’ Facebook networks, their communication patterns, and well-being. Developmental psychology. 2012;48(2):369. doi: 10.1037/a0026338. 2012. [DOI] [PubMed] [Google Scholar]
  • 43.Mark Gloria, Wang Yiran, Niiya Melissa, Reich Stephanie. Proceedings of the 2016 CHI conference on Human Factors in Computing Systems. ACM; 2016. Sleep Debt in Student Life: Online Attention Focus, Facebook, and Mood; pp. 5517–5528. [Google Scholar]
  • 44.Matsuba M Kyle. Searching for self and relationships online. Cyber Psychology & Behavior. 2006;9(3):275–284. doi: 10.1089/cpb.2006.9.275. 2006. [DOI] [PubMed] [Google Scholar]
  • 45.Matsushima Rumi, Shiomi Kunio. Social self-efficacy and interpersonal stress in adolescence. Social Behavior and Personality: an international journal. 2003;31(4):323–332. 2003. [Google Scholar]
  • 46.Mishne Gilad, Rijke Maarten De. Capturing Global Mood Levels using Blog Posts. AAAI spring symposium: computational approaches to analyzing weblogs. 2006:145–152. [Google Scholar]
  • 47.Murnane Elizabeth L, Counts Scott. Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM; 2014. Unraveling abstinence and relapse: smoking cessation reflected in social media; pp. 1345–1354. [Google Scholar]
  • 48.US News. [Accessed: 2016-09-03];US News University Rankings. 2015 http://colleges.usnews.rankingsandreviews.com/best-colleges. (2015)
  • 49.Pan Sinno Jialin, Yang Qiang. A survey on transfer learning. IEEE Transactions on knowledge and data engineering. 2010;22(10):1345–1359. 2010. [Google Scholar]
  • 50.Pavalanathan Umashanthi, Choudhury Munmun De. Identity Management and Mental Health Discourse in Social Media; Proceedings of the 24th International conference on World Wide Web Companion. International World Wide Web conferences Steering Committee; 2015. pp. 315–321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Pearlin Leonard I. Stress and mental health: A conceptual overview. 1999 1999. [Google Scholar]
  • 52.Pempek Tiffany A, Yermolayeva Yevdokiya A, Calvert Sandra L. College students’ social networking experiences on Facebook. Journal of applied developmental psychology. 2009;30(3):227–238. 2009. [Google Scholar]
  • 53.Pennebaker James W, Mehl Matthias R, Niederhoffer Kate G. Psychological aspects of natural language use: Our words, our selves. Annual review of psychology. 2003;54(1):547–577. doi: 10.1146/annurev.psych.54.101601.145041. 2003. [DOI] [PubMed] [Google Scholar]
  • 54.Rayle Andrea Dixon, Chung Kuo-Yi. Revisiting first-year college students’ mattering: Social support, academic stress, and the mattering experience. Journal of College Student Retention: Research, Theory & Practice. 2007;9(1):21–37. 2007. [Google Scholar]
  • 55.Schwartz Hansen Andrew, Eichstaedt Johannes C, Kern Margaret L, Dziurzynski Lukasz, et al. Characterizing Geographic Variation in Well-Being Using Tweets. Proc. ICWSM 2013 [Google Scholar]
  • 56.Soet Johanna, Sevig Todd. Mental health issues facing a diverse sample of college students: Results from the College Student Mental Health Survey. NASPA journal. 2006;43(3):410–431. 2006. [Google Scholar]
  • 57.Struthers C Ward, Perry Raymond P, Menec Verena H. An examination of the relationship among academic stress, coping, motivation, and performance in college. Research in higher education. 2000;41(5):581–592. 2000. [Google Scholar]
  • 58.Priceonomics Data Studio. [Accessed: 2016-09-03];Ranking the Most (and Least) Diverse Colleges in America. 2016 https://priceonomics.com/ranking-the-most-and-least-diverse-colleges-in/ (2016)
  • 59.Towbes Lynn C, Cohen Lawrence H. Chronic stress in the lives of college students: Scale development and prospective prediction of distress. Journal of youth and adolescence. 1996;25(2):199–217. 1996. [Google Scholar]
  • 60.Tsugawa Sho, Kikuchi Yusuke, Kishino Fumio, Nakajima Kosuke, Itoh Yuichi, Ohsaki Hiroyuki. Proceedings of the 33rd Annual ACM conference on Human Factors in Computing Systems. ACM; 2015. Recognizing Depression from Twitter Activity; pp. 3187–3196. [Google Scholar]
  • 61.Valkenburg Patti M, Peter Jochen, Schouten Alexander P. Friend networking sites and their relationship to adolescents’ well-being and social self-esteem. CyberPsychology & Behavior. 2006;9(5):584–590. doi: 10.1089/cpb.2006.9.584. 2006. [DOI] [PubMed] [Google Scholar]
  • 62.Wang Rui, Chen Fanglin, Chen Zhenyu, Li Tianxing, Harari Gabriella, Tignor Stefanie, Zhou Xia, Ben-Zeev Dror, Campbell Andrew T. Proceedings of the 2014 ACM International Joint conference on Pervasive and Ubiquitous Computing. ACM; 2014. StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones; pp. 3–14. [Google Scholar]
  • 63.Wang Rui, Harari Gabriella, Hao Peilin, Zhou Xia, Campbell Andrew T. Proceedings of the 2015 ACM International Joint conference on Pervasive and Ubiquitous Computing. ACM; 2015. SmartGPA: how smartphones can assess and predict academic performance of college students; pp. 295–306. [Google Scholar]
  • 64.Watanabe Jun-ichiro, Matsuda Saki, Yano Kazuo. Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication. ACM; 2013a. Using wearable sensor badges to improve scholastic performance; pp. 139–142. [Google Scholar]
  • 65.Watanabe Jun-Ichiro, Yano Kazuo, Matsuda Saki. Ubiquitous Intelligence and Computing, 2013 IEEE 10th International conference on and 10th International conference on Autonomic and Trusted Computing (UIC/ATC) IEEE; 2013b. Relationship between physical behaviors of students and their scholastic performance; pp. 170–177. [Google Scholar]
  • 66.Zajacova Anna, Lynch Scott M, Espenshade Thomas J. Self-efficacy, stress, and academic success in college. Research in higher education. 2005;46(6):677–706. 2005. [Google Scholar]
  • 67.Zivin Kara, Eisenberg Daniel, Gollust Sarah E, Golberstein Ezra. Persistence of mental health problems and needs in a college student population. Journal of affective disorders. 2009;117(3):180–185. doi: 10.1016/j.jad.2009.01.001. 2009. [DOI] [PubMed] [Google Scholar]

RESOURCES