Skip to main content
Journal of the Royal Society Interface logoLink to Journal of the Royal Society Interface
. 2018 Sep 19;15(146):20180210. doi: 10.1098/rsif.2018.0210

Orderliness predicts academic performance: behavioural analysis on campus lifestyle

Yi Cao 1, Jian Gao 1, Defu Lian 2, Zhihai Rong 1, Jiatu Shi 2, Qing Wang 3, Yifan Wu 1, Huaxiu Yao 4, Tao Zhou 1,2,
PMCID: PMC6170765  PMID: 30232241

Abstract

Quantitative understanding of relationships between students' behavioural patterns and academic performances is a significant step towards personalized education. In contrast to previous studies that were mainly based on questionnaire surveys, recent literature suggests that unobtrusive digital data bring us unprecedented opportunities to study students' lifestyles in the campus. In this paper, we collect behavioural records from undergraduate students' (N = 18 960) smart cards and propose two high-level behavioural characters, orderliness and diligence. The former is a novel entropy-based metric that measures the regularity of campus daily life, which is estimated here based on temporal records of taking showers and having meals. Empirical analyses on such large-scale unobtrusive behavioural data demonstrate that academic performance (GPA) is significantly correlated with orderliness. Furthermore, we show that orderliness is an important feature to predict academic performance, which improves the prediction accuracy even in the presence of students' diligence. Based on these analyses, education administrators could quantitatively understand the major factors leading to excellent or poor performance, detect undesirable abnormal behaviours in time and thus implement effective interventions to better guide students' campus lives at an early stage when necessary.

Keywords: computational social science, campus behaviour, academic performance, data science, orderliness, human behaviour

1. Introduction

A major challenge in education management is to uncover underlying ingredients that affect students' academic performance, which is significant in working out teaching programmes, facilitating personalized education, detecting harmful abnormal behaviours and intervening students' mentation, sentiments and behaviours when it is very necessary. For example, it has been demonstrated that physical status (e.g. height and weight) [15], intelligence quotient (IQ) [6,7] and even DNA [810] are correlated with educational achievement. Accordingly, we can design personalized teaching and caring programmes for different individuals. Since we cannot change a student's height or DNA via education, more studies concentrate on the aspects of psychology and behaviour, with a belief that learning problems resulting from psychological and behavioural issues can be at least partially intervened. For example, early interventions according to the predictions on course scores or course failures have been discussed recently for K12 education [1114].

Extensive experiments about relationships between personality and academic performance have been reported in the literature, suggesting that agreeableness, openness and conscientiousness, among the big five personality traits, are significantly correlated with tertiary academic performance, say GPA and course performance [1517]. In particular, the correlation between conscientiousness and GPA is the strongest (at about 0.2) [1517]. Behaviours are also associated with academic performance. Class attendance has long been known as an important determinant of academic performance [1822], and additional studying hours are positively correlated with GPA [2325]. In addition to studying behaviours, some experimental evidences indicate that students with healthy lifestyles and good sleep habits have higher GPAs on average [2629].

Under the traditional research framework, a large portion of datasets come from questionnaires and self-reports, which are usually of very small sizes (most sample sizes scale from dozens to hundreds, see meta-analysis reviews [15,16,18,23,26]) and suffer from social desirability bias [30,31], resulting in the difficulties to draw valid and solid conclusions. Thanks to the fast development of modern information technology, we have unprecedented opportunities to collect real-time records of students' living and studying activities in an unobtrusive way, through smartphones [32], online courses [33], campus WiFi [34] and so on. Analyses on these data revealed many unreported correlations between behavioural features and academic performance. For example, watching more of the video and pausing more than once are two strong indicators for better course performance in MOOCs [33], and students who spend more time partying at fraternities or sororities have lower GPAs on average [35].

To quantitatively understand the relationships between university students' behavioural patterns and academic performance as well as the predictive power of the patterns of students' further academic performance, through campus smart cards, we have collected digital records of undergraduate students' (N = 18 960, all of them are pseudonymous) daily activities in the University of Electronic Science and Technology of China (UESTC) from September 2009 to July 2015 (see Data collection in Material and methods for detailed description). The data resolution was reduced before analysis to protect the individuals' privacy (see Privacy protection in Material and methods). According to the methodology (figure 1) used in this study, we have extracted two high-level behavioural characters from the records, including orderliness (evaluated by the purchase records for showers (n = 3 151 783) and meals (n = 19 015 773)), which quantifies daily-life regularity and diligence (evaluated by the entry–exit records in the library (n = 3 412 587) and fetching water records in teaching buildings (n = 2 279 592)), which estimates how long time is spent on studies. Empirical results suggested the significant correlation between academic performance (GPA) and orderliness. Further, we found that orderliness as an important feature improves the prediction accuracy for academic performance even in the presence of students' diligence. Our work helps education administrators quantitatively understand the major behavioural factors that affect academic performance and provides a promising methodology towards quantitative and personalized education management.

Figure 1.

Figure 1.

Methodology used to analyse correlations between campus daily routine and academic performance, and then to predict future academic performance. First of all, a large volume of digital entry–exit and consumption records are collected by the real-name campus smart cards with ID encryption. Then, four kinds of behaviours are used to measure two high-level behavioural characters: orderliness and diligence. Specifically, taking showers in dormitories and having meals in cafeterias contribute to the orderliness measure, while entering/exiting the library and fetching water in teaching buildings contribute to the diligence measure. After that, empirical analysis is performed to show the correlation between academic performance and behavioural characters (i.e. orderliness and diligence). Last but not least, the predictive powers of orderliness and diligence are also presented and compared. (Online version in colour.)

2. Results

2.1. Orderliness

Intuitively, a regular lifestyle would stand us in good stead for college study. In particular, teachers and administrators in most Asian countries, (e.g. Japan, Korea, Singapore, China, etc.) ask students to be self-disciplined both in and out of class [36], and a significantly positive relationship between disciplinary climate and school performance has been revealed [37,38]. Moreover, previous studies based on questionnaires showed that to improve the regularity of class attendance [18,39] and to cultivate regular studying habits [23] will enhance academic performance. However, these studies have not distinguished orderliness in living patterns from diligence in study, since more regular studying habits will result in longer studying time. To our knowledge, a clear and quantitative relationship between orderliness in living patterns and academic performance of college students has not yet been unfolded in the literature. Fortunately, with the large-scale behavioural data, especially the extracurricular behavioural records, we are able to quantitatively measure the orderliness of a student's campus lifestyle.

According to the dataset, taking a specific behaviour, say taking showers, as an example, if the starting times of taking showers of student A always fall into the range [21:00, 21:30] while student B may take a shower at any time, we could say student A has a higher orderliness than student B for showers. Next, we turn to the mathematical issue of quantifying the orderliness of a student. Again, considering a specific behaviour (e.g. taking showers, having meals, etc.) of an arbitrary student, within total n recorded actions happening at time stamps {t1d1, t2d2, …, tndn}, where ti ∈ [00:01, 24:00] denotes the precise time with resolution in minutes, and di ∈ [1 September 2009, 20 July 2015] records the date. All actions are arranged in order of occurrence, namely, the i-th action happens before the j-th action if i < j. A typical example could be {21:12—20 March 2012, 22:02—22 March 2012, … , 12:10—09 April 2014}. In the analysis of orderliness, we only concentrate on the precise time within a day, say {t1, t2, …, tn}. We first divide 1 day into 48 time bins, each of which spans 30 min and is encoded from 1 to 48 (specifically, 0:01—0:30 is the 1st bin, 0:31—1:00 is the 2nd bin, … ). Then, the time series {t1, t2, …, tn} can be mapped into a discrete sequence {t1, t2, …, tn} where ti ∈ {1, 2, …, 48}. For example, if a student's starting times of five consecutive showers are {21:05, 21:33, 21:13, 21:48, 21:40}, the corresponding binned sequence is Inline graphic. In this paper, we apply the actual entropy [40,41] to measure the orderliness of any sequence Inline graphic (see Material and methods for details). The actual entropy is considered as a metric for orderliness: the smaller the entropy, the higher the orderliness. The advantages of using actual entropy instead of some other well-known metrics, such as information entropy [42] and Simpson's diversity index [43], are presented in electronic supplementary material, S1.

Among various daily activities on campus, we calculate orderliness based on two behaviours: taking showers in dormitories and having meals in cafeterias. The reasons to choose these two behaviours are fivefold: (i) they are both high-frequency behaviours so that we have a large number of records; (ii) the data are unobtrusive and thus can objectively reflect students' lifestyles without experimental bias; (iii) they are not directly related to diligence; (iv) they are less affected by the specific course schedules since any schedule will leave time for meals and showers; (v) most university students in China live and study on campus, and thus the used datasets have sufficient coverage to validate the results. We show the distributions p(S) of actual entropies of students on taking showers in dormitories (figure 2a) and having meals in cafeterias (figure 2b), respectively. The broad distributions guarantee the discriminations of students with different orderliness. We compare two typical students (figure 2c), respectively, with very high orderliness (at the 5th percentile of the distribution p(S), named as student H) and very low orderliness (at the 95th percentile of the distribution p(S), named as student L). As clearly shown by the behavioural clock, student H takes most showers around 21:00 while student L may take showers at any time in a day only except for a very short period before dawn, from about 2:30 to about 5:00. We observe a similar discrepancy between two students, respectively, with very high and very low orderliness on having meals (figure 2d). In a word, students with higher orderliness have more concentrated behaviours over time while students with lower orderliness have much more dispersed temporal activities.

Figure 2.

Figure 2.

The distributions of actual entropies. (a,b) Distributions, p(S), of students in taking showers in dormitories (a) and having meals in cafeterias (b). The broad distributions guarantee the discriminations of students with different orderliness. (c,d) To better illustrate the differences in behavioural patterns, the behavioural clocks of two students at the 5th percentile and the 95th percentile are shown for taking showers in dormitories (c) and having meals in cafeterias (d). Intuitively, the students with higher orderliness have more concentrated behaviours over time while the students with lower orderliness have much more dispersed temporal activities. The huge differences between their behavioural patterns demonstrate the relevance of the orderliness measure. (Online version in colour.)

In addition to orderliness, we have also considered another high-level behavioural character called diligence, which estimates the effort a student makes in his/her academic studies. Considering the difficulties in quantifying diligence due to the lack of ground truth, we roughly estimate diligence based on two behaviours: entering/exiting the library and fetching water in teaching buildings. Specifically, we use a student's cumulative occurrences of entering/exiting the library and fetching water as a rough estimate of his/her diligence (see electronic supplementary material, S2 for details). Empirical analysis also demonstrates that the corresponding distributions are broad enough to distinguish students with different diligence (see electronic supplementary material, figure S1).

2.2. Analysis

Intuitively, students with higher orderliness are probably more self-disciplined since orderliness is an intrinsic personality trait that not only affects meals and showers but also acts on studying behaviours. Hence, we would like to explore whether orderliness is correlated with academic performance, say GPA. The orderliness is simply defined as Inline graphic and both orderliness and GPA are firstly regularized by Z-score [44] (see Material and methods). The relationships between regularized GPA and regularized orderliness (meal and shower) indicate significantly positive correlations (see figure 3). Considering that the relationships between behavioural features and GPA are not simply linear (see electronic supplementary material, figure S3), we apply the well-known Spearman rank correlation coefficient [45] to quantify the correlation strength (see Material and methods). Spearman's rank correlation coefficient r lies in the range [−1, 1], and the larger the absolute value is, the higher the correlation is. Spearman's rank correlation coefficients for meal (r = 0.182; p < 0.0001) and shower (r = 0.157; p < 0.0001) both suggest the statistical significance.

Figure 3.

Figure 3.

Relationship between orderliness and academic performance for meal (orange circles) and shower (blue squares). Binned statistics are used to aggregate the data points, where regularized orderliness is divided into 11 bins, each of which contains the same number of data points. The mean value of data points in each bin is presented, with error bars denoting the standard error of the regularized GPA. Spearman's rank correlation coefficients for GPA–meal (r = 0.182; p < 0.0001) and GPA–shower (r = 0.157; p < 0.0001) suggest the statistical significance. (Online version in colour.)

The significant correlation implies that orderliness can be considered as a feature class to predict students' academic performance. Diligence is also significantly correlated with academic performance (see electronic supplementary material, figure S2) and thus is considered to be another feature class in the prediction model. We apply a well-known supervised learning to rank algorithm named RankNet [46] (see Material and methods) to predict the ranks of students' semester grades. We train RankNet based on the extracted orderliness and diligence values in one of the first four semesters and predict students' ranks of grades in the next semester. We use the AUC value [47] to evaluate the prediction accuracy, which, in this case, is equal to the percentage of student pairs whose relative ranks can be consistently predicted with the ground truth. The AUC value ranges from 0 to 1 with 0.5 being the random chance, therefore to which extent the AUC value exceeds 0.5 can be considered as the predictive power. We calculate the AUC values under different feature combinations (table 1). It is noticed that both orderliness and diligence are effective for predicting academic performance in all testing semesters, and the introduction of orderliness can remarkably improve the prediction accuracy even at the presence of diligence. At the same time, we have checked that orderliness and diligence are not significantly correlated (see electronic supplementary material, figure S4). That is to say, orderliness has its independent effects on academic performance. In particular, orderliness is for the first time, to our knowledge, proposed as an important behavioural character that is significantly correlated with a student's academic performance.

Table 1.

AUC values for the GPA prediction. The abbreviations O, D and O + D stand for utilizing features on orderliness only, on diligence only and on the combination of orderliness and diligence, respectively. SEM is short for semester, for example, SEM 3 represents the case we train the data of semester 2 and predict the ranks of examination performance in semester 3.

SEMs
features SEM 2 SEM 3 SEM 4 SEM 5
O 0.618 0.617 0.611 0.597
D 0.630 0.655 0.663 0.668
O + D 0.668 0.681 0.685 0.683

3. Discussion

In this paper, we proposed novel metrics to measure two high-level behavioural characters, orderliness and diligence, in the university campus. These two types of behavioural features are not correlated themselves (see electronic supplementary material, figure S4), while the correlations between two orderliness features and between two diligence features are both positive and significant (see electronic supplementary material, figure S5 and figure S6), suggesting the robustness of the proposed indices. Extensive empirical analyses on tens of millions of digital records show strong correlations between orderliness and academic performance, as well as between diligence and academic performance. Of particular interests, orderliness is calculated from temporal records of taking showers and having meals, which are not directly related to studying behaviours. We further show the considerable predictive power of orderliness for academic performance. Compared with most previous works in the literature, this work is characterized by large-scale unobtrusive data that allow robust statistical analyses.

The majority of known studies in this domain are mainly based on questionnaires with sample sizes usually scaling from dozens to hundreds [15,16,18,23,26]. In addition, these studies suffer from experimental bias since subjects would like to report socially desirable information instead of disapproved behaviours [30,31]. Therefore, analysing large-scale unobtrusive digital records will become a promising or even mainstream methodology in the near future. However, we do not think such big-data analyses should replace questionnaire surveys. Instead, these two methodologies will complement and benefit each other. First of all, with the help of large-scale accessible data on individual daily routines, we can estimate the discrimination of a set of items in a questionnaire on the target behavioural character. Therefore, it is very possible that psychologists and computer scientists will work together not only to make use of unobtrusive digital records, but also to improve the quality of questionnaires [48,49]. Secondly, a few recent works [5052] show the potential to predict personality and some other private attributes by behavioural data. If these types of reverse predictions are accurate enough to compare with diagnoses, human judgments, self-reports and questionnaire surveys, then we are able to infer questionnaire results of a large population based on the combination of behavioural records and questionnaires of a small fraction of the population.

The present report is relevant to education management. On the one hand, understanding the explicit relationship between behavioural patterns and academic performance could help education administrators to guide students to behave like excellent ones and then they may become excellent later on. On the other hand, we can detect undesirable abnormal behaviours in time and thus implement effective interventions at an early stage. The behavioural pattern of students who are addicted to the Internet may be largely different from those without Internet addiction. For example, previous studies have shown that adolescents with Internet addiction have higher irregular bedtimes and dietary behaviour [53], and there is a significant and negative correlation between Internet addiction and academic performance [54,55]. Therefore, identifying Internet addicts at an early stage is critical for effective interventions.

Yet, the current findings are not beyond their limitations on data and method. First of all, some factors that have large effects on GPA could not be captured by our methods such as psychological factors, talent and luck during the exam. Secondly, we do not have the full scope of data that could be used to estimate orderliness (such as bedtimes) and diligence (such as duration of self-studying). Thirdly, our method may underestimate the diligence of some students with different living habits, for example, some students may mainly drink bottled water instead of fetched water, even though they are also taking classes and studying in the teaching buildings. Some students with low orderliness and diligence may exhibit a high academic performance (see electronic supplementary material, figure S3). Therefore, we will collect more relevant data in future works. In addition, we could not establish the causal link between behavioural features and academic performance based on the current data. We expect to reveal causality relations by designing a controlled experiment.

Another interesting yet challenging issue for future study is the generality of our findings across different cultures and educational atmospheres. For example, East Asia creates a higher level of disciplined atmosphere than other cultures, and student academic performance is significantly positively correlated with the disciplinary climate [37,38]. Although in China orderliness is positively correlated with academic performance, whether orderliness is a quality that is predictive across all cultures still remains an open question. Moreover, most undergraduate students in universities in China live in campus dormitories and most of their activities take place within the campus. However, students in other countries may live off-campus or spend a considerable portion of time doing part-time jobs. Accordingly, the ties between collectable behavioural data and academic performance in other countries may be weaker than those in China.

In summary, we hope the reported approaches in this paper, together with some other works [32,35,5052] in the same direction, will induce methodological and ideational shifts in pedagogy, eventually resulting in quantitative and personalized education management in the future.

4. Material and methods

4.1. Data collection

In most Chinese universities, every student owns a campus smart card with real-name registration. The smart card can be used for student identification and serves as the unique payment medium for many consumptions in the campus. In addition, almost all Chinese undergraduate students live on campus in dormitories until graduation. In the case of UESTC, the university provides campus dormitories to all undergraduate students and in principle does not allow students to live off-campus. Therefore, smart cards record a large volume of behavioural data in terms of students' living and studying activities. For the 18 960 anonymous students under consideration (they cover almost the whole population of undergraduate students in UESTC, except for very few students who live off-campus for health reasons or have less than 15 actions in one or more types of behaviours under consideration), the data cover the period from the beginning of their first year to the end of their third year. The data used in this paper contain four kinds of daily behaviours within the campus. Specifically, there are 3 151 783 records for taking showers in dormitories, 19 015 773 records for having meals in cafeterias, 3 412 587 records for entering/exiting the library and 2 279 592 records for fetching water in teaching buildings, respectively. In addition, some other consumption and entry–exit behaviours are also recorded, including purchasing daily necessities in campus supermarkets, doing the laundry, having coffees in cafes, taking school buses, entering/exiting the dormitories and so on. GPAs of undergraduate students in each semester are also collected.

4.2. Privacy protection

In the data collection and analyses, we deal with privacy issues very carefully and tried to avoid infringement of student privacy. The students are already pseudonymous in the raw data. Moreover, considering that outside information can be used to link the data back to an individual if the individual's spatio-temporal patterns are unique enough [56,57], we tried to reduce the resolution of the data. For instance, all the information about dates was removed, the precise happening times of behaviours were divided into 48 bins. From the data, we only know a student started to take the shower sometime between 21:00 and 21:30 on some day, while there are about 1000 possible shower rooms, as well as over 15 cafeterias, over 10 teaching buildings and so on. After the raw data were processed, it would be reasonably hard to re-identify individuals by the method reported by Montjoye et al. [57].

4.3. Actual entropy

We take the actual entropy [40,41] to measure the orderliness of any sequence Inline graphic. Formally, the actual entropy is defined as

4.3. 4.1

where Λi represents the length of the shortest subsequence starting from ti of Inline graphic, which never appeared previously. If such a subsequence does not exist, we set Λi = ni + 2 [41]. Following this definition, given the binned sequence Inline graphic, we have Λ1 = 1, Λ2 = 1, Λ3 = 3, Λ4 = 2, Λ5 = 2, and thus Inline graphic. In this paper, the actual entropy is considered as a measurement for orderliness: the smaller the entropy, the higher the orderliness.

4.4. Data regularization

The distributions of orderliness and GPA are spread around different value scopes. To eliminate the potential effect on correlation analysis, we use the Z-score [44] to regularize the data, namely,

4.4. 4.2

where Inline graphic is the regularized orderliness for the student with binned sequence Inline graphic, μO and σO are the mean and standard deviation of orderliness O for all considered students, and μS and σS are the mean and standard deviation of actual entropy S for all considered students. Indeed, orderliness is simply defined as Inline graphic under a monotone and one-to-one relationship. Obviously, μO = − μS and σO = σS. As a result, the predictability of orderliness and entropy is the same. Analogously, the regularized GPA for an arbitrary student i is defined as

4.4. 4.3

where Gi is the GPA of student i, and μG and σG are the mean and standard deviation of G for all considered students.

4.5. Spearman's rank correlation

In the analysis of relationships between regularized orderliness and regularized GPA, Spearman's rank correlation coefficient [45] is defined as

4.5. 4.4

where N is the number of students under consideration, di = r(Oi) − r(Gi), with r(Oi) and r(Gi) being the ranks for student i's orderliness and GPA, respectively. Spearman's rank correlation coefficient falls into the range [−1, 1], and the larger the absolute value is, the higher the correlation is.

4.6. Prediction approach

Given a characteristic feature vector Inline graphic of each student, a pair-wise learning to rank algorithm, RankNet [46], has been exploited to predict students' academic performance. RankNet tries to learn a scoring function Inline graphic, so that the predicted ranks according to f are as consistent as possible with the ground truth. In RankNet, such consistence is measured by cross entropy between the actual probability and the predicted probability. Based on the scoring function, the predicted probability that a student i has a higher GPA than another student j (denoted as Inline graphic) is defined as Inline graphic, where σ(z) = 1/(1 + ez) is a sigmoid function. Here we consider a simple regression function f = wTx, where w is the vector of parameters. The cost function of RankNet is formulated as follows:

4.6. 4.5

where Ω(f) = wTw is a regularized term to prevent over-fitting. Given all students' feature vectors and their ranks, we apply gradient decent to minimize the cost function. The gradient of the lost function with respect to parameter w in f is

4.6. 4.6

Supplementary Material

Electronic Supplementary Material
rsif20180210supp1.pdf (3.2MB, pdf)

Supplementary Material

Supplementary Dataset

Acknowledgements

The authors acknowledge the anonymous reviewers for valuable comments and suggestions. The authors would like to thank Hao Chen, Yan Wang from Nankai University, Qin Zhang, Junming Huang and Jiansu Pu from UESTC for helpful discussions.

Data accessibility

The dataset needed to evaluate the conclusions in the paper has been uploaded as part of the supplementary material. The original data of precise behavioural records, however, cannot be released in order to preserve the privacy of individuals.

Authors' contributions

Y.C., J.G., Y.W., H.Y. and T.Z. are co-first authors. D.L. and T.Z. designed the research. Y.C., J.S., Q.W., Y.W. and H.Y. performed the research. All authors analysed the data. T.Z. drafted the manuscript. Y.C., J.G. and T.Z. revised the manuscript. All authors gave final approval for publication.

Competing interests

We have no competing interests.

Funding

This work was partially supported by the National Natural Science Foundation of China (61603074, 61473060, 61433014, 61502083). D.L. acknowledges the Fundamental Research Funds for the Central Universities (no. ZYGX2016J087). T.Z. acknowledges the Science Promotion Programme of UESTC (no. Y03111023901014006).

References

  • 1.Jamison DT. 1986. Child malnutrition and school performance in China. J. Dev. Econ. 20, 299–309. ( 10.1016/0304-3878(86)90026-x) [DOI] [Google Scholar]
  • 2.Mo-Suwan L, Lebel L, Puetpaiboon A, Junjana C. 1999. School performance and weight status of children and young adolescents in a transitional society in Thailand. Int. J. Obes. 23, 272–277. ( 10.1038/sj.ijo.0800808) [DOI] [PubMed] [Google Scholar]
  • 3.Taras H, Potts-Datema W. 2005. Obesity and student performance at school. J. Sch. Health 75, 291–295. ( 10.1111/j.1746-1561.2005.00040.x) [DOI] [PubMed] [Google Scholar]
  • 4.Stabler B, Clopper RR, Siegel PT, Stoppani C, Compton PG, Underwood LE. 1994. Academic achievement and psychological adjustment in short children. J. Dev. Behav. Pediatr. 15, 1–6. ( 10.1097/00004703-199402000-00001) [DOI] [PubMed] [Google Scholar]
  • 5.Chang SM, Walker SP, Grantham-McGregor S, Powell CA. 2002. Early childhood stunting and later behaviour and school achievement. J. Child Psychol. Psychiatry 43, 775–783. ( 10.1111/1469-7610.00088) [DOI] [PubMed] [Google Scholar]
  • 6.Deary IJ, Strand S, Smith P, Fernandes C. 2007. Intelligence and educational achievement. Intelligence 35, 13–21. ( 10.1016/j.intell.2006.02.001) [DOI] [Google Scholar]
  • 7.Laidra K, Pullmann H, Allik J. 2007. Personality and intelligence as predictors of academic achievement: a cross-sectional study from elementary to secondary school. Pers. Individ. Dif. 42, 441–451. ( 10.1016/j.paid.2006.08.001) [DOI] [Google Scholar]
  • 8.Krapohl E. et al. 2014. The high heritability of educational achievement reflects many genetically influenced traits, not just intelligence. Proc. Natl Acad. Sci. USA 111, 15 273–15 278. ( 10.1073/pnas.1408777111) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Okbay A. et al. 2016. Genome-wide association study identifies 74 loci associated with educational attainment. Nature 533, 539–542. ( 10.1038/nature17671) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Selzam S, Krapohl E, Stumm SV, O'Reilly PF, Rimfeld K, Kovas Y, Dale PS, Lee JJ, Plomin R. 2017. Predicting educational achievement from DNA. Mol. Psychiatry. 22, 267–272. ( 10.1038/mp.2016.107) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bowers AJ, Sprott R, Taff SA. 2013. Do we know who will drop out? A review of the predictors of dropping out of high school: precision, sensitivity, and specificity. High Sch. J. 96, 77–100. ( 10.1353/hsj.2013.0000) [DOI] [Google Scholar]
  • 12.Tamhane A, Ikbai S, Sengupta B, Duggirala M, Appleton J. 2014. Predicting student risks through longitudinal analysis. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1544–1552. New York, NY: ACM Press. [Google Scholar]
  • 13.Jayaprakash SM, Moody EW, Lauría EJ, Regan JR, Baron JD. 2014. Early alert of academically at-risk students: an open source analytics initiative. J. Learn. Anal. 1, 6–47. (doi:10.18608/jla.2014.11.3) [Google Scholar]
  • 14.Lakkaraju H, Aguiar E, Shan C, Miller D, Bhanpuri N, Ghani R, Addison KL. 2015. A machine learning framework to identify students at risk of adverse academic outcomes. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining pp. 1909–1918. New York, NY: ACM Press. [Google Scholar]
  • 15.Poropat AE. 2009. A meta-analysis of the Five-Factor Model of personality and academic performance. Psychol. Bull. 135, 322–338. ( 10.1037/a0014996) [DOI] [PubMed] [Google Scholar]
  • 16.Vedel A. 2014. The Big Five and tertiary academic performance: a systematic review and meta-analysis. Pers. Individ. Dif. 71, 66–76. ( 10.1016/j.paid.2014.07.011) [DOI] [Google Scholar]
  • 17.O'Conner M, Paunonen S. 2007. Big Five personality predictors of post-secondary academic performance. Pers. Individ. Dif. 43, 971–990. ( 10.1016/j.paid.2007.03.017) [DOI] [Google Scholar]
  • 18.Credé M, Roch SG, Kieszczynka UM. 2010. Class attendance in college: a meta-analytic review of the relationship of class attendance with grades and student characteristics. Rev. Educ. Res. 80, 272–295. ( 10.3102/0034654310362998) [DOI] [Google Scholar]
  • 19.Conard MA. 2006. Aptitude is not enough: how personality and behavior predict academic performance. J. Res. Pers. 40, 339–346. ( 10.1016/j.jrp.2004.10.003) [DOI] [Google Scholar]
  • 20.Schmidt RM. 1983. Who maximizes what? A study in student time allocation. Am. Econ. Rev. 73, 23–28. [Google Scholar]
  • 21.Romer D. 1993. Do students go to class? Should they? J. Econ. Perspect. 7, 167–174. ( 10.1257/jep.7.3.167) [DOI] [Google Scholar]
  • 22.Durden GC, Ellis LV. 1995. The effects of attendance on student learning in principles of economics. Am. Econ. Rev. 85, 343–346. [Google Scholar]
  • 23.Credé M, Kuncel NR. 2008. Study habits, skills, and attitudes: the third pillar supporting collegiate academic performance. Perspect. Psychol. Sci. 3, 425–453. ( 10.1111/j.1745-6924.2008.00089.x) [DOI] [PubMed] [Google Scholar]
  • 24.Stinebrickner R, Stinebrickner TR. 2008. The causal effect of studying on academic performance. B.E. J. Econ. Anal. Policy 8, 1–53. ( 10.2202/1935-1682.1868) [DOI] [Google Scholar]
  • 25.Grave B. 2011. The effect of student time allocation on academic achievement. Educ. Econ. 19, 291–310. ( 10.1080/09645292.2011.585794) [DOI] [Google Scholar]
  • 26.Dewald JF, Meijer AM, Oort FJ, Kerkhof GA, Bögels SM. 2010. The influence of sleep quality, sleep duration and sleepiness on school performance in children and adolescents: a meta-analytic review. Sleep. Med. Rev. 14, 179–189. ( 10.1016/j.smrv.2009.10.004) [DOI] [PubMed] [Google Scholar]
  • 27.Trockel MT, Barnes MD, Egget DL. 2000. Health-related variables and academic performance among first-year college students: implications for sleep and other behaviors. J. Am. Coll. Health. 49, 125–131. ( 10.1080/07448480009596294) [DOI] [PubMed] [Google Scholar]
  • 28.Taylor DJ, Vatthauer KE, Bramoweth AD, Ruggero C, Roane B. 2013. The role of sleep in predicting college academic performance: is it a unique predictor? Behav. Sleep. Med. 11, 159–172. ( 10.1080/15402002.2011.602776) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wald A, Muennig PA, O'Connell KA, Garber CE. 2014. Associations between healthy lifestyle behaviors and academic performance in US undergraduates: a secondary analysis of the American College Health Association's National College Health Assessment II. Am. J. Health. Promot. 28, 298–305. ( 10.4278/ajhp.120518-quan-265) [DOI] [PubMed] [Google Scholar]
  • 30.Fisher RJ. 1993. Social desirability bias and the validity of indirect questioning. J. Consum. Res. 20, 303–315. ( 10.1086/209351) [DOI] [Google Scholar]
  • 31.Paulhus DL, Vazire S. 2007. The self-report method. In Handbook of research methods in personality psychology (eds Robins RW, Fraley RC, Krueger RF), pp. 224–239. New York, NY: The Guilford Press. [Google Scholar]
  • 32.Wang R, Chen F, Chen Z, Li T, Harari G, Tignor S, Zhou X, Ben-Zeev D, Campbell AT. 2014. StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 3–14. New York, NY: ACM Press. [Google Scholar]
  • 33.Brinton CG, Chiang M. 2015. MOOC performance prediction via clickstream data and social learning networks. In Proceedings of the 2015 IEEE Conference on Computer Communications, pp. 2299–2307. New York, NY: IEEE Press. [Google Scholar]
  • 34.Zhou M, Ma M, Zhang Y, Sui K, Pei D, Moscibroda T. 2016. EDUM: Classroom education measurements via large-scale WiFi networks. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 316–327. New York, NY: ACM Press. [Google Scholar]
  • 35.Wang R, Harari G, Hao P, Zhou X, Campbell AT. 2016. SmartGPA: how smartphones can assess and predict academic performance of college students. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 295–306. New York, NY: ACM Press. [Google Scholar]
  • 36.Baumann C, Krskova H. 2016. School discipline, school uniforms and academic performance. Int. J. Educ. Manage. 30, 1003–1029. ( 10.1108/ijem-09-2015-0118) [DOI] [Google Scholar]
  • 37.Ning B, Van Damme J, Van Den Noortgate W, Yang X, Gielen S. 2015. The influence of classroom disciplinary climate of schools on reading achievement: a cross-country comparative study. Sch. Effect. Sch. Improv. 26, 586–611. ( 10.1080/09243453.2015.1025796) [DOI] [Google Scholar]
  • 38.Guo S, Li L, Zhang D. 2018. A multilevel analysis of the effects of disciplinary climate strength on student reading performance. Asia Pacific Educ. Rev. 19, 1–15. ( 10.1007/s12564-018-9516-y) [DOI] [Google Scholar]
  • 39.Hijazi ST, Naqvi SR. 2006. Factors affecting students' performance. Bangladesh e-J. Sociol. 3, 1–10. [Google Scholar]
  • 40.Kontoyiannis I, Algoet PH, Suhov YM, Wyner AJ. 1998. Nonparametric entropy estimation for stationary processes and random fields, with applications to English text. IEEE Trans. Inf. Theory 44, 1319–1327. ( 10.1109/18.669425) [DOI] [Google Scholar]
  • 41.Xu P, Yin L, Yue Z, Zhou T. 2018. On predictability of time series. arXiv: 1806.03876. [Google Scholar]
  • 42.Shannon CE. 1948. A mathematical theory of communication. Bell Syst. Techn. J. 27, 379–423. ( 10.1002/j.1538-7305.1948.tb01338.x) [DOI] [Google Scholar]
  • 43.Simpson EH. 1949. Measurement of diversity. Nature 163, 688 ( 10.1038/163688a0) [DOI] [Google Scholar]
  • 44.Kreyszig E. 2010. Advanced engineering mathematics. New Jersey: John Wiley and Sons, Hoboken. [Google Scholar]
  • 45.Spearman C. 1904. The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101. ( 10.2307/1412159) [DOI] [PubMed] [Google Scholar]
  • 46.Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine Learning, pp. 89–96. New York, NY: ACM Press. [Google Scholar]
  • 47.Hanley JA, McNeil BJ. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36. ( 10.1148/radiology.143.1.7063747) [DOI] [PubMed] [Google Scholar]
  • 48.Markowetz A, Blaszkiewicz K, Montag C, Switala C, Schlaepfer TE. 2014. Psycho-informatics: big data shaping modern psychometrics. Med. Hypotheses. 82, 405–411. ( 10.1016/j.mehy.2013.11.030) [DOI] [PubMed] [Google Scholar]
  • 49.Montag C, Duke É, Markowetz A. 2016. Toward psychoinformatics: computer science meets psychology. Comput. Math. Meth. Med. 2016, 1–10. ( 10.1155/2016/2983685) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Chittaranjan G, Blom J, Gatica-Perez D. 2013. Mining large-scale smartphone data for personality studies. Pers. Ubiquitous. Comput. 17, 433–450. ( 10.1007/s00779-011-0490-1) [DOI] [Google Scholar]
  • 51.Kosinski M, Stillwell D, Graepel T. 2013. Private traits and attributes are predictable from digital records of human behavior. Proc. Natl Acad. Sci. USA 110, 5802–5805. ( 10.1073/pnas.1218772110) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Youyou W, Kosinski M, Stillwell D. 2015. Computer-based personality judgments are more accurate than those made by humans. Proc. Natl Acad. Sci. USA 112, 1036–1040. ( 10.1073/pnas.1418680112) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Kim Y, Park JY, Kim SB, Jung IK, Lim YS, Kim JH. 2010. Effect of Internet addiction on academic performance of medical students. Nutr. Res. Pract. 4, 51–57. ( 10.4162/nrp.2010.4.1.51) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Akhter N. 2013. Relationship between Internet addiction and academic performance among university undergraduates. Educ. Res. Rev. 8, 1793–1796. [Google Scholar]
  • 55.Khan MA, Alvi AA, Shabbir F, Rajput TA. 2016. Effect of Internet addiction on academic performance of medical students. J. Islam. Int. Med. Coll. 11, 48–51. [Google Scholar]
  • 56.De Montjoye YA, Hidalgo CA, Verleysen M, Blondel VD. 2013. Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3, 1376 ( 10.1038/srep01376) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.De Montjoye YA, Radaelli L, Singh VK. 2015. Unique in the shopping mall: on the reidentifiability of credit card metadata. Science 347, 536–539. ( 10.1126/science.1256297) [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Electronic Supplementary Material
rsif20180210supp1.pdf (3.2MB, pdf)
Supplementary Dataset

Data Availability Statement

The dataset needed to evaluate the conclusions in the paper has been uploaded as part of the supplementary material. The original data of precise behavioural records, however, cannot be released in order to preserve the privacy of individuals.


Articles from Journal of the Royal Society Interface are provided here courtesy of The Royal Society

RESOURCES