Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2022 Jul 4;78:102220. doi: 10.1016/j.labeco.2022.102220

Can peer mentoring improve online teaching effectiveness? An RCT during the COVID-19 pandemic

David Hardt a, Markus Nagler b,, Johannes Rincke c
PMCID: PMC9251909  PMID: 35815179

Abstract

Online delivery of higher education has taken center stage but is fraught with issues of student self-organization. We conducted an RCT to study the effects of remote peer mentoring at a German university that switched to online teaching due to the COVID-19 pandemic. Mentors and mentees met one-on-one online and discussed topics like self-organization and study techniques. We find positive impacts on motivation, studying behavior, and exam registrations. The intervention did not shift earned credits on average, but there is evidence for positive effects on the most able students.

Keywords: Peer Mentoring, Online Education, Covid-19, Teaching Effectiveness, Higher education

1. Introduction

Online delivery of tertiary education has taken center stage. The COVID-19 pandemic has forced virtually all education institutions to switch to online teaching. However, the literature has generally found online teaching to be inferior to classroom-based teaching (e.g., Bettinger, Fox, Loeb, Taylor, 2017, Figlio, Rush, Yin, 2013). Switching to online teaching may thus aggravate the problem that many students struggle to complete their studies successfully (Weiss et al., 2019). Accordingly, students expect and experience negative consequences of the COVID-19-induced shift to online teaching (Aucejo, French, Araya, Zafar, 2020, Bird, Castleman, Lohner, 2022, Kofoed, Gebhart, Gilmore, Moschitto, 2021). This may be due to problems of disorganization among students in online teaching, as argued for massive open online courses (“MOOCs”; e.g. Banerjee, Duflo, 2014, McPherson, Bacow, 2015, Patterson, 2018). One way to improve outcomes of online education could therefore be to assist students through online peer mentoring. Evidence on the effectiveness of such programs is scarce for online teaching, where they may be particularly helpful.

In this paper, we report results of a randomized trial studying the effects of peer mentoring at a German university that, due to the COVID-19 pandemic, switched to online teaching for the spring term 2020. Our sample comprises 691 second term students from the core undergraduate program at the university’s School of Business and Economics. To assess the effectiveness of the program, we combine registry data with survey data on study behavior and motivation that we collected before the start of the examination period in the spring term 2020. Our paper presents the first evidence on the role of remote peer mentoring in online higher education.

The mentoring program focused on students’ general study skills, such as self-organization and study techniques. Mentors and mentees met one-on-one online. The program consisted of five meetings of around 40 minutes each that took place around every two weeks. In each meeting, mentors would discuss specific topics, such as mentees’ weekly study schedules, using materials provided by us, as well as follow-on discussions on topics from prior meetings. Importantly, we instructed mentors not to discuss any coursework with mentees. As mentors, we hired students from a more advanced term in the same study program. Thus, this kind of mentoring could be scaled up easily and at modest cost. Including one additional mentee into the program for a three-month period would cost about € 60.

Our setting is common for public universities across the developed world. Each fall, students enroll in the three-year bachelor’s program Economics and Business Studies. In each of the first two terms, students are to pass six courses each worth five credits. Since the second term includes more rigorous courses relative to the first term, many students struggle in this term.1 A key advantage of our setting is that the spring term 2020 was conducted entirely online because the German academic spring term starts and ends later (April to July) than is common internationally. Thus, there are no spillovers from in-person to online classes.

Our main results are as follows. First, the mentoring program improved students’ motivation and study behavior. Treated students report higher overall motivation, are more likely to report having studied throughout the spring term 2020, and are more likely to state they provided enough effort to reach their goals in this term. In contrast, students’ views on departmental services or on online teaching in the spring term 2020 or in general seem unaffected. Second, while these effects translate into more exam registrations, the average effect on earned credits is small and insignificant. Similarly, the students’ GPA is unaffected. Third, these results mask a heterogeneity that contrasts with common expectations on potential impacts of peer mentoring. For instance, we observe a positive effect on students who previously performed well, with no effects on other students. In addition, male students benefit more from the program, if anything. These results somewhat contrast expectations based on prior research suggesting that weaker students struggle most in online learning (e.g., Bettinger, Fox, Loeb, Taylor, 2017, Figlio, Rush, Yin, 2013) and that female students tend to benefit more from mentoring (e.g., Angrist, Lang, Oreopoulos, 2009, Rodriguez-Planas, 2012).

We contribute to research on the effectiveness of online education. This literature has found online teaching to be less effective than classroom-based teaching (Bettinger, Fox, Loeb, Taylor, 2017, Figlio, Rush, Yin, 2013), likely due to problems of disorganization among students in online teaching (e.g. Banerjee and Duflo, 2014).2 Research on specific aspects of the online education production function is scarce. Closest to our work, Oreopoulos et al. (2022) show that assigning online and regular students to an online planning module to construct weekly study schedules and reminders or loose mentoring via text messages does not affect students’ outcomes. We focus on a more comprehensive and intensive mentoring program with regular contact and guidance, which has been shown to matter (e.g., Oreopoulos and Petronijevic, 2018). We thus contribute by providing the first evidence on the role of peer mentoring programs for online higher education.

We also contribute to the experimental literature on mentoring interventions in higher education.3 Closest to our work, Oreopoulos and Petronijevic (2018) experimentally study different coaching methods. They find that close one-on-one peer mentoring programs are effective in raising student outcomes. Oreopoulos and Petronijevic (2019) show evidence from several nudging interventions that did not shift students’ academic outcomes, but improved their motivation. Angrist et al. (2009) test the impact of academic support and financial incentives on students’ GPA, finding performance increases among female students. However, they find no effects for academic support without financial incentives. Our program is targeted more towards individual mentor-mentee interactions, is more structured, and it takes place in an online environment where mentoring may be more important. We thus contribute by providing the first evidence on the effectiveness of close (peer) mentoring in online education and by extending the small experimental literature on mentoring effects in higher education.4

Finally, we contribute to research on education responses to the COVID-19 pandemic, most of which has focused on primary or secondary education (e.g., Angrist, Bergman, Matsheng, 2022, Bacher-Hicks, Goodman, Mulhern, 2020, Grewenig, Lergetporer, Werner, Woessmann, Zierow, 2021). The closest paper is Carlana and La Ferrara (2020), who experimentally assigned struggling Italian middle school students an online tutor during the pandemic and report positive effects on performance and well-being. They specifically target students from less affluent backgrounds. Our paper contributes by studying the effectiveness of online mentoring in higher education. Our strongest effects are for already well-performing students, indicating that peer mentoring in online education may increase inequality in outcomes. This is very different from tutoring, where we find a reduction in inequality in a follow-on project in a similar setting (Hardt et al., 2022). Despite the nearly universal shift towards online teaching in higher education due to the pandemic, evidence on improving the effectiveness of online teaching in this context remains scarce. This is in spite of emerging evidence showing that the shift led to worse learning outcomes and depressed student expectations and outlooks (Altindag, Filiz, Tekin, 2021, Bird, Castleman, Lohner, 2022, De Paola, Gioia, Scoppa, 2022, Jaeger, Arellano-Bover, Karbownik, Matute, Nunley, Seals Jr., Almunia, Alston, Becker, Beneito, Bheim, Bosc, Brown, Chang, Cobb-Clark, Danagoulian, Donnally, Eckrote-Nordland, Farr, Ferri, Fort, Fruewirth, Gelding, Goodman, Guldi, Hckl, Hankin, Imberman, Lahey, Llull, Mansour, McFarlin, Merilinen, Mortlund, Nybom, O’Connell, Sausgruber, Schwartz, Stuhler, Thiemann, van Veldhuizen, Wanamaker, Zhu, 2021, Kofoed, Gebhart, Gilmore, Moschitto, 2021, Rodriguez-Planas, 2022, Rodriguez-Planas, 2022).

We acknowledge that the situation during the pandemic was special relative to regular online education settings. We nevertheless believe that our results carry lessons beyond the pandemic and discuss this in more detail in Section 2.

2. Experimental setting and design

2.1. Experimental setting

Our setting is typical of public universities in the Western world. The undergraduate study program Economics and Business Studies at the intervention university requires students to collect 180 credits to graduate, which is expected after three years. The study plan assigns courses worth 30 credits to each term. Administrative data show that large shares of students do not complete 30 credits per term, delaying their graduation. Survey data collected from earlier cohorts of students suggests that most students do not work full-time even when summing up hours studied and hours worked to earn income.5 The salient study plan and target of achieving 30 credits per term, the fact that most students register for exams worth these credits, and the fact that students do not seem to work enough to pass these exams suggests that many students have problems in self-organizing and/or studying efficiently.

Most likely, given prior findings on such problems in online education, these issues were exacerbated by the switch to online teaching. In addition, online education also may have affected other margins, such as instruction quality, peer interactions, differences in the study environment, distractions, and motivation. Given our program content, the mentoring intervened with students’ self-organization, helped them to study more efficiently, and affected students’ handling of distractions as well as their motivation.

Due to the COVID-19 pandemic, in the spring term 2020 all courses of the university were conducted online. To this end, the university relied on Zoom, an online video tool used widely during the pandemic at universities around the globe. A key advantage of our setting is that the spring term 2020 was conducted entirely online because the German academic year starts and ends later than common internationally.6 It is therefore cleaner than would be possible in similar settings during the pandemic since there are no spillovers from in-person to online teaching.

We leveraged the COVID-19 pandemic to assess the effectiveness of mentoring programs when students lack formal in-person interaction at university. However, in many respects, the pandemic situation was different than other online education settings. First, instructors did largely not have much experience teaching virtually via Zoom (Altindag, Filiz, Tekin, 2021, Orlov, McKee, Berry, Boyle, DiCiccio, Ransom, Rees-Jones, Stoye, 2021). Second, students experienced a variety of negative shocks, from cancelled internships to adverse labor market and health events for their family members (Jaeger et al., 2021). Third, Germany went into a partial lockdown from end of March to early May 2020. Thus, students may have suffered from isolation and depression more than in other online education settings. Finally, the mentors themselves may have also been affected by the challenges that came with the pandemic. We still believe that our results carry implications beyond the pandemic. First, we do not observe strong impacts on some survey questions that should capture dimensions of such pandemic effects. For instance, treated students do not report being in touch with peers more than control group students or do not feel differentially valued by the department. Second, similar impacts on student motivation and study behavior are found in the literature analyzing comparable interventions outside pandemics (Oreopoulos et al., 2020). Third, if anything, the fact that we implemented our intervention during a partial lockdown should make the treatment more effective than typical interventions, particularly for those students most affected by the pandemic (Rodriguez-Planas, 2022, Rodriguez-Planas, 2022). However, we find similarly muted effects on average than mentoring interventions outside the pandemic (e.g., Oreopoulos and Petronijevic, 2019) and show that the intervention had the strongest impact on students who already performed well before the onset of the pandemic. We also do not see differential impacts on students who had greater local support as proxied by being from the region around the university.

2.2. The mentoring program

In the first term week of the spring term 2020, students in the treatment group were informed via e-mail about the launch of a mentoring program for students in the second term. They were invited to register for the program through a webpage.7

The program focused on self-organization and on making mentees aware of potential problems of studying online. We designed it to involve five one-on-one online meetings between mentors and mentees, taking place around every two weeks. The average length of meetings reported by the mentors was around 40 minutes, well within our prior target length of 30–45 minutes. For each meeting, we provided mentors with some information. Because our sample is rather small, we combined several aspects of mentoring that the prior literature has suggested to be effective into one program.

The first meeting focused on mentees’ expectations of their term performance and contrasted these expectations with the average performance of previous cohorts to target student overconfidence (Lavecchia et al., 2016). Mentors also provided advice on self-organization when working from home, targeting student disorganization in online environments (e.g., Banerjee and Duflo, 2014). In the second meeting, mentors and mentees formulated specific goals for the mentee. This included study goals (weekly study schedule, see Figs. A.1 and A.2 in the Online Appendix), courses to be taken, and performance-based goals (credits earned), based on research on the effectiveness of goal-setting (e.g., Clark et al., 2020). The third meeting focused on exam preparation (timing of exams, implications for mentees’ preparation), targeting students’ alignment of long-term goals and short-term behavior (e.g., Bettinger and Baker, 2014).

The fourth meeting focused on studying effectively. This included the presentation of a simplified four-stage learning model (see Fig. A.3 in the Online Appendix) and how to implement these learning strategies in practice, targeting students’ study skills (e.g., Angrist et al., 2009). The final meeting focused on the mentee’s exam preparation, including a time schedule providing guidance on how to specifically prepare for exams. This targeted students’ underestimation of the time required to complete a task (e.g., Oreopoulos et al., 2022). In all meetings, mentors and mentees additionally discussed issues the mentee was currently facing, similar to general counseling services in other settings (e.g., Rodriguez-Planas, 2012). We instructed mentors to ensure that the information was only provided to mentees and not to other students.

In summary, our mentoring program targets similar elements as the one studied by Oreopoulos et al. (2022). They find no effects of mentoring on student outcomes and only slight effects on study behavior. Relative to their intervention, our program is more intensive, offering regular contact and guidance, which may be required to shift academic outcomes. In line with this, Oreopoulos and Petronijevic (2018) find that among several student coaching methods, one-on-one peer mentoring was effective to raise student outcomes while technology-based interventions were ineffective. These elements may be particularly important in an online context.

In the control group, there was no mentoring. However, the university provided general information on the topics that we focus on through its website. This included advice on how to work from home and general information regarding the online implementation of courses.8

2.3. Recruitment and training of mentors

We hired 15 peer mentors, with each mentor handling ten mentees at most. The mentoring program’s capacity was therefore 150 students.9 All mentors were students who successfully completed the first year and were enrolled in the fourth term of the study program during the spring term 2020. They had good program and high-school GPAs and more likely worked in student jobs next to their studies. Among all applicants, we selected those we felt would be the most able mentors. Eight of the mentors were female and seven were male.10

All mentors took part in an online kick-off meeting where we explained the purpose and the structure of the program and laid out the planned sequence and contents of the mentoring sessions. They were not informed that the program was experimentally evaluated, but were informed that the program’s capacity was limited and that a random subset of students in the second term was invited to participate. They subsequently took part in a training by professional coaches. The training focused on communication skills and took about five hours. Three weeks after program start, the mentors took part in a short supervision meeting with the coaches. In addition, we sent regular e-mails to the mentors (before each of the five meetings) and answered questions. Short feedback conversations also took place, mainly for us to get a sense on how the program was being implemented. An overview of the timing of the project is displayed in Fig. 1 .

Fig. 1.

Fig. 1

Timeline of Intervention.

Note: This figure shows the timeline of our experiment, which took place in the spring term 2020. The lecture period in the spring term 2020 started on April 20th, 2020 and ended on July 27th, 2020. The term officially ended on September 30th, 2020.

2.4. Sampling and random assignment to treatment and control group

About 850 students enrolled for the study program Economics and Business Studies in the fall of 2019. We excluded students who dropped out after the first term, who were not formally in their second term in the spring term 2020 (e.g., because of having been enrolled with some credits at another university before), and who completed less than a full course (5 credits) in the winter term 2019/20, their first term at university.11

This leaves us with a sample of 694 students. Using a stratified procedure (strata variables: gender and earned credits in the winter term 2019/20), we randomly assigned all students in the sample to treatment or control group such that both groups were of equal size. We contacted all students in the treatment group via e-mail and offered them the chance to participate in the mentoring program. Students in the control group were not contacted and could not sign up for the program. After the intervention ended, we had to drop another three students from the sample who got credited for second-term courses earned elsewhere.12 Our final sample thus consists of 691 students (344 in treatment and 347 in control).13

We opted for a design that offers mentoring services to a random subset of the student population and against the alternative of an oversubscribed lottery for two main reasons. First, mentoring services are typically offered to students directly, without any pre-screening of students who might be interested in such services. Our design is thus a natural one and provides the most informative estimates. Second, in the specific situation of the pandemic and the uncertainties associated with it, we were concerned that an oversubscribed lottery could negatively affect students who in the end would not get an offer.

142 of the 344 students in the treatment group signed up for the mentoring. We randomly assigned these 142 students to mentors. To achieve a balanced gender mix of mentee-mentor pairs, we used the mentees’ gender as a strata variable in the assignment. Among students in the treatment group who signed up for the mentoring program, about 54 percent were female. Given that eight of the mentors were female and seven were male, the number of pairs in each of the mentee-mentor gender combinations was similar.

3. Data and empirical strategy

3.1. Data

Survey data

After the fifth round of mentee-mentor meetings, we invited all 691 students in the experimental sample to an online survey. It was conducted on an existing platform at the department that is regularly used to survey students. Students who completed the survey, which lasted around ten minutes, received a payoff of € 8.00. The survey elicited the students’ assessment of their study effort, their satisfaction with the department’s effort to support online learning during the spring term 2020, views on online teaching generally, and beliefs about their academic achievement. The full set of questions is shown in the Online Appendix. We use all survey responses submitted until the beginning of the examination period to avoid spillovers from exams to the survey. 404 students (58.5% of the sample) participated and participation was balanced across treatment and control group.

Administrative data

We collected registry data from the university in mid-October 2020 to measure outcomes related to academic achievement in the students’ first study year. Our outcomes of interest are, first, the number of credits (students receive five credits for each course that they pass) for which students register to measure attempted examinations, interpreted as student effort. Students do not have the right to participate in exams if they did not register in advance. Since the registration window typically closes after only half of the term passed, students have to commit rather early whether they want to take an exam. Second, our main outcome of interest is credits earned in the spring term 2020. This measures most directly the students’ academic achievement during the intervention term. This variable includes all credits earned in the spring term 2020, i.e., credits earned from exams that were supposed to be taken in the spring term 2020, credits earned from retaken exams that were supposed to be taken in the winter term 2019/20, and credits earned from other exams.14 Note, however, that this might be a slow-moving variable since study effort has cumulative gains over time.

Students can in principle repeat each exam twice.15 Additional costs of not passing specific exams in the first attempt is that students must have passed 50 credits in total by the end of the third term. If they do not reach this goal, they need to leave the program. Due to the Covid-19 pandemic, these restrictions were loosened a bit in that attempts to pass an exam did not count towards the total number of attempts for each exam. In addition, students were given an additional term to reach the goal of 50 credits. Thus, the opportunity costs of not passing courses decreased in this term. However, note that this is true both for the treatment and the control group and should thus not affect treated students differentially.

Note that in the second term, each exam is worth 5 credits since each compulsory module is graded only on the final exam. However, in principle, courses may also be graded through exams covering parts of the course, worth less than the 5 credits the course is worth. For example, courses may require students to give a presentation in addition to the final exam, with the credits split between both required assessments. Many courses in the first term use this option, which is why students can earn credits that are not multiples of 5. This is why we use credits registered for and credits earned as dependent variables instead of using the number of courses or exams that students register for and pass.

Third, we examine the impact on students’ GPA for passed courses, running from 1 (passed) to 4 (best grade).16 Given that we expect (and find) impacts of the treatment on the two main outcomes, treatment effects on GPA are not directly interpretable, though. This is in contrast to Angrist et al. (2009), whose main measure of achievement is students’ GPA. The reason for this difference is that in the German system, students are typically free to choose the timing of taking their courses even when a core curriculum is suggested. In addition, many students do not attempt to complete the curriculum in the suggested time period, making the extensive margin of how many courses to take more relevant than in North America. Following Angrist et al. (2009), we did not exclude students who withdrew from the sample. These students were coded as having zero attempted and earned credits and missing GPA.

The exams took place after the end of the teaching period between end of July and September 2020. In addition to the data on exams, the registry data also contain background information on individual students. The characteristics include information on enrollment, gender, age, type of high school completed, and information on high-school GPA (coded from 0 as the fail grade to 4 as the best grade).

3.2. Balancing checks and take-up

Balancing checks

Table 1 reports differences in means and standardized differences in students’ characteristics. The characteristics comprise gender, age (in years), high-school GPA, a dummy for the most common type of high school certificate (“Gymnasium”), a dummy for students who obtained their high school certificate abroad, credits earned in the first term, a dummy for being in their first year at university, and a dummy for part-time students.17 As can be seen from Table 1, the treatment and control groups were well balanced across all characteristics.

Table 1.

Summary statistics by treatment status.

Control Treatment Difference Std. diff.
Female 0.46 0.47 0.01 0.01
(0.50) (0.50) (0.04)
Age 21.29 21.26 -0.03 -0.01
(2.48) (2.69) (0.20)
High-school GPA 2.37 2.38 0.01 0.01
(0.57) (0.61) (0.05)
Top-tier high-school type 0.76 0.74 -0.01 -0.02
(0.43) (0.44) (0.03)
Foreign univ. entrance exam 0.07 0.08 0.02 0.04
(0.25) (0.27) (0.02)
Credits WT 25.23 25.26 0.02 0.00
(9.27) (8.93) (0.69)
First enrollment 0.63 0.68 0.05 0.08
(0.48) (0.47) (0.04)
Part-time student 0.09 0.08 -0.00 -0.01
(0.28) (0.28) (0.02)
Obs. 347 344 691 691

Note: This table shows means of administrative student data (standard deviations in parentheses) by treatment status, together with differences between means and corresponding standard errors (in parentheses) and standardized differences. In the line where we report high-school GPA we need to drop 11 observations where we do not have information on students’ high-school GPA. CreditsWT is the number of (ECTS) credits earned by the student in the winter term 2019/20.

To assess the quality of our survey data, we repeat the balancing checks using our survey respondents. We also study selection into survey participation by mean-comparison tests between survey participants and non-participants. Table B.2 in the Online Appendix shows that students who participated in the survey differ slightly from students who did not participate. Participants are somewhat younger, more female, have better high-school GPA, have earned more credits in the winter term, and are more likely part-time students. Importantly, the likelihood of survey completion is unrelated to treatment assignment. Within the sample of participants, treatment and control group are balanced across all characteristics.

Take-Up

Our main measure of take-up is whether students signed up for the program. Out of the 344 students assigned to the treatment group, 142 signed up for the program. Table A.1 in the Online Appendix shows that these students are slightly older, more likely female, and more likely to have a foreign university entrance exam than students from the treatment group who did not sign up. The differences are small, however.18 Students who registered for the program could drop out at any time with no penalty. Table 2 shows the impact of offering treatment on the likelihood of sign-up, i.e., the first stage of our IV regressions.

Table 2.

Take-up and first stage estimates.

Dependent Variable: Sign-up Sign-up & attended
one meeting all meetings
(1) (2) (3)
Treatment 0.41*** 0.37*** 0.32***
(0.03) (0.03) (0.03)
Underid. Test 179.60 158.01 130.20
Weak IV Test 241.30 203.70 159.52
Obs. 691 691 691

Note: This table shows results of regressions of program take-up on initial treatment assignment controlling for student gender (where possible) and credits earned in the winter term 2019/20, i.e., the first stage of our instrumental variable regressions. Column (1) uses initial program sign-up as the dependent variable and measure of take-up. This is our main take-up measure. Column (2) uses a dummy for having met at least once with the mentor as the dependent variable. Column (3) uses an indicator of whether students met five times with their mentors as the dependent variable. The underidentification and weak identification tests are the heteroskedasticity-robust Kleibergen and Paap (2006) rk LM and Wald F statistics, respectively, as reported by the ivreg2 Stata command (Baum et al., 2007). Standard errors are robust. * p<0.10, ** p<0.05, *** p<0.01.

As mentioned before, students assigned to the control group could not sign up for the program. Table 2 shows that from the 41% of students in the treatment group who signed up for the program (Column 1), some students did not attend any meeting. 37% of students in the treatment group took at least one meeting (Column 2). Further students leaving the program during the spring term 2020 reduced the share of students in the treatment group who took all five meetings to 32% (Column 3).19

3.3. Estimation

To evaluate the intent-to-treat (ITT) effects of the peer mentoring program, we estimate the equation

yi=α+βTreatmenti+γ1Femalei+γ2CreditsWTi+ϵi, (1)

where yi is the outcome of student i, Treatmenti is an indicator for (random) assignment to the mentoring treatment group, Femalei is a dummy for female students, and CreditsWTi is the number of (ECTS) credits earned by the student in the winter term 2019/20, the first term in which the students were enrolled. Each of the outcomes is thus regressed on the indicator for random assignment to the treatment group and the strata variables. We report robust standard errors.

Since treatment take-up was imperfect (i.e., not all treatment group students actually received mentoring services), we additionally run IV regressions using the random treatment assignment as an instrument for actual program take-up. Our main variable for measuring take-up is program sign-up (i.e., our first stage can be seen in Column 1 of Table 2). The first stage is expectedly strong, with a Kleibergen and Paap (2006) F statistic of around 240. The IV estimations essentially scale up the ITT point estimates of the treatment effect by the inverse of the sign-up rate.

For several reasons, we considered it likely that the effects would be heterogeneous. First, prior evidence on online education shows more negative effects for weaker students (e.g., Bettinger, Fox, Loeb, Taylor, 2017, Figlio, Rush, Yin, 2013). We thus expected heterogeneous effects by credits earned in the winter term 2019/20.20 Second, male students suffer more from online relative to in-person teaching (e.g., Figlio, Rush, Yin, 2013, Xu, Jaggars, 2014). However, take-up rates in mentoring programs seem higher for female students (e.g., Angrist et al., 2009). Thus, while we expected the effects of mentoring on outcomes among randomly chosen students to be larger for male students, the relative effect of having been offered a mentor on outcomes, and the relative effect of mentoring on outcomes conditional on sign-up, was unclear. We study treatment effect heterogeneities by including an interaction between the variable capturing the dimension of heterogeneity and the treatment indicator, joint with the variable capturing the dimension itself.

We investigated additional heterogeneities that we described as less central (and likely not to be reported) in the pre-analysis plan. First, whether the effects of mentoring are larger when mentored by female than by male mentors as well as gender interactions (e.g., Dee, 2005, Dee, 2007, Hoffmann, Oreopoulos, 2009). We study this in Table A.5 in the Online Appendix, but find no differences. Second, the pre-analysis plan also specified that we would test if the effect on students enrolled at university for the first time differs from students who had been enrolled before. Again, this is not the case (not reported).

4. Results

4.1. Effects on study behavior and motivation

We first study the effects of the mentoring program on self-reported study behavior and motivation and contrast those with effects on the perception of department services and online teaching more generally. Fig. 2 shows results from ITT and from IV estimations (instrumenting sign-up by treatment assignment). We show the treatment effects along with 95% confidence intervals. All dependent variables are responses on a five-point Likert scale, with higher values indicating higher agreement with the statement.21

Fig. 2.

Fig. 2

Impacts of Mentoring on Survey Outcomes.

Note: This figure shows impacts of peer mentoring on survey responses adapting Eq. (1). For each question, the first row uses OLS regressions, estimating intent-to-treat effects (labeled “ITT”). The second row uses (random) treatment assignment as an instrument for initial program sign-up, estimating treatment-on-the-treated effects (labeled “IV”). For the full set of survey questions, see the Online Appendix. Diamonds indicate the point estimates, bars the associated 95% confidence bounds. Full diamonds indicate significance at the ten percent level. Hollow diamonds indicate p-values above 0.10. The corresponding tables can be found in the Online Appendix. Standard errors are robust.

Panel (a) shows treatment effects on students’ assessment of their motivation and study behavior in the spring term 2020. The mentoring program specifically targeted these outcomes. The first two rows show positive impacts on students’ motivation. The estimated effect in the IV estimation amounts to around 18% of the control group mean. The next two rows show economically and statistically significant effects on students’ response to whether they managed to study continuously throughout the spring term 2020. The subsequent two rows show smaller effects on students’ response to whether they think they prepared for exams in time.

The final two rows again show significant effects on students’ response to the question whether they think they provided enough effort to reach their goals. To complement these results, we also estimate average standardized effects analogous to Kling et al. (2004) and Clingingsmith et al. (2009) in Online Appendix Table B.6. This part of the survey shows an average standardized effect of around 0.16 standard deviations (p-value = 0.048).22 We believe that it is unlikely that these results merely reflect impacts on students’ mental health since in a more intensive tutoring intervention in later terms, we find no impacts on any of several dimensions of student mental health (Hardt et al., 2022). Thus, the treatment seems to have worked through affecting study motivation and study behavior.23

In line with our expectations, Panel (b) shows that the treatment did not shift views on departmental services generally, an aspect that the mentoring program was not directly concerned with. The items include student services, communication by the department, whether there is a clear departmental contact person, and students’ views on whether the department cares for their success or takes their concerns seriously. The most pronounced effect is for students’ feeling whether the department cares for their success, with point estimates of 7% relative to the control group mean; this effect is however insignificant. Panel (c) reports results on students’ general views on online teaching. Again, we did not expect to see treatment effects on the respective outcomes, as the mentoring program did not target students’ general perceptions of online teaching. The specific items include students’ satisfaction with the departments’ online teaching content and implementation in the spring term 2020. We also asked students whether they feel online teaching can work in principle and whether it should play a large role in the future. Both effects are statistically and economically insignificant. We additionally analyze the students’ response to the question whether they frequently interacted with other students. This null results provides suggestive evidence that the program may not induce substitution for interaction among students but further research is needed to document whether the finding is robust.24

We did not have baseline data and therefore did not pre-register minimum detectable effects. Considering a Z-score index (Kling et al., 2007) of all questions in the block of questions on motivation, continuous studying, timely exam preparation, and sufficient effort, we find that the treatment shifts the index by 0.19 standard deviations (p-value=0.05). The ex-post power for this effect is 0.66.

Overall, our results suggest that the peer mentoring program improved students’ motivation and study behavior, hence working as intended. One could be worried that these effects are driven by experimenter demand effects. It is therefore reassuring that we do not see effects on survey items which are unrelated to peer mentoring. In particular, the questions on students’ assessment of department services in general should suffer by experimenter demand effects, but we find no differences here. We also believe that this is not an issue since the survey was sent from the e-mail address of the dean of studies, a departmental e-mail address frequently used to inform students on departmental issues and to reach out to students. The survey was sent to all students and not targeted to treated students only. Finally, it asked students for their opinion on how the department fared in the spring term 2020. We thus believe that experimenter demand effects are not an issue. We now investigate whether the positive effects on motivation and study behavior translated into improved academic outcomes.

4.2. Average impacts on primary outcomes

Table 3 shows differences between students in treatment and control group for academic outcomes.

Table 3.

Average impacts of online peer mentoring on student outcomes.

Dependent Variable: Credits
GPA
Registered
Earned
ITT IV ITT IV ITT IV
(1) (2) (3) (4) (5) (6)
Treatment 1.39** 3.37** 0.54 1.30 0.03 0.07
(0.70) (1.69) (0.61) (1.47) (0.05) (0.11)
Credits WT 0.28*** 0.28*** 0.81*** 0.81*** 0.05*** 0.05***
(0.05) (0.05) (0.03) (0.03) (0.00) (0.00)
Female 1.91*** 1.74** 1.28** 1.21* 0.01 0.00
(0.71) (0.70) (0.62) (0.62) (0.05) (0.05)

Mean dep. 26.33 26.33 17.66 17.66 2.52 2.52
Underid. Test 179.6 179.6 172.6
Weak IV Test 241.3 241.3 241.7
Obs. 691 691 691 691 595 595

Note: This table shows impacts of peer mentoring on administrative student outcomes using Eq. 1. The odd-numbered columns use OLS regressions, estimating intent-to-treat effects (labeled “ITT”). The even-numbered columns instrument a dummy for initial program sign-up by the (random) treatment assignment variable, estimating treatment-on-the-treated effects (labeled “IV”). Columns (1) and (2) use the number of credits for which students registered in the spring term 2020 as the dependent variable. Columns (3) and (4) use the number of earned credits in the spring term 2020 as the dependent variable. Columns (5) and (6) use students’ average GPA (running from 1=worst to 4=best) among earned credits in the spring term 2020 as the dependent variable. CreditsWT is the number of (ECTS) credits earned by the student in the winter term 2019/20. The number of observations differs from Columns (1)-(4) since we have several students who do not earn any credits and thus do not have a GPA. The underidentification and weak identification tests are the heteroskedasticity-robust Kleibergen and Paap (2006) rk LM and Wald F statistics, respectively, as reported by the ivreg2 Stata command (Baum et al., 2007). Standard errors are robust. * p<0.10, ** p<0.05, *** p<0.01.

The odd-numbered columns show ITT estimates, the even-numbered columns show corresponding IV estimates where we use the indicator for assignment to the treatment group as an instrumental variable for program sign-up. Column (1) shows the impacts on credits registered for. Students who received a treatment offer register for around 1.4 more credits than students who did not. Column (2) shows that students who signed up for the treatment register for around 3.4 more credits than those who did not. This corresponds to around 67% of an additional course and 13% of the control group mean. Column (3) shows that students with a treatment offer earn around 0.5 more credits than control students. Registered students earn around 1.3 credits more (Column 4), which implies that students pass around 40% of the additional credits for which they register. These results are statistically insignificant, however.25 Finally, Columns (5) and (6) show that students’ GPA is unaffected, indicating that the (modest) average increase in attempted credits did not come at the expense of worse average grades.26

Overall, Table 3 suggests that the effects of the mentoring program on study behavior and motivation did translate into more exam registrations, but the impacts on performance (in terms of credits earned) are too noisy to rule out either zero or moderate effects.

In the pre-analysis plan, we stated a minimum detectable ITT effect for our main outcome (credits earned) of 2.2 credit points, or 21% of a standard deviation. This is a large effect, but from an ex-ante perspective, it did not seem impossible to strongly shift academic performance. For instance, Angrist et al. (2009), report ITT estimates of providing mentoring services combined with financial incentives on their main outcome of up to 35% of a standard deviation. On the one hand, our intervention did not involve financial incentives, suggesting smaller effect sizes. On the other hand, we implemented it in a pandemic situation involving a complete shift towards online education and very limited chances of students to socially interact with other students and faculty. In addition, our mentoring program was much more structured than the one studied by Angrist et al. (2009) and involved more commitment from students. The latter differences suggest at least the (ex ante) possibility of sizable effects.

The estimates in Table 3 imply we can reasonably exclude effect sizes larger than 16% of a standard deviation in the control group. Our paper thus contributes to the literature on mentoring in higher education by showing that even in a situation with forced online education (i.e., a situation where mentoring arguably could be more powerful than normally), one-on-one peer mentoring is unlikely to significantly shift the average academic achievement students.27 To obtain a more nuanced picture of the effects of the intervention, we next study the effect heterogeneity by prior academic performance and gender.

4.3. Heterogeneity of effects

Prior evidence on online education suggests that its negative effects are larger for weak and for male students (e.g., Bettinger, Fox, Loeb, Taylor, 2017, Figlio, Rush, Yin, 2013, Xu, Jaggars, 2014). We therefore investigate the heterogeneity of our effects in Table 4 .

Table 4.

Intent-to-treat effects by student characteristics.

By prior performance
By gender
Credits
GPA Credits
GPA
registered earned registered earned
(1) (2) (3) (4) (5) (6)
Treatment -0.55 -4.13*** 0.14 2.67*** 0.91 0.03
(2.92) (1.57) (0.21) (0.99) (0.83) (0.07)
Treatment × Credits WT 0.08 0.18*** -0.00
(0.10) (0.06) (0.01)
Treatment × Female -2.73* -0.79 0.01
(1.40) (1.22) (0.10)
Credits WT 0.24*** 0.72*** 0.05*** 0.28*** 0.81*** 0.05***
(0.07) (0.04) (0.01) (0.05) (0.03) (0.00)
Female 1.91*** 1.29** 0.01 3.27*** 1.67* 0.00
(0.71) (0.62) (0.05) (1.00) (0.89) (0.07)

Mean dep. 26.33 17.66 2.52 26.33 17.66 2.52
Obs. 691 691 595 691 691 595

Note: This table shows ITT estimations of the impact of peer mentoring on administrative student outcomes by prior performance and gender adapting Eq. 1. Columns (1) to (3) estimate interactions by prior performance, using students’ credits earned in their first term, the winter term 2019/20, as the measure of prior performance. Columns (4) to (6) use interactions by gender. Columns (1) and (4) use the number of credits for which students registered in the spring term 2020 as the dependent variable. Columns (2) and (5) use the number of earned credits in the spring term 2020 as the dependent variable. Columns (3) and (6) use students’ average GPA (running from 1=worst to 4=best) among earned credits in the spring term 2020 as the dependent variable. Standard errors are robust. * p<0.10, ** p<0.05, *** p<0.01.

We start with the analysis by prior performance. Column (1) shows the impact on credits registered for. The interaction term is insignificant, but points towards a higher treatment effect for those with more credits in the winter term 2019/20. The interaction in Column (2) shows that students with better prior performance benefit more from the program in terms of credits earned. The point estimates suggest a positive effect starting at around 23 credits (about five of the six scheduled courses) passed in the winter term 2019/20.28 There are no effects on GPA (Column 3). This analysis suggests that highly motivated and highly able students, who often leverage the relatively easy first term to already take exams that they would have to do later (leading to credits in the winter term 2019/20 above 30) drive the heterogeneity effect shown in Table 4. These students comprise around 30% of our sample, making this behavior far from unusual in our setting.

Fig. 3 further illustrates this heterogeneity in an exploratory analysis. It shows credits earned by students by treatment status and by students’ tercile in the distribution of credits earned in the winter term 2019/20. The control mean is calculated as the students’ mean in the control group. The treatment mean is calculated as the control mean plus the estimated ITT effect, following Bergman et al. (2020). The figure shows that treatment and control groups feature similar outcomes in the lower two terciles of the distribution. In contrast, those students who already fared better in their first term benefit substantially from the program.

Fig. 3.

Fig. 3

Treatment Effects by Tercile of Credits Earned in Winter Term 2019/20.

Note: This figure shows the number of credits students earned in the spring term 2020 by prior performance as measured by their tercile in the distribution of credits earned in the winter term 2019/20. The control mean is calculated as the students’ mean in the control group. Treatment effects, reported in the top center of each comparison, are estimated using an OLS regression of the outcome on a treatment indicator, an indicator for students’ gender, and students’ credits earned in their first term, the winter term 2019/20. The treatment mean is calculated as the control mean plus the estimated treatment effect. Standard errors reported are robust. * p<0.10, ** p<0.05, *** p<0.01.

Overall, our findings demonstrate that those who fared well in the winter term 2019/20 benefited from the mentoring program. In contrast, weak students do not seem to have benefitted from the program. This is interesting because in several evaluated (peer) mentoring programs in higher education, good students are excluded (e.g., Angrist et al., 2009).29 These results also raise the question whether similar patterns (more able students benefiting more from the program) are also observed in the survey. In Online Appendix B.4, we show heterogeneity analyses by credits earned in the winter term 2019/20 for our survey outcomes. While the results are more noisy, the overall pattern is similar. This bolsters our confidence that the heterogeneous effects by prior performance reflect a meaningful difference between more and less able students’ response to the mentoring.30

We then turn to effects by gender. Column (4) of Table 4 shows a positive treatment effect for men, who register for around 2.7 more credits (>0.5 additional courses) relative to students assigned to the control group. The interaction is negative and of around the same magnitude, suggesting that female students do not benefit from the program. Column (5) shows similar results for credits earned. The results are again attenuated, with an effect of around one more credit earned by male students and zero effects for female students. We again find no effects on GPA. Again, both female and male students benefit more when they passed more credits in the winter term 2019/20 (not shown). This pattern is more pronounced for male students.31 These results are somewhat in contrast to prior evidence on mentoring programs. Angrist et al. (2009) find that an in-person program combining academic counseling with financial incentives positively affected female college students, with no effects on male students. In our context male students benefit more from the mentoring program. This may be explained by the online teaching environment which has been shown to particularly impair the performance of male students (e.g., Figlio et al., 2013).

5. Conclusion

This paper presents the first evidence on the potential role of remote peer mentoring programs in online higher education. We conducted a field experiment that provided first year students with a more advanced online mentor. The structured one-on-one mentoring focused on study behavior, study skills, and students’ self-organization, some of the most common issues in online teaching. For our experiment, we leveraged the COVID-19-induced switch to online teaching at a German public university, where the entire spring term 2020 was conducted online.

We document three sets of main results. First, the peer mentoring program positively affected the students’ motivation and study behavior. Second, while the impacts on study behavior and motivation translate into an increase in exam registrations, the average treatment effect on passed credits is small and not significantly different from zero. Similarly, the students’ GPA is not affected by our intervention. Third, across various outcomes, we observe a consistent pattern of heterogeneity: while students in the bottom part of the distribution of prior performance in the first term, the winter term 2019/20, seem to be largely unaffected by the treatment, we observe a positive effect on students who performed well in the winter term 2019/20. Male students also benefit somewhat more from the program.

Our results provide the first evidence on the effectiveness of peer mentoring to improve student outcomes and student well-being in online higher education. Although we acknowledge that the pandemic situation may affect the effectiveness of such mentoring interventions, we believe that our results carry lessons beyond our setting. First, our intervention did not impact survey questions that were directly related to specific aspects of the pandemic, like being in touch with peers or feeling valued by the department. Second, our findings on student motivation and study behavior are well in line with previous literature analyzing comparable interventions outside pandemics. Third, a concern regarding external validity could be that our intervention would be more effective during a partial lockdown relative to normal times, particularly for those students most negatively affected by the pandemic. However, we find similarly muted effects on academic performance than mentoring interventions outside the pandemic on average, and specifically for students performing poorly at baseline.

Yet, we do believe that our intervention informs on some interesting aspects related to the pandemic and a potential recovery. Most importantly, students seem more motivated when interacting with a peer mentor, even though the interactions takes place remotely and at low intensity. This is useful since remote interactions are viable even when the local supply of potential mentors is limited (Carlana, La Ferrara, Kraft, Falken, 2021). Our results also suggest that the program helps students to focus on their studies, which is one issue that many students struggled with during the pandemic. In the recovery from this pandemic, remote mentoring programs could thus help students in self-organizing better, being more motivated, and improving the transition from online to in-person teaching. We would like to highlight that, given the cumulative nature of human capital accumulation, our results on students’ well-being and behavior may suggest that a more permanent peer mentoring program could improve student outcomes more.

Regarding more general lessons on how to optimally design mentoring programs in higher education in the future, our results suggest that a “one-size-fits-all” approach may not be optimal. Most importantly, we document in our paper that the a low-dose mentoring intervention can significantly shift upwards the academic performance of the most able students, while the lower part of the ability distribution is unlikely to benefit. This pattern suggests that the type of mentoring we offered is a complement (rather than a substitute) to traits and skills that positively affect the students’ academic achievement in the first place (like motivation, conscientiousness, etc.). Starting off from recent work on focused vs. general interventions in K-12 education (Christensen et al., 2020), an interesting possible route for further research in higher education would be to test the effectiveness of targeted mentoring and/or tutoring interventions that allow to better address the individual needs of students. An optimal targeting of student support services may result in programs that offer mentoring services to well-performing students and more practical support services like tutoring to academically weaker students.

Footnotes

This field experiment was pre-registered at the AEA Social Science Registry under the ID AEARCTR-0005868 and has the approval from the ethics commission as well as from the data protection officer at the university where the experiment took place. We thank the editor, Margherita Fort, as well as three helpful referees for comments and suggestions. Uschi Backes-Gellner, Christine Binzel, Ben Castleman, Fabian Dehos, Ulrich Glogowsky, Joshua Goodman, Philipp Lergetporer, Claus Schnabel, Jeffrey Smith, Martin Watzinger, Kathrin Wernsdorf, Martin West and participants at various research seminars and conferences provided helpful comments and suggestions. We thank Jens Gemmel for excellent research assistance. We thank Sophie Andresen, Helen Dettmann, Eva-Maria Drasch, Daniel Geiger, Nina Graßl, Jana Hoffmann, Lukas Klostermeier, Alexander Lempp, Lennart Mayer, Jennifer Meyer, Annabell Peipp, Tobias Reiser, Sabrina Ried, Niklas Schneider, and Matthias Wiedemann for their excellent work as student mentors. Nagler gratefully acknowledges funding by the Joachim Herz Foundation through an Add-On Fellowship. Rincke gratefully acknowledges funding by the Innovationsfonds Lehre at the University of Erlangen-Nuremberg.

1

Administrative data from the year 2018/19 shows that even in regular times, many students underperform relative to the suggested curriculum: after the first term, only 59 percent of enrolled students have completed courses worth at least 30 credits.

2

In line with this, Patterson (2018) experimentally studies commitment devices, alerts, and distraction blocking tools in a MOOC and finds positive effects for treated students. Delivery-side frictions such as lack of experience in online teaching may also be important (e.g., Orlov et al., 2021).

3

Lavecchia et al. (2016) provide a recent review of behavioral interventions in education production. For other higher-education interventions, see, e.g. (Bettinger, Long, Oreopoulos, Sanbonmatsu, 2012, Clark, Gill, Prowse, Rush, 2020, Himmler, Jaeckle, Weinschenk, 2019). The literature on mentoring has so far mostly focused on settings before the onset of tertiary education (see, e.g., Lavy, Schlosser, 2005, Oreopoulos, Brown, Lavecchia, 2017, Rodriguez-Planas, 2012). There is an additional related literature on assistance provision in higher education (see, e.g. Bettinger et al., 2012). For research on mentoring in other settings, see, e.g., Lyle and Smith (2014).

4

Bettinger and Baker (2014) show that a (professional) student coaching service focusing on aligning long-term goals and self-organization and providing study skills increased university retention. Castleman and Page (2015) provide evidence of an effective text messaging mentoring for high-school graduates. CUNY’s ASAP program combines several interventions from financial support to tutoring and seems highly effective (Scrivener, Weiss, Ratledge, Rudd, Sommo, Fresques, 2015, Sommo, Cullinan, Manno, Blake, Alonzo, 2018, Weiss, Ratledge, Sommo, Gupta, 2019). For a recent review of the tutoring literature in K-12 education, see Nickow et al. (2020) and Kraft and Falken (2021). For a comparison of mentoring and tutoring interventions in K-12, see Christensen et al. (2020).

5

On average in the first two terms, survey participants spend about 13.3 hours per week attending courses, about 9.8 hours self-studying, and 7.5 hours to earn income.

6

Teaching started on April 20th and the exam period started on July 20th, 2020.

7

The page asked for the students’ consent to use their personal information for research in anonymized form and for their consent to pass along names and e-mail addresses to mentors. Treatment group students who did not register for the program within two days received reminders.

8

For more information, see https://www.wiso-virtuell.fau.eu/, last accessed October 19, 2021.

9

The program could be scaled up easily and at low cost. Including one additional mentee for a three-month period would cost about € 60. Mentors were employed for three months, with work contracts on four hours per week and monthly net pay of about € 160. Employer wage costs were about € 200 per month and mentor.

10

Section A.3 in the Online Appendix shows that female and male mentors differ slightly in their mentoring. They do not seem to be differentially effective, though.

11

In Germany, some students enroll at a university because as students they have access to heavily subsidized health insurance.

12

Students can choose when exactly to hand in certificates on credits earned elsewhere, delaying this information in the administrative data.

13

Because of the fixed capacity of the program and the (ex ante) unknown take-up rate, we first invited students sampled into treatment who did complete up to 30 credits in their first term, the winter term 2019/20. We sent a reminder to these students after 24 hours. Given the observed sign-up rate, we then invited the remaining students sampled into treatment to participate in the program. Hence, all 344 students sampled into treatment got an invitation e-mail during the first days of the spring term 2020, plus reminder e-mails in case they did not sign up.

14

When we run the same analyses using only credits earned from exams that were supposed to be taken in the spring term 2020, the results follow the same pattern, but the effect on exams registered is not significant anymore.

15

In each term, students only have the opportunity to sit an exam once. The next opportunity to take the exam is then in the subsequent term.

16

In Germany, a reversed scale is used, with 1 being the best and 4 the worst (passing) grade. We recoded the GPA to align with the U.S. system.

17

Students can be in the first year of the study program, but in a more advanced year at university if they were enrolled in a different program before. About 10% of students are enrolled as part-time students because their studies are integrated into a vocational training program.

18

To characterize compliers relative to the overall population in our sample, we also use the ivdesc command by Marbach and Hangartner (2020) and report results in Table A.2 in the same appendix. We find similar results. Comparing the impacts of sign-up on outcomes using OLS vs. IV in Appendix Table A.3 however suggests that compliers are positively selected on unobservables.

19

In unreported analyses, we find that the intensity of take-up (i.e., the extent to which students assigned to the treatment group actually made use of mentoring services, measured by the number of mentoring meetings attended) correlates significantly with some student characteristics. In particular, across various measures of actual service use, take-up is positively associated with high-school GPA and the number of credits earned in the winter term 2019/20. We do not have information on why some students signed up but then did not make (full) use of the offered mentoring services.

20

In addition, there is a positive baseline correlation between students’ high school GPA and their university performance. In the Online Appendix, we therefore also show estimates using mentees’ high-school GPA as the dimension of effect heterogeneity; results are similar.

21

All corresponding tables are in the Online Appendix. The statements are also shown in the Online Appendix. Note that we treat the ordinal scales as if they were cardinal scales. Using ordered probit or ordered logit models renders qualitatively identical findings.

22

In unreported analyses, we also investigate whether our effects are robust to adjustments to multiple hypothesis testing. First, we follow Kling et al. (2007) and estimate impacts on a Z-score index of all questions in this block. The ITT is 0.19 SD (p-value = 0.05) and the treatment effect on the treated (TOT) shows an effect of 0.40 SD (p-value = 0.046). Second, we use randomization inference procedures (Heß, 2017). Our standard errors are unaffected. Third, we use the procedure by Barsbai et al. (2020) that corrects for multiple hypothesis testing. Here, the p-values are larger, at around 0.15 for the questions that are significant above. While these results are thus a bit noisy, they are overall robust.

23

We show the heterogeneity of these effects by gender in Online Appendix B.5.

24

We additionally elicited students’ expectations of the likelihood of completing their studies on time and their planned credits. The results are noisy and show no difference between treatment and control group (not shown).

25

Using the latest available data from the early winter term 2021/22, when the students in our sample are in their fifth term of the program, we find that the program also fades out over time. Treated students register for more credits and earn more credits on average in the subsequent winter term 2020/2021, although the estimates are imprecise. In the spring term 2021, these differences decrease. We also find that program dropout was unaffected. All results are available on request.

26

In Appendix Table A.3, we show the impact of sign-up on student outcomes without accounting for endogenous sign-up. When comparing OLS and IV results of this relationship, the OLS point estimates are larger throughout and significantly different from zero, but not significantly different from the IV estimates using versions of Hausman (1978) tests. The direction of changes in the point estimates however suggests that adopters are positively selected on unobservables. These results thus caution against analyzing impacts of mentoring without acknowledging endogenous participation decisions.

27

Using machine-learning based methods to improve precision using more covariates (Opper, 2021, Wu, Gagnon-Bartsch, 2017) also does not meaningfully improve precision.

28

Taken at face value, the estimates suggest that the treatment effect is positive (at 0.41) for the average student in the treatment group who earned 25 credits in the winter term 2019/20. For students at the 25th percentile, the effect is slightly negative (-0.53), while for top performers at the 90th percentile we see strong positive effects (2.17). In Online Appendix C.5, we show flexible estimates of the relationship between credits earned in the winter term 2019/20 and treatment effects. We find that while the treatment is positive for good students, there is no evidence that it is detrimental for low performers.

29

We hired as mentors students with above-average performance in the same study program as their mentees. Hence, the heterogeneity in our data is in line with prior evidence suggesting that students perform better when being taught by similar persons (e.g. Dee, 2005, Hoffmann, Oreopoulos, 2009).

30

We show exploratory analyses in the Online Appendix. In the Online Appendix, we follow Abadie et al. (2018) and Ferwerda (2014) and estimate effects using endogenous stratification approaches. In line with the analysis above, students in the upper part of the distribution of predicted outcomes in the spring term 2020 seem to benefit most from the program.

31

In the Online Appendix, Fig. C.1 shows heterogeneous treatment effects by credits earned in the winter term 2019/20, by gender. Again, both female and male students benefit more when they passed more credits in the winter term 2019/20. However, this pattern is much more pronounced for male students. Using credits registered as dependent variable shows similar effects (available on request).

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.labeco.2022.102220.

Appendix A. Supplementary materials

Supplementary Data S1

Supplementary Raw Research Data. This is open data under the CC BY license http://creativecommons.org/licenses/by/4.0/

mmc1.pdf (3.5MB, pdf)

References

  1. Abadie A., Chingos M.M., West M.R. Endogenous stratification in randomized experiments. Rev. Econ. Stat. 2018;100(4):567–580. [Google Scholar]
  2. Altindag D.T., Filiz E.S., Tekin E. NBER Working Paper No. 29113. 2021. Is Online Education Working? [Google Scholar]
  3. Angrist J., Lang D., Oreopoulos P. Incentives and services for college achievement: evidence from a randomized trial. Am. Econ. J. 2009;1(1):136–163. [Google Scholar]
  4. Angrist N., Bergman P., Matsheng M. School’s out: experimental evidence on limiting learning loss using ‘low-tech’ in a pandemic. Nat. Hum. Behav. 2022 doi: 10.1038/s41562-022-01381-z. [DOI] [PubMed] [Google Scholar]; Forthcoming
  5. Aucejo E.M., French J., Araya M.P.U., Zafar B. The impact of COVID-19 on student experiences and expectations: evidence from a survey. J. Public Econ. 2020;191:104271. doi: 10.1016/j.jpubeco.2020.104271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bacher-Hicks A., Goodman J., Mulhern C. Inequality in household adaptation to schooling shocks: Covid-induced online learning engagement in real time. J. Public Econ. 2020;193:104345. doi: 10.1016/j.jpubeco.2020.104345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Banerjee A.V., Duflo E. (dis)organization and success in an economics MOOC. Am. Econ. Rev. 2014;104(5):514–518. [Google Scholar]
  8. Barsbai T., Licuanan V., Steinmayr A., Tiongson E., Yang D. NBER Working Paper No. 27346. 2020. Information and the Acquisition of Social Network Connections. [Google Scholar]
  9. Baum C.F., Schaffer M.E., Stillman S. Enhanced routines for instrumental variables/generalized method of moments estimation and testing. Stata J. 2007;7(4):465–506. [Google Scholar]
  10. Bergman P., Chetty R., DeLuca S., Hendren N., Katz L.F., Palmer C. NBER Working Paper No. 26164. 2020. Creating Moves to Opportunity: Experimental Evidence on Barriers to Neighborhood Choice. [Google Scholar]
  11. Bettinger E.P., Baker R.B. The effects of student coaching: an evaluation of a randomized experiment in student advising. Educ. Eval. Policy Anal. 2014;36(1):3–19. [Google Scholar]
  12. Bettinger E.P., Fox L., Loeb S., Taylor E.S. Virtual classrooms: how online college courses affect student success. Am. Econ. Rev. 2017;107(9):2855–2875. [Google Scholar]
  13. Bettinger E.P., Long B.T., Oreopoulos P., Sanbonmatsu L. The role of application assistance and information in college decisions: results from the h&r block fafsa experiment. Q. J. Econ. 2012;127(3):1205–1242. [Google Scholar]
  14. Bird K.A., Castleman B.L., Lohner G. Negative impacts from the shift to online learning during the COVID-19 crisis: evidence from a statewide community college system. AERA Open. 2022;8(1):1–16. doi: 10.26300/GX68-RQ13. [DOI] [Google Scholar]
  15. Carlana, M., La Ferrara, E., 2020. Apart but connected: Online tutoring to mitigate the impact of COVID-19 on educational inequality. Mimeo.
  16. Castleman B.L., Page L.C. Summer nudging: can personalized text messages and peer mentor outreach increase college going among low-income high school graduates? J. Econ. Behav. Organ. 2015;115:144–160. [Google Scholar]
  17. Christensen K.M., Hagler M.A., Stams G.-J., Raposa E.B., Burton S., Rhodes J.E. Non-specific versus targeted approaches to youth mentoring: a follow-up meta-analysis. J. Youth Adolesc. 2020;49:959–972. doi: 10.1007/s10964-020-01233-x. [DOI] [PubMed] [Google Scholar]
  18. Clark D., Gill D., Prowse V., Rush M. Using goals to motivate college students: theory and evidence from field experiments. Rev. Econ. Stat. 2020;102(4):648–663. [Google Scholar]
  19. Clingingsmith D., Khwaja A.I., Kremer M. Estimating the impact of the hajj: religion and tolerance in Islam’s global gathering. Q. J. Econ. 2009;124(3):1133–1170. [Google Scholar]
  20. De Paola M., Gioia F., Scoppa V. Universita della Calabria Working Paper No 2. 2022. Online Teaching, Procrastination and Students’ Achievement: Evidence From COVID-19 Induced Remote Learning. [Google Scholar]
  21. Dee T.S. A teacher like me: does race, ethnicity, or gender matter? Am. Econ. Rev. 2005;95(2):158–165. [Google Scholar]
  22. Dee T.S. Teachers and the gender gaps in student achievement. J. Hum. Resour. 2007;42(3):528–554. [Google Scholar]
  23. Ferwerda J. Estrat: stata module to perform endogenous stratification for randomized experiments. Statistical Software Components S457801. 2014 [Google Scholar]
  24. Figlio D., Rush M., Yin L. Is it live or is it internet? experimental estimates of the effects of online instruction on student learning. J. Labor Econ. 2013;31(4):763–784. [Google Scholar]
  25. Grewenig E., Lergetporer P., Werner K., Woessmann L., Zierow L. Covid-19 and educational inequality: how school closures affect low- and high-achieving students. Eur. Econ. Rev. 2021;140:103920. doi: 10.1016/j.euroecorev.2021.103920. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hardt D., Nagler M., Rincke J. CESifo Working Paper No. 9555. 2022. Tutoring in (online) Higher Education: Experimental Evidence. [Google Scholar]
  27. Hausman J. Specification tests in econometrics. Econometrica. 1978;46:1251–1271. [Google Scholar]
  28. Heß S. Randomization inference with stata: a guide and software. Stata J. 2017;17(3):630–651. [Google Scholar]
  29. Himmler O., Jaeckle R., Weinschenk P. Soft commitments, reminders, and academic performance. Am. Econ. J. 2019;11(2):114–142. [Google Scholar]
  30. Hoffmann F., Oreopoulos P. A professor like me: the influence of instructor gender on college achievement. J. Hum. Resour. 2009;44(2):479–494. [Google Scholar]
  31. Jaeger D.A., Arellano-Bover J., Karbownik K., Matute M.M., Nunley J.M., Seals R.A., Jr., Almunia M., Alston M., Becker S.O., Beneito P., Bheim R., Bosc J.E., Brown J.H., Chang S., Cobb-Clark D.A., Danagoulian S., Donnally S., Eckrote-Nordland M., Farr L., Ferri J., Fort M., Fruewirth J.C., Gelding R., Goodman A.C., Guldi M., Hckl S., Hankin J., Imberman S.A., Lahey J., Llull J., Mansour H., McFarlin I., Merilinen J., Mortlund T., Nybom M., O’Connell S.D., Sausgruber R., Schwartz A., Stuhler J., Thiemann P., van Veldhuizen R., Wanamaker M.H., Zhu M. IZA Discussion Paper Series No. 14419. 2021. The Global COVID-19 Student Survey: First Wave Results. [Google Scholar]
  32. Kleibergen F., Paap R. Generalized reduced rank tests using the singular value decomposition. J. Econ. 2006;133(1):97–126. [Google Scholar]
  33. Kling J.R., Liebman J.B., Katz L.F. Experimental analysis of neighborhood effects. Econometrica. 2007;75(1):83–119. [Google Scholar]
  34. Kling J.R., Liebman J.B., Katz L.F., Sanbonmatsu L. KSG Working Paper No. RWP04-035. 2004. Moving to Opportunity and Tranquility: Neighborhood Effects on Adult Economic Self-sufficiency and Health from a Randomized Housing Voucher Experiment. [Google Scholar]
  35. Kofoed M., Gebhart L., Gilmore D., Moschitto R. IZA Discussion Paper No. 14356. 2021. Zooming to Class?: Experimental Evidence on College Students’ Online Learning During Covid-19. [Google Scholar]
  36. Kraft M.A., Falken G.T. EdWorkingPaper No. 20–335. 2021. A Blueprint for Scaling Tutoring Across Public Schools. [Google Scholar]
  37. Lavecchia A.M., Liu H., Oreopoulos P. In: Handbook of the Economics of Education. Hanushek E.A., Machin S., Woessmann L., editors. Vol. 5. Elsevier; 2016. Behavioral economics of education: Progress and possibilities; pp. 1–74. [Google Scholar]
  38. Lavy V., Schlosser A. Targeted remedial education for underperforming teenagers: costs and benefits. J. Labor Econ. 2005;23(4):839–874. [Google Scholar]
  39. Lyle D.S., Smith J.Z. The effect of high-performing mentors on junior officer promotion in the US army. J. Labor Econ. 2014;32(2):229–258. [Google Scholar]
  40. Marbach M., Hangartner D. Profiling compliers and noncompliers for instrumental-variable analysis. Polit. Anal. 2020;28:435–444. [Google Scholar]
  41. McPherson M.S., Bacow L.S. Online higher education: beyond the hype cycle. J. Econ. Perspect. 2015;29(4):135–154. [Google Scholar]
  42. Nickow A., Oreopoulos P., Quan V. NBER Working Paper No. 27476. 2020. The Impressive Effects of Tutoring on Prek-12 Learning: a Systematic Review and Meta-analysis of the Experimental Evidence. [Google Scholar]
  43. Opper I.M. EdWorkingPaper: 21–344. 2021. Improving Average Treatment Effect Estimates in Small-scale Randomized Controlled Trials. [Google Scholar]
  44. Oreopoulos P., Brown R.S., Lavecchia A.M. Pathways to education: an integrated approach to helping at-risk high school students. J. Polit. Econ. 2017;125(4):947–984. [Google Scholar]
  45. Oreopoulos P., Patterson R.W., Petronijevic U., Pope N.G. Low-touch attempts to improve time management among traditional and online college students. J. Hum. Resour. 2022;57(1):1–43. [Google Scholar]
  46. Oreopoulos P., Petronijevic U. Student coaching: how far can technology go? J. Hum. Resour. 2018;53(2):299–329. [Google Scholar]
  47. Oreopoulos P., Petronijevic U. NBER Working Paper No. 26059. 2019. The Remarkable Unresponsiveness of College Students to Nudging and What we can Learn from it. [Google Scholar]
  48. Oreopoulos P., Petronijevic U., Logel C., Beattie G. Improving non-academic student outcomes using online and text-message coaching. J. Econ. Behav. Organ. 2020;171:342–360. [Google Scholar]
  49. Orlov G., McKee D., Berry J., Boyle A., DiCiccio T., Ransom T., Rees-Jones A., Stoye J. Learning during the COVID-19 pandemic: it is not who you teach, but how you teach. Econ. Lett. 2021;202:109812. doi: 10.1016/j.econlet.2021.109812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Patterson R.W. Can behavioral tools improve online student outcomes? Experimental evidence from a massive open online course. J. Econ. Behav. Organ. 2018;53:293–321. [Google Scholar]
  51. Rodriguez-Planas N. Longer-term impacts of mentoring, educational services, and learning incentives: evidence from a randomized trial in the United States. Am. Econ. J. 2012;4(4):121–139. [Google Scholar]
  52. Rodriguez-Planas N. Covid-19 and college academic performance: alongitudinal analysis. J. Public Econ. 2022;207:104606. doi: 10.1016/j.jpubeco.2022.104606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Rodriguez-Planas N. Hitting where it hurts most: COVID-19 and low-income urban college students. Econ. Educ. Rev. 2022;87:102233. doi: 10.1016/j.econedurev.2022.102233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Scrivener S., Weiss M.J., Ratledge A., Rudd T., Sommo C., Fresques H. Doubling graduation rates: three-year effects of CUNY’s accelerated study in associate programs (ASAP) for developmental education students. MDRC. 2015 [Google Scholar]
  55. Sommo C., Cullinan D., Manno M., Blake S., Alonzo E. Doubling graduation rates in a new state two-year findings from the ASAP ohio demonstration. MDRC Policy Brief 12/2018. 2018 [Google Scholar]
  56. Weiss M.J., Ratledge A., Sommo C., Gupta H. Supporting community college students from start to degree completion: long-term evidence from a randomized trial of CUNY’s ASAP. Am. Econ. J. 2019;11(3):253–297. [Google Scholar]
  57. Wu E., Gagnon-Bartsch J. Working Paper, Department of Statistics, University of Michigan, Ann Arbor. 2017. The LOOP Estimator: Adjusting for Covariates in Randomized Experiments. [DOI] [PubMed] [Google Scholar]; https://arxiv.org/abs/1708.01229
  58. Xu D., Jaggars S.S. Performance gaps between online and face-to-face courses: differences across types of students and academic subject areas. J. Higher Educ. 2014;85(5):633–659. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data S1

Supplementary Raw Research Data. This is open data under the CC BY license http://creativecommons.org/licenses/by/4.0/

mmc1.pdf (3.5MB, pdf)

Articles from Labour Economics are provided here courtesy of Elsevier

RESOURCES