Skip to main content
De Gruyter Funded Articles logoLink to De Gruyter Funded Articles
. 2025 Aug 26;14(1):20240028. doi: 10.1515/em-2024-0028

Investigating the association between school substance programs and student substance use: accounting for informative cluster size

Aya A Mitani 1,, Yushu Zou 1, Scott T Leatherdale 2, Karen A Patte 3
PMCID: PMC12376993  PMID: 40861313

Abstract

Objectives

The use of substances in adolescents is an increasing public health problem. Many high schools in Canada have implemented school-based programs to mitigate student substance use, but their utility is not conclusive. Polysubstance use data collected on students from multiple schools may be subject to informative cluster size (ICS). The objective of this study was to investigate whether a multivariate analysis approach that addresses ICS provides different conclusions from univariate analyses and methods that do not account for ICS.

Methods

We used data from the 2018/2019 cycle of the Cannabis, Obesity, Mental health, Physical activity, Alcohol, Smoking, and Sedentary Behaviour (COMPASS) study, an ongoing prospective cohort study that annually collects data from Canadian high schools and students. We compared results from four analytical approaches that estimate marginal associations between each school substance program and the four substance use behaviours (binge drinking, cannabis, e-cigarette, and cigarette): univariate generalized estimating equations (GEE), univariate cluster-weighted GEE (CWGEE), multivariate GEE, and multivariate CWGEE.

Results

We observed that the proportion of students who engage in each of the four behaviours was higher in small schools and lower in large schools. In general, the univariate and multivariate analyses produced comparable results. Some differences existed between multivariate CWGEE and GEE. CWGEE indicated that the school program on cannabis had an odds ratio (OR) and 95 % confidence interval (CI) of 0.83 (0.73, 0.95) on all substance use, but GEE produced a null association with an OR (95 % CI) of 0.92 (0.79, 1.07).

Conclusions

When ICS is present in clustered school data, weighted and unweighted analyses may produce different results. Care is needed to investigate the relationship between cluster size and the outcome, and use appropriate methods for analysis. Certain substance programs may influence student behaviour in other substances, highlighting the need for a multivariate analytical approach when studying the use of substances by adolescents.

Keywords: adolescent health, clustered data, cluster-weighted generalized estimating equations, marginal models, multivariate analysis

Introduction

Adolescent substance use is a serious public health problem. With the increase in electronic cigarette (e-cigarette) use and cannabis legalisation in Canada and several U.S. states, researchers have explored risk factors and potential mitigation strategies for substance use among high school students [1], 2]. Although schools are well positioned to educate and equip students to deter and minimize their participation in substance-related activities [3], 4], the effectiveness of school-based programs in preventing substance use among students is inconclusive [5], [6], [7]. Burnett et al. found that the use of alcohol or cannabis was more prevalent when public health experts were involved in schools [5]. Williams et al. found that school prevention programs targeting e-cigarettes and cigarettes did not significantly reduce e-cigarette use among students [6]. Polysubstance use, the concurrent use of multiple substances, often shows correlated usage patterns [8]. However, many school-based studies build separate regression models for each substance use and exposures of interest [9], [10], [11], [12], overlooking the correlations between different substance uses. Furthermore, analysis must consider the clustering of students within schools, particularly when investigating school-level interventions.

Marginal models with a logit link fitted using generalized estimating equations (GEE) provide a population-average inference for the effect of school-level programs on the odds of student-level substance use. Past research has often used GEE with exchangeable working correlation with robust standard errors to account for the clustering of students within schools [5], 13], 14]. When students are clustered within schools, population-average inference can be either inference about a typical student in a typical school or inference about all students in the data [15]. The two types of inference are equal if the outcome (substance use at the student level) is independent of the size of the cluster (size of school) and can be consistently estimated using GEE with independence or non-independence (including exchangeable) working correlation [16]. However, when the outcome is related to cluster size, a condition known as informative cluster size (ICS), GEE with non-independence working correlation will produce biased results for both inferences [17], 18].

Research examining the relationship between school size and student substance use has produced different results. Some studies suggest that small schools are associated with a high percentage of student misconduct [19], while others suggest that large schools exhibit higher levels of misbehaviour [20]. This discrepancy underscores the importance of exploring the link between substance use rates and school size, especially because GEE with the exchangeable working correlation structure, a popular choice of analysis for school-based studies, produces biased results under ICS [16], 17]. When ICS exists, weighting the GEE with independence working correlation by the inverse of cluster size is an established method to consistently estimate the effects of cluster-level exposures where the inference is about a typical student in a typical school [16], 18]. Cluster-weighted GEE (CWGEE) has been applied to various clustered data, but is generally not applied to school-based studies. CWGEE has also been extended to account for the association between each pair of multiple outcomes when more than one outcome is of interest, such as in the case of multiple types of substance use [21].

The objective of this paper was to examine whether the use of a multivariate analysis approach that accounts for ICS yields conclusions distinct from those derived from univariate analyses and approaches that disregard ICS when studying the marginal association of school-level substance use prevention or cessation programs and student substance use. We first investigated the relationship between the rates of four behaviours (binge drinking, cannabis use, e-cigarette use, and cigarette use) and school size. Then using multivariate CWGEE, we estimated the effect of school programs on the odds of four behaviours, comparing the results with unweighted GEE analysis that assumes no ICS and univariate CWGEE and GEE analyses which do not account for the correlations between different substance use.

Materials and methods

Data source

We used cross-sectional student-level and school-level data from year 7 (2018/2019) of the COMPASS study, an ongoing prospective cohort study that annually collects data from a sample of Canadian high schools and their students [22]. School-level data were collected using the COMPASS School Program and Policies (SPP) tool and student-level data were collected using the COMPASS student questionnaire (Cq). Active information passive-consent parental permission protocols were used, given their importance in collecting robust substance use data among youth [23]. Applying these protocols, all students attending participating schools were eligible to complete the questionnaire if they were not withdrawn by a parent/guardian. In year 7, the Cq was a machine-readable paper-and-pencil questionnaire completed in the classrooms of all schools. The SPP is an online survey completed at the same time by the school contacts most knowledgeable about their school’s health policy environment. Additional details on the methods and protocols of the COMPASS study can be found online (www.compass.uwaterloo.ca).

Outcome variables – binge drinking, cannabis use, e-cigarette use, and cigarette use

We derived four dichotomous variables that each indicate the current use of binge drinking, cannabis, e-cigarettes, and cigarettes. Each student was allowed to use more than one substance. To identify binge drinkers, students were asked “In the last 12 months, how often did you have five drinks of alcohol or more on one occasion?” and responded with one of ten options ranging from ‘I have never done this’ to ‘Daily or almost daily’. A student was considered a current binge drinker if they reported having five or more drinks on one occasion at least monthly. To measure cannabis use, students were asked “In the last 12 months, how often did you use marijuana or cannabis (a joint, pot, weed, hash)?” and responded with one of nine options ranging from ‘I have never used marijuana’ to ‘every day’. A student was considered a current cannabis user if they reported using cannabis at least monthly. To measure e-cigarettes and tobacco cigarettes use, students were asked “On how many of the last 30 days did you use an e-cigarette?” and “On how many of the last 30 days did you smoke one or more cigarettes?” and responded with one of eight options ranging from ‘None’ to ‘30 days (every day)’. A student was considered a current e-cigarette or cigarette user if they reported smoking an e-cigarette or tobacco cigarette at least one day in the last 30 days, respectively. Of note, the legal age to drink and smoke (cigarettes and e-cigarettes) is 18 in Alberta (AB) and Quebec (QC), and 19 in British Columbia (BC) and Ontario (ON). On 2018 October 17, cannabis was legalized in Canada for those over age 18 in AB, 19 in ON and BC, and 21 in QC.

Exposure variable – school prevention and/or cessation program on alcohol, cannabis, e-cigarettes, and cigarettes

For each school, we derived four dichotomous variables to indicate the presence of prevention or cessation programs for alcohol, cannabis, e-cigarettes, and cigarettes. School contacts were asked if their school offered programs outside the standard curriculum that addressed alcohol, cannabis, e-cigarette, or cigarette use prevention (“Other than classes/curriculum, does your school offer any programs that address alcohol/Marijuana/e-cigarette/tobacco use prevention?”), and e-cigarette or cigarette use cessation (“Other than classes/curriculum, does your school offer any programs that address e-cigarette/tobacco use cessation?”). Each school was considered to have: 1) a school program on alcohol if it indicated the existence of an alcohol prevention program; 2) a school program on cannabis if it indicated the existence of a cannabis prevention program; 3) a school program on e-cigarette if it indicated the existence of either e-cigarettes prevention or e-cigarette cessation program; 4) a school program on cigarettes if it indicated the existence of either tobacco prevention or tobacco cessation program.

Covariates

Several studies have found protective effects of school connectedness on adolescent substance use [24], [25], [26]. School connectedness was collected using a modified version of the National Longitudinal Study of Adolescent Health School Connectedness scale [25]. Students were asked to respond using a four-option Likert scale from ‘strongly agree’ to ‘strongly disagree’ to the following statements: (1) ‘I feel close to people at my school’, (2) ‘I am a part of my school’, (3) ‘I am happy to be at my school’, (4) ‘I feel the teachers at my school treat me fairly’, and (5) ‘I feel safe in my school’. These items were summed into a connectedness score ranging from 6 to 24 with a higher score indicating a higher level of school connectedness. Other covariates included age (years), gender (female/male), ethnicity (White/Black/Asian/Hispanic or Latin American/Other), weekly personal spending money (None, $1–20, $21–100, > $100, Not Sure) of each student and the province (ON/AB/BC/QC) of each school as well as their location (Large/Medium urban or Small urban/Rural). Students who identified with multiple ethnicities were classified into the “Other” category.

Investigating ICS in the data

ICS occurs when cluster size varies and the outcome is not independent of cluster size, given the covariates. For each school, we determined the proportions of students engaging in binge drinking, cannabis, e-cigarette, and cigarette use. Density plots were used to examine the distribution of proportion of each substance use. Scatter plots with Spearman’s rank correlation coefficients were used to investigate the relationship between each pair of substance use behaviours. We also visually inspected the relationship between school size and the proportion of students who engage in each behaviour, using Spearman’s rank correlation coefficient to measure the strength of these associations.

Multivariate CWGEE

We let Yijk represent the kth binary outcome for the jth student enrolled in the ith school, where k=(Binge, Cannabis, E-cig, Cig) and let Yij=YijBinge,YijCannabis,YijEcig,YijCig be the vector that contains the four outcomes for student j in school i. Each school i includes n i number of students with a total of N schools, j=1, …, n i , i=1, …, N. We also let X ij be the 1 × p row vector of covariates. School-level covariates included the presence of programs on alcohol, cannabis, e-cigarettes, and cigarettes, as well as the province and urbanicity. Student-level covariates included school connectedness score, age, sex, ethnicity, and weekly personal spending money. As the goal is to jointly describe the relationship between μijk=EYijk=PrYijk=1 and X ij, we built the following set of four logistic regression models:

logitμijBinge=αBinge+XiβlogitμijCannabis=αCannabis+Xi(β+βCannabis)logitμijEcig=αEcig+Xi(β+βEcig)logitμijCig=αCig+Xi(β+βCig) (1)

where α k is the outcome-specific intercept, β is the p × 1 column vector of coefficients for the outcome binge drinking, and β Cannabis, β E-cig, β Cig are additive effects that each of the covariates has on the outcomes cannabis, e-cigarette, and cigarette use respectively, compared to the reference outcome binge drinking. We constructed the following cluster-weighted generalized sum of squares for error to account for the correlation between the four outcomes:

QW(γ,ρ)=i=1N1nij=1niZ(γ)ijR(ρ)ij1Z(γ)ij. (2)

where γ = (α Binge, α Cannabis, α E-cig, α Cig, β , β Cannabis, β E-cig, β Cig) is the vector of all parameters, Z(γ)ij=ZijBinge,ZijCannabis,ZijEcig,ZijCig is the vector of standardized residuals defined as Zijk=Yijkμijkμijk1μijk , and R(ρ) ij is the 4 × 4 working correlation matrix [21]. We assumed the unstructured working correlation structure because in addition to estimating γ , we were also interested in estimating the correlation between each pair of outcomes. We constructed two sets of estimating equations by taking the partial derivatives of Q W ( γ , ρ ) with respect to γ and ρ and setting each equal to 0. We applied the Fisher scoring algorithm to solve for γ and ρ and iterated between the two equations until the convergence criterion was met [27]. The final estimates, γ^ and ρ^ , minimize Equation (2). The estimation procedure is implemented in the mvoGEE function in the R package CWGEE available from https://github.com/ayamitani/CWGEE [21].

After fitting the full model, we used the multivariate Wald test to test the hypothesis, H0:βpCannabis=βpEcig=βpCig=0 vs. H 1: at least one of βpCannabis,βpEcig,βpCig does not equal 0, for each covariate p to arrive at a more parsimonious model. If the null hypothesis was not rejected for covariate p, then we assumed that the effects of covariate p on cannabis, e-cigarette, and cigarette use were similar to the effect of covariate p on binge drinking and used a common slope in all four outcomes. We used the Benjamini-Hochberg procedure to control the false discovery rate at 0.1.

We also performed three other analyses for comparison. The first was using the unweighted multivariate GEE which is the model in Equation (1) without the cluster weights.

Q(γ,ρ)=i=1Nj=1niZ(γ)ijR(ρ)ij1Z(γ)ij. (3)

and similarly estimated the unweighted parameters. The second was a set of four separate univariate CWGEE analysis for each outcome. The third was a set of four separate univariate GEE analysis for each outcome assuming the exchangeable working correlation structure, which is most commonly applied in school-based studies.

Results

Relationship between substance use and school size

Year 7 (2018/19) of the COMPASS study included 136 schools and 74,075 students in four Canadian provinces: Alberta (8 schools with 3,283 students), British Columbia (15 schools with 10,350 students), Quebec (52 schools with 29,956 students), and Ontario (61 schools with 30,486 students). The student response rate was 84.2 %, with less than 1 % missing or invalid data, which were excluded from the final analyses.

Figure 1 shows the distribution of the proportions of students that currently binge drink, use cannabis, e-cigarettes, and cigarettes in each school, along with the pairwise Spearman correlation coefficient (R) and p-value (p). The proportion of current binge drinkers per school ranged from 2 to 41 %. Most schools had 9–19 % of students using cannabis, 18–30 % using e-cigarettes, and 5–11 % using cigarettes. A strong positive correlation was evident across all pairs of substance use, indicating that schools with higher rates of binge drinking also had higher rates of cannabis, e-cigarette, and cigarette use. The strongest correlation was observed between binge drinking and e-cigarette use (R=0.771, p<0.001), while the weakest correlation was observed between binge drinking and cannabis use (R=0.506, p<0.001).

Figure 1:

Figure 1:

The distribution of substance use proportions in each school and the relationship between each pair of substance use behaviours: binge drinking (Binge), cannabis, e-cigarette, and cigarette. R is the Spearman correlation coefficient, and p is the p-value for the hypothesis test, H 0:R=0 vs. H 1:R≠0.

Figure 2 illustrates the correlation between school size and the proportion of students involved in each substance, shown by locally estimated scatterplot smoothing (LOESS). Each substance use rate was higher in smaller schools and lower in larger ones, indicating that the data was subject to ICS for each substance use outcome. Binge drinking and school size had a Spearman correlation coefficient (R) of −0.248 (p=0.004). Cannabis use and school size had a similar R of −0.247 (p=0.004). E-cigarette and school size had R of −0.217 (p=0.011), and cigarette and school size had R of −0.467 (p<0.001).

Figure 2:

Figure 2:

The relationships between proportions of binge drinkers, cannabis users, e-cigarette users, and cigarette users in each school vs. school enrollment size. R is the Spearman correlation coefficient and p is the p-value for the hypothesis test, H 0:R=0 vs. H 1:R≠0.

Summary of student characteristics

Table 1 summarizes the characteristics of the 74,075 student respondents. The median age was 15 years, with 49.1 % female. Most identified as White (68.5 %), followed by Other (14.1 %), Asian (10.0 %), Black (3.9 %), and Hispanic or Latin American (2.5 %). Over 70 % reported alcohol consumption, with 17.2 % as current binge drinkers. Cannabis, e-cigarette, and cigarette use were reported by 12.9 , 23.9, and 7.4 % of students, respectively.

Table 1:

Summary of student characteristics, COMPASS 2018/19.

Total
Student characteristics (n=74,075)
Age, median [Q1, Q3] 15 [14, 16]
Missing, n (%) 539 (0.7 %)
Sex, n (%)
Female 36,371 (49.1 %)
Male 36,882 (49.8 %)
Missing 822 (1.1 %)
Ethnicity, n (%)
White 50,764 (68.5 %)
Asian 7,429 (10.0 %)
Black 2,921 (3.9 %)
Hispanic/Latin American 1,864 (2.5 %)
Other (including multiple ethnicities) 10,439 (14.1 %)
Missing 658 (0.9 %)
Connectedness score, median [Q1, Q3] 18.0 [16.0, 21.0]
Weekly personal spending money, n (%)
None 11,621 (15.7 %)
$1–20 17,647 (23.8 %)
$21–100 16,701 (22.5 %)
> $100 14,098 (19.0 %)
Not sure 12,932 (17.5 %)
Missing 1,076 (1.5 %)
Urbanicity of school location, n (%)
Rural/Small urban 26,347 (35.6 %)
Medium/Large urban 47,728 (64.4 %)
Substance use, n (%)
Current alcohol user 52,648 (71.1 %)
Current binge drinker 12,752 (17.2 %)
Current cannabis user 9,566 (12.9 %)
Current e-cigarette user 17,716 (23.9 %)
Current cigarette user 5,463 (7.4 %)

Results from full multivariate analyses and univariate analyses

The results of the full multivariate and univariate CWGEE along with the full multivariate and univariate GEE are compared in Figure 3. Each point represents the odds ratio (OR) of each type of school program for each type of student substance behaviour, and the horizontal line shows the 95 % confidence interval (CI). The exact ORs and 95 % CIs, and those for other covariates are available in Table S1. All analyses were performed on complete cases with no missing covariates (n=72,392). In general, the results from the multivariate and univariate CWGEE were similar and the ORs from the multivariate and univariate GEE were similar but the univariate GEE yielded slightly wider 95 % CIs.

Figure 3:

Figure 3:

Effects of each school program on each substance use shown in odds ratios and 95 % CIs across four methods: 1. Multivariate CWGEE (orange); 2. Univariate CWGEE (red); 3. Multivariate GEE (green); 4. Univariate GEE with the exchangeable working correlation structure (blue).

Each school program had different effects on each type of substance use. All methods showed that school programs on alcohol had a null effect on student binge drinking and e-cigarette use. The multivariate and univariate CWGEE both yielded higher odds of student cannabis use in schools with an alcohol program with an OR (95 % CI) of 1.22 (1.00, 1.49) and 1.22 (1.01, 1.49), respectively, while the ORs from the multivariate and univariate GEE analyses were closer to 1. Both the multivariate and univariate CWGEE also yielded higher odds of cigarette use in schools with an alcohol program with an OR of 1.27 (0.98, 1.64) and 1.26 (0.98, 1.64), respectively. In contrast, the multivariate GEE indicated a null association with an OR of 1.25 (0.93, 1.68), while the univariate GEE showed a harmful association with an OR of 1.36 (1.02, 1.80), potentially leading to conflicting conclusions between the multivariate and univariate analyses. The associations of other school programs (cannabis, e-cigarette, and cigarette) on each substance use were similar between the multivariate and univariate methods, but some differences were observed between CWGEE and GEE which will be discussed in Section 3.4.

We also observed discrepancies between the multivariate and univariate analyses on the association of student ethnicity and cannabis and cigarette use. Within GEE, the univariate analysis indicated that the odds of Black students using cannabis were 1.29 (1.09, 1.53) times higher compared to White students, while the multivariate analysis yielded an OR of 1.17 (0.96, 1.42), potentially leading to a different conclusion. Similarly, the univariate GEE indicated that the odds of Black and Hispanic students using cigarettes were 1.45 (1.19, 1.77) and 1.22 (1.02, 1.47) times higher compared to White students, but the multivariate analysis yielded an OR (95 % CI) of 1.24 (0.97, 1.57) and 1.11 (0.89, 1.38), respectively (Table S1).

Results from parsimonious multivariate CWGEE and GEE

Under the GEE approach, the multivariate Wald test indicated similar effects of school alcohol and cannabis programs on student substance use outcomes. Students in British Columbia (BC) and Alberta (AB) also had similar odds of substance use compared to Ontario (ON), leading to a parsimonious multivariate GEE with common coefficients for school alcohol and cannabis programs and the provinces of BC and AB. Under the CWGEE approach, the odds of school cigarette program and Hispanic students were also similar, resulting in common coefficients for school alcohol, cannabis, and cigarette programs, provinces BC and AB, and Hispanic ethnicity. p-values from the Wald tests with Benjamini-Hochberg adjustment are shown in Table S2.

Table 2 shows the OR estimates and 95 % CI of the school-level covariates from the parsimonious multivariate CWGEE and GEE analyses. The OR (95 % CI) of the student-level covariates (connectedness score, age, sex, ethnicity, weekly personal spending money) are shown in Table S3. Both analyses were performed on complete cases with no missing covariates (n=72,392).

Table 2:

Odds ratios and 95 % confidence intervals of school-level covariates from parsimonious multivariate CWGEE and GEE. ON was used as the reference level for the province. Each model was also adjusted for the student’s school connectedness score, age, gender, ethnicity, and weekly personal spending money.

Method Covariate Binge Cannabis E-cigarette Cigarette
CWGEE School program – alcohol 1.13 (0.98, 1.30)
School program – cannabis 0.83 (0.73, 0.95)
School program – E-cigarette 0.89 (0.76, 1.04) 0.95 (0.82, 1.10) 0.97 (0.83, 1.13) 0.83 (0.69, 1.00)
School program – cigarette 1.06 (0.94, 1.19)
Province – QC 1.30 (1.12, 1.50) 0.66 (0.56, 0.78) 1.14 (1.00, 1.31) 1.19 (0.94, 1.51)
Province – AB 1.03 (0.83, 1.28)
Province – BC 0.86 (0.70, 1.04)
Medium/Large urban 0.63 (0.55, 0.71) 0.87 (0.76, 1.00) 0.80 (0.71, 0.90) 0.53 (0.43, 0.64)
GEE School program – alcohol 1.04 (0.89, 1.21)
School program – cannabis 0.92 (0.79, 1.07)
School program – E-cigarette 0.88 (0.75, 1.03) 1.00 (0.85, 1.19) 1.01 (0.85, 1.21) 0.82 (0.65, 1.03)
School program – cigarette 1.16 (1.01, 1.33) 1.05 (0.89, 1.24) 1.04 (0.88, 1.22) 1.27 (1.01, 1.60)
Province – QC 1.32 (1.15, 1.51) 0.60 (0.52, 0.69) 1.07 (0.93, 1.22) 1.01 (0.84, 1.22)
Province – AB 1.13 (0.92, 1.39)
Province – BC 0.87 (0.72, 1.05)
Medium/Large urban 0.66 (0.59, 0.75) 0.92 (0.81, 1.04) 0.84 (0.74, 0.95) 0.59 (0.49, 0.71)

We observed some differences between CWGEE and GEE on the effect of school substance programs. The CWGEE results indicated that the odds of each substance use were 1.13 (0.98, 1.30) times higher in schools with an alcohol program while the GEE results yielded an OR of 1.04 (0.89, 1.21), which could lead to different conclusions between CWGEE and GEE. In addition, the CWGEE results indicated that the odds of each substance use were 0.83 (0.73, 0.95) times lower in schools with a cannabis program, while GEE produced an OR of 0.92 (0.79, 1.07), again potentially leading to different conclusions. School e-cigarette programs had varying effects on each substance use. Under CWGEE, the odds of binge drinking were 0.89 (0.76, 1.04) times lower and the odds of cigarette use were 0.83 (0.69, 1.00) times lower. GEE yielded similar results with an OR of 0.82 (0.75, 1.03) for binge drinking and 0.82 (0.65, 1.03) for cigarette use. Under CWGEE, school cigarette programs had a null effect on each substance use, while GEE indicated higher odds of binge drinking and cigarette use with OR of 1.16 (1.01, 1.33) and 1.27 (1.01, 1.60), respectively.

Both methods showed higher odds of binge drinking and lower odds of cannabis use in QC compared to ON. Under CWGEE, we observed slightly higher odds of e-cigarette and cigarette use in QC with OR of 1.14 (1.00, 1.31) and 1.19 (0.94, 1.51), respectively, but under GEE the ORs were close to 1.00. Both CWGEE and GEE showed that students in BC had lower odds of substance use compared to students in ON. Regarding the school setting, both methods agreed that students in medium and large urban schools had lower odds of substance use than students in rural and small urban schools.

Male gender, older age, and more weekly spending money were associated with higher odds of each type of substance use, while feeling more connected to school and the Asian ethnicity were associated with lower odds in both CWGEE and GEE (Table S3), which is consistent with previous findings. Black students had lower odds of binge drinking and e-cigarette use but slightly higher odds of cannabis use and cigarette use compared to White students in both CWGEE and GEE (Table S3). Students in the “Other” ethnicity category also had higher odds of using cannabis and cigarettes compared to White students (Table S3).

Table 3 shows the estimates of the correlation coefficients from the multivariate CWGEE (upper triangular portion) and multivariate GEE (lower triangular portion) assuming the unstructured working correlation structure. Estimates were similar between the two methods and all pairs of outcomes showed moderately strong positive correlations. The highest correlation was between cannabis and cigarette use (0.418 with CWGEE and 0.420 with GEE), while the lowest was between binge drinking and cigarette use (0.263 with CWGEE and 0.274 with GEE).

Table 3:

Estimates of the correlation coefficients between each pair of substance use behaviours assuming the unstructured working correlation structure: CWGEE estimates are shown in the upper triangular portion of the table and GEE estimates are shown in the lower triangular portion of the table.

Binge Cannabis E-cigarette Cigarette
Binge 0.328 0.304 0.274
Cannabis 0.314 0.340 0.418
E-cigarette 0.288 0.339 0.290
Cigarette 0.263 0.420 0.292

Discussion

We found a negative association between school size and student substance use, with smaller schools exhibiting higher substance use rates. We also observed positive correlations between different substance use, indicating that schools with high rates of one substance tended to have high rates of others. These findings motivated us to use the multivariate CWGEE, given the presence of ICS and positive correlations between the results. Unweighted GEE with exchangeable working correlation – a method commonly used in school-based analyses – has been shown to yield biased estimates if cluster size is informative on the outcome and applying a univariate model to each correlated outcome may be less efficient [21].

Although most of the results were in agreement between multivariate and univariate analyses, we observed some differences in inference that may have led to different conclusions based on the method. With multivariate CWGEE, we found that school alcohol, cannabis, and cigarette programs similarly influenced all substance use, while school e-cigarette programs had different effects on each substance. These associations would not be detected by fitting a separate model for each substance use. In addition, the multivariate approach may be useful in estimating the model-based correlation between each pair of substance use after adjusting for covariates. For example, based on the school-level descriptive analysis, binge drinking and cigarette use had the highest correlation, but based on the estimated model-based correlation estimates from the multivariate CWGEE, cannabis and cigarette use had the highest correlation.

Previous studies have suggested that school substance programs may be ineffective in lowering the rate of targeted substance use among students [4], [5], [6]. In our study, we also found that most school programs that aimed to prevent the use of a specific substance were not associated with decreased odds of students using that particular substance. Instead, we observed that some school programs had a statistically significant association with another type of student substance use. Based on multivariate CWGEE, students in schools with a cannabis program had lower odds of binge drinking and e-cigarette use and those in schools with an e-cigarette program had lower odds of cigarette use, but students in schools with an alcohol program had moderately higher odds of using each substance. Our findings suggest that interventions aimed at one substance may unintentionally influence the use of others by addressing shared health risks and behaviours, while also leading students to experiment with alternative substances. Understanding whether school programs truly reduce overall substance use or just alter consumption patterns is an important topic for future research. In addition, prevention and cessation strategies for youth may need to adopt a more comprehensive approach that addresses multiple substances simultaneously [28].

We used a limited number of covariates which may not fully control for all potential confounders including school neighborhood characteristics such as socioeconomic status and student characteristics such as BMI. However, our objective was to investigate whether there is ICS in the data that survey student substance use from multiple schools and whether multivariate methods that account for ICS are needed to learn about the relationship between student substance use and exposures of interest. A more thorough investigation with longitudinal data and a comprehensive list of exposures and confounders is required to make novel scientific claims about risk factors for the use of substances in students. Another limitation is the lack of information on school programs, such as their duration and frequency. Our analytical approach also assumes that all students in the same school are exposed to the same program.

Our investigation revealed that there exists ICS when the outcome is student-level substance use and students are clustered in schools. Using unweighted GEE to analyze such data could lead to inaccurate parameter estimates and to incorrect assessment of school-based substance programs.

Supplementary Material

Supplementary Material Details

Supplementary Material Details

Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/em-2024-0028).

Footnotes

Research ethics: The University of Waterloo Office of Research Ethics and appropriate School Board committees approved all procedures for the COMPASS study. Secondary analysis of COMPASS data was approved by the University of Toronto Research Ethics Boards (RIS Human Protocol Number 44375).

Informed consent: Informed consent was obtained from all individuals included in this study, or their legal guardians or wards.

Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission. A.M. conceptualized the study, conducted the literature review, contributed to data interpretation, wrote the first draft of the manuscript and contributed to editing the manuscript. Y.Z. conducted the literature review, cleaned the data, conducted data analysis, contributed to data interpretation, wrote the first draft of the manuscript, and contributed to editing the manuscript. SL contributed to the review and editing of the manuscript, planned and obtained funding for the data collection, and conceptualized the larger cost study. K.P. contributed to the review and editing of the manuscript and planned and obtained funding for the data collection. All authors supported the discussion, interpreted the data, critically reviewed the manuscript, and approved the final manuscript.

Use of Large Language Models, AI and Machine Learning Tools: None declared.

Conflict of interest: All other authors state no conflict of interest.

Research funding: The COMPASS study has been supported by a bridge grant from the CIHR Institute of Nutrition, Metabolism and Diabetes (INMD) through the “Obesity – Interventions to Prevent or Treat” priority funding awards (OOP-110788; awarded to S.L.), an operating grant from the CIHR Institute of Population and Public Health (IPPH) (MOP-114875; awarded to S.L.), a CIHR project grant (PJT-148562; awarded to S.L.), a CIHR bridge grant (PJT-149092; awarded to K.P./S.L.), a CIHR project grant (PJT-159693; awarded to K.P.), and by a research funding arrangement with Health Canada (\#1617-HQ-000012; contract awarded to S.L.), a CIHR-Canadian Centre on Substance Use and Addiction (CCSA) team grant (OF7 B1-PCPEGT 410-10-9633; awarded to S.L.), a project grant from the CIHR Institute of Population and Public Health (IPPH) (PJT-180262; awarded to S.L. and K.P.). A SickKids Foundation New Investigator Grant, in partnership with CIHR Institute of Human Development, Child and Youth Health (IHDCYH) (Grant No. NI21-1193; awarded to K.P.) funds a mixed methods study examining the impact of the COVID-19 pandemic on youth mental health, leveraging COMPASS study data. The COMPASS-Quebec project additionally benefits from funding from the Ministère de la Santé et des Services sociaux of the province of Québec, and the Direction régionale de santé publique du CIUSSS de la Capitale-Nationale. This research conducted for this paper was partially supported by the Natural Sciences and Engineering Research Council of Canada - Discovery Grants Program (RGPIN-2022-05356; awarded to A.M.). K.P. is supported by the Canada Research Chairs program.

Data availability: COMPASS study data are available upon request through completion and approval of an online form: https://uwaterloo.ca/compass-system/information-researchers/data-usage-application.

References

  • 1.Hopfer C. Implications of marijuana legalization for adolescent substance use. Subst Abuse. 2014;35:331–5. doi: 10.1080/08897077.2014.943386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lim CC, Sun T, Leung J, Chung JY, Gartner C, Connor J, et al. Prevalence of adolescent cannabis vaping: a systematic review and meta-analysis of US and Canadian studies. JAMA Pediatr. 2022;176:42–51. doi: 10.1001/jamapediatrics.2021.4102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Alarcó-Rosales R, Sánchez-SanSegundo M, Ferrer-Cascales R, Albaladejo-Blazquez N, Lordan O, Zaragoza-Martí A. Effects of a school-based intervention for preventing substance use among adolescents at risk of academic failure: a pilot study of the reasoning and rehabilitation V2 program. Healthcare. 2021;9:1488. doi: 10.3390/healthcare9111488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Evans-Whipp T, Beyers JM, Lloyd S, Lafazia AN, Toumbourou JW, Arthur MW, et al. A review of school drug policies and their impact on youth substance use. Health Promot Int. 2004;19:227–34. doi: 10.1093/heapro/dah210. [DOI] [PubMed] [Google Scholar]
  • 5.Burnett T, Battista K, Butt M, Sherifali D, Leatherdale ST, Dobbins M. The association between public health engagement in school-based substance use prevention programs and student alcohol, cannabis, e-cigarette and cigarette use. Can J Public Health. 2023;114:94–103. doi: 10.17269/s41997-022-00655-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Williams GC, Cole AG, de Groh M, Jiang Y, Leatherdale ST. More support needed: evaluating the impact of school e-cigarette prevention and cessation programs on e-cigarette initiation among a sample of Canadian secondary school students. Prev Med. 2022;155:106924. doi: 10.1016/j.ypmed.2021.106924. [DOI] [PubMed] [Google Scholar]
  • 7.Liu XQ, Guo YX, Wang X. Delivering substance use prevention interventions for adolescents in educational settings: a scoping review. World J Psychiatr. 2023;13:409. doi: 10.5498/wjp.v13.i7.409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Yang Y, Butt ZA, Leatherdale ST, Morita PP, Wong A, Rosella L, et al. Exploring the dynamic transitions of polysubstance use patterns among Canadian youth using latent Markov models on compass data. Lancet Reg Health – Am. 2022;16:100389. doi: 10.1016/j.lana.2022.100389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Fagan MJ, Duncan MJ, Bedi RP, Puterman E, Leatherdale ST, Faulkner G. The prospective association between physical activity and initiation of current substance use among adolescents: examining the role of school connectedness. Ment Health Phys Act. 2023;24:100503. doi: 10.1016/j.mhpa.2023.100503. [DOI] [Google Scholar]
  • 10.Fagan MJ, Duncan MJ, Bedi RP, Puterman E, Leatherdale ST, Faulkner G. Physical activity and substance use among Canadian adolescents: examining the moderating role of school connectedness. Front Public Health. 2022;10:889987. doi: 10.3389/fpubh.2022.889987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Williams GC, Burns KE, Battista K, de Groh M, Jiang Y, Leatherdale ST. High school intramural participation and substance use: a longitudinal analysis of COMPASS data. Subst Use Misuse. 2021;56:1108–18. doi: 10.1080/10826084.2021.1901932. [DOI] [PubMed] [Google Scholar]
  • 12.Kristjansson AL, Sigfusdottir ID, Allegrante JP. Adolescent substance use and peer use: a multilevel analysis of cross-sectional population data. Subst Abuse Treat Prev Pol. 2013;8:1–10. doi: 10.1186/1747-597x-8-27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Doggett A, Godin KM, Schell O, Wong SL, Jiang Y, Leatherdale ST. Assessing the impact of sports and recreation facility density within school neighbourhoods on Canadian adolescents’ substance use behaviours: quasi-experimental evidence from the COMPASS study, 2015–2018. BMJ open. 2021;11:e046171. doi: 10.1136/bmjopen-2020-046171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Williams GC, Burns KE, Battista K, de Groh M, Jiang Y, Leatherdale ST. High school sport participation and substance use: a cross-sectional analysis of students from the COMPASS study. Addict Behav Rep. 2020;12:100298. doi: 10.1016/j.abrep.2020.100298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Seaman S, Pavlou M, Copas A. Review of methods for handling confounding by cluster and informative cluster size in clustered data. Statistics Med. 2014;33:5371–87. doi: 10.1002/sim.6277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kahan BC, Li F, Copas AJ, Harhay MO. Estimands in cluster-randomized trials: choosing analyses that answer the right question. Int J Epidemiol. 2022;52:107–18. doi: 10.1093/ije/dyac131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hoffman EB, Sen PK, Weinberg CR. Within-cluster resampling. Biometrika. 2001;88:1121–34. doi: 10.1093/biomet/88.4.1121. [DOI] [Google Scholar]
  • 18.Williamson JM, Datta S, Satten GA. Marginal analyses of clustered data when cluster size is informative. Biometrics. 2003;59:36–42. doi: 10.1111/1541-0420.00005. [DOI] [PubMed] [Google Scholar]
  • 19.O’Malley PM, Johnston LD, Bachman JG, Schulenberg JE, Kumar R. How substance use differs among American secondary schools. Prev Sci. 2006;7:409–20. doi: 10.1007/s11121-006-0050-5. [DOI] [PubMed] [Google Scholar]
  • 20.National Center on Addiction and Substance Abuse at Columbia University . National survey of American attitudes on substance abuse VIII: teens and parents. New York, NY: Columbia University; 2003. [Google Scholar]
  • 21.Mitani AA, Kaye EK, Nelson KP. Marginal analysis of multiple outcomes with informative cluster size. Biometrics. 2021;77:271–82. doi: 10.1111/biom.13241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Leatherdale ST, Brown KS, Carson V, Childs RA, Dubin JA, Elliott SJ, et al. The COMPASS study: a longitudinal hierarchical research platform for evaluating natural experiments related to changes in school-level programs, policies and built environment resources. BMC Public Health. 2014;14:1–7. doi: 10.1186/1471-2458-14-331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.White VM, Hill DJ, Effendi Y. How does active parental consent influence the findings of drug-use surveys in schools? Eval Rev. 2004;28:246–60. doi: 10.1177/0193841x03259549. [DOI] [PubMed] [Google Scholar]
  • 24.Bond L, Butler H, Thomas L, Carlin J, Glover S, Bowes G, et al. Social and school connectedness in early secondary school as predictors of late teenage substance use, mental health, and academic outcomes. J Adolesc Health. 2007;40:357-e9. doi: 10.1016/j.jadohealth.2006.10.013. [DOI] [PubMed] [Google Scholar]
  • 25.McNeely CA, Nonnemaker JM, Blum RW. Promoting school connectedness: evidence from the national longitudinal study of adolescent health. J Sch Health. 2002;72:138–46. doi: 10.1111/j.1746-1561.2002.tb06533.x. [DOI] [PubMed] [Google Scholar]
  • 26.Weatherson KA, O’Neill M, Lau EY, Qian W, Leatherdale ST, Faulkner GEJ. The protective effects of school connectedness on substance use and physical activity. J Adolesc Health. 2018;63:724–31. doi: 10.1016/j.jadohealth.2018.07.002. [DOI] [PubMed] [Google Scholar]
  • 27.Chaganty NR. An alternative approach to the analysis of longitudinal data via generalized estimating equations. J Stat Plann Inference. 1997;63:39–54. doi: 10.1016/s0378-3758-96-00203-0. [DOI] [Google Scholar]
  • 28.Akre C, Michaud PA, Berchtold A, Suris JC. Cannabis and tobacco use: where are the boundaries? A qualitative study on cannabis consumption modes among adolescents. Health Educ Res. 2010;25:74–82. doi: 10.1093/her/cyp027. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material Details

Supplementary Material Details


Articles from Epidemiologic Methods are provided here courtesy of De Gruyter

RESOURCES