Skip to main content
BMJ Open logoLink to BMJ Open
. 2018 Oct 2;8(10):e022626. doi: 10.1136/bmjopen-2018-022626

Getting the most out of intensive longitudinal data: a methodological review of workload–injury studies

Johann Windt 1,2,3, Clare L Ardern 4,5, Tim J Gabbett 6,7, Karim M Khan 1,8, Chad E Cook 9, Ben C Sporer 8,10, Bruno D Zumbo 11
PMCID: PMC6169745  PMID: 30282683

Abstract

Objectives

To systematically identify and qualitatively review the statistical approaches used in prospective cohort studies of team sports that reported intensive longitudinal data (ILD) (>20 observations per athlete) and examined the relationship between athletic workloads and injuries. Since longitudinal research can be improved by aligning the (1) theoretical model, (2) temporal design and (3) statistical approach, we reviewed the statistical approaches used in these studies to evaluate how closely they aligned these three components.

Design

Methodological review.

Methods

After finding 6 systematic reviews and 1 consensus statement in our systematic search, we extracted 34 original prospective cohort studies of team sports that reported ILD (>20 observations per athlete) and examined the relationship between athletic workloads and injuries. Using Professor Linda Collins’ three-part framework of aligning the theoretical model, temporal design and statistical approach, we qualitatively assessed how well the statistical approaches aligned with the intensive longitudinal nature of the data, and with the underlying theoretical model. Finally, we discussed the implications of each statistical approach and provide recommendations for future research.

Results

Statistical methods such as correlations, t-tests and simple linear/logistic regression were commonly used. However, these methods did not adequately address the (1) themes of theoretical models underlying workloads and injury, nor the (2) temporal design challenges (ILD). Although time-to-event analyses (eg, Cox proportional hazards and frailty models) and multilevel modelling are better-suited for ILD, these were used in fewer than a 10% of the studies (n=3).

Conclusions

Rapidly accelerating availability of ILD is the norm in many fields of healthcare delivery and thus health research. These data present an opportunity to better address research questions, especially when appropriate statistical analyses are chosen.

Keywords: methodology, workloads, training load, athletic injury, statistics


Strengths and limitations of this study.

  • As intensive longitudinal data become increasingly common across disciplines, catalysed by technological advances, this methodological review provides researchers with several considerations when determining how to analyse these data.

  • Whereas systematic reviews provide a quantitative synthesis of research findings, they do not account for the statistical approaches used in the original studies. Therefore, methodological reviews like this one fill an important void in the literature to highlight key shortcomings and ways forward from a methodological and statistical perspective.

  • By choosing a homogenous group of papers—prospective cohort studies in team sports that collected intensive longitudinal data—we were able to focus more directly on the statistical analyses that the authors employed.

  • It was beyond the scope of this review to list every challenge posed by intensive longitudinal data, and we are not exhaustive in our discussion of different analyses and their capacity to handle the challenges that we did highlight.

Introduction

Intensive longitudinal data (ILD) are being collected more frequently in various research areas,1 catalysed by technological advancements that simplify data collection and analysis.2 By collecting data repeatedly on the same participants, researchers are enabled to answer more detailed research questions, particularly regarding phenomena that change or fluctuate over time. However, arriving at these answers requires researchers to overcome the challenges of analysing ILD, which include: (1) the dependencies created by repeated measures, (2) missing/unbalanced data, (3) separating between-person and within-person effects, (4) time-varying and time-invariant (stable) factors and (5) specifying the role of time/temporality.3

The field of exercise and sports medicine provides one specific example which can illustrate principles that apply to the use of ILD broadly. In the field of sports performance, technological advances mean that a plethora of physiological, psychological and physical data are conveniently available from athletes.4 5 As one example, of 48 professional football clubs that responded to a survey on player monitoring, 100% reported collecting daily global positioning system (GPS) and heart rate (HR) data.6

One research question that has gained a great deal of interest in the last decade is how athletes’ training and competition workloads relate to injury risk. Since athletes’ training and injury risk continually varies over time, many researchers have used prospective cohort studies to collect and analyse ILD to answer this question.7 There is moderate evidence from systematic reviews and an International Olympic Committee (IOC) consensus statement suggesting a positive relationship between injury rates and high training workloads, increased risk of injury with low workloads and a pronounced increase in injury risk associated with rapid workload increases.7–11 However, such systematic reviews do not consider the statistical approaches used in included studies.12 Choosing the wrong statistical analysis or poorly implementing an otherwise correct one (eg, violating statistical assumptions) can bias results and create false conclusions. Even a perfectly performed systematic review cannot compensate for poorly designed, or poorly analysed studies.13

Longitudinal data analysis is most effective when the chosen statistical approach aligns with the frequency of data collection and with the theoretical model underpinning the research question (box 1).14 Therefore, we used this lens to evaluate whether the statistical models employed in prospective cohort studies using ILD to investigate the relation between athletic workloads and injury were optimal. We had three aims: (1) to summarise researchers’ data collection, methodological, statistical and reporting practices12 15; (2) to evaluate the degree to which the adopted statistical analyses fit within Collins’ threefold alignment (box 1) and (3) to provide recommendations for future investigations in the field.

Box 1. Theoretical model, temporal design, statistical model.

In a landmark, highly cited paper, Professor Linda Collins described how aligning the (1) theoretical model (subject matter theory), (2) temporal design (data collection strategy/timing), and (3) statistical model (analytical strategy) is crucial when analysing longitudinal data.14 For example, if researchers (1) theorise that a given physiological variable fluctuates every hour, (2) data must be collected at least on an hourly basis. If researchers measure participants once a day, they will miss virtually all the hourly fluctuations that their theories predict. Once researchers have collected their hourly data, they should (3) select a statistical strategy that enables them to examine the relationship between these fluctuations and the outcome of interest. As Collins noted, perfect alignment of these three components may not be possible, but it provides researchers a target, and readers a lens through which longitudinal research can be evaluated.

Methods

Article selection

We systematically searched the literature (MEDLINE, CINAHL, SPORTDISCUS, PsychINFO and EMBASE) (10 December 2016) to identify systematic reviews and consensus statements that investigated the relationship between workloads and athletic injuries, with the aim of extracting all original articles included in these reviews that met our inclusion criteria. A summary of the systematic search and article selection process is described in online supplementary appendix 1 (table A1 and figure A1)), and the full systematic search is available from the authors.

Supplementary file 1

bmjopen-2018-022626supp001.pdf (271.5KB, pdf)

A priori, we operationally defined ‘workload’ as either external—the amount of work completed by the athlete (eg, distance run, hours completed, etc), or internal—the athlete’s response to a given external workload (eg, session rating of perceived exertion, HR-based measures, etc). We acknowledge that athlete self-reported measures often evaluate how athletes are handling training demands and may be referred to as ‘internal’ load measures, but we considered these perceptual well-being measures as a distinct step from quantifying athletes’ internal or external workloads.16 Athletic injuries have been diversely defined in the literature, so we operationally defined athletic injury as any article that reported measuring ‘injury’, regardless of their specific definition (eg, time loss, medical attention, etc).

Two authors (JW and TG) screened the titles/abstracts of the systematic reviews. Where necessary, the full texts were retrieved to determine whether they should be included. A total of six systematic reviews7–10 17 18 and one consensus statement11 were identified that included at least one article meeting the inclusion criteria.

We extracted and reviewed the full texts of all the original studies included (n=279) in these seven papers. For our analysis, we included all the original articles that met the following criteria:

  1. Original articles were prospective cohort studies that examined the relationship between at least one measure of internal or external workload (as defined above) and athletic injury. Since theoretical models describe the recursive nature of injury risk with each training or competition exposure, workloads had to be continually monitored and include both training and match workloads for the same athletes. Although some athletes may have entered or left the group during the study period (eg, through retirement or trades to other teams), the same team/group of athletes had to be followed throughout the study period, as opposed to repeated cross-sectional snapshots of different cohorts.

  2. Articles collected ILD. We defined ILD as >20 observations per athlete.14

  3. Articles studied team sport athletes. We chose team sports because (1) there are high amounts of ILD collected in applied team sport settings6 and (2) the majority of workload–injury studies are in team sport athletes.7 Military populations and individual sports (eg, distance running) were excluded due to the differences in task requirements and operating environment.

Patient and public involvement

As a methodological review, there was no patient or public involvement in this current investigation.

Article coding and description

To describe the methodological, statistical and reporting approaches used in each article, two authors (JW and CA) reviewed all the included papers and extracted 50 items of information for each article. These items included publication year, journal, variable operationalisation (eg, internal vs external load measures, injury definition, etc), methodological approaches, statistical analyses implemented, reported findings and more. To ensure consistency between coders, 10 articles were randomly selected, coded independently by both reviewers and compared with assess agreement. Discrepancies were discussed by the two coders and an additional five articles were randomly selected and coded independently. The remaining articles were coded by JW and checked by CA.

Assessing how statistical models aligned with Collins’ threefold framework

To evaluate the statistical approaches used in this field, we first identified the key themes and challenges within the theoretical models and temporal design features within the workload–injury field, then developed a qualitative assessment to evaluate the statistical approaches.

Collins' component 1: the theoretical models that underpin athletic workloads and injury risk (in brief)

Briefly, we identified at least three key elements of athletic injury aetiology models. First, sports injuries are multifactorial.19–21 Aetiology models since 1994 have all explained between-athlete differences in injury risk by identifying a host of ‘internal’ (eg, athlete characteristics, psychological well-being, previous injury) and ‘external’ (eg, opponent behaviour, playing surface) risk factors. More recently, the dynamic recursive model by Meeuwisse et al 22 and the workload–injury aetiology model23 have highlighted the recurrent nature of injury risk, meaning each athlete’s injury risk (ie, within-athlete risk) also fluctuates continually as they train or compete in their sport (figure 1). Thus, a second theme is that injury risk differs between-athletes and within-athletes. Finally, more recent injury aetiology models have highlighted injury risk as a complex, dynamic system (figure 2).24 25 Complex systems, as in weather forecasting or biological systems,26 27 possess many key features, including an open-system, inherent non-linearity between variables and outcomes, recursive loops where the system output becomes the new system input, self-organisation where regular patterns (risk profiles) may emerge for given outcomes (emergent pattern) and uncertainty.24

Figure 1.

Figure 1

The workload–injury aetiology model. Key features include the multifactorial nature of injury, between-athlete and within-athlete differences in risk and a recursive loop.

Figure 2.

Figure 2

Complex systems model of athletic injury. Web of determinants are shown for an anterior cruciate ligament (ACL) injury in basketball players (A), and in a ballet dancer (B).

Collins’ component 2: temporal design/data collection

The theoretical models relating workloads and injury illustrate a continuously fluctuating injury risk, with many variables that influence risk on a daily or weekly basis.22–24 Thus, if researchers want to investigate the association between workloads and injuries, these data must be collected frequently enough to observe changes in these variables as they occur (temporal design). With technological advances, athletes’ physiological, psychological and physical variables are now often collected on a daily, weekly or monthly basis, along with ongoing injury surveillance data.4 5 Therefore, in the workload–injury field, the theoretical models (injury aetiology models that describe regular fluctuation in workloads and injury risk) and the temporal design (frequent, often daily, data collection) are often well-aligned, especially in prospective cohort studies using ILD. This leaves us to consider only whether Professor Collins’ third component—the statistical model—aligns with these first two.

Collins’ component 3: statistical model

From the theoretical aetiology models underpinning the workload–injury association, we highlighted three key themes to consider when choosing a statistical model: (1) injury risk is multifactorial, (2) between-athlete and within-athlete differences in injury risk fluctuate regularly and (3) injury risk may be considered a complex, dynamic system.

From a temporal design perspective, ILD are necessary to address these key themes, but they also carry at least five challenges that influence the choice of the statistical model.

  1. Differentiating between-person and within-person effects.

  2. ILD include time-varying variables (eg, workloads) and may also incorporate stable (time-invariant) variables (eg, sex).

  3. The ‘dependency’ created by repeated measurements of the same individuals violates the assumption of ‘independence’ common to many traditional analyses.28 29

  4. Almost all longitudinal datasets have missing or unbalanced data.14

  5. Longitudinal data analysis require researchers to consider the role of time in their analysis.3

Evaluating statistical approaches

We deliberately tried to align components 1 and 2 of Collins’ framework by describing the theoretical models underpinning the workload–injury association and only including articles that had a temporal design characterised by ILD. To review whether statistical approaches aligned with these two components, two authors (JW and BZ) qualitatively assessed whether the statistical models, as employed in the included studies, (1) were multifactorial, (2) differentiated between-athlete and within-athlete differences in injury risk and (3) analysed the data as a dynamic system—the three themes highlighted in the theoretical framework. From the temporal design, the same two authors evaluated whether the statistical analyses (4) included both time-varying and time-invariant variables, (5) were robust to missing/unbalanced data, (6) addressed the dependencies created by repeated measures and (7) incorporated time into the analysis.

Data synthesis approach

We first describe the characteristics of the included articles, then present our qualitative assessment of how well the various statistical approaches fit within Collins’ framework.

Results

Thirty-four articles were included in this methodological review (see online supplementary appendix 1). In the first 10 articles coded by both reviewers, there were 10 discrepancies out of 500 total coded entries (10 papers×50 items/paper), which gave us 98% agreement between reviewers. No item had more than two discrepancies. Of the 250 study criteria in the second set of 5 articles coded by both reviewers, there were 8 discrepancies (97% agreement).

Included articles were published from 2003 to 2016, with 78% of the studies published since 2010. Sports studied included rugby league (n=10), soccer (n=7), Australian football (n=6), cricket (n=5), rugby union (n=2), multiple sports (n=1) and basketball, handball and volleyball (n=1 each). Studies included an average of 96 athletes (median=46), ranging from 1230 to 502 athletes.31 The observation period for these cohort studies ranged from 14 weeks32 to 6 years.33 Most studies investigated male athletes (n=30), with two studies on female athletes and two on both sexes. Table 1 summarises the included articles’ basic characteristics, while the full data extraction table is available from the authors on request.

Table 1.

Summary of included workload–injury investigations, sorted by sport then publication year

Reference Journal Study length Sport n Level Sex
Rogalski et al 88 J Sci Med Sport 1 Season AFL 46 Elite Male
Colby et al 43 JSCR 1 Season AFL 46 Professional Male
Duhig et al 119 BJSM 2 Seasons AFL 51 Professional Male
Murray et al 120 Scand J Med Sci Sports 2 Seasons AFL 59 Professional Male
Murray et al 121 IJSPP 1 Season AFL 46 Professional Male
Veugelers et al 53 J Sci Med Sport 15 Weeks AFL 45 Elite Male
Anderson et al 30 JSCR 21 Weeks Basketball 12 Subelite competitive Female
Dennis et al 37 J Sci Med Sport 2 Seasons Cricket 90 Professional Male
Dennis et al 58 J Sci Med Sport 1 Season Cricket 12 Professional Male
Dennis et al 38 BJSM 2002–2003 cricket season Cricket 44 Subelite competitive Male
Saw et al 46 BJSM 1 Season Cricket 28 Elite Male
Hulin et al 33 BJSM 43 Individual seasons/6 years Cricket 28 Professional Male
Bresciani et al 45 Eur J Sport Sci 1 Season (40 weeks) Handball 14 Elite Male
Gabbett81 BJSM 3 Years Rugby league 220 Subelite Male
Gabbett89 J Sports Sci 1 Season Rugby league 79 Semi-professional Male
Gabbett and Domrow59 J Sports Sci 2 Seasons Rugby league 183 ‘Subelite’ Male
Gabbett56 JSCR 4 Years Rugby league 91 Professional Male
Gabbett and Jenkins122 J Sci Med Sport 4 Years Rugby league 79 Professional Male
Gabbett and Ullah66 JSCR 1 Season Rugby league 34 Professional Male
Hulin et al 123 BJSM 2 Seasons Rugby league 28 Professional Male
Hulin et al 47 BJSM 2 Seasons Rugby league 53 Professional Male
Windt et al 57 BJSM 1 Season Rugby league 30 Elite Male
Killen et al 32 JSCR 14 Weeks Rugby league 36 Professional Male
Brooks et al 31 J Sports Sci 2 Seasons Rugby union 502 Professional Male
Cross et al 52 IJSPP 1 Season Rugby union 173 Professional Male
Arnason et al 87 AJSM 1 Season Soccer 306 Professional Male
Brink et al 42 BJSM 2 Seasons Soccer 53 Elite youth players Male
Mallo and Dellal35 J Sports Med Phys Fitness 2 Seasons (2007/2008 and 2008/2009) Soccer 35 Professional (Spanish Division II) Male
Clausen et al 39 AJSM 1 Season Soccer 498 Recreational Female
Owen et al 36 JSCR 2 Consecutive seasons Soccer 23 Professional Male
Bowen et al 41 BJSM 2 Seasons Soccer 32 Elite youth players Male
Ehrmann et al 82 JSCR 1 Season Soccer 19 Professional Male
Malisoux et al 48 J Sci Med Sport 41 Weeks Varied 154 High-school Both (65% males)
Visnes and Bahr40 Scand J Med Sci Sports 4 Years (231 student-seasons) Volleyball 141 Elite high school Both (72 females, 69 males)

Data collection

Injury definitions

Injury definitions varied across articles, with exact wording outlined in the online supplementary appendix. In table 2, we have categorised the definitions into more discrete injury categories (and subcategories) in accordance with recognised consensus statements.34 Where studies used multiple injury definitions, we categorised them according to the definition used for the primary analysis.

Table 2.

Broad injury definitions used in workload–injury investigations

Injury definition N
Time-loss
 All time-loss 13
 Match time-loss 2
 Non-contact time-loss 7
 Non-contact match time-loss 1
Medical attention
 Medical attention 7
 Player-reported pain, soreness or discomfort 1
 Non-contact medical attention injuries 1
 Clinical diagnosis of jumper’s knee 1
Other
 Injury scale on the Recovery-Stress Questionnaire for Athletes 1

Subsequent or recurrent injuries

Of the 34 articles, 30 did not define or include subsequent or recurrent injuries. Of those that explicitly addressed subsequent injuries, two defined these injuries as those occurring at the same time and occurring by the same mechanism.35 36 Two articles explicitly stated that they only considered time until first injury, meaning no injuries were subsequent or recurrent.37 38

Workload definitions

Workload variables varied widely across articles and are summarised in table 3. For a more detailed description of each article’s load measures, see the online supplementary appendix. Many articles used workload metrics to derive additional variables from workload distribution over time (eg, monotony, strain, acute:chronic workload ratios).

Table 3.

Independent variables used in workload–injury investigations

Workload measure N
Internal
 sRPE 15
 Heart rate zones 2
External
 Balls bowled or pitched 5
 GPS/accelerometry 10
 Hours 6

If articles included more than one type of workload variable they are counted more than once. sRPE scores could be the original Foster scale or modified. sRPE is calculated as the product of session intensity on a 1–10 Borg Scale and activity duration in minutes.

GPS, global positioning system; sRPE, session-rating of perceived exertion.

Measurement frequency

Most included articles (n=32) collected workload data at every session that athletes completed, while two studies recorded workload on a weekly basis.39 40

Handling missing data

Twenty-three of the 34 articles (67%) did not report any strategies for missing data. Of those that did, five used listwise or casewise deletion, and six used estimation. Estimation methods for players missing data included techniques such as: using the full team average values for the drills a player completed,41 using an individual’s mean weekly value42 and multiplying player’s preseason per-minute match data by the number of minutes they played in a match.43

Statistical analysis and reporting in included articles

Data binning/aggregation

Although 32 articles collected daily workload measurements, many aggregated data for analysis. Most (n=16) summed workload metrics for a total or average weekly workload. Three studies aggregated workload data for the entire year, three aggregated data into season periods, two aggregated data monthly and three used multiple aggregation strategies.

Analysis methods

Table 4 summarises the statistical practices of applied researchers investigating the relationship between workload and injury. Although some studies had analysed other primary or secondary objectives, we recorded only the analyses used to investigate the workload–injury relationship.

Table 4.

The number of studies using various statistical analysis techniques

Analytical method N
Regression modelling
 Logistic
  Regular 10
  Generalised estimating equation 5
  Multilevel 1
 Linear 2
  Regular
 Poisson
  Generalised estimating equation* 1
 Multinomial regression
  Regular 1
 Cox proportional hazards model 1
 Frailty model 1
Correlation
 Pearson 9
 Spearman 1
Relative risk/rate ratio† 8
T-tests
 Paired and independent samples 4
 Independent samples only 2
Χ2 tests 1
Repeated measures ANOVA (one-way or two-way) 5

If articles used more than one statistical method to analyse workload and injury, they are included more than once in the table. We only report analyses used to analyse workload–injury associations, not other analyses reported in the articles (eg, ANOVA to test for differences in total workloads at separate times of the season).

*Clausen et al 39 also report fitting multilevel models, but do not report any of the results—presenting only their GEE findings in their results and discussion.

†Relative risk here refers to the use of RR as a primary analysis based on risks in different categorical groups, not as an effect estimated from another model. For example, comparing risks among different load groups like Hulin et al 33 47 are counted here, whereas Gabbett and Ullah66 derived RR from their frailty model, and Clausen et al 39 derived RR from their Poisson model, but neither are included in the count for RR.

ANOVA, analysis of variance; GEE, generalised estimating equation.

Typical uses of statistical tools

Regression approaches were used most commonly (22/34 studies). The most common approach was logistic regression (binary injury status as the outcome variable), independently or jointly modelling workload variables as independent variables. Generalised estimating equations (GEEs) were used in five studies to account for the clustering of observations within players and were used very similarly to simple logistic regression approaches.

Correlation was the second most common method (10/34 studies). Most studies that used correlation (7/10) measured the association between weekly or monthly workloads and injury incidence at the team level. Of those that used correlation at the individual level, two compared the number of completed preseason sessions with the number of completed in-season sessions,23 44 while the final compared workload with injury operationalised as a numerical score on the injury subscale of the Recovery-Stress Questionnaire for Athletes.45

Relative risk approaches were generally used in one of two ways. First, workload categories were established for the entire year, like cricket bowlers who averaged <2, 2–2.99, 3–3.99, 4–4.99 or >5 days between bowling sessions up until an injury, or for the entire year if they did not sustain an injury.37 Risks were calculated as the number of injuries/number of athletes in a given group, and relative risks were calculated to compare across groups.46 In the second approach, athletes contributed exposures on a weekly basis, and thus contributed to multiple workload classifications. In this case, the likelihood/risk was the number of injuries/number of weekly player exposures to that workload category.33 41 47

Group differences were sometimes evaluated using t-tests, analysis of variance (ANOVA) or Χ2 analyses. Typically, unpaired t-tests contrasted workload variables (eg, mean sessions/week) between athletes who sustained an injury during the year, to those who did not.37 38 Paired t-tests and repeated measures ANOVAs (one-way or two-way) were most often used to contrast the workloads of the same athletes at different time periods. For example, workloads in an ‘injury block’ (like the week preceding an injury) were contrasted with non-injury blocks, like other weeks in the season,37 46 or the 4 weeks preceding the injury block.48

Justifications for statistical approaches

Authors of 15 of the included articles (44%) did not cite any sources to support their analytical choices. Of those who did, most (n=14) cited previous literature in the sports medicine field. Eight articles referenced statistics or methodology articles, four cited articles on Professor Will Hopkins’ website (www.sportssci.org) and three cited statistical textbooks.49–51

Addressing analysis assumptions and model fit

More than half (n=20) the included articles did not report on the assumptions underlying their statistical analyses. Among those that did report on analysis assumptions, checks included checks for normality, collinearity of predictor variables in regression analyses,52 sphericity for repeated measures ANOVA,45 overdispersion39 or correlation structures for GEEs.53

When authors reported checking for normality, Shapiro-Wilk44 or Kolmogorov-Smirnov tests32 were referenced. Regression modelling was the most common analysis to investigate the workload–injury association. In 8/10 instances where simple logistic regression was chosen, the authors appear to have conducted the analyses using weekly observations without accounting for the dependencies created by repeated-measures across players. In all instances were regression was used, it was uncommon for authors to report that model assumptions were checked. Where multiple regression approaches were used, multicollinearity checks were rarely reported—an important consideration since multicollinearity can cause imprecise estimates of regression coefficients when multiple workload variables are simultaneously modelled.52 54 55

Of the papers that modelled data using regression or similar techniques, six described how they assessed model fit. Some authors assessed specificity/sensitivity, or receiver operating characteristics, either on the current data set,40 or future data set.56 Other in-sample model fit indices R2 values,36 Akaike Information Criteria and Bayesian Information Criteria, which were sometimes mentioned as guiding the model selection process.57

Alignment of authors’ statistical models with theoretical model and temporal design challenges

In table 5 (see online supplementary appendix 2 (table A2)), we qualitatively evaluated whether the statistical approaches chosen by the authors in our current review effectively addressed the key themes/challenges presented by the theoretical model and the temporal design (ILD). This table is an analytical tool to guide the reader through the discussion. It highlights the themes/challenges of the theoretical model and temporal design, as well as the strengths/weaknesses of the statistical tools used in included studies. The table has the challenges/themes in columns and statistical tools in rows. The reader can follow a row to see how well a given statistical tool addressed key challenges as used by researchers in our included articles, or they can choose a challenge and follow the column down to see which analyses were used in a way that addressed that challenge adequately. The rows are ordered according to their qualitative ‘score’. As one proceeds down the rows, the statistical tools address more of the temporal design and theoretical model challenges.

Table 5.

Evaluation of the degree to which authors’ use of statistical tools addressed theoretical and temporal design challenges

Method n Themes of theoretical model Themes of temporal design—intensive longitudinal data
Multifactorial aetiology Between-athlete and within-athlete differences Complex system Includes time-varying and time-invariant variables Missing/unbalanced data* Repeated measure dependency Incorporates time into the analysis
Correlation (Pearson and Spearman) 10 X X X X X X X
Unpaired t-test 6 X X X X X X X
Χ2 test 1 X X X X X X X
Relative risk calculations 8 O X X X X X X
Regression (logistic, linear, multinomial) 13 O X X X X X X
Paired t-test 2 X X X X X
Repeated measures ANOVA
(one-way or two-way)
5 O O X O X
Generalised estimating equations
(Poisson and logistic)
6 O X X O O
Cox proportional hazards model 1 X X X
Multilevel modelling 1 X X
Frailty model 1 X

Qualitative assessment performed on a three-tiered scale. An ‘X’ (red formatting) means that none of the authors using this tool adequately addressed that specific challenge. In some cases, this may be because the statistical model was unable to address it, and other times it may be because of the way they used it. An ‘O’ (yellow formatting) indicates that some authors addressed that challenge while others did not. This generally happened when the statistical tool could address a challenge but the authors sometimes chose not to use it in that way. A ‘✓’ (green formatting) indicates that all authors using this statistical tool addressed that challenge adequately.

*Missing/unbalanced data here is that caused by intensive longitudinal data—meaning a different number of observations for each athlete during the observation period, some of which may be missing.

ANOVA, analysis of variance.

We caution the reader that (1) not every possible statistical tool is included in the table, only those used in at least one article in our review, and (2) the evaluation is based on whether researchers of our included papers used a test in a way that addressed a given challenge, not necessarily whether the test is capable of being used in a way that meets that challenge. For example, a logistic regression analysis conducted using a GEE framework can include multiple explanatory/predictor variables, thereby allowing for a multifactorial model. However, some authors used GEEs and only included one predictor variable,58 59 in which case the GEE did not address the multifactorial theme.

Discussion

We used the workload–injury field of medical research to examine whether statistical approaches analyse ILD optimally. By design, the theoretical models underpinning the workload–injury field and the temporal design (ILD) were aligned in all the included articles, but common statistical approaches varied in how adequately they addressed the key themes needed to align them with the other two components.

Consideration 1—theoretical theme—multifactorial aetiology

Sports injury aetiology models of the last two decades have highlighted the multifactorial nature of athletic injury.19 21 We asked whether the burgeoning body of research relating workloads and injury is using modern statistical methods to capture workloads while incorporating known risk factors. Few articles in this review incorporated previously identified risk factors and workload into the same analysis. In some instances, the analytical approach prevented this from being an option. For example, simple analyses like t-tests, correlations and Χ2 tests do not allow for multiple variables to be included. In other instances, the statistical approaches allowed a multifactorial approach (eg, GEEs) but researchers opted to focus on the effects of workloads in isolation.58 59

Including known risk factors in workload–injury investigations is important from an aetiological perspective in at least two ways. First, failing to control for known risk factors may mean that key confounding variables are not included in the analysis and the relationship between workloads and injury are spurious. For example, women have a 2–6 times higher risk of ACL injury in soccer than their male counterparts.60 61 If a study included both male and female soccer players and did not account for sex in the analysis, then differences in workload may be spuriously correlated with injury rates if male and female players performed varying levels of workload. Depending on the injury type and sporting group, previous injury, age, sex, physiological and/or biomechanical variables may all be important to include.

Second, by including additional risk factors into the analysis, the investigator may be able to identify moderation or effect-measure modification to better understand how risk factors and workload jointly contribute to injury risk.62 63 As a reminder, there are subtle, but important differences between mediation, moderation and effect measure modification that will influence analytical choices.64 65 Effect modification occurs when the effect of a treatment or condition (eg, a given workload demand), differs among different athlete groups. Interaction (or moderation), although similar, examines the joint effect of two or more variables on an outcome. Finally, mediation is concerned with the pathway of exposure to a given outcome, and what are potentially intermediate variables. Previously identified risk factors may aetiologically relate to workload in each of these three ways and may be explored through different modelling strategies.

Statistical approaches that allow multivariable analyses enabled researchers to examine the effects of workloads while controlling for known risk factors. Malisoux et al 48 used a Cox proportional hazards model to control for age and sex while examining the effects of average training volume and intensity. The frailty model by Gabbett and Ullah66 incorporated previous injury—a proven injury risk factor—into the evaluation of the influence of different GPS workloads on injury risk. When investigating multifactorial phenomena, statistical approaches that enable multiple explanatory variables provide a more appropriate option.

Consideration 2—theoretical theme—between-athlete and within-athlete differences

One of the primary benefits of ILD is that it enables researchers (when using certain analyses) to differentiate within-person and between-person effects.3 In the sports medicine field, this would correspond to researchers asking (1) why do some athletes suffer few injuries (between-person inquiry) while others appear ‘injury-prone’? and on the other hand, (2) at what point is a given athlete (within-person inquiry) more likely to sustain an injury? The simpler statistical approaches used by researchers in our included studies (correlation, t-tests, ANOVAs, regular regression) are limited in the number of variables they can include, and consequently cannot differentiate risk between-athletes and within-athletes. Tests of group differences (independent sample t-tests and one-way ANOVAs) only differentiate between athletes (eg, injured vs uninjured), while repeated measures tests (repeated measures ANOVA and paired t-tests) only examine within-athlete differences (eg, loads preceding injuries vs loads during non-injury weeks).

GEEs were commonly used to address some of the longitudinal data challenges. Although this approach accounts for the clustering within-persons, it assumes the effects of predictor variables are constant across all athletes.67 Simple Cox proportional hazards models48 are common in survival analyses, but do not differentiate between-person and within-person effects.68

Only two statistical tools were used in a way that examined between-athlete and within-athlete differences in injury risk. The frailty model by Gabbett and Ullah66 modelled each athlete as a random effect with a given frailty. The multilevel model by Windt et al57 incorporated athlete-level variables (age, position, preseason sessions) and observation-level variables (weekly workload measures). In the latter case, athletes’ weekly distances did not affect their risk of injury in the subsequent week (OR 0.82 for 1 SD increase, 95% CI 0.55 to 1.21)—a within-athlete inquiry. However, controlling for weekly distance and the proportion of distance at high speeds, athletes who had completed a greater number of preseason training sessions had significantly reduced odds of injury (OR 0.83 for each 10 preseason sessions, 95% CI 0.70 to 0.99)—a between-athlete inquiry. These two examples highlight that certain analyses carry a distinct advantage of allowing researchers to tease out differences both between-study and within-study participants

Consideration 3—theoretical theme—injury risk as a complex dynamic system

Complex systems are defined, among other things, by the interaction between multiple internal and external variables that interact to produce an outcome. Simple analyses (t-tests, correlations), which cannot incorporate multiple variables, cannot examine the interaction between multiple factors. However, even other traditional analyses which are more effective in handling the challenges of longitudinal data (eg, GEEs, Cox proportional hazards models) were not used to incorporate non-linear interactions between predictor variables.

The most recent reviews of athletic injury aetiology have highlighted complex systems models.24 69 None of the analyses included in our review analysed intensive longitudinal workload–injury data with statistical analyses that fit within a complex systems framework. This lack of research may reflect the fact that the suggestion that injury aetiology fits within a dynamic, complex systems framework is still relatively ‘new’ in this field. It remains to be seen whether a complex systems approach and the analyses recommended in such reviews (eg, self-organising feature maps, classification and regression trees, agent-based models, etc) are more effective for evaluating the association between workloads and injury.24

Consideration 4—ILD challenge—including time-varying and time-invariant variables

Tying back to the theoretical model of workloads and injury, some relevant factors may be relatively stable (time-invariant) over the course of an observation period (eg, height, age), while others are time-varying (eg, workload). Some analyses can incorporate both time-varying and time-invariant variables, while others are limited in this respect. All analyses that cannot or did not address the multifactorial nature of injury cannot include time-varying and time-invariant variables concurrently. Group difference tests (t-tests, ANOVAs, etc) may collect time-varying measures, but must aggregate them into a single average for analysis.

Including time-varying and stable variables in the same analysis links closely to between-athlete and within-athlete differences—with the frailty model66 and multilevel model57 both used in a way that allowed the researchers to include both. The one exception in our included studies was the GEE approach. As mentioned earlier, the GEE assumes an ‘overall’ effect for each explanatory variable, such that between-athlete and within-athlete differences cannot be differentiated.70 However, the major benefit to a GEE is that it accounts for the repeated measures for each participant and can therefore include both time-invariant and time-varying variables for each participant.

Consideration 5—ILD challenge—handling missing and unbalanced data

Dealing with missing and unbalanced data is a near certainty when collecting ILD, and is common in applied workload-monitoring settings.71 Such missing data decrease statistical power and increases bias, and may be missing at completely at random, missing at random or missing not at random. When analysing aggregated data or using analyses that require balanced data, strategies may include complete-case analysis, last observation carried forward or various imputation methods.72 73 Multiple imputation methods, of which there are many, involves replacing missing values with values imputed from the observed data and is preferred over single imputation. Finally, if interactions are included in regression analyses, the transform-then-impute method has been recommended.74

However, these missing data approaches are not recommended for longitudinal analyses, since researchers have statistical analyses that are robust to missing and unbalanced data at their disposal.75 Statistically, four types of analyses used in this review are robust to missing and unbalanced data—Cox proportional hazards models, GEEs, multilevel models and frailty models, where all observations can be included in the analysis, and athletes can have different numbers of observations. Since mixed/multilevel models have less stringent assumptions for missing data (ie, missing at random) than GEEs (ie, missing completely at random), they have been suggested over GEEs.75

While the statistical concerns related to unbalanced data may be addressed with these analyses, missing data may also affect derived variables, which are common in workload–injury research. These derived variables include rolling workload averages (eg, 1 week, ‘acute’, workloads, 4 weeks average, ‘chronic’, workloads, etc),33 41 ‘monotony’ (average weekly workload divided by the SD of that workload) or ‘strain’ (the monotony multiplied by the average weekly workload).30 Since these measures are all calculated from workloads accumulated over time, failing to estimate workloads for these missing sessions (that end up being treated as ‘0’ workload days) means inferences from these derived measures may be underestimated and unreliable. Few authors discussed how they handled missing data. In these instances, it is important that researchers report how they accounted for missing data, whether they be strategies employed in the past, for example, full team average values,41 weekly individual averages,42 player-specific per-minute values by time played43 or whether through other advanced imputation methods recommended for ILD.72 74

Challenge 6—ILD challenge—dependencies created by repeated measures

Collecting ILD in applied sport settings means repeated (often daily) measurements of the same athlete, such that observations are clustered within athletes. Comparisons of independent groups, through Χ2 tests, independent sample t-tests and one-way/two-way ANOVAs all assume participants contribute a single observation to the analysis and force an aggregated variable (eg, average number of balls bowled in a week) to conduct the analysis.37 Similarly, correlation and simple regression (in its linear, logistic and multinomial forms) assume independence of observations.76 Paired t-tests and repeated measures ANOVA were used to deal with repeated measurements by comparing the same athletes’ workloads at different periods (eg, the week before injury vs weeks that did no precede injury).

Of the analyses that addressed this challenge, GEEs were used most commonly (six studies). GEE’s ability to handle clustering was also used in one article to control for players clustering within teams.39 Cox proportional hazard models, used in one article,48 can handle repeated measurements for participants.77 Multilevel models57 and frailty models66an extension of the Cox proportional hazards model—were also used in a single instance each, where repeated measures were clustered within players through a random player effect.

As mentioned in our introduction, there may be additional data dependency created by recurrent injuries.78 Previous recommendations to handle the recurrent injury challenges have included frailty models,79 and a multistate framework.80 However, as so few articles reported collecting information on recurrent injuries (n=5), we focused primarily on the dependencies caused by repeated measures across participants.

Challenge 7—ILD challenge—incorporating time into the analysis/temporality

One of the most relevant questions in ILD analyses is the way that time is accounted for.1 3 Some authors used one-way32 and two-way repeated measures ANOVAs81 to compare loading in different seasons or season-periods—a very simple way of accounting for time. Repeated measures ANOVAs44 48 82 and paired t-tests37 46 also account for time by categorising time-periods as preinjury blocks or non-injury blocks. Multilevel models have been used to examine change through the interactions of variables with time, but the one multilevel model used in this review did not include time as a covariate.57 Survival analyses explicitly account for time by calculating the effects of variables on the predicted time-to-event.48 66 Notably, only one analysis—the frailty model66adjusted the probability of long-term outcomes (eg, injury) based on variations after an initial capture of risk, something few traditional analyses accomplish.26

Temporality is also vital in considering potential causal associations. While making causal inferences from observational data is a topic beyond the scope of this paper, temporality is a well-accepted component of causality dating back, at least, to Bradford Hill’s ‘criteria’.83 Without temporality—where a postulated cause precedes the outcome—directional associations cannot be made.84 85 A lack of temporality can also skew associations since it allows for reverse causality. In the workload–injury field, findings that high weekly workloads are sometimes associated with lower odds of injury in a given week53 57 may be in part because players who get injured in a given week are less likely to accumulate high weekly workloads.

Trying to account for temporality, some researchers have included a latent period—where workload variables are examined for their association with injury occurrence in a given proceeding time window, like the subsequent week.33 57 While recent work has noted that the length of the latent period may affect model findings,86 it is clear that without some type of latent period, any directional inferences between workloads and injury cannot be made.

Methodological, statistical and reporting considerations

Data aggregation

Data aggregation was common, whether in data preparation, or forced through the analysis. In some cases, researchers aggregated individual level data into team-level measures (total/average workload and injury incidence). Although 32/34 articles collected daily data, most aggregated these daily data into weekly measures, potentially contributing to temporality problems if no latent period was included. Finally, certain analyses (eg, paired t-tests, simple logistic regression) aggregated data for athletes across an entire year so that workload measures were used to control for exposure.40 87 Differences in analyses make it impossible to measure the effect of fluctuations in workload and potential impact on injury risk. Furthermore, with no latent period, the directionality of the relationship is unclear. For example, players with high exposure throughout the year were at a lower injury risk than the intermediate group, but it could be interpreted that players who do not sustain an injury throughout the year are more likely to accumulate high total training and match exposures (ie, higher workload).87 Aggregated data may be easier to analyse but comes at the cost of losing some of the inherent benefits of collecting ILD, such as the changes in injury risk that occur at a daily level. As a result, theory-driven questions that relate to daily workload fluctuations and injury risk will become challenging, or impossible to answer.

Checking model assumptions and fit

While many studies may have under-reported how they assessed model assumptions or fit, others52 provide an example for other researchers to emulate. In fitting a GEE to account for intrateam and intraplayer clustering effects, they explained how they selected an appropriate autocorrelation structure, reported how potential quadratic relationships were assessed in the case of non-linear associations and described checking for potential multicollinearity with defined thresholds (variance inflation factor >10) and for their GEE.

Researcher ‘trade-offs’, consequences of misalignment

We used the workload–injury field to highlight seven themes that relate to theoretical injury aetiology models and temporal design (ILD). In many cases, highlighted published studies’ statistical models either could not, or were not used in a way that addresses these themes. In some cases, misalignment may carry a severe cost—like assumption violations that may bias study results.29 This is akin to building conclusions on an unstable foundation. Other times, researchers have properly employed their chosen statistical approach, but the approaches themselves were limited, and unable to answer research questions that ILD can address. This is more akin to having a grand building plan and all the necessary supplies, but only using a screwdriver to construct the building.

Simple regression models provide an ideal example of researchers’ trade-offs when using traditional statistical analyses on ILD, and the potential costs of misalignment. Although 13 papers used regular regression to analyse the association between workloads and injury outcome, they chose one of three paths when dealing with ILD. First, many proceeded to analyse each daily or weekly data point as an independent observation—not addressing the violation of the independence assumption.33 47 88 Second, some researchers aggregated the workload data into an average weekly workload or total workload exposure over the course of the year, such that each participant contributed only one observation to a classic logistic regression.40 87 Although the regression assumptions were not violated, workload was aggregated into a single metric, the temporal relationship between workload and injury was lost, and there was then no way to analyse the effects of workload fluctuations on injury risk. Third, some researchers converted individual data to team level data and examined team workloads with team injury incidence in a linear regression.36 89 In this final case, no differentiation could then be made between players or within-players, and inferences were only possible at the team level. This may be sufficient to inform research on the association of workloads and injury at the team level, but the theoretical model underpinning team injury rates may differ from those that underpin individual athletes’ injury risk.

Review limitations

Previous systematic reviews investigating the workload–injury relationship have documented the challenges of identifying articles through classic systematic review search strategies.7 9 Heterogeneous keywords and the breadth of sporting contexts have meant previous systematic reviews include many articles post hoc that were not originally identified by their systematic searches (eg, 29 of 67 articles in the paper by Jones et al,7 12 of 35 articles in the paper by Drew and Finch9). Therefore, although we worked to identify articles through six systematic reviews7–10 17 18 and the 2016 IOC consensus statement on athletic workloads and injury,11 we may have missed potentially eligible articles.

We used the cut-off for ILD (>20 observations) proposed by Collins.14 However, there is no universal cut-off for ILD, with previous thresholds of ‘more than a handful’,1 10 observations,90 or 40.3

In some instances, authors’ analytical choices may have been attributable to factors outside of statistical considerations. For example, in lower level competitions, or in organisations with lower budgets, it may not have been feasible to collect multiple variables longitudinally with the available equipment or staff. In these types of instances, the authors would be unable to employ a multifactorial approach, instead of choosing not to use one. Such external factors may have influenced the findings of this methodological review.

Finally, it was beyond the scope of this review to list every challenge posed by ILD, and we were not exhaustive in our discussion of different analyses and their capacity to handle the challenges. Where possible, we tried to identify the themes that are most common within the research field of sport and exercise medicine field. Ultimately, our call to action is that statistical tools be chosen more thoughtfully so that the extensive work put into theory building and data collection is not short-changed by a suboptimal statistical model.

Longitudinal improvements in ILD analysis

Methods and statistical analyses evolve over time, as with all scientific inquiry. Therefore, it is possible that we were a little unfair to some earlier papers. For example, researchers may have chosen analyses that aligned with ‘their’ theoretical model at the time, not what is considered the most current theoretical model. However, most papers were published since 2010—the dynamic, recursive aetiology model was introduced in 2007, and the multifactorial nature of injury risk has been highlighted since 1994.21 As complex systems approaches are the most recently proposed theoretical model,24 69 it is not surprising that none of the included articles analysed the data within this type of framework, with the first analysis of its kind in sport injury research only appearing recently.91 Furthermore, some techniques for longitudinal data analysis have been developed and grown in popularity recently, so researchers may not have been aware of alternative approaches at the time of their studies.

As more statistical methods are developed and refined for longitudinal data analysis, researchers will continue to gain awareness and skills with these analyses and their implementation is likely to become more common. Some evidence for that progression can be seen in this review. If we were to assign a ‘method’ score to each analytical approach outlined in table 1, assigning 0 for each red box, 0.5 for each yellow box and 1 for each green box (eg, correlation would score 0, while GEEs would score a 3.5), and then assign that score to each paper in the study, we could obtain a rough estimate of whether analytical approaches were improving over time. Breaking the papers roughly into four periods, the ‘average score’ for papers up to 2005 (n=6) is 1.6, papers between 2006 and 2010 (n=7) score an average of 1.9, papers between 2011 and 2015 (n=11) score 1.7 and papers since 2016 (n=10) score an average of 2.3. Moreover, since the search for this current review was conducted, there have been promising developments in the sports medicine field and a continued improvement in longitudinal analysis. Recent publications have applied statistical models that more appropriately take advantage of the strengths inherent to ILD, and better align with the theoretical frameworks.92–97

Mediation, effect measure modification and interaction/moderation are all causal models, which may also contribute to aetiological frameworks.98 We recently proposed that traditional intrinsic and extrinsic risk factors may act as moderators or effect measure modifiers of the workload–injury association.62 If that is true, the most appropriate statistical model would include workload measures as the independent variable of interest, and incorporate other risk factors such that these causal models can be investigated, whether by stratifying effects across different levels of these risk factors, or including an interaction term within regression.63 While no included articles performed such an analysis, recent studies (not included in this review because it was published after our search) have started to adopt these approaches.94 99 100 For example, Møller et al used a frailty model with weekly workload fluctuations (decrease or <20% increase, 20%–60% increase and >60% increase) as the primary predictor variable in a frailty model. Known shoulder risk factors were treated as ‘effect measure modifiers’, so the model was stratified based on the presence or absence of a given risk factor (eg, scapular dyskinesis).65 In so doing, the researchers used a statistical tool (component 3) that addressed all the challenges inherent to longitudinal data (component 2), conducting a multifactorial analysis that clearly differentiated both within-athlete and between-athlete injury risk—key aspects of the theoretical model (component 1).

Future directions and recommendations for ILD analysis

Researchers in the sports medicine field should be encouraged that the increased availability of ILD may improve understanding of athletes’ fluctuating injury risks—as articulated by their theoretical models. More advanced statistical techniques for longitudinal data are increasingly being developed and implemented across disciplines. This will enable sports medicine researchers to more accurately answer their theory-driven questions by taking advantage of the benefits of ILD. To capitalise on this understanding, researchers must choose statistical models that most closely align with their theory and that address longitudinal data challenges. GEEs, a Cox proportional hazards model, a multilevel logistic model, and a frailty model were the four analyses that most closely approached this alignment within our included papers. However, there remains some clear room for improvement in the future.

First, although mixed modelling was only used in one study, these forms of analyses have inherent values over GEE methods and have been recommended for this reason.101 Because of sample structure, mixed models prevent false-positive associations and have an applied correction method that increases the power of the analysis102; a finding that is useful with the commonly smaller samples. Mixed models also carry a less stringent missing data assumption (missing at random) when compared with GEEs (missing completely at random). Furthermore, whereas GEEs require the correlation structure to be chosen by the researcher (which may be wrong), mixed models model the correlation structure so that it can be investigated. Finally, GEEs assume a constant effect across all individuals in the model, while mixed models allow for individual level effects and for differentiating these individual effects.

To borrow an example from another field and demonstrate the flexibility and utility of mixed effect models, Russell et al used daily stressor values from students during their first three college years to demonstrate that students consumed more alcohol on high-stress days than low-stress days (within-person fixed effect).103 However, a significant random effect between students suggested that some students experienced this increase in alcohol consumption, while others did not. Finally, those students with a tendency to increase alcohol consumption with stressors were more likely to have drinking-related problems in their fourth year.103 For more information on multilevel/mixed effect models for longitudinal analysis, readers are referred to a other helpful resources.1 28 75 104 105

Time-to-event models are another family of statistical models that have become a very common in clinical research articles—reported in 61% of original articles in the New England Journal of Medicine in 2004–2005106but were used infrequently within our included articles. Notably, these models answer a different research question— when does an event occur? These approaches can account for many of the ILD challenges.107–109 Time-to-event models account for censoring, can incorporate time-varying exposures, time-varying effect measure modifiers and time-varying changes in injury status, and may be used to control for competing risks.107 As with other modelling techniques, the appropriate number of events per variable has been investigated, and at least 5–10 events per variable are recommended for these types of models to prevent sparse data bias.110 As long as this and other model assumptions are met, more advanced time-to-event models may be a valuable tool for researchers analysing ILD.77 111 112

Lastly, computational modelling methods, which involve computer simulation, have both pros and cons when modelling injuries. They may provide insight on the best ways to model certain predictor variables,113 and open the door to more complex systems modelling (eg, agent-based modelling).91 Although they show promise, such simulation studies are based on artificially generated data and must be interpreted carefully.114

More analytical approaches are available for ILD, but a full discussion of each of these is beyond the scope of this paper. For the interested reader, functional data analysis,115 machine learning approaches,92 95 time series analysis116 and time-varying effect models117 all show promise. Such analyses and others for ILD can be found in the landmark ILD textbook by Walls and Schafer,1 and more recently, in the work of Bolger and Laurenceau.104

We believe ILD provide an exciting opportunity for applied researchers and statisticians to collaborate moving forward. As the field continues to progress to more advanced analytical approaches that may better suit ILD, the need for collaboration with statisticians will be vital. In our included papers, few researchers referenced methodological or statistical references to justify their analytical approaches. In some instances, this may be attributable to using common, relatively simple analyses—one likely does not expect a citation for a t-test. Where such references existed, they were often to previous papers in the field, not statistical sources. In future longitudinal analyses, we encourage researchers to partner with a statistician, psychometrician, epidemiologist, biostatistician, etc.118 Such fruitful collaborations may lead to statistical approaches that take full advantage of ILD by aligning theory, data collection and statistical analyses as seamlessly as possible.

Conclusion

We used studies investigating the relationship between workloads and injury as a substrate to highlight to researchers how important it is to align their theoretical model, temporal design and statistical model. In longitudinal research, thoughtfully chosen statistical analyses are those grounded in subject matter theory and that maximise the utility of the collected data. The three most common analyses in our included papers (logistic regression, correlations and relative risk calculations) addressed one or none of the three key theoretical themes, and one or fewer of the four inherent challenges of ILD. In this example discipline, researchers have developed sophisticated theories and frequently collect data that enable them to test these theoretical models. The missing step, and future opportunity for researchers, is to avail themselves of all the tools at their disposal—choosing statistical models that address the ILD challenges and answer theory-driven research questions.

Supplementary Material

Reviewer comments
Author's manuscript

Acknowledgments

The United States Coalition for the Prevention of Illness and Injury in Sport is a partner in the United States Coalition for the Prevention of Illness and Injury in Sport, an International Research Centres for Prevention of Injury and Protection of Athlete Health supported by the International Olympic Committee (IOC). Technical or equipment support for this study was not provided by any outside companies, manufacturers or organisations.

Footnotes

Contributors: JW and TJG searched for, screened and identified appropriate systematic reviews and consensus statements. JW identified appropriate original data papers from included systematic reviews/consensus statements. JW and CLA performed quality assessment of original data papers and coded the study methods. BDZ provided the statistical feedback on the creation of the data extraction spreadsheet and advice on the approach of the methodological review. JW and BDZ completed the qualitative assessment of whether authors’ use of statistical tools aligned with theoretical or temporal design themes. BCS, CLA, BDZ and KMK provided early input during early stages of study development. CEC contributed to the original idea for the review, and contributed to the discussion section of the manuscript. JW compiled the first draft of the manuscript. CLA, CEC, BCS, BDZ and KMK all contributed to critical revision of multiple drafts of the current manuscript.

Funding: JW was a Vanier Scholar funded by the Canadian Institutes of Health Research.

Competing interests: None declared.

Patient consent: Not required.

Provenance and peer review: Not commissioned; externally peer reviewed.

Data sharing statement: No additional data available.

References

  • 1. Walls TA, Schafer JL. Models for intensive longitudinal data. Oxford: Oxford University Press, 2006. [Google Scholar]
  • 2. Ginexi EM, Riley W, Atienza AA, et al. . The promise of intensive longitudinal data capture for behavioral health research. Nicotine Tob Res 2014;16(Suppl 2):S73–5. 10.1093/ntr/ntt273 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Hamaker EL, Wichers M. No time like the present: discovering the hidden dynamics in intensive longitudinal data. Curr Dir Psychol Sci 2017;26:10–15. [Google Scholar]
  • 4. Halson SL. Monitoring training load to understand fatigue in athletes. Sports Med 2014;44:139–47. 10.1007/s40279-014-0253-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. McCall A, Carling C, Nedelec M, et al. . Risk factors, testing and preventative strategies for non-contact injuries in professional football: current perceptions and practices of 44 teams from various premier leagues. Br J Sports Med 2014;48:1352–7. 10.1136/bjsports-2014-093439 [DOI] [PubMed] [Google Scholar]
  • 6. Akenhead R, Nassis GP. Training load and player monitoring in high-level football: current practice and perceptions. Int J Sports Physiol Perform 2016;11:587–93. 10.1123/ijspp.2015-0331 [DOI] [PubMed] [Google Scholar]
  • 7. Jones CM, Griffiths PC, Mellalieu SD. Training load and fatigue marker associations with injury and illness: a systematic review of longitudinal studies. Sports Med 2017;47:1–32. 10.1007/s40279-016-0619-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Black GM, Gabbett TJ, Cole MH, et al. . Monitoring workload in throwing-dominant sports: a systematic review. Sports Med 2016;46:1503–16. 10.1007/s40279-016-0529-6 [DOI] [PubMed] [Google Scholar]
  • 9. Drew MK, Finch CF. The relationship between training load and injury, illness and soreness: a systematic and literature review. Sports Med 2016;46:861–83. 10.1007/s40279-015-0459-8 [DOI] [PubMed] [Google Scholar]
  • 10. Gabbett TJ, Whyte DG, Hartwig TB, et al. . The relationship between workloads, physical performance, injury and illness in adolescent male football players. Sports Med 2014;44:989–1003. 10.1007/s40279-014-0179-5 [DOI] [PubMed] [Google Scholar]
  • 11. Soligard T, Schwellnus M, Alonso JM, et al. . How much is too much? (Part 1) International Olympic Committee consensus statement on load in sport and risk of injury. Br J Sports Med 2016;50:1030–41. 10.1136/bjsports-2016-096581 [DOI] [PubMed] [Google Scholar]
  • 12. Keselman HJ, Huberty CJ, Lix LM, et al. . Statistical practices of educational researchers: an analysis of their ANOVA, MANOVA, and ANCOVA Analyses. Rev Educ Res 1998;68:350–86. 10.3102/00346543068003350 [DOI] [Google Scholar]
  • 13. Weir A, Rabia S, Ardern C. Trusting systematic reviews and meta-analyses: all that glitters is not gold!. Br J Sports Med 2016;50:1100–1. 10.1136/bjsports-2015-095896 [DOI] [PubMed] [Google Scholar]
  • 14. Collins LM. Analysis of longitudinal data: the integration of theoretical model, temporal design, and statistical model. Annu Rev Psychol 2006;57:505–28. 10.1146/annurev.psych.57.102904.190146 [DOI] [PubMed] [Google Scholar]
  • 15. Lemon SC, Wang ML, Haughton CF, et al. . Methodological quality of behavioural weight loss studies: a systematic review. Obes Rev 2016;17:636–44. 10.1111/obr.12412 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Gabbett TJ, Nassis GP, Oetter E, et al. . The athlete monitoring cycle: a practical guide to interpreting and applying training monitoring data. Br J Sports Med 2017;51:1451–2. 10.1136/bjsports-2016-097298 [DOI] [PubMed] [Google Scholar]
  • 17. Olivier B, Taljaard T, Burger E, et al. . Which extrinsic and intrinsic factors are associated with non-contact injuries in adult cricket fast bowlers? Sports Med 2016;46:1–23. 10.1007/s40279-015-0383-y [DOI] [PubMed] [Google Scholar]
  • 18. Whittaker JL, Small C, Maffey L, et al. . Risk factors for groin injury in sport: an updated systematic review. Br J Sports Med 2015;49:803–9. 10.1136/bjsports-2014-094287 [DOI] [PubMed] [Google Scholar]
  • 19. Bahr R, Holme I. Risk factors for sports injuries--a methodological approach. Br J Sports Med 2003;37:384–92. 10.1136/bjsm.37.5.384 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Bahr R, Krosshaug T. Understanding injury mechanisms: a key component of preventing injuries in sport. Br J Sports Med 2005;39:324–9. 10.1136/bjsm.2005.018341 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Meeuwisse WH. Assessing causation in sport injury: a multifactorial model. Clin J Sport Med 1994;4:166–70. [Google Scholar]
  • 22. Meeuwisse WH, Tyreman H, Hagel B, et al. . A dynamic model of etiology in sport injury: the recursive nature of risk and causation. Clin J Sport Med 2007;17:215–9. 10.1097/JSM.0b013e3180592a48 [DOI] [PubMed] [Google Scholar]
  • 23. Windt J, Gabbett TJ. How do training and competition workloads relate to injury? The workload-injury aetiology model. Br J Sports Med 2017;51:428–35. 10.1136/bjsports-2016-096040 [DOI] [PubMed] [Google Scholar]
  • 24. Bittencourt NFN, Meeuwisse WH, Mendonça LD, et al. . Complex systems approach for sports injuries: moving from risk factor identification to injury pattern recognition-narrative review and new concept. Br J Sports Med 2016;50 10.1136/bjsports-2015-095850 [DOI] [PubMed] [Google Scholar]
  • 25. Hulme A, Salmon PM, Nielsen RO, et al. . Closing Pandora’s Box: adapting a systems ergonomics methodology for better understanding the ecological complexity underpinning the development and prevention of running-related injury. Theor Issues Ergon Sci 2017;18:338–59. 10.1080/1463922X.2016.1274455 [DOI] [Google Scholar]
  • 26. Cook C. Predicting future physical injury in sports: it’s a complicated dynamic system. Br J Sports Med 2016;50:1356–7. 10.1136/bjsports-2016-096445 [DOI] [PubMed] [Google Scholar]
  • 27. Sun L, Wu R. Mapping complex traits as a dynamic system. Phys Life Rev 2015;13:155–85. 10.1016/j.plrev.2015.02.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Hoffman L. Longitudinal analysis: modeling within-person fluctuation and change: Routledge, 2014. [Google Scholar]
  • 29. Wilkinson M, Akenhead R. Violation of statistical assumptions in a recent publication? Int J Sports Med 2013;34:281 10.1055/s-0032-1331775 [DOI] [PubMed] [Google Scholar]
  • 30. Anderson L, Triplett-McBride T, Foster C, et al. . Impact of training patterns on incidence of illness and injury during a women’s collegiate basketball season. J Strength Cond Res 2003;17:734–8. [DOI] [PubMed] [Google Scholar]
  • 31. Brooks JH, Fuller CW, Kemp SP, et al. . An assessment of training volume in professional rugby union and its impact on the incidence, severity, and nature of match and training injuries. J Sports Sci 2008;26:863–73. 10.1080/02640410701832209 [DOI] [PubMed] [Google Scholar]
  • 32. Killen NM, Gabbett TJ, Jenkins DG. Training loads and incidence of injury during the preseason in professional rugby league players. J Strength Cond Res 2010;24:2079–84. 10.1519/JSC.0b013e3181ddafff [DOI] [PubMed] [Google Scholar]
  • 33. Hulin BT, Gabbett TJ, Blanch P, et al. . Spikes in acute workload are associated with increased injury risk in elite cricket fast bowlers. Br J Sports Med 2014;48:708–12. 10.1136/bjsports-2013-092524 [DOI] [PubMed] [Google Scholar]
  • 34. Fuller CW, Ekstrand J, Junge A, et al. . Consensus statement on injury definitions and data collection procedures in studies of football (soccer) injuries. Br J Sports Med 2006;40:193–201. 10.1136/bjsm.2005.025270 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Mallo J, Dellal A. Injury risk in professional football players with special reference to the playing position and training periodization. J Sports Med Phys Fitness 2012;52:631–8. [PubMed] [Google Scholar]
  • 36. Owen AL, Forsyth JJ, Wong delP, et al. . Heart rate-based training intensity and its impact on injury incidence among elite-level professional soccer players. J Strength Cond Res 2015;29:1705–12. 10.1519/JSC.0000000000000810 [DOI] [PubMed] [Google Scholar]
  • 37. Dennis R, Farhart P, Goumas C, et al. . Bowling workload and the risk of injury in elite cricket fast bowlers. J Sci Med Sport 2003;6:359–67. 10.1016/S1440-2440(03)80031-2 [DOI] [PubMed] [Google Scholar]
  • 38. Dennis RJ, Finch CF, Farhart PJ. Is bowling workload a risk factor for injury to Australian junior cricket fast bowlers? Br J Sports Med 2005;39:843–6. 10.1136/bjsm.2005.018515 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Clausen MB, Zebis MK, Møller M, et al. . High injury incidence in adolescent female soccer. Am J Sports Med 2014;42:2487–94. 10.1177/0363546514541224 [DOI] [PubMed] [Google Scholar]
  • 40. Visnes H, Bahr R. Training volume and body composition as risk factors for developing jumper’s knee among young elite volleyball players. Scand J Med Sci Sports 2013;23:607–13. 10.1111/j.1600-0838.2011.01430.x [DOI] [PubMed] [Google Scholar]
  • 41. Bowen L, Gross AS, Gimpel M, et al. . Accumulated workloads and the acute:chronic workload ratio relate to injury risk in elite youth football players. Br J Sports Med 2017;51 10.1136/bjsports-2015-095820 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Brink MS, Nederhof E, Visscher C, et al. . Monitoring load, recovery, and performance in young elite soccer players. J Strength Cond Res 2010;24:597–603. 10.1519/JSC.0b013e3181c4d38b [DOI] [PubMed] [Google Scholar]
  • 43. Colby MJ, Dawson B, Heasman J, et al. . Accelerometer and GPS-derived running loads and injury risk in elite Australian footballers. J Strength Cond Res 2014;28:2244–52. 10.1519/JSC.0000000000000362 [DOI] [PubMed] [Google Scholar]
  • 44. Murray NB, Gabbett TJ, Townshend AD, et al. . Individual and combined effects of acute and chronic running loads on injury risk in elite Australian footballers. Scand J Med Sci Sports 2017;27:990–8. 10.1111/sms.12719 [DOI] [PubMed] [Google Scholar]
  • 45. Bresciani G, Cuevas MJ, Garatachea N, et al. . Monitoring biological and psychological measures throughout an entire season in male handball players. Eur J Sport Sci 2010;10:377–84. 10.1080/17461391003699070 [DOI] [Google Scholar]
  • 46. Saw R, Dennis RJ, Bentley D, et al. . Throwing workload and injury risk in elite cricketers. Br J Sports Med 2011;45:805–8. 10.1136/bjsm.2009.061309 [DOI] [PubMed] [Google Scholar]
  • 47. Hulin BT, Gabbett TJ, Caputi P, et al. . Low chronic workload and the acute:chronic workload ratio are more predictive of injury than between-match recovery time: a two-season prospective cohort study in elite rugby league players. Br J Sports Med 2016;50:1008–12. 10.1136/bjsports-2015-095364 [DOI] [PubMed] [Google Scholar]
  • 48. Malisoux L, Frisch A, Urhausen A, et al. . Monitoring of sport participation and injury risk in young athletes. J Sci Med Sport 2013;16:504–8. 10.1016/j.jsams.2013.01.008 [DOI] [PubMed] [Google Scholar]
  • 49. Cohen J. Statistical power analysis for the behavioral sciences. 2nd edn Hillsdale, NJ: erlbaum, 1988. [Google Scholar]
  • 50. Kraemer HC, Thiemann S. How many subjects: Citeseer, 1987. [Google Scholar]
  • 51. Kutner MH, Nachtsheim C, Neter J. Applied linear regression models: McGraw-Hill/Irwin, 2004. [Google Scholar]
  • 52. Cross MJ, Williams S, Trewartha G, et al. . The influence of in-season training loads on injury risk in professional Rugby Union. Int J Sports Physiol Perform 2016;11:350–5. 10.1123/ijspp.2015-0187 [DOI] [PubMed] [Google Scholar]
  • 53. Veugelers KR, Young WB, Fahrner B, et al. . Different methods of training load quantification and their relationship to injury and illness in elite Australian football. J Sci Med Sport 2016;19:24–8. 10.1016/j.jsams.2015.01.001 [DOI] [PubMed] [Google Scholar]
  • 54. Silvey SD. Multicollinearity and imprecise estimation. J R Stat Soc Ser B Methodol 1969;31:539–52. [Google Scholar]
  • 55. Zidek JV, Wong H, LE Nhud, et al. . Causality, measurement error and multicollinearity in epidemiology. Environmetrics 1996;7:441–51. [DOI] [Google Scholar]
  • 56. Gabbett TJ. The development and application of an injury prediction model for noncontact, soft-tissue injuries in elite collision sport athletes. J Strength Cond Res 2010;24:2593–603. 10.1519/JSC.0b013e3181f19da4 [DOI] [PubMed] [Google Scholar]
  • 57. Windt J, Gabbett TJ, Ferris D, et al. . Training load--injury paradox: is greater preseason participation associated with lower in-season injury risk in elite rugby league players? Br J Sports Med 2017;51 10.1136/bjsports-2016-095973 [DOI] [PubMed] [Google Scholar]
  • 58. Dennis R, Farhart P, Clements M, et al. . The relationship between fast bowling workload and injury in first-class cricketers: a pilot study. J Sci Med Sport 2004;7:232–6. 10.1016/S1440-2440(04)80014-8 [DOI] [PubMed] [Google Scholar]
  • 59. Gabbett TJ, Domrow N. Relationships between training load, injury, and fitness in sub-elite collision sport athletes. J Sports Sci 2007;25:1507–19. 10.1080/02640410701215066 [DOI] [PubMed] [Google Scholar]
  • 60. Datson N, Hulton A, Andersson H, et al. . Applied physiology of female soccer: an update. Sports Med 2014;44:1225–40. 10.1007/s40279-014-0199-1 [DOI] [PubMed] [Google Scholar]
  • 61. Hewett TE, Myer GD, Ford KR. Anterior cruciate ligament injuries in female athletes: part 1, mechanisms and risk factors. Am J Sports Med 2006;34:299–311. 10.1177/0363546505284183 [DOI] [PubMed] [Google Scholar]
  • 62. Windt J, Zumbo BD, Sporer B, et al. . Why do workload spikes cause injuries, and which athletes are at higher risk? Mediators and moderators in workload-injury investigations. Br J Sports Med 2017;51:993–4. 10.1136/bjsports-2016-097255 [DOI] [PubMed] [Google Scholar]
  • 63. Nielsen RO, Bertelsen ML, Møller M, et al. . Training load and structure-specific load: applications for sport injury causality and data analyses. Br J Sports Med 2018;52 10.1136/bjsports-2017-097838 [DOI] [PubMed] [Google Scholar]
  • 64. Corraini P, Olsen M, Pedersen L, et al. . Effect modification, interaction and mediation: an overview of theoretical insights for clinical investigators. Clin Epidemiol 2017;9:331–8. 10.2147/CLEP.S129728 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. VanderWeele TJ. On the distinction between interaction and effect modification. Epidemiology 2009;20:863–71. 10.1097/EDE.0b013e3181ba333c [DOI] [PubMed] [Google Scholar]
  • 66. Gabbett TJ, Ullah S. Relationship between running loads and soft-tissue injury in elite team sport athletes. J Strength Cond Res 2012;26:953–60. 10.1519/JSC.0b013e3182302023 [DOI] [PubMed] [Google Scholar]
  • 67. Ghisletta P, Spini D. An introduction to generalized estimating equations and an application to assess selectivity effects in a longitudinal study on very old individuals. J Educ Behav Stat 2004;29:421–37. 10.3102/10769986029004421 [DOI] [Google Scholar]
  • 68. Bellera CA, MacGrogan G, Debled M, et al. . Variables with time-varying effects and the Cox model: some statistical concepts illustrated with a prognostic factor study in breast cancer. BMC Med Res Methodol 2010;10:20 10.1186/1471-2288-10-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Hulme A, Finch CF. From monocausality to systems thinking: a complementary and alternative conceptual approach for better understanding the development and prevention of sports injury. Inj Epidemiol 2015;2 10.1186/s40621-015-0064-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Ward P, Coutts AJ, Pruna R, et al. . Putting the ‘i’ back in team. Int J Sports Physiol Perform 2018;14:1–14. 10.1123/ijspp.2018-0154 [DOI] [PubMed] [Google Scholar]
  • 71. Buchheit M. Applying the acute:chronic workload ratio in elite football: worth the effort? Br J Sports Med 2017;51:1325–7. 10.1136/bjsports-2016-097017 [DOI] [PubMed] [Google Scholar]
  • 72. Newgard CD, Lewis RJ. Missing data: how to best account for what is not known. JAMA 2015;314:940–1. 10.1001/jama.2015.10516 [DOI] [PubMed] [Google Scholar]
  • 73. El-Masri MM, Fox-Wasylyshyn SM. Missing data: an introductory conceptual overview for the novice researcher. Can J Nurs Res 2005;37:156–71. [PubMed] [Google Scholar]
  • 74. von Hippel PT. 8. How to impute interactions, squares, and other transformed variables. Sociol Methodol 2009;39:265–91. 10.1111/j.1467-9531.2009.01215.x [DOI] [Google Scholar]
  • 75. Gibbons RD, Hedeker D, DuToit S. Advances in analysis of longitudinal data. Annu Rev Clin Psychol 2010;6:79–107. 10.1146/annurev.clinpsy.032408.153550 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. McDonald JH. Handbook of biological statistics. Baltimore, MD: Sparky House Publishing, 2009. [Google Scholar]
  • 77. Therneau TM, Grambsch PM. Modeling survival data: extending the Cox model. New York: Springer, 2000. [Google Scholar]
  • 78. Finch CF, Cook J. Categorising sports injuries in epidemiological studies: the subsequent injury categorisation (SIC) model to address multiple, recurrent and exacerbation of injuries. Br J Sports Med 2014;48:1276–80. 10.1136/bjsports-2012-091729 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Ullah S, Gabbett TJ, Finch CF. Statistical modelling for recurrent events: an application to sports injuries. Br J Sports Med 2014;48:1287–93. 10.1136/bjsports-2011-090803 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Shrier I, Steele RJ, Zhao M, et al. . A multistate framework for the analysis of subsequent injury in sport (M-FASIS). Scand J Med Sci Sports 2016;26:128–39. 10.1111/sms.12493 [DOI] [PubMed] [Google Scholar]
  • 81. Gabbett TJ. Reductions in pre-season training loads reduce training injury rates in rugby league players. Br J Sports Med 2004;38:743–9. 10.1136/bjsm.2003.008391 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82. Ehrmann FE, Duncan CS, Sindhusake D, et al. . GPS and Injury Prevention in Professional Soccer. J Strength Cond Res 2016;30:360–7. 10.1519/JSC.0000000000001093 [DOI] [PubMed] [Google Scholar]
  • 83. Hill AB. The environment and disease: association or causation? Proc R Soc Med 1965;58:295–300. [PMC free article] [PubMed] [Google Scholar]
  • 84. Höfler M. The Bradford Hill considerations on causality: a counterfactual perspective. Emerg Themes Epidemiol 2005;2:11 10.1186/1742-7622-2-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85. Rothman KJ, Greenland S. Causation and causal inference in epidemiology. Am J Public Health 2005;95(S1):S144–50. 10.2105/AJPH.2004.059204 [DOI] [PubMed] [Google Scholar]
  • 86. Carey DL, Blanch P, Ong KL, et al. . Training loads and injury risk in Australian football-differing acute: chronic workload ratios influence match injury risk. Br J Sports Med 2017;51 10.1136/bjsports-2016-096309 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87. Arnason A, Sigurdsson SB, Gudmundsson A, et al. . Risk factors for injuries in football. Am J Sports Med 2004;32:5–16. 10.1177/0363546503258912 [DOI] [PubMed] [Google Scholar]
  • 88. Rogalski B, Dawson B, Heasman J, et al. . Training and game loads and injury risk in elite Australian footballers. J Sci Med Sport 2013;16:499–503. 10.1016/j.jsams.2012.12.004 [DOI] [PubMed] [Google Scholar]
  • 89. Gabbett TJ. Influence of training and match intensity on injuries in rugby league. J Sports Sci 2004;22:409–17. 10.1080/02640410310001641638 [DOI] [PubMed] [Google Scholar]
  • 90. Tan X, Shiyko MP, Li R, et al. . A time-varying effect model for intensive longitudinal data. Psychol Methods 2012;17:61–77. 10.1037/a0025814 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91. Hulme A, Thompson J, Nielsen RO, et al. . Towards a complex systems approach in sports injury research: simulating running-related injury development with agent-based modelling. Br J Sports Med 2018:bjsports-2017-098871 10.1136/bjsports-2017-098871 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92. Carey DL, Ong K, Whiteley R, et al. . Predictive modelling of training loads and injury in Australian football. International Journal of Computer Science in Sport;17:49–66. 10.2478/ijcss-2018-0002 [DOI] [Google Scholar]
  • 93. Colby MJ, Dawson B, Peeling P, et al. . Multivariate modelling of subjective and objective monitoring data improve the detection of non-contact injury risk in elite Australian footballers. J Sci Med Sport 2017;20:1068–74. 10.1016/j.jsams.2017.05.010 [DOI] [PubMed] [Google Scholar]
  • 94. Møller M, Nielsen RO, Attermann J, et al. . Handball load and shoulder injury rate: a 31-week cohort study of 679 elite youth handball players. Br J Sports Med 2017;51:231–7. 10.1136/bjsports-2016-096927 [DOI] [PubMed] [Google Scholar]
  • 95. Rossi A, Pappalardo L, Cintia P, et al. . Effective injury prediction in professional soccer with GPS data and machine learning. ArXiv. doi:http://arxiv.org/abs/1705.08079 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96. Stares J, Dawson B, Peeling P, et al. . Identifying high risk loading conditions for in-season injury in elite Australian football players. J Sci Med Sport 2018;21:46–51. 10.1016/j.jsams.2017.05.012 [DOI] [PubMed] [Google Scholar]
  • 97. Williams S, Trewartha G, Cross MJ, et al. . Monitoring what matters: a systematic process for selecting training-load measures. Int J Sports Physiol Perform 2017;12:1–20. 10.1123/ijspp.2016-0337 [DOI] [PubMed] [Google Scholar]
  • 98. Ad W, Zumbo BD. Understanding and using mediators and moderators. Soc Indic Res 2007;87:367. [Google Scholar]
  • 99. Williams S, Booton T, Watson M, et al. . Monitoring what matters: a systematic process for selecting training load measures. J Sports Sci Med 2017;16:443–9. [DOI] [PubMed] [Google Scholar]
  • 100. Sampson JA, Murray A, Williams S, et al. . Injury risk-workload associations in NCAA American college football. J Sci Med Sport 2018. 10.1016/j.jsams.2018.05.019 [DOI] [PubMed] [Google Scholar]
  • 101. Garcia TP, Marder K. Statistical approaches to longitudinal data analysis in neurodegenerative diseases: huntington’s disease as a model. Curr Neurol Neurosci Rep 2017;17:14 10.1007/s11910-017-0723-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102. Yang J, Zaitlen NA, Goddard ME, et al. . Advantages and pitfalls in the application of mixed-model association methods. Nat Genet 2014;46:100–6. 10.1038/ng.2876 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103. Russell MA, Almeida DM, Maggs JL. Stressor-related drinking and future alcohol problems among university students. Psychol Addict Behav 2017;31:676–87. 10.1037/adb0000303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104. Bolger N, Laurenceau JP. Intensive longitudinal methods: an introduction to diary and experience sampling research. New York: Guilford Press, 2013. [Google Scholar]
  • 105. Gelman A, Hill J. Data analysis using regression and multilevel/hierarchical models. New York: Cambridge University Press, 2007. [Google Scholar]
  • 106. Horton NJ, Switzer SS. Statistical methods in the journal. N Engl J Med 2005;353:1977–9. 10.1056/NEJM200511033531823 [DOI] [PubMed] [Google Scholar]
  • 107. Nielsen RØ, Malisoux L, Møller M, et al. . Shedding light on the etiology of sports injuries: a look behind the scenes of time-to-event analyses. J Orthop Sports Phys Ther 2016;46:300–11. 10.2519/jospt.2016.6510 [DOI] [PubMed] [Google Scholar]
  • 108. Wu L, Liu W, Gy Y, et al. . Analysis of longitudinal and survival data: joint modeling, inference methods, and issues. J Probab Stat 2012. [Google Scholar]
  • 109. Andersen PK, Syriopoulou E, Parner ET. Causal inference in survival analysis using pseudo-observations. Stat Med 2017;36:2669–81. 10.1002/sim.7297 [DOI] [PubMed] [Google Scholar]
  • 110. Hansen SN, Andersen PK, Parner ET. Events per variable for risk differences and relative risks using pseudo-observations. Lifetime Data Anal 2014;20:584–98. 10.1007/s10985-013-9290-4 [DOI] [PubMed] [Google Scholar]
  • 111. Hickey GL, Philipson P, Jorgensen A, et al. . Joint modelling of time-to-event and multivariate longitudinal outcomes: recent developments and issues. BMC Med Res Methodol 2016;16:117 10.1186/s12874-016-0212-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112. Mahmood A, Ullah S, Finch CF. Application of survival models in sports injury prevention research: a systematic review. Br J Sports Med 2014;48:630.2–630. 10.1136/bjsports-2014-093494.190 [DOI] [Google Scholar]
  • 113. Carey DL, Crossley KM, Whiteley R, et al. . Modelling Training Loads and Injuries: The Dangers of Discretization. Med Sci Sports Exerc 2018. 10.1249/MSS.0000000000001685 [DOI] [PubMed] [Google Scholar]
  • 114. Maldonado G, Greenland S. The importance of critically interpreting simulation studies. Epidemiology 1997;8:453–6. [PubMed] [Google Scholar]
  • 115. Trail JB, Collins LM, Rivera DE, et al. . Functional data analysis for dynamical system identification of behavioral processes. Psychol Methods 2014;19:175–87. 10.1037/a0034035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116. de Haan-Rietdijk S, Kuppens P, Hamaker EL. What’s in a day? A guide to decomposing the variance in intensive longitudinal data. Front Psychol 2016;7 10.3389/fpsyg.2016.00891 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117. Biddle SJ, Edwardson CL, Wilmot EG, et al. . A randomised controlled trial to reduce sedentary time in young adults at risk of type 2 diabetes mellitus: project STAND (Sedentary Time ANd Diabetes). PLoS One 2015;10:e0143398 10.1371/journal.pone.0143398 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118. Casals M, Finch CF. Sports biostatistician: a critical member of all sports science and medicine teams for injury prevention. Inj Prev 2017;23:423–7. 10.1136/injuryprev-2016-042211 [DOI] [PubMed] [Google Scholar]
  • 119. Duhig S, Shield AJ, Opar D, et al. . Effect of high-speed running on hamstring strain injury risk. Br J Sports Med 2016;50:1536–40. 10.1136/bjsports-2015-095679 [DOI] [PubMed] [Google Scholar]
  • 120. Murray NB, Gabbett TJ, Townshend AD, et al. . Individual and combined effects of acute and chronic running loads on injury risk in elite Australian footballers. Scand J Med Sci Sports 2017;27 10.1111/sms.12719 [DOI] [PubMed] [Google Scholar]
  • 121. Murray NB, Gabbett TJ, Townshend AD. Relationship between preseason training load and in-season availability in elite australian football players. Int J Sports Physiol Perform 2017;12:749–55. 10.1123/ijspp.2015-0806 [DOI] [PubMed] [Google Scholar]
  • 122. Gabbett TJ, Jenkins DG. Relationship between training load and injury in professional rugby league players. J Sci Med Sport 2011;14:204–9. 10.1016/j.jsams.2010.12.002 [DOI] [PubMed] [Google Scholar]
  • 123. Hulin BT, Gabbett TJ, Lawson DW, et al. . The acute:chronic workload ratio predicts injury: high chronic workload may decrease injury risk in elite rugby league players. Br J Sports Med 2016;50 10.1136/bjsports-2015-094817 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary file 1

bmjopen-2018-022626supp001.pdf (271.5KB, pdf)

Reviewer comments
Author's manuscript

Articles from BMJ Open are provided here courtesy of BMJ Publishing Group

RESOURCES