Abstract
Aggressive cancer treatments that affect the central nervous system are associated with an increased risk of cognitive deficits. As treatment for pediatric brain tumors has become more effective, there has been a heightened focus on improving cognitive outcomes, which can significantly affect the quality of life for pediatric cancer survivors. This paper is motivated by and applied to a clinical trial for medulloblastoma, the most common malignant brain tumor in children. The trial collects comprehensive data including treatment-related clinical information, neuroimaging, and longitudinal neurocognitive outcomes to enhance our understanding of the responses to treatment and the enduring impacts of radiation therapy on the survivors of medulloblastoma. To this end, we have developed a new mediation model tailored for longitudinal outcomes with high-dimensional imaging mediators. Specifically, we adopt a joint binary Ising-Gaussian Markov random field prior distribution to account for spatial dependency and smoothness of ultra-high-dimensional neuroimaging mediators for enhancing detection power of informative voxels. By exploiting the proposed approach, we identify causal pathways and the corresponding white matter microstructures mediating the negative impact of irradiation on neurodevelopment. The results provide guidance on sparing the brain regions and improving long-term neurodevelopment for pediatric cancer survivors. Simulation studies also confirm the validity of the proposed method.
Keywords: Bayesian mediation analysis, DTI, High-dimensional data, Neurodevelopment, Neuroimaging, Longitudinal outcomes
AMS 2000 subject classifications: Primary 62M40, 62J05; secondary 62F15
1. INTRODUCTION
As cancer treatments have become more effective, researchers are collecting indices of neurotoxicity, such as neurocognitive functioning and neuroimaging data that reveal brain structures affected by treatment, to measure treatment response and treatment side effects. Medulloblastoma is the most common malignant brain tumor in children, with average overall survival of 70% to 75% with current therapy [12, 23, 15, 37, 45, 65]. The SJMB03 clinical trial was designed, in part, to investigate risk-adapted therapy, which reserves the most aggressive therapy for medulloblastoma patients with the worst prognosis, including evaluating the neurocognitive benefit of reduced-dose craniospinal radiation therapy (RT). To address this objective, the investigators collected: (1) diagnostic- and treatment-related clinical information; (2) quantitative magnetic resonance imaging (MRI) exams measuring white matter (WM) integrity after irradiation; and (3) longitudinal measurements of neurocognitive performance before and after irradiation and at 12, 24, 36, 48, and 60 months after diagnosis, including Woodcock-Johnson III Tests of Cognitive Abilities [58] and Tests of Achievement [59] that measure multiple vulnerable cognitive skills, such as processing speed. Longitudinal study designs facilitate the investigation of the developmental trends of neurocognitive outcomes and their relation to the development of brain structures, which cannot be assessed in a cross-sectional study design. The broad availability of high quality MRI and longitudinal comprehensive neurocognitive assessments provide a unique opportunity to develop and test new approaches to better understand the underlying etiology of these impairments in children diagnosed with cancer [2].
In the SJMB03, patients were categorized into averagerisk (AR) and high-risk (HR) groups who received a different level of craniospinal irradiation. Preliminary analysis reveals a clear difference in neurodevelopmental trajectories, particularly with respect to the change in processing speed score from baseline between AR and HR patients, as depicted in the left panel of Figure 1. While the processing speed score of AR patients tends to increase over the time, HR patients exhibit a trend of degeneration. Hence, gaining a better understanding of how current radiation therapy affects brain structure and neurocognitive function is vital. Such understanding could help mitigate the significant negative effects on the neurocognitive and social functioning of survivors. Our goal is to locate WM microstructure that is damaged by the irradiation and affects neurocognitive outcomes (the and paths in the right panel of Figure 1), which will provide more convincing evidence of causal pathways among these variables. Identifying such pathways and the corresponding regions could improve future irradiation planning, spare the vulnerable brain regions and improve long-term neurodevelopment for pediatric cancer survivors. Moreover, we aim to ascertain the proportion of the treatment effect on neurocognitive outcomes that can be attributed to WM damage. This will furnish further evidence to explain and comprehend the impact of treatment on neurocognition.
Figure 1.

The left panel: Changes in processing speed scores from baseline over the time. The thick lines represent the means of the patient groups. The right panel: Path diagram of the mediation analysis for treatment (exposure), brain microstructure (mediator), and cognition (outcome).
Previous studies have separately identified the relationship between pairs of the above three types of variables. First, cranial RT and some chemotherapeutic agents are well-established causes of structural alteration of cerebral WM [15]. Second, cognitive late effects are associated with decreased integrity of widely distributed neural networks supported by WM tracts [15]. Third, pediatric patients treated for medulloblastoma with cranial RT experience cognitive late effects [38, 23]. However, to the best of our knowledge, no studies have investigated whether there is a causal pathway from treatment through WM microstructure to neurodevelopmental trends.
Traditional mediation analysis models [32] were proposed to identify important mediators between exposure and outcome. However, these traditional models are mostly efficient in handling univariate or low-dimensional exposure, mediators and outcomes, let alone grappling with diverse data types. Attempts to concurrently analyze high-dimensional voxelwise neuroimaging data along with longitudinal outcomes using these models have invariably encountered challenges. In our study, the high dimensional diffusion tensor imaging (DTI), processed by the Tract-Based Spatial Statistics (TBSS) pipeline [49], is used to measure the brain microstructure. The fractional anisotropy values (FA) on the TBSS skeleton (TBSS-FA) consist of nearly 90 thousand voxels. Moreover, the complication of data analysis is exacerbated by the missing visits and potential complex temporal dependency in longitudinal neurocognitive outcomes. The most straightforward approach is to repeat a univariate mediation analysis for each voxel based on linear mixed effects model. However, typically few voxels are identified as significant mediators after multiple comparison corrections due to the limited sample sizes. Recently, there has been a growing interest in the development of high-dimensional mediation models for biomedical data [6, 50, 9, 51]. Moreover, high-dimensional mediation analysis has been applied in imaging studies [4, 60, 56]. However, most of these methods fall short of adequately addressing spatial dependencies and the extreme high-dimensionality intrinsic to neuroimaging mediators, as well as the longitudinal structure of cognitive outcomes. Indeed, to the best of our knowledge, there is no existing approach to analyze high-dimensional imaging mediators and longitudinal outcomes simultaneously.
In this work, we develop a new mediation model tailored for longitudinal outcomes with high-dimensional imaging mediators aiming to (1) account for the spatial dependency and smoothness of the voxelwise DTI data to increase the power to detect mediating areas through a binary Ising-Gaussian Markov random field (BI-GMRF) prior distribution; The employment of this novel prior aligns with clinical plausibility and will be elaborated in Section 3.1; (2) identify the sparse informative voxels through a Bayesian variable selection method; (3) further increase the power to select voxels with mediating effects through a joint prior distribution on the coefficients in both causal pathways.
The remainder of the paper is structured as follows. Section 2 introduces the proposed model as well as causal assumptions and interpretations. Section 3 presents a Bayesian estimation method with prior specifications and an overview of posterior computation. In Sections 4 and 5, we illustrate the efficacy of the proposed method in an application to a pediatric brain tumor study and a series of simulations. Finally, Section 6 concludes the paper with a discussion. Technical details are deferred to the Appendix.
2. IMAGING MEDIATION MODEL
2.1. Model
We start by introducing the notation and structure of the observed data. Let denote an imaging mediator observed on a -dimensional compact space for the th participant, for example, for DTI in this case study, where and is the number of participants. In practice, neuroimages are observed at a dense discretized grid of voxels given specific image resolutions and . Therefore, imaging mediators are oftentimes complicated by its high-dimensionality, for example, is close to 90 thousand for the WM skeleton in our case. Let denote a binary treatment indicator, which represents either a high or average/low dose level of radiation therapy and is the exposure in our mediation analysis. Let be the confounder, such as age, gender, and other demographic variables. Without loss of generality, we assume univariate confounders here for ease of presentation but extension to multivariate confounders is straightforward. Denote by the longitudinal outcomes, such as cognitive scores of processing speed, measured intermittently at time for the th participant and th visit with , where is the subject-specific number of visits. Our objective is to identify the influential voxels in neuroimages that mediate the treatment effect of on the neurocognitive outcomes , with adjustment for the confounders .
There are two basic components of a mediation model: the mediator component and the outcome component, see Figure 1. The mediation component can be characterized by an image-on-scalar regression that models the relationship between high-dimensional neuroimaging mediators and treatment exposure adjusted by certain confounders,
| (1) |
where is the image intercept, and are the image and scalar coefficients corresponding to treatment and confounders, respectively, are assumed as independent Gaussian measurement errors with unknown variance . Note that accounts for the treatment effects of radiation dose on brain WM microstructure.
The outcome component consists of a trajectory-on-image regression that models the relationship between neurocognitive profiles and high-dimensional neuroimaging mediators and treatment exposure with adjustment for confounders. We model the longitudinal outcome component through a linear mixed effects model,
| (2) |
where , , , , and are fixed effects corresponding to the intercept, visit times, imaging mediator , treatment exposure , and confounders , respectively; is a subject-specific random intercept following normal distribution; we assume that are independent and normally distributed measurement errors. We should note that captures the effects of brain WM microstructures on the neurocognitive scores, while represents the direct treatment effects on the scores. The cross-sectional outcome model has been considered in a regression setup [16, 17], but the proposed longitudinal outcome model is new. In addition, we assume that the image coefficients and are voxel-wise sparsity-induced and grouped according to spatial adjacency as well as smoothness in nonzero regions, representing the mediation effect of radiation therapy on neurocognition through specific brain structures and their contiguous regions. Crucially, to enhance the detection of mediation effects, we also take into account the voxelwise overlap of the image coefficients and to acknowledge the interconnectedness of the two pathways.
The proposed imaging mediation model is general in that it contains several imaging regression models as special cases. The formulation of the mediator component (1) is similar to linear regression model for matrix responses [24], but our model is more flexible as it has no restriction on the dimension of image responses. The response can be either a matrix, such as a brain connectivity, or a tensor, such as DTI. The outcome component (2) relates to some state-of-the-art scalar-on-image regressions, but our trajectory-on-image regression is carefully formulated for longitudinal outcomes. When and are set to zero, the outcome component (2) reduces to tensor regression [63] and image regression [17, 16], and the proposed joint model reduces to a high-dimensional mediation model for cross-sectional outcomes [56].
2.2. Causal assumptions and interpretations
In this section, we first introduce the potential outcome notation [40] corresponding to variables in the proposed model to define the causal estimands of interest using capital letters. Then, we introduce several assumptions commonly used in causal inference and the corresponding interpretations for defining causal effects. Finally, a proposition is established to identify causal effects.
Let , corresponding to with , denote the potential longitudinal outcome under exposure . Let , corresponding to , represent the imaging mediator under . Similarly, denotes the longitudinal outcome under and . We also make the generally applied composition constraint by letting following [39]. Let denote confounders. Suppose we are interested in comparing two arbitrary levels of exposure, we use and to represent the two treatment levels.
Several assumptions are required to identify causal effects. First, we make the stable unit treatment value assumption (SUTVA), which assumes there is no interference between subjects and the consistency assumption [41, 42, 18]. It implies that one individual’s treatment exposure assignment does not affect others’ outcome. The consistency assumption states that the observed variables are the same as the potential variables corresponding to the actually observed treatment level.
Second, we make the no-unmeasured-confounding assumptions [55, 5], including
Assumption 1. ,
Assumption 2. ,
Assumption 3. ,
Assumption 4. .
The interpretations corresponding to Assumptions 1-4 are: (1) there is no unmeasured confounding for the exposure effect on the outcome; (2) there is no unmeasured confounding for any mediator-outcome relationship after controlling for the exposure; (3) there is no unmeasured confounding for the exposure effect on all the mediators; and (4) there is no downstream effect of the exposure that confounds the mediator-outcome relationship for any of the mediators, or there is no mediator-outcome confounder that is affected by the exposure.
Third, we also hold the temporal ordering assumption, that is, the exposure precedes the mediators, which precede the outcome.
Proposition 1. Suppose equations (1) and (2) are correctly specified. In addition, suppose (i) the SUTVA, (ii) the no-unmeasured-confounding assumptions, and (iii) the temporal ordering assumption hold, and is compact. Let be a realization of , and be a vector of length with all entries equal to one. Then, under some mild conditions of imaging coefficients , , and , the average natural direct effect (), indirect effect (), total effect () and proportion mediated () are identifiable and given by
The proof of Proposition 1 is straightforward and proceeds analogously to the proof in [56]. According to causal effects DE, IE, and TE, PM can be used to capture the extent to which the effect of the exposure on the longitudinal outcome operates through the imaging mediator.
3. BAYESIAN ESTIMATION METHOD
3.1. Prior specification
To tackle the challenging estimation of ultra-high-dimensional imaging coefficients in the longitudinal mediation model, we employ a Bayesian method to address the following considerations. Specifically, we aim to (1) select influential voxels among the ultra-high-dimensional mediator; (2) account for spatial dependency and smoothness; and (3) increase the power to detect influential voxels with mediation effects. The rationale behind these considerations is three-fold and closely tied to the scientific hypothesis of neuroimaging studies for pediatric cancer survivors. First, given the high-resolution structural MRI, the dimensions of the mediators can be hundreds of thousands of voxels, but compared to the whole brain, only a handful of regions are particularly vulnerable to injury from radiation and are associated with specific neurocognitive impairment [1, 2]. Second, spatial information and smooth structures are commonly incorporated in imaging analysis, leading to the discovery of latent synergies across and/or at the boundary of brain anatomical structures. [10, 30] Third, the detection rate of influential voxels is oftentimes low due to the vast number of voxels and limited sample sizes in pediatric cancer studies [2]. To address variable selection and spatial smoothness for neuroimaging mediators, we propose a BI-GMRF prior distribution for imaging coefficients and , which is essentially a combination of a binary Ising random field and a Gaussian Markov random field. The justification for choosing this specific prior hinges on four key reasons and corresponds to the aforementioned clinical plausibility. First, the latent binary mask prior, often referred to as the “spike and slab” priors as per [33], is frequently employed to foster sparsity in high-dimensional data analysis, mirroring the spirit of Lasso [53] within frequentist statistics. Second, the binary Ising prior, a binary spatial Markov random field, is specially used to account for the spatial dependency of imaging voxels. This is implemented alongside binary indicators to establish a latent mask that simultaneously governs the general sparsity of influential voxels and their grouping structures, paralleling the fused Lasso [54] in the context of spatial coherence. The Ising prior has seen wide application in imaging analysis, as evident in the early work [61], as well as more influential applications in functional MRI [48, 46, 29]. Third, recent studies have utilized the combination of BI and GMRF priors [17, 16]. The Gaussian Markov random field encapsulates the smoothness of influential or non-zero coefficients, a concept that has found widespread acceptance in imaging analysis [11, 3, 47]. Fourth, we employ a joint Ising prior that considers both paths concurrently, thereby enhancing detection power. Compared to the existing use of Ising prior for variable selection [29, 16], our joint analysis is innovative and important in that the joint binary Ising random field for imaging coefficients and encourages the identification of mediation effects .
The selection of influential/nonzero voxels and is controlled by the latent binary indicators and , respectively. Specifically, let if , and is to be estimated if . is defined in a similar way. Let , , , . Then, the joint binary Ising random field for is given by
| (3) |
where , , , , and are hyperparameters fixed a priori, is an indicator function, and is the set of voxels in the neighborhood of voxel . We make a few remarks. First, the neighborhood can be defined based on the application and the imaging geometrical structure. For example, we define as the subset of six adjacent voxels that are in the 3D TBSS skeleton. And it is natural to define as the subset of four adjacent voxels for 2D imaging data, such as slices of structural MRI or brain connectivity. The selection of a specific local neighborhood is typically based on the nearest voxels along each dimension. This approach is widely used in neuroimaging analysis [47]. Alternatively, a data-driven approach could be employed to establish a latent neighborhood through regularization, fostering spatial interdependence without the need for a priori assumptions [30]. Second, the hyperparameters of Ising distribution are assumed to be the same over all voxels, which is a compromise due to the computational consideration as it reduces the number of hyperparameters to a manageable scale. The hyperparameters play different roles in controlling model complexity. First, and control the proportion of ones in and and consequently the overall sparsity of and , respectively. Second, and are chosen to further induce spatial dependency of voxels with nonzero and to form grouped regions with influential voxels. Third, is used to characterize the concurrence between and , and therefore the mediation effect. The voxels with both and are assigned with a higher prior probability, leading to a higher detection rate of voxels with mediation effects. In Section 3.3, we will discuss the selection of hyperparameters through a cross-validation procedure.
Let , , and be the sub-vectors of , , and with the component at the th voxel removed, respectively. The conditional prior distribution of , given and , is specified through the GMRF as , , where is the number of elements in , is the local average of the coefficients in the neighborhood of voxel , and is a point mass at zero. The conditional prior distribution of , given and , is defined similarly. , , where . The spatial smoothing of imaging coefficients is achieved by the local average of and , and and control the smoothness of and . Selection of the tuning parameters and are considered in Section 3.3.
In addition, we assume a GMRF prior distribution for the mean function of imaging mediator , , where , and is a hyperparameter controlling local smoothness. Conjugate priors are assigned to scalar parameters. Specifically, Gaussian prior distributions are assigned to , , , , and , and inverse gamma distributions are assigned to and .
3.2. Posterior computation
Draws from the posterior distribution of the longitudinal mediation model are generated using a Markov chain Monte Carlo (MCMC) algorithm through Gibbs samplers. Specifically, for each voxel a single-site Gibbs sampler is deployed using the location-specific joint posterior probability of the latent binary indicator and the corresponding imaging coefficient. This is essentially a draw from a Bernoulli distribution accounting for BI-GMRF prior information. The site-specific joint posterior distribution of is given by , where is the odds, and
Similarly, the site-specific joint posterior distribution of is given by , and
The details of calculating and are included in the Appendix. Given that , the computational time mainly depends on the sampling of the parameters indexed by the total number of voxels . See the Appendix for detailed posterior distributions and an overview of the sampling schemes.
3.3. Implementation
Following [16], we adopt a five-fold cross-validation procedure to select hyperparameters in the proposed model. Alternative approaches for Bayesian model selection include information criteria, such as DIC and WAIC [14, 62]. Here, we choose to use cross-validation due to its simplicity and good empirical performance. The subjects in each fold are randomly selected. We hold out one fold and train the model using the rest of the data. We then test the trained model using the holdout and select tuning parameters with the minimum squared errors given by
| (4) |
| (5) |
where the hat version of the fixed effects is estimated without the th group, and is defined similarly but obtained by best linear unbiased prediction. To alleviate computational burden, a three-step strategy is employed. For , , , we select tuning parameters according to equation (4) with fixed to 0. For , , , and we choose tuning parameters based on equation (5) while fixing . Finally, we minimize the sum of equations (4) and (5) to determine given other hyperparameters.
We offer some guidance about finding suitable ranges of tuning parameters for the cross-validation procedure. In accordance with [16], a useful starting range for the Ising prior parameters, and , is (−4, 0), where −4 and 0 represent sparse and dense coefficients, respectively. The suggested range for and is (0, 2), where 2 enforces the spatial dependency of the binary indicators whereas 0 assumes no spatial dependency. We also suggest choosing with cross-validation because it is important in controlling under- and over-fitting and its optimal choice depends on the scale of the imaging mediator and the signal strength. The spatial smoothness of and depends on and , respectively, the choices of which are less critical and also depend on the scale of the imaging mediator. The same is true for . An alternative is to adopt a fully Bayesian approach for finding tuning parameters, but this approach is computationally intensive and its performance is sensitive to the choice of according to our numerical studies. See [17, 16] for similar findings.
4. DATA ANALYSIS
4.1. Descriptions
In this section, we illustrate using our discovery SJMB03 data, with the aim to identify influential brain regions that mediate the negative effect of treatment on long-term neurocognitive development. To form the analysis group, we include pediatric patients for whom both imaging after irradiation and processing speed test scores at baseline are available. We only include patients with at least one cognitive measurement after baseline. As a result, the final data include 103 patients with 513 observations and the average number of cognitive assessments after baseline is 4.05. Of these 103 patients, 31 are HR patients (16 males; baseline age 8.92 years), and the remaining 72 are AR patients (48 males; baseline age 11.1 years). Patients with the minimal localized disease were assigned to AR group and received lower radiation doses ranging from 23.4 Gy to 55.8 Gy. While patient with more than 1.5 cm2 of residual disease and/or metastatic spread were assigned to HR group with higher radiation dose ranging from 36 Gy to 59.4 Gy. A more detailed description of SJMB03 study can be found in Gajjar et al. [13]. The binary treatment variable is assigned to distinguish AR group () and HR group (). The neurocognitive score of interest is the processing speed score derived from the Woodcock-Johnson III Tests of Cognitive Abilities [44]. The measurement is based on visual matching and decision speed tests. A higher age standardized score (mean = 100; standard deviation = 15) indicates better neurocognitive function. In order to evaluate processing speed over time, we use the change in processing speed score from baseline to follow-up visits at 12, 24, 36, 48 and 60 months as the longitudinal outcome . To adjust for the confounding effects in our non-randomized study, we adopt the idea from propensity score [8, 21]. Specifically, the propensity score in this analysis is obtained as the probability that an individual receives the HR treatment given sex, baseline age and processing speed via a logistic regression.
All imaging studies are performed on one of two 1.5T MAGNETOM Avanto MRI scanners, a 3.0T MAGNETOM Trio MRI scanner, or a 3.0T MAGNETOM Skyra MRI scanner (all from Siemens Medical Systems). DTI data are acquired using bipolar diffusion-encoding gradients. In this analysis, we use the MRI imaging data after the completion of RT as the mediator. The tensor has been used to evaluate the FA for each point in the image, which measures the directional organization of a region and reflects the myelin integrity. FA maps are processed via the TBSS pipeline in the FSL [20], which minimizes the inter-subject variability and is, therefore, more reliable [49]. We use TBSS to register all FA maps to a common atlas space, which is achieved by using the nonlinear registration tools in the FSL. Each FA image is normalized to the FMRIB58 FA standard space in the Montreal Neurologic Institute coordinates. The mean image after registration for all subjects is calculated, and WM voxels are identified by using an FA lower threshold of 0.25. The final WM skeleton represents the fiber bundle centers across all subjects. Then, each participant’s FA data is mapped onto the skeleton to represent the physiologic characteristics of that participant, without having to make allowance for the structural variability of the participants or bias to the alignment. The dimensions size of the image is 182 × 218 × 182, and the number of WM voxels on the skeleton is as shown in Figure 2. The FA variable derived from the DTI images is known as a useful index for the evaluation of radiation-induced WM injury in children with medulloblastoma [22, 35]. Specifically, TBSS-FA measurements after the completion RT have been used to quantify the radiation-dose-dependent WM injury [36].
Figure 2.

3D visualization of the TBSS skeleton in the SJMB03 analysis.
After carefully evaluating the causal assumptions for assessing the validity of the causal conclusions regarding the SJMB03 data, we conduct data analysis using the proposed BI-GMRF approach. The tuning parameters are determined by a 5-fold cross-validation as described in Section 3.3. The other Gaussian-type priors are chosen to be noninformative, with mean 0 and large variance 100. The shape and scale hyperparameters of the inverse gamma priors are (3, 5) for , and (1, 1) for . The proposed MCMC algorithm converges within 4,000 iterations as suggested by trace plots. The subsequent 4,000 samples are collected for posterior inference with reasonable elapsed time of 1,293 secs on a generic laptop.
4.2. Results
Table 1 summarizes a number of significant voxels of , , and within each anatomical region, where the positive and negative effects are counted separately. For , 13.6% (12,204/89,456) voxels are significant, of which 61.0% are negative. For , 14.3% (12,784/89,456) voxels are significant, of which 54.1% are positive. For , 12.9% (11,504/89,456) voxels are significant of which 62.3% are negative. 3D visualizations of the estimated , , and are provided in Figure 3. To facilitate the interpretation, the TBSS skeleton is matched with the JHU-ICBM-DTI-81 white matter atlas [34], which covers approximately 1/3 of all significant voxels.
Table 1.
Summary of the estimated nonzero imaging coefficients of the SJMB03 analysis
| + | − | + | − | + | − | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Total Voxel Count | 4758 | 7446 | 6916 | 5868 | 4338 | 7166 | ||||||
| Brain region: | # | % | # | % | # | % | # | % | # | % | # | % |
| Corpus Callosum | 153 | 12.6 | 159 | 6.5 | 267 | 11.1 | 61 | 4.3 | 149 | 11.5 | 141 | 6.5 |
| Corticospinal Tract | 77 | 6.3 | 401 | 16.4 | 330 | 13.8 | 145 | 10.2 | 119 | 9.2 | 350 | 16.1 |
| Medial Lemniscus | 4 | 0.3 | 118 | 4.8 | 122 | 5.1 | 4 | 0.3 | 8 | 0.6 | 112 | 5.1 |
| Cerebellar Peduncle | 385 | 31.6 | 802 | 32.8 | 816 | 34 | 381 | 26.9 | 357 | 27.7 | 810 | 37.2 |
| Cerebral Peduncle | 1 | 0.1 | 33 | 1.3 | 36 | 1.5 | 7 | 0.5 | 2 | 0.2 | 27 | 1.2 |
| Internal Capsule | 103 | 8.5 | 219 | 8.9 | 158 | 6.6 | 184 | 13 | 131 | 10.1 | 173 | 7.9 |
| Corona Radiata | 162 | 13.3 | 350 | 14.3 | 279 | 11.6 | 279 | 19.7 | 233 | 18 | 228 | 10.5 |
| Posterior Thalamic Radiation | 18 | 1.5 | 67 | 2.7 | 59 | 2.5 | 25 | 1.8 | 15 | 1.2 | 54 | 2.5 |
| Sagittal Stratum | 89 | 7.3 | 17 | 0.7 | 92 | 3.8 | 16 | 1.1 | 76 | 5.9 | 20 | 0.9 |
| External Capsule | 87 | 7.1 | 90 | 3.7 | 83 | 3.5 | 101 | 7.1 | 47 | 3.6 | 118 | 5.4 |
| Cingulum | 90 | 7.4 | 59 | 2.4 | 82 | 3.4 | 84 | 5.9 | 64 | 5 | 79 | 3.6 |
| Fornix | 10 | 0.8 | 33 | 1.3 | 31 | 1.3 | 11 | 0.8 | 9 | 0.7 | 25 | 1.1 |
| Superior Longitudinal Fasciculus | 38 | 3.1 | 74 | 3 | 41 | 1.7 | 99 | 7 | 61 | 4.7 | 40 | 1.8 |
| Superior Fronto-occip Fasciculus | 0 | 0 | 4 | 0.2 | 1 | 0 | 3 | 0.2 | 2 | 0.2 | 1 | 0 |
| Uncinate Fasciculus | 1 | 0.1 | 21 | 0.9 | 1 | 0 | 18 | 1.3 | 18 | 1.4 | 0 | 0 |
Figure 3.

3D visualization of the estimated , , parameters (from top to bottom); from left to right: axial, coronal, and sagittal views. Regions in red demonstrate negative effects on processing speed scores, while those in blue indicate positive effects.
accounts for the effect of treatment (HR) on brain white matter microstructure or TBSS-FA. The majority of significant voxels in are negative which aligns with the expectation that a higher dose of irradiation should be associated with decreased FA [15], see Figure 4 for details. The cerebellar peduncle predominantly demonstrates negative association, which aligns with the anticipation. The region accounts for more than 30% of the significant labeled voxels in the atlas. This region is close to the irradiation boost region and therefore generally receives higher irradiation doses. The corpus callosum and corona radiata are deep white matter areas that are known to be more susceptible to radiation damage and account for more than 20% of the significant negative voxels in the atlas. The negative associations demonstrated by these voxels reflect that a greater dose of irradiation is associated with decreased FA in these regions. The significant effects are seen primarily in posterior portions of the corpus callosum, which receives higher doses of irradiation due to the boost given to the posterior fossa region. The internal capsule also shows predominantly negative association. This is likely because the white matter fibers that traverse from the cerebellar peduncle through the internal capsule are impacted. Consequently, the significant regions within the internal capsule are also negatively associated with the intensity of the treatment. In addition, regions such as the corticospinal tract and medial lemniscus predominantly display a negative association between FA values and treatment intensity.
Figure 4.

Selected 2D slices of the estimated . The images shown are on top of the FMRIB58 FA standard space. The skeleton voxels are presented in yellow. Voxels depicted in red indicate significant negative effects. They represent areas where an increase in the irradiation dose corresponds with a decrease in TBSS-FA.
represents the effect of brain white matter microstructure or TBSS-FA on the processing speed score changes. The majority of significant voxels in are positive, which aligns with the expectation that a more intact brain microstructure or higher TBSS-FA values correlate with superior processing speed performance [2], see Figure 5. The cerebellar peduncle accounts for approximately 34% of the significant positive voxels in the atlas. This region is close to the boost region and, therefore generally receives higher irradiation doses. The cerebellar peduncle reveals regions that have a positive effect on processing speed, where higher TBSS-FA is associated with improved performance on test scores. The internal capsule and external capsule also demonstrate positive effects. As previously described, white matter fibers from the irradiation boost region generally pass through the internal and external capsules and display significant regions that positively influence processing speed performance. The detected relationship is consistent with the existing studies revealing that these regions are positively associated with neurocognitive performances [26, 52]. The corona radiata shows an almost evenly mixed positive and negative effects. Small regions with positive effects in the corona radiata superior distributions have been identified in adults in the existing literature [7]. Regions with negative effects have been seen in the anterior corona radiata in patients who have experienced mild traumatic brain injury [64]. Additionally, the corpus callosum also accounts for a substantial portion of the significant voxels positively associated with processing speed performance. The corpus callosum is the largest primary white matter tract by which communications between the hemispheres are conducted. The association of neurocognitive performance with white matter integrity in this region has also been recently demonstrated [31, 25].
Figure 5.

Selected 2D slices of the estimated . The images shown are on top of the FMRIB58 FA standard space. The skeleton voxels are presented in yellow. Voxels highlighted in blue signal pronounced positive effects. These signify areas where an increase in TBSS-FA is associated with higher processing speed scores.
reflects the voxels accounting for mediation effects which is of primary interest. Most of the significant voxels are negative, corroborating our hypothesis that brain white matter microstructure or TBSS-FA mediates the detrimental treatment effect on processing speed scores, see Figure 6. The cerebellar peduncle, which generally receives higher doses of irradiation, demonstrates a substantial number of voxels with negative associations. This reflects the mediation pathway that the high-risk treatment leads to lower brain TBSS-FA, eventually resulting in lower processing speed performance. The corticospinal tract also accounts for a substantial portion of the significant voxels that contribute to the negative mediation pathway. The tract, which plays a major role in cortical control of spinal cord activity, has been shown to be related to human corticospinal tract developmental disorders [57]. The internal capsule connected to the the irradiation boost region also displays significant areas with negative association. The corona radiata, corpus callosum, and internal capsule display a nearly even mix of positive and negative associations, a phenomenon that may warrant further investigation. Overall, the predominant estimates are negative as expected and result in a net negative mediation effect.
Figure 6.

Selected 2D slices of the estimated . The images shown are on top of the FMRIB58 FA standard space. The skeleton voxels are presented in yellow. Voxels illustrated in red indicate significant negative effects. These are the areas that mediate the negative influence of treatment on processing speed deficits.
The estimated mean causal effects are evaluated from the posterior samplings with 95% credible intervals. The IE that represents the mediation effect of the WM microstructure between treatment and the change in processing speed score is −0.2376 (−0.2397, −0.2355). The DE representing the treatment effect on the change in processing speed unrelated to the WM microstructure is −0.4184 (−0.4370, −0.3990). The TE = IE + DE, is −0.6560 (−0.6767, −0.6345). Thereby, PM = IE/TE = 36.22% (35.42%, 37.12%), which is the estimated proportion of the treatment effect on the change in processing speed that is mediated by the WM microstructure. In addition, we observe the neurodevelopment over time, which is evidenced by the positive effect of with 0.1891 (0.1773,0.2000). The expected mean value of the change is with 0.0530 (0.0421, 0.0642).
5. SIMULATIONS
5.1. Simulation settings
To evaluate the empirical performance of the proposed method, we conduct numerical studies from two aspects. In the first set of simulations, we generate data in line with model assumptions to compare the performance of our proposed model with that of other competing models, which we will detail later. We simulate the data set with sample size using the 2D round and butterfly images, each with the dimension of 64 × 64. According to equation (1), the treatment indicator is a univariate binary variable generated by probability 0.5, are generated based on the corresponding 2D image with in the shaded region and elsewhere. The scalar and are fixed to 0.01 and 0.5, respectively. The confounder is generated from a standard normal distribution. For equation (2), we let for , and set five time points for each patient to mimic the real data. are generated similarly to with in the shaded region and elsewhere. For the fix effects, we let ,, , and . For the subject-specific random intercept , we let . We set . For simplicity, is not used. Hyperparameters are determined through the five-fold cross-validation as before. For each scenario, the parameters are estimated by MCMC algorithm with 1000 iterations. The first 500 draws are discarded as a warm-up sampling and the rest 500 samples are used for the posterior inference. We ensure the convergence for posterior samplings through trace plots of each parameter and several pilot chains with well-separated starting values, the diagnostic information is not shown here for the interest of space. For each image scenario, we replicate our numerical studies by 100 times.
In the second simulation study, we examine the robustness of the proposed model against violations of assumptions in a manner similar to investigating the linear mixed effects model [19, 43]. Here, we delve into three specific settings, corresponding to the outcome submodel and mediator submodel: First, we consider the scenario where the random intercept and measurement error follow non-Gaussian distributions: and , where and are as previously defined. In the second case, we consider the scenario where a heteroskedastic random intercept and measurement error present in the longitudinal outcomes. While we continue to assume that and adhere to normal distributions with a zero mean, we set as and as . Third, we generate the mediators with measurement errors that are correlated. Here, we assume that follow a multivariate normal distribution with zero mean and a covariance matrix, where the diagonal elements are set to 1 and all other elements are 0.01. Unless otherwise specified, we maintain the same settings as in the previous simulation study.
5.2. Simulation results
In the first simulation study, because there is no existing mediation model for longitudinal outcomes with high-dimensional imaging mediators, we compare the proposed BI-GMRF method to a combination of two matrix/tensor regression models widely used in imaging analysis. The first competitor as an alternative to equation (1) is the low-rank linear regression model for matrix response [24], denoted by L2RM. The second competitor as an alternative to equation (2) is the tensor regression model [63], denoted by TR. These two models assume a low rank structures for the matrix/tensor imaging coefficients and through a soft- and hard-thresholding approach, respectively. The rank of L2RM is selected by a nuclear norm regularization. And note that TR is not proposed for longitudinal data, thus we stack all observations to fit TR with the optimal rank of 3 selected by BIC. We also use a small cutoff to threshold the nonzero estimates of L2RM and TR. We evaluate the estimation accuracy for imaging coefficients and causal effects with total squared errors and squared errors , respectively. The variable selection performance is evaluated by false positive rates (FPR) and false negative rates (FNR).
To visualize imaging coefficients estimated using various methods, we show their heat maps of averaged estimates based on 100 replications in Figure 7. In all scenarios, the proposed BI-GMRF method outperforms its competitors in detecting nonzero regions with clear margins. L2RM method is capable to recover the desired region but at the expense of false positives. TR method has the least satisfied performance of recovering the true signal which leads to inferior performance of recognizing mediation effects . This may be due to the misspecified rank of hard-thresholding.
Figure 7.

The 2D simuation study showing a comparison of the BI-GMRF method and its competitors. The three rows from top to bottom correspond to imaging coefficients , , and . The true images are shown in the first and fourth columns, the estimates from BI-GMRF are shown in the second and fifth columns, and those from L2RM+TR are shown in the third and sixth columns.
Table 2 summarizes details for evaluating estimation accuracy and variable selection. In terms of variable selection, the proposed BI-GMRF method achieves the lowest FNR and FPR in all scenarios. For FNR, the BI-GMRF method is capable of identifying signal regions with 100% accuracy. While L2RM+TR performs well for , it does not achieve the same success for , and the error rates for increase significantly. In terms of FPR, our proposed method consistently exhibits the smallest error rates across all settings, with a substantial improvement over the competing approach. For total squared errors, the BI-GMRF method outperforms the others for all images. This superiority is especially pronounced for and .
Table 2.
Medians (IQRs) of false negative rates, false positive rates, and total squared errors for 2D simulation results over 100 replications
| Scenario | Coefficient | Method | FNR | FPR | TSE |
|---|---|---|---|---|---|
| Round | BI-GMRF | 0 (0) | 0.0035 (0.0008) | 0.2005 (0.0524) | |
| L2RM+TR | 0 (0) | 0.1563 (0.0239) | 0.9076 (0.1181) | ||
| BI-GMRF | 0 (0) | 0.0045 (0.0011) | 0.0229 (0.0043) | ||
| L2RM+TR | 0.0515 (0.0294) | 0.9303 (0.0119) | 98.0116 (19.6011) | ||
| BI-GMRF | 0 (0) | 0.0010 (0.0005) | 0.0021 (0.0006) | ||
| L2RM+TR | 0.4138 (0.2069) | 0.0167 (0.0036) | 0.1008 (0.0324) | ||
| Butterfly | BI-GMRF | 0 (0) | 0.0207 (0.0026) | 0.4733 (0.1070) | |
| L2RM+TR | 0 (0) | 0.2793 (0.0214) | 2.4480 (0.2029) | ||
| BI-GMRF | 0 (0) | 0.0271 (0.0020) | 0.0851 (0.0141) | ||
| L2RM+TR | 0.0588 (0.0241) | 0.9326 (0.0079) | 112.4496 (20.1869) | ||
| BI-GMRF | 0 (0) | 0.0027 (0.0007) | 0.0065 (0.0022) | ||
| L2RM+TR | 0.4722 (0.1146) | 0.0211 (0.0053) | 0.1644 (0.0296) |
To evaluate causal effects of IE, DE, and TE, we summarize their squared errors in Table 3. The proposed BI-GMRF method shows the most accurate estimates in both scenarios. It is noteworthy that the L2RM + TR shows a significant bias even for the direct effect .
Table 3.
Medians (IQRs) of squared errors for 2D simulation results over 100 replications
| Scenario | Causal Effect | BI-GMRF | L2RM+TR |
|---|---|---|---|
| Round | IE | 0.0155 (0.0153) | 0.6639 (0.9590) |
| DE | 0.0004 (0.0008) | 9.2977 (6.1188) | |
| TE | 0.0132(0.0138) | 16.0264(9.1187) | |
| Butterfly | IE | 0.0534 (0.0711) | 4.6716 (2.7702) |
| DE | 0.0073 (0.0093) | 2.7476 (4.151) | |
| TE | 0.0216 (0.0454) | 14.5419(8.8784) |
Tables 4 and 5 present the robustness findings from the second simulation study. These results align closely with the performance metrics shown in Tables 2 and 3, respectively, which indicates the robustness of the proposed model to assumption violations. While performance slightly deteriorates, it remains within a comparable range. The scenario most affected involves with correlated measurement error, suggesting a potential area for future improvement in accounting for such correlations. Nonetheless, these violations of assumptions have negligible effects on the estimation. As revealed by the heat maps (not shown), the estimated image coefficients still closely resemble those seen in Figure 7.
6. DISCUSSION
In this work, we propose an imaging mediation model for longitudinal outcomes with high-dimensional mediators. We adopt a Bayesian approach to address challenges in the longitudinal mediation model for uncovering important and interpretable voxelwise mediators and increasing the power to detect causal effects. Numerical studies are carried out to examine the numerical properties of the proposed method and its competitors. We further apply the proposed Bayesian method to analyze a pediatric cancer survivor dataset. By taking into account spatial dependency and structure of smoothness, we successfully identify voxelwise WM microstructural damage that mediates treatment effect on longitudinal neurocognitive outcomes. Our findings of identified regions are in agreement with the previous literature of cognitive studies of pediatric cancer. The newly found regions suggest future directions for neurocognitive research. Indeed, we could delve deeper by analyzing the influence of existing photon irradiation treatments on these specific brain regions and subsequent cognitive outcomes, such as processing speed. In addition, we could design treatment studies employing more targeted therapies, such as proton therapy, aiming to protect these vulnerable brain regions. One limitation of the proposed model is that our approach only allows for a single longitudinal outcome. It is of interest to develop mediation models for multiple longitudinal outcomes as the correlations between cognitive measures could be leveraged to enhance the detection of informative voxels [27, 28].
ACKNOWLEDGEMENTS
The authors thank the editors, guest-editors, associate editor, and reviewer for their insightful feedback. Cai Li wishes to express his deep appreciation to Professor Heping Zhang for his invaluable mentorship and guidance.
This work is partially supported by NIH grant P30-CA021765 and the ALSAC.
APPENDIX A. APPENDIX
Using the BI-GMRF prior, we have the site-specific joint posterior distribution of and as and , respectively. With
we have as the product of the following three terms,
Similarly, with
we have as the product of the following three terms,
Table 4.
Medians (IQRs) of false negative rates, false positive rates, and total squared errors for robustness simulation results over 100 replications
| Scenario | Coefficient | Setting | FNR | FPR | TSE |
|---|---|---|---|---|---|
| Round | Non-Gaussian | 0 (0) | 0.0043 (0.0008) | 0.2048 (0.0461) | |
| Heteroskedasticity | 0 (0) | 0.0039 (0.0013) | 0.2071 (0.0514) | ||
| Correlated Covariates | 0 (0.0074) | 0.0076 (0.0018) | 0.7947 (0.3343) | ||
| Non-Gaussian | 0 (0) | 0.0045 (0.0011) | 0.0264 (0.0048) | ||
| Heteroskedasticity | 0 (0) | 0.0053 (0.0008) | 0.0298 (0.0059) | ||
| Correlated Covariates | 0 (0) | 0.0058 (0.0008) | 0.0362 (0.0068) | ||
| Non-Gaussian | 0 (0) | 0.0010 (0.0005) | 0.0022 (0.0009) | ||
| Heteroskedasticity | 0 (0) | 0.0012 (0.0005) | 0.0027 (0.0007) | ||
| Correlated Covariates | 0 (0) | 0.0012 (0.0005) | 0.0056 (0.003) | ||
| Butterfly | Non-Gaussian | 0 (0) | 0.0212 (0.0031) | 0.4987 (0.1082) | |
| Heteroskedasticity | 0 (0) | 0.0227 (0.0026) | 0.4893 (0.0957) | ||
| Correlated Covariates | 0.0399 (0.0426) | 0.0222 (0.0040) | 0.6337 (0.1034) | ||
| Non-Gaussian | 0 (0) | 0.0274 (0.0025) | 0.0939 (0.0168) | ||
| Heteroskedasticity | 0 (0) | 0.0282 (0.0026) | 0.0941 (0.0132) | ||
| Correlated Covariates | 0 (0) | 0.0307 (0.0035) | 0.1098 (0.0207) | ||
| Non-Gaussian | 0 (0) | 0.0030 (0.001) | 0.0071 (0.0016) | ||
| Heteroskedasticity | 0 (0) | 0.0027 (0.0007) | 0.0076 (0.0017) | ||
| Correlated Covariates | 0.0278 (0.1007) | 0.0045 (0.0015) | 0.0290 (0.0257) |
Table 5.
Medians (IQRs) of squared errors for robustness simulation results over 100 replications
| Scenario | Causal Effect | Non-Gaussian | Heteroskedasticity | Correlated Covariates |
|---|---|---|---|---|
| Round | IE | 0.0221 (0.0197) | 0.0221 (0.0204) | 0.0349 (0.0450) |
| DE | 0.0005 (0.0011) | 0.0005 (0.0005) | 0.0007 (0.0003) | |
| TE | 0.0177 (0.0182) | 0.0198 (0.0178) | 0.0327 (0.0453) | |
| Butterfly | IE | 0.0563 (0.0620) | 0.0578 (0.0463) | 0.9169 (1.6269) |
| DE | 0.0089 (0.0107) | 0.0084 (0.0073) | 0.0089 (0.0221) | |
| TE | 0.0226 (0.0374) | 0.0283 (0.0374) | 0.5831 (0.7137) |
Posterior distributions of parameters in equation (1) are specified as follows. For with
we have
For with
we have
For with
we have
For with
we have
Posterior distributions of parameters in equation (2) are specified as follows. For with
we have
For with
we have
For with
we have
For with
we have
For with
we have
For with
we have
For with
we have
Finally, the sampling schemes are shown as below.
Sample from for .
Calculate and sample from for .
For , if , ; otherwise, .
Sample from for .
Sample from .
Sample from .
Sample from for .
Calculate and sample from for .
For , if , ; otherwise, .
Sample from .
Sample from .
Sample from .
Sample from .
Sample from for .
Sample from .
Contributor Information
Yimei Li, Department of Biostatistics, St. Jude Children’s Research Hospital, USA.
Jade Xiaoqing Wang, Department of Biostatistics, University of Michigan, USA.
Grace Chen Zhou, Department of Biostatistics, St. Jude Children’s Research Hospital, USA.
Heather M. Conklin, Department of Psychology & Biobehavioral Sciences, St. Jude Children’s Research Hospital, USA
Arzu Onar-Thomas, Department of Biostatistics, St. Jude Children’s Research Hospital, USA.
Amar Gajjar, Department of Biostatistics, St. Jude Children’s Research Hospital, USA.
Wilburn E. Reddick, Department of Diagnostic Imaging, St. Jude Children’s Research Hospital, USA
Cai Li, Department of Biostatistics, St. Jude Children’s Research Hospital, USA.
REFERENCES
- [1].Acharya S, Wu S, Ashford JM, Tinkle CL, Lucas JT, Qaddoumi I, Gajjar A, Krasin MJ, Conklin HM and Merchant TE (2019). Association between hippocampal dose and memory in survivors of childhood or adolescent low-grade glioma: a 10-year neurocognitive longitudinal study. Neuro-oncology 21 1175–1183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Acharya S, Guo Y, Patni T, Li Y, Wang C, Gargone M, Ashford JM, Wilson L, Faught A, Reddick WE, Patay Z, Gajjar A, Conklin HM and Merchant TE (2022). Association between brain substructure dose and cognitive outcomes in children with medulloblastoma treated on SJMB03: a step toward substructure-informed planning. Journal of Clinical Oncology 40 83–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Ashburner J and Friston KJ (2000). Voxel-based morphometry—the methods. Neuroimage 11 805–821. [DOI] [PubMed] [Google Scholar]
- [4].Bi X, Yang L, Li T, Wang B, Zhu H and Zhang H (2017). Genome-wide mediation analysis of psychiatric and cognitive traits through imaging phenotypes. Human Brain Mapping 38 4088–4097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Bind M-A, Vanderweele T, Coull B and Schwartz J (2016). Causal mediation analysis for longitudinal data with exogenous exposure. Biostatistics 17 122–134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Chén OY, Crainiceanu C, Ogburn EL, Caffo BS, Wager TD and Lindquist MA (2018). High-dimensional multivariate mediation with application to neuroimaging data. Biostatistics 19 121–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Chung S, Fieremans E, Kucukboyaci NE, Wang X, Morton CJ, Novikov DS, Rath JF and Lui YW (2018). Working memory and brain tissue microstructure: white matter tract integrity based on multi-shell diffusion MRI. Scientific Reports 8 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Coffman DL (2011). Estimating causal effects in mediation analysis using propensity scores. Structural Equation Modeling: A Multidisciplinary Journal 18 357–369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Derkach A, Pfeiffer RM, Chen T-H and Sampson JN (2019). High dimensional mediation analysis with latent variables. Biometrics 75 745–756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Feng L, Bi X and Zhang H (2021). Brain regions identified as being associated with verbal reasoning through the use of imaging regression via internal variation. Journal of the American Statistical Association 116 144–158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Friston KJ, Frith C, Liddle P and Frackowiak R (1991). Comparing functional (PET) images: the assessment of significant change. Journal of Cerebral Blood Flow & Metabolism 11 690–699. [DOI] [PubMed] [Google Scholar]
- [12].Gajjar AJ and Robinson GW (2014). Medulloblastoma—translating discoveries from the bench to the bedside. Nature Reviews Clinical Oncology 11 714–722. [Google Scholar]
- [13].Gajjar A, Robinson GW, Smith KS, Lin T, Merchant TE, Chintagumpala M, Mahajan A, Su J, Bouffet E, Bartels U, Schechter T, Hassall T, Robertson T, Nicholls W, Gururangan S, Schroeder K, Sullivan M, Wheeler G, Hansford JR, Kellie SJ, McCowage G, Cohn R, Fisher MJ, Krasin MJ, Stewart CF, Broniscer A, Buchhalter I, Tatevossian RG, Orr BA, Neale G, Klimo P, Boop F, Srinivasan A, Pfister SM, Gilbertson RJ, Onar-Thomas A, Ellison DW and Northcott PA (2021). Outcomes by clinical and molecular features in children with medulloblastoma treated with risk-adapted therapy: results of an international phase III trial (SJMB03). Journal of Clinical Oncology 39 822–835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Gelman A, Carlin JB, Stern HS and Rubin DB (1995). Bayesian Data Analysis. Chapman and Hall/CRC. [Google Scholar]
- [15].Glass JO, Ogg RJ, Hyun JW, Harreld JH, Schreiber JE, Palmer SL, Li Y, Gajjar AJ and Reddick WE (2017). Disrupted development and integrity of frontal white matter in patients treated for pediatric medulloblastoma. Neuro-Oncology 19 1408–1418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Goldsmith J, Huang L and Crainiceanu CM (2014). Smooth scalar-on-image regression via spatial Bayesian variable selection. Journal of Computational and Graphical Statistics 23 46–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Huang L, Goldsmith J, Reiss PT, Reich DS and Crainiceanu CM (2013). Bayesian scalar-on-image regression with application to association between intracranial DTI and cognitive outcomes. NeuroImage 83 210–223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Imbens GW and Rubin DB (2015). Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press. [Google Scholar]
- [19].Jacqmin-Gadda H, Sibillot S, Proust C, Molina J-M and Thiébaut, R. (2007). Robustness of the linear mixed model to misspecified error distribution. Computational Statistics & Data Analysis 51 5142–5154. [Google Scholar]
- [20].Jenkinson M, Beckmann CF, Behrens TEJ, Woolrich MW and Smith SM (2012). FSL. NeuroImage 62 782–790. 20 YEARS OF fMRI. [DOI] [PubMed] [Google Scholar]
- [21].Jo B, Stuart EA, MacKinnon DP and Vinokur AD (2011). The use of propensity scores in mediation analysis. Multivariate Behavioral Research 46 425–452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Khong P-L, Kwong DL, Chan GC, Sham JS, Chan F-L and Ooi G-C (2003). Diffusion-tensor imaging for the detection and quantification of treatment-induced white matter injury in children with medulloblastoma: a pilot study. American Journal of Neuroradiology 24 734–740. [PMC free article] [PubMed] [Google Scholar]
- [23].Knight SJ, Conklin HM, Palmer SL, Schreiber JE, Armstrong CL, Wallace D, Bonner M, Swain MA, Evankovich KD, Mabbott DJ, Boyle R, Huang Q, Zhang H, Anderson VA and Gajjar A (2014). Working memory abilities among children treated for medulloblastoma: parent report and child performance. Journal of Pediatric Psychology 39 501–511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Kong D, An B, Zhang J and Zhu H (2020). L2RM: low-rank linear regression models for high-dimensional matrix responses. Journal of the American Statistical Association 115 403–424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Koshiyama D, Fukunaga M, Okada N, Morita K, Nemoto K, Yamashita F, Yamamori H, Yasuda Y, Matsumoto J and Fujimoto M (2020). Association between the superior longitudinal fasciculus and perceptual organization and working memory: A diffusion tensor imaging study. Neuroscience Letters 738 135349. [DOI] [PubMed] [Google Scholar]
- [26].Law N, Bouffet E, Laughlin S, Laperriere N, Brière M-E, Strother D, McConnell D, Hukin J, Fryer C, Rockel C, Dickson J and Mabbott D (2011). Cerebello–thalamo–cerebral connections in pediatric brain tumor patients: Impact on working memory. Neuroimage 56 2238–2248. [DOI] [PubMed] [Google Scholar]
- [27].Li C, Xiao L and Luo S (2020). Fast covariance estimation for multivariate sparse functional data. Stat 9 e245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Li C, Xiao L and Luo S (2022). Joint model for survival and multivariate sparse functional data with application to a study of Alzheimer’s Disease. Biometrics 78 435–447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Li F and Zhang NR (2010). Bayesian variable selection in structured high-dimensional covariate spaces with applications in genomics. Journal of the American Statistical Association 105 1202–1214. [Google Scholar]
- [30].Li C and Zhang H (2021). Tensor quantile regression with application to association between neuroimages and human intelligence. The Annals of Applied Statistics 15 1455–1477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Loe IM, Adams JN and Feldman HM (2019). Executive function in relation to white matter in preterm and full term children. Frontiers in Pediatrics 6 418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [32].MacKinnon DP (2008). Introduction to Statistical Mediation Analysis. Multivariate Applications Series. Taylor & Francis Group/Lawrence Erlbaum Associates, New York, NY. [Google Scholar]
- [33].Mitchell TJ and Beauchamp JJ (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association 83 1023–1032. [Google Scholar]
- [34].Mori S, Oishi K, Jiang H, Jiang L, Li X, Akhter K, Hua K, Faria AV, Mahmood A, Woods R, Toga AW, Pike GB, Neto PR, Evans A, Zhang J, Huang H, Miller MI, van Zijl P and Mazziotta J (2008). Stereotaxic white matter atlas based on diffusion tensor imaging in an ICBM template. Neuroimage 40 570–582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Moxon-Emre I, Bouffet E, Taylor MD, Laperriere N, Sharpe MB, Laughlin S, Bartels U, Scantlebury N, Law N, Malkin D, Skocic J, Richard L and Mabbott DJ (2016). Vulnerability of white matter to insult during childhood: evidence from patients treated for medulloblastoma. Journal of Neurosurgery: Pediatrics 18 29–40. [DOI] [PubMed] [Google Scholar]
- [36].Nagesh V, Tsien CI, Chenevert TL, Ross BD, Lawrence TS, Junick L and Cao Y (2008). Radiation-induced changes in normal-appearing white matter in patients with cerebral tumors: a diffusion tensor imaging study. International Journal of Radiation Oncology, Biology, Physics 70 1002–1010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [37].Palmer SL, Glass JO, Li Y, Ogg R, Qaddoumi I, Armstrong GT, Wright K, Wetmore C, Broniscer A, Gajjar A et al. (2012). White matter integrity is associated with cognitive processing in patients treated for a posterior fossa brain tumor. Neuro-Oncology 14 1185–1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Palmer SL, Armstrong C, Onar-Thomas A, Wu S, Wallace D, Bonner MJ, Schreiber J, Swain M, Chapieski L, Mabbott D, Knight S, Boyle R and Gajjar A (2013). Processing speed, attention, and working memory after treatment for medulloblastoma: an international, prospective, and longitudinal study. Journal of Clinical Oncology 31 3494–3500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [39].Pearl J (2000). Models, Reasoning and Inference. Cambridge University Press, Cambridge, UK, 19. [Google Scholar]
- [40].Rubin DB (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66 688. [Google Scholar]
- [41].Rubin DB (1978). Bayesian inference for causal effects: The role of randomization. The Annals of Statistics 34–58. [Google Scholar]
- [42].Rubin DB (1980). Randomization analysis of experimental data: The Fisher randomization test comment. Journal of the American Statistical Association 75 591–593. [Google Scholar]
- [43].Schielzeth H, Dingemanse NJ, Nakagawa S, Westneat DF, Allegue H, Teplitsky C, Réale D, Dochtermann NA, Garamszegi LZ and Araya-Ajoy YG (2020). Robustness of linear mixed-effects models to violations of distributional assumptions. Methods in Ecology and Evolution 11 1141–1152. [Google Scholar]
- [44].Schrank FA (2011). Woodcock-Johnson III Tests of Cognitive Abilities. [Google Scholar]
- [45].Schreiber JE, Palmer SL, Conklin HM, Mabbott DJ, Swain MA, Bonner MJ, Chapieski ML, Huang L, Zhang H and Gajjar A (2017). Posterior fossa syndrome and long-term neuropsychological outcomes among children treated for medulloblastoma on a multi-institutional, prospective study. Neuro-oncology 19 1673–1682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [46].Smith M and Fahrmeir L (2007). Spatial Bayesian variable selection with application to functional magnetic resonance imaging. Journal of the American Statistical Association 102 417–431. [Google Scholar]
- [47].Smith SM and Nichols TE (2009). Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference. Neuroimage 44 83–98. [DOI] [PubMed] [Google Scholar]
- [48].Smith M, Pütz B, Auer D and Fahrmeir L (2003). Assessing brain activity through spatial Bayesian variable selection. NeuroImage 20 802–815. [DOI] [PubMed] [Google Scholar]
- [49].Smith SM, Jenkinson M, Johansen-Berg H, Rueckert D, Nichols TE, Mackay CE, Watkins KE, Ciccarelli O, Cader MZ, Matthews PM and Behrens TEJ (2006). Tract-based spatial statistics: Voxelwise analysis of multi-subject diffusion data. NeuroImage 31 1487–1505. [DOI] [PubMed] [Google Scholar]
- [50].Sohn MB, Li H et al. (2019). Compositional mediation analysis for microbiome studies. The Annals of Applied Statistics 13 661–681. [Google Scholar]
- [51].Song Y, Zhou X, Zhang M, Zhao W, Liu Y, Kardia SL, Roux AVD, Needham BL, Smith JA and Mukherjee B (2020). Bayesian shrinkage estimation of high dimensional causal mediation effects in omics studies. Biometrics 76 700–710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [52].Takahashi M, Iwamoto K, Fukatsu H, Naganawa S, Iidaka T and Ozaki N (2010). White matter microstructure of the cingulum and cerebellar peduncle is related to sustained attention and working memory: a diffusion tensor imaging study. Neuroscience Letters 477 72–76. [DOI] [PubMed] [Google Scholar]
- [53].Tibshirani R (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology 58 267–288. [Google Scholar]
- [54].Tibshirani R, Saunders M, Rosset S, Zhu J and Knight K (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology 67 91–108. [Google Scholar]
- [55].VanderWeele T (2015). Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press. [Google Scholar]
- [56].Wang JX, Li Y, Reddick WE, Conklin HM, Glass JO, Onar-Thomas A, Gajjar A, Cheng C and Lu Z-H (2023). A high-dimensional mediation model for a neuroimaging mediator: Integrating clinical, neuroimaging, and neurocognitive data to mitigate late effects in pediatric cancer. Biometrics. 79(3) 2430–2443. [DOI] [PubMed] [Google Scholar]
- [57].Welniarz Q, Dusart I and Roze E (2017). The corticospinal tract: Evolution, development, and human disorders. Developmental Neurobiology 77 810–829. [DOI] [PubMed] [Google Scholar]
- [58].Woodcock RW, Mather N, McGrew KS and Wendling BJ (2001a). Woodcock-Johnson III Tests of Cognitive Abilities. Riverside Publishing Company, Itasca, IL. [Google Scholar]
- [59].Woodcock RW, McGrew KS, Mather N et al. (2001b). Woodcock-Johnson III Tests of Achievement. Riverside Publishing Company, Itasca, IL. [Google Scholar]
- [60].Yu D, Wang L, Kong D and Zhu H (2022). Mapping the genetic-imaging-clinical pathway with applications to Alzheimer’s disease. Journal of the American Statistical Association 117 1656–1668. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [61].Zhang H (1993). Image restoration: flexible neighborhood systems and iterated conditional expectations. Statistica Sinica 117–139. [Google Scholar]
- [62].Zhang D, Chen M-H, Ibrahim JG, Boye ME and Shen W (2017). Bayesian model assessment in joint modeling of longitudinal and survival data with applications to cancer clinical trials. Journal of Computational and Graphical Statistics 26 121–133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [63].Zhou H, Li L and Zhu H (2013). Tensor regression with applications in neuroimaging data analysis. Journal of the American Statistical Association 108 540–552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [64].Zhu J, Ling J and Ding N (2019). Association between diffusion tensor imaging findings and cognitive outcomes following mild traumatic brain injury: a PRISMA-compliant meta-analysis. ACS Chemical Neuroscience 10 4864–4869. [DOI] [PubMed] [Google Scholar]
- [65].Zou P, Conklin HM, Scoggins MA, Li Y, Li X, Jones MM, Palmer SL, Gajjar A and Ogg RJ (2016). Functional MRI in medulloblastoma survivors supports prophylactic reading intervention during tumor treatment. Brain Imaging and Behavior 10 258–271. [DOI] [PMC free article] [PubMed] [Google Scholar]
