Skip to main content
Human Brain Mapping logoLink to Human Brain Mapping
. 2019 May 15;40(11):3299–3320. doi: 10.1002/hbm.24599

Evaluation of the 3D fractal dimension as a marker of structural brain complexity in multiple‐acquisition MRI

Stephan Krohn 1,2,3,, Martijn Froeling 4, Alexander Leemans 5, Dirk Ostwald 3,6, Pablo Villoslada 7, Carsten Finke 1,2, Francisco J Esteban 8,
PMCID: PMC6865657  PMID: 31090254

Abstract

Fractal analysis represents a promising new approach to structural neuroimaging data, yet systematic evaluation of the fractal dimension (FD) as a marker of structural brain complexity is scarce. Here we present in‐depth methodological assessment of FD estimation in structural brain MRI. On the computational side, we show that spatial scale optimization can significantly improve FD estimation accuracy, as suggested by simulation studies with known FD values. For empirical evaluation, we analyzed two recent open‐access neuroimaging data sets (MASSIVE and Midnight Scan Club), stratified by fundamental image characteristics including registration, sequence weighting, spatial resolution, segmentation procedures, tissue type, and image complexity. Deviation analyses showed high repeated‐acquisition stability of the FD estimates across both data sets, with differential deviation susceptibility according to image characteristics. While less frequently studied in the literature, FD estimation in T2‐weighted images yielded robust outcomes. Importantly, we observed a significant impact of image registration on absolute FD estimates. Applying different registration schemes, we found that unbalanced registration induced (a) repeated‐measurement deviation clusters around the registration target, (b) strong bidirectional correlations among image analysis groups, and (c) spurious associations between the FD and an index of structural similarity, and these effects were strongly attenuated by reregistration in both data sets. Indeed, differences in FD between scans did not simply track differences in structure per se, suggesting that structural complexity and structural similarity represent distinct aspects of structural brain MRI. In conclusion, scale optimization can improve FD estimation accuracy, and empirical FD estimates are reliable yet sensitive to image characteristics.

Keywords: fractal analysis, MRI biomarker, structural brain complexity, structural similarity, imaging validation

1. INTRODUCTION

Fractal analysis has attracted increasing interest from the neuroscience community as a versatile new tool for the analysis of structural brain data on a cellular as well as a macroscopic scale and in both health and disease (Di Ieva 2016; Di Ieva, Esteban, Grizzi, Klonowski, & Martín‐Landrove, 2015; Di Ieva, Grizzi, Jelinek, Pellionisz, & Losa, 2014). Fractal geometry, prominently developed by Mandelbrot (1983), features the fundamental insight that real‐world objects do not adhere to the smooth whole‐integer dimensions of Euclidean geometry and are instead more adequately described by the fractal dimension (FD), which is not limited to integers and can be regarded as a measure of morphometric complexity (Di Ieva 2016; Mandelbrot, 1967). While natural objects are constrained to finite physical scales and their self‐similarity is rather statistical than compositional, the analysis of an object's fractal properties has proven insightful in a variety of fields, from the inanimate (e.g., coastlines, clouds, lightning) and the cellular (e.g., protein surfaces, viral receptor molecules, cellular shapes) up to the realm of higher‐order organisms (e.g., human bronchial and vascular ramifications; Di Ieva et al., 2015, Di Ieva 2016; Di Ieva, Grizzi, et al., 2014; Mandelbrot, 1967, 1983). In biomedical neuroimaging, fractal analysis can be applied to estimate structural brain complexity (cf. Di Ieva 2016), i.e. the topological complexity of brain tissue segmentations as obtained from structural neuroimaging data, most commonly anatomical MRI. As such, fractal analysis has been employed in the anatomical description of cortical geometry (Im et al., 2006; Kiselev, Hahn, & Auer, 2003), and the FD has shown promise as a biomarker in the detection of early tissue alterations in multiple sclerosis (Esteban et al., 2007, 2009), brain abnormalities in infants with intrauterine growth restriction (Esteban et al., 2010), atherosclerotic white matter lesions (Takahashi et al., 2006), morphological changes in multiple system atrophy of the cerebellar type (Wu et al., 2010), angioarchitecture of cerebral arteriovenous malformations (Di Ieva et al., 2014), the cortical features in Alzheimer's disease (King et al., 2009, 2010; Ruiz de Miras et al., 2017), cerebral tumors (Iftekharuddin, Zheng, Islam, & Ogg, 2009), age‐related brain atrophy (Madan & Kensinger, 2016) as well as traumatic (Rajagopalan et al., 2018) and age‐induced (Reishofer et al., 2018) white matter changes.

However, while fractal analysis is now being applied in both fundamental research and clinical investigations, there is a relative scarcity of literature on the methodological evaluation of the FD in structural brain MRI. On the computational side, one aspect that warrants further study regards the optimal range of spatial scales for empirical estimation, i.e. the regression intervals applied to the log‐transformed data, specifically with respect to the commonly applied 3D box‐counting procedure. We here employ a simple spatial optimization procedure that automatically selects the optimal scale range for each individual estimation, and we present a series of simulation studies with known expected FDs to examine performance against non‐optimized estimation.

Empirically, further examination is warranted with regard to the impact of fundamental image characteristics on the FD estimates, for instance regarding segmentation procedures, tissue type, image complexity, image registration, and spatial resolution. Moreover, it is important to assess the stability of the FD over multiple repeated acquisitions, since a reasonable test–retest reliability is an essential prerequisite for a biomarker's diagnostic capacity. Furthermore, T1‐weighted images (T1WI) have been the mainstay of neuroimaging studies implementing fractal analysis such that systematic evaluation regarding the utility of T2‐weighted images (T2WI) in fractal analysis is comparatively scarce, even though the latter are essential to both fundamental neuroimaging research and clinical neuroradiological assessment.

To address these empirical questions, we analyzed structural MRI data from two independent openly available neuroimaging datasets. On the one hand, this includes the recently published Multiple Acquisitions for Standardization of Structural Imaging Validation and Evaluation database (MASSIVE, cf. Froeling, Tax, Vos, Luijten, & Leemans, 2017), featuring 10 repeated T1WI and T2WI acquisitions over a short amount of time. We hypothesized that in such an acquisition procedure, it is reasonable to assume that there was essentially no change in the underlying structural brain complexity and that, therefore, the estimated FD values should show high stability across these short‐interval measurements, allowing for detailed image parameter‐dependent analyses. While this data set is thus well‐suited to examine the above questions, it also emanates from a single subject, potentially restricting the generality of our findings. Therefore, we extended our analyses to the recently presented Midnight Scan Club (MSC) data set (Gordon et al., 2017), featuring repeated short‐interval acquisitions of T1WI and T2WI in 10 subjects. Our approach to the points raised above then rests on an image processing procedure differentiating between sequence weighting, spatial resolution, segmentation method, tissue type, and image complexity reduction by skeletonization (see Section 2.1). As detailed below, this leads to a stratification of 32 distinct image analysis groups. We then apply fractality estimation with spatial scale optimization on the 3‐dimensional input volumes obtained from image processing and implement a systematic analysis of the resulting FD estimates. The latter features a combination of random and systematic resampling methods, deviation detection, assessment of the sample distributions, similarity comparison, unsupervised machine learning techniques, correlation analyses, and image parameter‐dependent group comparisons. Based on these analyses, we assess (a) parameter‐dependent repeated‐sampling deviations, both within analysis groups and across the two data sets, (b) the impact of image registration on the FD estimates, (c) the within‐ and across‐subject FD sample distributions, (d) the estimated optimal spatial scales across data sets, subjects, and processing parameters, (e) the relationship between the FD and structural similarity, and (f) the impact of image weighting, spatial resolution, and processing parameters on the FD estimates.

2. METHODS

2.1. Image acquisition and processing

Structural MRI in the MASSIVE data set were acquired on a clinical 3T system (Philips Achieva). The data emanate from a healthy 25‐year‐old female subject scanned in five sessions occasions over an interval of 2 weeks. Ten T1WI and T2WI were collected, each reconstructed with 1 mm3 isotropic resolution, and data for both weightings were resampled to 2.5 mm3 isotropic resolution, resulting in the four image categories T1 high resolution, T1 low resolution, T2 high resolution, and T2 low resolution for further processing. Data were registered to a common space using a rigid registration algorithm (http://elastix.isi.uu.nl, see Klein, Staring, Murphy, Viergever, & Pluim, 2010; Shamonin et al., 2014) with the first T1 volume as the registration target. For additional details on the acquisition procedure, please refer to Froeling et al., 2017. The MASSIVE data set is openly available from http://www.massive-data.org.

Structural MRI in the MSC data set were obtained on a 3T scanner (Siemens TRIO) across two separate days, with each session starting at midnight. Four T1 and four T2 scans with 0.8 mm3 isotropic resolution were acquired in each of the 10 healthy subjects (5 females, 5 males; age range: 24–34 years). Additionally, subject #8 had one extra T1 scan, and subject #6 had five T1 scans and six T2 scans in total, which we included in our analyses wherever feasible. Similar to the above, data were resampled to a 2.5 mm3 isotropic resolution, and subject‐wise rigid‐body registration to the respective subject's first T1 volume was carried out. For further details on the data set, see Gordon et al., 2017. The MSC data set is openly available from https://openneuro.org.

A standard FSL‐based pipeline (Jenkinson, Beckmann, Behrens, Woolrich, & Smith, 2012; Smith et al., 2004; Woolrich et al., 2009) was used to preprocess the MR images for subsequent fractal analysis. Specifically, the brain extraction routine (BET) was applied to all individual 3D volumes with default fractional intensity threshold (Smith, 2002). The brain‐extracted images entered the FAST routine for tissue segmentation into gray matter (GM), white matter (WM), and cerebrospinal fluid classes with default analysis parameters (Zhang, Brady, & Smith, 2001). Intensity inhomogeneity was accommodated by iterative bias‐field correction. We estimated partial volume maps for each of the three tissue classes, of which the GM and WM estimates entered fractal analysis. For qualitative comparison, we also included a forced‐decision binary classification (“hard” segmentation), in which voxels are labeled as 0 or 1 for a specific tissue class. Based on these segmentations, 3D image skeletons were estimated for each input volume. Image skeletons are the result of an iterative reduction process that computes a minimum complexity version of the input image. We here apply a publicly available 3D parallel thinning algorithm to build the skeleton models of the respective input volume (Kerschnitzki et al., 2013; Lee, Kashyap, & Chu, 1994). Intuitively, image skeletons aim at capturing the “essence” of an image and are thought to be more sensitive to pathological changes in some cases (Esteban et al., 2007, 2009, 2010; Jiménez et al., 2014; Sheelakumari et al., 2017), which is why we include them in the present study. We thus obtain an additional complexity‐reduced skeleton model for every input volume. In summary, the combination of image parameters amounts to a total of 32 analysis groups, on which we base the taxonomy applied throughout the manuscript: image weighting (T1 vs. T2), spatial resolution (low vs. high), segmentation procedure (partial volume estimates (pve) vs. binary segmentation (bin)), tissue type (gray matter (GM) vs. white matter (WM)), and image complexity reduction (skeletonized vs. unskeletonized images, where the former is abbreviated by “Skel”). Figure 1 summarizes the analysis stratification (Panel a) and provides an example of the processing results (Panel b) as well as a 3D rendering of the corresponding skeleton models (Panel c).

Figure 1.

Figure 1

Analysis stratification and image processing. Panel (a) represents a schematic of the applied analysis stratification. Panel (b) visualizes this procedure for the first volume of the T1 high resolution images in the MASSIVE data set. Note the absence of gray voxels in the binary forced‐decision segmentations (bin) as compared to the partial volume estimates (pve). For each processed volume, image skeleton models were estimated, a 3D rendering of which is visualized for the WM_pve and GM_pve segmentations in Panel (c). The 3D volumes then entered the fractal dimension estimation. bin, binary segmentations; GM, gray matter; pve, partial volume estimates; Skel, skeleton model; WM, white matter [Color figure can be viewed at http://wileyonlinelibrary.com]

2.2. Fractal estimation and spatial optimization

The volumes obtained from preprocessing provided the input for the estimation of the 3D FD. In the empirical sciences, the FD of an object A is commonly estimated by the box‐counting dimension D b given by

DbA=limx0logNxlog1x (1)

where x is the box edge length and N(x) the minimum number of boxes needed to cover the object under scrutiny (cf., Di Ieva, Grizzi, et al., 2014). Box‐counting was applied here based on a function from the openly available calcFD toolbox (Madan & Kensinger, 2016). Due to the finite physical scales of natural objects, D b(A) is in practice calculated as the slope of the linear regression line over an interval of x in the loglog plot (see Gneiting, Ševčíková, & Percival, 2012, for detailed treatment of the ordinary least squares regression fit in box‐counting). In terms of structural MR images, these intervals correspond to the range of voxel unit edge sizes over which the box‐counting dimension is computed. In this context, consider a finite sequence X k of spatial scales defined as

Xkxkk=0,1,,n=x0x1xn,withn,andxkbk (2)

where b defines a scale base and k specifies the exponents to be tested. For instance, we here define b = 2 and k = 0, 1, …, 8, yielding X k = (1, 2, 4, 8, 16, 32, 64, 128, 256). Nonetheless, the above raises the question over which particular range of k (i.e., which subsequence of X k) one should compute the box‐counting regression in order to obtain the best FD estimate. One common solution is to simply define the k‐range for the estimation and keep it fixed over repeated estimations. This, however, entails the danger of introducing inaccuracies as it disregards potential differences between subjects, scanning sessions, or processed input volumes. Another option is to base the definition on prior validation studies suggesting an optimal range of k for a particular image analysis group (see e.g., Esteban et al., 2009; Jiménez et al., 2014). Albeit an improvement, optimal spatial scales may depend on the scanning equipment, image processing, or estimation algorithm applied, and there is no principled reason to believe that the best regression intervals generalize uniformly from one population to another. As such, a more flexible and data‐driven decision criterion may be desirable. We here apply a simple procedure to help alleviate this issue. Let |X k| denote the number of elements in the sequence of spatial scales resulting from Equation (2), and let ω ≤ |X k| indicate the upper bound on regression interval length, with ω = |X k| representing the case in which we allow estimation over all spatial scales in X k (i.e., here, k = 0, …, 8). However, we may also estimate the FD over a subsequence of spatial scales (e.g., k = 2, …, 5). Let τ ≥ 2 denote the lower bound on the number of elements in this subsequence, that is, the minimum length of the regression interval over which fractality estimation is carried out. The number of spatial scales of at least length τ and at most length ω is then given by the number of subsequences of X k,

m=nn+12,wheren=ωτ+1,andω>τ. (3)

For a specified lower and upper bound on the regression interval, we thus obtain m possible k‐ranges over which to carry out the estimation, yielding a set of m regression models. From this set, we may then choose the best‐fitting model as suggested by the highest adjusted coefficient of determination Radj2, where standard adjustment (Fritz, Morris, & Richler, 2012) is applied due to the varying cardinality of the different tested k‐ranges. The slope estimator of the thus selected model is then chosen as the optimal FD estimate, in the sense of being the best guess in approximating the true but unknown underlying dimension value based on the box‐counting results.

In this context, empirical estimation faces the challenge that the true underlying fractal properties of the natural object are unknown, making it intrinsically difficult to judge estimation accuracy. In order to examine the performance of the outlined procedure, we therefore ran a series of simulation studies, in which we applied the estimation process to objects whose FD is known and which can thus serve as a benchmark. Specifically, we created a series of 3D random Cantor sets whose expected FD is specified by the probability of retaining a particular subset during iterative removal (Falconer & Grimmett, 1992; Moisy, 2008). For each random Cantor set, we then estimate its FD over both the respective optimal spatial scales and over a randomly chosen nonoptimal interval. This randomization approach avoids prior assumptions about the estimation quality across non‐optimal scales and obviates arbitrary choices about which specific non‐optimal scale to use as a baseline for the comparison to optimal k‐ranges. Furthermore, the applied set of k‐values in Equation (2) (i.e., k = 0, …, 8) ensures comprehensive coverage of possible spatial scales in the simulated fractal set (as the size of the latter was 28). We thus place no prior constraints on which k‐ranges are expected to yield better estimation accuracy than others. Following initial parameter search in fixed benchmarking objects, we here apply τ = 4 (i.e., computing the regression over at least four contiguous spatial scales; τ = 3 yielded similar outcome) and ω = | X k | = 9 (i.e., allowing a maximum interval over all examined spatial scales), leading to m = 21 different models based on Equation (3). Figure 2 relates the corresponding simulation results: Panel (a) displays the exemplary estimation of a non‐fractal object (cube with expected FD = 3) and a fractal object (3D random Cantor set with expected FD ≈ 2.7655). Compared to random non‐optimal spatial scales, the outlined procedure improved estimation accuracy by several orders of magnitude (e.g., the arbitrary k‐range estimates the Cantor set correctly to the first decimal, while the optimal k‐range first deviates from the expected FD only in the fourth decimal place), even though Radj2 was very high in both cases. We then conducted a systematic simulation study, for which we created n = 100 distinct random Cantor sets for eight different retainment probabilities (from p = .6 to p = .95) yielding expected FD values in the range between 2 and 3 (with the aim of covering a biologically plausible range for FD estimates in brain MRI). Panel (b) displays the results of the subsequent fractality estimation over optimized and random non‐optimal scales. Here, the proposed spatial optimization procedure produced improved estimation results in virtually all simulation iterations and for all expected FD values. In contrast, choosing a random non‐optimal spatial scale led to both pronounced over‐ and underestimation of the expected FD, and comparing estimation variance with Levene's test suggested that optimized estimation precision was superior to non‐optimal spatial scales at p < .001 for all retainment probabilities (right subpanel). However, these simulation results also suggested that performance against optimization was not uniform across all non‐optimal spatial scales; under‐ and overestimation of the true FD varied from moderate to severe depending on how non‐optimal the particular control interval was. Furthermore, panel (c) visualizes a read‐out of which spatial scales were selected as optimal over the various simulation iterations. Optimization outcomes were selective (i.e., only a few of the 21 k‐ranges were ever selected as optimal), showing a preference for lower k‐values and shorter interval lengths with high consistency over the different retainment probabilities. Given these findings, we ran an additional simulation study to compare optimization outcomes against a set of fixed (i.e., non‐random) k‐ranges, summarized in Figure A1 in the online Appendix. While our optimization procedure resulted in improved estimation accuracy for all these comparisons, the magnitude of this improvement varied with the particular fixed k‐range and the expected FD. Improvement of estimation accuracy was less pronounced in those k‐ranges that were more often selected as optimal (e.g., k = 0, …, 4 as used by Madan & Kensinger, 2016) and more pronounced in those k‐ranges that were less often selected as optimal (e.g., k = 1, …, 4). Further details are reported in the Appendix. Based on these simulation studies, we applied the same estimation and optimization parameters to the FD estimation in the empirical data.

Figure 2.

Figure 2

Fractal dimension estimation with spatial scale optimization. Panel (a) contrasts estimation results over optimal and arbitrary k‐ranges for a non‐fractal (cube) and a fractal object (3D random Cantor set) whose expected fractal dimension values are known. In both examples, optimization increases estimation accuracy by several orders of magnitude. Panel (b) displays the results of a random Cantor set simulation over varying retainment probabilities, yielding different expected fractal dimensions. Green crosses correspond to the outlined optimization procedure, while red crosses indicate estimation results over randomly chosen non‐optimal spatial scales. The right subpanel relates the difference from the respective expected fractal dimension values over all estimations for the different retainment probabilities. Choosing a non‐optimal spatial scale led to both pronounced over‐ and underestimation of the expected FD, and optimized estimation precision was superior to non‐optimal spatial scales for all retainment probabilities. Panel (c) visualizes which spatial scales were selected as optimal over all simulation iterations. Optimization results showed a preference for lower k‐values and shorter interval lengths for all retainment probabilities [Color figure can be viewed at http://wileyonlinelibrary.com]

2.3. Data analysis

2.3.1. Deviation analysis

With the outlined processing stratification, we obtained a total of 320 FD estimates in the MASSIVE data set (10 subject scans × 32 analysis groups) and 1,344 FD estimates in the MSC data set (42 subject scans × 32 analysis groups). In order to qualitatively assess the data within each analysis group, we first applied a combination of random and systematic resampling procedures. Specifically, we performed a bootstrapping procedure in order to randomly sample the mean and the 99% normal approximation confidence interval (CI) of the FD over 2,000 resampling iterations. Bootstrapping provided an objective way of qualitative data assessment in terms of the tightness of the CI, which served as an indicator for the deviations within the analysis group, and the presence or absence of a skew in the clusters of the resampled means, indicative of important singular deviations in the raw estimates. Moreover, the bootstrapped CI was subsequently assessed as one of several criteria to identify meaningful deviations in the sampled FDs within each analysis group. We then applied a jackknife procedure, in which we systematically resampled the means by iteratively omitting each of the scans within the group in order to see if the variance changed significantly as assessed by Levene's tests. We then made the explicit assumption that the FDs obtained within each analysis group were sampled from a true but unknown normal distribution. We fitted a Gaussian distribution to the sampled FDs and assessed the coherence to a corresponding theoretical distribution by means of a quantile–quantile plot. In order to examine whether the sampled data was reasonably assumed to follow a normal distribution, we furthermore computed the Shapiro–Wilk test (Shapiro & Wilk, 1965), applicable to assess composite normality for smaller sample sizes.

As an example, Figure 3 visualizes these analysis steps for the exemplary group of binarized and skeletonized WM images in the T2 low resolution category (T2 low WM_Skel_bin) in the MASSIVE data set. The same analysis steps were applied to all 32 analysis groups in both data sets. In doing so, we sought to define a sensible criterion of when to “flag” an FD value due to a meaningful deviation within an analysis group. To this end, we compared various measures to find a balanced trade‐off between detection and discrimination ability. First, we assessed whether a single FD value was inside or outside the bootstrapped CI. As a second method, we assessed whether a particular value was within one or respectively two standard deviations (SDs) of the sample mean. Third, we assessed whether the variances of the jackknife means significantly differed from one another by evaluating Levene's test. Furthermore, we computed the Grubbs test to detect outliers within a given analysis group (Grubbs, 1969). The different methods were then assessed in terms of the original data and the effect that removing a flagged value had on the analysis in Figure 3. Specifically, we checked the flags against whether or not they occurred in groups in which the assumption of composite normality was first violated when considering all raw estimates, whether the removal of the flagged volume changed this, and if a deviation criterion would identify those analysis groups selectively. Based on the above points, the first method was deemed too conservative because the CI was tighter than even the one SD interval of the sample mean and because it was sensitive to arbitrary choices regarding the type of computation (normal approximation vs. percentile‐based, studentized or not, etc.). Systematic resampling nicely showed the qualitative effect that a single volume had on the overall mean and its variance but resulted in limited sensitivity in multivariate testing, despite increased accuracy in case of non‐normality. Jackknife resampling was thus considered too liberal for our purposes given the cases of deviation‐induced non‐adherence to composite normality. When the 1 SD interval around the sample mean was considered, volumes were more selectively flagged. However, this criterion does not account for the range of the data scatter, which was generally very small within analysis groups. See for example, Figure 3, where the data were sampled in the subdecimal scatter range of well under 0.03. As a result, scanning sessions were flagged with relatively low selectivity, which was alleviated by choosing a 2 SD interval around the sample mean. Even more selective, the Grubbs test procedure closely flagged nonadherence to composite normality, which was generally reversed after removal of the flag. Therefore this method was deemed the most appropriate criterion with the more conservative 2 SD method as a cross‐check. For an exemplary identification of a flag, see Figure 4, relating the results for T1 high GM_pve images in the MASSIVE data set. Here, Grubbs testing flags the FD that corresponds to the first scanning session (note that the more conservative 2 SD criterion equivalently identifies this flag). Systematic resampling shows that omitting the flagged value causes an upward shift of the mean and reduces its variance but this does not reach significance level in multivariate testing. The flagged FD causes the assumption of composite normality to be invalid although the remaining samples tightly follow the reference for normality. Omitting the flag restores normality and clearly “tightens” the distribution, while nonparametric distribution comparison was insignificant. Based on the results of the deviation analysis within each analysis group, we then examined the occurrence of flagged volumes by subjects, scanning session, image weighting, and processing parameters across the MASSIVE and the MSC data sets (see Section 3.1).

Figure 3.

Figure 3

Main steps of within‐group deviation analysis. The figure displays the deviation analysis for the exemplary analysis group of low‐resolution T2 WM partial volume estimates in the MASSIVE data set. Panel (a) shows a near‐uniform resampling distribution for bootstrapping, indicating the absence of a priori weights. Panel (b) displays the bootstrapped mean fractal dimensions as well as the resulting 99% CI and average over all bootstrapped means. Panel (c) plots the raw estimates for the 10 scans in the data set and their sample mean, together with the bootstrapped CI and the intervals spanning one and two SDs, respectively. Panel (d) represents the jackknife means (i.e., systematic resampling), where each of the 10 raw estimates was iteratively omitted to compute the mean over the remaining nine samples. Levene's test to see if the variances of the thus obtained means significantly differed from one another was insignificant. Panel (e) shows a quantile–quantile plot for the original data versus a fitted normal distribution, where a theoretical Gaussian would precisely follow the reference line. The values of the current analysis group reasonably adhere to this reference, and the test decision suggested that assuming composite normality was acceptable. Panel (f) shows the corresponding estimated normal distribution together with the cluster of the sampled FDs. The same procedure was applied to all 32 analysis groups in both the MASSIVE and the MSC data sets. CI, confidence interval; FD, fractal dimension; PDF, probability density function; SD, standard deviation [Color figure can be viewed at http://wileyonlinelibrary.com]

Figure 4.

Figure 4

Exemplary identification of a within‐group deviation. The data presented here belongs to the high‐resolution T1 GM_pve images in the MASSIVE data set. If the fractal dimension of an image was identified to deviate from the remaining analysis group according to the chosen deviation criterion, the corresponding volume was flagged (indicated here by #). In this case, the FD value belonging to the first scan was flagged, and its deviation from the remaining samples is visible from Panel (a). Note that in Panel (b), the SD of the jackknife mean without this flagged volume is notably smaller, although this difference did not reach significance level in multivariate variance comparison. Panel (c) shows the corresponding quantile–quantile plot. Although the flagged FD only deviates by about .05 from the other FD estimates, normality assessment suggests that assuming an underlying Gaussian distribution is not recommendable. Clearly, however, the remaining samples tightly follow the normality reference and discarding the flagged FD indeed restores the acceptance of composite normality. Furthermore, nonparametric comparison between the distributions with and without the flagged volume yielded insignificant results, exemplified here in Panel (d). CI, confidence interval; PDF, probability density function; SD, standard deviation [Color figure can be viewed at http://wileyonlinelibrary.com]

2.3.2. Impact of image registration

Based on the above analysis, we tested the effect of image registration and the ensuing interpolation on the fractal analysis results. In the MASSIVE data set, images were originally registered to the first T1 volume, and thus not all images were subject to the same transformation targets. To assess the impact of registration, we therefore reregistered all images to the mean of the FLAIR images, also included in the MASSIVE data set but independent of the presented analyses, and extended our analyses to the thus reregistered data. For further examination, we moreover reregistered the MSC data using FSL's MNI152 structural template. We then compared the mean FDs in the 32 analysis groups between the respective first volume registration and the reregistered data nonparametrically by a series of Wilcoxon rank sum tests, with Bonferroni–Holm correction for multiple comparisons. Effect sizes for these comparisons are calculated based on the z‐value of the test statistic as rzval=zn1+n2, where n 1 and n 2 are the compared sample sizes (i.e., number of scans for the two respective registrations, see Fritz et al. (2012)). Moreover, we computed correlation matrices to examine if there were associations between the 32 image analysis groups and whether image registration had an effect on potential associations.

2.3.3. FD and structural similarity

Furthermore, we sought to investigate the relationship between structural complexity and structural similarity. The motivation behind this was to examine if differences in FD essentially just track differences in structure, that is, if two MRI volumes differ little in their fractal dimensionality simply if they are very similar to one another. In this context, we computed the structural similarity index (SSIM) between two given 3D volumes and related it to the difference of their respective FDs. The SSIM is a well‐known reference metric of structural similarity between two images based on luminance, contrast, and structure, and is commonly applied in signal processing and image quality assessment (Wang, Bovik, Sheikh, & Simoncelli, 2004). The SSIM aims at evaluating structural differences between two complex‐structured signals and is computed as the result of comparing local intensity patterns over image windows. Importantly, it satisfies a number of useful properties for our comparisons: first, it represents a single scalar measure of the overall image comparison. Moreover, the SSIM is bounded by [−1, 1], with the unique maximum SSIM (x, y) = 1 if and only if the two images x and y to be compared are identical. Furthermore, the SSIM exhibits symmetry, such that SSIM (x, y) = SSIM (y, x) holds for any two images x and y. For further details, please refer to Wang et al. (2004), Østergaard, Derpich, and Channappayya (2011), and Brunet, Vrscay, and Wang (2012). We here computed the SSIM in every possible pair‐wise comparison of two volumes within an analysis group (i.e., volume 1 vs. 2, volume 1 vs. 3, and so on) in both the MASSIVE and the MSC data set. The number of total unique comparisons between any two out of n input volumes is given by the binomial coefficient m = n2, and we compute

SSIMxixj,withi,j=1,,nandij. (4)

For each of these comparisons, we calculate the difference of the corresponding FD values of volume x i and x j, that is,

ΔFDi,j=FDxiFDxj (5)

where we take the absolute difference to match the symmetry of the SSIM. In the MASSIVE data set, there are n = 10 repeated scans of a single subject. For each of the 32 analysis groups, we thus obtain m = 45 ΔFD/SSIM pairs, each belonging to one particular comparison of two 3D volumes. In the MSC data set, there are n = 4 repeated scans in each of the 10 subjects, yielding m = 6 between‐volumes comparisons in each analysis group. While the within‐subject comparisons were thus considerably more limited, the MSC data set allowed us to extend the above procedure to across‐subject analyses. To this end, we computed all possible session‐wise comparisons between subject scans (i.e., session 1 subject 1 vs. session 1 subject 2, …, session 4 subject 9 vs. session 4 subject 10), yielding m = 4 × 102 = 180 comparisons for each of the 32 analysis groups. As plotting ΔFD over SSIM was suggestive of data clusters in some cases, we carried out a group‐wise kmeans clustering analysis. To this end, k was chosen agnostically based on range‐constrained silhouette optimization (see Appendix for an example and further details), yielding k = 2 for most analysis groups, followed by k = 3 in some instances. The clustering algorithm was run on the corresponding ΔFD/SSIM pairs with 10 replicates to avoid convergence on nonglobal minima due to random initial conditions. Clustering quality was generally very good across the data sets as indicated by high average silhouette values and reasonably balanced cluster sizes. We furthermore examined whether there were significant associations between ΔFD and SSIM by means of nonparametric Kendall's τ correlation, and performed a linear regression for all significant dependencies. In order to test if differences in FD induced by varying interpolation (see above) were related to structural similarity, and if the relationship between ΔFD and SSIM was altered due to different image registration, we conducted the above analysis in both first volume registration and the reregistered data sets with identical optimization settings and compared ΔFD, SSIM, and kmeans clustering results between the different registrations.

2.3.4. FD by image characteristics

Finally, we assessed differences of the fractal estimates across analysis groups as a function of image characteristics and analysis parameters. To this end, we compared the corresponding mean FDs by computing an analysis of variance (ANOVA), which invariably yielded significant differences in FDs across groups, and applied a post‐hoc Tukey–Kramer test (Hayter, 1984) to investigate significant FD differences between analysis groups in pair‐wise parameter‐dependent comparisons. For all statistical tests employed in the present work, we defined a minimum significance level of α = 0.05.

Image processing was implemented with a set of Unix shell scripts. Skeletonization, spatial optimization studies, fractality estimation, and data analysis were carried out based on custom‐written Matlab code (The MathWorks, Inc., Natick, MA). For the interested reader wishing to retrace our analyses, all files are available from the Open Science Framework (http://osf.io/3mtqx).

3. RESULTS

3.1. Deviation analysis

The procedure detailed in Section 2.3.1 was applied to all 32 analysis groups across the MASSIVE and MSC data sets, the result of which is shown in Figure 5. The overall robustness of the FD against repeated‐sampling deviations was very high across both data sets, with over 95% unflagged volumes. For the detected flags, our analyses uniquely identified a single scanning session that was responsible for the majority of deviations in both the MASSIVE and the MSC data sets in original registration, in this case volume 1 (Figure 5a and 5c). As the first T1 volume served as the respective subject‐wise registration target, this finding motivated further examination in the reregistered data sets (see Section 2.3.2). Interestingly, reregistration consistently abolished the clustering of deviations in the first volume in both the MASSIVE and the MSC data (Figure 5b and 5d, respectively). Furthermore, reregistration further reduced the absolute number of deviations in both data sets by around 1.5–2%. Despite this general reduction, reregistration also induced a few previously absent deviations in both data sets (e.g., volume 6 in the MASSIVE data; subject 7, volume 4, in the MSC data). In terms of image parameters, high resolution images were more susceptible to the effect of registration (with a slight predilection for T1WI), and skeleton models were more prone to deviations than unskeletonized images, while deviations were rather balanced between segmentation procedure and tissue type.

Figure 5.

Figure 5

Deviation analysis across the MASSIVE and MSC data sets. Panels (a) and (b) depict sampling deviations by volume and analysis group in the MASSIVE data set in the original first volume registration and after reregistration to the mean FLAIR images. Panels (c) and (d) relate the results by volumes and subjects in the Midnight Scan Club (MSC) data in first volume and MNI registration. Note that only Subjects 8 and 6 underwent acquisition runs 5 and 6, respectively (indicated by * and #), while all other subjects had four acquisition runs. Panels (e) and (f) resolve the MSC deviations by analysis groups in the two registrations. The original registration resulted in a deviation cluster around the registration target in both the MASSIVE and the MSC data. This effect was abolished by reregistration in both data sets. High‐resolution images were more susceptible to the registration effect, and skeleton models were more prone to deviations than unskeletonized images. bin, binary tissue segmentation; GM, gray matter; pve, partial volume estimates; Skel, skeleton model; WM, white matter [Color figure can be viewed at http://wileyonlinelibrary.com]

3.2. Impact of image registration on FD profile

3.2.1. Absolute FD estimates

For further characterization of registration effects, we compared the FD profiles across all analysis groups between the two respective registrations for both data sets. As summarized in Table 1, image registration had a significant impact on the mean FD estimates for most analysis groups in T2WI for the MASSIVE data set, while the comparisons in T1WI were less often significant. For the MSC data set, all comparisons in the high resolution category for both T1WI and T2WI yielded significant results, while differences were less pronounced for low resolution volumes, especially in T1WI. Notably, in both registrations and both data sets, SDs for skeleton models across most analysis groups were up to one order of magnitude higher as compared to their unskeletonized counterparts (e.g., T1 low‐resolution WM estimates). Moreover, data scatter was generally higher in the MSC data (across‐subject means) as compared to the MASSIVE data (within‐subject means). Regarding the direction of the effects, all significant registration‐induced changes of the skeleton models in the MASSIVE data resulted in a decreased mean FD, that is, reregistration uniformly reduced FD values in image skeletons. In contrast, the opposite pattern occurred in all but one of the unskeletonized image groups, with reregistration yielding higher mean FD estimates. Across the MSC data set, on the other hand, reregistration invariably resulted in decreased FD estimates for both T1 and T2 high resolution volumes, while mean FDs of low resolution images were generally increased. While registration‐induced changes were thus quite consistent within each data set, the absolute mean values and the direction of registration‐induced changes did not generalize from one data set to another.

Table 1.

Impact of image registration on fractal dimension profile

Analysis group MASSIVE data set MSC data set
First volume FLAIR p corr r zval First volume MNI p corr r zval
Mean FD ± SD h n Mean FD ± SD h n Mean FD ± SD h n Mean FD ± SD h n
T1 high
GM_pve 2.6394 ± 0.0158 n* 2.6489 ± 0.0098 y ns −0.28 2.6734 ± 0.0229 n1/1 2.6393 ± 0.0104 n0/1 1.9e−11 0.78
GM_bin 2.6025 ± 0.0130 n* 2.6002 ± 0.0025 y ns 0.53 2.6187 ± 0.0427 n3/3 2.5530 ± 0.0234 n1/2 5.1e−13 0.84
GM_Skel_pve 2.2805 ± 0.0762 n 2.2201 ± 0.0439 n ns 0.50 2.2744 ± 0.0377 n4/4 * 2.1185 ± 0.0919 n4/5 1.2e−12 0.82
GM_Skel_bin 2.3146 ± 0.0536 n* 2.3437 ± 0.0064 y ns −0.55 2.3445 ± 0.0951 n9/9 2.1761 ± 0.0591 y2/2 2.1e−11 0.78
WM_pve 2.5685 ± 0.0103 n* 2.5426 ± 0.0028 y 0.0330 0.68 2.6459 ± 0.0268 n0/1 2.6093 ± 0.0125 y0/0 6.7e−11 0.76
WM_bin 2.4917 ± 0.0031 y 2.4878 ± 0.0089 n ns 0.08 2.5833 ± 0.0429 n1/1 2.5391 ± 0.0122 y1/2 2.0e−09 0.70
WM_Skel_pve 2.2423 ± 0.0311 n* 2.0673 ± 0.0216 y 0.0058 0.84 2.1899 ± 0.0468 n2/2 1.9107 ± 0.0504 y0/0 9.9e−14 0.86
WM_Skel_bin 2.2078 ± 0.0780 y 2.1530 ± 0.0171 y ns 0.62 2.2822 ± 0.0758 n2/2 1.9932 ± 0.0888 n0/2 9.9e−14 0.86
T1 low
GM_pve 2.5265 ± 0.0080 y 2.5374 ± 0.0066 y ns −0.55 2.5280 ± 0.0888 n 0/1 2.5070 ± 0.0452 n2/2 ns 0.17
GM_bin 2.3901 ± 0.0152 y 2.4539 ± 0.0039 y 0.0058 −0.84 2.3773 ± 0.1595 n1/2 2.4216 ± 0.0324 n0/0 ns −0.07
GM_Skel_pve 2.2087 ± 0.0073 y 1.9864 ± 0.0266 n* 0.0057 0.84 2.0784 ± 0.1697 n2/2 2.2745 ± 0.1090 n0/3 6.5e−07 −0.59
GM_Skel_bin 2.2367 ± 0.0114 n* 2.2371 ± 0.0080 y ns −0.14 2.1512 ± 0.0957 n2/2 2.2587 ± 0.0925 n2/3 .0009 −0.43
WM_pve 2.4137 ± 0.0028 y 2.4103 ± 0.0032 y ns 0.46 2.4444 ± 0.1949 n1/3 2.5162 ± 0.0308 n2/2 * ns 0.03
WM_bin 2.2861 ± 0.0119 n 2.2795 ± 0.0045 y ns 0.33 2.3574 ± 0.1661 n2/5 2.4178 ± 0.0161 y1/1 ns −0.06
WM_Skel_pve 1.8732 ± 0.0350 y 1.7491 ± 0.0474 y 0.0055 0.84 1.8443 ± 0.1480 n2/3 1.7583 ± 0.1639 n2/2 .0438 0.30
WM_Skel_bin 2.0148 ± 0.0219 y 1.8672 ± 0.0974 n 0.0053 0.84 1.9449 ± 0.1010 n0/0 1.9427 ± 0.1093 n1/1 ns 0.03
T2 high
GM_pve 2.6327 ± 0.0017 y 2.6352 ± 0.0018 y ns −0.57 2.6650 ± 0.0365 n0/1 2.6362 ± 0.0072 y0/0 8.9e−07 0.59
GM_bin 2.5263 ± 0.0043 y 2.5539 ± 0.0043 y 0.0051 −0.84 2.6268 ± 0.0418 n1/2 2.5665 ± 0.0211 n1/1 * 6.8e−12 0.80
GM_Skel_pve 2.2925 ± 0.0142 n 2.2419 ± 0.0149 y 0.0042 0.82 2.2700 ± 0.0509 n1/2 * 2.1373 ± 0.0589 n3/4 1.9e−11 0.78
GM_Skel_bin 2.3437 ± 0.0229 y 2.2741 ± 0.0311 n 0.0161 0.74 2.3969 ± 0.0699 n2/2 2.3037 ± 0.0443 n0/1 7.0e−09 0.68
WM_pve 2.6699 ± 0.0024 y 2.6727 ± 0.0036 y ns −0.40 2.7047 ± 0.0333 n0/1 2.6559 ± 0.0088 y0/1 2.0e−13 0.85
WM_bin 2.5284 ± 0.0080 n* 2.5773 ± 0.0019 y 0.0049 −0.84 2.6455 ± 0.0471 n3/3 2.5422 ± 0.0237 n2/2 5.1e−13 0.84
WM_Skel_pve 2.3031 ± 0.0319 y 2.2771 ± 0.0158 y ns 0.48 2.2768 ± 0.0509 y3/3 2.1982 ± 0.0574 n3/3 1.5e−08 0.67
WM_Skel_bin 2.3808 ± 0.0130 n* 2.3340 ± 0.0056 y 0.0047 0.84 2.4624 ± 0.0900 n2/3 2.2730 ± 0.0332 y1/1 7.0e−12 0.80
T2 low
GM_pve 2.4427 ± 0.0014 y 2.4696 ± 0.0120 n* 0.0046 −0.84 2.4760 ± 0.1729 n3/4 2.5023 ± 0.0159 n2/3 ns 0.14
GM_bin 2.4306 ± 0.0010 y 2.4597 ± 0.0046 y 0.0044 −0.84 2.3153 ± 0.2169 n3/4 2.4318 ± 0.0365 n3/3 .0066 −0.37
GM_Skel_pve 2.1620 ± 0.0182 n* 1.8032 ± 0.0658 n* 0.0042 0.84 2.0175 ± 0.0837 n2/2 2.2725 ± 0.1170 n2/2 2.2e−10 −0.74
GM_Skel_bin 2.3035 ± 0.0036 y 2.1593 ± 0.0076 y 0.0040 0.84 2.0624 ± 0.1015 n1/3 2.2635 ± 0.0800 n2/2 5.0e−12 −0.80
WM_pve 2.5655 ± 0.0048 y 2.5595 ± 0.0049 y ns 0.53 2.4888 ± 0.2084 n1/3 2.5485 ± 0.0147 n1/1 * ns −0.01
WM_bin 2.4405 ± 0.0011 y 2.4610 ± 0.0052 y 0.0038 −0.84 2.3456 ± 0.2081 n0/2 2.4651 ± 0.0269 n0/0 .0029 −0.40
WM_Skel_pve 2.1466 ± 0.0146 n* 1.7757 ± 0.0562 n* 0.0037 0.84 2.0379 ± 0.1207 n1/1 2.3721 ± 0.1025 n0/2 6.7e−12 −0.80
WM_Skel_bin 2.3265 ± 0.0034 y 2.0101 ± 0.0021 y 0.0035 0.84 2.1418 ± 0.1276 n3/4 2.3559 ± 0.0683 n1/1 1.0e−13 −0.86

The table summarizes the mean fractal dimension values by image group for the first volume registration and the reregistered data in both the MASSIVE and the Midnight Scan Club (MSC) data sets. Assessment of within‐group composite normality (h n) is indicated by “y” (yes) and “n” (no). Asterisks indicate those groups in which composite normality was first violated but restored after removal of a within‐group deviation (see Section 3.1). Mean fractal dimensions between registrations were compared nonparametrically by Wilcoxon signed rank tests with Bonferroni–Holm‐adjustment for multiple comparisons. Effect sizes are calculated based on the z value of the test statistic as r zval (see Section 2.3.2). bin, binary segmentation; FD, fractal dimension; GM, gray matter; ns, not significant; p corr, adjusted p‐value; pve, partial volume estimates; SD, standard deviation; Skel, skeleton model; WM, white matter.

3.2.2. Sample distributions

We furthermore assessed the sample distributions of the repeated‐acquisition FD estimates in response to image registration across the two data sets. Specifically, Table 1 summarizes the outcomes of composite normality assessment (h n) both within‐subject (MASSIVE and MSC data) and across‐subject samples (MSC data). Here, asterisks indicate the conversion cases, where composite normality was first refuted but acceptable upon removal of the within‐group deviations as identified by the deviation analysis from Section 2.3.1 (see Figures 4 and 5). For the MSC data, the test decision refers to the sample across all subject volumes, with subscripts indicating how many within‐subject normality assumptions were refuted without and respectively with these flagged volumes (a maximum of 10 for each analysis group based on the 10 subjects). As a general result, the normality assumption in within‐subject measurements was more often refuted in first volume as compared to reregistration, although this reached significance level only for the MSC data (MASSIVE: 40.6% in first volume registration vs. 25% in FLAIR registration, χ2 = 1.1, n = 32, p = .29; MSC: 25.3% in first volume registration, 17.2% in MNI registration, χ2 = 5.8, n = 320, p = .01). Furthermore, the repeated‐sampling deviations constituted a main reason for a priori rejection of composite normality in within‐subject sampling: in the MASSIVE data set, 10/13 normality rejections were restored by omitting deviations in first volume registration, and 4/8 in the reregistered data set. Conversion rates were 28.4% in first volume registration and 29.1% in MNI registration for the MSC data set. Considering the conversion cases, a total of 28/32 analysis groups adhered to composite normality in the reregistered MASSIVE data (87.5%), with similar results for the within‐subject distributions in the MNI‐registered MSC data set (281/320 within‐subject measurements, 87.8%). While assuming an underlying normal distribution for within‐subject sampling was hence acceptable for most analysis groups across both data sets, this did not transfer to the across‐subject distributions in the MSC data set. Here, normality was refuted in the vast majority of analysis groups in first volume registration, and this was virtually unaltered by omitting within‐subject deviations. MNI‐registration yielded adherence to composite normality in 25% of the analysis groups, without any obvious distribution across image categories, and this was again practically unaffected by within‐subject deviations. Closer examination of the sample distributions suggested that reregistration had a discernible regularization effect on the across‐subject distributions in some analysis groups, but not in others, as exemplified in Figure 6 for high‐resolution gray matter partial volume estimates in T1WI and T2WI.

Figure 6.

Figure 6

Exemplary across‐subject distributions of fractal dimension estimates in the MSC data set. The figure reports the raw fractal dimension estimates by subjects. Panels (a) and (b) display exemplary sample distributions with kernel density estimations for high‐resolution gray matter partial volume estimates in T1WI and T2WI, respectively. While MNI registration of the T2WI resulted in a regularization of the across‐subject sample (and composite normality was acceptable), this was not the case for T1WI. GM, gray matter; pve, partial volume estimates [Color figure can be viewed at http://wileyonlinelibrary.com]

3.2.3. Across‐group associations

Based on the complex impact of image registration on the FD estimates in both data sets, we furthermore investigated whether there were any between‐group associations across the 32 analysis groups and whether image registration had an impact on these associations. Figure 7 reports the corresponding results for the MSC data set (results for the MASSIVE data set were similar but limited to 10 estimates in each group and only reflective of within‐subject associations). First volume registration featured a large number of systematic, strong, bidirectional, and highly significant between‐group correlations, reflected in a “checkerboard” pattern of the correlation matrix. Interestingly, reregistration to MNI space resulted in a pronounced overall across‐group decorrelation, reducing both the strength and the amount of associations between image analysis groups, while an across‐group association cluster was seen for some analysis groups in the T2 low‐resolution category.

Figure 7.

Figure 7

Across‐group correlations in MSC data set. Panels (a) and (b) depict the correlation coefficients across the 32 image analysis groups in the Midnight Scan Club (MSC) data set in first volume and MNI registration, respectively. Panels (c) and (d) show the corresponding p‐values below significance threshold after Bonferroni–Holm adjustment. While first volume registration induced strong systematic correlations between analysis groups, both the amount and the strength of these associations were markedly attenuated by reregistration. bin: binary tissue segmentation; pve: partial volume estimates; Skel: skeleton model [Color figure can be viewed at http://wileyonlinelibrary.com]

3.3. Optimal k‐ranges

We subsequently analyzed the optimization results across the two data sets in terms of analysis parameters and image registration. Specifically, for each individual fractality estimation, we tracked which spatial scale interval (i.e., which range of k in Equation (2)) was selected as the optimal range for that particular estimation according to the procedure in Section 2.2. Based on Equation (3), there were m = 21 distinct spatial scale intervals, ranging from k = 0, …, 3 to k = 0, …, 8. Figure 8 visualizes the frequency of the optimal k‐ranges as estimated from the data. Panels (a) and (b) display the optimization results across analysis groups for the MASSIVE data set in first volume and FLAIR registration, respectively. As a general result, optimal k‐ranges were highly selective in that they (a) displayed a clear preference for a subset of all possible spatial scales (i.e., were far from a uniform distribution), (b) differed markedly over the various analysis groups, and (c) showed a systematic tendency toward lower‐cardinality over higher‐cardinality scale intervals. Furthermore, the k‐ranges in Figure 8 are ordered from left to right by interval length and lower to higher k‐values within each of these groups (i.e., from k = 0, …, 3 to k = 5, …, 8 for a cardinality of 4, from k = 0, …, 4 to k = 4, …, 8 for a cardinality of 5, and so on). From this it becomes apparent that optimal spatial scales showed a further tendency toward lower k‐values (i.e., smaller box sizes) for a given interval length. For instance, considering a cardinality of 4, all estimations in the reregistered data set yielded optimal scales from k = 0, …, 3 to k = 3, …, 6, while the larger box edge sizes of k = 4, …, 7 and k = 5, …, 8 were never selected as optimal (Figure 8b). Interestingly, scale selectivity in the MASSIVE data was even further increased by reregistration to the FLAIR images (in Figure 8b, 11 k‐ranges contained all optimization results, while the remaining 10 were never chosen as the optimal spatial scales). Optimization outcome furthermore differed by image analysis groups. While there was no obvious distribution of optimal scales by weighting, resolution, segmentation procedure, or tissue type, a discernible pattern emerged as a function of skeletonization, on which we thus focus the visual comparison (with unskeletonized volumes in colder colors, and skeleton models in warmer tones). Optimal scales for image skeletons were systematically shifted to the right of unskeletonized images, yielding that intervals for skeleton models were generally of the same length but over higher k‐values. Furthermore, we examined how consistently a particular k‐range was selected in repeated estimations within the same image analysis group. To this end, we tracked how many volumes in each analysis group yielded the same optimal scale, regardless of the particular k‐range. Panel (c) visualizes this scale dispersion for the MASSIVE data set. For some analysis groups in first volume registration, estimation yielded the same optimal scales for all 10 input volumes, while there were nearly as many cases in which a k‐range was only chosen once in a particular analysis group. Interestingly, reregistration shifted this distribution to the right, indicating that more analysis groups now consistently yielded the same optimization outcome over all 10 input volumes. The same analyses were carried out over the MSC data set, summarized in Panels (d)–(f). Results closely mirrored the above findings in the MASSIVE data. Optimal k‐ranges showed highly similar convergence on lower‐cardinality intervals as well as lower k‐values for a given interval length, with high consistency across subjects. Moreover, the same distribution of skeleton models and unskeletonized images was observed, and this pattern as well as scale selectivity was equivalently augmented by reregistration (Figure 8e). Furthermore, the scale dispersion distribution in Panel (f) was also right‐shifted in the reregistered data set, indicating increased optimization consistency. This effect, however, was more pronounced in some subjects than in others, and absolute counts differed moderately among subjects, suggesting that despite high qualitative consistency, there was also some between‐subject variability in the numerical frequency of individual optimization results.

Figure 8.

Figure 8

Optimal k‐ranges in MASSIVE and MSC data sets. Panels (a) and (b) display the optimal spatial scales across all fractal dimension estimations in the MASSIVE data set for first volume and FLAIR registration. Panel (c) quantifies how many of the 10 volumes in each image analysis group yielded the same respective optimal k‐ranges as a measure of scale dispersion. Reregistration shifted this distribution to the right, reflecting increased consistency of repeated optimization results. Panels (d) and (e) show the absolute frequencies of optimal spatial scales in the MSC data set for first volume and MNI registration (single bars represent subjects and stacks represent image analysis groups for each subject). There was notable similarity to the MASSIVE data in scale selectivity and distribution by image analysis groups, especially regarding skeleton models versus unskeletonized images. Panel (f) represents the consistency distribution over subjects in the MSC data. Note that only subjects 8 and 6 underwent acquisition runs 5 and 6, respectively (indicated by * and #), while all other subjects had four acquisition runs, and thus 4 represents the maximum repeated‐optimization consistency for those subjects. bin, binary tissue segmentation; GM, gray matter; pve, partial volume estimates; Skel, skeleton model; WM, white matter [Color figure can be viewed at http://wileyonlinelibrary.com]

3.4. FD and structural similarity

The procedure in Section 2.3.3 revealed an interesting relationship between the FD and structural similarity. Generally, SSIM values were found in the range of 0.7 and 1 for both data sets, indicating a high degree of similarity between any two MRI volumes across all image analysis groups. With regard to ΔFD/SSIM pairs in within‐subject comparisons, some cases were indicative of data clustering, and this was related to image registration. Figures 9 and 10 show the results for the exemplary group of T1 high‐resolution images in the MASSIVE data set. In first volume registration, kmeans clustering showed that the data was clearly separated into fractality‐similarity clusters (Figure 9a) across all analysis groups. Notably, this clustering was mainly driven by comparisons involving the first volume, i.e. the registration target (indexed by 0, see caption). Consequently, a number of across‐cluster correlations were found in various analysis groups, suggesting a systematic negative association between differences in FD and structural similarity (Figure 9b). However, this relationship was limited to clusters that were highly separated in both ΔFD and SSIM (see centroid location) and that were most clearly induced by comparisons involving the registration target. Indeed, when the same procedure was applied to high‐resolution T1 images in the reregistered MASSIVE data, these associations disappeared (Figure 10). Here, ΔFD/SSIM clusters as found by kmeans were generally less separated, mainly differed only by ΔFD in centroid location, and showed no systematic relationship between cluster assignment and which of the MRI volumes entered the comparison (Figure 10a). Similarly, the previous associations between ΔFD and SSIM were strongly attenuated, and all but one vanished altogether (Figure 10b). In fact, no general systematic relationship between FDs and structural similarity was observed in the reregistered MASSIVE data set. We then applied the same within‐subject analysis to the MSC data. While we observed similar target‐induced clustering and cluster‐driven ΔFD/SSIM associations in first volume registration as well as the attenuation of these effects in the reregistered images (see Appendix for an example), within‐subject analyses in the MSC data set were restricted to only a few possible between‐volumes comparisons due to the lower number of per‐subject scans (see Equation (4)). Nonetheless, the MSC data enabled us to compute extensive across‐subject comparisons, as detailed in Section 2.3.3. Figure 11 summarizes the results for the exemplary case of high‐resolution T1 images (but similar results were found for T2WI). kmeans clustering yielded two to three ΔFD/SSIM clusters for each image analysis group, with low between‐cluster separation and centroid locations driven predominantly by differences in ΔFD or SSIM but not both (Figure 11a). Furthermore, no systematic relationship between ΔFD and SSIM was observed for across‐subjects comparisons (Figure 11b).

Figure 9.

Figure 9

Fractal dimension differences and structural similarity in the MASSIVE high‐resolution T1 images (first volume registration). Panel (a) displays the kmeans clustering results within each analysis group. For all possible 45 comparisons, the structural similarity index (SSIM) between two input volumes was computed and related to the difference in the corresponding fractal dimensions (ΔFD). Numbers indicate which of the 10 volumes were compared, with indices running from 0 to 9 to avoid triple digits. For first volume registration, ΔFD/SSIM pairs showed strong clustering, and there was a systematic effect of comparisons involving the first volume (the original registration target, indexed by 0) for most image analysis groups. In these groups, clusters were driven by differences in both ΔFD and SSIM, and this induced strong negative associations between differences in fractal dimension and structural similarity shown in Panel (b). This effect, however, was attenuated by reregistration (see Figure 10 below). ΔFD, absolute difference in fractal dimension between two compared volumes; bin, binary tissue segmentation; GM, gray matter; pve, partial volume estimates; Skel, skeleton model; SSIM, structural similarity index between two compared volumes; WM, white matter [Color figure can be viewed at http://wileyonlinelibrary.com]

Figure 10.

Figure 10

Fractal dimension differences and structural similarity in the MASSIVE high‐resolution T1 images (reregistered to FLAIR). Similar to Figure 9, Panel (a) represents the ΔFD/SSIM pairs and kmeans clustering results for high‐resolution T1 images after reregistration to FLAIR. Here, ΔFD/SSIM clusters as found by kmeans clustering were generally less separated, mainly differed only by ΔFD in centroid location, and showed no systematic relationship between cluster assignment and which of the input volumes entered the comparison. Panel (b) shows that the previous associations between ΔFD and SSIM were strongly attenuated, and all but one vanished altogether. ΔFD, absolute difference in fractal dimension between two compared volumes; bin: binary tissue segmentation; GM, gray matter; pve, partial volume estimates; Skel, skeleton model; SSIM, structural similarity index between two compared volumes; WM, white matter [Color figure can be viewed at http://wileyonlinelibrary.com]

Figure 11.

Figure 11

Fractal dimension differences and structural similarity in across‐subjects comparisons in the MSC data set. Panel (a) visualizes the results of across‐subject comparisons for the high‐resolution T1 images in MNI registration. Each subject had four scans, and all possible between‐subject comparisons were computed for each of those scanning sessions across all image analysis groups (where we omit the comparison indices from above for visual coherence). Panel (b) relates the corresponding correlation results by image analysis groups. There was no systematic ΔFD/SSIM data clustering in across‐subject comparisons, and no systematic association between the fractal dimension and structural similarity was found. bin, binary tissue segmentation; GM, gray matter; pve, partial volume estimates; Skel, skeleton model; SSIM, structural similarity index between two compared volumes; WM, white matter [Color figure can be viewed at http://wileyonlinelibrary.com]

Further evidence against a systematic fractality‐similarity association comes from between‐registration comparisons of ΔFD and SSIM (see Appendix). While SSIM values in the MASSIVE data set were significantly different between first volume registration and the reregistered data across all analysis groups (p < .001 for all comparisons, Bonferroni–Holm‐adjusted), there was no significant difference in ΔFD values in the majority of the analysis groups (20/32 confirmed null hypotheses, see Table A1 in Appendix). This finding was corroborated and indeed more pronounced in the MSC data set, in which SSIM values for all analysis groups also showed a highly significant between‐registration difference, while there were essentially no significant differences in ΔFD values between the two image registrations (30/32 confirmed null hypotheses, see Table A2).

3.5. FD by image characteristics

Finally, we compare the mean FD estimates by image weighting and resolution in a parameter‐dependent fashion. Figure 12 reports the results for the MSC data set (but results for the MASSIVE data were highly similar, see Figure A4 in Appendix). As a general result, FD estimates in both image weighting were sampled in the expected range, compatible with previous reports, and T1WI and T2WI were affected by image registration, binarization, skeletonization, and spatial resolution in a highly similar manner. While the results from Section 3.2 highlight that registration had a significant impact on the absolute FD values, the influence of sequence weighting, tissue type, and image processing parameters within a given set of input images was essentially unaltered by reregistration. As such, binary tissue segmentation consistently caused a moderate reduction of FD values in the unskeletonized volumes across both registrations, while it led to a slight increase or no significant change in the skeleton models for both T1WI and T2WI, gray matter as well as white matter segmentations and regardless of spatial resolution. Furthermore, skeleton models invariably resulted in significantly decreased FD values across all analysis groups and in both image registrations. Another interesting pattern was observed with regard to tissue type: while gray matter and white matter FDs showed no significant differences for most comparisons in unskeletonized analysis groups, skeleton models generally yielded significantly higher gray matter FDs in T1WI as well as slightly but significantly higher white matter FDs for most comparisons in T2WI. Moreover, lower spatial resolution invariably resulted in significantly decreased FD values for all unskeletonized image groups in both the MASSIVE and the MSC data sets, regardless of image registration (see Figure A5). The same effect was observed in most image skeleton groups across both data sets, with a few exceptions in the MNI‐registered MSC data. Furthermore, comparing the SDs in Panels (a) and (b) of Figure 12, there was a marked reduction in between‐subject variability by reregistration to MNI space for all unskeletonized analysis groups, while within‐ and between‐subject variability were not equivalently reduced in skeleton models (see also by‐subject averages in Figure A6).

Figure 12.

Figure 12

Parameter‐dependent comparison of the fractal dimension estimates in the MSC data set. Panels (a) and (b) visualize the comparisons of the mean fractal dimension estimates over image analysis groups in first volume and MNI registration, respectively. Horizontal bars reflect pair‐wise significance levels. Comparisons for binary‐segmented images (second bar in each subpanel) invariably yielded the same significance levels as the partial volume estimates (first bar) so they were omitted here for visual coherence. Note that while image registration had a profound impact on the absolute fractal dimension estimates, the relative impact of sequence weighting, spatial resolution, segmentation procedure, tissue type, and skeletonization was essentially unaltered by registration. ns, not significant; *p < .05; **p < .01; ***p < .001; bin, binary tissue segmentation; GM, gray matter; pve, partial volume estimates; Skel, skeleton model; WM, white matter [Color figure can be viewed at http://wileyonlinelibrary.com]

4. DISCUSSION

The current study presents a systematic and in‐depth evaluation of the FD as a marker of structural brain complexity in human brain MRI. To this end, we first consider some computational aspects regarding FD estimation, and we report detailed empirical analyses of two recently published open‐access neuroimaging data sets.

As detailed above, the FD estimates obtained from box‐counting numerically depend on the spatial scale interval over which the linear regression of the log‐transformed data is computed, highlighting the question which scale interval will most adequately capture the underlying FD in the estimation process. We here applied an algorithmic scale optimization procedure to address this issue and examined the performance of optimized versus non‐optimized estimation in simulation studies of random Cantor sets, whose FD values were known. The outlined procedure improved estimation accuracy against both agnostic (random) and specific (fixed) non‐optimal k‐ranges, although the magnitude of this improvement depended on the particular non‐optimal k‐range as well as the underlying FD values. A further advantage of our procedure concerns the increased flexibility toward different object types. The simulated random Cantor sets only differed from each other over one degree of freedom (the retainment probability), while empirical objects such as brain MRI segmentations may differ over many degrees of freedom (e.g., sequence, spatial resolution, tissue type, and so on). As such, a particular spatial scale interval that yields good estimation accuracy for one type of object does not necessarily fare equally well in another type of object. Indeed, our procedure allowed us to analyze explicitly which optimal spatial scales were selected from the empirical data, and optimization results from Section 3.3 show that optimal k‐ranges were selective in terms of interval length, numerical k‐values (i.e., box edge sizes) and image analysis groups. With regard to the latter, an interesting pattern emerged in function of skeletonization, with remarkably similar optimization outcomes in the MASSIVE and the MSC data sets and high consistency across repeated measurements and subjects. In consequence, those spatial scales that were optimal for unskeletonized volumes were not optimal for skeleton models and vice versa, such that no single fixed k‐range would accommodate both. In sum, we suggest that the applied procedure provides improvement over using fixed spatial scales because (a) it can estimate the underlying FD with improved accuracy and because (b) it is agnostic toward different object types. In similar spirit, group‐wise scale selection based on correlation maximization has been applied by Esteban et al. (2010). Nonetheless, generalization of optimal scales across distinct estimations may be limited by differences in populations, subjects, acquisition sessions, scanning equipment, or estimation software, and thus a more data‐driven approach offers increasingly individualized optimization. In the current study, we apply scale optimization to individual fractality estimations in a completely automatic fashion.

With regard to the latter, scale selection here was based on maximizing the adjusted coefficient of determination, a commonly used measure of goodness of fit. While this perhaps represents the most natural approach to the box‐counting regression, other well‐studied model selection criteria exist (e.g., the Bayesian Information Criterion), and future studies may examine if applying a different model selection criterion yields further improvement of estimation accuracy. Of note, the disadvantage of using all spatial scales in the box‐counting regression has been pointed out from an analytical perspective (e.g., Gneiting et al., 2012), and indeed avoidance of greater‐length k‐ranges was observed in our empirical optimization outcomes. Finally, further study is also warranted to examine if similar improvements can be achieved in other methods of fractality estimation, such as dilation‐based algorithms or the sandbox method, which are thought to possess several advantages over classical box‐counting (Lopes & Betrouni, 2009; Madan & Kensinger, 2016, 2017; Ruiz de Miras et al., 2017; Xue & Bogdan, 2017; Yotter, Nenadic, Ziegler, Thompson, & Gaser, 2011).

Our empirical results suggested a high overall test–retest stability of the FD estimates (~95%) across both the MASSIVE and the MSC data sets. This is in accordance with a recent reliability study of brain morphology estimates in two open‐access data sets by Madan and Kensinger (2017) who found that regional FD as computed by both dilation and box‐counting methods was generally very high and comparable to the reliability of gyrification indices, while it was in fact superior to volumetric measures such as cortical thickness. Similarly, Goñi et al. (2013) analyzed the fractal properties of the pial surface, the gray matter/white matter boundary and the cortical ribbon and white matter volumes in MRI data from different imaging centers and found a high within‐subject reproducibility with region‐specific patterns of individual variability. While there is thus converging evidence for the robustness of fractal analysis in neuroimaging, these studies used parcellation‐ and surface‐based methods, and T2WI were not analyzed. In this regard, the present study provides additional information as our evaluation was stratified into 32 distinct analysis groups based on sequence weighting, spatial resolution, segmentation procedure, tissue type, and image complexity reduction by skeletonization, highlighting that the different image variables entail a differential susceptibility to repeated‐sampling deviations, observed here especially for high‐resolution images and skeleton models.

In this context, one important finding of the current study concerns the complex and profound influence of image registration on the FD estimates. In both data sets, image registration had a significant impact on the absolute FD estimates, without obvious patterns across the various analysis groups. Furthermore, we found that unbalanced registration targets can induce test–retest deviations in the FD estimates that are reduced with reregistration, and this was consistently observed in both data sets and across subjects in the MSC data. These test–retest deviations were also found to render the assumption of composite normality to be invalid in repeated within‐subject sampling. While a high proportion of analysis groups in balanced registration adhered to composite normality for repeated within‐subject measurements, this did not transfer to the across‐subject sample distributions. Instead, here the assumption of normality was refuted in a large majority of image analysis groups, and differences between analysis groups appeared to be driven by a variable across‐subject sample regularization in balanced registration. This finding (together with the test‐inherent limitation that accepting the null hypothesis does not prove composite normality but rather indicates it should not be refuted) suggests that it may not be advisable to assume the FD estimates over various subjects to be sampled from an underlying normal distribution. Measuring multiple subjects with only one or a few respective samples is a very common empirical scenario, however. As such, it appears that distributional assumptions in comparisons across populations (e.g., patients vs. controls) may need to be relaxed, for instance by opting for nonparametric methods, or ought to be informed by explicit assessment.

Furthermore, image registration also had an interesting effect on between‐group ties within the data sets: while unbalanced registration induced strong associations among various analysis groups, reregistration caused a pronounced overall decorrelation (indeed, the presence of strong across‐groups associations also seems biologically implausible; for example, there is no principled reason to believe that structural complexity of white matter will generally follow that of gray matter). In summary, our results point to an important methodological question: given the profound impact of image registration of the fractality estimates, which registration scheme should be applied for fractal analysis of structural brain MRI? While our results clearly argue for balanced registration methods, it is at this point unclear if subject‐derived templates (that were found to increase between‐scan structural similarity, see below) carry any advantages over subject‐independent templates. In any case, as the former may not always be feasible (e.g., in single‐acquisition scenarios), registration to commonly used subject‐independent targets such as the MNI template may currently be a reasonable solution, perhaps also in the interest of between‐study comparisons.

The latter point also concerns procedural standardization and technical variance. Both the MASSIVE and the MSC data set provide highly standardized images, while this may not always be the case in empirical reality. Motion artifacts, for instance, can be expected to obscure the utility of fractal analysis. Indeed, a recent study by Madan (2018a) has shown that head motion can cause a significant decrease of numerical FD estimates. Moreover, just as reference values for blood tests may differ depending on the laboratory where they are measured, fractal analysis may be influenced by the type of scanning equipment, sequences, preprocessing software or estimation method, as has been shown for other morphometric analyses (e.g., Wonderlick et al., 2009; Madan & Kensinger, 2017; Duché et al., 2017). In this context, it is noteworthy that the MASSIVE and the MSC data were acquired on scanning systems from two different manufacturers. While this provides some evidence that the results presented herein (which were very similar across the two data sets) were fairly independent of the scanning equipment, it may also constitute one reason why the absolute numerical dimension estimates were not generally transferable from one data set to another.

One finding with high consistency across the two data sets regards the impact of binary segmentation on the FD estimates, which caused a moderate FD reduction in unskeletonized images but no change or slight increases in skeleton models. Of note, image skeletons invariably yielded decreased fractality estimates as compared to their unskeletonized counterparts across both data sets. Since the skeleton models can be thought of as a minimum complexity version of the input volume, it seems rather plausible that the FD as a marker of tissue complexity was consistently reduced by skeletonization. The finding that SDs in skeleton models remained comparatively high regardless of registration (whereas this was not the case for unskeletonized images) further raises the interesting question if this could be interpreted as a an indication of multifractal behavior (see also below) or a shift or destabilization of localized fractal scaling over a finite range of scales (cf., Xue & Bogdan, 2017), which could be related to the right‐shift of the optimization outcomes in Figure 8. Moreover, we found that lower voxel resolution invariably resulted in lower FD values in the unskeletonized images across both data sets. A similar pattern was observed for skeletonized images, with a few exceptions in the MNI‐registered MSC data set. Intuitively, a measure of structural brain complexity may be decreased in coarser spatial resolution because structural information is blunted by partial volume effects.

Furthermore, the present study systematically evaluates the methodological characteristics of fractality estimation in structural T2WI. While T2‐derived sequences have been used for fractal analysis in the realm of functional MRI (albeit predominantly with respect to time series analysis, see Bullmore et al., 2001; Eke et al., 2012; Foss, Apkarian, & Chialvo, 2006; Lai et al., 2010; Thurner, Windischberger, Moser, Walla, & Barth, 2003), T1WI have been the mainstay of structural neuroimaging studies employing fractal analysis. Nonetheless, there has been some prior indication that T2‐based fractal analysis is both feasible and useful, especially in clinical assessment. For instance, Iftekharuddin and colleagues successfully incorporated T2WI in fractality‐based multimodal feature extraction for tumor segmentation (Iftekharuddin et al., 2009), and Takahashi et al. (2006) used multifractal analysis of deep white matter in T2WI to detect microstructural changes in early atherosclerotic alterations. Furthermore, Di Ieva et al., (2014) characterized nidus angioarchitecture of brain arteriovenous malformations with fractal analysis of T2WI. In the present study, we found T2WI to yield remarkably robust results, both in comparison to T1WI and in terms of stability over repeated measurements. Furthermore, T1WI and T2WI were affected by binarization, skeletonization, and spatial resolution in a similar manner, which may encourage further research given the importance of T2WI in clinical neuroradiological practice.

Finally, perhaps one of the most interesting findings of this study concerns the relationship between structural complexity and structural similarity. These analyses were motivated both by registration‐induced changes and by the general question of whether differences in FD essentially just reflect differences in structure per se. To our knowledge, the present study is the first to investigate the relationship between the FD and the SSIM in MRI.

Structural similarity as captured by the SSIM was generally very high across the two data sets. In relating structural similarity to the corresponding difference in complexity (ΔFD), we applied a kmeans clustering analysis, which provided a useful way to objectively assess data clusters, especially since k was chosen automatically and the same optimization settings were used for both image registrations and across both data sets. Due to the method's unsupervised character, it can be difficult to interpret qualitative differences in the cluster features. However, based on the procedure in Section 2.3.3, each ΔFD/SSIM pair represented a particular comparison of two MRI volumes, enabling us to check for systematic effects of between‐volumes comparisons as cluster features, and comparing the cluster centroids was useful in describing whether clustering was mostly driven by differences in ΔFD, SSIM, or both. We furthermore conducted the analyses in two distinct ways: we first examined fractality‐similarity relationships over repeated acquisitions within subjects and then extended the analyses to comparisons across subjects.

In line with the results of Section 3.2, we found considerable within‐subject clustering in various analysis groups for the MASSIVE data set in first volume registration, with a systematic effect of comparisons involving the registration target that yielded pronounced between‐cluster separation in both ΔFD and SSIM and induced a number of strong fractality‐similarity correlations. However, we interpret these to be spurious correlations induced by unbalanced registration because (a) they were mostly limited to analysis groups with strong target‐induced clustering, (b) the direction of the association was not consistent across analysis groups, (c) structural similarity across all analysis groups was significantly different in the reregistered data set (as expected) while there was little difference in ΔFD, and (d) systematic ΔFD/SSIM clustering and fractality‐similarity associations essentially disappeared with reregistration. While we observed a similar tendency in the MSC data set toward target‐induced clustering entailing across‐cluster associations in first volume registration and the attenuation thereof in MNI registration, within‐subject comparisons were numerically limited by the lower number of per‐subject scans as compared to the MASSIVE data. However, the MSC data allowed for extensive across‐subject comparisons, which showed no systematic ΔFD/SSIM clustering and no association between FD differences and structural similarity. Furthermore, similar to the MASSIVE data, structural similarity across all image groups was significantly different between first volume and MNI registration, while there was essentially no difference in ΔFD. In this context, a closer examination of the numerical SSIM values in the MASSIVE and the MSC data reveals a subtle but interesting corollary of our analyses: while reregistration in the MASSIVE data set invariably caused a marked increase in the SSIM values to above 0.9 in all analysis groups, reregistration in the MSC data caused a decrease in SSIM in all but two analysis groups to values around 0.7–0.8 (see Tables A1 and A2 in Appendix). Bearing in mind that the MASSIVE data were reregistered to the mean FLAIR image (derived from the same subject) while the MSC data were reregistered to the MNI template (i.e., not derived from the same subjects), these findings suggest that subject‐specific common image registration increased between‐scan structural similarity while subject‐independent common registration decreased between‐scan similarity. Notably, however, differences in FD did not simply track differences in structural similarity in either case, i.e., regardless of whether scans were more or less similar to each other, and this applied to both within‐ and across‐subject analyses. In summary, the present results suggest that there is no general relationship between structural complexity as measured by the FD and structural similarity as captured by the SSIM and that, rather, they may represent two distinct aspects of structural brain MRI.

4.1. Future directions

In the current study, we obtain several FD values for every input volume due to the stratification of processing parameters (tissue type, segmentation procedure, skeletonization). Thus, instead of just mapping one FD to one image, we compute a fractal “profile” of eight FD estimates per input image. Since the different analysis groups seem to entail differential susceptibility to deviations, such a fractal profile could perhaps be useful to optimize diagnostic sensitivity‐specificity trade‐offs. Furthermore, we here employed monofractal analysis, and it may be useful to expand this to multifractal approaches. Indeed, a recent study by Xue and Bogdan (2017) presents reliable multifractal estimation algorithms for quantifying structural complexity and their application for community detection in structural brain networks. These authors also consider scale‐related biases of the estimation procedure, albeit in weighted complex networks. Similarly, a formal framework for pattern characterization by multifractal analysis has recently been put forward by Balaban, Lim, Gupta, Boedicker, and Bogdan (2018). Moreover, we here compute FD estimates on global tissue segmentations. Given the increasingly sophisticated brain parcellation methods, however, region‐ and substructure‐specific fractal analysis is also being developed and is likely to yield interesting additional information, especially in the clinical context (see Eickhoff, Yeo, & Genon, 2018; Glasser et al., 2016; Goñi et al., 2013; Madan, 2018b; Madan & Kensinger, 2017; Ruiz de Miras et al., 2017). As such, future work is warranted to expand upon the utility of fractal analysis for empirical neuroimaging, specifically with respect to clinical applications. As detailed above, fractal analysis has shown the potential to detect brain tissue alterations in a wide range of vascular, inflammatory, neoplastic, and neurodegenerative pathologies—even in the absence of radiologically visible lesions. These findings raise hopes of defining novel biomarkers for improved diagnostics and enhancing our understanding of disease‐induced brain changes, for instance by identifying previously unrecognized tissue alterations. To reach this goal, however, the diagnostic and prognostic capacity of fractal analysis needs further investigation, both on a population and on the individual subject level. With the present work, we hope to contribute some methodological groundwork to facilitate progress in this direction.

Supporting information

Figure S1: Scale optimization results against a set of fixed k‐ranges. The figure displays the scale optimization results of a random Cantor set simulation, similar to fig. 2 from the main text. Panels A‐E represent accuracy comparisons for the five highlighted k‐ranges in the upper left panel, where interval lengths are color‐coded. Each panel shows the coincidence rate (i.e. fixed k‐range coincided with the optimal k‐range for the particular estimation iteration) and the cumulative estimation error (i.e. the absolute deviation from the expected fractal dimension value over repeated iterations). Scale optimization reduced estimation inaccuracy for all comparisons but the magnitude of this improvement varied with the particular fixed k‐range and retainment probabilities, with a tendency for more pronounced improvement in lower retainment probabilities and lower coincidence rates.

Appendix

ACKNOWLEDGMENTS

S.K. and C.F. are supported by the German Federal Ministry for Education and Research (BMBF grant 13GW0206D). The research of A.L. is supported by VIDI Grant 639.072.411 from the Netherlands Organization for Scientific Research (NWO). P.V. is employed by Genentech Inc. for work unrelated to this project. P.V. has stocks and is serving in the advisory board of Health Engineering SL (who has licensed a platform for measuring the FD from brain images from the University of Jaén and IDIBAPS), QMenta SL and Bionure SL. The work of F.J.E. is supported by Junta de Andalucía (BIO‐302) and MEIC (Systems Medicine Excellence Network SAF2015‐70270‐REDT). Finally, the authors are grateful to Jakob Ludewig and Leonhard Waschke for inspiring discussions regarding the current work.

Krohn S, Froeling M, Leemans A, et al. Evaluation of the 3D fractal dimension as a marker of structural brain complexity in multiple‐acquisition MRI. Hum Brain Mapp. 2019;40:3299–3320. 10.1002/hbm.24599

Carsten Finke and Francisco J. Esteban contributed equally to the manuscript.

Funding information Bundesministerium für Bildung und Forschung, Grant/Award Number: 13GW0206D; Consejería de Economía, Innovación, Ciencia y Empleo, Junta de Andalucía, Grant/Award Number: BIO‐302; MEIC Systems Medicine Excellence Network, Grant/Award Number: SAF2015‐70270‐REDT; Nederlandse Organisatie voor Wetenschappelijk Onderzoek, Grant/Award Number: VIDI Grant 639.072.411

Contributor Information

Stephan Krohn, Email: stephan.krohn@charite.de.

Francisco J. Esteban, Email: festeban@ujaen.es.

REFERENCES

  1. Balaban, V. , Lim, S. , Gupta, G. , Boedicker, J. , & Bogdan, P. (2018). Quantifying emergence and self‐organisation of Enterobacter cloacae microbial communities. Scientific Reports, 8(1), 12416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Brunet, D. , Vrscay, E. R. , & Wang, Z. (2012). On the mathematical properties of the structural similarity index. IEEE Transactions on Image Processing, 21(4), 1488–1499. [DOI] [PubMed] [Google Scholar]
  3. Bullmore, E. , Long, C. , Suckling, J. , Fadili, J. , Calvert, G. , Zelaya, F. , … Brammer, M. (2001). Colored noise and computational inference in neurophysiological (fMRI) time series analysis: Resampling methods in time and wavelet domains. Human Brain Mapping, 12(2), 61–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Di Ieva, A. , Esteban, F. J. , Grizzi, F. , Klonowski, W. , & Martín‐Landrove, M. (2015). Fractals in the neurosciences, part II: Clinical applications and future perspectives. The Neuroscientist, 21(1), 30–43. [DOI] [PubMed] [Google Scholar]
  5. Di Ieva, A. , Grizzi, F. , Jelinek, H. , Pellionisz, A. J. , & Losa, G. A. (2014). Fractals in the neurosciences, part I: General principles and basic neurosciences. The Neuroscientist, 20(4), 403–417. [DOI] [PubMed] [Google Scholar]
  6. Di Ieva, A. , Niamah, M. , Menezes, R. J. , Tsao, M. , Krings, T. , Cho, Y.‐B. , … Cusimano, M. D. (2014). Computational fractal‐based analysis of brain arteriovenous malformation angioarchitecture. Neurosurgery, 75(1), 72–79. [DOI] [PubMed] [Google Scholar]
  7. Di Ieva, A. (2016). The fractal geometry of the brain. Springer, New York. [Google Scholar]
  8. Duché, Q. , Saint‐Jalmes, H. , Acosta, O. , Raniga, P. , Bourgeat, P. , Doré, V. , … Salvado, O. (2017). Partial volume model for brain MRI scan using MP2RAGE. Human Brain Mapping, 38(10), 5115–5127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Eickhoff, S. B. , Yeo, B. T. , & Genon, S. (2018). Imaging‐based parcellations of the human brain. Nature Reviews Neuroscience, 19, 672–686. [DOI] [PubMed] [Google Scholar]
  10. Eke, A. , Herman, P. , Sanganahalli, B. G. , Hyder, F. , Mukli, P. , & Nagy, Z. (2012). Pitfalls in fractal time series analysis: fMRI BOLD as an exemplary case. Frontiers in Physiology, 3, 417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Esteban, F. J. , Padilla, N. , Sanz‐Cortés, M. , de Miras, J. R. , Bargalló, N. , Villoslada, P. , & Gratacós, E. (2010). Fractal‐dimension analysis detects cerebral changes in preterm infants with and without intrauterine growth restriction. NeuroImage, 53(4), 1225–1232. [DOI] [PubMed] [Google Scholar]
  12. Esteban, F. J. , Sepulcre, J. , de Mendizábal, N. V. , Goñi, J. , Navas, J. , de Miras, J. R. , … Villoslada, P. (2007). Fractal dimension and white matter changes in multiple sclerosis. NeuroImage, 36(3), 543–549. [DOI] [PubMed] [Google Scholar]
  13. Esteban, F. J. , Sepulcre, J. , de Miras, J. R. , Navas, J. , de Mendizábal, N. V. , Goñi, J. , … Villoslada, P. (2009). Fractal dimension analysis of grey matter in multiple sclerosis. Journal of the Neurological Sciences, 282(1), 67–71. [DOI] [PubMed] [Google Scholar]
  14. Falconer, K. J. , & Grimmett, G. (1992). On the geometry of random Cantor sets and fractal percolation. Journal of Theoretical Probability, 5(3), 465–485. [Google Scholar]
  15. Foss, J. M. , Apkarian, A. V. , & Chialvo, D. R. (2006). Dynamics of pain: Fractal dimension of temporal variability of spontaneous pain differentiates between pain states. Journal of Neurophysiology, 95(2), 730–736. [DOI] [PubMed] [Google Scholar]
  16. Fritz, C. O. , Morris, P. E. , & Richler, J. J. (2012). Effect size estimates: Current use, calculations, and interpretation. Journal of Experimental Psychology: General, 141(1), 2–18. [DOI] [PubMed] [Google Scholar]
  17. Froeling, M. , Tax, C. M. , Vos, S. B. , Luijten, P. R. , & Leemans, A. (2017). “MASSIVE” brain dataset: Multiple acquisitions for standardization of structural imaging validation and evaluation. Magnetic Resonance in Medicine, 77(5), 1797–1809. [DOI] [PubMed] [Google Scholar]
  18. Glasser, M. F. , Coalson, T. S. , Robinson, E. C. , Hacker, C. D. , Harwell, J. , Yacoub, E. , … van Essen, D. C. (2016). A multi‐modal parcellation of human cerebral cortex. Nature, 536(7615), 171–178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Gneiting, T. , Ševčíková, H. , & Percival, D. B. (2012). Estimators of fractal dimension: Assessing the roughness of time series and spatial data. Statistical Science, 27, 247–277. [Google Scholar]
  20. Goñi, J. , Sporns, O. , Cheng, H. , Aznárez‐Sanado, M. , Wang, Y. , Josa, S. , … Pastor, M. A. (2013). Robust estimation of fractal measures for characterizing the structural complexity of the human brain: Optimization and reproducibility. NeuroImage, 83, 646–657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Gordon, E. M. , Laumann, T. O. , Gilmore, A. W. , Newbold, D. J. , Greene, D. J. , Berg, J. J. , … Dosenbach, N. U. F. (2017). Precision functional mapping of individual human brains. Neuron, 95(4), 791–807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Grubbs, F. E. (1969). Procedures for detecting outlying observations in samples. Technometrics, 11(1), 1–21. [Google Scholar]
  23. Hayter, A. J. (1984). A proof of the conjecture that the Tukey‐Kramer multiple comparisons procedure is conservative. The Annals of Statistics, 12, 61–75. [Google Scholar]
  24. Iftekharuddin, K. M. , Zheng, J. , Islam, M. A. , & Ogg, R. J. (2009). Fractal‐based brain tumor detection in multimodal MRI. Applied Mathematics and Computation, 207(1), 23–41. [Google Scholar]
  25. Im, K. , Lee, J.‐M. , Yoon, U. , Shin, Y.‐W. , Hong, S. B. , Kim, I. Y. , … Kim, S. I. (2006). Fractal dimension in human cortical surface: Multiple regression analysis with cortical thickness, sulcal depth, and folding area. Human Brain Mapping, 27(12), 994–1003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Jenkinson, M. , Beckmann, C. F. , Behrens, T. E. , Woolrich, M. W. , & Smith, S. M. (2012). FSL. NeuroImage, 62(2), 782–790. [DOI] [PubMed] [Google Scholar]
  27. Jiménez, J. , López, A. , Cruz, J. , Esteban, F. J. , Navas, J. , Villoslada, P. , & de Miras, J. R. (2014). A web platform for the interactive visualization and analysis of the 3D fractal dimension of MRI data. Journal of Biomedical Informatics, 51, 176–190. [DOI] [PubMed] [Google Scholar]
  28. Kerschnitzki, M. , Kollmannsberger, P. , Burghammer, M. , Duda, G. N. , Weinkamer, R. , Wagermaier, W. , & Fratzl, P. (2013). Architecture of the osteocyte network correlates with bone material quality. Journal of Bone and Mineral Research, 28(8), 1837–1845. [DOI] [PubMed] [Google Scholar]
  29. King, R. D. , Brown, B. , Hwang, M. , Jeon, T. , George, A. T. , Initiative, A. D. N. , et al. (2010). Fractal dimension analysis of the cortical ribbon in mild Alzheimer's disease. NeuroImage, 53(2), 471–479. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. King, R. D. , George, A. T. , Jeon, T. , Hynan, L. S. , Youn, T. S. , Kennedy, D. N. , et al. (2009). Characterization of atrophic changes in the cerebral cortex using fractal dimensional analysis. Brain Imaging and Behavior, 3(2), 154–166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kiselev, V. G. , Hahn, K. R. , & Auer, D. P. (2003). Is the brain cortex a fractal? NeuroImage, 20(3), 1765–1774. [DOI] [PubMed] [Google Scholar]
  32. Klein, S. , Staring, M. , Murphy, K. , Viergever, M. A. , & Pluim, J. P. (2010). Elastix: A toolbox for intensity‐based medical image registration. IEEE Transactions on Medical Imaging, 29(1), 196–205. [DOI] [PubMed] [Google Scholar]
  33. Lai, M.‐C. , Lombardo, M. V. , Chakrabarti, B. , Sadek, S. A. , Pasco, G. , Wheelwright, S. J. , et al. (2010). A shift to randomness of brain oscillations in people with autism. Biological Psychiatry, 68(12), 1092–1099. [DOI] [PubMed] [Google Scholar]
  34. Lee, T.‐C. , Kashyap, R. L. , & Chu, C.‐N. (1994). Building skeleton models via 3‐D medial surface axis thinning algorithms. CVGIP: Graphical Models and Image Processing, 56(6), 462–478. [Google Scholar]
  35. Lopes, R. , & Betrouni, N. (2009). Fractal and multifractal analysis: A review. Medical Image Analysis, 13(4), 634–649. [DOI] [PubMed] [Google Scholar]
  36. Madan, C. R. (2018a). Age differences in head motion and estimates of cortical morphology. PeerJ, 6, e5176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Madan, C. R. (2018b). Shape‐related characteristics of age‐related differences in subcortical structures. Aging & Mental Health, 11, 1–11. [DOI] [PubMed] [Google Scholar]
  38. Madan, C. R. , & Kensinger, E. A. (2016). Cortical complexity as a measure of age‐related brain atrophy. NeuroImage, 134, 617–629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Madan, C. R. , & Kensinger, E. A. (2017). Test–retest reliability of brain morphology estimates. Brain Informatics, 4(2), 107–121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Mandelbrot, B. B. (1967). How long is the coast of britain? Statistical self‐similarity and fractional dimension. Science, 156(3775), 636–638. [DOI] [PubMed] [Google Scholar]
  41. Mandelbrot, B. B. (1983). The fractal geometry of nature (Vol. 173). San Francisco: Macmillan; [Google Scholar]
  42. Moisy, F. (2008). Computing a fractal dimension with Matlab: 1D, 2D and 3D Box‐counting Paris: Laboratory FAST, University Paris Sud. http://www.fast.u-psud.fr/moisy/ml/boxcount/html/demo.html.
  43. Østergaard, J. , Derpich, M. S. , & Channappayya, S. S. (2011). The high‐resolution rate‐distortion function under the structural similarity index. EURASIP Journal on Advances in Signal Processing, 2011(1), 857959. [Google Scholar]
  44. Rajagopalan, V. , Das, A. , Zhang, L. , Hillary, F. , Wylie, G. R. , & Yue, G. H. (2018). Fractal dimension brain morphometry: A novel approach to quantify white matter in traumatic brain injury. Brain Imaging and Behavior. 10.1007/s11682-018-9892-2. [Epub ahead of print] [DOI] [PubMed] [Google Scholar]
  45. Reishofer, G. , Studencnik, F. , Koschutnig, K. , Deutschmann, H. , Ahammer, H. , & Wood, G. (2018). Age is reflected in the fractal dimensionality of MRI diffusion based Tractography. Scientific Reports, 8(1), 5431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Ruiz de Miras, J. , Costumero, V. , Belloch, V. , Escudero, J. , Ávila, C. , & Sepulcre, J. (2017). Complexity analysis of cortical surface detects changes in future Alzheimer's disease converters. Human Brain Mapping, 38(12), 5905–5918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Shamonin, D. P. , Bron, E. E. , Lelieveldt, B. P. , Smits, M. , Klein, S. , & Staring, M. (2014). Fast parallel image registration on CPU and GPU for diagnostic classification of Alzheimer's disease. Frontiers in Neuroinformatics, 7, 50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Shapiro, S. S. , & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611. [Google Scholar]
  49. Sheelakumari, R. , Rajagopalan, V. , Chandran, A. , Varghese, T. , Zhang, L. , Yue, G. H. , … Kesavadas, C. (2017). Quantitative analysis of grey matter degeneration in FTD patients using fractal dimension analysis. Brain Imaging and Behavior, 12(5), 1221–1228. 10.1007/s11682-017-9784-x. [DOI] [PubMed] [Google Scholar]
  50. Smith, S. M. (2002). Fast robust automated brain extraction. Human Brain Mapping, 17(3), 143–155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Smith, S. M. , Jenkinson, M. , Woolrich, M. W. , Beckmann, C. F. , Behrens, T. E. , Johansen‐Berg, H. , et al. (2004). Advances in functional and structural MR image analysis and implementation as FSL. NeuroImage, 23, S208–S219. [DOI] [PubMed] [Google Scholar]
  52. Takahashi, T. , Murata, T. , Narita, K. , Hamada, T. , Kosaka, H. , Omori, M. , … Wada, Y. (2006). Multifractal analysis of deep white matter microstructural changes on MRI in relation to early‐stage atherosclerosis. NeuroImage, 32(3), 1158–1166. [DOI] [PubMed] [Google Scholar]
  53. Thurner, S. , Windischberger, C. , Moser, E. , Walla, P. , & Barth, M. (2003). Scaling laws and persistence in human brain activity. Physica A: Statistical Mechanics and its Applications, 326(3–4), 511–521. [Google Scholar]
  54. Wang, Z. , Bovik, A. C. , Sheikh, H. R. , & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612. [DOI] [PubMed] [Google Scholar]
  55. Wonderlick, J. , Ziegler, D. A. , Hosseini‐Varnamkhasti, P. , Locascio, J. , Bakkour, A. , Van Der Kouwe, A. , … Dickerson, B. C. (2009). Reliability of MRI‐derived cortical and subcortical morphometric measures: Effects of pulse sequence, voxel geometry, and parallel imaging. NeuroImage, 44(4), 1324–1333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Woolrich, M. W. , Jbabdi, S. , Patenaude, B. , Chappell, M. , Makni, S. , Behrens, T. , … Smith, S. M. (2009). Bayesian analysis of neuroimaging data in FSL. NeuroImage, 45(1), S173–S186. [DOI] [PubMed] [Google Scholar]
  57. Wu, Y.‐T. , Shyu, K.‐K. , Jao, C.‐W. , Wang, Z.‐Y. , Soong, B.‐W. , Wu, H.‐M. , & Wang, P.‐S. (2010). Fractal dimension analysis for quantifying cerebellar morphological change of multiple system atrophy of the cerebellar type (MSA‐C). NeuroImage, 49(1), 539–551. [DOI] [PubMed] [Google Scholar]
  58. Xue, Y. , & Bogdan, P. (2017). Reliable multi‐fractal characterization of weighted complex networks: Algorithms and implications. Scientific Reports, 7(1), 7487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Yotter, R. A. , Nenadic, I. , Ziegler, G. , Thompson, P. M. , & Gaser, C. (2011). Local cortical surface complexity maps from spherical harmonic reconstructions. NeuroImage, 56(3), 961–973. [DOI] [PubMed] [Google Scholar]
  60. Zhang, Y. , Brady, M. , & Smith, S. (2001). Segmentation of brain MR images through a hidden Markov random field model and the expectation‐maximization algorithm. IEEE Transactions on Medical Imaging, 20(1), 45–57. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1: Scale optimization results against a set of fixed k‐ranges. The figure displays the scale optimization results of a random Cantor set simulation, similar to fig. 2 from the main text. Panels A‐E represent accuracy comparisons for the five highlighted k‐ranges in the upper left panel, where interval lengths are color‐coded. Each panel shows the coincidence rate (i.e. fixed k‐range coincided with the optimal k‐range for the particular estimation iteration) and the cumulative estimation error (i.e. the absolute deviation from the expected fractal dimension value over repeated iterations). Scale optimization reduced estimation inaccuracy for all comparisons but the magnitude of this improvement varied with the particular fixed k‐range and retainment probabilities, with a tendency for more pronounced improvement in lower retainment probabilities and lower coincidence rates.

Appendix


Articles from Human Brain Mapping are provided here courtesy of Wiley

RESOURCES