Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2020 Sep 1.
Published in final edited form as: Neuroimage. 2020 Aug 18;222:117273. doi: 10.1016/j.neuroimage.2020.117273

Non-Negative Data-Driven Mapping of Structural Connections with Application to the Neonatal Brain

E Thompson 1, AR Mohammadi-Nejad 1,2, EC Robinson 3, JLR Andersson 4, S Jbabdi 4, MF Glasser 5,6, M Bastiani 1,4, SN Sotiropoulos 1,2,4
PMCID: PMC7116021  EMSID: EMS93775  PMID: 32818619

Abstract

Mapping connections in the neonatal brain can provide insight into the crucial early stages of neurodevelopment that shape brain organisation and lay the foundations for cognition and behaviour. Diffusion MRI and tractography provide unique opportunities for such explorations, through estimation of white matter bundles and brain connectivity. Atlas-based tractography protocols, i.e. apriori defined sets of masks and logical operations in a template space, have been commonly used in the adult brain to drive such explorations. However, rapid growth and maturation of the brain during early development make it challenging to ensure correspondence and validity of such atlas-based tractography approaches in the developing brain. An alternative can be provided by data-driven methods, which do not depend on predefined regions of interest. Here, we develop a novel data-driven framework to extract white matter bundles and their associated grey matter networks from neonatal tractography data, based on non-negative matrix factorisation that is inherently suited to the non-negative nature of structural connectivity data. We also develop a non-negative dual regression framework to map group-level components to individual subjects. Using in-silico simulations, we evaluate the accuracy of our approach in extracting connectivity components and compare with an alternative data-driven method, independent component analysis. We apply non-negative matrix factorisation to whole-brain connectivity obtained from publicly available datasets from the Developing Human Connectome Project, yielding grey matter components and their corresponding white matter bundles. We assess the validity and interpretability of these components against traditional tractography results and greymatter networks obtained from resting-state fMRI in the same subjects. We subsequently use them to generate a parcellation of the neonatal cortex using data from 323 new-born babies and we assess the robustness and reproducibility of this connectivity-driven parcellation.

Introduction

The neonatal period is a critical time for brain development, during which the refinement and maturation of white matter connections lay the groundwork for later cognitive development (Ball et al., 2015; Counsell et al., 2008; Girault et al., 2019). With diffusion MRI (dMRI) we can track these connections non-invasively and in vivo, which enables us to study the early development of structural connectivity and microstructure, even during the first weeks of life (see (Ouyang et al., 2019)for a recent review).

DMRI studies of neonates have shown that the trajectory of fibre maturation reflects the neurodevelopmental hierarchy, with primary motor and sensory tracts developing earlier than the association tracts that enable higher order functioning (Dubois et al., 2008; Kulikova et al., 2015; Partridge et al., 2004). Studies have also demonstrated the impact of preterm birth (Ball et al., 2015; Batalle et al., 2017; Brown et al., 2014; Girault et al., 2019) and maternal environment (Deoni et al., 2013; Tam et al., 2016) on the early development of white matter.

Despite the large potential of diffusion imaging for exploring early developmental stages of the brain, current analysis techniques follow the paradigms that have been established for the adult brain. For instance, dMRI tractography protocols for identifying specific white matter bundles typically rely on delineation of regions of interest (ROIs) that provide a priori anatomical knowledge on the route of the tract; and these ROIs can be defined relative to a template for automated delineation (Bastiani et al., 2019; De Groot et al., 2013; Warrington et al., 2020).

However, neonatal brains are not simply small adult brains (Batalle et al., 2018), and this renders the above paradigm problematic. The rapid growth and changes in brain morphology during the neonatal period, as well as fast alterations in tissue composition that alter imaging contrast over time (Bastiani et al., 2019), render it challenging to ensure correspondence between template-driven ROIs and tractography protocols at different stages of development (Serag et al., 2012). Manual delineation on a subject-by-subject basis could be an alternative, but it is time-consuming, assumes very detailed knowledge of how neonatal neuroanatomy is depicted in MRI at various early development stages, and becomes prohibitive for large cohorts, such as the developing human connectome projects (Howell et al., 2019; Hughes et al., 2017).

In this paper, we propose an alternative approach for simultaneously mapping white matter bundles and the corresponding grey matter nodes in the neonatal brain using data-driven methods, which are model-free and are expected to be more immune to the challenges described above. Independent component analysis (ICA) has been a commonly used data-driven method for identifying brain networks from resting-state functional MRI (fMRI) data (McKeown et al., 1998), and recent work has shown that it can also be applied to dMRI tractography data of the adult human brain (O’Muircheartaigh and Jbabdi, 2017; Wu et al., 2015)or of the non-human primate brain (Mars et al., 2019). We develop an alternative approach to ICA and explore its applicability in the neonatal brain.

One limitation of applying ICA to tractography data is that the estimated independent components and the respective mixing matrix can contain both positive and negative values, whereas structural connectivity data areinherently non-negative. This leads to challenges in the interpretation of negative weights. To address this problem, we present an alternative data-driven method that can be used to identify non-negative connectivity components. Our approach is based on non-negative matrix factorisation (NMF) (Lee and Seung, 2001). Like ICA, NMF is an unsupervised technique that estimates a pre-defined number of components from the data. However, the elements and their weights are constrained to take non-negative values. Sparsity constraints in the decomposition allow identifiability and further provide an indirect means of requiring independence between the estimated components. This results in a set of components whose weighted summation represents the whole system. Due to these advantageous properties, NMF has been recently used to identify networks of structural covariance (Ball et al., 2019; Sotiras et al., 2017, 2015) from MRI data.

In this study, we present for the first time an NMF-based framework for extracting connectivity components from diffusion MRI data, both at the group and the individual level. We apply this approach within the context of mapping patterns of structural connections in new-born babies, aged 37 to 44 weeks post-menstrual age (PMA) at scan, using publicly-released data provided by the developing Human Connectome Project (dHCP) (Hughes et al., 2017; Hutter et al., 2018). First, we describe the theory for decomposing whole-brain tractography-induced connectivity matrices into grey matter networks and their corresponding white matter bundles. We subsequently use simulations to quantitatively evaluate the behaviour of the method and assess its performance against ICA. We explore the validity and interpretability of i) the automatically detected white matter patterns against results from standard tractography protocols available through the dHCP(Bastiani et al., 2019) and ii) the greymatter patterns against components obtained from data-driven mapping of resting-state fMRI in the same subjects. Finally, we use the extracted structural connectivity components from a group of 323 new-born babies to derive connectivity-driven cortical parcellations of the neonatal brain and assess their robustness and reproducibility.

Theory

Let X be an M x N dense1 “connectivity” matrix, with Xij {i=1:M, j=1:N} carrying information on the likelihood of structural connections existing between locations i and j in the brain. Without loss of generality, let us assume that locations irepresent the whole brain and comprise of all imaging voxels, and that locations j represent grey matter and reside on the cortical white/grey matter boundary (WGB) and in subcortical grey matter. Diffusion MRI tractography can provide such a matrix if we seed streamlines from N seeds on the WGB and subcortical nuclei, and record visitation counts to M voxels across the brain, such that each column of X describes the connectivity profile of a grey matter location j. A data-driven decomposition of X can identify K components based on similarity of connectivity profiles. Different numbers of components can be obtained depending on the desired properties of the estimated components.

Independent component analysis (ICA) imposes statistical independence between the components to perform a linear decomposition. An observed matrix Xis represented as X = AS, where S is the independent sources matrix (each row k corresponds to a source/ component) and Athe weights or mixing matrix (each column k corresponds to the weights of source k). As this is an ill-posed problem in general, ICA uses source independence to estimate an un-mixing matrix W, that best approximates A -1, to recover the original sources from the observed data: WXS. This process is entirely data-driven by the statistical properties of the mixture, with no prior knowledge of the mixing matrix or the signals. The first step of all ICA algorithms is to centre and whiten the data, for normalisation. This can be achieved with a principal component analysis (PCA)(Wold et al., 1987) or singular value decomposition (SVD)(De Lathauwer et al., 2000). Then we seek an orthogonal rotation V to apply to the whitened data to optimise the statistical independence of the estimated components. This cannot be done analytically but there are a number of different methods of solving the problem iteratively. The FastICA algorithm (Hyvärinen and Oja, 2000), which uses non-Gaussianity as a proxy for independence, is one of the typically used algorithms.

ICA has been used to identify networks from resting-state functional MRI data (McKeown et al., 1998), where M = T, the number of timepoints, and the decomposition results into K spatial maps (covering all N brain voxels), each with a weight vector of length T. Each weight wik represents how much component k contributes to activity recorded at time point i. ICA has also been used recently in the case of dMRI tractography, where N is the number of seed locations(O’Muircheartaigh and Jbabdi, 2017). In that case, the decomposition provides K spatial maps (covering all N points on the grey matter), each representing a component with shared connectivity profile through white matter, associated with a weight vector of length M. Each weight wik represents in this case how much component k contributes to the connection patterns of voxel i.

Non-negative matrix factorisation is an alternative decomposition technique, where a matrix X is factorised into the product of two matrices A and H, under the constraint that all three contain only positive values (Lee and Seung, 1999). This is more naturally suited for use with structural connectivity data, which is inherently non-negative. In general, NMF is an ill-posed problem and there exist multiple solutions in most cases. The linear superposition of components, combined with the non-negativity constraint, lead to an implicit sparsity constraint in the algorithm (requesting a signal to be explained as a linear combination of non-negative regressors will lead to many weights close to zero).Additional explicit sparsity constraints can be applied to further constrainthe solution space and improve the identifiability of the decomposition (Hoyer, 2004). Specifically, the cost function C to minimise is of the form:

C=12XAHF+α1AL1+α2HL1, (1)

where ∥XF is the Frobenius norm, ∥xL1 is the L1-norm, used to explicitly promote sparsity, and α1 and α2 are tuning parameters that allow us to control the degree of regularisation on the mixing matrix and component matrix, respectively. Higher values of α’s lead to more sparsity in the resultant decomposition. The NMF can be initialised with a non-negative SVD, which has been shown to improve the accuracy of the decompositions (Boutsidis and Gallopoulos, 2008).Most NMF algorithms use a two-block coordinate descent approach to optimise CoverA and Halternatively, while keeping the other fixed. Each blockis a convex problem that can be solved using non-negative least squares(Cichocki and Phan, 2009).

From group to subject decompositions - Non-negative dual regression

When considering data matrices X from multiple subjects (e.g., by averaging across subjects in the simplest case), the components and mixing matrices will represent the group. Dual regression can then be used to generate subject-level representations of the group components and mixing matrices (Beckmann et al., 2009; Nickerson et al., 2017), both for ICA and NMF decompositions. Dual regression comprises of two steps:

  • i)
    Identify the subject-specific mixing matrix à from the group-level grey matter components S, using the subject-levelconnectivity matrix (where S denotesthe pseudoinverse of S):
    X˜=A˜SA˜=X˜S
  • ii)
    ii) Find the subject-level grey matter components, using the subject-specific mixing matrix Ã:
    X˜=A˜S˜S˜=A˜X˜

In previous work, this multivariate regression has been achieved by taking the pseudo- inverse of the group-level components and the subject-level mixing matrix (O’Muircheartaigh and Jbabdi, 2017), as illustrated in Suppl. Figure 1. However, taking the pseudoinverse introduces negative values into the components and their weights, which leads to mixed-sign subject-level representations of the original non-negative group-level components. Instead, we have developed a “non-negative dual regression” technique for back projecting NMF results, using non-negative least squares (NNLS) (Ling et al., 1977)for the regression steps. NNLS solves an equation of the form argmin xBX-y2 subject to x ≥0, in which x and y are vectors, andB is a matrix. Thus, the optimisation has to be performed separately for each target voxel in step (i) and each grey matter seed in step (ii) (see Suppl. Figure 2), but this process can be parallelised to reduce computation time. This provides an entirely non-negative framework for dual regression that retains the sparse characteristics of the group-level NMF components, as shown in figure 2.

Figure 2.

Figure 2

Example dual regression results for a component from a K = 50 NMF decomposition. On the left, the component has been dual regressed onto two subjects’ data with the standard approach using the pseudoinverse. On the right, the component has been dual regressed with our non-negative method that uses non-negative least squares. In all volumetric images, left is left.

Methods

We present the development of the above frameworks to map structural connectivity in the neonatal brain. We first give an overview of the data employed. We then describe a set of simulations that allow principled evaluation of the decomposition frameworks and we finally describe the methods we use to illustrate the benefits of our approach.

Data

We used structural, dMRI, and fMRI data made publicly available by the developing Human Connectome Project (dHCP) (www.developingconnectome.org). Briefly, data were acquired during natural sleep on a 3T Philips Achieva with a dedicated neonatal imaging system, including a neonatal 32 channel head coil (Hughes et al., 2017; Hutter et al., 2018). Diffusion MRI data were acquired over a spherically optimised set of directions on three shells (b = 400, 1000 and 2600 s/mm2).A total of300 of volumes were acquired per subject, including 20 with b = 0 s/mm2. For each volume, 64 interleaved overlapping slices were acquired (in-plane resolution = 1.5 mm, thickness = 3 mm, overlap = 1.5 mm).The data were then super-resolved along the slice direction to achieve isotropic resolution of 1.5 mm3 (Kuklisova-Murgasova et al., 2012). Pre-processing was carried out according to the dHCP diffusion processing pipeline (Bastiani et al., 2019). This includes motion correction and distortion correction (Andersson et al., 2016; Andersson and Sotiropoulos, 2016). Cortical surface reconstruction was carried out from T2w images with an isotropic resolution of 0.5 mm3, using a pipeline specifically adapted for neonatal structural MRI data (Makropoulos et al., 2018). Resting-state functional MRI data were acquired for 15 minutes (TE/TR = 38/392 ms, 2300 volumes) with an acquired resolution of 2.15 mm isotropic. fMRI pre-processing was carried out as detailed in (Fitzgibbon et al., 2019), with an automated pipeline including fieldmap pre-processing to estimate susceptibility distortion; registration steps; susceptibility and motion correction; and denoising with ICA-FIX.

Data were considered from a group of 323 subjects born at term age (175 male, 148 female). Median (range) birth age was 40.1 (37.0, 42.3) postmenstrual weeks and age at scan 40.9 (37.4, 44.4) weeks. Pre-processed data are available through the latest dHCP’s data release2.

Data processing and whole-brain tractography

Pre-processed data were further analysed to obtain structural connectivity matrices. To ensure alignment between subjects, we registered the anatomical surfaces to a representative template space before performing tractography. First, we used a surface registration pipeline (https://github.com/ecr05/dHCP_template_alignment), based on the multi-modal surface matching (MSM) algorithm (Robinson et al., 2018, 2014). Cortical folding was used to drive the alignment of neonatal WGB, cortical mid-thickness, and pial surfaces to the dHCP 40-week PMA surface templates (Bozek et al., 2018).This aligned vertices on the WGB surface to ensure consistent seed points for tractography across subjects. We then applied a previously computed non-linear volumetric registration (ANTs, Avants et al., 2011) to all MSM-derived surfaces to register them to 40-week PMA volumetric template space (Serag et al., 2012). This step was necessary to ensure that the tractography seeds were aligned to the target space, because the volumetric and surface-based neonatal templates are not aligned (Bozek et al., 2018; Serag et al., 2012).

Once the surfaces were aligned, we obtained connectivity matrices Xfor each subject, by performing whole-brain probabilistic tractography using FSL (Behrens et al., 2007; Hernandez-Fernandez et al., 2019). Fibre orientations (up to 3 per voxel) were estimated using a model-based deconvolution against a zeppelin response kernel, to accommodate for the low anisotropy inherent in data from this age group (Bastiani et al., 2019; Hernández et al., 2013; Sotiropoulos et al., 2016). We subsequently seeded 10,000 streamlines from each of 58,551 vertices on the WGB of both hemispheres (average vertex spacing 1.2 mm, excluding the medial wall) and from each of 2548 subcortical 2mm voxels (bilateral amygdala, caudate, hippocampus, putamen and thalamus), giving us a total of N = 61,099 seeds. This type of grey matter seeding has been shown to suffer less from the gyral bias in tractography, compared to whole-brainwhite matter seeding, even if gyral bias is less prominent in the neonatal brain (Thompson et al., 2019). Visitation counts were recorded between each seed point and each of M = 50,272 voxels in a whole-brain mask with the ventricles removed, down-sampled to 2 mm3. The pial surface was used as a termination mask to prevent streamlines from crossing between gyri, and streamlines were not allowed to cross the WGB more than twice (once at the seed point and again at termination), to reduce false positives(Hernandez-Fernandez et al., 2019; Smith et al., 2012). All masks (seeds, targets, exclusions) were defined in 40 post-menstrual weeks volume template space (Serag et al., 2012), however tractography was carried out in native space with results resampled directly to template space. Visitation counts were multiplied by the length of the pathway to correct for compound uncertainty in the estimated trajectories (O’Muircheartaigh and Jbabdi, 2017). The resulting dense matrices describe the likelihood of a white matter connection between each grey matter seed and the rest of the brain. The connectivity matrices were normalised by the total number of viable streamlines before being averaged across the group. Connectivity matrices were saved and averaged in a sparse format to reduce computation time and memory requirements.

Dimensionality reduction and back-projection

We evaluated data-driven connection mapping using both ICA and NMF. Large M (i.e. large number of imaging voxels)can pose computational and numerical convergence challenges for ICA. We therefore used PCA to reduce the M x N matrix X, into a P x N matrix Xr of principal components. Applying ICA to this reduced matrix results in a K x N set of components S, and a P x K mixing matrix in PCA space Ar. In order to obtain the mixing matrix in the original space of M imaging voxels, we can take the pseudoinverse of the component matrix S and project it back onto the original data to obtain the tract space mixing matrix, i.e. A = XS, where S denotes the pseudoinverse of S (see Suppl. Figure 3). We followed this approach for the ICA analysis in both simulations and on real data, to cope with excessive memory requirements of the full connectivity matrix. However, the dimensionality reduction step was not necessary for NMF.

Simulations

We evaluated the performance of the decomposition frameworks (using NMF and ICA) in numerically simulated data, before applying them to real data. We simulated datasets X with a known number of underlying sources S, to observe how the behaviour of the decompositions over different model orders reflects the true dimensionality of the data. To find a realistic generative distribution to use for our sources, we used the spatial maps from standard tractography protocols in the neonatal brain (Bastiani et al., 2019)to generate connectivity blueprints (Mars et al., 2018) as proxies for the source spatial maps in grey matter space (Figure 1), and fit several distributions to the intensities of these maps (unwrapped to 1D). We found that log-beta distributions best described the data. The sources were therefore drawn from log-beta distributions, whose parameters in turn were drawn from Gaussian distributions according to the fits to the measured data. These sources are random and sparse, features that indirectly ensure a high degree of independence. Sources were scaled to lie in the range 0-1. The mixing matrix was randomly generated, normalised so the columns sum squared to 1. The simulated data was calculated as the product of the mixing matrix with the source matrix. Zero-mean, additive Gaussian noise was applied to that product via a logit transform, to maintain non-negativity.

Figure 1.

Figure 1

Data-driven matrix decomposition methods applied to resting-state functional MRI and structural connectivity data. a) N functional time-courses of length T are recorded from points in the grey matter. We can apply a matrix decomposition technique, such as ICA, to this matrix, yielding an TxK mixing matrix of time courses and a KxN matrix of spatial components. b) an MxN connectivity matrix describes the likelihood of structural connections existing between each of N grey matter seeds and M locations in the brain. The equivalent decomposition applied to this matrix gives us an MxK mixing matrix of spatial maps, and a KxN matrix of components in the grey matter.

Varying L1-norm regularisation in NMF

The NMF decomposition can be regularised with L1-norm terms to promote sparsity in the components (see equation (1)) (Févotte and Idier, 2011). We first tested NMF on the simulated data with varying levels of regularisation to assess its effect on the accuracy and robustness of the decomposition. Data were simulated with K = 50 sources, and overall dimensions of N = 1200 and M = 1000, with noise added with σ2 = 0.05 to best match the real data. We used the same regularisation parameter for the mixing matrix and the components, i.e. α12=α,following the implementation in scikit-learn(Pedregosa et al., 2011). NMF was applied with model orders from 1 to 100 and with regularisation parameters, α = 0, 0.1, 0.25, 0.5. This process was repeated with 100 noisy realisations of the data in each case.

Varying number of sources

We performed the simulations with varying number of sources in the data to check how this affects the results. The data were generated with σ2= 0.05 and with K=25, 50 and 75 sources. ICA and NMF were applied with model orders from 1 to 100. For NMF, we used a regularisation parameter of α = 0.1 (see Simulation Results for justification). This was also repeated 100 times. ICA was first initialised with a PCA with P = 100 components, as described in the Dimensionality Reduction section.

Varying noise levels

Finally, we tested the impact of varying noise levels on the decompositions. Data were simulated as above. Gaussian noise was added to the data with varying σ2 = 0.0005, 0.005, 0.05, and 0.5. 100 noisy realisations were generated in each case. The data were decomposed with ICA and NMF, with model orders K from 1 to 100. ICA was first initialised with a PCA with P=100 components, as above.

Assessing Performance

We used three different metrics to assess the success of the decompositions on the simulated data: i) Reconstruction error: the sum of squared errors between the reconstructed data after decomposition and the original data: i.e. Σ(XAS)2. This gives us a measure of the information lost through the decomposition. ii) Source-component correlation: the correlation between each original source and the estimated components. The best-matched component to each source was identified and the mean of the maximum correlation values for each component was considered. This describes how well the decompositions have characterised the underlying signals in the data, and is sensitive to overfitting, as redundant components that are not well matched to sources will bring the value down. iii) Sparsity: Following the approach in (Hoyer, 2004; Sotiras et al., 2015), we used a sparsity measure for the derived components based on the relationship between the L1-norm and the L2-norm:

sparsityx=Nxi/xi2N1 (2)

This returns values between 0 and 1, with 1 signifying a maximally sparse component with only one non-zero element. This was calculated for each component vector in S, and we report the mean value across all components. Sparse components are desirable because they provide an easily interpretable representation of the data with minimal redundant information. In the case of NMF, sparsity constraints also make results more reproducible, by constraining the solution space. For ICA, sparsity can be thought of as a proxy for independence.

In-vivo data decompositions

For real data, we decomposed group-average tractography matrices, using independent component analysis (ICA) and non-negative matrix factorisation (NMF), with a range of model orders K. ICA was initialised withregular PCA, in which the first 500 components were retained (explaining 97% of the total variance). ICA was applied to the reduced dataset using the FastICA algorithm (Hyvärinen and Oja, 2000), with independence imposed in the seed domain. The pseudo-inverse of this matrix was projected back onto the group-level connectivity matrix to yield the corresponding components in target space (notice that although we use a whole-brain target mask for tractography, the bulk of the tractography data -and therefore the data-driven components - are in white matter, so we refer to these spatial maps as the white matter components throughout the rest of the text).To deal with the sign ambiguity of ICA, components that were negative in the long tail of their distribution were sign-flipped, for consistency with the other methods (i.e. so that the main mass of the distribution was in the positive valued domain).

NMF was performed with a coordinate descent algorithm (Cichocki and Phan, 2009), a Frobenius norm cost function (see equation (1)), and an L1-norm regularisation parameter α = 0.1. In NMF, the matrix is decomposed directly into the M x K mixing matrix and the K x N component matrix so there is no need for the back-projection step that was carried out for ICA after the PCA.

All decompositions were implemented using scikit learn (Pedregosa et al., 2011) and the code is available on GitHub (https://github.com/ethompson93/Data-driven-tractography). An NMF decomposition on a group average matrix takes around 2 hours and 80 GB of RAM on a single CPU.

Comparison to tractography-derived white matter tracts

To assess validity and interpretability of the extracted components, we compared the automatically extracted white matter components with results obtained from standard, template-driven tractography protocols, developed for neonatal subjects, as described in (Bastiani et al., 2019). 28 tracts (13 bilateral) were mapped in each subject. The tracts included in this analysis were: acoustic radiation (AR), anterior thalamic radiation (ATR), cingulate gyrus part of cingulum (CGC), parahippocampal part of cingulum (CGH), cortico-spinal tract (CST), forceps minor (FMI), forceps major (FMA), fornix (FOR), inferior fronto-occipital fasciculus (IFO), inferior longitudinal fasciculus (ILF), medial lemniscus (ML), posterior thalamic radiation (PTR), superior longitudinal fasciculus (SLF), superior thalamic radiation (STR), and uncinate fasciculus (UNC). These were registered to a 40-week template and down sampled to 2 mm for comparison with the tractspace representations of our data-driven components.Each tract was averaged across all subjects within a split half, and we calculated the Pearson’s correlation coefficient between each of the average tracts with each of the data-driven components from the K = 100 decompositions. A one-to-one matching was performed between the standard tractography results and the component maps, based on the correlation scores, with the results displayed in Figure 6 and Supplementary Figure 4.

Figure 6.

Figure 6

Example group level results from NMF and ICA (model order = 100), displayed alongside their matching tract from the standard protocols. Data-driven components are un-thresholded to enable the comparison between the negative values in the ICA components and the sparse, non-negative representations from NMF, whereas the maps from standard protocols are lowerthresholded at 0.001 for clearer visualisation of the tract. All tractography and data-driven results are taken from split 1 of the split-half analysis. The full set of 28 tracts with their matched data-driven components are shown in the supplementary material.

Split-half reliability analysis

We performed a split-half analysis on a cohort of 323 term-age subjects to see how robust and reproducible our decompositions were across different model orders. We evaluated a number of model orders: K = 5, 10, 25, 50, 100, 200. For each value of K, we performed a one-to-one matching of components across the split-half, based on the Pearson’s correlation coefficients of their spatial maps, recording the correlation coefficients of the matched pairs as a measure of their similarity. This was repeated for the grey matter and white matter maps. We also measured the reconstruction error and the sparsity of the components for both ICA and NMF, as in the simulations.

Comparison to functional resting-state networks

The cortical patterns of structural connectivity from our NMF components were compared with resting state networks from fMRI. For this analysis, we selected a group of 55 subjects all born and scanned between 40 weeks and 41 weeks PMA (i.e. all subjects within this age range who had both structural and functional data available).

We first mapped the functional data onto the cortical surface, broadly following the fMRISurface pipeline outlined in (Glasser et al., 2013). The native WGB, midthickness and pial surfaces were affine registered to the same space as the functional data. The fMRI timeseries were then mapped onto the cortical surface using a partial volume weighted ribbon-constrained volume to surface mapping algorithm, as implemented in HCP’s connectome workbench (Marcus et al., 2011). These data were then downsampledfrom the native mesh and registered to the 32k resolution template (using the same MSM transform as for the WGB surface used to seed tractography). Spatial smoothing was applied over the cortical surface with a Gaussian kernel, FWHM = 2mm.

Temporally-concatenated group-ICA was performed using FSL’s Melodic (Beckmann and Smith, 2004), with Melodic’s Incremental Group PCA (MIGP) for the PCA step (Smith et al., 2014). MIGP uses an incremental approach to closely approximate PCA of very large datasets but with a reduction in the amount of memory required. We specified 50 independent components. We performed NMF on the group-averaged structural connectivity matrices of the same group of subjects, with K = 50, for comparison. The similarity between the resultant grey matter spatial maps was assessed using Pearson’s correlation coefficient.

Cortical parcellations using structural connection patterns

We used the grey matter components to generate a parcellation of the cortex with K clusters, using a “winner-takes-all” approach, whereby each vertex on the cortical surface was labelled according to the component that had the highest weighting at that point. This results in a hard parcellation where each cluster corresponds to a component that best characterises the connection patterns of the vertices contained within it. We tested the robustness of these parcellations by calculating the Dice coefficient between parcellations generated on each of the split-halves. Dice coefficients measure the overlap between two clusters, normalised by the number of elements in each cluster. Subject-level parcellations were generated from the dual-regressed subject-level grey matter components. We assessed the variability of the subject-level parcellations by calculating the Dice coefficient between equivalent parcels in the subject-level and group-level parcellations.

We also assessed the parcellation using a Silhouette coefficient, which assesses the similarity of the vertices in a cluster in relation to the vertices in other clusters (Rousseeuw, 1987). We used (1-Pearson’s R) as a distance metric for the connectivity profiles of different vertices. A successful parcellation would group vertices with similar connectivity profiles, which are distinct from the connections in other parcels.

Results from our data-driven parcellations were benchmarked against a “null distribution” of 100 random Voronoi parcellations of the same model order (Aurenhammer, 1991). Voronoi parcellations are spatially-continuous and geodesic-distance based and were generated from seeds randomly distributed over the surface of two spheres, mapped to the surface of each hemisphere of the cortex. Each vertex on the sphere is labelled according to its nearest seed point on the surface. The spherical parcellations were projected onto the cortex, providing random parcellations with a set number of contiguous spatial regions.

Results

Simulations

We performed simulations to evaluate the performance of ICA and NMF decompositions on a synthetic dataset in which the underlying sources were known. We first looked at the effect of varying the degree of L1-norm regularisation in NMF. We then investigated how the number of sources and the noise level in the data affected results.

Varying L1-norm regularisation

Increasing the regularisation parameter, α, increases sparsity, but also increases the reconstruction error. The NMF decomposition breaks down for high regularisation (α = 0.5), with high error and very low source-component correlation. Smaller amounts of regularisation improve the agreement between the components and sources and reduce the reconstruction error at the cost of reducing sparsity. A good middle-ground solution is shown (α = 0.1), balancing reconstruction accuracy and sparsity. We therefore opted to use α = 0.1 for subsequent experiments.

Varying number of sources

We carried out the decompositions on data with varying numbers of underlying sources. Figure 4 shows that reconstruction error increases with the number of sources, so more information is lost between the decomposition and the original data as the data become more complex. For the source-component correlation, we can see two different regimes. When the number of components, N, is lower than the true number of sources in the data, K, the average correlation between the components and the true sources rises quickly for very low N, then plateaus until N = K. When N > K, the extra components overfit to the noise and bring down the average correlation with the sources. NMF achieves overall very high correlations between the reconstructed components and the true non-negative sources. NMF component sparsity increases rapidly for low N, then increases more slowly once the number of components exceeds the number of sources. In the case of ICA, sparsity reaches a peak when the number of components is equal to the number of underlying sources, then decreases.

Figure 4.

Figure 4

Simulation results to show how decompositions vary with differing numbers of underlying sources. The dotted vertical line shows the number of underlying sources in each case (from left to right: K = 25, 50, 75). Results are shown from ICA and NMF decompositions, in orange and blue, respectively. σ2 = 0.05 and a = 0.1 for NMF.

Varying SNR

Overall, reconstruction error increases with noise level. In general, reconstruction error decreases as the model order approaches K, the true number of underlying sources and then plateaus for higher model orders. The mean correlation between the components and the underlying non-negative sources increases as the number of components approaches K, and then decreases as the models overfit to noise. The sparsity of the components exhibits a relatively stable pattern for low and mid-levels of noise, but it becomes considerably reduced in the high noise scenario (σ2 = 0.5).

Figures 4 and 5 also enable us to compare the performances of ICA and NMF on simulated, non-negative data. ICA shows a lower reconstruction error than NMF, particularly when model order exceeds the number of true sources. This could, however, signify that ICA is overfitting to noise more than NMF, particularly since ICA also exhibits a lower correlation between its components and the underlying sources than NMF, at all model orders. This reflects the better suitability of NMF for identifying inherently non-negative patterns within the data, in contrast to ICA, which generates components that contain both positive and negative values. NMF also generates components with consistently higher sparsity than those from ICA.

Figure 5.

Figure 5

Simulation results to assess the effect of varying noise levels on the ICA (orange) and NMF (blue) decompositions. The noise level increases from left to right across the plots (σ2 = 0.0005, 0.005, 0.05, 0.5). Different metrics are shown from top to bottom: reconstruction error, correlation between derived components and underlying sources, sparsity of components. The true number of underlying sources (K = 50) is denoted by a vertical dashed line.

To summarise, we evaluated the performance of ICA and NMF on a simulated dataset with non-negative sources. Based on the results of these simulations, we have chosen a regularisation parameter of α = 0.1 for NMF to use on the real data, as this promotes sparsity in the components, without compromising too much accuracy in the reconstruction. We have found that NMF has a number of advantages over ICA for non-negative data: it generates components that are more closely matched to the real sources, with higher sparsity and potentially less overfitting to noise.

In-vivo results - Comparison between ICA, NMF and standard tractography

To investigate the interpretability and validity of the extracted components, we compared the white matter components from both ICA and NMF with the group-averaged results from standard tractography protocols. A number of our data-driven components exhibit strong spatial similarity to known white matter pathways (figure 6). In fact, all the considered 28 tracts have well-matching components (Suppl. Figure 4). Both ICA and NMF are able to identify spatially separate regions of grey matter (i.e. networks), along with their underlying white matter connections, for example in the forceps minor, the ILF and the various thalamic projections.

These examples demonstrate the advantages of using NMF over ICA. NMF components are inherently sparser (ICA-derived spatial maps typically cover the whole brain) and by construction non-negative. The main body of the anatomically relevant information conveyed by ICA components is present with NMF decompositions but in an inherently non-negative manner. This suggests that the NMF sparsity constraints effectively enforce independence in the composition, similarly to ICA. In addition, we can observe qualitative improvements of NMF over ICA for a number of tracts. For instance, the NMF component corresponding to the right IFO has a stronger peak in the occipital lobe than the equivalent ICA component, and NMF has fewer false positive frontal projections in the left ILF. These features are seen in the results from both split-halves of the cohort.Further detailed comparison between NMF and ICA components with differences between matched pairs is shown in Suppl. Figure 6. This demonstrates that the NMF results convey different information that the ICA results, even when the latter are thresholded to only retain positive values.

Interpretability can be also illustrated for components that do not match any tracts from the set we reconstructed using standard tractography protocols. An example is demonstrated in figure 7, where 10 components from the K = 100 NMF decomposition have been identified as corresponding to different segments of the corpus callosum. For each component, the grey matter (seed space) map is shown, along with the WM spatial map (tract space).

Figure 7.

Figure 7

Ten components from the K = 100 NMF decomposition that correspond to segments of the corpus callosum. For each component, the grey matter (seed space) map is shown, along with the WM spatial map (tract space) rendered in 3D to aid visualisation. All rendered WM segments are shown at the top.

Assessing the reliability and accuracy of the decompositions

To assess the reproducibility of the derived components, we performed a split-half reliability analysis for the ICA and NMF decompositions. Figure 8 presents histograms of correlations between the best-matching components across the split-halves, for both ICA and NMF. In all cases, the median value lies above 0.8, which shows that both methods are robust to different subject groups. Even if patterns are more variable for lower model orders (K< 25), both methods perform similarly for higher K (50, 100, 200). Similar behaviour is observed for grey matter components and white matter mixing matrices.

Figure 8.

Figure 8

Split-half reliability analysis for ICA and NMF. Pearson’s correlation scores were calculated between the best-matched components in each split for the white matter spatial maps (a) and the grey matter maps (b). The dotted lines on the violin plots indicate the 25th and 75th percentiles and the median is represented by a dashed line.

We also computed the reconstruction error and component sparsity. In line with the results from the simulations, reconstruction error decreases with increasing numbers of components, with ICA having slightly higher reconstruction accuracy than NMF (Suppl. Figure 5). Sparsity is much higher for NMF than for ICA, as we would expect from a qualitative examination of the components in figure 6. Sparsity increases rapidly from 5 to 50 components and increases after 100 components become smaller. Both measures have been calculated for both splits, and confidence intervals are displayed but very small, which indicates that these measures are stable for different groups of subjects.

We explored how increasing the model order in the decomposition affects the splitting of components (Suppl. Figure 7). Equivalent components were identified across model orders by calculating the correlations between their spatial maps. We can see that the more coarse-grained connectivity patterns from the low dimensionality decompositions are broken down into more sparse, fine-grained spatial maps as we increase the number of components. For example, in the left panel of Suppl. Figure 7, we show an NMF component and the associated white matter spatial map from the K = 5 decomposition that delineates the left pyramidal tract. As we increase the number of components from K = 5 to K = 50, we see this bundle split into sub-components that characterise different parts of corona radiata projections. We can also see the increase in sparsity between the low and the high order components (which agrees with the quantitative results - Suppl. Figure 5b).

Having ascertained the reliability of the data-driven framework for a large group of subjects, we explored the behaviour of smaller groups. We performed a K = 50 decomposition on a single subject’s data, and then for groups of 5, 10, 50 and 200 subjects. The white matter and grey matter spatial maps from two of the resultant components are shown in figure 9. This shows that the patterns are robust even at the single-subject level, although the patterns are noisier with fewer subjects. A quantitative analysis of the similarity between the small group-size results and the full cohort components is shown in Suppl. Figure 8, from which we can see that components from 10 subjects and 50 subjects have similarly very strong correspondence with the full cohort, while even the single-subject results are reasonable.

Figure 9.

Figure 9

Two components and their corresponding white matter pathways from K = 50 group-level decompositions with varying numbers of subjects. Component 1 correlates well with the tractography-delineated cortico-spinal tract, and component 2 with the inferior longitudinal fasciculus.

Finally, we compared the results from single-subject NMFdecompositions with the results from non-negative dual regressionon the same subjects against a group NMF decomposition, as shown in Suppl. Figure 9. We found that there is a strong agreement between the component maps obtained from these different approaches, which is reassuring and highlights the benefit of using non-negative dual-regression against a group decomposition in ensuring consistency in the components between subjects, but also preserving individual subject features.A small number of cases (lower end of the depicted distribution in Suppl. Figure 9a) exhibit relatively lower agreement between the two sets of results. We anticipate that imperfections in registration and/or alignment of the surfaces to the volumetric template are reflected in these disagreements; but even in these cases (Subject B as representative example), the spatial maps of the components do not look too dissimilar, demonstrating the robustness of the approach.

Comparison with functional resting-state networks

As an extra indirect validation, we compared the grey matter maps from the NMF decompositions of the tractography data, with resting-state networks (RSNs) obtained from ICA decomposition of fMRI data. We performed group-level ICA (K = 50) on fMRI data from 55 subjects and compared the resultant resting-state networks to those from a K = 50 NMF decomposition of the structural connectivity data from the same subjects. Through visual inspection, 24 of the functional components were found to contain noise or artefacts, so were discarded. We measured the similarity of the remaining 26 RSNs to our structural grey matter components using Pearson’s correlation coefficient, r, to identify the best matching pairs.

Most functional components were well matched to at least one structural component, with the lowest correlation value between an RSN and a tractography component being r = 0.2. Over half (14 out of the 26 networks identified) had a correlation value r > 0.5 with their best-matched structural component. The correlation matrix in figure 10 is sparse, which indicates that there is specificity in the matching. Where RSNs were strongly associated with multiple structural components, this was either a bilateral network split into the two hemispheres (e.g. fig 10b and c) or structural networks that overlapped with different regions of the RSN (fig 10a and d).

Figure 10.

Figure 10

Left: correlation matrix between the fMRI RSNs and their 26 best-matched tractography NMF components. Right: examples of the functional networks and their most spatially similar grey matter components from structural NMF. These correspond to the columns outlined in yellow on the correlation matrix. The corresponding white matter patterns are shown as maximum intensity projections.

Parcellations

The grey matter components from NMF were used to generate hard parcellations of the cortex, using a winner-takes-all approach. This process was carried out on each of the split-half groups to assess how robust the parcellations are to different groups of subjects. Figure 11a illustrates the parcellation results for different values of K. We can observe high reproducibility of the parcels between the two split-halves, and parcellation schemes are robust across different model orders. We also showa subject-specific parcellation generated from the results of a non-negative dual regression that demonstrates qualitatively how the group results correspond to single subjects. In order to quantify the variability of these group parcellations across subjects, we calculated the Dice coefficient between the equivalent parcels in the group-level and subject-level parcellations. The average coefficient for each parcel of the K = 100 parcellation is shown in figure 11b, alongside two example subject level parcellations, with the lowest and highest average (across parcels) Dice score, respectively. We can see that most parcels are relatively stable across subjects (average Dice > 0.7).

Figure 11.

Figure 11

a)Hard parcellations of the cortical surface from NMF, from each split-half of the cohort and from dual regression of the group-level results onto a single subject. The left hemisphere displayed only. Parcels are colour matched according to the correlation values between the original grey matter components. b)Left: variability of the K = 100 parcellation borders, colour coded according to the average Dice score between the group level parcellations with the subject level parcellations from split 1 of the cohort (dark red: small overlap of parcel across subjects, bright yellow: large overlap of parcel across subjects). Right: examples of subject-level parcellations with low and high average Dice score with the group parcellation.

Figure 12a further quantifies the similarity between split-half group parcellationsby showing the distributions of Dice scores across all generated parcels. This can be compared against distributions of Dice scores obtained from 100 random Voronoi parcellations (with spatial continuity-enforced) of the same order as the decomposition used in each case. The parcellations using the NMF components are significantly more consistent than the equivalent randomly generated parcellations.

Figure 12.

Figure 12

a) Dice scores of matching parcels across the split-half analysis. For comparison, we also calculated the Dice score between one of the splits’ NMF parcellations and 100 randomly generated Voronoi parcellations of the same model order. b) Mean Silhouette score across clusters for NMF and Voronoi parcellations with model orders of 5, 10, 25, 50, 100 and 200.

To further gain insight into the validity of these parcellations, we calculated the mean Silhouette score across parcels for the NMF-based parcellations at each model order, and for each split-half of the cohort. For comparison, we also computed the measure for 100 randomly generated Voronoi parcellations with the same number of parcels. A silhouette score measures the similarity of the data within a parcel, relative to their dissimilarity to data in other parcels. From figure 12b, we can see that the mean Silhouette score across parcels for our data-driven parcellations is consistently higher than for the equivalent random parcellations. Furthermore, we can see that the validity of the parcellations increases with increasing numbers of parcels in data-driven parcellations. On the contrary, for random parcellations, the Silhouette score peaks at K = 25, and then decreases for greater values of K. Our results show that our data-driven parcellations provide a more meaningful clustering of the data than random parcellations, even when the random parcellation has spatial contiguity enforced.

Discussion

We have developed and demonstrated a non-negative framework for simultaneously mapping white matter connections and corresponding grey-matter networks from diffusion MRI data in a data-driven manner. We presented this approach within the context of mapping structural connectivity in the neonatal brain. Non-negative matrix factorisation (NMF) is a powerful alternative to traditional tract delineation that has no parametric assumptions, no dependence on predefined ROIs and masks in a template space, and is inherently suited to the non-negative nature of tractography data. We directly evaluated the performance of the framework using numerical simulated scenarios and indirectly explored the validity of the extracted components by comparing them against known tracts and against networks obtained from a different modality (resting-state fMRI). We also developed a non-negative dual regression approach to allow group NMF decomposition results to be consistently applied to individual subjects andconfirmed the similarityof dual-regressed results with single-subject decompositions. Finally, weshowed benefits of the NMF framework compared to a similar-in-spirit approach that used ICA to map connections in the adult brain (O’Muircheartaigh and Jbabdi, 2017). NMF is an alternative decomposition method that provides more interpretable and accurate reconstructions of non-negative sources than ICA.

Our work falls within the family of other data-driven approaches for mapping structural connections from whole-brain tractograms, such as (Garyfallidis et al., 2012; O’Donnell and Westin, 2007; Siless et al., 2018). Our approach extends these efforts by allowing simultaneous reconstructions of white matter bundles, but also corresponding grey-matter networks that these bundles connect. Furthermore, none of the previous data-driven approaches have been applied for mapping connections from diffusion MRI data of the neonatal brain, as shown here.

Validation using Simulations

We used simulations to investigate the behaviour of the decompositions in controlled scenarios, in which the ground truth was known, and we could evaluate performance as a function of preselected features. In order to generate realistic simulations for such a decomposition framework, we therefore learned properties of the sources from distributions obtained from in vivo data, and mixed non-negative sources to generate synthetic data with a known number of components.

We first looked at the effect of adding an L1-norm regularisation term to the objective function for NMF (see equation 1). Increasing the regularisation reduces the accuracy of the data reconstruction, but a small amount (α = 0.1) improves the correlations between the sources and the components at lower model orders and promotes component sparsity. We decided to use an alpha value of 0.1 for subsequent work, as we deemed this to be a good compromise between higher component sparsity and sources reproduction, with only a minimal impact on reconstruction accuracy. Increasing the sparsity of components has been shown to generate features that are inherently more independent, while constraining the NMF solution space to make the decomposition more reliable (Hoyer, 2004).

We also looked at the effect of adding varying levels of Gaussian noise to the data. As expected, the reconstruction error of the decompositions increased with increasing noise, but the correlation between components and true sources was fairly stable, particularly at low model orders. Comparing the results from ICA and NMF, both were able to reconstruct the original data (using the dot product of the mixing matrix and component matrix) with good accuracy, but the components from ICA were less well matched to the true non-negative sources themselves than those from NMF. This is because the components from ICA contain negative values that are not found in the real sources, although mutual cancellation of positive and negative values in the components and mixing matrix allows the data matrix to be reconstructed accurately.

Indirect Validation

White matter spatial maps of the NMF components show strong spatial similarity to known white matter pathways (Figures 6, 7, Suppl. Figure 4).Each of the 28 tracts that were considered had a corresponding component from the K = 100 decomposition. The tractography-matched patterns from ICA and NMF have similarities, as seen in figure 6. This hints towards NMF being able to separate spatially independent components, in an analogous manner to ICA, despite not having independence constraints enforced explicitly. This is because the sparsity constraint on the NMF decomposition implicitly promotes non-Gaussianity in the resultant components, which is used as a proxy for independence in the FastICA algorithm (Hyvärinen and Oja, 2000). Indeed, sparsity and independence criteria have previously been shown to generate very similar basis sets across several different data types (Saito et al., 2000).

Despite the overall similarity between the results from the two methods, there are some noticeable differences between the spatial maps from ICA and NMF, shown in Supplementary Figures 4 and 6. For example, the component corresponding to the forceps major extends more strongly into the right hemisphere in the NMF component than in ICA component. In addition, the ILF component from ICA extends into the frontal lobe, mixing with the inferior fronto-occipital fasciculus, which is not seen in the NMF result. Suppl. Figure 6 shows further examples and illustrates the effects different levels of thresholding on the ICA results. We can see from these results that a) the NMF results convey different information than the ICA results, even when the latter are thresholded to only retain positive values, b) different ICA components would require different levels of thresholding to match the results from NMF.

There are also some tracts which are not so well-characterised by either method, such as the acoustic radiation, which contains a mixture of the middle longitudinal fasciculus, and the superior longitudinal fasciculus, which does not have separate lobes in the grey matter components. However, it is worth noting that the data-driven methods presented here are not meant to replace tractography for major bundle delineation, particularly in cases where we have well-defined tractography protocols. Instead, they can provide complementary ways to concurrently extract GM and WM connectivity patterns from all the data simultaneously, particularly for cases where this delineation of bundles is challenging or incomplete. This can potentially be a powerful novel way of summarising the information content of tractography data for applications other than bundle delineation, such as connectivity-driven functional localisation (for example(Mars et al., 2018)).

We explored a range of model orders from 5 to 200. The lower model orders generate more distributed components that contain multiple white matter bundles, whereas the higher model orders give more specificity, as shown in Suppl. Figure 7. The components from lower model orders (eg. K = 5) are split into smaller constituent parts for higher model orders, providing a component hierarchy as K increases. At higher model orders, we also see additional components that do not have matchingpredefined tracts from the standard protocols. We show examples of these“unassigned”NMF components in Suppl. Figure 10. Many of these components are bilateral, and show short range connections in the frontal lobe, such as the fronto-marginal tract (bottom row, first two columns)(Catani et al., 2012). Others may reflect false positive connections, such as the thalamic loops in the fifth row. Interpreting and potentially classifying these components is an interesting topic for future exploration and similar ideas applied to fMRI ICA-based classification(Salimi-Khorshidi et al., 2014) could be aimed for here, particularly with respect to the NMF model order used.

We performed a quantitative analysis of the components, shown in Suppl. Figure 5, looking at the reconstruction error and the sparsity of the components. Reconstruction error decreases with more components and that the sparsity of the components increases. This reflects the higher degree of freedom afforded by more components that permit a more detailed reconstruction of the original data, and components that are more tightly localised around fine-grained regions of similar connectivity. NMF components are sparser than those from ICA, which indicates that the former is able to localise connectivity patterns more effectively, disregarding redundant information and keeping non-negativity in the reconstruction.

The grey matter maps of the NMF components were also shown to align well to resting-state networks from fMRI. This provides further evidence that these data-driven results are anatomically meaningful. It also opens up future possibilities for devising a multi-modal data-driven framework that can fuse information across modalities and perform decompositions simultaneously for dMRI and fMRI data.

Parcellations

We used the grey matter maps of the NMF components to generate a connectivity-based cortical parcellation scheme. Specifically, each vertex on the cortical mesh was labelled according to the component with the strongest weighting at each point. This leads to a parcellation in which clusters share similar patterns of structural connectivity to the rest of the brain. Depending on the model order of the decomposition, the parcellation can be coarse or more fine-grained (see figure 11a). An advantage of this approach is that it is entirely data-driven, so the parcellations are not biased by any subjective measures. It can also be used to generate subject specific parcellations, by using the subject-level grey-matter maps from dual regression.

We also performed a split-half reliability analysis of the parcellations, using Dice Score as a similarity measure, to see how reproducible the parcellations are for different model orders. We compared the results with the Dice score between one split and a set of randomly generated Voronoi parcellations. For all model orders, the data-driven parcellations were more consistent than random parcellations. In addition, we used Silhouette score as a measure of the parcel validity, and again compared the performance of the NMF-based parcellations against 100 random Voronoi parcellations. Silhouette score measures the similarity of the connectivity profile of a given grey matter vertex to others in its parcel, relative to the connectivity of vertices in other parcels. We found that our data-driven parcellations consistently scored higher on this measure than the random parcellations (see figure 12).

Despite this evidence provided by our results, validating a cortical parcellation is extremely challenging.Existing schemes for the neonatal brain have been derived from manual segmentation of high-resolution data(Alexander et al., 2019, 2017), or compared against expert manual segmentations (Adamson et al., 2020; Oishi et al., 2011). While these are extremely useful pieces of work, as they stem from traditional invasive parcellation approaches, they are based ongyral and sulcal landmarks.These landmarks may not necessarily coincide withfunctional boundaries (see e.g. (Van Essen and Glasser, 2018)for a recent review). The hope is that connectivity patterns can provide additional information that is closely linked to non-invasive functional delineation, as shown in (Glasser et al., 2016). The NMF framework presented here may be extremely useful for providing another connectivity-based modality, in addition for instance to functional connectivity approaches,and further augment multi-modal parcellations.

Decomposition Domain

In the results presented here, we have been applying decompositions in the WGB seed domain, allowing white matter tract overlap. We also tried applying the decompositions to the transpose of the connectivity matrix,XT,which meant decomposing (and in the case of ICA enforcing independence) in the tract domain. ICA and NMF were performed on the transpose of the split 1 connectivity matrix, XT with K = 50. Looking at the similarity between the results from both methods (see Suppl. Figure 11), we can see that the ICA components are most affected by this change. Most NMF components are nearly identical to the original results. This agrees with expectations, as in NMF the sparsity and non-negativity constraints are enforced in both the mixing matrix and the components (see equation 1).

Limitations

Our decomposition framework uses whole-brain tractography data and its performance can therefore be challenged by tractography limitations, which are important to keep in mind when interpreting results. Tractography is an indirect measure of anatomy that is prone to identifying false positive connections (Maier-Hein et al., 2017). False positives in tractography can be demonstrated in two ways: a) In a noisy fashion, causing false paths that are inconsistent either spatially or across subjects. These are less likely to be major drivers of data-driven decompositions, b) In a biased fashion, i.e. consistent false positives that have a certain spatial extent and are reproducible across subjects. These can form the basis of extracted components in NMF, even at the group level.We however performed a number of indirect validations to gain confidence in the validity and interpretability of the results. NMF decompositions, without any constraints or anatomical knowledge imposed, identified patterns that resembled constrained tractography results in white matter and patterns obtained from an independent modality (rfMRI) in grey matter, and allowed whole-brain connectivity-based parcellations that were reproducible across subjects.

It has also been shown that tractography streamlines are biased towards terminations in the gyri rather than the sulci (Schilling et al., 2018; Van Essen et al., 2013), although the effects of this “gyral bias” can be minimised by seeding from the cortical surface rather than the whole brain (Donahue et al., 2016; Schilling et al., 2018), as we have done here. We have also shown in previous work that the effects of gyral bias are less prevalent in neonates than in adults due to the less developed cortical folding (Thompson et al., 2019)and we therefore expect less direct influence of such biases into the NMF performance in the neonatal brain. In fact, our parcellation borders did not show a consistent overlap with sulcal fundi or gyral crowns (Suppl. Figure 12).

Data-driven decompositions can be more computationally demanding than standard tractography approaches, as they consider all data at once and extract all white-matter and grey-matter maps simultaneously, within the same decomposition. To reduce the memory requirements and the computational burden, we binned the whole brain tractography data into a 2 mm spatial grid, which subsequently defined size M in the decompositions (Figure 1); rather than using the native 1.5mm spatial grid of the dMRI data. This provides WM components at a lower resolution than available in the original data but does not change any trends or conclusions drawn from the presented analyses.

Conclusions

We have shown that data-driven methods can be used to jointly map white matter bundles and their corresponding grey matter networks from dMRI tractography data from neonatal subjects. In particular, we show that non-negative matrix factorisation provides a robust decomposition that is a natural fit for the inherently non-negative structural connectivity data.

Supplementary Material

Supplementary Figures

Figure 3.

Figure 3

Simulation experiment to assess the effect of L1-norm regularisation on NMF. The degree of regularisation increases from left to right across the plots (a = 0.0, 0.1, 0.25, 0.5). Different metrics are shown from top to bottom: reconstruction error, correlation between derived components and underlying sources, sparsity of components. The true number of underlying sources (K = 50) is denoted by a vertical dashed line. Noise variance was σ2 = 0.05. Results are shown averaged over 100 noisy realisations of the data.

Acknowledgements

E.T. is supported by funding from the Engineering and Physical Sciences Research Council (EPSRC) and Medical Research Council (MRC) [ONBI CDT, grant number EP/L016052/1].S.N.S.is also supported by grant [217266/Z/19/Z] from the Wellcome Trust. Data were provided by the developing Human Connectome Project, a KCL-Imperial-Oxford Consortium funded by the European Research Council under the European Union Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement no. [319456]. We are grateful to the families who generously supported this trial. The computations described in this paper were performed using the University of Nottingham’s Augusta HPC service and the Precision Imaging Beacon Cluster, which provide High Performance Computing service to the University’s research community.

Footnotes

1

By ”dense” we refer to voxel-wise / vertex-wise representations rather than areal-wise nodes, i.e. N and M are in the order of thousands.

References

  1. Adamson CL, Alexander B, Ball G, Beare R, Cheong JLY, Spittle AJ, Doyle LW, Anderson PJ, Seal ML, Thompson DK. Parcellation of the neonatal cortex using Surface-based Melbourne Children’s Regional Infant Brain atlases (M-CRIB-S) Sci Rep. 2020;10:4359. doi: 10.1038/s41598-020-61326-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alexander B, Loh WY, Matthews LG, Murray AL, Adamson C, Beare R, Chen J, Kelly CE, Anderson PJ, Doyle LW, Spittle AJ, et al. Desikan-Killiany-Tourville Atlas Compatible Version of M-CRIB Neonatal Parcellated Whole Brain Atlas: The M-CRIB 2.0. Front Neurosci. 2019;13 doi: 10.3389/fnins.2019.00034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Alexander B, Murray AL, Loh WY, Matthews LG, Adamson C, Beare R, Chen J, Kelly CE, Rees S, Warfield SK, Anderson PJ, et al. A new neonatal cortical and subcortical brain atlas: the Melbourne Children’s Regional Infant Brain (M-CRIB) atlas. Neuroimage. 2017;147:841–851. doi: 10.1016/j.neuroimage.2016.09.068. [DOI] [PubMed] [Google Scholar]
  4. Andersson JLR, Graham MS, Zsoldos E, Sotiropoulos SN. Incorporating outlier detection and replacement into a non-parametric framework for movement and distortion correction of diffusion MR images. Neuroimage. 2016;141:556–572. doi: 10.1016/j.neuroimage.2016.06.058. [DOI] [PubMed] [Google Scholar]
  5. Andersson JLR, Sotiropoulos SN. An integrated approach to correction for off-resonance effects and subject movement in diffusion MR imaging. Neuroimage. 2016;125:1063–1078. doi: 10.1016/j.neuroimage.2015.10.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Aurenhammer F. Voronoi diagrams—a survey of a fundamental geometric data structure. ACM Comput. Surv. 1991;23:345–405. doi: 10.1145/116873.116880. [DOI] [Google Scholar]
  7. Avants BB, Tustison NJ, Song G, Cook PA, Klein A, Gee JC. A reproducible evaluation of ANTs similarity metric performance in brain image registration. Neuroimage. 2011;54:2033–2044. doi: 10.1016/j.neuroimage.2010.09.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Ball G, Beare R, Seal ML. Charting shared developmental trajectories of cortical thickness and structural connectivity in childhood and adolescence. Hum. Brain Mapp. 2019;40:4630–4644. doi: 10.1002/hbm.24726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Ball G, Pazderova L, Chew A, Tusor N, Merchant N, Arichi T, Allsop JM, Cowan FM, Edwards AD, Counsell SJ. Thalamocortical Connectivity Predicts Cognition in Children Born Preterm. Cereb. Cortex. 2015;25:4310–4318. doi: 10.1093/cercor/bhu331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bastiani M, Andersson JLR, Cordero-Grande L, Murgasova M, Hutter J, Price AN, Makropoulos A, Fitzgibbon SP, Hughes E, Rueckert D, Victor S, et al. Automated processing pipeline for neonatal diffusion MRI in the developing Human Connectome Project. Neuroimage. 2019;185:750–763. doi: 10.1016/j.neuroimage.2018.05.064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Batalle D, Edwards AD, O’Muircheartaigh J. Annual Research Review: Not just a small adult brain: understanding later neurodevelopment through imaging the neonatal brain. J. Child Psychol. Psychiatry. 2018;59:350–371. doi: 10.1111/jcpp.12838. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Batalle D, Hughes EJ, Zhang H, Tournier J-D, Tusor N, Aljabar P, Wali L, Alexander DC, Hajnal JV, Nosarti C, Edwards AD, et al. Early development of structural networks and the impact of prematurity on brain connectivity. Neuroimage. 2017;149:379–392. doi: 10.1016/j.neuroimage.2017.01.065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Beckmann C, Mackay C, Filippini N, Smith S. Group comparison of resting-state FMRI data using multi-subject ICA and dual regression. Neuroimage. 2009;47:S148. doi: 10.1016/S1053-8119(09)71511-3. [DOI] [Google Scholar]
  14. Beckmann CF, Smith SM. Probabilistic Independent Component Analysis for Functional Magnetic Resonance Imaging. IEEE Trans. Med. Imaging. 2004;23:137–152. doi: 10.1109/TMI.2003.822821. [DOI] [PubMed] [Google Scholar]
  15. Behrens TEJ, Berg HJ, Jbabdi S, Rushworth MFS, Woolrich MW. Probabilistic diffusion tractography with multiple fibre orientations: What can we gain? Neuroimage. 2007;34:144–155. doi: 10.1016/j.neuroimage.2006.09.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Boutsidis C, Gallopoulos E. SVD based initialization: A head start for nonnegative matrix factorization. Pattern Recognit. 2008;41:1350–1362. doi: 10.1016/j.patcog.2007.09.010. [DOI] [Google Scholar]
  17. Bozek J, Makropoulos A, Schuh A, Fitzgibbon S, Wright R, Glasser MF, Coalson TS, O’Muircheartaigh J, Hutter J, Price AN, Cordero-Grande L, et al. Construction of a neonatal cortical surface atlas using Multimodal Surface Matching in the Developing Human Connectome Project. Neuroimage. 2018;179:11–29. doi: 10.1016/j.neuroimage.2018.06.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Brown CJ, Miller SP, Booth BG, Andrews S, Chau V, Poskitt KJ, Hamarneh G. Structural network analysis of brain development in young preterm neonates. Neuroimage. 2014;101:667–680. doi: 10.1016/j.neuroimage.2014.07.030. [DOI] [PubMed] [Google Scholar]
  19. Catani M, Dell’Acqua F, Vergani F, Malik F, Hodge H, Roy P, Valabregue R, Thiebaut de Schotten M. Short frontal lobe connections of the human brain. Cortex. 2012;48:273–291. doi: 10.1016/j.cortex.2011.12.001. [DOI] [PubMed] [Google Scholar]
  20. Cichocki A, Phan A-H. Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 2009;E92-A:708–721. doi: 10.1587/transfun.E92.A.708. [DOI] [Google Scholar]
  21. Counsell SJ, Edwards AD, Chew ATM, Anjari M, Dyet LE, Srinivasan L, Boardman JP, Allsop JM, Hajnal JV, Rutherford MA, Cowan FM. Specific relations between neurodevelopmental abilities and white matter microstructure in children born preterm. Brain. 2008;131:3201–3208. doi: 10.1093/brain/awn268. [DOI] [PubMed] [Google Scholar]
  22. De Groot M, Vernooij MW, Klein S, Ikram MA, Vos FM, Smith SM, Niessen WJ, Andersson JLR. Improving alignment in Tract-based spatial statistics: Evaluation and optimization of image registration. Neuroimage. 2013;76:400–411. doi: 10.1016/j.neuroimage.2013.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. De Lathauwer L, De Moor B, Vandewalle J. A Multilinear Singular Value Decomposition. SIAM J. Matrix Anal. Appl. 2000;21:1253–1278. doi: 10.1137/S0895479896305696. [DOI] [Google Scholar]
  24. Deoni SCL, Dean DC, Piryatinsky I, O’Muircheartaigh J, Waskiewicz N, Lehman K, Han M, Dirks H. Breastfeeding and early white matter development: A cross-sectional study. Neuroimage. 2013;82:77–86. doi: 10.1016/j.neuroimage.2013.05.090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Donahue CJ, Sotiropoulos SN, Jbabdi S, Hernandez-Fernandez M, Behrens TE, Dyrby TB, Coalson T, Kennedy H, Knoblauch K, Van Essen DC, Glasser MF. Using Diffusion Tractography to Predict Cortical Connection Strength and Distance: A Quantitative Comparison with Tracers in the Monkey. J. Neurosci. 2016;36:6758–6770. doi: 10.1523/JNEUROSCI.0493-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Dubois J, Dehaene-Lambertz G, Perrin M, Mangin J-F, Cointepas Y, Duchesnay E, Le Bihan D, Hertz-Pannier L. Asynchrony of the early maturation of white matter bundles in healthy infants: Quantitative landmarks revealed noninvasively by diffusion tensor imaging. Hum. Brain Mapp. 2008;29:14–27. doi: 10.1002/hbm.20363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Févotte C, Idier J. Algorithms for Nonnegative Matrix Factorization with the β-Divergence. Neural Comput. 2011;23:2421–2456. doi: 10.1162/NECO_a_00168. [DOI] [Google Scholar]
  28. Fitzgibbon SP, Harrison SJ, Jenkinson M, Baxter L, Robinson EC, Bastiani M, Bozek J, Karolis V, Grande LC, Price AN, Hughes E, et al. The developing Human Connectome Project (dHCP) automated resting-state functional processing framework for newborn infants. bioRxiv. 2019 doi: 10.1101/766030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Garyfallidis E, Brett M, Correia MM, Williams GB, Nimmo-Smith I. QuickBundles, a Method for Tractography Simplification. Front. Neurosci. 2012;6 doi: 10.3389/fnins.2012.00175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Girault JB, Munsell BC, Puechmaille D, Goldman BD, Prieto JC, Styner M, Gilmore JH. White matter connectomes at birth accurately predict cognitive abilities at age 2. Neuroimage. 2019;192:145–155. doi: 10.1016/j.neuroimage.2019.02.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Glasser MF, Coalson TS, Robinson EC, Hacker CD, Harwell J, Yacoub E, Ugurbil K, Andersson J, Beckmann CF, Jenkinson M, Smith SM, et al. A multi-modal parcellation of human cerebral cortex. Nature. 2016;536:171–178. doi: 10.1038/nature18933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Glasser MF, Sotiropoulos SN, Wilson JA, Coalson TS, Fischl B, Andersson JL, Xu J, Jbabdi S, Webster M, Polimeni JR, Van Essen DC, et al. The minimal preprocessing pipelines for the Human Connectome Project. Neuroimage. 2013;80:105–124. doi: 10.1016/j.neuroimage.2013.04.127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Hernandez-Fernandez M, Reguly I, Jbabdi S, Giles M, Smith S, Sotiropoulos SN. Using GPUs to accelerate computational diffusion MRI: From microstructure estimation to tractography and connectomes. Neuroimage. 2019;188:598–615. doi: 10.1016/j.neuroimage.2018.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Hernández M, Guerrero GD, Cecilia JM, García JM, Inuggi A, Jbabdi S, Behrens TEJ, Sotiropoulos SN. Accelerating Fibre Orientation Estimation from Diffusion Weighted Magnetic Resonance Imaging Using GPUs. PLoS One. 2013;8:e61892. doi: 10.1371/journal.pone.0061892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Howell BR, Styner MA, Gao W, Yap PT, Wang L, Baluyot K, Yacoub E, Chen G, Potts T, Salzwedel A, Li G, et al. The UNC/UMN Baby Connectome Project (BCP): An overview of the study design and protocol development. Neuroimage. 2019;185:891–905. doi: 10.1016/j.neuroimage.2018.03.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Hoyer PO. Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 2004;5:1457–1469. [Google Scholar]
  37. Hughes EJ, Winchman T, Padormo F, Teixeira R, Wurie J, Sharma M, Fox M, Hutter J, Cordero-Grande L, Price AN, Allsop J, et al. A dedicated neonatal brain imaging system. Magn. Reson. Med. 2017;78:794–804. doi: 10.1002/mrm.26462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Hutter J, Price AN, Cordero-Grande L, Malik S, Ferrazzi G, Gaspar A, Hughes EJ, Christiaens D, McCabe L, Schneider T, Rutherford MA, et al. Quiet echo planar imaging for functional and diffusion MRI. Magn. Reson. Med. 2018;79:1447–1459. doi: 10.1002/mrm.26810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Hyvärinen A, Oja E. Independent component analysis: algorithms and applications. Neural Networks. 2000;13:411–430. doi: 10.1016/S0893-6080(00)00026-5. [DOI] [PubMed] [Google Scholar]
  40. Kuklisova-Murgasova M, Quaghebeur G, Rutherford MA, Hajnal JV, Schnabel JA. Reconstruction of fetal brain MRI with intensity matching and complete outlier removal. Med. Image Anal. 2012;16:1550–1564. doi: 10.1016/j.media.2012.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Kulikova S, Hertz-Pannier L, Dehaene-Lambertz G, Buzmakov A, Poupon C, Dubois J. Multi-parametric evaluation of the white matter maturation. Brain Struct. Funct. 2015;220:3657–3672. doi: 10.1007/s00429-014-0881-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Lee DD, Seung HS. Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems. 2001:556–562. [Google Scholar]
  43. Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401:788–791. doi: 10.1038/44565. [DOI] [PubMed] [Google Scholar]
  44. Ling RF, Lawson CL, Hanson RJ. Solving Least Squares Problems. J. Am. Stat. Assoc. 1977;72:930. doi: 10.2307/2286501. [DOI] [Google Scholar]
  45. Maier-Hein KH, Neher PF, Houde J-C, Côté M-A, Garyfallidis E, Zhong J, Chamberland M, Yeh F-C, Lin Y-C, Ji Q, Reddick WE, et al. The challenge of mapping the human connectome based on diffusion tractography. Nat. Commun. 2017;8:1349. doi: 10.1038/s41467-017-01285-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Makropoulos A, Robinson EC, Schuh A, Wright R, Fitzgibbon S, Bozek J, Counsell SJ, Steinweg J, Vecchiato K, Passerat-Palmbach J, Lenz G, et al. The developing human connectome project: A minimal processing pipeline for neonatal cortical surface reconstruction. Neuroimage. 2018;173:88–112. doi: 10.1016/j.neuroimage.2018.01.054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Marcus DS, Harwell J, Olsen T, Hodge M, Glasser MF, Prior F, Jenkinson M, Laumann T, Curtiss SW, Van Essen DC. Informatics and Data Mining Tools and Strategies for the Human Connectome Project. Front. Neuroinform. 2011;5:4. doi: 10.3389/fninf.2011.00004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Mars RB, O’Muircheartaigh J, Folloni D, Li L, Glasser MF, Jbabdi S, Bryant KL. Concurrent analysis of white matter bundles and grey matter networks in the chimpanzee. Brain Struct. Funct. 2019;224:1021–1033. doi: 10.1007/s00429-018-1817-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Mars RB, Sotiropoulos SN, Passingham RE, Sallet J, Verhagen L, Khrapitchev AA, Sibson N, Jbabdi S. Whole brain comparative anatomy using connectivity blueprints. Elife. 2018;7:245209. doi: 10.7554/eLife.35237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. McKeown MJ, Makeig S, Brown GG, Jung TP, Kindermann SS, Bell AJ, Sejnowski TJ. Analysis of fMRI data by blind separation into independent spatial components. Hum. Brain Mapp. 1998;6:160–188. doi: 10.1002/(SICI)1097-0193(1998)6:3&#x0003c;160::AID-HBM5&#x0003e;3.0.CO;2-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Nickerson LD, Smith SM, Öngür D, Beckmann CF. Using Dual Regression to Investigate Network Shape and Amplitude in Functional Connectivity Analyses. Front. Neurosci. 2017;11:115. doi: 10.3389/fnins.2017.00115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. O’Donnell LJ, Westin CF. Automatic tractography segmentation using a high-dimensional white matter atlas. IEEE Trans. Med. Imaging. 2007;26:1562–1575. doi: 10.1109/TMI.2007.906785. [DOI] [PubMed] [Google Scholar]
  53. O’Muircheartaigh J, Jbabdi S. Concurrent white matter bundles and grey matter networks using independent component analysis. Neuroimage. 2017;170:296–306. doi: 10.1016/j.neuroimage.2017.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Oishi K, Mori S, Donohue PK, Ernst T, Anderson L, Buchthal S, Faria A, Jiang H, Li X, Miller MI, van Zijl PCM, et al. Multi-contrast human neonatal brain atlas: Application to normal neonate development analysis. Neuroimage. 2011;56:8–20. doi: 10.1016/j.neuroimage.2011.01.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Ouyang M, Dubois J, Yu Q, Mukherjee P, Huang H. Delineation of early brain development from fetuses to infants with diffusion MRI and beyond. Neuroimage. 2019;185:836–850. doi: 10.1016/j.neuroimage.2018.04.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Partridge SC, Mukherjee P, Henry RG, Miller SP, Berman JI, Jin H, Lu Y, Glenn OA, Ferriero DM, Barkovich AJ, Vigneron DB. Diffusion tensor imaging: Serial quantitation of white matter tract maturity in premature newborns. Neuroimage. 2004;22:1302–1314. doi: 10.1016/j.neuroimage.2004.02.038. [DOI] [PubMed] [Google Scholar]
  57. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, et al. Scikit learn: Machine Learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
  58. Robinson EC, Garcia K, Glasser MF, Chen Z, Coalson TS, Makropoulos A, Bozek J, Wright R, Schuh A, Webster M, Hutter J, et al. Multimodal surface matching with higher-order smoothness constraints. Neuroimage. 2018;167:453–465. doi: 10.1016/j.neuroimage.2017.10.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Robinson EC, Jbabdi S, Glasser MF, Andersson J, Burgess GC, Harms MP, Smith SM, Van Essen DC, Jenkinson M. MSM: A new flexible framework for multimodal surface matching. Neuroimage. 2014;100:414–426. doi: 10.1016/j.neuroimage.2014.05.069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Rousseeuw PJ. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987;20:53–65. doi: 10.1016/0377-0427(87)90125-7. [DOI] [Google Scholar]
  61. Saito N, Larson BM, Benichou B. Sparsity vs. statistical independence from a best-basis viewpoint. In: Aldroubi A, Laine AF, Unser MA, editors. Wavelet Applications in Signal and Image Processing VIII. 2000. p. 474. [DOI] [Google Scholar]
  62. Salimi-Khorshidi G, Douaud G, Beckmann CF, Glasser MF, Griffanti L, Smith SM. Automatic denoising of functional MRI data: Combining independent component analysis and hierarchical fusion of classifiers. Neuroimage. 2014;90:449–468. doi: 10.1016/j.neuroimage.2013.11.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Schilling K, Gao Y, Janve V, Stepniewska I, Landman BA, Anderson AW. Confirmation of a gyral bias in diffusion MRI fiber tractography. Hum. Brain Mapp. 2018;39:1449–1466. doi: 10.1002/hbm.23936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Serag A, Aljabar P, Ball G, Counsell SJ, Boardman JP, Rutherford MA, Edwards AD, Hajnal J V, Rueckert D. Construction of a consistent high-definition spatio-temporal atlas of the developing brain using adaptive kernel regression. Neuroimage. 2012;59:2255–2265. doi: 10.1016/j.neuroimage.2011.09.062. [DOI] [PubMed] [Google Scholar]
  65. Siless V, Chang K, Fischl B, Yendiki A. AnatomiCuts: Hierarchical clustering of tractography streamlines based on anatomical similarity. Neuroimage. 2018;166:32–45. doi: 10.1016/j.neuroimage.2017.10.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Smith RE, Tournier JD, Calamante F, Connelly A. Anatomically-constrained tractography: Improved diffusion MRI streamlines tractography through effective use of anatomical information. Neuroimage. 2012;62:1924–1938. doi: 10.1016/j.neuroimage.2012.06.005. [DOI] [PubMed] [Google Scholar]
  67. Smith SM, Hyvärinen A, Varoquaux G, Miller KL, Beckmann CF. Group-PCA for very large fMRI datasets. Neuroimage. 2014;101:738–749. doi: 10.1016/j.neuroimage.2014.07.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Sotiras A, Resnick SM, Davatzikos C. Finding imaging patterns of structural covariance via Non-Negative Matrix Factorization. Neuroimage. 2015;108:1–16. doi: 10.1016/J.NEUROIMAGE.2014.11.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Sotiras A, Toledo JB, Gur RE, Gur RC, Satterthwaite TD, Davatzikos C. Patterns of coordinated cortical remodeling during adolescence and their associations with functional specialization and evolutionary expansion. Proc Natl Acad Sci U S A. 2017;114:3527–3532. doi: 10.1073/pnas.1620928114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Sotiropoulos SN, Hernández-Fernández M, Vu AT, Andersson JL, Moeller S, Yacoub E, Lenglet C, Ugurbil K, Behrens TEJ, Jbabdi S. Fusion in diffusion MRI for improved fibre orientation estimation: An application to the 3T and 7T data of the Human Connectome Project. Neuroimage. 2016;134:396–409. doi: 10.1016/j.neuroimage.2016.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Tam EWY, Chau V, Barkovich AJ, Ferriero DM, Miller SP, Rogers EE, Grunau RE, Synnes AR, Xu D, Foong J, Brant R, et al. Early postnatal docosahexaenoic acid levels and improved preterm brain development. Pediatr. Res. 2016;79:723–730. doi: 10.1038/pr.2016.11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Thompson E, Robinson E, Bozek J, Jbabdi S, Bastiani M, Sotiropoulos SN. Annual Meeting of the Organization for Human Brain Mapping. Rome: 2019. Exploring the Gyral Bias on White Matter Tractography in Neonates. [Google Scholar]
  73. Van Essen DC, Glasser MF. Parcellating Cerebral Cortex: How Invasive Animal Studies Inform Noninvasive Mapmaking in Humans. Neuron. 2018;99:640–663. doi: 10.1016/j.neuron.2018.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Van Essen DC, Jbabdi S, Sotiropoulos SN, Chen C, Dikranian K, Coalson T, Harwell J, Behrens TEJ, Glasser MF. Diffusion MRI: From Quantitative Measurement to In Vivo Neuroanatomy. Second Edition. Academic Press; 2013. Mapping Connections in Humans and Non-Human Primates. Aspirations and Challenges for Diffusion Imaging; pp. 337–358. [DOI] [Google Scholar]
  75. Warrington S, Bryant KL, Khrapitchev AA, Sallet J, Charquero-Ballester M, Douaud G, Jbabdi S, Mars RB, Sotiropoulos SN. XTRACT - Standardised protocols for automated tractography in the human and macaque brain. Neuroimage. 2020;217:804641. doi: 10.1016/j.neuroimage.2020.116923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Wold S, Esbensen K, Geladi P. Principal component analysis. Chemom. Intell. Lab. Syst. 1987;2:37–52. doi: 10.1016/0169-7439(87)80084-9. [DOI] [Google Scholar]
  77. Wu L, Calhoun VD, Jung RE, Caprihan A. Connectivity-based whole brain dual parcellation by group ICA reveals tract structures and decreased connectivity in schizophrenia. Hum. Brain Mapp. 2015;36:4681–4701. doi: 10.1002/hbm.22945. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Figures

RESOURCES