Skip to main content
PLOS One logoLink to PLOS One
. 2022 Aug 2;17(8):e0268707. doi: 10.1371/journal.pone.0268707

Prediction of fluid intelligence from T1-w MRI images: A precise two-step deep learning framework

Mingliang Li 1,2, Mingfeng Jiang 1,*, Guangming Zhang 1, Yujun Liu 1, Xiaobo Zhou 3,*
Editor: Yiming Tang4
PMCID: PMC9345352  PMID: 35917308

Abstract

The Adolescent Brain Cognitive Development (ABCD) Neurocognitive Prediction Challenge (ABCD-NP-Challenge) is a community-driven competition that challenges competitors to develop algorithms to predict fluid intelligence scores from T1-w MRI images. In this work, a two-step deep learning pipeline is proposed to improve the prediction accuracy of fluid intelligence scores. In terms of the first step, the main contributions of this study include the following: (1) the concepts of the residual network (ResNet) and the squeeze-and-excitation network (SENet) are utilized to improve the original 3D U-Net; (2) in the segmentation process, the pixels in symmetrical brain regions are assigned the same label; (3) to remove redundant background information from the segmented regions of interest (ROIs), a minimum bounding cube (MBC) is used to enclose the ROIs. This new segmentation structure can greatly improve the segmentation performance of the ROIs in the brain as compared with the classical convolutional neural network (CNN), which yields a Dice coefficient of 0.8920. In the second stage, MBCs are used to train neural network regression models for enhanced nonlinearity. The fluid intelligence score prediction results of the proposed method are found to be superior to those of current state-of-the-art approaches, and the proposed method achieves a mean square error (MSE) of 82.56 on a test data set, which reflects a very competitive performance.

1 Introduction

Understanding cognitive development in children may potentially improve their health outcomes through adolescence. Thus, determining the neural mechanism underlying general intelligence is a critical task. Fluid intelligence is one crucial component of general human intelligence, which involves the capacity to think logically and solve problems in novel situations and is independent of acquired knowledge [1]. It has been widely accepted that fluid intelligence reaches a peak in late adolescence, after which it declines. Thus, its quantification and accurate prediction are important for teenagers, as it foresees creative achievement, scholastic performance, employment prospects, socioeconomic status, etc., in their future years. Structural and functional magnetic resonance imaging (MRI) images are one of the most powerful tools to help predict fluid intelligence. Aiming at the precise prediction of fluid intelligence scores, the ABCD dataset provides data and MRI images of a large number of adolescent participants, which have been adjusted for different data collection sites, demographic variables, and whole brain volumes.

The study of fluid intelligence has traditionally been concerned with the identification of the underlying mechanism responsible for cognitive ability. The research results indicate a strong correlation between brain volume and intelligence, and the magnitude of this effect is likely large [2]. More recently, MRIs have been shown to contain useful structural information with a strong correlation to fluid intelligence [3]. In the most related work, the reference [4] has outlined machine learning approaches employed to predict fluid intelligence from brain MRI data.

The traditional method for the prediction of fluid intelligence scores is to calculate features extracted with the assistance of existing computer-aided tools, and then to train a machine learning model on these features. FreeSurfer extracts the volume and thickness features describing the brain structure, which can provide more information for the prediction of fluid intelligence scores [5]. The National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) pipeline can be used to complete brain image noise reduction, correction, and feature extraction [6]. Moreover, the subcortical regions of subjects have been segmented by FSL FIRST; these regions were mainly cortical and did not include any subcortical regions of interest (ROIs) [7]. Furthermore, brain global shape features have been calculated via the implementation of the Insight Segmentation and Registration Toolkit (ITK) [8].

In recent years, deep learning methods have emerged as state-of-the-art solutions to many problems spanning various domains, such as natural language processing, bioinformatics, and medical imaging [9]. The convolutional neural network (CNN), a type of deep learning model, has been a useful tool for the analysis of image data [10]. Some studies have utilized structural MRI images to predict fluid intelligence scores, and brain volume has been demonstrated to be related to quantitative reasoning and working memory [11]. Moreover, a novel framework has been proposed for the estimation of a subject’s intelligence score via sparse learning based on neuroimaging features [12].

The use of traditional deep learning methods for fluid intelligence score regression is characterized by the following weaknesses: (1) the structure of the segmentation model is not well optimized: first, the high-layer and low-layer features of the segmentation framework are not fused [13]. As a result, a substantial amount of spatial information in the image is lost in the lower layer; second, the attention mechanism is not introduced into the segmentation model. The attention mechanism guides segmentation model by giving the higher weight to focus features while minimizing the irrelevant features, giving them lower weights; (2) the segmentation results are not used to create a neural network; instead, machine learning methods are used [14]. Thus, the traditional methods cannot fit the intelligence score well, resulting in unsatisfactory prediction accuracy.

In the present work, T1-weighted MRI images of adolescents are utilized to predict their fluid intelligence scores with a novel precise two-step deep learning framework. The main contributions of this work are three-fold: (1) residualized fluid intelligence scores are predicted based on an improved 3D U-Net architecture that utilizes the concepts of the residual network (ResNet), squeeze-and-excitation network (SENet), and symmetry learning mechanism; (2) the pixels in symmetrical brain regions share the same label, and the minimum bounding cube (MBC) operation is employed to eliminate interference from the background; (3) more accurate and stable results are obtained via fine segmentation, and these results are more helpful for improving the prediction accuracy of fluid intelligence scores.

2 Data

2.1 Dataset

Data were provided by the 2019 ABCD Neurocognitive Prediction Challenge (ABCD-NP-Challenge) [15], and included data on children aged 9–10. Participants were given access to T1-weighted MRI scans from 3739 children for training, scans from 415 children for validation, and scans from 4515 children for testing. The fluid intelligence scores recorded by the ABCD study were measured via the NIH Toolbox Neurocognition battery, as detailed in the electronic S1 File (ESI). To minimize the impact of confounds that are not related to the brain structure, the raw scores were pre-residualized by the ABCD-NP- Challenge organizers based on sex at birth, ethnicity, highest parental education, parental income, parental marital status, brain volume, and image acquisition site.

3D T1-w MRI images were pre-processed by the challenge organizers. The pre-processing steps involved first transforming raw data into NIfTI formats [16]. The brain mask was created by a majority voting approach among the outputs of a series of neuroimaging software including FSL BET, AFNI 3dSkullStrip, FreeSurer mri gcut, and Robust Brain Extraction (ROBEX). The final mask was obtained by taking a majority voting of the resulting masks, and it removed noise and corrected for bias-field inhomogeneities. Based on the final masks, the T1-w MRI image was segmented into gray matter, white matter, and cerebrospinal fluid via Atropos [17]. Afterwards, the skull-stripped T1-w image and corresponding gray matter segmentations were affinely mapped to the SRI24 atlas [18].

Finally, 122 brain regions of interest (ROIs) extracted by the challenge organizers based on the SRI24 atlas. 14 brain ROIs with unique anatomical characteristics and the roles in cognitive functions were selected to predict fluid intelligence score, as specifically described in the next section.

2.2 Selected brain regions

Most of the 14 ROIs for analysis by the proposed method have previously been reported and are highly associated with cognitive ability, as shown in Table 1. It has been found that the hippocampus, an important component in the limbic system, plays an important role in memory and spatial navigation, and the thalamus is conceptualized as a switchboard of information that processes and relays sensory information[1, 19, 20]. The inferior frontal gyrus has also been found to be related to semantic task processing. Moreover, recent novel views of thalamic functions emphasize integrative roles in cognition, ranging from learning and memory to flexible adaption [21]. The caudate nucleus is related to cognitive tasks such as organizing behavioral responses and using verbal skills in problem-solving [22]. Considerable evidence suggests that the human amygdala plays an important role in higher cognitive functions in addition to its well-known role in emotional processing [23].

Table 1. The labels and names of the ROIs in the SRI24 space.

ROIs Label Name
L Inferior frontal gyrus—opercular 11 Frontal gyrus
R Inferior frontal gyrus—opercular 12
L Inferior frontal gyrus—triangular 13
R Inferior frontal gyrus—triangular 14
L Inferior frontal gyrus—orbital 15
R Inferior frontal gyrus—orbital 16
L Hippocampus 37 Hippocampus
R Hippocampus 38
L Amygdala 41 Amygdala
R Amygdala 42
L Caudate nucleus 71 Caudate nucleus
R Caudate nucleus 72
L Thalamus 77 Thalamus
R Thalamus 78

L/R indicates a location in the left/right hemisphere.

In anatomy, the frontal gyrus has six regions, including L inferior frontal gyrus—opercular, R inferior frontal gyrus—opercular, L inferior frontal gyrus—triangular, R inferior frontal gyrus—triangular, L inferior frontal gyrus—orbital, and R inferior frontal gyrus—orbital; the hippocampus has two regions, including L hippocampus and R hippocampus; the amygdala has two regions, including L amygdala and R amygdala; the caudate nucleus has two regions, including L caudate nucleus and R caudate nucleus; the thalamus has two regions, including L thalamus and R thalamus.

3 Methodology

3.1 Symmetry learning mechanism

The vertebrate cerebrum (brain) is formed by two cerebral hemispheres that are separated by a groove, namely the longitudinal fissure. The brain can thus be described as being divided into left and right cerebral hemispheres. Macroscopically, the hemispheres are roughly mirror images of each other, with only subtle differences. On the microscopic level, the cytoarchitecture of the cerebral cortex reveals the functions of cells, the quantities of neurotransmitter levels, and receptor subtypes to be markedly asymmetrical between the hemispheres [24, 25]. However, while some of these hemispheric distribution differences are consistent across human beings, many vary from individual to individual [26]. It is precisely because the hemispheres are roughly mirror images of each other that the pixels in symmetrical brain regions were assigned the same label in the segmentation process. The ROIs presents macroscopic symmetry, the specific details are presented in Fig 1.

Fig 1. Example segmentation of the ROIs.

Fig 1

3.2 Technical details

The traditional pipeline of the regression of brain MRI images and fluid intelligence scores is based on the original deep learning architecture and post-processing. First, in the pre-processing stage, the segmentation framework needs to be further optimized. Second, the post-processing stage only uses the median predicted scores as the final prediction result, or extracts the high-level feature map information for regression, which causes information loss in the pre-processing stage. Finally, the fluid intelligence score regression operation is performed for only one ROI at a time, which ignores the interaction between other brain areas [13].

To improve the prediction accuracy of fluid intelligence scores, a two-step deep learning network is proposed. This network was inspired by the original 3D U-Net architecture, ResNet, SENet, and the symmetry learning mechanism. The improved 3D U-Net can perform accurate 3D segmentation tasks, after which the fluid intelligence score is predicted based on the feature of each finely segmented brain ROI.

A. Segmentation stage

The brain MRIs of subjects have high similarity, and the original 3D U-Net is able to extract a large number of features. However, due to the existence of individual differences, it is necessary to further enhance the attention mechanism of the network to obtain more refined segmentation results.

While deeper networks can extract more structural information, they lose more local information due to the continuous reduction of the feature map resolution. The architecture of the proposed network is illustrated in Fig 2, and consists of four layers including the bottleneck. It is assumed that the use of four layers is sufficient to extract more location information. The motivation behind the proposed architecture is to improve the attention mechanism.

Fig 2. The improved 3D U-Net architecture.

Fig 2

Blue boxes represent feature maps, and the number of channels is denoted after the size of each feature map. A skip connection is included in the bottleneck between the encoder and decoder layers. ⊕ refers to element-wise addition in the selected channel. The predicted labels are compared with the ground truth to calculate the Dice loss.

Therefore, skip connections in the bottleneck between the encoder and decoder layers, the recommendation block, and the SegS-E block are added to the network architecture to improve its ability to capture spatial and spectral information; more details are comprehensively provided in the next subsection.

The encoder takes a 3D input patch with size of 112 × 112 × 112 from the set of input images. In the first layer, the 16-channel 112 × 112 × 112 feature maps are generated with a 1 × 1 × 1 convolution operation, and 32-channel 112 × 112 × 112 feature maps are generated with the subsequent recombination block operation. The number of feature maps is increased in the subsequent layers to learn the deep features, which is followed by max-pooling and the down-sampling of features in the encoding layer. To match the size of feature maps in the channel, the 64-channel 28 × 28 × 28 feature maps after the 1 × 1 convolution operation are transformed into 96-channel 28 × 28 × 28 feature maps. The skip performs an element-wise addition operation (⊕) in the selected channel to ensure the volumes at this addition operation are the same size. Similarly, in the decoding layer, the feature maps are upsampled. In the output layer, the segmentation map predicted by the model is compared with the corresponding ground truth, and the error is backpropagated.

1) Recombination block. The concepts of ResNet and SENet are referenced to construct the recombination block.

With ResNet, the gradients can flow backward directly through the skip connections from the later layers to the initial filters, which can effectively reduce model overfitting [27]. The recombination block aims to enhance the semantic information between different feature layers; primarily, more convolutions and nonlinear transformations are performed so that the model can adapt to its own structure during training.

In the recombination block, the convolution, batch normalization (BN), and SegS-E block feature extraction modules are used. For the convenience of explanation, the labels B1-B8 are denoted on the blue cube representing the feature map shown in Fig 3. The SegS-E block in the recombination block is specifically introduced in the next section.

Fig 3. The recombination block.

Fig 3

“Normal conv” refers to 3 × 3 × 3 convolution filters and batch normalization; “1 × 1 × 1 conv” refers to 1 × 1 × 1 convolution filters.

In the present work, the recombination block is formally defined as follows.

B8=F{B1,B2,,B6}+B7. (1)

In the following notation, B1RD×H×W×C and B8RD×H×W×C are respectively used to denote the input and the output of the reco RD×H×W×C mbination block. Moreover, ⊕ refers to the addition of two matrices of the same dimension, in which each element is the sum of the original two matrix elements.

The function F{…} represents the residual mapping to be learned, specifically, the convolution operations on the feature maps. Convolution operations are performed on B1 with 1 × 1 × 1 convolution kernels, resulting in the feature map B2RD×H×W×4C with 4C channels. B3RD×H×W×4C is obtained by performing a 3 × 3 × 3 convolution and batch normalization operation. Then, B5RD×H×W×4C is obtained by performing the SegS-E block operation on B4, and the specific definition of the SegS-E block is provided in the next subsection. Finally, B6RD×H×W×C is obtained by performing a 1 × 1 × 1 convolution operation, and B7RD×H×W×C is similarly obtained by performing a 1 × 1 × 1 convolution operation.

2) SegS-E block. Via the introduction of the attention mechanism, useful features can be captured more accurately. In the proposed method, a more fine-grained feature enhancement method with different weights at different locations is employed, as shown in Fig 4. The attention mechanism of squeeze-and-excitation network (SENet) is used for the SegS-E block [28]. Dilated convolutions “inflate” the kernel by inserting holes between the kernel elements. Dilated convolution is utilized in the SegS-E block, and can systematically aggregate the contextual information of the input. With this purpose, it has applications concerned with the integration of knowledge in the wider context with less cost while keeping the output resolutions high. The goal is to increase the sensitivity of the network by explicitly modeling the channel interdependencies via the use of gated networks. Consequently, the SegS-E block learns how to understand the importance of each feature map in the stack of all the feature maps extracted after a convolution operation, and recalibrates that output to reflect that importance before passing the information to the next layer.

Fig 4. The SegS-E block.

Fig 4

“Dilated conv” refers to 3 × 3× 3 kernel with dilation rate of 2, and “1 × 1 × 1 conv” refers to 1 × 1 × 1 convolution filters.

The details of the SegS-E block are as follows. First, the size of the input feature map size is D × H × W, and the number of channels is C. A dilated convolution with a 3 × 3× 3 kernel and dilation rate of 2 is then performed over each channel of the input feature maps. Next, a 1 × 1 × 1 kernel is used to perform channel-wise convolution, resulting in a set of feature maps with C channels. Finally, the elements in the matrix of the feature map are multiplied by the elements in the matrix of the processed feature map, and the products are added.

A SegS-E block is a squeeze-and-excitation computational unit, which uses 3D dilation and a 1 × 1 × 1 convolution to map an input X ∈ RD×H×W×C to feature maps W∈ RD×H×W×C, after which a Hadamard product operator is used to map X and W to X˜ ∈ RD×H×W×C as the output. In the following notation, the dilated convolution operation is considered, and V = [v1, v2,…, vc] is used to denote the learned set of filter kernels, where vc refers to the parameters of the c-th dilated convolution filter. Thus, U can be defined as U = [u1, u2,…, uc], where

uc=vc*lX=s=1c/2vc*lxs, (2)

where *l denotes dilated convolution, and vc is a 3D kernel that acts on the corresponding channel of X. Subsequently, a convolution operation is performed on U with c 1 × 1 × 1 convolution kernels, resulting in feature maps W∈ RD×H×W×C with C channels, and W = [w1, w2,…, wc]. To systematically aggregate the contextual information of the input, a Hadamard product operation is used to map X and W to X˜ ∈ RD×H×W×C as the output. The final output of the SegS-E block is

x˜=WX=1cwcxc, (3)

where X˜=[x1˜,x2˜,,xc˜], and ⊗ refers to the Hadamard product. In mathematics, the Hadamard product is the product of two matrices of the same dimensions and has the same dimension as the operands, in which each element is the product of the elements of the original two matrices.

B. Regression stage

In the section 3.2-A, we have described the segmentation framework. In this section, we first define the Minimum bounding cube (MBC) of the segmented ROI. Then, we describe the process of building a neural network with resized MBCs as input to predict fluid intelligence scores.

  1. Minimum bounding cube (MBC). To improve the prediction accuracy of fluid intelligence score, we have performed the minimum bounding cube (MBC) operations on segmented ROIs in place of the traditional resizing operation. The details of generating an MBC are as follows: (1) the minimum bounding boxes (MBB) for ROI are generated [29]; (2) Lmax, the longest edge of the MBB of ROI, is determined; (3) the MBB of ROI is resized to fit size of Lmax ×Lmax × Lmax; (3) The MBB with size of Lmax ×Lmax × Lmax will be resized into a cube of a certain size again, resulting in the minimum bounding cube (MBC). In fact, MBC can be obtained by two interpolation operations, while traditional resizing tasks only need to perform one interpolation operation. In this work, the input size of the regression model is 64 × 64 × 64.

  2. Neural network construction. In the second stage, to explore the relationship between brain MRI volumes by incorporating morphological information and fluid intelligence scores, a convolutional neural network (CNN) was constructed to map each subject to the corresponding fluid intelligence score based on the ROI segmentation [30].

The greatest advantage of deep learning algorithms as compared with traditional machine learning models is that they try to learn high-level features from data in an incremental manner. In the regression stage, the convolution, BN, ReLU, and flatten operations were conducted, as shown in Fig 5. The inputs of the regression model were the resized MBCs of the ROIs with a size of 64×64×64, which consisted of the frontal gyrus, hippocampus, amygdala, caudate nucleus, thalamus, as indicated by the orange cubes in the Fig 5. The dropout rate was set to 0.5.

Fig 5. Creating a neural network to predict fluid intelligence score.

Fig 5

“Flatten” operation is converting the data into a 1-dimensional array for inputting it to the next layer, and “Conv(3 × 3 × 3) + BN + ReLU” refers to 3 × 3 × 3 convolution filters, BN for batch normalization, and ReLU for rectified linear unit.

Convolution with a kernel size of 3 × 3 × 3 was applied, resulting in feature maps with a size of 64 × 64 × 64 and 10 channels. By performing the same set of operations, feature maps with a size of 64 × 64 × 64 × 20 were obtained. Further, feature maps with a size of 64 × 64 × 64 × 1 were obtained, as indicated by the purple cube in Fig 5.

After flattening, the flattened feature map was passed through a neural network. The dimensions of the three fully connected layers were 262,144, 4096, and 64, respectively. Finally, the mapping of the fluid intelligence scores from the regressions was completed.

4 Experiment

In this section, we first present materials and experimental settings used in our study. We then present the quantitative evaluation metrics for segmentation results.

A Experimental settings

In this work, the segmentation and regression components in this framework were respectively trained. The segmentation model is trained at first and after completed training the regression model is trained based on the segmented ROIs. In the first step, the improved 3D U-net is trained. We compare our proposed improved 3D U-net method with the conventional counterparts in the experiments. In the second step, the CNN is trained for regression of fluid intelligence score. We compare the CNN method with the conventional machine learning method.

Model training was carried out with 10 RTX 2080ti 11GB GPUs. The unwanted outermost pixels of raw data with a size of 240 × 240 × 240 were removed as 224 × 224 × 224 pixels, in which the outer layers of the raw data volumes were the background information. Due to the GPU memory limitation, a patch size of 112 × 112 × 112 was adopted, and the batch size was 10. The patch was randomly selected from the training data, and each epoch set to 200 iterations, i.e., 200 × 10 patches were effectively selected in each epoch. In the segmentation stage and regression stage, we have resampled the training, test, and validation sets 5 times separately, and performed the same training and testing procedures for each resampled data.

B. Quantitative evaluation metrics

To evaluate the performance of our segmentation approach compared to the counterparts in the experiments, we implemented the following evaluation metrics [31]. We use the Dice coefficient (DC) as the first evaluation metric. The dice coefficient is defined as the region-based similarity between the segmentation result B and the ground truth A, which can be written as

DC=2|AB||A|+|B|, (4)

where |A ∩ B| denotes the overlapped region between A and B, and |A |+|B| denotes the union region.

Meanwhile, the average surface distance (ASD) also is used to measure the performance of different segmentation algorithms, which can be written as

ASD(A,B)=aAminbBd(a,b)|A|, (5)

Where d (a, b) is the Euclidean distance between the points of a and b, a and b are the numbers of vertices in the surface A and B, respectively.

We use the mean square error (MSE) as the our CNN method and machine learning methods evaluation metric. In statistics, the MSE is defined as average of the square of the difference between true and predicted value, which can be written as

MSE=1Ni=1N(yy*)2, (6)

where, N is the total number of subjects, y is the true intelligence score, y* is the predicted score from the prediction model.

5 Results

The proposed approach was compared with four recently proposed methods about using brain MRIs to predict fluid intelligence scores, as listed in Table 2. In comparison with the other methods, the proposed method achieved good performance with MSE = 60.29 on the training set, MSE = 51.72 on the validation set, and MSE = 82.56 on the test set.

Table 2. The comparison of the MSE values of the proposed method and current state-of-the-art methods.

Method Train: MSE Val: MSE Test: MSE References
SVR 85.82 71.19 93.83 [32]
CNN + GBM 18.44 68.79 96.18 [33]
3D ConvNets 79.28 70.58 92.74 [13]
3D U-net - 71.57 102.25 [34]
Our method 60.29 (53.12, 67.46) 51.72 (45.95, 57.49) 82.56 (75.75, 89.37) -

Note:—denotes the result is not reported; ( , ) denotes the upper bound of the confidence interval lower bound of confidence interval.

Regarding the method proposed by Neil P. Oxtoby et al. [32], the structure and function of some brain regions were relatively tightly coupled. The structural covariance network (SCN) of the cerebral cortex was extracted, and the nodes were used as the input of the support vector regression (SVR) model to predict the intelligence score. Yeeleng S. Vang et al. [33] trained a CNN to compress 3D MRI data to a feature map size of 123 × 1 × 1 × 1, and used the 123 extracted features to train a gradient boosting machine (GBM) that predicts the intelligence score of the subject. In the experiment conducted in this study, the model performed well on the training set, but it performed poorly on the test set. To a certain extent, the model was in a state of overfitting, and was found to lack generalization ability. In the method proposed by Yukai Zou et al. [13], multiple brain regions were selected to predict the residualized fluid intelligence scores using a 3D CNN, but the median predicted score was used as the final prediction result, which caused a substantial amount of information loss in the pre-processing stage. Moreover, the regression operation of the fluid intelligence scores was performed for only one ROI at a time, which ignored the interaction between other brain areas. Lihao Liu et al. [34] used a basic 3D U-Net to enhance the segmentation performance. In this experiment, the weights of the encoder component were fixed, and the regression component was updated using the brain volume and the provided intelligence score.

From the preceding discussion, it is evident that the current models are not further optimized in terms of their algorithms and mechanisms; instead, only the original model is used. This is also the main reason for the poor performance of the prediction results. The results of the experiments and the comparisons with the existing methods demonstrate the advantages of the proposed method, which are mainly reflected in the following aspects. The current pipeline of the regression of brain MRI images and fluid intelligence scores is based on machine learning, 3D ConvNets, and 3D U-Net. However, 3D ConvNets is characterized by the following weaknesses. First, in the pre-processing stage, the high-layer and low-layer features of the 3D ConvNets framework are not fused. As a result, a large amount of spatial information in the image is lost in the lower layer. Second, in the post-processing stage, only uses feature maps with machine learning models predict scores as final prediction result. For the basic 3D U-Net, only the bottom features are used for regression with the intelligence scores.

Our method has achieved relatively good results in predicting fluid intelligence scores, mainly due to the following contributions: First, the 14 candidate ROIs are marked as five categories, i.e., when performing pixel classification, the ROIs of the same category are divided into the same label. In the segmentation task, the fewer the categories of segmentation targets, the higher the final segmentation accuracy. Second, the symmetric learning mechanism and MBC operation are beneficial to improve the prediction accuracy. Third, the many improvements made to the original 3D U-Net continuously strengthen the attention mechanism of the model, and contribute to better segmentation accuracy. Finally, in the second stage, the introduction of the CNN model increases the nonlinearity of the model, and is more conducive to the model fitting of fluid intelligence scores.

6 Discussion

In this section, we first compare our proposed method with several segmentation methods for brain ROI segmentation. Then, we study the influence of macro-symmetric ROIs given the same label on the segmentation results. Thirdly, we compare convolutional neural network for regression with Classical Machine Learning Methods. Finally, we present the limitations of this work.

6.1 Comparison with current deep learning methods

Examples of frontal gyrus, hippocampus, amygdala, caudate nucleus, thalamus segmentation results of FCN, U-Net, ResNet, FC DensNet and our method on the test dataset are shown in Fig 6. Table 3 shows the Dice coefficient (DC) and the average surface distance (ASD) values achieved by FCN [35], U-net [36], ResNet [27], FC Densenet [37] and our method. The proposed segmentation method shows the significant improvement over the counterparts on ROIs segmentation task.

Fig 6. Examples of frontal gyrus, hippocampus, amygdala, caudate nucleus, thalamus segmentation results of FCN, U-Net, ResNet, FC DensNet and our method on the test dataset.

Fig 6

The pixels in symmetrical brain regions are assigned the same label.

Table 3. Comparison of frontal gyrus, hippocampus, amygdala, caudate nucleus, thalamus segmentation results on the ABCD dataset.

Method Dice coefficient ASD (mm)
*FCN 0.7958± 0.0257 0.577±0.081
*U-Net 0.8394± 0.0197 0.533±0.074
ResNet 0.8566±0.0256 0.5168± 0.059
FC DenseNet 0.8622±0.0270 0.496 ± 0.056
Ours 0.8920± 0.0241 0.390 ± 0.055

The terms a and b in “a ± b” denote the mean and standard deviation for different subjects, respectively. The symbol ‘*’ indicates that the proposed method achieved significantly improvement over the other segmentation methods based on Mann Whitney U Test (p < 0.05) in terms of Dice coefficient.

From Table 3, we can observe that the proposed method achieves the best performance for ROIs segmentation regarding the Dice coefficient metric. For example, our method achieves the highest Dice coefficient (i.e., 0.8920), which is significantly better than the U-Net method (i.e., 0.8394). In general, the proposed method achieves 0.0962, 0.0526, 0.0354 and 0.0298 improvement in terms of Dice coefficient over the counterparts, i.e., FCN, U-Net, ResNet and FC DenseNet, respectively. Besides, the proposed method also achieves better results in terms of ASD values, compared with the counterparts. The ASD values achieved by FCN, U-Net, ResNet and FC DenseNet for ROIs segmentation are 0.577, 0.533, 0.5168 and 0.496, respectively. The ASD value of our method is 0.390, which is better than other counterparts. On the other hand, compared the FCN and U-Net, the reason why U-Net performs better segmentation results is that U-Net can fuse advanced context feature information. The reason why ResNet segmentation performance is better than U-Net is that increasing the network depth can fit more complex feature inputs, and residual connection can effectively alleviate the degradation problem. FC-DenseNet has a similar structure to U-Net, adding skip connections from the encoder to the decoder and deepening the network. Therefore, FC-DenseNet has better segmentation performance than ResNet and U-Net. We can see that our method achieved the better segmentation performance than other counterparts, the possible reasons are twofold: firstly, the segmentation network is improved based on the U-Net network, which can integrate the high-level and low-level feature maps; and secondly, the concepts of ResNet and SENet are referenced to construct the network in order to suppress the degradation of the weight matrix and increase the attention mechanism, respectively [38]. We performed 5 training and testing procedures to calculate the p-value. The proposed method shows significant improvement over the FCN (p = 0.00023) and U-Net (p = 0.0229) in terms of Dice coefficient on the ABCD dataset for brain segmentation, respectively. Besides, the p-values of the proposed method over ResNet and FC DenseNet are 0.144759 and 0.202517293 in terms of Dice coefficient on the ABCD dataset for brain segmentation.

6.2 Influence of symmetry learning mechanism

We segmented 14 ROIs in the ABCD dataset to validate our proposed symmetry learning mechanism, where each ROI has a separate label. The segmentation results achieved by different methods are shown in Table 4. Dice coefficient and ASD are still used as the evaluation metric for our method and the counterparts.

Table 4. Comparison of L inferior frontal gyrus—opercular, L inferior frontal gyrus–triangular, L inferior frontal gyrus—orbital, L hippocampus, L amygdala, L caudate nucleus, L thalamus, R inferior frontal gyrus–opercular, R inferior frontal gyrus–triangular, R inferior frontal gyrus—orbital, R hippocampus, R amygdala, R caudate nucleus, R thalamus segmentation results on the ABCD dataset.

Method Dice coefficient ASD (mm)
*FCN 0.7632 ±0.0244 1.225 ± 0.058
*U-Net 0.8084 ± 0.0254 1.214 ± 0.063
ResNet 0.8432 ± 0.0310 1.119 ± 0.057
FC DenseNet 0.8505 ± 0.0280 1.063 ± 0.055
Ours 0.8686 ± 0.0236 1.049 ± 0.053

L/R indicates a location in the left/right hemisphere. The pixels in symmetrical brain regions are not assigned the same label. The terms a and b in “a ± b” denote the mean and standard deviation for different subjects, respectively. The symbol ‘*’ indicates that the proposed method achieved significantly improvement over the other methods based on Mann Whitney U Test (p < 0.05).

From Table 4, we can see that the average Dice coefficient on 14 ROIs are 0.7632, 0.8084, 0.8432 and 0.8505 yielded by the counterparts, respectively, which are lower than that achieved by the proposed method (0.8686). The achieved average surface distance on 14 ROIs is 1.049 mm by our proposed method, compared with 1.225 mm, 1.214 mm, 1.119 mm, and 1.063 mm by FCN, U-Net, ResNet and FC DensNet, respectively. From Table 3, we can see that the FCN, U-Net, ResNet, FC DenseNet and the proposed method achieved 0.0326, 0.031, 0.0134, 0.0117 and 0.0234 improvements in terms of Dice coefficient over the segmentation results in Table 4, respectively. The proposed method is superior to traditional counterparts in the separation markings in two different experiments. Also, we performed 5 training and testing procedures to calculate the p-value. The proposed method shows significant improvement over the FCN (p = 2.594e-05) and U-Net (p = 0.0196) in terms of Dice coefficient on ABCD dataset for brain segmentation.

It should be noted that the two experiments used the same experimental data. As shown in Table 3, considering that the ROI exhibits macroscopic symmetry, the pixels in the symmetrical brain region are assigned the same label during the segmentation process, and a better segmentation result is achieved. The segmentation results in Table 4 do not use a symmetric learning mechanism, one per ROI is assigned by one label.

These results demonstrate that incorporating the anatomical prior into networks could further improve the performance for brain ROI segmentation. The possible reason for the improvement is that symmetrical brain regions sharing the same label can provide more brain anatomical information for ROI segmentation, and deep neural networks can learn image features, which boost the segmentation performance around the ROI boundary.

6.3 Comparison with classical machine learning methods

Traditional regression methods have been widely studied in the field of predicting a continuous variable from a set of features. The support vector machine (SVM) [39], random forests (RF) [40] and gradient boosting machine (GBM) [41] are applied to predict fluid intelligence score in comparison with the proposed CNN method. In this section, we study the influence of the segmented ROIs preprocessed by three different ways on the fluid intelligence score prediction.

1) MBCs after dimensionality reduction. The inputs of the classical regression model were the five MBCs with a size of 64×64×64. After flattening, the dimension of flattened feature map is 262,144×5. The high dimensional data is often not useful to regression analysis [42]. Therefore, principal component analysis (PCA) is used to reduce the data dimension to 256. The input of the three machine learning models are vectors with a dimension of 256×5. We train the convolutional neural network (CNN) model using vectors with a dimension of 262,144×5.

From the Table 5, we can observe that the proposed CNN method achieves the best performance MSE = 82.56 on the test set. In comparison with the SVM, GB and RF methods, the proposed CNN method achieves 26.87, 22.91 and 21.3 improvement on the test set, respectively. The proposed CNN method shows the significant improvement over the traditional counterparts on prediction fluid intelligence score task. The possible reason for the improvement is that the CNN can adaptively learn the spatial hierarchy of low- to high-level features [42]. Besides, the proposed method shows significant improvement over the SVM (p = 2.0205e-06), GB (p = 3.2441e-06) and RF (p = 2.694e-06) for the fluid intelligence score prediction, respectively. Fig 7 shows the MSEs of each model on the training set, validation set and test set. we performed 5 training and testing procedures to estimate the mean and 95% confidence interval of the results.

Table 5. The comparison of the MSE values of the proposed CNN method and classical machine learning methods.

Method Train: MSE Val: MSE Test: MSE
*SVM 91.79 (83.06, 100.52) 78.33 (68.69, 87.97) 109.43 (98.8, 120.06)
*GB 85.75 (77.45, 94.05) 71.88 (64.89, 78.87) 105.47 (96.86, 114.08)
*RF 83.24 (75.30, 91.18) 70.84 (63.94, 77.74) 103.86 (96.37, 111.35)
Our method 60.29 (53.12, 67.46) 51.72 (45.95, 57.49) 82.56 (75.75, 89.37)

Note: ( , ) denotes the upper bound of the confidence interval lower bound of confidence interval.

The symbol ‘*’ indicates that the proposed method achieved significantly improvement over the other methods based on Mann Whitney U Test (p < 0.05).

Fig 7. The MSEs of different methods.

Fig 7

The heights of the bars denote the means, gray lines denote the 95% confidence intervals.

2) ROIs after dimensionality reduction. We now compare the influence of MBC operation and traditional resizing operation on the fluid intelligence score prediction. The segmented ROIs are resized to a size of 64x64x64 by the traditional resizing operation. The PCA is used to reduce the dimension of the segmented ROI with size of 64×64×64 to 256. Therefore, the input of the three machine learning models are vectors with a dimension of 256×5. We train the convolutional neural network (CNN) model using vectors with a dimension of 262,144×5.

From the Table 6, we can observe that the proposed CNN method achieves the best performance MSE = 86.41 on the test set. In comparison with the Table 5, the models trained with the ROI resized by the traditional method performs slightly worse. The possible reason for the slightly better performance shown in Table 5 is that MBC operation (two interpolation operations) produces a smoother interpolation than traditional one interpolation operation, improving the orderlity of the data, the MBC details are described in the Section 3.2—B—1). Meanwhile, the proposed method shows significant improvement over the SVM (p = 2.8675e-05), GB (p = 9.0128e-06) and RF (p = 3.8012e-05) for the fluid intelligence score prediction, respectively. Fig 8 shows the MSEs of each model on the training set, validation set and test set. we performed 5 training and testing procedures to estimate the mean and 95% confidence interval of the results.

Table 6. The comparison of the MSE values of the proposed CNN method and classical machine learning methods.

Method Train: MSE Val: MSE Test: MSE
*SVM 93.25 (84.43, 102.07) 78.03 (68.11, 87.95) 110.08 (99.6, 120.56)
*GB 89.46 (81.39, 97.53) 72.75 (65.01, 80.49) 108.36 (99.82, 116.9)
*RF 82.37 (74.55, 90.19) 74.64 (68.06, 81.22) 105.56 (97.92, 113.2)
Our method 62.63 (55.33, 69.93) 54.37 (48.4, 60.34) 86.41 (79.47, 93.35)

Note: ( , ) denotes the upper bound of the confidence interval lower bound of confidence interval.

The symbol ‘*’ indicates that the proposed method achieved significantly improvement over the other methods based on Mann Whitney U Test (p < 0.05).

Fig 8. The MSEs of different methods.

Fig 8

The heights of the bars denote the means, gray lines denote the 95% confidence intervals.

3) MBCs without dimensionality reduction. We now study the influence of data dimensionality on model performance. The MBCs used for model training is not dimensionally reduced. The dimensions of one MBC are 262,144, details are described in the Section 3.2—B—2). We train the convolutional neural network (CNN) model and machine learning models using vectors with a dimension of 262,144×5.

From Table 7, the worse experimental results are observed compared to the Table 5. Also, the proposed method shows significant improvement over the SVM (p = 7.1721e-07), GB (p = 5.4398e-06) and RF (p = 1.0912e-05) for the fluid intelligence score prediction, respectively. Fig 9 shows the MSEs of each model on the training set, validation set and test set. we performed 5 training and testing procedures to estimate the mean and 95% confidence interval of the results.

Table 7. The comparison of the MSE values of the proposed CNN method and classical machine learning methods.

Method Train: MSE Val: MSE Test: MSE
*SVM 101.64 (92.92, 110.36) 97.86 (88.38, 107.34) 127.34 (117.04, 137.64)
*GB 97.86 (89.25, 106.47) 92.53 (84.89, 100.17) 123.55 (114.12, 132.98)
*RF 92.58 (84.97, 100.19) 90.68 (84.01, 97.35) 122.55 (113.2, 131.9)
Our method 74.26 (67.4, 81.12) 67.24 (59.89, 74.59) 95.37 (85.28, 105.46)

Note: ( , ) denotes the upper bound of the confidence interval lower bound of confidence interval.

Fig 9. The MSEs of different methods.

Fig 9

The heights of the bars denote the means, gray lines denote the 95% confidence intervals.

In theory, a higher number of dimensions allows more information to be stored, and there is a higher possibility of noise and data redundancy, which is not conducive to model training. As a powerful and general dimensionality reduction algorithm, PCA can unearth potential trends in our data. Therefore, using PCA for dimensionality reduction is more conducive to the convergence of the model before training the model with high-dimensional data.

6.4 Limitations

The proposed method was inspired by the basic 3D U-Net framework, the main concept of which is to consider the holistic perspective of intelligence predictions obtained from multiple ROIs. State-of-the-art results of multiple brain regions were achieved simultaneously. Subsequently, the fluid intelligence scores were predicted based on the fine segmentation results, which eliminated a large amount of interference and yielded more accurate and stable results. The proposed framework can be generalized to other related regression problems.

However, the SRI24 atlas is an MRI-based atlas of normal adult human brain anatomy. The participants in the ABCD project were aged between 9–10 years old, and differences in age may lead to deviations in anatomical structure matching; this is also an important factor that affected the fluid intelligence score prediction accuracy. Second, the complexity of the brain is still not fully understood, and the functional areas of the brain are quite complex. Only a few selected brain regions were used to train the model to verify the feasibility of the proposed method. Finally, the proposed method was found to achieve good results in the prediction of fluid intelligence scores. In subsequent research, the model will be further optimized while considering more brain regions to further improve the prediction accuracy.

7 Conclusion

In this paper, a two-step deep learning pipeline was proposed to predict fluid intelligence scores from T1-w MRI images. In the first step, an improved 3D U-Net was trained to segment the 3D MRI data to obtain the target brain areas. The proposed architecture is a combination of ResNet, SENet, and the symmetry learning mechanism to increase the segmentation accuracy. In the second step, a CNN was trained to predict the fluid intelligence scores based on the fine segmentation results. Compared with the current state-of-the-art methods for the prediction of fluid intelligence scores from T1-weighted MRI images, the proposed method includes the addition of different modules to improve the attention mechanism of the entire model, thereby contributing to better prediction results. The proposed framework can be validated and improved in the future, and it offers a new and unique perspective for the prediction of fluid intelligence scores based on brain morphometry.

Supporting information

S1 File. Fluid intelligence score measurement.

(DOCX)

Data Availability

Data were obtained from the NIMH Data Archive (NDA) database, generated by the Ado-lescent Brain Cognitive Development (ABCD) study, the largest long-term study of brain development and child health in the United States. Information about the ABCD Data Repository can be found at https://nda.nih.gov/abcd/about.

Funding Statement

This research was supported by Center of Excellence-International Collaboration Initiative Grant (Grant Number: 139170052), 1.3.5 project for disciplines of excellence, West China Hospital, Sichuan University (Grant Number: ZYJC18010). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

References

  • 1.Supekar K. et al. , "Neural predictors of individual differences in response to math tutoring in primary-grade school children," vol. 110, no. 20, pp. 8230–8235, 2013. doi: 10.1073/pnas.1222154110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gilles E., Gignac, Timothy C., and Intelligence B. J., "Brain volume and intelligence: The moderating role of intelligence measurement quality—ScienceDirect," vol. 64, pp. 18–29, 2017. [Google Scholar]
  • 3.Cole M. W., Yarkoni T., Repovs G., Anticevic A., and Braver T. S. J. J. o. N. t. O. J. o. t. S. f. N., "Global connectivity of prefrontal cortex predicts cognitive control and intelligence," vol. 32, no. 26, pp. 8988–99, 2012. doi: 10.1523/JNEUROSCI.0536-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Wang L., Chong-Yaw W., Heung-Il S., Tang X., Shen D., and Chen K. J. P. O., "MRI-Based Intelligence Quotient (IQ) Estimation with Sparse Learning," vol. 10, no. 3, p. e0117295, 2015. doi: 10.1371/journal.pone.0117295 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Smann P. G. et al. , "FreeSurfer‐based segmentation of hippocampal subfields: A review of methods and applications, with a novel quality control procedure for ENIGMA studies and other collaborative efforts." [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Pominova M. et al. , "Ensemble of 3D CNN Regressors with Data Fusion for Fluid Intelligence Prediction," 2019. [Google Scholar]
  • 7.Patenaude B., Smith S. M., Kennedy D. N., and Jenkinson M. J. N., "A Bayesian model of shape and appearance for subcortical brain segmentation," vol. 56, no. 3, pp. 907–922, 2011. doi: 10.1016/j.neuroimage.2011.02.046 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Yushkevich P. A. et al. , "User-Guided Segmentation of Multi-modality Medical Imaging Datasets with ITK-SNAP," pp. 1–20, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ohlsson S., "Deep Learning: The Nature of the Enterprise," 2011. [Google Scholar]
  • 10.Krizhevsky A., Sutskever I., and Hinton G., "ImageNet Classification with Deep Convolutional Neural Networks %J Advances in neural information processing systems," vol. 25, no. 2, 2012. [Google Scholar]
  • 11.A E. J. P. et al. , "Dissociable brain biomarkers of fluid intelligence," vol. 137, pp. 201–211, 2016. doi: 10.1016/j.neuroimage.2016.05.037 [DOI] [PubMed] [Google Scholar]
  • 12.Eickhoff S. B. and Langner R., "Neuroimaging-based prediction of mental traits: Road to utopia or Orwell? %J PLOS Biology," vol. 17, 2019. doi: 10.1371/journal.pbio.3000497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zou Y., Jang I., Reese T. G., Yao J., and Rispoli J. V., Cortical and Subcortical Contributions to Predicting Intelligence Using 3D ConvNets. Adolescent Brain Cognitive Development Neurocognitive Prediction, 2019. [Google Scholar]
  • 14.Guerdan L. et al. , "Deep Learning vs. Classical Machine Learning: A Comparison of Methods for Fluid Intelligence Prediction," in Challenge in Adolescent Brain Cognitive Development Neurocognitive Prediction, 2019. [Google Scholar]
  • 15.Volkow N. D. et al. , "The conception of the ABCD study: From substance use to a broad NIH collaboration," pp. 4–7, 2017. doi: 10.1016/j.dcn.2017.10.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Hagler D. J., Hatton S., Cornejo M., Makowski C., and Dale A. M. J. N., "Image processing and analysis methods for the Adolescent Brain Cognitive Development Study," vol. 202, p. 116091, 2019. doi: 10.1016/j.neuroimage.2019.116091 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Avants B. B., Tustison N. J., Wu J., Cook P. A., and Gee J. C. J. N., "An Open Source Multivariate Framework for n-Tissue Segmentation with Evaluation on Public Data," vol. 9, no. 4, pp. 381–400, 2011. doi: 10.1007/s12021-011-9109-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rohlfing T., Zahr N. M., Sullivan E. V., and Pfefferbaum A., "The SRI24 multichannel atlas of normal adult human brain structure," Hum. Brain Mapp, vol. 31, no. 5, pp. 798–819, 2010. doi: 10.1002/hbm.20906 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Maguire E. A, Gadian, and D. G, "Navigation-related structural change in the hippocampi of taxi drivers. Proceedings of the National Academy of Sciences of the United States of America," 2000. doi: 10.1073/pnas.070039597 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Xiao H. et al. , "Reliability of MRI-derived measurements of human cerebral cortical thickness: the effects of field strength, scanner upgrade and manufacturer," vol. 32, no. 1, pp. 180–194, 2006. doi: 10.1016/j.neuroimage.2006.02.051 [DOI] [PubMed] [Google Scholar]
  • 21.Costafreda S. G., Fu C., Lee L., Everitt B., and David A. S. J. H. B. M., "A systematic review and quantitative appraisal of fMRI studies of verbal fluency: role of the left inferior frontal gyrus," vol. 27, no. 10, pp. 799–810, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Packard M., Hirsh R., and White N. J. J. o. N., "Differential effects of fornix and caudate nucleus lesions on two radial maze tasks: evidence for multiple memory systems," vol. 9, no. 5, pp. 1465–1472, 1989. doi: 10.1523/JNEUROSCI.09-05-01465.1989 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Mcgaugh J. L. J. A. R. o. N., "The amygdala modulates the consolidation of memories of emotionally arousing experiences," vol. 27, no. 1, pp. 1–28, 2004. doi: 10.1146/annurev.neuro.27.070203.144157 [DOI] [PubMed] [Google Scholar]
  • 24.Britt A. and Valisa R. J. B., "Age and hemisphere effects on dendritic structure," no. 6, pp. 1983–1990, 1996. doi: 10.1093/brain/119.6.1983 [DOI] [PubMed] [Google Scholar]
  • 25.Hutsler J. and Galuske R. J. T. i. N., "Hemispheric asymmetries in cerebral cortical networks," vol. 26, no. 8, pp. 429–435, 2003. doi: 10.1016/S0166-2236(03)00198-X [DOI] [PubMed] [Google Scholar]
  • 26.Li M., Chen H., Wang J., and e. al., "Handedness- and hemisphere-related differences in small-world brain networks: a diffusion tensor imaging tractography study," Brain Connectivity, vol. 4, no. 2, p. 145, 2014. doi: 10.1089/brain.2013.0211 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.He K., Zhang X., Ren S., and Sun J. J. I., "Deep Residual Learning for Image Recognition," 2016. [Google Scholar]
  • 28.Jie H., Li S., Gang S., Albanie S. J. I. T. o. P. A., and Intelligence M., "Squeeze-and-Excitation Networks," vol. PP, no. 99, 2017. [Google Scholar]
  • 29.O’Rourke J. J. I. J. o. C. and ences I., "Finding minimal enclosing boxes," vol. 14, no. 3, pp. 183–199, 1985. [Google Scholar]
  • 30.Koenig K. A. et al. , "The role of the thalamus and hippocampus in episodic memory performance in patients with multiple sclerosis," p. 1352458518760716, 2018. doi: 10.1177/1352458518760716 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Anuar N., Sultan A. M. J. C., and Science I., "Validate Conference Paper using Dice Coefficient," vol. 3, no. 3, 2010. [Google Scholar]
  • 32.Oxtoby N. P., Ferreira F. S., Mihalik A., Wu T., and Mourao-Miranda J., "ABCD Neurocognitive Prediction Challenge 2019: Predicting individual residual fluid intelligence scores from cortical grey matter morphology," 2019. [Google Scholar]
  • 33.Vang Y. S., Cao Y., and Xie X., "A Combined Deep Learning-Gradient Boosting Machine Framework for Fluid Intelligence Prediction," in Springer, Cham, 2019. [Google Scholar]
  • 34.Liu L., Yu L., Wang S., and Heng P. A., "Predicting Fluid Intelligence from MRI Images with Encoder-Decoder Regularization," 2019. [Google Scholar]
  • 35.Long J., Shelhamer E., Darrell T. J. I. T. o. P. A., and Intelligence M., "Fully Convolutional Networks for Semantic Segmentation," vol. 39, no. 4, pp. 640–651, 2015. [DOI] [PubMed] [Google Scholar]
  • 36.Ronneberger O., Fischer P., Brox T. J. I. C. o. M. I. C., and Intervention, "U-Net: Convolutional Networks for Biomedical Image Segmentation," 2015. [Google Scholar]
  • 37.Zhang R. et al. , "Automatic Segmentation of Acute Ischemic Stroke From DWI Using 3-D Fully Convolutional DenseNets," pp. 1–1, 2018. doi: 10.1109/TMI.2018.2821244 [DOI] [PubMed] [Google Scholar]
  • 38.Orhan A. E. and Pitkow X., "Skip Connections Eliminate Singularities," arXiv preprint arXiv:1701.09175, 2017. [Google Scholar]
  • 39.Awad M. and Khanna R. J. A., "Efficient Learning Machines," vol. doi: 10.1007/978-1-4302-5990-9, 2015. [DOI] [Google Scholar]
  • 40.Strobl C., Boulesteix A. L., Kneib T., Augustin T., and Zeileis A. J. B. B., "Conditional variable importance for random forests," vol. 9, no. 1, pp. 307–307, 2008. doi: 10.1186/1471-2105-9-307 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Friedman J. H. J. A. o. S., "Greedy Function Approximation: A Gradient Boosting Machine," vol. 29, no. 5, pp. 1189–1232, 2001. [Google Scholar]
  • 42.Murphy K. P. J. M. P., "Machine Learning: A Probabilistic Perspective," MIT press, 2012. [Google Scholar]

Decision Letter 0

Yiming Tang

13 Jan 2022

PONE-D-21-37778Prediction of fluid intelligence from T1-w MRI images: A precise two-step deep learning frameworkPLOS ONE

Dear Dr. Li,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Feb 27 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Yiming Tang, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf.

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. Thank you for stating the following in the Acknowledgments Section of your manuscript: 

[This research was supported by Center of Excellence-International Collaboration Initiative Grant (Grant Number: 139170052), 1.3.5 project for disciplines of excellence, West China Hospital, Sichuan University (Grant Number: ZYJC18010).]

We note that you have provided funding information that is currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. 

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: 

 [This research was supported by Center of Excellence-International Collaboration Initiative Grant (Grant Number: 139170052), 1.3.5 project for disciplines of excellence, West China Hospital, Sichuan University (Grant Number: ZYJC18010). ]

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

4. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

5. PLOS requires an ORCID iD for the corresponding author in Editorial Manager on papers submitted after December 6th, 2016. Please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. Please see the following video for instructions on linking an ORCID iD to your Editorial Manager account: https://www.youtube.com/watch?v=_xcclfuvtxQ.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Thank you very much for the opportunity to review the manuscript entitled "Prediction of fluid intelligence from T1-w MRI images: a precise two-step deep learning framework".

Although the methodology seems fine in general, there remain open questions on what the authors did exactly. During reading I had the impression that the authors had some difficulties with the English language and structuring the text in an intelligible manner. This made it not only difficult for me to follow sometimes, but also led to wrong statements. One example for this is the last sentence in the 2nd paragraph of the Introduction: "In the most related work, MRI data were used as a machine learning method by which to predict fluid intelligence." This statement implies that MRI data is a machine learning method, however, MRI data itself is NOT a machine learning method. Machine learning methods are APPLIED TO MRI data. Sentences like the one mentioned unfortunately worsen the overall impression of the manuscript.

In the following I will list some points, which would in my opinion improve the manuscript:

- Introduction 5th paragraph: The drawbacks that are listed are not drawbacks of the MRI data itself but of the methodology applied to it, so the first sentence should be reformulated. It should also be pointed out in what terms the segmentation model is not well optimized in previous work.

- Section 2.1: The structure of the section is a bit strange, for me it seems as if paragraphs had been randomly shuffled. For me it would make more sense to start the section with the first sentence of the 3rd paragraph. Then it should be described what processing has been performed by the challenge organizers and after that the preprocessing steps done by the authors. Furthermore, the last sentence of the section is the same as the one forming the second paragraph.

- Section 2.2 Please also mention somewhere in the text explicitly that the ROIs are assigned to five categories. Please also describe how the ground truth for the parcellation was created. Was the parcellation provided through the challenge or a freely available parcellation used, etc.?

- Section 3.2 A, paragraph 4: The description of the first layer does not seem to fit to Fig. 2. In figure the dimensions change 112*112*112*112*1 -->112*112*112*16 --> 112*112*112*32, but in the text it is unclear if it is 112*112*112*112*1 -->112*112*112*32 --> 112*112*112*12, or 112*112*112*112*1 -->112*112*112*12 --> 112*112*112*32. Also note that in the image there are 16 channels vs. 12 channels mentioned in the text.

- Section 3.2. A1 Recombination Block: In the text it is said that the operation from B2 to B3 is a 1x1x1 conv, however in the image it is a 3x3x3 conv.

-Section 3.2. A2 SegS-E Block: In the 3rd line below Fig 4, the output is denoted as Y but later the output is denoted as X tilde

- Section 3.2 B1 Minimum bounding cube: This section is very confusing. It seems that there is only 1 sentence on what the minimum bounding cube is and the rest of the 2 paragraphs is about model training. I would suggest to add a separate section on model training after the description of the whole framework and focus in this section only on the MBC. Actually, I do not really understand what the MBC exactly is and what exactly the input to the neural network for regression is.

Concerning model training: It is not totally clear for me if the segmentation and regression component are trained together or if the segmentation component is trained at first and after completed training the regression component is trained based on the results of the segmentation component.

- Section 3.2 B2 Neural network construction: In the last sentence "onto the ROIs" should be deleted.

Results and Discussion in general: I would suggest to put all experimental results (especially the tables) into the results section and every interpretation of results (why one method might be better than another, etc) into the discussion section.

Table 2: If possible, it would be nice to have mean and standard deviation over several training runs/ seeds reported, instead of just a value for training each method only once.

Section 5.1: Please check the grammar of the first sentence.

Table 3 and 4: Pleas provide mean and standard deviation over several training runs instead just 1.

Section 5.3: I would like to see also the results of SVM, RF and GB based on the higher dimensional data. Since the CNN has more information available than the other approaches, this seems to be a bit of an unfair comparison. By reducing the dimension with PCA you might discard information that is useful for prediction, since the components corresponding to highest variance, do not necessary have to reflect also the most useful information. How much variance the kept components explain would also be interesting to know.

The authors also claim that the CNN shows SIGNIFICANT improvement over the other methods. However, Table 5 includes only MSE-values for training the CNN and other methods only once. p-values or confidence intervals have to be added to provide evidence for significant improvement.

Reviewer #2: Major remark :

1) We know that the quality of measurement of intelligence is linked to validity and can

interfere on results of each study (see for example, Gignac et al. (2017, https://doi.org/10.1016/j.intell.2017.06.004). The authors do not give any indication of the measurement of fluid intelligence. This information, although necessary, is difficult to obtain from the link https://nda.nih.gov/abcd/about. and requires an in-depth analysis of the database from which the data were extracted (NIMH Data Archive (NDA) database, ABCD study). This is not within the reach of all readers. A summary table of the measurements and tests used will certainly be useful.

Minor remarks :

2) The authors used the MSE value to compare the proposed CNN method to the classical machine learning methods. Choosing MSE is a good choice since it is a well-behaved metric, but correlation criteria should also be provided because the correlation score is used in many other studies on fluid intelligence prediction.

3) Results need to be supported by statistical analysis showing at what extent the differences between the proposed method and the other methods are significant.

4) In p. 3 : Fluid intelligence scores were decidualized ( ??). is « decidualized » the right word ?.

5) In p. 13: « The ROIs of frontal gyrus, hippocampus, amygdala, caudate nucleus and

thalamus are performed segmentation ». Should be rephrased.

6) In Fig.6, it is not clear that the proposed method shows significant improvement

over the counterparts on ROIs segmentation task

7) p.17, « adaptively learn » is repeated twice. Please correct.

8) Caption table 4 : the word « sementation » is repeated twice. Please correct.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Abdel-Kader Boulanouar

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Aug 2;17(8):e0268707. doi: 10.1371/journal.pone.0268707.r002

Author response to Decision Letter 0


29 Mar 2022

Reviewer #1: Thank you very much for the opportunity to review the manuscript entitled "Prediction of fluid intelligence from T1-w MRI images: a precise two-step deep learning framework".

Although the methodology seems fine in general, there remain open questions on what the authors did exactly. During reading I had the impression that the authors had some difficulties with the English language and structuring the text in an intelligible manner. This made it not only difficult for me to follow sometimes, but also led to wrong statements. One example for this is the last sentence in the 2nd paragraph of the Introduction: "In the most related work, MRI data were used as a machine learning method by which to predict fluid intelligence." This statement implies that MRI data is a machine learning method, however, MRI data itself is NOT a machine learning method. Machine learning methods are APPLIED TO MRI data. Sentences like the one mentioned unfortunately worsen the overall impression of the manuscript.

In the following I will list some points, which would in my opinion improve the manuscript:

Q1- One example for this is the last sentence in the 2nd paragraph of the Introduction: "In the most related work, MRI data were used as a machine learning method by which to predict fluid intelligence." This statement implies that MRI data is a machine learning method, however, MRI data itself is NOT a machine learning method. Machine learning methods are APPLIED TO MRI data. Sentences like the one mentioned unfortunately worsen the overall impression of the manuscript.

Reply: It is really a big mistake to the whole quality of our article. We feel sorry for our carelessness. We have corrected it and we also feel great thanks for your point out. The corrected statemen “In the most related work, the reference [4] has outlined machine learning approaches employed to predict fluid intelligence from brain MRI data.” has been placed in the last sentence of 2nd paragraph of the Introduction.

Q2 - Introduction 5th paragraph: The drawbacks that are listed are not drawbacks of the MRI data itself but of the methodology applied to it, so the first sentence should be reformulated. It should also be pointed out in what terms the segmentation model is not well optimized in previous work.

Reply: Thanks for your suggestions. We have revised the first sentence of the fifth paragraph, and also given what the segmentation model needs to be optimized.

Q3 - Section 2.1: The structure of the section is a bit strange, for me it seems as if paragraphs had been randomly shuffled. For me it would make more sense to start the section with the first sentence of the 3rd paragraph. Then it should be described what processing has been performed by the challenge organizers and after that the preprocessing steps done by the authors. Furthermore, the last sentence of the section is the same as the one forming the second paragraph.

Reply: The content of section 2.1 in the original manuscript appears somewhat illogical. According to your comments, we have made a reformulated revision and placed it in the 1st paragraph of section 2.1.

In the 2nd paragraph of section 2.1, it is described what preprocessing has been performed by the challenge organizers. Preprocessing includes format conversion, selection of the brain mask, and creation of the parcellation. The challenge organizers performed brain parcellation using neuroimaging software. Finally, 122 brain regions of interest (ROIs) extracted by the challenge organizers based on the SRI24 atlas.

The 3rd paragraph of section 2.1 describes the preprocessing work performed by the authors. 14 brain ROIs with unique anatomical characteristics and roles in cognitive functions were selected to predict fluid intelligence score.

Q4- Section 2.2 Please also mention somewhere in the text explicitly that the ROIs are assigned to five categories. Please also describe how the ground truth for the parcellation was created. Was the parcellation provided through the challenge or a freely available parcellation used, etc.?

Reply: In the 2nd paragraph of Section 2.2, we have given the categories to which ROIs are assigned. We put the description of the ground truth for the parcellation in the 2nd paragraph of Section 2.1 because creating parcellation is the focus of preprocessing.

The challenge organizers created brain parcellation using neuroimaging software, the details are put in the 2nd paragraph of section 2.1.Finally, 122 brain regions of interest (ROIs) extracted by the challenge organizers based on the SRI24 atlas.

Q5- Section 3.2 A, paragraph 4: The description of the first layer does not seem to fit to Fig. 2. In figure the dimensions change 112*112*112*112*1 -->112*112*112*16 --> 112*112*112*32, but in the text it is unclear if it is 112*112*112*112*1 -->112*112*112*32 --> 112*112*112*12, or 112*112*112*112*1 -->112*112*112*12 --> 112*112*112*32. Also note that in the image there are 16 channels vs. 12 channels mentioned in the text.

Reply: We have re-drawn the Fig 2. We have revised the description of the first layer in the first two sentences of paragraph 4, section 3.2 A. Thank you for your detailed review! Yes, in the image there are 16 channels, no 12 channels.

Q6- Section 3.2. A1 Recombination Block: In the text it is said that the operation from B2 to B3 is a 1x1x1 conv, however in the image it is a 3x3x3 conv.

Reply: we have given the revised it, and placed it in the 3rd sentence of last paragraph in Section 3.2 -A - 1).

Q7-Section 3.2. A2 SegS-E Block: In the 3rd line below Fig 4, the output is denoted as Y but later the output is denoted as X tilde

Reply: I am sorry, ‘Y’ is a typo. We have put the X tilde in the 3rd line below Fig 4.

Q8- Section 3.2 B1 Minimum bounding cube: This section is very confusing. It seems that there is only 1 sentence on what the minimum bounding cube is and the rest of the 2 paragraphs is about model training. I would suggest to add a separate section on model training after the description of the whole framework and focus in this section only on the MBC. Actually, I do not really understand what the MBC exactly is and what exactly the input to the neural network for regression is.

Concerning model training: It is not totally clear for me if the segmentation and regression component are trained together or if the segmentation component is trained at first and after completed training the regression component is trained based on the results of the segmentation component.

Reply: we have focused only on the MBC, and we have re-described what MBC is in Section 3.2 B-1). In this work, the prediction of fluid intelligence scores in two stages, the segmentation model is trained at first. Then, the regression model is trained using the segmented ROIs for fluid intelligence score prediction.

We have added a new Section 4 (Experiment). The Experimental Settings are described in the new Section 4 – A and the Quantitative Evaluation Metrics in the new Section 4– B.

In addition, we have compared the influence of MBC operation and traditional resizing operation on the fluid intelligence score prediction, details are shown in the new Section 6.3-2). And we have made the statistical analysis for the results.

Q9- Section 3.2 B2 Neural network construction: In the last sentence "onto the ROIs" should be deleted.

Reply: we have deleted the "onto the ROIs" In the last sentence in Section 3.2 B2 Neural network construction.

Q10-Results and Discussion in general: I would suggest to put all experimental results (especially the tables) into the results section and every interpretation of results (why one method might be better than another, etc) into the discussion section.

Reply: Thank you your suggestion. The predicted results of the fluid intelligence scores are presented in the new Section 5 (Results) as the most important results.

In the new Section 6 (Discussion), we have added additional experimental results, including tables and figures. After our discussion, we think that putting other results in the new Section 6 (Discussion) will be more convenient for readers to read and think.

Q11-Table 2: If possible, it would be nice to have mean and standard deviation over several training runs/ seeds reported, instead of just a value for training each method only once.

Reply: Unfortunately, the parameter settings details of the models in the references are not given, and it is difficult for us to reproduce their methods.

In Table 2, we have given the mean and the confidence interval of our method. In addition, all the results in this paper have been supported by statistical analysis.

In our work, in the segmentation stage and regression stage, we have resampled the training, test, and validation sets 5 times separately, and performed the same training and testing procedures for each resampled data

Q12-Section 5.1: Please check the grammar of the first sentence.

Reply: We have reformulated the first sentence in the Section 6.1 in the revised manuscript.

Q13-Table 3 and 4: Pleas provide mean and standard deviation over several training runs instead just 1.

Reply: We have resampled the training, test, and validation sets 5 times separately, and performed the same training and testing procedures for each resampled data. We have given the new mean and standard deviation in Table 3 and 4.

Besides, we have made the statistical analysis for the Table 3 and 4. The details are presented in the last two sentence of the last paragraph of new Section 6.1, and in the last two sentence of the 2nd paragraph of the new Section 6.2.

Q14-Section 5.3: I would like to see also the results of SVM, RF and GB based on the higher dimensional data. Since the CNN has more information available than the other approaches, this seems to be a bit of an unfair comparison. By reducing the dimension with PCA you might discard information that is useful for prediction, since the components corresponding to highest variance, do not necessary have to reflect also the most useful information. How much variance the kept components explain would also be interesting to know.

reply: We have given the result of models training using higher dimensional data as input for, details are described in the new Section 6.3- 3).

In theory, a higher number of dimensions allows more information to be stored, and there is a higher possibility of noise and data redundancy, which is not conducive to model training. The dimensionality reduction can unearth potential trends in data. Our experimental results confirm that dimensionality reduction operations are still necessary for high-latitude data before model training.

Q15-The authors also claim that the CNN shows SIGNIFICANT improvement over the other methods. However, Table 5 includes only MSE-values for training the CNN and other methods only once. p-values or confidence intervals have to be added to provide evidence for significant improvement.

Reply: We have resampled the training, test, and validation sets 5 times separately, and performed the same training and testing procedures for each resampled data. All the results have been supported by statistical analysis. P-values or confidence intervals have been provided for significant improvement.

Reviewer #2: Major remark :

1) We know that the quality of measurement of intelligence is linked to validity and can

interfere on results of each study (see for example, Gignac et al. (2017, https://doi.org/10.1016/j.intell.2017.06.004). The authors do not give any indication of the measurement of fluid intelligence. This information, although necessary, is difficult to obtain from the link https://nda.nih.gov/abcd/about. and requires an in-depth analysis of the database from which the data were extracted (NIMH Data Archive (NDA) database, ABCD study). This is not within the reach of all readers. A summary table of the measurements and tests used will certainly be useful.

Reply: Thanks for your suggestion. We have cited the reference (2017, https://doi.org/10.1016/j.intell.2017.06.004) in the 2nd sentence of the 2nd paragraph of Section 1 Introduction. We have provided a summary table of the measurements and tests Table details in supplement.

Minor remarks :

2) The authors used the MSE value to compare the proposed CNN method to the classical machine learning methods. Choosing MSE is a good choice since it is a well-behaved metric, but correlation criteria should also be provided because the correlation score is used in many other studies on fluid intelligence prediction.

Reply: We have provided the correlation criteria of MSE as as our CNN method and machine learning methods evaluation metric in the new Section 4 (Experiment) – B (Quantitative Evaluation Metrics).

3) Results need to be supported by statistical analysis showing at what extent the differences between the proposed method and the other methods are significant.

Reply: All the results have been supported by statistical analysis.

4) In p. 3 : Fluid intelligence scores were decidualized ( ??). is « decidualized » the right word ?.

Reply: Thanks for your suggestion. The “decidualized” is a wrong word. we have revised it.

5) In p. 13: « The ROIs of frontal gyrus, hippocampus, amygdala, caudate nucleus and

thalamus are performed segmentation ». Should be rephrased.

Reply: we have reformulated the first sentence of the new Section 6.1.

6) In Fig.6, it is not clear that the proposed method shows significant improvement

over the counterparts on ROIs segmentation task

Reply: We agree with your comment. To evaluate accurately the performance of our segmentation approach compared to the counterparts in the experiments, we implemented the following evaluation metrics: Dice coefficient (DC) and the average surface distance (ASD), details are shown in the new Section 4 (Experiment) – B.

7) p.17, « adaptively learn » is repeated twice. Please correct.

Reply: We have deleted extra “adaptively learn”, as shown in the 4th sentence of the 2nd paragraph of the new Section 6.3 – 1).

8) Caption table 4 : the word « sementation » is repeated twice. Please correct.

Reply: we have deleted the extra “segmentation” in Caption table 4.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Yiming Tang

6 May 2022

Prediction of fluid intelligence from T1-w MRI images: A precise two-step deep learning framework

PONE-D-21-37778R1

Dear Dr. Li,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Yiming Tang, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: the authors have extensively modified the manuscript to correctly answer the remarks and questions of the reviewers.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Acceptance letter

Yiming Tang

22 Jul 2022

PONE-D-21-37778R1

Prediction of fluid intelligence from T1-w MRI images: A precise two-step deep learning framework

Dear Dr. Li:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Yiming Tang

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. Fluid intelligence score measurement.

    (DOCX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    Data were obtained from the NIMH Data Archive (NDA) database, generated by the Ado-lescent Brain Cognitive Development (ABCD) study, the largest long-term study of brain development and child health in the United States. Information about the ABCD Data Repository can be found at https://nda.nih.gov/abcd/about.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES